structured low-rank matrix approximation in gaussian...

6
Structured Low-Rank Matrix Approximation in Gaussian Process Regression for Autonomous Robot Navigation Eunwoo Kim, Sungjoon Choi, and Songhwai Oh Abstract— This paper considers the problem of approximat- ing a kernel matrix in an autoregressive Gaussian process regression (AR-GP) in the presence of measurement noises or natural errors for modeling complex motions of pedestrians in a crowded environment. While a number of methods have been proposed to robustly predict future motions of humans, it still remains as a difficult problem in the presence of measurement noises. This paper addresses this issue by proposing a structured low-rank matrix approximation method using nuclear-norm regularized l1-norm minimization in AR-GP for robust motion prediction of dynamic obstacles. The proposed method approxi- mates a kernel matrix by finding an orthogonal basis using low- rank symmetric positive semi-definite matrix approximation assuming that a kernel matrix can be well represented by a small number of dominating basis vectors. The proposed method is suitable for predicting the motion of a pedestrian, such that it can be used for safe autonomous robot navigation in a crowded environment. The proposed method is applied to well-known regression and motion prediction problems to demonstrate its robustness and excellent performance compared to existing approaches. I. I NTRODUCTION We are witnessing service robots appearing in public places, offices, hospitals and homes interacting with humans by performing routine tasks, such as performing household chores and delivering medicine and supplies. In the near future, more service robots will be assisting and cooper- ating with humans in many dynamic and complex real- world environments. In such environments, it is difficult to operate successfully without the exact prediction of dynamic obstacles or moving humans. Since the safe operation is an important requirement for the success of service robots, an ability to predict motions of pedestrians and moving objects is of paramount importance [1]. For safe navigation of a mobile robot under a dynamic and crowded environment, autonomous robot navigation has been studied extensively in recent years [2]–[9] and it is required to predict the trajectories of pedestrians or mov- ing objects precisely. In [2], a probabilistic model based on a partially observable Markov decision process is used to predict trajectories of pedestrians for autonomous robot navigation. Fulgenzi et al. [3] proposed a motion pattern model of pedestrians using a Gaussian process (GP). Wang et This work was supported in part by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2013R1A1A2065551) and by the ICT R&D program of MSIP/IITP (14-824-09-014, Basic Software Research in Human-Level Lifelong Machine Learning). E. Kim, S. Choi, and S. Oh are with the Department of Elec- trical and Computer Engineering, ASRI, Seoul National University, Seoul 151-744, Korea (e-mail: {eunwoo.kim, sungjoon.choi, songh- wai.oh}@cpslab.snu.ac.kr). al. [10] proposed Gaussian process dynamical models and its applications to learning models of human motion. Henry et al. [4] proposed an inverse reinforcement learning based ap- proach for human-like navigation in a crowded environment. Trautman et al. [7] developed a novel cooperative navigation approach using a GP and conducted the first trial of robot navigation in a crowded cafeteria. In general, it is assumed that the current positions of a robot and moving obstacles are available [8] or can be estimated from an external device [7]. However, such assumption makes existing approaches impractical in many practical environments since collecting exact locations using an external device can be a costly option and available only in a laboratory setting. Recently, Choi et al. [9] proposed an autoregressive Gaus- sian process (AR-GP) to model a complex motion of a pedestrian from the egocentric view of a mobile robot. AR-GP is capable of capturing complex human motions using Gaussian process regression (GPR), a nonparametric regression method, whereas parametric models, e.g., a linear model, can only handle simple human motions [9]. This work was extended in [11] to a robust AR-GP motion model by removing the effects of measurement noises and outliers in the training set using low-rank kernel matrix approximation based on the l 1 -norm. While the approximation method proposed in [11] shows the robustness against outliers, it can fail to find a feasible solution since the algorithm does not guarantee the positive semi-definiteness of its solution, which approximates the kernel matrix in AR-GP. Since an incorrect estimation of the kernel matrix can lead to an unstable situation when a robot navigates using AR-GP, it is necessary to approximate a kernel matrix while keeping its structure of positive semi-definiteness. In this paper, we propose a novel structured low-rank matrix approximation, which finds a low-rank solution of a symmetric positive semi-definite matrix using nuclear-norm regularized l 1 -norm minimization, for robust autoregressive Gaussian process regression (AR-GP). The proposed method approximates a kernel matrix used in AR-GP using its low- rank kernel approximation, assuming that the kernel matrix can be represented using a small number of dominating principal components, eliminating outliers and erroneous aspects in the training data set. The proposed method is applied to motion prediction problems to demonstrate its robustness against unwanted corruptions. Furthermore, the method is applied in a physical experiment using a Pioneer 3DX mobile robot and a Microsoft Kinect camera for motion prediction and autonomous robot navigation. The remainder of this paper is organized as follows: In

Upload: others

Post on 25-Jul-2020

28 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structured Low-Rank Matrix Approximation in Gaussian ...cpslab.snu.ac.kr/publications/papers/2015_icra_factspsd.pdf · Structured Low-Rank Matrix Approximation in Gaussian Process

Structured Low-Rank Matrix Approximation in Gaussian ProcessRegression for Autonomous Robot Navigation

Eunwoo Kim, Sungjoon Choi, and Songhwai Oh

Abstract— This paper considers the problem of approximat-ing a kernel matrix in an autoregressive Gaussian processregression (AR-GP) in the presence of measurement noises ornatural errors for modeling complex motions of pedestrians ina crowded environment. While a number of methods have beenproposed to robustly predict future motions of humans, it stillremains as a difficult problem in the presence of measurementnoises. This paper addresses this issue by proposing a structuredlow-rank matrix approximation method using nuclear-normregularized l1-norm minimization in AR-GP for robust motionprediction of dynamic obstacles. The proposed method approxi-mates a kernel matrix by finding an orthogonal basis using low-rank symmetric positive semi-definite matrix approximationassuming that a kernel matrix can be well represented bya small number of dominating basis vectors. The proposedmethod is suitable for predicting the motion of a pedestrian,such that it can be used for safe autonomous robot navigationin a crowded environment. The proposed method is appliedto well-known regression and motion prediction problems todemonstrate its robustness and excellent performance comparedto existing approaches.

I. INTRODUCTION

We are witnessing service robots appearing in publicplaces, offices, hospitals and homes interacting with humansby performing routine tasks, such as performing householdchores and delivering medicine and supplies. In the nearfuture, more service robots will be assisting and cooper-ating with humans in many dynamic and complex real-world environments. In such environments, it is difficult tooperate successfully without the exact prediction of dynamicobstacles or moving humans. Since the safe operation is animportant requirement for the success of service robots, anability to predict motions of pedestrians and moving objectsis of paramount importance [1].

For safe navigation of a mobile robot under a dynamicand crowded environment, autonomous robot navigation hasbeen studied extensively in recent years [2]–[9] and it isrequired to predict the trajectories of pedestrians or mov-ing objects precisely. In [2], a probabilistic model basedon a partially observable Markov decision process is usedto predict trajectories of pedestrians for autonomous robotnavigation. Fulgenzi et al. [3] proposed a motion patternmodel of pedestrians using a Gaussian process (GP). Wang et

This work was supported in part by Basic Science Research Programthrough the National Research Foundation of Korea (NRF) funded by theMinistry of Science, ICT & Future Planning (NRF-2013R1A1A2065551)and by the ICT R&D program of MSIP/IITP (14-824-09-014, BasicSoftware Research in Human-Level Lifelong Machine Learning).

E. Kim, S. Choi, and S. Oh are with the Department of Elec-trical and Computer Engineering, ASRI, Seoul National University,Seoul 151-744, Korea (e-mail: {eunwoo.kim, sungjoon.choi, songh-wai.oh}@cpslab.snu.ac.kr).

al. [10] proposed Gaussian process dynamical models and itsapplications to learning models of human motion. Henry etal. [4] proposed an inverse reinforcement learning based ap-proach for human-like navigation in a crowded environment.Trautman et al. [7] developed a novel cooperative navigationapproach using a GP and conducted the first trial of robotnavigation in a crowded cafeteria. In general, it is assumedthat the current positions of a robot and moving obstaclesare available [8] or can be estimated from an external device[7]. However, such assumption makes existing approachesimpractical in many practical environments since collectingexact locations using an external device can be a costlyoption and available only in a laboratory setting.

Recently, Choi et al. [9] proposed an autoregressive Gaus-sian process (AR-GP) to model a complex motion of apedestrian from the egocentric view of a mobile robot.AR-GP is capable of capturing complex human motionsusing Gaussian process regression (GPR), a nonparametricregression method, whereas parametric models, e.g., a linearmodel, can only handle simple human motions [9]. This workwas extended in [11] to a robust AR-GP motion model byremoving the effects of measurement noises and outliers inthe training set using low-rank kernel matrix approximationbased on the l1-norm. While the approximation methodproposed in [11] shows the robustness against outliers, itcan fail to find a feasible solution since the algorithm doesnot guarantee the positive semi-definiteness of its solution,which approximates the kernel matrix in AR-GP. Since anincorrect estimation of the kernel matrix can lead to anunstable situation when a robot navigates using AR-GP, itis necessary to approximate a kernel matrix while keepingits structure of positive semi-definiteness.

In this paper, we propose a novel structured low-rankmatrix approximation, which finds a low-rank solution of asymmetric positive semi-definite matrix using nuclear-normregularized l1-norm minimization, for robust autoregressiveGaussian process regression (AR-GP). The proposed methodapproximates a kernel matrix used in AR-GP using its low-rank kernel approximation, assuming that the kernel matrixcan be represented using a small number of dominatingprincipal components, eliminating outliers and erroneousaspects in the training data set. The proposed method isapplied to motion prediction problems to demonstrate itsrobustness against unwanted corruptions. Furthermore, themethod is applied in a physical experiment using a Pioneer3DX mobile robot and a Microsoft Kinect camera for motionprediction and autonomous robot navigation.

The remainder of this paper is organized as follows: In

Page 2: Structured Low-Rank Matrix Approximation in Gaussian ...cpslab.snu.ac.kr/publications/papers/2015_icra_factspsd.pdf · Structured Low-Rank Matrix Approximation in Gaussian Process

Section II, we briefly review low-rank matrix approximationand Gaussian process regression. In Section III, we proposea structured low-rank matrix approximation algorithm. Then,we present various experimental results to evaluate the pro-posed method in Section IV.

II. PRELIMINARIES

A. Low-rank matrix approximation

Low-rank matrix approximation is a minimization prob-lem, in which the cost function measures the fit betweenan observation matrix and an approximating matrix, subjectto the constraint that the approximating matrix has a re-duced rank. The problem arises in a number of problemsin machine learning and computer vision, such as imagedenoising, collaborative filtering, background modeling, anddata reconstruction, to name a few [12], [13].

Let us consider the l2 approximation of matrix G. Theproblem is to minimize the following cost function for givenG:

arg minP,X‖G− PX‖F , (1)

where G ∈ Rm×n, P ∈ Rm×r, and X ∈ Rr×n are the ob-servation, projection, and coefficient matrices, respectively.Here, r is a predefined parameter less than min(m,n) andPX is a low-rank approximation of G. However, the l2 basedapproximation is highly sensitive to non-Gaussian noises. Toovercome the disadvantage, methods based on the l1-normhave been emerged in many fields [12]–[14].

There is another family of approaches using the recentadvances in nuclear-norm minimization which is also knownas robust principal component analysis (RPCA) [14]. RPCAdecomposes the observation matrix into a low-rank matrixand a sparse matrix by solving the l1-norm regularizednuclear-norm minimization problem:

minD,E||D||∗ + λ||E||1

s.t. G = D + E,(2)

where D, E, and G are low-rank, sparse error, and obser-vation matrices, respectively. Here, the nuclear-norm of amatrix is the sum of its singular values, i.e., ||Σ||∗ =

∑i σi,

where σi is a singular value of Σ. RPCA has recentlyachieved many successful results in machine learning andcomputer vision [14], [15].

B. Gaussian process regression

A Gaussian process (GP) is a collection of random vari-ables which has a joint Gaussian distribution and is specifiedby its mean function m(x) and covariance function k(x,x′)[16]. A Gaussian process f(x) is expressed as:

f(x) ∼ GP (m(x), k(x,x′)). (3)

Suppose that x ∈ Rn is an input and yi ∈ R is an output.For a noisy observation set D = {(xi, yi)|i = 1, ..., n}, wecan consider the following observation model:

yi = f(xi) + wi, (4)

where wi ∈ R is a zero-mean Gaussian noise with varianceσ2w. Then the covariance of yi and yj can be expressed as

cov(yi, yj) = k(xi,xj) + σ2wδij , (5)

where δij is the Kronecker delta function which is 1 if i = jand 0 otherwise. k(xi,xj) = φ(xi) · φ(xj) is a covariancefunction based on some nonlinear mapping function φ. Thefunction k is also known as a kernel function.

We can represent (5) in a matrix form as follows:

cov(y) = K + σ2wI, (6)

where y = [y1 . . . yn]T and K is a kernel matrix such that[K]ij = k(xi, xj).

The conditional distribution of a new output y∗ at a newinput x∗ given D becomes

y∗|D,x∗ ∼ N (y∗,V(y∗)), (7)

wherey∗ = kT∗ (K + σ2

wI)−1y = kT∗ Λy, (8)

where Λ = (K + σ2wI)−1 and the covariance of y∗ is

V(y∗) = k(x∗,x∗)− kT∗ (K + σ2wI)−1k∗. (9)

Here, k∗ ∈ Rn is a covariance vector between the new datax∗ and existing data, such that [k∗]i = k(x∗, xi). Note thatwhen it comes to make a prediction given a collected trainingset, the computational cost of GP can be reduced by pre-computing the inverse of a kernel matrix [9].

An autoregressive Gaussian process (AR-GP) is a methodto predict future positions of a moving object given a finitenumber of past positions [9], [11], which is applied in thispaper for predicting future motions of pedestrians.

Kim et al. [11] proposed a low-rank kernel matrix approx-imation algorithm using the relationship between GPR andlow-rank kernel matrix approximation and showed that therobustness of the proposed method in the presence of outliersand measurement noises. However, the approximated matrixin [11] is not a proper kernel matrix since the positive semi-definiteness is not assumed in the algorithm. In the nextsection, we propose a structured low-rank approximationalgorithm which guarantees the positive semi-definiteness ofits solution.

III. THE PROPOSED METHOD

A. Formulation

In this section, we propose a structured low-rank matrixapproximation method for approximating a kernel matrixby making sure that the approximated matrix is positivesemi-definite. For robustness of the proposed method in thepresence of erroneous data, we use robust measures in a costfunction. Instead of methods based on the l2-norm, whichas known to be highly sensitive to outliers, the proposedmethod is based on the robust principal component analysis(RPCA) framework [14] to reduce the effect of outlierswith an automatic rank search. Hence, we approximatea kernel matrix using a nuclear-norm regularized l1-normminimization problem for robust approximation.

Page 3: Structured Low-Rank Matrix Approximation in Gaussian ...cpslab.snu.ac.kr/publications/papers/2015_icra_factspsd.pdf · Structured Low-Rank Matrix Approximation in Gaussian Process

We formulate the problem of nuclear-norm regularized l1-norm minimization as shown below:

minP,M

‖K − PMPT ‖1 + λ‖PMPT ‖∗, (10)

where K ∈ Rn×n is a kernel or symmetric positive semi-definite matrix and P ∈ Rn×r and M ∈ Rr×r are optimiza-tion variables. ‖ ·‖∗ denotes the nuclear-norm or trace-norm,and λ > 0 is a regularization parameter. In the cost function,we use the nuclear-norm regularizer to minimize the rankof PMPT , an approximation of K, to our desired one byadjusting the parameter λ since we do not know the exactrank. The nuclear-norm has been used as a convex surrogatefor the rank in many rank minimization problems [14], [15].This problem is non-convex and its solution can be obtainedusing the augmented Lagrangian framework [14].

To reduce the computational complexity and make the con-vergence faster, it is reasonable to enforce an orthogonalityconstraint to the basis matrix P by shrinking the solutionspace of P . Based on these observations, we reformulate thelow-rank matrix approximation problem as follows:

minP,M

‖K − PMPT ‖1 + λ‖M‖∗

s.t. PTP = Ir,M � 0,(11)

where Ir is an r × r identity matrix and M is a positivesemi-definite matrix. By enforcing the orthogonal constrainton P , we can compute only small matrix M instead ofPMPT when computing the nuclear-norm. Figure 1 showsan overview of the proposed structured low-rank matrixapproximation method. Due to the difficulty of solving theproblem (11) directly, we introduce two auxiliary variablesD and M̂ and solve the following problem:

minP,M,D,M̂

‖K −D‖1 + λ‖M‖∗

s.t. D = PM̂PT , M̂ = M,PTP = Ir,M � 0.(12)

The augmented Lagrangian framework [14] is used tosolve (12) by converting the constrained optimization prob-lem into the following unconstrained problem:

L(K,P,M,D, M̂) = ‖K −D‖1 + λ‖M‖∗+ tr

(ΛT1 (D − PM̂PT )

)+ tr

(ΛT2 (M̂ −M)

)+β

2

(‖D − PM̂PT ‖2F + ‖M̂ −M‖2F

),

(13)

where Λ1,Λ2 ∈ Rn×n are Lagrange multipliers and β > 0is a small penalty parameter. Here, we have not included theorthogonality constraint over P , but it is considered whenwe optimize (13) with respect to P . We apply the alternat-ing minimization approach iteratively, which estimates onevariable while other variables are held fixed. Each step ofthe proposed algorithm is described in the following section.

Fig. 1. A graphical illustration of the proposed method. A kernel matrix Kcan be approximated by multiplication of P , M , and PT . We can performprediction of future motions of moving objects using AR-GP based on therank reduced kernel matrix.

B. Algorithm

To solve for M , we fix the other variables and solve thefollowing optimization problem:

M+ = arg minM

λ

β‖M‖∗ +

1

2

∥∥∥∥M̂ −M +Λ2

β

∥∥∥∥2

F

,

= arg minM

λ

β‖M‖∗ +

1

2‖M −A‖2F , s.t. M � 0,

(14)

where A = M̂−Λ2

β . If A is not a symmetric matrix, we make

it a symmetric matrix by A ← A+AT

2 and find M+. Then,this problem can be solved using eigenvalue thresholding(EVT) [17] and its solution is

M+ = Qdiag

[max

(γ − λ

β, 0

)]QT , (15)

where Q and Γ are matrices, which contain eigenvectors andeigenvalues, respectively, from the eigenvalue decompositionof A, i.e., A = QΓQT and Γ = diag(γ).

For D, we solve the following problem:

D+ = arg minD‖K −D‖1 + tr

(ΛT1 (D − PM̂PT )

)+β

2‖D − PM̂PT ‖2F ,

= arg minD‖K −D‖1 +

β

2

∥∥∥∥D − PM̂PT +Λ1

β

∥∥∥∥2

F

,

(16)

and the solution can be computed using the shrinkage (soft-thresholding) operator [14]:

D+ ← K − S(K − PM̂PT +

Λ1

β,

1

β

), (17)

where S(x, τ) = sign(x) ·max(|x| − τ, 0) for a variable x.With other variables fixed, we have the following opti-

mization problem for finding P :

P+ = arg minP

tr(

ΛT1 (D − PM̂PT ))

2‖D − PM̂PT ‖2F ,

= arg minP

β

2

∥∥∥∥D +Λ1

β− PM̂PT

∥∥∥∥2

F

, s.t. PPT = Ir.

(18)

The above problem is a least square problem with anorthogonality constraint. Let R = D + Λ1

β and L = PM̂ ,

Page 4: Structured Low-Rank Matrix Approximation in Gaussian ...cpslab.snu.ac.kr/publications/papers/2015_icra_factspsd.pdf · Structured Low-Rank Matrix Approximation in Gaussian Process

then L can be represented by L = R(PT )+ = R(PT )T =RP , where (PT )+ is the pseudo-inverse of the matrixPT . Therefore, from [18], we can obtain the orthogonalmatrix P = QR(RP ) = QR(L), where QR(A) is the QRfactorization of A.

To update M̂ , we consider the following equation:

M̂+ = arg minM̂

tr(

ΛT1 (D − PM̂PT ))

+ tr(

ΛT2 (M̂ −M))

2

(‖D − PM̂PT ‖2F + ‖M̂ −M‖2F

),

(19)

and its solution is computed by taking a derivative as

M̂+ =1

2

(PTDP +

1

βPTΛ1P +M − 1

βΛ2

). (20)

Finally, we update the Lagrange multipliers Λ1 and Λ2 asfollows:

Λ1 ← Λ1 + β(D − PM̂PT ),

Λ2 ← Λ2 + β(M̂ −M).(21)

The proposed structured low-rank matrix approximationalgorithm is summarized in Algorithm 1. Since it is asymmetric positive semi-definite matrix factorization algo-rithm, it is named as factSPSD. Similar to [11], weapply KPCA to the approximated low-rank kernel matrixafter performing the algorithm for faster computation whenGPR is applied, reducing the computational complexity fromO(n3) to O(rn2). In the algorithm, we have assumed anormalized observation matrix. Hence, the output matricesare obtained by rescaling them using the scaling factor. Thealternating minimization order of optimization variables canbe different, but we had empirically found that the orderin Algorithm 1 showed better than other orders. We set theinitial values to all zero matrices since the algorithm is notsensitive to the choice of initial values. We set the parametersin algorithm as λ = 10−3, β = 10−5, and ρ = 2. The inneriteration of the algorithm (from line 5 to line 10) was set to10 since it is enough to converge to a local solution. Althoughit is difficult to guarantee the convergence of the proposedmethod to a local minimum, the alternating optimization willconverge to a finite limit since the cost function consists ofnon-negative functions. The stopping criterion (line 13 ofAlgorithm 1) is chosen as

‖D − PM̂PT ‖1 < ε or ‖M̂ −M‖1 < ε, (22)

and ε = 10−5, which shows good results in our experiments.

IV. EXPERIMENTAL RESULTS

In this section, we evaluate the performance of the pro-posed method (factSPSD) by experimenting with variousdata sets and comparing with other sparse Gaussian processregression methods (SPGP1 [19], PITC [20], GPLasso2 [21],and PCGP-l1 [11]) along with the standard full GP. In our

1Available at http://www.gatsby.ucl.ac.uk/˜snelson/.2Available at https://www.cs.purdue.edu/homes/alanqi/

softwares/softwares.htm.

Algorithm 1 factSPSD for optimizing (12)1: Input: K ∈ Rn×n, rank r, λ, β, and ρ2: Output: P ∈ Rn×r, D ∈ Rn×n, and M ∈ Rr×r3: Initialization: M = P = D = M̂ = 0 and βmax = 1010

4: while not converged do5: while not converged do6: Update M by (15)7: Update P ← QR(RP ) where R = D + Λ1

β

8: Update M̂ by (20)9: Update D by (17)

10: end while11: Update the Lagrange multipliers Λ1 and Λ2 by (21)12: Update β = min(ρβ, βmax)13: Check the convergence condition14: end while

0 2 4 6 8 10−2

−1

0

1

2

3

4

5

6

x

y

Refence fieldGP train dataFull−GPfactSPSDPITC

(a)

0 2 4 6 8 10−40

−20

0

20

40

60

x

y

Refence fieldGP train dataFull−GPfactSPSDPITC

(b)

Fig. 2. Simulation results on a synthetic example. (a) No outliers. (b) 30%outliers.

experiments, we used the RBF Gaussian kernel for all GPmethods and hyperparameters are learned using a conjugategradient method [16]. The prediction or regression accuracyis measured by the root mean squared error (RMSE).

A. Regression problems

First, we tested on a synthetic example to compare theproposed structured low-rank matrix approximation withother Gaussian process methods [16], [20] in a regressionproblem with and without outliers. Since we are interestedin how the proposed method performs in the presence ofoutliers, we have compared factSPSD to a sparse GP (PITC[20]) and the full GP [16].

Figure 2 shows the simulation results of the regressionproblem according to two outlier levels (no outliers and 30%outliers). As shown in Figure 2(a), full GP and the proposedmethod give nearly exact results when there are no outliers,whereas PITC does not fit the reference field exactly. So theproposed method shows its competitiveness compared withthe other sparse GP methods. If we add some outliers asshown in Figure 2(b), the full GP and PITC try to fit outliersso they show fluctuations, but factSPSD is less affected byoutliers than the full GP and PITC, showing its robustnessagainst outliers. Although a kernel function can give an effectof smoothing, the effect of outliers still remain in the kernelmatrix. From this experiment, we see the clear benefit of theproposed low-dimensional learning method to a regressionproblem when the train set contains outliers.

Page 5: Structured Low-Rank Matrix Approximation in Gaussian ...cpslab.snu.ac.kr/publications/papers/2015_icra_factspsd.pdf · Structured Low-Rank Matrix Approximation in Gaussian Process

0.1 0.2 0.3 0.4 0.52.5

3

3.5

4

4.5

5

Basis ratio (x100) (%)

RM

SE

Pumadyn−8nm dataset

FullGPSPGPPITCGPLassofactSPSD

(a)

0.1 0.2 0.3 0.4 0.5

0.16

0.18

0.2

0.22

0.24

Basis ratio (x100) (%)

RM

SE

Kin−8nm dataset

FullGPSPGPPITCGPLassofactSPSD

(b)

Fig. 3. Regression results of the proposed method compared with otherGP methods according to basis conditions for two benchmark data sets: (a)Pumadyn-8nm, (b) Kin-8nm.

To verify the proposed algorithm for real-world data sets,we have tested algorithms using two well-known data sets,Pumadyn-8nm and Kin-8nm3, which are benchmark data setsin the Gaussian process regression literature [21]. Pumadyn-8nm is a data set which consists of puma forward dynamicsof 8 inputs and Kin-8nm is a data set which consists ofthe forward kinematics of an 8 link robot arm. For eachdataset, we randomly collected 1,000 training and 800 testsamples from the data sets. To verify the robustness of theproposed method under the existence of various outliers,we added 30 percent outliers which are randomly selectedfrom [-25, 25] in the data sets, whereas the data sets are inthe range of [-2, 2]. The simulation results of the proposedmethod with other sparse GP methods (SPGP [19], PITC[20], and GPLasso [21]) for various basis ratios (10% ∼50%) are shown in Figure 3. As shown in Figure 3(a), theproposed method gives the lowest error among the methodsregardless of the basis conditions, especially, it shows betterperformance than full GP, whereas sparsity-based methodsshow lower error than full GP when the basis ratio is large.In Figure 3(b), the proposed method also gives lower errorsthan other sparsity-based methods. Although its performanceis worse than full GP when the basis ratio is small, thedifference is the smallest.

B. Motion prediction of human trajectories

For an actual experiment, we collected trajectories ofmoving pedestrians using a Pioneer 3DX differential drivemobile robot and a Microsoft Kinect camera, which ismounted on top of the robot as shown in Figure 6. Allalgorithms are written in MATLAB using the mex-compiledARIA package4 on a notebook with a 2.1 GHz quad-coreCPU and 8GB RAM. The position of a pedestrian is detectedusing the skeleton grab API for Kinect.

We performed experiments in our laboratory to predict thefuture position of a person. To model the future positions ofa pedestrian, our algorithm is applied to an inversion of akernel matrix in the autoregressive Gaussian process (AR-GP) motion model [9], which estimates the current positionof a pedestrian based on p recent positions of the human

3Available at http://www.cs.toronto.edu/˜delve/methods/mars3.6-bag-1/mars3.6-bag-1.html.

4Available at http://robots.mobilerobots.com/wiki/ARIA.

10 20 30 40 500.8

1

1.2

1.4

1.6

1.8

Basis ratio (%)

RM

SE

(m

)

Trajectory prediction (Outlier 20%)

factSPSDPCGP−L1PITCGPLasso

(a)

0 5 10 15 20 25 300.8

1

1.2

1.4

1.6

1.8

2

Outlier ratio (%)

RM

SE

(m

)

Trajectory prediction (Basis 30%)

factSPSDPCGP−L1PITCGPLasso

(b)

Fig. 4. Motion prediction results using Kinect-based human trajectories:(a) Various basis ratio with 30 percent outliers. (b) Various outlier ratio with30 percent basis vectors.

by a nonlinear model of an autoregressive process under theGaussian process framework. To make a training set fromthe collected trajectories, we uniformly sampled positions tohave ten samples in a trajectory when a trajectory has manydetected positions.

We compared the proposed method with state-of-the-artapproaches (PCGP-l1 [11], GPLasso [21], and PITC [20])for the collected human trajectories. We divided the collectedtrajectories into training and test sets with autoregressiveorder p = 3. Using the data set, we have experimentedfor two cases: under various rank (basis) conditions witha fixed outlier level and under various outlier conditionswith a fixed rank. We added outliers to randomly selectedpositions of collected trajectories from [-10, 10], whereasthe data sets are in the range of [-5, 5]. Figure 4 showsresults for two cases. As shown in Figure 4(a), the proposedfactSPSD shows the best results compared to other sparseGP methods for all cases. PCGP-l1 gives the second bestresults regardless of the basis ratios. We can interpret thatthe proposed algorithm approximates a kernel matrix used inAR-GP better than PCGP-l1, since the proposed algorithmcan guarantee the positive semi-definiteness, whereas PCGP-l1 does not assume the positive semi-definiteness. The RMSEerror results for a fixed rank (r/n × 100 = 30%) undervarious outlier conditions are shown in Figure 4(b). Asshown in the figure, the proposed method gives the bestresults regardless of outlier conditions. From two figures,we can see that the proposed method shows the robustnessagainst outliers, by recovering from measurement noises anderroneous trajectories. Figure 5 shows some snapshots of amotion prediction experiment using two Microsoft Kinectcameras (about 110◦ field of view) in our laboratory.

C. Motion control

We have applied the proposed method to a motion controlproblem. A non-parametric Bayesian motion controller cannavigate through crowded dynamic environments [9] and theproposed method based AR-GP motion model is applied tothe Gaussian process motion controller [9] for autonomousrobot navigation. We performed the motion controller ex-periments using the Pioneer 3DX mobile robot and twoMicrosoft Kinect cameras, to enlarge the field of view ofa robot, in various dynamic and crowded environments. Thenumber of pedestrians varies from one to seven to verify

Page 6: Structured Low-Rank Matrix Approximation in Gaussian ...cpslab.snu.ac.kr/publications/papers/2015_icra_factspsd.pdf · Structured Low-Rank Matrix Approximation in Gaussian Process

Fig. 5. A motion prediction experiment using the proposed algorithm.A pink circle represents a prediction of a pedestrian given the observedpositions. Each experiment consists of a photo taken by camera and robot’sinternal state. Best viewed in color.

Fig. 6. Snapshots from a real experiment in a crowded school cafeteria.A pink circle represents a prediction of a pedestrian given the observedpositions. Each experiment consists of photos taken by camera and robot’sinternal state. Best viewed in color.

the performance of the the proposed method under crowdedenvironments. Figure 6 shows some snapshots from the ex-periments in a crowded school cafeteria. In the experiments,we have verified that the robot successfully avoided movingpedestrians and obstacles without any collisions and arrivedat the goal position.

V. CONCLUSION

In this paper, we have proposed a structured low-rankmatrix approximation method using nuclear-norm regular-ized l1-norm minimization and its application to robustautoregressive Gaussian process regression for autonomousrobot navigation since modeling a complex pedestrian motionpattern is a difficult problem in the presence of measurementnoises or outliers. To overcome the limitation of the state-of-the-art low-rank approximation method, we have presenteda novel optimization formulation and its efficient algorithmto obtain a symmetric positive semi-definite matrix. Theproposed method is applied to various well-known regressiondata sets and experiments using a Pioneer 3DX mobile robotand two Microsoft Kinect cameras. The experimental resultsshow the robustness of the proposed method against outliersand sensor errors compared to existing methods.

REFERENCES

[1] C. H. Chen, Y. Weng, and S.-T. Sun, “Toward the human-robotcoexistence society: on safety intelligence for next generation robots,”International Journal of Social Robotics, vol. 1, no. 4, pp. 267–282,2009.

[2] A. F. Foka and P. E. Trahanias, “Predictive autonomous robot navi-gation,” in Proc. of IEEE/RSJ International Conference on IntelligentRobots and Systems (IROS), 2002.

[3] C. Fulgenzi, C. Tay, A. Spalanzani, and C. Laugier, “Probabilisticnavigation in dynamic environment using rapidly-exploring randomtrees and Gaussian processes,” in Proc. of IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS), 2008.

[4] P. Henry, C. Vollmer, B. Ferris, and D. Fox, “Learning to navigatethrough crowded environments,” in Proc. of IEEE International Con-ference on Robotics and Automation (ICRA), 2010.

[5] R. Asaula, D. Fontanelli, and L. Palopoli, “Safety provisions forhuman/robot interactions using stochastic discrete abstractions,” inProc. of IEEE/RSJ International Conference on Intelligent Robots andSystems (IROS), 2010.

[6] J. J. Park, C. Johnson, and B. Kuipers, “Robot navigation withmodel predictive equilibrium point control,” in Proc. of IEEE/RSJInternational Conference on Intelligent Robots and Systems (IROS),2012.

[7] P. Trautman, J. Ma, R. M. Murray, and A. Krause, “Robot navigationin dense human crowds: the case for cooperation,” in Proc. of IEEEInternational Conference on Robotics and Automation (ICRA), 2013.

[8] G. S. Aoude, B. D. Luders, J. M. Joseph, N. Roy, and J. P. How,“Probabilistically safe motion planning to avoid dynamic obstacleswith uncertain motion patterns,” Autonomous Robots, no. 1, pp. 51–76, 2013.

[9] S. Choi, E. Kim, and S. Oh, “Real-time navigation in crowded dynamicenvironments using Gaussian process motion control,” in Proc. ofIEEE International Conference on Robotics and Automation (ICRA),2014.

[10] J. M. Wang and D. J. Fleet, “Gaussian process dynamical modelsfor human motion,” IEEE Trans. on Pattern Analysis and MachineIntelligence, vol. 30, no. 2, pp. 283–298, 2008.

[11] E. Kim, S. Choi, and S. Oh, “A robust autoregressive gaussianprocess motion model using l1-norm based low-rank kernel matrixapproximation,” in Proc. of IEEE/RSJ International Conference onIntelligent Robots and Systems (IROS), 2014.

[12] Q. Ke and T. Kanade, “Robust l1 norm factorization in the presence ofoutliers and missing data by alternative convex programming,” in Proc.of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), June 2005.

[13] E. Kim, M. Lee, C.-H. Choi, N. Kwak, and S. Oh, “Efficient l1-norm-based low-rank matrix approximations for large-scale problemsusing alternating rectified gradient method,” IEEE Trans. on NeuralNetworks and Leraning Systems (TNNLS), vol. 26, no. 2, pp. 237–251,2015.

[14] E. J. Candes, X. Li, Y. Ma, and J. Wright, “Robust principal componentanalysis?” Journal of the ACM, vol. 58, pp. 11:1–11:37, 2011.

[15] B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimum-ranksolutions of linear matrix equations via nuclear norm minimization,”SIAM Review, vol. 52, pp. 471–501, 2010.

[16] C. Rasmussen and C. Williams, Gaussian process for machine learn-ing. MIT Press, 2006.

[17] Y. Ni, J. Sun, X. Yuan, S. Yan, and L.-F. Cheong, “Robust low-ranksubspace segmentation with semidefinite guarantees,” in Proc. of theIEEE International Conf. on Data Mining Workshops, 2010, pp. 1179–1188.

[18] Z. Wen, W. Yin, and Y. Zhang, “Solving a low-rank factorizationmodel for matrix completion by a non-linear successive over-relaxationalgorithm,” Rice University CAAM Technical Report TR10-07, 2010.

[19] E. Snelson and Z. Ghahramani, “Sparse Gaussian processes usingpseudo-inputs,” in Advances in Neural Information Processing Systems(NIPS), 2005.

[20] J. Quinonero-candela, C. E. Rasmussen, and R. Herbrich, “A unifyingview of sparse approximate Gaussian process regression,” Journal ofMachine Learning Research (JMLR), vol. 6, pp. 1939–1959, 2005.

[21] F. Yan and Y. Qi, “Sparse Gaussian process regression via l1 penaliza-tion,” in Proc. of the International Conference on Machine Learning(ICML), 2010.