bayesian robust principal component analysis presenter: raghu ranganathan ece / cmr tennessee...
TRANSCRIPT
![Page 1: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/1.jpg)
Bayesian Robust Principal Component Analysis
Presenter: Raghu Ranganathan
ECE / CMR
Tennessee Technological University
January 21, 2011
Reading Group
(Xinghao Ding, Lihan He, and Lawrence Carin)
![Page 2: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/2.jpg)
2
Paper contribution
■ The problem of matrix decomposition into low-rank and sparse components is considered employing a hierarchical approach
■ The matrix is assumed noisy, with unknown and possibly non-stationary noise statistics
■ The Bayesian framework approximately infers the noise statistics in addition to the low-rank and sparse outlier contributions
■ The model proposed is robust to a broad range of noise levels without having to change the hyper-parameter settings
■ In addition, a Markov dependency between successive rows of the matrix is inferred by the Bayesian model to exploit additional structure in the observed matrix, particularly, in video applications
1/21/11
![Page 3: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/3.jpg)
3
Introduction
■ Most high-dimensional data such as images, biological data, and social network data (Netflix data) reside in a low-dimensional subspace or low-dimensional manifold
1/21/11
![Page 4: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/4.jpg)
4
Noise models
■ In low-rank matrix representations, two types of noise models are usually considered
■ One causes small scale perturbation to all the matrix elements, e.g. i.i.d. Gaussian noise added to each element.
■ In this case, if the noise energy is small compared to the dominant singular values of the SVD, it does not significantly affect the principal vectors
■ The second case is sparse noise with arbitrary magnitude, impacting a small subset of matrix elements, for example a moving object in video, in the presence of a static background manifests such sparse noise
1/21/11
![Page 5: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/5.jpg)
5
Convex optimization approach
11/5/10
![Page 6: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/6.jpg)
6
Bayesian approach
■ The observation matrix is considered to be of the form
Y = L (low-rank)+ S (sparse)+ E (noise), with the presence of both sparse noise, S, and dense noise E.
■ In the proposed Bayesian model, the noise statistics of E are approximately learned, along with learning S, and L.
■ The proposed model is robust to a broad range of noise variances
■ The Bayesian model infers approximation to the posterior distributions on the model parameters, and obtains approximate probability distributions for L, S, and E
■ The advantage of Bayesian model is that prior knowledge is employed in the inference
1/21/11
![Page 7: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/7.jpg)
7
Bayesian approach
■ The Bayesian framework exploits the anticipated structure in the sparse component.
■ In video analysis, it is desired to separate the spatially localized moving objects (sparse component), from the static or quasi-static background (low-rank component) in the presence frame dependent additive noise E.
■ The correlation between the sparse components of the video from frame to frame (column to column in the matrix) has to be considered
■ In this paper, a Markov dependency in time and space is assumed between the sparse components of consecutive matrix columns
■ This structure is incorporated into the Bayesian framework, with the Markov parameters inferred through the observed matrix
1/21/11
![Page 8: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/8.jpg)
8
Bayesian Robust PCA
■ The work in this paper is closely related to the low-rank matrix completion problem where we try to approximate a matrix (with noisy entries) by a low-rank matrix and to predict the missing entries
■ The matrix Y = L + S + E is missing random entries; the proposed model can make estimates for the missing entries (in terms of the low-rank term L)
■ The S term is defined as a sparse set of matrix entries; the location of S must be inferred while estimating the values of L, S, and E
■ Typically, in Bayesian inference, a sparseness promoting prior is imposed on the desired signal, and the posterior distribution of the sparse signal is inferred.
1/21/11
![Page 9: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/9.jpg)
9
Bayesian Low-rank and Sparse Model
1/21/11
![Page 10: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/10.jpg)
10 11/5/10
![Page 11: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/11.jpg)
11
Bayesian Low-rank and Sparse Model
11/5/10
![Page 12: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/12.jpg)
12
Bayesian Low-rank and Sparse Model
11/5/10
![Page 13: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/13.jpg)
13
C. Noise component
■ The measurement noise is drawn i.i.d from a Gaussian distribution, and the noise affects all measurements
■ The noise variance is assumed unknown, and is learned within the model inference. Mathematically, the noise is modeled as
■ The model can learn different noise variances for different parts of E, i.e. each column/row of Y (each frame) in general have its own noise level. The noise structure is modified as
1/21/11
![Page 14: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/14.jpg)
14
Relation to the optimization based approach
1/21/11
![Page 15: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/15.jpg)
15
Relation to the optimization based approach
■ In the Bayesian model, it is not required to know the noise variance a priori, the model will learn the noise during inference
■ For the low-rank component instead of the constraint to impose sparseness of singular values, the Gaussian prior together with the beta-Bernoulli distribution is used to obtain an constraint
■ For the sparse component, instead of the constraint, the constraint
and the beta-Bernoulli distribution is employed to enforce sparsity■ Compared to the Laplacian prior (gives many small entries close to 0),
the beta-Bernoulli prior yields exactly zero values■ In Bayesian learning, numerical methods are used to estimate the
distribution for the unknown parameters, whereas in the optimization based approach, a solution to the minimum of a function similar to
1l
Fl2
F
1l Fl2
FX
1l
),|(log HYp
1/21/11
![Page 16: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/16.jpg)
16
Markov dependency of Sparse Term in Time and Space
1/21/11
![Page 17: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/17.jpg)
17
Markov dependency of Sparse Term in Time and Space
1/21/11
![Page 18: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/18.jpg)
18
Markov dependency of Sparse Term in Time and Space
1/21/11
![Page 19: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/19.jpg)
19
Posterior inference
1/21/11
![Page 20: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/20.jpg)
20 11/5/10
![Page 21: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/21.jpg)
21
Experimental results
11/5/10
![Page 22: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/22.jpg)
22 11/5/10
![Page 23: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/23.jpg)
23 11/5/10
![Page 24: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/24.jpg)
24
B. Video example
■ The application of video surveillance with a fixed camera is considered
■ The objective is to reconstruct a near static background and moving foreground from a video sequence
■ The data are organized such that the column m of Y is constructed by concatenating all pixels of frame m from a grayscale video sequence
■ The background is modeled as the low-rank component, and the moving foreground as the sparse component.
■ The rank r is usually small for a static background, and the sparse components across frames (columns of Y) are strongly correlated, modeled by a Markov dependency
1/21/11
![Page 25: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/25.jpg)
25 11/5/10
![Page 26: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/26.jpg)
26 11/5/10
![Page 27: Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao](https://reader030.vdocuments.net/reader030/viewer/2022032723/56649d1f5503460f949f2bc6/html5/thumbnails/27.jpg)
27
Conclusions
■ The authors have developed a new robust Bayesian PCA framework for analysis of matrices with sparsely distributed noise of arbitrary magnitude
■ The Bayesian approach is found to be robust to densely distributed noise, and the noise statistics may be inferred based on the data, with no tuning of hyperparameters
■ In addition, using the Markov property, the model allows the noise statistics to vary from frame to frame
■ Future research directions would involve a moving camera which would assume the background resides in a low-dimensional manifold as opposed to low-dimensional linear subspace
■ The Bayesian framework may be extended to infer the properties of the low-dimensional manifold
1/21/11