scalable low-rank nonlinear subspace tracking€¦ · acknowledgement: nsf grants 1500713, 1514056,...

15
Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F. Sheikholeslami, D. Berberidis, and G. B. Giannakis Dept. of ECE and DTC, University of Minnesota

Upload: others

Post on 07-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01

Scalable Low-Rank Nonlinear Subspace Tracking

F. Sheikholeslami, D. Berberidis, and G. B. Giannakis

Dept. of ECE and DTC, University of Minnesota

Page 2: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

Learning from “Big Data”

Big size ( and/or )

Challenges

Incomplete

Noise and outliers

Fast streaming

2

Opportunities in key tasks

Dimensionality reduction

Online and robust

regression, classification

and clustering

Denoising and imputation

Page 3: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

3/36

Outline

Scalable kernel-based learning

Sparsity-aware low-rank approximation of lifted data

Theoretical guarantees and test results

Online subspace tracking

Affordable memory and computation

Future directions

Page 4: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

4/36

Linear or nonlinear functions for learning?

e.g.,

Regression or classification: Given , find

Memory requirement , and complexity

Pre-select kernel (inner product) function

RKHS basis expansion

Lift via nonlinear map to linear

Kernel-based nonparametric ridge regression

4

Page 5: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

5/36

Prior art

Kernel matrix approximation

Tractable by leveraging low-rank attribute (in dual domain)

o Nystrom approximation [Williamson-Seeger’00, Si et al. ‘14, Wang et al. ‘14]

o Incomplete Cholesky decomposition [Fine-Scheinberg’01]

Random feature approximation

Kernel approximation via feature approximation

o Random sampling and kernel matrix factorization [Rahimi-Recht’08, Zhang et al.’11]

Functional gradient descent

Tractable gradient descent (batch form in primal domain)

o Pegasus [Shaleve-Schwart et al.’11]; Norma [Kivinen et al.’04]; [Wang et al. ’12, Lu et al. ‘16]

Reduced-dimensionality features

Tractability through online extraction of low-dimensional features

o Online (kernel) PCA [Honnein’012]; linear subspace tracking [Mardani et al.’15]

Page 6: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

6

Low-rank approximation of mapped features

Low-rank approximation

S2. Update features while fixing subspace

S1. Update virtual subspace while fixing feature vectors

Alternating minimization

Nonlinear mapping

Page 7: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

7

Sparsity-aware low-rank approximation

Row-sparsity of fewer “basis vectors” for

BCD

S1. Update with fixed

Group shrinkage

S2. Update with fixed

Solvers

i. BCD-Exact: Exact minimization- more expensive, larger descent

ii. BCD-PG: Proximal gradient-based update, inexpensive, smaller descent

F. Sheikholeslami and G. B. Giannakis, "Scalable Kernel-based Learning via Low-rank Approximation of

Lifted Data," Proc. of 55th Allerton Conf. on Comm., Control, and Computing, Oct. 4-6, 2017.

Page 8: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

8

Convergence and generalization bounds

Proposition 1 Sequences generated by BCD-Ex and BCD-PG

iterations converge to a stationary point.

Proposition 2. If , and , then for it holds wp >

Generalization: performance guarantees on unseen data

Equivalent optimization

Define

F. Sheikholeslami and G. B. Giannakis, "Scalable Kernel-based Learning via Low-rank Approximation of

Lifted Data," Proc. Allerton, 2017.

Page 9: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

9

Linear regression!

Linearized kernel regression and classification

Proposition 3. If iid with kernel matrix

can be approximated as , and wp >

Kernel matrix approximation

Bounds also on support vector machines for regression and classification

F. Sheikholeslami, D. K. Berberidis, and G. B. Giannakis, "Kernel-based Low-rank

Feature Extraction on a Budget for Big Data Streams,“ arxiv:1601.07947.

Page 10: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

10

Simulation tests

F. Sheikholeslami and G. B. Giannakis, "Scalable Kernel-based Learning via Low-rank Approximation of Lifted Data," Proc. Allerton Conf. 2017.

IJCNN DatasetAdult Dataset Slice Dataset

N=53,500 , d= 384 N=32,650, d= 123 N=140,230 , d= 22

Datasets available at UCI repository: http://archive.ics.uci.edu/ml/datasets.html

Page 11: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

11

Online function approximation on a budget

S1. Find projection coefficients

S2. Update subspace factor (via stochastic gradient descent)

Iteration : and available

If budget exceeded, remove the row of A with minimum -norm

Low-rank (r) subspace tracking [Mardani-Mateos-GG’14] here on lifted data

Censoring can mitigate curse of dimensionality as n

Rank surrogate

-0.2 -0.1 0 0.1 0.20

0.1

0.2

Budget

maintenance

F. Sheikholeslami, D. K. Berberidis, and G. B. Giannakis, "Kernel-based Low-rank Feature Extraction on

a Budget for Big Data Streams," arxiv:1601.07947.

Page 12: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

12

OK-FEB with linear classification and regression

Slice dataset (regression) Year dataset (regression)

N=53,500 , d= 384, r=10, B=15 N=463,700 , d= 90, r=10, B=15

OK-FEB LSVM outperforms budgeted K-SVM/SVR variants in classification/regression

Page 13: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

13

Tracking dynamic subspaces

Synthetic dataset

Recency-aware removal rule

Parameter trades off tracking for precision

Recency factor

Removal rule

• Associate factor to the i-th SV in the budget

• Decay with inclusion of a new SV,

and drawn from two subspaces

Smaller values of provide faster tracking, while larger values increase fitting precision

Successful use of affordable budget for nonlinear subspace tracking

Page 14: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

14

Online physical activity tracking

Subjects asked to do various physical activities

d =13 quantities measured from chest,

ankle and wrist via wireless MUs

F. Sheikholeslami, D. K. Berberidis, and G. B. Giannakis, "Kernel-based Low-rank Feature Extraction

on a Budget for Big Data Streams,“ arxiv:1601.07947.

Gaussian kernel; and

Average LS-fit

Page 15: Scalable Low-Rank Nonlinear Subspace Tracking€¦ · Acknowledgement: NSF grants 1500713, 1514056, and NIH grant 1R01GM104975-01 Scalable Low-Rank Nonlinear Subspace Tracking F

15

Summary

Sparsity-aware low-rank approximation of lifted data

Theoretical guarantees and test results

Kernel-based learning

Nuclear norm regularization for online approximation

Budget enforcement for affordable memory and computation

Future directions

Nonlinear feature extraction for canonical correlation analysis

Kernel-based feature extraction over 2-D signals on networks

Thank you!