spam: sparse additive modelsranger.uta.edu/~heng/cse6389_15_slides/spam_ppt.pdf · spam: sparse...
TRANSCRIPT
![Page 1: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/1.jpg)
SpAM: Sparse Additive Models
Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman
![Page 2: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/2.jpg)
Outline
● 1) The SpAM Optimization Problem● 2) A backfitting algorithm● 3) Properties of SpAM● 4) Simulations showing the estimator's behavior
![Page 3: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/3.jpg)
Outline
● 1) The SpAM Optimization Problem● 2) A backfitting algorithm● 3) Properties of SpAM● 4) Simulations showing the estimator's behavior
![Page 4: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/4.jpg)
● How sparsity is achievedRecall the standard additive model optimization problem:
The SpAM Optimization Problem
![Page 5: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/5.jpg)
● How sparsity is achievedSome modifi cation:
The SpAM Optimization Problem
![Page 6: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/6.jpg)
● Consider following modification that imposes additional constraints:
The SpAM Optimization Problem
![Page 7: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/7.jpg)
● L1 encourages sparsity of estimated beta
The SpAM Optimization Problem
![Page 8: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/8.jpg)
● Drawback: the optimization problem is convex in � and { g } separately
● But not convex in � and { g } jointly.● So consider a related optimization problem.
● First call the original optimization problem as P
The SpAM Optimization Problem
![Page 9: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/9.jpg)
● A related optimization problem, called Q:
This problem is convex in { f } and the problem P and Q are equivalent
The SpAM Optimization Problem
![Page 10: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/10.jpg)
● The problem P and Q are equivalent:
● The optimization Problem Q is convex● It encourages sparsity is not intuitive
The SpAM Optimization Problem
![Page 11: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/11.jpg)
● It encourages sparsity is not intuitive● Consider an example to provie some insight
The projection π12C onto first two components (L2) The projection π13C onto first third components (L1)
The SpAM Optimization Problem
![Page 12: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/12.jpg)
The SpAM Optimization Problem
● Act as an L1 constraint across components(sparsity)● Act as an L2 constraint within components (smoothnes
s)
In case { f } is linear
The optimization problem reduces to the lasso
![Page 13: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/13.jpg)
Outline
● 1) The SpAM Optimization Problem● 2) A backfitting algorithm● 3) Properties of SpAM● 4) Simulations showing the estimator's behavior
![Page 14: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/14.jpg)
● A coordinate descent algorithm● Write the Lagrangian for the optimization Q
A backfitting algorithm
![Page 15: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/15.jpg)
● Define the jth Residual:
● Minimizing the Lagrangian as a function of fj is expressed in terms of Frechet derivative as:
where:
A backfitting algorithm
![Page 16: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/16.jpg)
● Conditioning on Xj, the Frechet derivative becomes:
● Letting Pj = E [ Rj | Xj ] denote the projection of the residual onto Hj, the solution satisfies
(Recall that : )
A backfitting algorithm
![Page 17: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/17.jpg)
● This form
implies or
A backfitting algorithm
![Page 18: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/18.jpg)
● From the form,
or
● we arrive the following soft-thresholding update
A backfitting algorithm
![Page 19: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/19.jpg)
● Two terms are needed to be estimated ● 1) As in standard backfitting, the projection
Pj = E [ Rj | Xj ] is estimated by a smooth of the residuals
Sj is a linear smoother, such as a local linear or kernel smoother
A backfitting algorithm
![Page 20: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/20.jpg)
● Two terms are needed to be estimated ● 2) A simple but biased estimate for the denominator:
A backfitting algorithm
![Page 21: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/21.jpg)
● We derived the SpAM backfitting algorithm
A backfitting algorithm
![Page 22: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/22.jpg)
Outline
● 1) The SpAM Optimization Problem● 2) A backfitting algorithm● 3) Properties of SpAM● 4) Simulations showing the estimator's behavior
![Page 23: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/23.jpg)
● 1) SpAM is Presistent● 2) SpAM is Sparsistent
Properties of SpAM
![Page 24: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/24.jpg)
● 1) SpAM is Presistent
Presistence comes from shortening “Predictive consistency”
Def: Let (X,Y) be a new pair of data and the pre-dictive risk when predicting Y with f(X)
Properties of SpAM
![Page 25: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/25.jpg)
● 1) SpAM is Presistent
Def: Let (X,Y) be a new pair of data and the pre-dictive risk when predicting Y with f(X)
we say an estimator is presistent relative to a cl-ass of functions Mn if
where:
Properties of SpAM
![Page 26: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/26.jpg)
● 2) SpAM is Sparsistent
Def: the support of � to be the location of the nonzero elements:
supp(� ) = { j: � j !=0 }
Then the estimate of � is sparsistent if
The SpAM Optimization Problem
![Page 27: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/27.jpg)
Outline
● 1) The SpAM Optimization Problem● 2) A backfitting algorithm● 3) Properties of SpAM● 4) Simulations showing the estimator's behavior
![Page 28: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/28.jpg)
● A simulations dataset
sample size n=150, generated from a 200 dimens-ional additive model. (196 irrelevant dimentsions).
Experiments
![Page 29: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/29.jpg)
● Boston Housing (506 observations with 10 covariates)
Then added 20 irrelevant variables:1) 10 for randomly drawn from uniform(0,1)
2) 10 for random permutation of the original ten
covariates
● Result shows the SpAM correctly zeros
out both types of irrelevant variables,
identifies 6 nonzero components
Experiments
![Page 30: SpAM: Sparse Additive Modelsranger.uta.edu/~heng/CSE6389_15_slides/SpAM_ppt.pdf · SpAM: Sparse Additive Models Author: Pradeep Ravikumar, Han Liu, John Lafferty and Larry Wasserman](https://reader033.vdocuments.net/reader033/viewer/2022052614/60612d4dbe445a639269427d/html5/thumbnails/30.jpg)
● Thank you!