a generalized iterated shrinkage algorithm for non-convex sparse coding wangmeng zuo, deyu meng, lei...

33
A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang ICCV 2013 [email protected] Harbin Institute of Technology

Upload: lorraine-adams

Post on 17-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding

Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang

ICCV 2013

[email protected] Institute of Technology

2

Overview

• From L1-norm sparse coding to Lp-norm sparse coding– Existing solvers for Lp-minimization

• Generalized shrinkage / thresholding function– Algorithm and analysis– Connections with soft/hard-thresholding functions

• Generalized Iterated Shrinkage Algorithms

• Experimental results

Overcomplete Representation• Compressed Sensing, image restoration, image

classification, machine learning, …

• Overcomplete Representation

– Infinite solutions of x – What’s the optimal?

L0-Sparse Coding• Impose some prior (constraint) on x:

– Sparser is better•

• Problems– Is the sparsest solution unique?– How can we obtain the optimal solution?

0min s.t.

xx Ax y

2

0 2min s.t.

xx Ax y

2

2 0min +

xAx y x

Theory: Uniqueness of Sparse Solution (L0)

• Nonconvex optimization, intractable• Greedy algorithms: matching pursuit (MP),

orthogonal matching pursuit (OMP)

Convex Relaxation: L1-Sparse Coding

• L1-Sparse Coding

• Problems– When L1- and L0- Sparse Coding have the same solution

– Algorithms for L1-Sparse Coding

6

1min s.t.

xx Ax y

2

1 2min s.t.

xx Ax y

2

2 1min +

xAx y x

Theory: Uniqueness of Sparse Solution (L1)

Theory: Uniqueness of Sparse Solution (L1)

• Restricted Isometry Property

• Convex, various algorithms have been proposed.

Algorithms for L1-Sparse Coding

• Iterative shrinkage/thresholding algorithm

• Augmented Lagrangian method

• Accelerated Proximal Gradient

• Homotopy

• Primal-Dual Interior-Point Method

• …

Generalized Iterated Shrinkage Algorithm 9

Allen Y. Yang, Zihan Zhou, Arvind Ganesh, Shankar Sastry, and Yi Ma. Fast l1-minimization algorithms for robust face recognition. IEEE Transactions on Image Processing, 2013.

Lp-norm Approximation

• L0-norm: The number of non-zero values

• Lp-norm

– L1-norm: convex envolope of L0

– L0-norm

Theory: Uniqueness of Sparse Solution (Lp)

• Weaker restricted isometry property is sufficient to guarantee perfect recovery in the Lp case.

min s.t. p

p

xx Ax y

R. Chartrand and V. Staneva, "Restricted isometry properties and nonconvex compressive sensing", Inverse Problems, vol. 24, no. 035020, pp. 1--14, 2008

Existing Lp-sparse coding algorithms

• Analytic solutions: Only suitable for some special cases, e.g., p = 1/2, or p = 1/3.

• IRLS, IRL1, ITM_Lp: would not converge to the global optimal solution even for solving the simplest problem

• Lookup table– Efficient, pre-computation

2

2min +

p

pxx y x

2

2min +

p

p

xAx y x

IRLS for Lp-sparse Coding

Generalized Iterated Shrinkage Algorithm 13

2

2min +

p

p

xAx y x

• IRLS– (1)

– (2)

2 2 /2 1 2

2min + ( ) p

i iix x

xAx y

M. Lai, J. Wang. An unconstrained lq minimization with 0 < q < 1 for sparse

solution of under-determined linear systems. SIAM Journal on Optimization, 21(1):82–101, 2011.

IRL1 for Lp-Sparse Coding

• IRL1– (1)

– (2)

Generalized Iterated Shrinkage Algorithm 14

2

2min +

p

p

xAx y x 12

2

1min

2

p

i iip x x

xy Ax

2( 1)

2

1arg min

2k

i iiw x

xx y Ax

1( )

pk

i iw p x

E. J. Candes, M. Wakin, S. Boyd. Enhancing sparsity by reweighted l1 minimization. Journal of Fourier Analysis and Applications, 14(5):877–905, 2008.

ITM_Lp for Lp-Sparse Coding

• ITM_Lp

where

Generalized Iterated Shrinkage Algorithm 15

2

2min +

p

pxx y x

0, if | | ( )( ; )

sgn( ) ( ; ), if | | ( )pITM

ITMpp p

yT y

y S y y

1( ; ) ppg p

1/(2 ) 1 1/(2 )( ) (2 )[ / (1 ) ]p p pp p p p

Root of the equation

Y. She. Thresholding-based iterative selection procedures for model selection and shrinkage. Electronic Journal of Statistics, 3:384–415, 2009.

Generalized Iterated Shrinkage Algorithm 16

p = 0.5, λ = 1, and y = 1.3

2

2min +

p

pxx y x

Generalized Shrinkage / Thresholding

• Keys of soft-thresholding– Thresholding rule: – Shrinkage rule:

• Generalization of soft-thresholding– What’s the thresholding value for Lp?

– How to modify the shrinkage rule?

212 2

min +p

pxx y x

sgn( )( )y y

212 2 1

min +x

x y x

Motivation

(a) y = 1, (b) y = 1.19, (c) y = 1.3, (d) y = 1.5, and (e) y = 1.6

2 0.512 2 0.5

min +x

x y x

Determining the threshold

• The first derivative of the nonzero extreme point is zero

• The second derivative of the nonzero extreme point higher than zero

• The function value at the nonzero extreme point is equivalent with that at zero

10

px y p x

2 2* *1 12 2( ) ( )

pGST GSTp p p px x

1

* 22 (1 ) ppx p

1 1

2 2( ) 2 (1 ) 2 (1 )p

GST p pp p p p

Determining the shrinkage operator

• – k = 0, x(k) = |y|– Iterate on k = 0, 1, ..., J– – k k + 1–

Generalized Iterated Shrinkage Algorithm 20

1( 1) ( ) pk kx y p x

( )( ; )GST kpT y x

Generalized Shrinkage / Thresholding Function

Generalized Iterated Shrinkage Algorithm 21

GST: Theoretical Analysis

Connections with soft / hard-thresholding functions

• p = 1: GST is equivalent with soft-thresholding

• p = 0: GST is equivalent with hard-thresholding

1

0, if ( ; )

sgn( ) , if GST

yT y

y y y

1

2

0 1

2

0, if 2( ; )

, if 2

GSTy

T yy y

Generalized Iterated Shrinkage Algorithms

• Lp-sparse coding

– Gradient descent

– Generalized Shrinkage / Thresholding

Generalized Iterated Shrinkage Algorithm 24

2

2min +

p

p

xAx y x

20.5 ( )k k T x x A A Ax y

( 1) ( 0.5)( , , , )k kGST t p J x x

Comparison with Iterated Shrinkage Algorithms

• Iterative Shrinkage / Thresholding– Gradient descent

– Soft thresholding

2

2 1minimize +Ax y x

20.5 ( )k k T x x A A Ax y

20.5

21 0.51 20.5 0.5

0, if( , )

sgn( ) , else

k

k k

k kT

x Ax x A

x x A

GISA2

2minimize +

p

p x Ax y

Sparse gradient based image deconvolution

Generalized Iterated Shrinkage Algorithm 27

2

2

1min

2p

p

xx k y Dx 2 2

2 2,

1min

2 2p

p

x d

x k y Dx d d

Application I: Deconvolution

Application I: Deconvolution

Application II: Face Recognition

• Extended YaleB

Conclusion• Compared with the state-of-the-art methods, GISA

is theoretically solid, easy to understand and efficient to implement, and it can converge to a more accurate solution.

• Compared with LUT, GISA is more general and does not need to compute and store the look-up tables.

• GISA can be readily used to solve the many lp–norm minimization problems in various vision and learning applications.

Generalized Iterated Shrinkage Algorithm 32

Looking forward

• Applications to other vision problems.

• Incorporation of the primal-dual algorithm for better solution

• Extension of GISA for constrained Lp-minimization, e.g.,

33

2

2 1min s.t.

xAx y x