a generalized iterated shrinkage algorithm for non-convex sparse coding wangmeng zuo, deyu meng, lei...
TRANSCRIPT
A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding
Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang
ICCV 2013
[email protected] Institute of Technology
2
Overview
• From L1-norm sparse coding to Lp-norm sparse coding– Existing solvers for Lp-minimization
• Generalized shrinkage / thresholding function– Algorithm and analysis– Connections with soft/hard-thresholding functions
• Generalized Iterated Shrinkage Algorithms
• Experimental results
Overcomplete Representation• Compressed Sensing, image restoration, image
classification, machine learning, …
• Overcomplete Representation
– Infinite solutions of x – What’s the optimal?
L0-Sparse Coding• Impose some prior (constraint) on x:
– Sparser is better•
•
• Problems– Is the sparsest solution unique?– How can we obtain the optimal solution?
0min s.t.
xx Ax y
2
0 2min s.t.
xx Ax y
2
2 0min +
xAx y x
Theory: Uniqueness of Sparse Solution (L0)
• Nonconvex optimization, intractable• Greedy algorithms: matching pursuit (MP),
orthogonal matching pursuit (OMP)
Convex Relaxation: L1-Sparse Coding
• L1-Sparse Coding
–
–
• Problems– When L1- and L0- Sparse Coding have the same solution
– Algorithms for L1-Sparse Coding
6
1min s.t.
xx Ax y
2
1 2min s.t.
xx Ax y
2
2 1min +
xAx y x
Theory: Uniqueness of Sparse Solution (L1)
• Restricted Isometry Property
• Convex, various algorithms have been proposed.
Algorithms for L1-Sparse Coding
• Iterative shrinkage/thresholding algorithm
• Augmented Lagrangian method
• Accelerated Proximal Gradient
• Homotopy
• Primal-Dual Interior-Point Method
• …
Generalized Iterated Shrinkage Algorithm 9
Allen Y. Yang, Zihan Zhou, Arvind Ganesh, Shankar Sastry, and Yi Ma. Fast l1-minimization algorithms for robust face recognition. IEEE Transactions on Image Processing, 2013.
Lp-norm Approximation
• L0-norm: The number of non-zero values
• Lp-norm
– L1-norm: convex envolope of L0
– L0-norm
Theory: Uniqueness of Sparse Solution (Lp)
• Weaker restricted isometry property is sufficient to guarantee perfect recovery in the Lp case.
min s.t. p
p
xx Ax y
R. Chartrand and V. Staneva, "Restricted isometry properties and nonconvex compressive sensing", Inverse Problems, vol. 24, no. 035020, pp. 1--14, 2008
Existing Lp-sparse coding algorithms
• Analytic solutions: Only suitable for some special cases, e.g., p = 1/2, or p = 1/3.
• IRLS, IRL1, ITM_Lp: would not converge to the global optimal solution even for solving the simplest problem
• Lookup table– Efficient, pre-computation
2
2min +
p
pxx y x
2
2min +
p
p
xAx y x
IRLS for Lp-sparse Coding
Generalized Iterated Shrinkage Algorithm 13
2
2min +
p
p
xAx y x
• IRLS– (1)
– (2)
2 2 /2 1 2
2min + ( ) p
i iix x
xAx y
M. Lai, J. Wang. An unconstrained lq minimization with 0 < q < 1 for sparse
solution of under-determined linear systems. SIAM Journal on Optimization, 21(1):82–101, 2011.
IRL1 for Lp-Sparse Coding
• IRL1– (1)
– (2)
Generalized Iterated Shrinkage Algorithm 14
2
2min +
p
p
xAx y x 12
2
1min
2
p
i iip x x
xy Ax
2( 1)
2
1arg min
2k
i iiw x
xx y Ax
1( )
pk
i iw p x
E. J. Candes, M. Wakin, S. Boyd. Enhancing sparsity by reweighted l1 minimization. Journal of Fourier Analysis and Applications, 14(5):877–905, 2008.
ITM_Lp for Lp-Sparse Coding
• ITM_Lp
where
Generalized Iterated Shrinkage Algorithm 15
2
2min +
p
pxx y x
0, if | | ( )( ; )
sgn( ) ( ; ), if | | ( )pITM
ITMpp p
yT y
y S y y
1( ; ) ppg p
1/(2 ) 1 1/(2 )( ) (2 )[ / (1 ) ]p p pp p p p
Root of the equation
Y. She. Thresholding-based iterative selection procedures for model selection and shrinkage. Electronic Journal of Statistics, 3:384–415, 2009.
Generalized Shrinkage / Thresholding
• Keys of soft-thresholding– Thresholding rule: – Shrinkage rule:
• Generalization of soft-thresholding– What’s the thresholding value for Lp?
– How to modify the shrinkage rule?
212 2
min +p
pxx y x
sgn( )( )y y
212 2 1
min +x
x y x
Motivation
(a) y = 1, (b) y = 1.19, (c) y = 1.3, (d) y = 1.5, and (e) y = 1.6
2 0.512 2 0.5
min +x
x y x
Determining the threshold
• The first derivative of the nonzero extreme point is zero
• The second derivative of the nonzero extreme point higher than zero
• The function value at the nonzero extreme point is equivalent with that at zero
10
px y p x
2 2* *1 12 2( ) ( )
pGST GSTp p p px x
1
* 22 (1 ) ppx p
1 1
2 2( ) 2 (1 ) 2 (1 )p
GST p pp p p p
Determining the shrinkage operator
• – k = 0, x(k) = |y|– Iterate on k = 0, 1, ..., J– – k k + 1–
Generalized Iterated Shrinkage Algorithm 20
1( 1) ( ) pk kx y p x
( )( ; )GST kpT y x
Connections with soft / hard-thresholding functions
• p = 1: GST is equivalent with soft-thresholding
• p = 0: GST is equivalent with hard-thresholding
1
0, if ( ; )
sgn( ) , if GST
yT y
y y y
1
2
0 1
2
0, if 2( ; )
, if 2
GSTy
T yy y
Generalized Iterated Shrinkage Algorithms
• Lp-sparse coding
– Gradient descent
– Generalized Shrinkage / Thresholding
Generalized Iterated Shrinkage Algorithm 24
2
2min +
p
p
xAx y x
20.5 ( )k k T x x A A Ax y
( 1) ( 0.5)( , , , )k kGST t p J x x
Comparison with Iterated Shrinkage Algorithms
• Iterative Shrinkage / Thresholding– Gradient descent
– Soft thresholding
2
2 1minimize +Ax y x
20.5 ( )k k T x x A A Ax y
20.5
21 0.51 20.5 0.5
0, if( , )
sgn( ) , else
k
k k
k kT
x Ax x A
x x A
Sparse gradient based image deconvolution
Generalized Iterated Shrinkage Algorithm 27
2
2
1min
2p
p
xx k y Dx 2 2
2 2,
1min
2 2p
p
x d
x k y Dx d d
Conclusion• Compared with the state-of-the-art methods, GISA
is theoretically solid, easy to understand and efficient to implement, and it can converge to a more accurate solution.
• Compared with LUT, GISA is more general and does not need to compute and store the look-up tables.
• GISA can be readily used to solve the many lp–norm minimization problems in various vision and learning applications.
Generalized Iterated Shrinkage Algorithm 32