blind compressed sensing using sparsifying...

29
1 Blind Compressed Sensing Using Sparsifying Transforms Saiprasad Ravishankar and Yoram Bresler Department of Electrical and Computer Engineering and Coordinated Science Laboratory University of Illinois at Urbana-Champaign May 29, 2015 S. Ravishankar & Y. Bresler Blind Compressed Sensing

Upload: others

Post on 19-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

1

Blind Compressed SensingUsing Sparsifying Transforms

Saiprasad Ravishankar and Yoram Bresler

Department of Electrical and Computer Engineeringand Coordinated Science Laboratory

University of Illinois at Urbana-Champaign

May 29, 2015

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 2: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

2

Key Topics of Talk

Non-adaptive Compressed Sensing (CS)

Synthesis dictionary learning-basedblind compressed sensing

Transform learning vs. Dictionary learning

Transform learning-based blindcompressed sensing

Application to magnetic resonanceimaging (MRI)

Transform learning based MRI (TLMRI)

Conclusions

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 3: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

3

Compressed Sensing (CS)

CS enables accurate recovery of images from far fewermeasurements than the number of unknowns

Sparsity of image in transform domain or dictionary

measurement procedure incoherent with transform

Reconstruction non-linear and expensive

Reconstruction problem (NP-hard) -

minx

Data Fidelity︷ ︸︸ ︷

‖Ax − y‖22 +λRegularizer︷ ︸︸ ︷

‖Ψx‖0 (1)

x ∈ CP : image as vector, y ∈ Cm : measurements.

A ∈ Cm×P : sensing matrix (m < P), Ψ : transform (Wavelets,

Contourlets, Total Variation). ℓ0 “norm”counts non-zeros.

Iterative algorithms for CS reconstruction are usually expensive.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 4: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

4

Application: Compressed Sensing MRI (CSMRI)

Data - samples in k-space of spatial Fouriertransform of object, acquired sequentially.

Acquisition rate limited by MR physics,physiological constraints on RF energydeposition.

CS accelerates the data acquisition in MRI.

CSMRI with non-adaptive transforms ordictionaries limited to 2.5-3 foldundersampling [Ma et al. ’08].

Two directions to improve CSMRI -

better or adaptive sparse modeling

better choice of sampling pattern (Fu)[EMBC, 2011]

Fig. from Lustig et al. ’07

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 5: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

5

Synthesis Model for Sparse Representation

Given a signal y ∈ Rn, and dictionary D ∈ Rn×K , we assumey = Dx with ‖x‖0 ≪ K ⇒ a union of subspaces model.

Real world signals modeled as y = Dx + e, e is deviation term.

Given D, sparsity level s, the synthesis sparse coding problem is

x = argminx

‖y − Dx‖22 s.t. ‖x‖0 ≤ s

This problem is NP-hard.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 6: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

6

Synthesis Dictionary Learning

The DL problem (NP-hard) -

minD,B

N∑

j=1

‖Rjx − Dbj‖22 s.t. ‖dk‖2 = 1 ∀ k , ‖bj‖0 ≤ s ∀ j (2)

Rjx ∈ Cn -√n×√

n patch indexed bylocation in image.

Rj ∈ Cn×P extracts patch.

D ∈ Cn×K - patch based dictionary.

bj ∈ CK - sparse, xj ≈ Dbj .

s - sparsity, B = [b1 | b2 | ... | bN ].DL minimizes fit error of all patches usingsparse representations w.r.t. D.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 7: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

7

Synthesis-based Blind Compressed Sensing (BCS)

(P0) minx,D,B

Sparse Fitting Regularizer︷ ︸︸ ︷

N∑

j=1

‖Rjx − Dbj‖22 + ν

Data Fidelity︷ ︸︸ ︷

‖Ax − y‖22

s.t. ‖dk‖2 = 1 ∀ k , ‖bj‖0 ≤ s ∀ j .

B ∈ Cn×N : matrix that has the sparse codes bj as its columns.

(P0) learns D ∈ Cn×K , and reconstructs x , from only undersampled y ⇒

dictionary adaptive to underlying image.

(P0) is NP-hard, non-convex even if ℓ0 “norm” relaxed to ℓ1.

DLMRI1 solves (P0) for MRI and works better than non-adaptive CS.

Synthesis BCS algorithms have no guarantees and are expensive.

1 [Ravishankar & Bresler ’11]

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 8: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

8

2D Random Sampling - 6 fold undersampling

0

0.05

0.1

0.15

0.2

0.25

0.3

LDP2 reconstruction (22 dB) LDP error magnitude

0

0.05

0.1

0.15

0.2

0.25

0.3

DLMRI reconstruction (32 dB) DLMRI error magnitude

MRI data from Miki Lustig. 2 [Lustig et al. ’07]

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 9: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

9

Alternative: Sparsifying Transform Model

Given a signal y ∈ Rn, and transform W ∈ Rm×n, we modelWy = x + η with ‖x‖0 ≪ m and η - error term.

Natural signals are approximately sparse in Wavelets, DCT.

Given W , and sparsity s, transform sparse coding is

x = argminx

‖Wy − x‖22 s.t. ‖x‖0 ≤ s

x = Hs(Wy) computed by thresholding Wy to the s largest magnitude

elements. Sparse coding is cheap. Signal recovered as W †x .

Sparsifying transforms exploited for compression (JPEG2000), etc.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 10: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

10

Alternative: Sparsifying Transform Learning

Square Transform Models

Unstructured transform learning [IEEE TSP, 2013 & 2015]

Doubly sparse transform learning [IEEE TIP, 2013]

Online learning for Big Data [IEEE JSTSP, 2015]

Convex formulations for transform learning [ICASSP, 2014]

Overcomplete Transform Models

Unstructured overcomplete transform learning [ICASSP, 2013]

Learning structured overcomplete transforms with block cosparsity(OCTOBOS) [IJCV, 2014]

Applications: Sparse representation, Image & Video denoising,Classification, Blind compressed sensing (BCS) for imaging.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 11: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

11

Square Transform Learning Formulation

(P1) minW ,B

Sparsification Error︷ ︸︸ ︷

N∑

j=1

‖WRjx − bj‖22 +λ

Regularizer︷ ︸︸ ︷(

0.5 ‖W ‖2F − log |detW |)

s.t. ‖bj‖0 ≤ s ∀ j

Sparsification error - measures deviation of data in transform domainfrom perfect sparsity.

Regularizer enables complete control over conditioning & scaling of W .

If ∃ (W , B) such that the condition number κ(W ) = 1, WRjx = bj ,

||bj ||0 ≤ s ∀ j ⇒ globally identifiable by solving (P1).

(P1) favors both a low sparsification error and good conditioning.

The solution to (P1) is unitary as λ→ ∞.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 12: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

12

Transform-based Blind Compressed Sensing (BCS)

(P2) minx,W ,B

Sparsification Error︷ ︸︸ ︷

N∑

j=1

‖WRjx − bj‖22+νData Fidelity︷ ︸︸ ︷

‖Ax − y‖22 +λRegularizer︷ ︸︸ ︷

v(W )

s.t.N∑

j=1

‖bj‖0 ≤ s, ‖x‖2 ≤ C .

(P2) learns W ∈ Cn×n, and reconstructs x , from only undersampledy ⇒ transform adaptive to underlying image.

v(W ) , − log |detW | + 0.5 ‖W ‖2F controls scaling and κ of W .

‖x‖2 ≤ C is an energy/range constraint. C > 0.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 13: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

13

Transform BCS: Identifiability & Uniqueness

Proposition 1

Let x ∈ Cp, and let y = Ax with A ∈ Cm×p. Suppose

‖x‖2 ≤ C

W ∈ Cn×n is a unitary transform∑N

j=1 ‖WRjx‖0 ≤ s

Further, let B denote the matrix that has WRjx as its columns.Then, (x ,W ,B) is a global minimizer of Problem (P2), i.e., it isidentifiable by solving (P2).

Given minimizer (x ,W ,B) of (P2), (x ,ΘW ,ΘB) is anotherequivalent minimizer ∀Θ s.t. ΘHΘ = I ,

j ‖Θbj‖0 ≤ s. Theoptimal x is invariant to such transformations of (W ,B).

When W is constrained to be doubly sparse and unitary, uniquenesscan be guaranteed under additional (e.g., spark) conditions.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 14: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

14

Alternative Transform BCS Formulations

(P3) minx,W ,B

N∑

j=1

‖WRjx − bj‖22 + ν ‖Ax − y‖22

s.t. WHW = I ,

N∑

j=1

‖bj‖0 ≤ s, ‖x‖2 ≤ C .

(P3) is also a unitary synthesis dictionary-based BCS problem,with WH the synthesis dictionary.

(P4) minx,W ,B

N∑

j=1

‖WRjx − bj‖22 + ν ‖Ax − y‖22 + λ v(W ) + η2N∑

j=1

‖bj‖0

s.t. ‖x‖2 ≤ C .

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 15: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

15

Block Coordinate Descent (BCD) Algorithm for (P2)

(P2) solved by alternating between updating W , B, and x .

Alternate a few times between the W and B updates, beforeperforming an image update.

Sparse Coding Step solves (P2) for B with fixed x , W .

minB

N∑

j=1

‖WRjx − bj‖22 s.t.N∑

j=1

‖bj‖0 ≤ s. (3)

Cheap Solution: Let Z ∈ Cn×N be the matrix with WRjx as its

columns. Solution B = Hs(Z ) computed exactly by zeroing out allbut the s largest magnitude coefficients in Z .

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 16: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

16

BCD Algorithm for (P2)

Transform Update Step solves (P2) for W with fixed x , B.

minW

N∑

j=1

‖WRjx − bj‖22 + 0.5λ ‖W ‖2F − λ log |detW | (4)

Let X ∈ Cn×N be the matrix with Rjx as its columns.

Closed-form solution:

W = 0.5R

(

Σ+(

Σ2 + 2λI) 1

2

)

VHL−1 (5)

where XXH + 0.5λI = LLH , and L−1XBH has a full SVD of VΣRH .

Solution is unique if and only if XBH is non-singular.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 17: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

17

BCD Algorithm for (P2)

Image Update Step solves (P2) for x with fixed W , B.

minx

N∑

j=1

‖WRjx − bj‖22 + ν ‖Ax − y‖22 s.t. ‖x‖2 ≤ C . (6)

Least squares problem with ℓ2 norm constraint.

Solution is unique as long as the set of overlapping patches cover allimage pixels.

Solve Least squares Lagrangian formulation:

minx

N∑

j=1

‖WRjx − bj‖22 + ν ‖Ax − y‖22 + µ

(

‖x‖22 − C)

(7)

The optimal multiplier µ ∈ R+ is the smallest real such that‖x‖2 ≤ C . µ and x can be found cheaply.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 18: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

18

BCS Convergence Guarantees - Notations

Define the barrier function ψs(B) as

ψs(B) =

{

0,

+∞,

∑N

j=1 ‖bj‖0 ≤ s

else

χC (x) is the barrier function corresponding to ‖x‖2 ≤ C .

(P2) is equivalent to the problem of minimizing the objective

g(W ,B, x) =

N∑

j=1

‖WRjx − bj‖22 + ν ‖Ax − y‖22 + λ v(W ) + ψ(B) + χ(x)

For H ∈ Cp×q , ρj(H) is the magnitude of the j th largest element(magnitude-wise) of H .

X ∈ Cn×N denotes a matrix with Rjx , 1 ≤ j ≤ N , as its columns.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 19: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

19

Transform BCS Convergence Guarantees

Theorem 1

For the sequence {W t ,B t , x t} generated by the BCD Algorithm withinitial (W 0,B0, x0), we have

{g (W t ,B t , x t)} → g∗ = g∗(W 0,B0, x0).

{W t ,B t , x t} is bounded, and all its accumulation points areequivalent, i.e., they achieve the same value g∗ of the objective.∥∥x t − x t−1

∥∥2→ 0 as t → ∞.

Every accumulation point (W ,B, x) is a critical point of gsatisfying the following partial global optimality conditions

x ∈ argminx

g (W ,B, x) (8)

W ∈ argminW

g(

W ,B, x)

, B ∈ argminB

g(

W , B , x)

(9)

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 20: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

20

Transform BCS Convergence Guarantees

Theorem 2

Each accumulation point (W ,B, x) of {W t ,B t , x t} also satisfies thefollowing partial local optimality conditions

g(W +∆W ,B +∆B, x) ≥g(W ,B, x) = g∗ (10)

g(W ,B +∆B, x +∆x) ≥g(W ,B, x) = g∗ (11)

The conditions each hold for all ∆x ∈ Cp, and all ∆W ∈ Cn×n satisfying‖∆W ‖F ≤ ǫ for some ǫ = ǫ(W ) > 0, and all ∆B ∈ Cn×N in R1 ∪ R2

R1. The half-space Re(tr{(WX − B)∆BH

})≤ 0.

R2. The local region defined by ‖∆B‖∞ < ρs(WX ).

Furthermore, if ‖WX‖0 ≤ s, then ∆B can be arbitrary.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 21: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

21

Global Convergence Guarantees

Proposition 2

For each initialization, the iterate sequence in the BCD algorithmconverges to an equivalence class (same objective values) of criticalpoints of the objective, that are also partial global/local minimizers.

Proposition 3

The BCD algorithm is globally convergent to a subset of the set ofcritical points of the objective. The subset includes all (W ,B, x) that areat least partial global and partial local minimizers.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 22: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

22

Computational Advantages of Transform BCS

Cost per iteration of transform BCS: O(p4NL)

N overlapping patches of size p × p, W ∈ Cn×n, n , p2.

L : # inner alternations between transform update & sparse coding.

Cost per iteration of Synthesis BCS method DLMRI3: O(p6NJ).

D ∈ Cn×K , n , p2, K ∝ n, sparsity s ∝ n.

J : # of inner iterations of dictionary learning using K-SVD4.

In practice, transform BCS converges quickly and is much cheaperfor large p.

In 3D or 4D imaging, n = p3 or p4, and the gain in computations isabout a factor n in order.

3 [Ravishankar & Bresler ’11] 4 [Aharon et al. ’06],

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 23: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

23

TLMRI Convergence - 4x Undersampling (s = 3.4%)

Reference Sampling mask

100

1019.45

9.5

9.55

9.6

9.65

9.7x 105

Iteration Number

Obj

ectiv

e F

unct

ion

100

10110

−2

10−1

100

101

Iteration Number (t)

∥ ∥

xt−xt−1∥ ∥

2

Objective∥∥x t − x t−1

∥∥2vs. t

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 24: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

24

Convergence & Learning - 4x Undersampling (s = 3.4%)

0

0.05

0.1

0.15

0.2

Zero-filling (28.94 dB) Zero-filling Error

TLMRI (32.66 dB) real (top), imaginary (bottom)parts of learnt 36× 36 W

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 25: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

25

Comparison (PSNR & Runtime) to Recent Methods

Sampling Scheme Undersampling Zero-filling LDP5 PBDWS6 DLMRI7 PANO8 TLMRI

2D Random4x 25.3 30.3 32.6 32.91 32.2 33.04

7x 25.3 27.3 31.3 31.46 30.2 31.81

Cartesian4x 28.9 30.2 32.0 32.46 31.6 32.64

7x 27.9 25.5 30.1 30.72 30.4 31.04

Avg. Runtime (s) 251 794 2051 664 211

TLMRI is up to 5.5 dB better than LDP, that uses Wavelets + TV.

TLMRI provides up to 1 dB improvement in PSNR over the PBDWS methodthat uses redundant Wavelets and trained patch-based geometric directions, andis up to 1.6 dB better than the non-local PANO method.

It is up to 0.35 dB better than DLMRI, that learns 4x overcomplete dictionary.

TLMRI is 10x faster than DLMRI, and 4x faster than the PBDWS method.

TLMRI provides the best reconstructions, and is the fastest.

5 [Lustig et al. ’07] 6 [Ning et al. ’13] 7 [Ravishankar & Bresler ’11] 8 [Qu et al. ’14]

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 26: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

26

Example - 2D random 5x Undersampling

Reference DLMRI (28.54 dB) TLMRI (30.47 dB)

0

0.05

0.1

0.15

0.2

0.25

0

0.05

0.1

0.15

0.2

0.25

Sampling Mask DLMRI Error TLMRI Error

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 27: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

27

Conclusions

We introduced a transform-based BCS framework

Proposed BCS algorithms have a low computational cost.

We provided novel convergence guarantees for the algorithms, thatdo not require any restrictive assumptions.

For CSMRI, the proposed approach is better than leading imagereconstruction methods, while being much faster.

Future work: convergence of algorithm to global minimizer &convergence rate.

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 28: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

28

Thank you! Questions??

S. Ravishankar & Y. Bresler Blind Compressed Sensing

Page 29: Blind Compressed Sensing Using Sparsifying Transformstransformlearning.csl.illinois.edu/assets/Sai/Conference... · 2015. 5. 29. · 1 Blind Compressed Sensing Using Sparsifying Transforms

29

Convergence Guarantees - Definitions

Definition 1

Let φ : Rq 7→ (−∞,+∞] be a proper function and let z ∈ domφ. TheFrechet sub-differential of the function φ at z is the following set:

∂φ(z) ,{

h ∈ Rq : lim infb→z,b 6=z

1‖b−z‖ (φ(b) − φ(z) − 〈b − z , h〉) ≥ 0

}

(12)

If z /∈ domφ, then ∂φ(z) = ∅. The sub-differential of φ at z is defined as

∂φ(z) ,{

h ∈ Rq : ∃zk → z , φ(zk ) → φ(z), hk ∈ ∂φ(zk ) → h

}. (13)

Lemma 1

A necessary condition for z ∈ Rq to be a minimizer of the functionφ : Rq 7→ (−∞,+∞] is that z is a critical point of φ, i.e., 0 ∈ ∂φ(z). Ifφ is a convex function, this condition is also sufficient.

S. Ravishankar & Y. Bresler Blind Compressed Sensing