talk icml

Online Dictionary Learning forSparse Coding

Julien Mairal1 Francis Bach1Jean Ponce2 Guillermo Sapiro3

1INRIA–Willow project 2Ecole Normale Supérieure3University of Minnesota

ICML, Montréal, June 2008

What this talk is aboutLearning efficiently dictionaries (basis set) for sparsecoding.

Solving a large-scale matrix factorization problem.

Making some large-scale image processing problemstractable.

Proposing an algorithm which extends to NMF, sparsePCA,. . .

1 The Dictionary Learning Problem

2 Online Dictionary Learning

3 Extensions

The Dictionary Learning Problem

y︸︷︷︸measurements

= xorig︸︷︷︸original image

+ w︸︷︷︸noise

The Dictionary Learning Problem[Elad & Aharon (’06)]

Solving the denoising problemExtract all overlapping 8× 8 patches xi .

Solve a matrix factorization problem:

minαi ,D∈C

n∑i=1

12||xi −Dαi ||22︸︷︷︸

reconstruction

+λ||αi ||1︸︷︷︸sparsity

with n > 100, 000

Average the reconstruction of each patch.

The Dictionary Learning Problem[Mairal, Bach, Ponce, Sapiro & Zisserman (’09)]

Denoising result

The Dictionary Learning Problem[Mairal, Sapiro & Elad (’08)]

Image completion example

The Dictionary Learning ProblemWhat does D look like?

The Dictionary Learning Problem

minα∈Rk×n

n∑i=1

12 ||xi −Dαi ||22 + λ||αi ||1

C M= {D ∈ Rm×k s.t. ∀j = 1, . . . , k , ||dj ||2 ≤ 1}.

Classical optimization alternates between Dand α.

Good results, but very slow!

3 Extensions

Online Dictionary Learning

Classical formulation of dictionary learning

minD∈C

fn(D) = minD∈C

n∑i=1

l(xi ,D),

l(x,D) M= minα∈Rk

12||x−Dα||22 + λ||α||1.

Which formulation are we interested in?

minD∈C

[f (D) M= Ex [l(x,D)] ≈ lim

n→+∞

n∑i=1

l(xi ,D)]

Online learning canhandle potentially infinite datasets,adapt to dynamic training sets,be dramatically faster than batchalgorithms [Bottou & Bousquet (’08)].

Online Dictionary LearningProposed approach

1: for t=1,. . . ,T do2: Draw xt3: Sparse Coding

αt ← argminα∈Rk

12||xt −Dt−1α||22 + λ||α||1,

4: Dictionary Learning

Dt ← argminD∈C

t∑i=1

(12||xi−Dαi ||22+λ||αi ||1

5: end for

Implementation detailsUse LARS for the sparse coding step,

Use a block-coordinate approach for thedictionary update, with warm restart,

Use a mini-batch.

Which guarantees do we have?Under a few reasonable assumptions,

we build a surrogate function f̂t of theexpected cost f verifying

limt→+∞

f̂t(Dt)− f (Dt) = 0,

Dt is asymptotically close to a stationarypoint.

Online Dictionary LearningExperimental results, batch vs online

m = 8× 8, k = 256

m = 12× 12× 3, k = 512

m = 16× 16, k = 1024

Online Dictionary LearningExperimental results, ODL vs SGD

m = 8× 8, k = 256

m = 12× 12× 3, k = 512

m = 16× 16, k = 1024

Online Dictionary LearningInpainting a 12-Mpixel photograph

3 Extensions

Extension to NMF and sparse PCA

NMF extension

minα∈Rk×n

n∑i=1

12||xi −Dαi ||22 s.t. αi ≥ 0, D ≥ 0.

SPCA extension

minα∈Rk×n

D∈C′

n∑i=1

12||xi −Dαi ||22 + λ||α1||1

C ′ M= {D ∈ Rm×k s.t. ∀j ||dj ||22 + γ||dj ||1 ≤ 1}.

Extension to NMF and sparse PCAFaces: Extended Yale Database B

(a) PCA (b) NNMF (c) DL

Extension to NMF and sparse PCAFaces: Extended Yale Database B

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Extension to NMF and sparse PCANatural Patches

(a) PCA (b) NNMF (c) DL

Extension to NMF and sparse PCANatural Patches

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Conclusion

Take-home messageOnline techniques are adapted to thedictionary learning problem.Our method makes some large-scaleimage processing tasks tractable—. . .. . .— and extends to various matrixfactorization problems.

talk icml

dictionary update

dictionary learning

dictionary learning

dictionary learning1

batch vs online

dcdc n i

dc t i

nmin f d

Documents

research talk - the stanford natural language processing...

research introspection “icml does icml”

highlights of icml 2015 -...

lubrication certification requirements (icml...

international conference on machine...

icml 2014 club - online clustering of bandits poster, 31st...

controllable text generation (icml 2017 under review)

20160716 icml paper reading, learning to generate with...

molecular fingerprint vae - icml-compbio.github.io

american association 15 icml preliminary...3 welcome to 15...

neil d. lawrence 11th july 2015 deep learning@icml

icml · introduction to icml 2015 5 icml 2015 at a glance 7...

big data bootstrap (icml読み会)

a tutorial on deep learning at icml 2013

protocolo para el regreso a las actividades en el icml

by sergey alexandrov icml 2003

causata machine learning icml atlanta june 2013

offline components: collaborative filtering in cold...

icml ’11 tutorial: recommender problems for web...

infinite svm - icml 2011 読み会