learning a low-rank shared dictionary … · learning a low-rank shared dictionary for object...

1
LEARNING A LOW-RANK SHARED DICTIONARY FOR OBJECT CLASSIFICATION Tiep Vu, Vishal Monga School of Electrical Engineering and Computer Science, The Pennsylvania State University, USA Motivation Typically, different objects usually share common patterns (brown bases in D). the sparse code matrix is not block diagonal. [Y 1 ... Y c ... Y C ] × [D 1 ... D c ... D C ] X 1 1 X c c X C C Main Contributions I A new low-rank shared dictionary learning (LRSDL) framework for extracting discriminative and shared patterns. I New accurate and efficient algorithms for selected existing and proposed dictionary learning methods. I We derive the computational complexity of numerous dictionary learning methods. I Numerous sparse coding and dictionary learning algorithms in the manuscript are reproducible via a user-friendly toolbox. LRSDL - Idea visualization In real problems, different classes share some common features (represented by D 0 ). We model the shared dictionary D 0 as a low-rank matrix. [Y 1 ... Y c ... Y C ] [D 1 ... D c ... D C D 0 ] × X X X X 1 X c X c X C X 0 X 1 c X c c X C c X 0 c X 0 c Y c D 0 X 0 c Y c D c X c c D 1 X 1 c D C X C c No constraint Y c D 0 X 0 c Y c D c X c c D 1 X 1 c D C X C c LRSDL Y c = Y c - D 0 X 0 c Goal: Y c - D c X c c - D 0 X 0 c 2 F small. D i X i c 2 F small (i = c). m 1 X 1 - M 1 2 F m c X c - M c 2 F m C X C - M C 2 F m 0 X 0 - M 0 2 F m Goal: X c - M c 2 F (intra class) small. M c - M 2 F (inter class) lagre. X 0 - M 0 2 F small. LRSDL cost function and efficient algorithms f (D, D 0 ¯ D , X, X 0 ¯ X )= f 1 ( ¯ D, ¯ X) Y-D 0 X 0 - DX 2 F + i=1,..., C Y i -D 0 X 0 i - D i X i i 2 F + j =i D j X j i 2 F +λ 1 X 1 +λ 2 i=1,..., C ( X i - M i 2 F -M i - M 2 F ) + X 2 F +X 0 - M 0 2 F f 2 ( ¯ D, ¯ X) +λ 1 X 0 1 +η D 0 * Without red terms, LRSDL becomes FDDL (M. Yang, ICCV, 2011; IJCV, 2014). LRSDL- cost function Definition: A 11 ... A 1 C A 21 ... A 2 C ... ... ... A C1 ... A CC A A + A 11 ... 0 0 ... 0 ... ... ... 0 ... A CC M(A) function M() requires a low computational cost. Lemma 1: Efficient FDDL solving D using ODL (J. Mairal, JML, 2010) D = arg min D -2trace(ED T ) + trace(FD T D) where E = YM(X) T , F = M(XX T ). Lemma 2: Efficient FDDL solving X using FISTA (A. Beck, JIS, 2009) 1 2 f 1 X = M(D T D)X -M(D T Y) 1 2 f 2 X = 2X + M - 2 M 1 M 2 ... M C . Convergence rate comparison (cost and running time) 20 60 100 22 24 cost FDDL (LRSDL) 10 30 50 50 150 250 DLSI (I. Ramirez, CVPR, 2010) 10 30 50 0 100 300 COPAR (S. Kong, ECCV, 2012) 20 60 100 0 2,000 7,000 12,000 0 2,000 7,000 12,000 iteration running time (s) 10 30 50 0 500 2,000 4,000 iteration 10 30 50 0 2,000 4,000 iteration Original Algorithms Proposed Efficient Algorithms Figure: Original vs Proposed algorithm – convergence rate comparisons Table: Complexity analysis for different dictionary learning methods Method Complexity Plugging numbers O-DLSI Ck (kd + dn + qkn)+ Cqkd 3 6.25 × 10 12 E-DLSI Ck (kd + dn + qkn)+ Cd 3 + Cqdk (qk + d ) 3.75 × 10 10 O-FDDL C 2 dk (n + Ck + Cn)+ Ck 2 q(d + C 2 n) 2.51 × 10 11 E-FDDL C 2 k ((q + 1)k (d + Cn)+2dn) 1.29 × 10 11 O-COPAR C 3 k 2 (2d + Ck + qn)+ Cqkd 3 6.55 × 10 12 E-COPAR C 3 k 2 (2d + Ck + qn)+ Cd 3 + Cqdk (qk + d ) 3.38 × 10 11 LRSDL C 2 k ((q + 1)k (d + Cn)+2dn)+ C 2 dkn +(q + q 2 )dk 2 1.3 × 10 11 Simulated data 1 2 3 4 Shared Basic elements Samples 1 2 3 4 DLSI bases 1 2 3 4 Accuracy: 95.15%. LCKSVD1 1 bases 1 2 3 4 Accuracy: 45.15%. LCKSVD2 1 bases 1 2 3 4 Accuracy: 48.15%. FDDL bases 1 2 3 4 Accuracy: 97.25%. COPAR bases 1 2 3 4 Shared Accuracy: 99.25%. LRSDL bases 1 2 3 4 Shared Accuracy: 100%. 1 Z. Jiang, TPAMI, 2013 Datasets Effect of the shared dictionary a) Extended YaleB b) AR face c) AR gender males females bluebell fritillary sunflower daisy dandelion d) Oxford Flower laptop chair motorbike e) Caltech 101 dragonfly air plane 10 20 30 40 50 60 70 80 75 80 85 90 size of the shared dictionary (k 0 ) Overall accuray (%) COPAR LRSDL η = 0 LRSDL η = 0.01 LRSDL η = 0.1 Dependence of overall accuracy on the shared dictionary (AR gender dataset). Overall accuracy (%) vs. # training samples per class 10 20 30 70 80 90 100 YaleB 5 10 15 20 40 60 80 100 AR face 50 150 250 350 90 95 AR gender 20 40 60 60 70 80 90 Oxford Flower 10 20 30 50 60 70 Caltech 101 SRC 2 LCKSVD1 LCKSVD2 DLSI FDDL D 2 L 2 R 2 COPAR LRSDL 2 J. Wright, TPAMI, 2009 Research was supported by an Office of Naval Research Grant no. N00014-15-1-2042. Email: [email protected] Website: http://signal.ee.psu.edu

Upload: vanliem

Post on 26-Aug-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LEARNING A LOW-RANK SHARED DICTIONARY … · LEARNING A LOW-RANK SHARED DICTIONARY FOR OBJECT CLASSIFICATION Tiep Vu, Vishal Monga ... COPAR (S. Kong, ECCV, …

LEARNING A LOW-RANK SHARED DICTIONARY FOR OBJECT CLASSIFICATIONTiep Vu, Vishal Monga

School of Electrical Engineering and Computer Science, The Pennsylvania State University, USA

Motivation

Typically, different objects usually share commonpatterns (brown bases in D).

⇒ the sparse code matrix is not block diagonal.

[Y1 . . .Yc . . .YC ]

×

[D1 . . . Dc . . . DC ]

X11

Xcc

XCC

Main Contributions

I A new low-rank shared dictionary learning (LRSDL) framework forextracting discriminative and shared patterns.

I New accurate and efficient algorithms for selected existing andproposed dictionary learning methods.

I We derive the computational complexity of numerous dictionarylearning methods.

I Numerous sparse coding and dictionary learning algorithms in themanuscript are reproducible via a user-friendly toolbox.

LRSDL - Idea visualization

In real problems, different classes share somecommon features (represented by D0).

We model the shared dictionary D0 as alow-rank matrix.

[Y1 . . .Yc . . .YC ] [D1 . . .Dc . . .DC D0]

×X

XX

X1 Xc

Xc

XC

X0

X1c

Xcc

XCc

X0cX0c

YcD0 X

0c

Yc

DcXcc

D1X

1c

DCXCc

No constraint

YcD0 X

0c

Yc

DcXcc

D1X

1c DCXCc

LRSDL

Yc = Yc −D0X0c

Goal:‖Yc −DcX

cc −D0X

0c‖2F small.

‖DiXic‖2F small (i 6= c).

m1

‖X1 −M1‖2F

mc

‖Xc −Mc‖2F

mC

‖XC −MC‖2F

m0

‖X0 −M0‖2F

m

Goal:

‖Xc−Mc‖2F (intra class) small.

‖Mc−M‖2F (inter class) lagre.

‖X0 −M0‖2F small.

LRSDL cost function and efficient algorithms

f (D,D0︸ ︷︷ ︸D̄

,X,X0

︸ ︷︷ ︸X̄

) =

f1(D̄,X̄)︷ ︸︸ ︷‖Y−D0X

0 −DX‖2F +∑

i=1,...,C

(‖Yi−D0X

0i −DiX

ii‖2F +

j 6=i

‖D jXji ‖2F)+λ1‖X‖1

+λ2

( ∑

i=1,...,C

(‖Xi −Mi‖2F − ‖Mi −M‖2F

)+ ‖X‖2F+‖X0 −M0‖2F

)

︸ ︷︷ ︸f2(D̄,X̄)

+λ1‖X0‖1+η‖D0‖∗

Without red terms, LRSDL becomes FDDL (M. Yang, ICCV, 2011; IJCV, 2014).

LRSDL- cost function

Definition:

A11 . . . A1C

A21 . . . A2C

. . . . . . . . .AC1 . . . ACC

︸ ︷︷ ︸A

7→ A+

A11 . . . 00 . . . 0. . . . . . . . .0 . . . ACC

︸ ︷︷ ︸M(A)

⇒ function M(•) requires a low computational cost.

Lemma 1: Efficient FDDL solving D using ODL (J. Mairal, JML, 2010)

D = argminD−2trace(EDT ) + trace(FDTD)

where E = YM(X)T , F =M(XXT ).

Lemma 2: Efficient FDDL solving X using FISTA (A. Beck, JIS, 2009)

∂ 12 f1

∂X= M(DTD)X−M(DTY)

∂ 12 f2

∂X= 2X+M− 2

[M1 M2 . . . MC

].

Convergence rate comparison (cost and running time)

20 60 100

22

24

cost

FDDL (LRSDL)

10 30 50

50

150

250

DLSI (I. Ramirez, CVPR, 2010)

10 30 50

0

100

300

COPAR (S. Kong, ECCV, 2012)

20 60 100

02,000

7,000

12,000

02,000

7,000

12,000

iteration

runn

ing

tim

e(s

)

10 30 50

0500

2,000

4,000

iteration

10 30 50

0

2,000

4,000

iteration

Original Algorithms Proposed Efficient Algorithms

Figure: Original vs Proposed algorithm – convergence rate comparisons

Table: Complexity analysis for different dictionary learning methods

Method ComplexityPluggingnumbers

O-DLSI Ck(kd + dn + qkn) +Cqkd3 6.25× 1012

E-DLSI Ck(kd + dn + qkn) +Cd3 +Cqdk(qk + d) 3.75× 1010

O-FDDL C2dk(n +Ck +Cn) +Ck2q(d +C2n) 2.51× 1011

E-FDDL C2k((q + 1)k(d +Cn) + 2dn) 1.29× 1011

O-COPAR C3k2(2d +Ck + qn) +Cqkd3 6.55× 1012

E-COPAR C3k2(2d +Ck + qn) +Cd3 +Cqdk(qk + d) 3.38× 1011

LRSDL C2k((q + 1)k(d +Cn) + 2dn) +C2dkn + (q + q2)dk2 1.3× 1011

Simulated data

1

2

3

4 Shared

Basic elements Samples

1

2

3

4

DLSI bases

1

2

3

4

Accuracy: 95.15%.

LCKSVD11 bases

1

2

3

4

Accuracy: 45.15%.

LCKSVD21 bases

1

2

3

4

Accuracy: 48.15%.

FDDL bases

1

2

3

4

Accuracy: 97.25%.

COPAR bases

1

2

3

4

SharedAccuracy: 99.25%.

LRSDL bases

1

2

3

4

SharedAccuracy: 100%.

1 Z. Jiang, TPAMI, 2013

Datasets Effect of the shared dictionary

a) Extended YaleB b) AR face

c) AR gender

males females

bluebell fritillary sunflower daisy dandelion

d) Oxford Flower

laptop chair motorbike

e) Caltech 101

dragonfly air plane10 20 30 40 50 60 70 80

75

80

85

90

size of the shared dictionary (k0)

Overall accuray (%)

COPAR

LRSDL η = 0

LRSDL η = 0.01

LRSDL η = 0.1

Dependence of overall accuracy on theshared dictionary (AR gender dataset).

Overall accuracy (%) vs. # training samples per class

10 20 3070

80

90

100

YaleB

5 10 15 2040

60

80

100

AR face

50 150 250 350

90

95

AR gender

20 40 6060

70

80

90

Oxford Flower

10 20 30

50

60

70

Caltech 101 SRC2

LCKSVD1LCKSVD2DLSIFDDL

D2L2R2

COPARLRSDL

2J. Wright, TPAMI, 2009

Research was supported by an Office of Naval Research Grant no. N00014-15-1-2042. Email: [email protected] Website: http://signal.ee.psu.edu