sparse dictionary-based representation and recognition of...

Sparse Dictionary-based

Representation and Recognition

of Action Attributes

Qiang Qiu, Zhuolin Jiang, Rama Chellappa

Center for Automation Research,

Institute for Advanced Computer Studies

University of Maryland, College Park

[email protected], {zhuolin, rama}@umiacs.umd.edu

Action Feature Representation

2

Shape

Motion

HOG

Action Sparse Representation

3

=

0 0 0.64 0.53 -0.40 0.35 0 0 0

0.43 0.63 0 0 -0.33 0 -0.36 0 0

= 0.43 × + 0.63 × - 0.33 × - 0.36 ×

= 0.64× + 0.53× - 0.40 × +0.35 ×

Action Dictionary

Sparse code

K-SVD

4

=

Y

y2 d1 d2 d3 …

0 0 0.64 0.53 -0.40 0.35 0 0 0

0.43 0.63 0 0 -0.33 0 -0.36 0 0

x1 x2

y1

D

X

K-SVD [1]

Input: signals Y, dictionary size, sparisty T

Output: dictionary D, sparse codes X

arg min |Y- DX|2 s.t. i , |xi|0 ≤ T D,X

Input signals Dictionary

Sparse codes

[1] M. Aharon and M. Elad and A. Bruckstein, K-SVD: An Algorithm for Designing

Overcomplete Dictionries for Sparse Representation, IEEE Trans. on Signal Process, 2006

5

Compact Discriminative and

Dictionary.

Learn a

Objective

Probabilistic Model for Sparse

Representation

A Gaussian Process

Dictionary Class Distribution

6

7

= 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0 0

0 0 0.64 0.53 -0.40 0.35 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 -0.28 0.698 0.37 0.25 0

0 0 0 0 -0.42 0 0 0 0 0.42 0.47 0 0.32

y2 y1 y4 y3

l1 l1 l2 l2

d1 d2 d3 …

x1 x2 x3 x4

xd1

l1 l1 l2 l2

More Views of Sparse Representation

8

y2 y1 y4 y3

l1 l1 l2 l2

xd1 0.43 0 0 0

x1 x2 x3 x4

0.63 0 0 0

0 0.64 0 0

0 0.53 0 0

-0.33 -0.40 0 -0.42

… …

xd2

xd3

xd4

xd5

l1 l1 l2 l2

d1

d2

d3

d4

d5

A Gaussian Process

• Covariance function entry: K(i,j) = cov(xdi, xdj)

• P(Xd*|XD*) is a Gaussian with a closed-form conditional variance

A Gaussian Process

9

y2 y1 y4 y3

l1 l1 l2 l2

xd1 0.43 0 0 0

x1 x2 x3 x4

0.63 0 0 0

0 0.64 0 0

0 0.53 0 0

-0.33 -0.40 0 -0.42

… …

xd2

xd3

xd4

xd5

l1 l1 l2 l2

d1

d2

d3

d4

d5


• P(L|di), L [1, M]

• aggregate |xdi| based on class labels to obtain a M sized vector

• P(L=l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348

• P(L=l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37


Dictionary Learning Approaches

Maximization of Joint Entropy (ME)

Maximization of Mutual Information (MMI)

Unsupervised Learning (MMI-1)

Supervised Learning (MMI-2)

10

11

Maximization of Joint Entropy (ME)

- Initialize dictionary using k-SVD

Do =

- Start with D* = - Untill |D*|=k, iteratively choose d* from Do\D*,

d* = arg max H(d|D*)

Where

-A good approximation to ME criteria

arg max H(D)

d

D

ME dictionary

12

Maximization of Mutual Information

for Unsupervised Learning (MMI-1)


Do =


d* = arg max H(d|D*) - H(d|Do\(D* d))

-A near-optimal approximation to MMI criteria

arg max I(D; Do\D)

Within (1-1/e) of the optimum

d

D

MMI dictionary

13


• P(L|di), L [1, M]

• aggregate|xdi|based on class labels to obtain a M sized vector

• P(l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348

• P(l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37

• P(Ld) = P(L|d) • P(LD) = P(L|D) , where

y2 y1 y4 y3

l1 l1 l2 l2

xd1 0.43 0 0 0

x1 x2 x3 x4

0.63 0 0 0

0 0.64 0 0

0 0.53 0 0

-0.33 -0.40 0 -0.42

… …

xd2

xd3

xd4

xd5

l1 l1 l2 l2

d1

d2

d3

d4

d5

Revisit

14

Maximization of Mutual Information

for Supervised Learning (MMI-2)


Do =


d d* = arg max [H(d|D*) - H(d|Do\(D* d))]

+ λ[H(Ld|LD*) – H(Ld|LDo\(D* d))]

- MMI-1 is a special case of MMI-2 with λ=0.

Other learning methods

K-means

Liu-shah [1]

15

where

[1] J. Liu and M. Shah, Learning Human Actions via Information Maximization, CVPR 2008

Purity and Compactness

16

Representation Consistency

17

Keck gesture dataset

18

Recognition Accuracy

19

The recognition accuracy using initial dictionary Do: (a) 0.23 (b) 0.42 (c) 0.71

sparse dictionary-based representation and recognition of...

Documents