sparse dictionary-based representation and recognition of...
TRANSCRIPT
Sparse Dictionary-based
Representation and Recognition
of Action Attributes
Qiang Qiu, Zhuolin Jiang, Rama Chellappa
Center for Automation Research,
Institute for Advanced Computer Studies
University of Maryland, College Park
[email protected], {zhuolin, rama}@umiacs.umd.edu
Action Feature Representation
2
Shape
Motion
HOG
Action Sparse Representation
3
=
0 0 0.64 0.53 -0.40 0.35 0 0 0
0.43 0.63 0 0 -0.33 0 -0.36 0 0
= 0.43 × + 0.63 × - 0.33 × - 0.36 ×
= 0.64× + 0.53× - 0.40 × +0.35 ×
Action Dictionary
Sparse code
K-SVD
4
=
Y
y2 d1 d2 d3 …
0 0 0.64 0.53 -0.40 0.35 0 0 0
0.43 0.63 0 0 -0.33 0 -0.36 0 0
x1 x2
y1
D
X
K-SVD [1]
Input: signals Y, dictionary size, sparisty T
Output: dictionary D, sparse codes X
arg min |Y- DX|2 s.t. i , |xi|0 ≤ T D,X
Input signals Dictionary
Sparse codes
[1] M. Aharon and M. Elad and A. Bruckstein, K-SVD: An Algorithm for Designing
Overcomplete Dictionries for Sparse Representation, IEEE Trans. on Signal Process, 2006
5
Compact Discriminative and
Dictionary.
Learn a
Objective
Probabilistic Model for Sparse
Representation
A Gaussian Process
Dictionary Class Distribution
6
7
= 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0 0
0 0 0.64 0.53 -0.40 0.35 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 -0.28 0.698 0.37 0.25 0
0 0 0 0 -0.42 0 0 0 0 0.42 0.47 0 0.32
y2 y1 y4 y3
l1 l1 l2 l2
d1 d2 d3 …
x1 x2 x3 x4
xd1
l1 l1 l2 l2
More Views of Sparse Representation
8
y2 y1 y4 y3
l1 l1 l2 l2
xd1 0.43 0 0 0
x1 x2 x3 x4
0.63 0 0 0
0 0.64 0 0
0 0.53 0 0
-0.33 -0.40 0 -0.42
… …
xd2
xd3
xd4
xd5
l1 l1 l2 l2
d1
d2
d3
d4
d5
A Gaussian Process
• Covariance function entry: K(i,j) = cov(xdi, xdj)
• P(Xd*|XD*) is a Gaussian with a closed-form conditional variance
A Gaussian Process
9
y2 y1 y4 y3
l1 l1 l2 l2
xd1 0.43 0 0 0
x1 x2 x3 x4
0.63 0 0 0
0 0.64 0 0
0 0.53 0 0
-0.33 -0.40 0 -0.42
… …
xd2
xd3
xd4
xd5
l1 l1 l2 l2
d1
d2
d3
d4
d5
Dictionary Class Distribution
• P(L|di), L [1, M]
• aggregate |xdi| based on class labels to obtain a M sized vector
• P(L=l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348
• P(L=l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37
Dictionary Class Distribution
Dictionary Learning Approaches
Maximization of Joint Entropy (ME)
Maximization of Mutual Information (MMI)
Unsupervised Learning (MMI-1)
Supervised Learning (MMI-2)
10
11
Maximization of Joint Entropy (ME)
- Initialize dictionary using k-SVD
Do =
- Start with D* = - Untill |D*|=k, iteratively choose d* from Do\D*,
d* = arg max H(d|D*)
Where
-A good approximation to ME criteria
arg max H(D)
d
D
ME dictionary
12
Maximization of Mutual Information
for Unsupervised Learning (MMI-1)
- Initialize dictionary using k-SVD
Do =
- Start with D* = - Untill |D*|=k, iteratively choose d* from Do\D*,
d* = arg max H(d|D*) - H(d|Do\(D* d))
-A near-optimal approximation to MMI criteria
arg max I(D; Do\D)
Within (1-1/e) of the optimum
d
D
MMI dictionary
13
Dictionary Class Distribution
• P(L|di), L [1, M]
• aggregate|xdi|based on class labels to obtain a M sized vector
• P(l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348
• P(l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37
• P(Ld) = P(L|d) • P(LD) = P(L|D) , where
y2 y1 y4 y3
l1 l1 l2 l2
xd1 0.43 0 0 0
x1 x2 x3 x4
0.63 0 0 0
0 0.64 0 0
0 0.53 0 0
-0.33 -0.40 0 -0.42
… …
xd2
xd3
xd4
xd5
l1 l1 l2 l2
d1
d2
d3
d4
d5
Revisit
14
Maximization of Mutual Information
for Supervised Learning (MMI-2)
- Initialize dictionary using k-SVD
Do =
- Start with D* = - Untill |D*|=k, iteratively choose d* from Do\D*,
d d* = arg max [H(d|D*) - H(d|Do\(D* d))]
+ λ[H(Ld|LD*) – H(Ld|LDo\(D* d))]
- MMI-1 is a special case of MMI-2 with λ=0.
Other learning methods
K-means
Liu-shah [1]
15
where
[1] J. Liu and M. Shah, Learning Human Actions via Information Maximization, CVPR 2008
Purity and Compactness
16
Representation Consistency
17
Keck gesture dataset
18
Recognition Accuracy
19
The recognition accuracy using initial dictionary Do: (a) 0.23 (b) 0.42 (c) 0.71