lior wolf and noga levy the svm-minus similarity score for video face recognition 24.06.2013...

24
Lior Wolf and Noga Levy The SVM-minus Similarity Score for Video Face Recognition 24.06.2013 Makarand Tapaswi CVPR Reading Group @ VGG 1

Upload: natalie-heaphy

Post on 14-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

1

Lior Wolf and Noga Levy

The SVM-minus Similarity Score for Video Face Recognition

24.06.2013Makarand Tapaswi

CVPR Reading Group @ VGG

2

Same / Not Same ?

3

One liner

“How similar is the face in one video sequence to the other, where the similarity is uncorrelated

with pose-induced similarity”

• illumination, expression, image quality, pose• classifier should– discriminate positive/negative AND– uncorrelate w.r.to additional feature set

5

Basic Notation

• : the background(negative) set

8

Matched Background Similaritysame person

B1

X1

B2X2

B

9

MBGSdifferent persons

B1

X1

B2

X2

B

10

MBGS

11

SVM-minus Classifier

• Inputs– Training set – Privileged info. – Labels + + + + + - - - - -

12

• = train

• = test signed dist. to hyperplane• Un-correlate with

SVM-minus classifier

13

SVM-minus Classifier (2)

• Split– into and – into and – necessary since classifiers are correlated

• Normalization– and each feature-dimension to 0 mean– and to mean 0, and

14

SVM– loss function

• Pearson correlation coefficient

• Convexity: ignore denom. and square num.

15

Reduce to standard SVM

Can be reduced to standard SVM in the dual form.

In the dual and signed by

16

• thus is positive definite• is p.d., Cholesky decomposition

Projection Matrix

• Cancel influence from pose+ve scoring poses need not be same person–ve scoring poses need not be different person

SVM-minus Similaritysame person

Pan angle

18

One-Side SVM-minus Similarity

19

SVM-minus Similarity

• Use one-side SVM-minus for online tasks

20

YouTube Faces DB

• DB from [36]• Video LFW

– 3,425 videos; 1,595 people– 2.15 videos / person– min-duration: 48 frames– Average-clip length: 181.3 frames

• Evaluation– 5000 pairs– 10 fold cross-validation– 250+, 250–– Person exclusive splits (person appears only in one split)

21

Experimental info

• Detect face, expand bbox, align, resize 100x100

• Extract features– LBP– Center-Symm. LBP– Four-Patch LBP

• 3D head orientation () from face.com API*

22

MBGS Results from [36]Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained Videos with

Matched Background Similarity. CVPR 2011.

23

This paper results

Results where SVM– did most better than MBGS

24

Results

• MBGS > SVM– at Accuracy• but, MBGS + SVM– wins • Combination done by stacking– learning yet another SVM for the 2D scores

25

Is it really useful?

• Combined score “statistically significant” for [FP]LBP• Use entire background set, AUC: 83.6% to 79.9%• Online applications (one-side), AUC: 83.6% to 81.9%• Correlations:– Within method higher, different scores– Across methods, highest for same feature (as expected)

26

Conclusion

• SVM– : unlearn using additional features• MBGS : be choosy about the negative set• 3D Pose : a good “privileged” information source

• They don’t talk about pose estimation accuracy• Different types of privileged info that might work• Metric learning (and relatives) not compared

Thank You!

27

Some more results from other sourcesYouTube Faces DB

Ref Method Accuracy ± SE AUC EER[1] MBGS L2 mean, LBP 76.4 ± 1.8 82.6 25.3[2] MBGS+SVM- 78.9 ±1.9 86.9 21.2

[3] APEM-FUSION 79.1 ±1.5 86.6 21.4

[4] STFRD+PMML 79.5 ±2.5 88.6 19.9

References:[1] Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained Videos with Matched Background Similarity. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011.[2] Lior Wolf and Noga Levy. The SVM-minus Similarity Score for Video Face Recognition. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.[3] Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang. Probabilistic Elastic Matching for Pose Variant Face Verification. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.[4] Zhen Cui, Wen Li, Dong Xu, Shiguang Shan and Xilin Chen. Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.