lior wolf and noga levy the svm-minus similarity score for video face recognition 24.06.2013...
TRANSCRIPT
1
Lior Wolf and Noga Levy
The SVM-minus Similarity Score for Video Face Recognition
24.06.2013Makarand Tapaswi
CVPR Reading Group @ VGG
3
One liner
“How similar is the face in one video sequence to the other, where the similarity is uncorrelated
with pose-induced similarity”
• illumination, expression, image quality, pose• classifier should– discriminate positive/negative AND– uncorrelate w.r.to additional feature set
13
SVM-minus Classifier (2)
• Split– into and – into and – necessary since classifiers are correlated
• Normalization– and each feature-dimension to 0 mean– and to mean 0, and
15
Reduce to standard SVM
Can be reduced to standard SVM in the dual form.
In the dual and signed by
• Cancel influence from pose+ve scoring poses need not be same person–ve scoring poses need not be different person
SVM-minus Similaritysame person
Pan angle
20
YouTube Faces DB
• DB from [36]• Video LFW
– 3,425 videos; 1,595 people– 2.15 videos / person– min-duration: 48 frames– Average-clip length: 181.3 frames
• Evaluation– 5000 pairs– 10 fold cross-validation– 250+, 250–– Person exclusive splits (person appears only in one split)
21
Experimental info
• Detect face, expand bbox, align, resize 100x100
• Extract features– LBP– Center-Symm. LBP– Four-Patch LBP
• 3D head orientation () from face.com API*
22
MBGS Results from [36]Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained Videos with
Matched Background Similarity. CVPR 2011.
24
Results
• MBGS > SVM– at Accuracy• but, MBGS + SVM– wins • Combination done by stacking– learning yet another SVM for the 2D scores
25
Is it really useful?
• Combined score “statistically significant” for [FP]LBP• Use entire background set, AUC: 83.6% to 79.9%• Online applications (one-side), AUC: 83.6% to 81.9%• Correlations:– Within method higher, different scores– Across methods, highest for same feature (as expected)
26
Conclusion
• SVM– : unlearn using additional features• MBGS : be choosy about the negative set• 3D Pose : a good “privileged” information source
• They don’t talk about pose estimation accuracy• Different types of privileged info that might work• Metric learning (and relatives) not compared
Thank You!
27
Some more results from other sourcesYouTube Faces DB
Ref Method Accuracy ± SE AUC EER[1] MBGS L2 mean, LBP 76.4 ± 1.8 82.6 25.3[2] MBGS+SVM- 78.9 ±1.9 86.9 21.2
[3] APEM-FUSION 79.1 ±1.5 86.6 21.4
[4] STFRD+PMML 79.5 ±2.5 88.6 19.9
References:[1] Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained Videos with Matched Background Similarity. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011.[2] Lior Wolf and Noga Levy. The SVM-minus Similarity Score for Video Face Recognition. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.[3] Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang. Probabilistic Elastic Matching for Pose Variant Face Verification. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.[4] Zhen Cui, Wen Li, Dong Xu, Shiguang Shan and Xilin Chen. Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.