unconstrained face recognition: deep learning...

75
Unconstrained Face Recognition: Deep Learning Approaches Chun-Ting Huang

Upload: others

Post on 19-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Unconstrained Face Recognition: Deep Learning Approaches

Chun-Ting Huang

Page 2: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

2016/7/22USC Multimedia Communication Lab 2

Page 3: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

2016/7/22USC Multimedia Communication Lab 3

http://www.nytimes.com/2015/08/13/us/facial-recognition-software-moves-from-overseas-wars-to-local-police.html?_r=0

Page 4: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Why Face?

▪ Facial features scored highest compatibility in a Machine Readable Travel Documents (MRTD) system

2016/7/22USC Multimedia Communication Lab 4

Hietmeyer, R.: Biometric identification promises fast and secure processing of airline passengers. ICAO J. 55(9), 10–11 (2000)

Page 5: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Outline

▪ Introduction

▪ Unconstrained face dataset

▪ Unconstrained face recognition with deep learning

▪ Papers from industry

▪ Papers from academia

▪ Discussion and conclusion

2016/7/22USC Multimedia Communication Lab 5

Page 6: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Introduction

Page 7: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Categorization

▪ A face recognition system operates in two modes

▪ Face verification (authentication)

▪ Face identification (recognition)

▪ Face verification

▪ One-to-one match

▪ Between query face image against an enrollment face image

▪ Face identification

▪ One-to-many match

▪ Between query face against multiple faces in the enrollment database

2016/7/22USC Multimedia Communication Lab 7

Page 8: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Face Recognition Processing Flow

2016/7/22USC Multimedia Communication Lab 8

Jain, Anil K., and Stan Z. Li. Handbook of face recognition. Vol. 1. New York: springer, 2011

Page 9: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Face Subspace

2016/7/22USC Multimedia Communication Lab 9

Jain, Anil K., and Stan Z. Li. Handbook of face recognition. Vol. 1. New York: springer, 2011

Page 10: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Frontal Face Recognition

2016/7/22USC Multimedia Communication Lab 10

Page 11: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Conventional Approaches

2016/7/22USC Multimedia Communication Lab 11

▪ Template matching

▪ PCA: M. Turk, A. Pentland, Eigenfaces for Recognition, Journal of Cognitive Neurosicence, Vol. 3, No. 1, Win. 1991

▪ LDA: Kamran Etemad and Rama Chellappa, ” Discriminant analysis for recognition of human face images”, JOSA A, 1997

▪ HOG: Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005

▪ LBP: Ahonen, Timo and Hadid, Abdenour and Pietikainen, Matti, “Face description with local binary patterns: Application to face recognition”, Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2006

Page 12: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Frontal is NOT Enough

2016/7/22USC Multimedia Communication Lab 12

Page 13: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Facial Landmark Localization

▪ Model based approach

▪ ASM: T.F. Cootes and C.J. Taylor and D.H. Cooper and J. Graham (1995). "Active shape models - their training and application". Computer Vision and Image Understanding (61): 38–59

▪ AAM: T.F. Cootes, G. J. Edwards, and C. J. Taylor. Active appearance models. ECCV, 2:484–498, 1998

▪ Regression based approach

▪ Cascade pose regression: P. Doll’ar, P. Welinder, and P. Perona. “Cascaded pose regression”. In CVPR. IEEE, 2010

▪ Explicit shape regression: X. Cao, Y.Wei, F.Wen, and J. Sun. “Face alignment by explicit shape regression”. In CVPR. IEEE, 2012

2016/7/22USC Multimedia Communication Lab 13

Page 14: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

2016/7/22USC Multimedia Communication Lab 14

Page 15: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Explicit Shape Regression

2016/7/22USC Multimedia Communication Lab 15

t = 0 t = 1 t = 2 … t = 10

𝐼: image

initialized

from

face

detector

affine

transformtransform

back

Page 16: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Unconstrained Face Dataset

Page 17: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Labeled Faces in the Wild

▪ Contains 13233 images

▪ Consists of 5749 people

▪ 1680 people with two or more images

▪ Proposed in ICCV 2007

▪ Photos are collected through internet

▪ Also provide aligned faces with three types of alignment methods

USC Multimedia Communication Lab 17

Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained

Environments. University of Massachusetts, Amherst, Technical Report 07-49, October, 2007.

2016/7/22

Page 18: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

LFW: Performance (Image-Restricted)

2016/7/22USC Multimedia Communication Lab 18

Page 19: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

LFW: Performance (Image-Unrestricted)

2016/7/22USC Multimedia Communication Lab 19

Page 20: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

2016/7/22USC Multimedia Communication Lab 20

Page 21: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Youtube Face Database

▪ Lior Wolf, Tal Hassner and Itay Maoz, Face Recognition in Unconstrained Videos with Matched Background Similarity. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011

▪ 3425 videos of 1595 people

2016/7/22USC Multimedia Communication Lab 21

Page 22: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

YTF: Performance (Image-Restricted)

2016/7/22USC Multimedia Communication Lab 22

Page 23: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

YTF: Performance (Image-Restricted)

▪ EER - the error rate at the ROC operating point where the false positive and false negative rates are equal

2016/7/22USC Multimedia Communication Lab 23

Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep face recognition." Proceedings of the British Machine Vision 1.3 (2015): 6.

Page 24: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

IARPA Janus benchmark A

▪ Klare et al. Pushing the Frontiers of Unconstrained Face Detection and Recognition: IARPA Janus Benchmark A, CVPR, June 2015

▪ All labeled with manual bounding box annotation with fiducial landmarks

▪ Amazon Mechanical Turk (AMT)

▪ LFW are not fully constrained:

▪ Commodity face detector was used to detect all faces

▪ Restricted to pose variation, occlusions, and illuminations conditions

▪ Three landmarks: two eyes, and base of nose

▪ Geographic distribution

7/22/2016USC Multimedia Communications Lab 24

Page 25: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

IJB-A Labeled Information

▪ 10-fold gallery / probe image set

▪ 17,000 images for training (333 subjects)

▪ Gallery set: 3000 images (167 subjects)

▪ Probe set: 13,700 images (include non-gallery subjects)

▪ X Y coordinates of eyes and nose base

▪ Face yaw angle (if applicable)

▪ Observation labeling: FOREHEAD_VISIBLE, EYES_VISIBLE, NOSE_MOUTH_VISIBLE, INDOOR, GENDER, SKIN_TONE (6 levels), AGE (5 levels), FACIAL_HAIR

7/22/2016USC Multimedia Communications Lab 25

Page 26: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Pose Variant

7/22/2016USC Multimedia Communications Lab 26

Page 27: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

7/22/2016USC Multimedia Communications Lab 27

Page 28: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

7/22/2016USC Multimedia Communications Lab 28

Page 29: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

7/22/2016USC Multimedia Communications Lab 29

Page 30: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

7/22/2016USC Multimedia Communications Lab 30

Page 31: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

IJB-A Released Benchmark (1/29/2016)

7/22/2016USC Multimedia Communications Lab 31

Page 32: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Unconstrained Face RecognitionWith Deep Learning

Page 33: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Facebook: DeepFace

▪ DeepFace: Closing the Gap to Human-Level Performance in Face Verification

▪ Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1701-1708

▪ Claimed contributions

▪ Facial alignment with 3D modeling

▪ Advance LFW benchmark performance

▪ Reaching near human-performance

▪ Advance YTF benchmark performance

USC Multimedia Communication Lab 33Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf; The IEEE Conference on Computer Vision and Pattern Recognition

(CVPR), 2014, pp. 1701-1708 2016/7/22

Page 34: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

3D Facial Alignment

▪ Detected face provided with 6 initial fiducial points

▪ 2D-aligned crop

▪ 67 fiducial points from Delaunay triangulation

▪ 3D shape transform

▪ Triangle visibility w.r.t. to the fitted 3D-2D camera

▪ Affine warping

▪ Final frontalized crop

2016/7/22USC Multimedia Communication Lab 34

Page 35: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

DeepFace Architecture

2016/7/22USC Multimedia Communication Lab 35

Page 36: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

DeepFace: Performance

▪ Results on Labeled Face in the Wild (LFW) and YouTube Faces (YTF) databases

USC Multimedia Communication Lab 36Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf; The IEEE Conference on Computer Vision and Pattern Recognition

(CVPR), 2014, pp. 1701-1708 2016/7/22

Page 37: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

DeepID

▪ Sun, Yi, Xiaogang Wang, and Xiaoou Tang. "Deep learning face representation from predicting 10,000 classes." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014

2016/7/22USC Multimedia Communication Lab 37

60 patches

Page 38: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

DeepID

2016/7/22USC Multimedia Communication Lab 38

Page 39: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

DeepID Performance (1)

2016/7/22USC Multimedia Communication Lab 39

160-dimensional feature

Page 40: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

DeepID Performance (2)

2016/7/22USC Multimedia Communication Lab 40

o: outside dataset

u: unrestricted protocol

r: restricted protocol

Page 41: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Google: FaceNet

▪ Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015

2016/7/22USC Multimedia Communication Lab 41

Page 42: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

FaceNet

▪ Objective - learning a Euclidean embedding per image with DNN

▪ Map the face images to a compact Euclidean space

▪ Distance in space = Face Similarity

▪ Approach – DNN with triplet loss

2016/7/22USC Multimedia Communication Lab 42

Page 43: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Triplet Loss

▪ Embedding: 𝑓(𝑥) ∈ ℝ𝑑

▪ Input image as 𝑥𝑖𝑎 (anchor), 𝑥𝑖

𝑝(positive), and 𝑥𝑖

𝑛 (negative)

▪ 𝛼 is a margin between positive and negative pairs

▪ Corresponding loss function

2016/7/22USC Multimedia Communication Lab 43

Page 44: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Triplet Selection

▪ To achieve fast convergence for previous loss function

▪ Select 𝑥𝑖𝑝

for (hard positive)

▪ Select 𝑥𝑖𝑛 for (hard negative)

▪ Sampled the training set with

▪ 40 faces per identity in each mini-batch as positive examplars

▪ Randomly sampled negative faces are added

▪ To avoid converging to bad local minima

▪ (semi-hard)

2016/7/22USC Multimedia Communication Lab 44

Page 45: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Convolutional Networks

▪ CNN is trained using Stochastic Gradient Descent (SGD) with standard backpropagation

▪ Two types of architectures

▪ Zeiler&Fergus architecture

▪ GoogLeNet style Inception model

▪ Trained on a CPU cluster for 1000 to 2000 hours

▪ 100M-200M training face thumbnails consisting 8M identities

▪ Input sizes range from 96x96 to 224x224 pixels

2016/7/22USC Multimedia Communication Lab 45

Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." Computer vision–ECCV 2014. Springer International Publishing, 2014.

818-833.

Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

Page 46: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Network Details

2016/7/22USC Multimedia Communication Lab 46

Page 47: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Performance

▪ Validation rate VAL (true accepts / same identity pairs) on 1M hold-out test set

▪ Output dimension (embedding dimension)’s VAL

2016/7/22USC Multimedia Communication Lab 47

Page 48: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Sensitivity to Image Quality

2016/7/22USC Multimedia Communication Lab 48

Page 49: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Face Recognition

▪ Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep face recognition." Proceedings of the British Machine Vision 1.3 (2015): 6.

▪ Achieved similar performance on LFW and YTF dataset

▪ With less training images and identities

▪ 2.6M images collected from Google images and Bing with keyword “actor”

▪ Same triplet loss strategy with FaceNet

2016/7/22USC Multimedia Communication Lab 49

Page 50: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Fine-tuned with VGG Model

▪ The “Very Deep” Architecture

▪ Different from previous architectures proposed

▪ Network Details:

▪ 3 x 3 Convolution Kernels (Very small)

▪ Conv. Stride 1 px.

▪ Relu non-linearity

▪ No local contrast normalisation

▪ 3 Fully connected layers

2016/7/22USC Multimedia Communication Lab 50

image

Conv-64

maxpool

fc-4096

fc-4096

Softmax

Conv-64

Conv-128

maxpool

Conv-128

Conv-256

maxpool

Conv-256

Conv-512

maxpool

Conv-512

Conv-512

Conv-512

maxpool

Conv-512

Conv-512

fc-2622

Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint

arXiv:1409.1556 (2014).

Page 51: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Training• MatConvNet Tootlbox

• Nvidia CuDNN bindings

• Multi GPU Training (approx 3.5x speedup)

• Nvidia Titan Black

• 7 days of training

• Stochastic Gradient Descent with back prop.

• Accumulator Descent for large batch sizes

• Batch Size: 256

• Incremental FC layer training

• 2622 way multi class criterion (soft max)

2016/7/22USC Multimedia Communication Lab 51

image

Conv-64

maxpool

fc-4096

fc-4096

Softmax

Conv-64

Conv-128

maxpool

Conv-128

Conv-256

maxpool

Conv-256

Conv-512

maxpool

Conv-512

Conv-512

Conv-512

maxpool

Conv-512

Conv-512

fc-2622

Vedaldi, Andrea, and Karel Lenc. "MatConvNet: Convolutional neural networks for matlab."Proceedings of the 23rd Annual ACM Conference

on Multimedia Conference. ACM, 2015.

Page 52: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Performance on LFW

2016/7/22USC Multimedia Communication Lab 52

No. Method # Training

Images

# Networks Accuracy

1 Fisher Vector Faces - - 93.10

2 DeepFace 4 M 3 97.35

3 DeepFace Fusion 500 M 5 98.37

4 DeepID-2,3 Full 200 99.47

5 FaceNet 200 M 1 98.87

6 FaceNet+

Alignment

200 M 1 99.63

7 VGG Face 2.6 M 1 98.95

Page 53: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Performance on YTF

2016/7/22USC Multimedia Communication Lab 53

No. Method # Training

Images

# Networks 100%-EER Accuracy

1 Video Fisher Vector

Faces

- - 87.7 93.10

2 DeepFace 4 M 1 91.4 91.4

4 DeepID-2,2+,3 200 - 93.2

5 FaceNet +

Alignment

200 M 1 - 95.1

7 VGG Face 2.6 M 1 97.4 97.3

Page 54: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Lightened CNN

▪ Wu, Xiang, Ran He, and Zhenan Sun. "A Lightened CNN for Deep Face Representation." arXiv preprint arXiv:1511.02683 (2015).

▪ Obtained competitive performance with previous models

▪ Composed by two networks

▪ New activation function: Max-Feature-Map (MFM) to replace ReLU

2016/7/22USC Multimedia Communication Lab 54

Page 55: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Max-Feature-Map

2016/7/22USC Multimedia Communication Lab 55

Page 56: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

2016/7/22USC Multimedia Communication Lab 56

Page 57: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Performance

▪ On LFW:

▪ On YTF:

2016/7/22USC Multimedia Communication Lab 57

Page 58: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Learning Applications Other than Recognition

Page 59: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Incorrect Alignment

2016/7/22USC Multimedia Communication Lab 59

Liu, Ziwei, et al. "Deep learning face attributes in the wild." Proceedings of the IEEE International Conference on Computer Vision. 2015.

Page 60: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Learning Face Attributes

2016/7/22USC Multimedia Communication Lab 60

Page 61: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Details of the Networks

▪ Applied AlexNet directly for LNet

▪ Pre-trained with ImageNet 1000 object categories

▪ Fine-tuning LNet using attribute tags

2016/7/22USC Multimedia Communication Lab 61

Page 62: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Face Localization Performance (LNet)

2016/7/22USC Multimedia Communication Lab 62

Page 63: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Face localization performance (LNet)

2016/7/22USC Multimedia Communication Lab 63

Page 64: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Face Attributes Visualization

2016/7/22USC Multimedia Communication Lab 64

Page 65: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Attribute Accuracy

2016/7/22USC Multimedia Communication Lab 65

Page 66: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Discussion and Conclusion

Page 67: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

LFW Survey

▪ Labeled Faces in the Wild: A Survey: Erik Learned-Miller, Gary Huang, AruniRoyChowdhury, Haoxiang Li, Gang Hua

▪ The future of face recognition

▪ Verification versus identification

▪ Not uncommon that two random individuals have large differences in appearance

▪ The more people in a gallery, the greater the chance that two individuals have similar appearance

▪ New face dataset

▪ IJB-A

▪ CASIA

▪ FaceScrub

▪ MegaFace

2016/7/22USC Multimedia Communication Lab 67

Page 68: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Discussion

▪ Unconstrained face recognition is a competitive field

▪ Target dataset: IJB-A

▪ Testing different approaches (with source code / trained models)

▪ Working on checking the effectiveness of lightened CNN

▪ Facial attributes may serve as auxiliary purpose

2016/7/22USC Multimedia Communication Lab 68

Page 69: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Large-scale CelebFaces Attributes (CelebA) Dataset

▪ S. Yang, P. Luo, C. C. Loy, and X. Tang, "From Facial Parts Responses to Face Detection: A Deep Learning Approach", in IEEE International Conference on Computer Vision (ICCV), 2015

▪ 10,177 number of identities

▪ 202,599 number of face images

▪ 5 landmark locations, 40 binary attributes annotations per image

▪ Available for download

▪ 1.34 GB for 202,599 align&cropped face images

▪ Similarity transformation according to two eye locations and are resized to 218*178

▪ 9.8 GB for 202,599 original web face images

2016/7/22USC Multimedia Communication Lab 69

Page 70: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Large-scale CelebFaces Attributes (CelebA) Dataset

2016/7/22USC Multimedia Communication Lab 70

Page 71: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Face Dreams

2016/7/22USC Multimedia Communication Lab 71

Representative ImageNeuron Inversion

Mahendran, Aravindh, and Andrea Vedaldi. "Understanding deep image representations by inverting them." Computer Vision and Pattern Recognition (CVPR), 2015

Page 72: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Face Dreams

2016/7/22USC Multimedia Communication Lab 72

Representative Image Neuron InversionMahendran, Aravindh, and Andrea Vedaldi. "Understanding deep image representations by inverting them." Computer Vision and Pattern Recognition (CVPR), 2015

Page 73: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Face Dreams

2016/7/22USC Multimedia Communication Lab 73

Representative Image Neuron Inversion

Mahendran, Aravindh, and Andrea Vedaldi. "Understanding deep image representations by inverting them." Computer Vision and Pattern Recognition (CVPR), 2015

Page 74: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Deep Face Dreams

2016/7/22USC Multimedia Communication Lab 74

Representative Image Neuron Inversion

Mahendran, Aravindh, and Andrea Vedaldi. "Understanding deep image representations by inverting them." Computer Vision and Pattern Recognition (CVPR), 2015

Page 75: Unconstrained Face Recognition: Deep Learning Approachesmcl.usc.edu/wp-content/uploads/2016/07/Chunting... · DeepFace: Performance Results on Labeled Face in the Wild (LFW) and YouTube

Questions?