face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต...

Face Recognition &

Deep Learningsanparith.marukatat@nectec.or.th

Standard procedure• Image capturing: camera, webcam, surveillance

• Face detection: locate faces in the image

• Face alignment: normalize size, rectify rotation

• Face matching

• 1:1 Face verification

• 1:N Face recognition

Viola-Jones Haar-like detector (OpenCV haarcascade_frontalface_alt2.xml)

face size~35x35 to 80x80 pixels

too small

occlusion

rotation

Recognition = compare these faces to known faces

Controlled environment face size 218x218 pixels

Viola-Jones eye detector

Eyes distance = 81 pixels Eyes angle = -0.7 degrees

Face size = 180x200 pixels Eyes distance = 100 pixels

Eyes angle = 0 degrees

Comparing face• Face image

• Bitmap of size 180x200 pixels

• Grayscale (0-255)

• 36,000 values/face image

• Given 2 face images x1 and x2

• x1(x,y) - x2(x,y)

• | x1(x,y) - x2(x,y) |

• (x1(x,y) - x2(x,y))2

• What should be used?

Basic Maths• 1 Face image = 1 vector

• 36,000 dimensions (d)

• matrix with 1 column

• Distance

• Euclidean distance

• Norm-p distance

• Norm-1 distance

• Norm-infinity distance

Pixels importance and projection

• Not all pixels have the same importance

• Pixel with low variation -> not important

• Pixel with large variation -> could be important

Projection When ||w||=1, wTx is the projection of x on axis w

Subspace projection

• What should be the axis w?

• How many axis do we need?

Principal Component Analysis PCA (1)

• Basic idea

• Measure of information = variance

• Variance of z1,…,zN for real numbers zt

• Given a set of face vectors x1,…,xN and axis wVariance of wTx1,…,wTxN is

Covariance matrix

Principal Component Analysis PCA (2)

• Best axis w is obtained by maximizing wTCw

with constraint ||w||=1

• w is an eigenvector of C : Cw = a w

• Variance wTCw=a is the corresponding eigenvalue of w

• PCA

• Construct Covariance matrix C

• Eigen-decompose C

• Select m largest eigenvectors

Eigenface (1)• What is the problem with face data?

• Solution

Dot matrix

dxd matrix NxN matrix

Eigenface (2)• We work with vectors of projected values

x1 x2 …

x Enrollment

Template

Eigenface (3)

• Vector of raw intensity: 36,000 dimensions

• Vector of Eigenface coefficients: 10 dimensions

• Large Eigenface = large variation

• Small Eigenface = noise

Related techniques• Fisherface (LDA)

• Nullspace LDA

• Laplacianface

• Locality Sensitive Discriminant Analysis

• 2DPCA

• 2DLDA

• 2DPCA+2DLDA

Result on ORL (~10 years ago)

Techniques Accuracy #dimEigenface 90-95 200

Fisherface 91-97 50NLDA 92-97 40

Laplacianface 89-95 50LSDA 91-97 50

2DPCA 91.52DLDA 90.5

2DPCA+2DLDA 93.5

Limitations

• Occlusion: glasses, beard

• Lighting condition

• Facial expression

• Pose

• Make-up

Evaluation• Accuracy: find closest template and check the ID

• Verification (access control)

• Live captured image VS. stored image

• We have distance -> Should we accept or not?

• False Accept (FA) VS. False Reject (FR)

• From a set of face images

• Compute distances between all pair

• Select threshold T that gives 0 FA and X FR

• Number of tries

distance

Labeled Faces in the Wild

• Large number of subjects (>5,000)

• Unconstrained conditions

• Human performance 97-99%

• Traditional methods fail

• New alignment technique: funneling

LFW results

Use outside data to train the model

Deep Learning

Neural Network timeline

McCulloch & Pitts Neuron model (1943)

Perceptron limitation (1969)

Backprop algorithm 70-80’s

SVM (1992)

Deep Learning (2006)

• Return of Neural Network

• Focus on Deep Structure

• Take advantage of today computing power

Neural Networks (1)• Neurons are connected via synapse

• A neuron receives signals from other neurons

• When the activation reaches a threshold, it fires a signal to other neurons

http://en.wikipedia.org/wiki/Neuron

Neural Networks (2)• Universal Approximator

• Classical structure: MLP

• #hidden nodes, learning rate

• Backprop algorithm

• Gradient

• Direction of change that increases value of objective function

• Vector of partial derivatives wrt. each parameters

• Work on all structures, all objective functions

• Stoping criteria, local optima, gradient vanishing/exploding

Deep Learning• 2006 Hinton et al.: layer by layer construction -> pre-training

• Stack of RBMs, Stack of Autoencoders

• Convolutional NN (CNN)

• Shared weights

• Take advantage of GPU

CNN today• Common components

• Convolution layer, Max-pooling layer

• ReLU

• Drop-out, Sampling+flip training data

• GPU

• Tools: Caffe, TensorFlow, Theano, Torch

• Structure: LeNet, AlexNet, GoogLeNet

AlexNet

GoogLeNet

AlexNet

GoogLeNet

Microsoft deep residual network: 150 layers!

DeepID(Sun et al. CVPR 2014)

• 160 dim, 60 regions, flipped

• 19,200 dimensions!! • Input to other model • CelebFace • Refine training

Learning technique

for deep structure

Big dataComputing

power GPU, etc.

face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต...

Education

ระบบการเรียนการสอนระยะไกลโดยใช้เทคโนโลยีคลาวด์...

การทดสอบประสิทธิภาพสื่อหรือชุดการสอน...

การจัดการศึกษาสาขาวิชาสารสนเทศศาสตร์ที่มีคุณภาพ...

โดย รองศาสตราจารย์...

โดย ดร สมชาย ดิเจริญ...

การจัดกลุ่มนักลงทุนไทยตามพฤติกรรมการลงทุน...

โดย ดร. ทัศตริน ...

ท่องพุทธสถานผ่านเลนส์(ศรีลังกา)...

โดย ดร.เจษฎา ศิวรักษ์ /...

ข้อเสนอแนะในการก้าวเข้าสู่ตำแหน่งทางวิชาการ...

การจัดการศึกษาสาขาวิชาสารสนเทศศาสตร์ที่มีคุณภาพ...

โดย ดร.นพ. ภานุวัฒน์ ...

หลักการวิจัย โดย ดร....

s10.happy diabetic aging โดย...

สร้างรากฐานชีวิตด้วยวิทยาศาสตร์จิตภาพ...

ทุนทางสังคมกับการพัฒนาเมือง...

การให้คำปรึกษาวิทยานิพนธ์...

โดย รองศาสตราจารย์ ดร....

ดร. ชูเกียรติ...

โดย ดร.สมนึก คีรีโต...