face detection with boosted gaussian features

70
Face detection with boosted Gaussian features Pattern Recognition, Feb, 2007 井井井 井井

Upload: clinton-dawson

Post on 03-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Face detection with boosted Gaussian features. Pattern Recognition, Feb, 2007 井民全 報告. Outline. Introduction A brief overview of AdaBoost The VC-Dimension concept The features Anisotropic Gaussian filters Gaussian vs. Haar-like Experiments and results. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Face detection with boosted Gaussian features

Face detection with boosted Gaussian features

Pattern Recognition, Feb, 2007井民全 報告

Page 2: Face detection with boosted Gaussian features

Outline

• Introduction• A brief overview of AdaBoost• The VC-Dimension concept• The features

– Anisotropic Gaussian filters– Gaussian vs. Haar-like

• Experiments and results

Page 3: Face detection with boosted Gaussian features

Introduction

• Automatic face detection is a key step in any face processing system

• It is far from a trivial task– faces are highly deformable objects– lighting conditions, poses

• holistic methods– consider the face as a global object

• feature-based methods– recognize parts of the face and assemble them to

take the final decision

Page 4: Face detection with boosted Gaussian features

Introduction

• The classical approach for face detectionStep 1: scan the input image with a sliding window,

and for each positionStep 2: the window is classified as either face or

non-face• The efficient exploration of the search space is

a key ingredient for obtaining a fast face detector– Skin color, a coarse-to-fine approach, etc…

Page 5: Face detection with boosted Gaussian features

Introduction

• A fast algorithm is proposed by Viola and Jones– Three main ideas

• first train a strong classifier by Haar-like features-based classifiers

• use the so-called integral image as image representation very efficiently

• a classification structure in cascade speed

Page 6: Face detection with boosted Gaussian features

A brief overview of AdaBoost

h1 h2 h3 h4 hT-1 hT

Weaker classifier

a strong classifiera strong classifier

24

24

Page 7: Face detection with boosted Gaussian features

Cascaded Classifiers Structure

Ada Boosting Learner

1h1h

Feature setFeature set

Feature Select & Classifier

Stage 1Stage 1

False (Reject)

Ada Boosting LearnerAda Boosting Learner

Stage 2Stage 2

1h1h

2h2h

10h10h

Pass

False (Reject)

Ada Boosting LearnerAda Boosting Learner

Stage 3Stage 3

1h1h

2h2h moremore

Pass

False (Reject)

Reject as many negatives as possible (minimize the false negative)Reject as many negatives as possible (minimize the false negative)

100% Detection Rate50% False Positive

Page 8: Face detection with boosted Gaussian features

Haar-like featuresThe difference between the sum of pixels

within two rectangular regionsThe difference between the sum of pixels

within two rectangular regionsTwo-Rectangle FeatureTwo-Rectangle Feature

The region have the same size and shape

And are horizontally or vertically adjacent

The base resolution is 24x24

The exhaustive set of rectangle is large, over 180,000.

Three-Rectangle Feature the sum within two

outside rectangle subtracted from the sum in a center rectangle

The difference between the diagonal pairs of rectangles

Four-Rectangle Feature

Four-Rectangle Feature

1

2

3

Page 9: Face detection with boosted Gaussian features

24

24

Over 180,000 rectangle features associate with each sub-image

Over 180,000 rectangle features associate with each sub-image

The feature values An example

Page 10: Face detection with boosted Gaussian features

The training process for a weaker learner

• Let’s see an example

Page 11: Face detection with boosted Gaussian features

The training process for a weaker classifier (an example)

1

1

0

{10,23,…,5, …}

{7,20,…, 25, …}

{15,21,…,100,…}

h1

h1(xi)= 1 , if fj(xi) < 30 0, otherwise

Example x y

0 {15,21,…,20,…}

{f1(x),f2(x),…,fj(x), …, f180,000(x)}

fj(x)

Example xi yi

Searching for a feature that the training error is minimal !

Page 12: Face detection with boosted Gaussian features

The 1st iteration

1

1

0

h1

{10,23,…,5, …}

{7,20,…, 25, …}

{15,21,…,100,…}

h1(xi)= 1 , if fj(xi) < 30 0, otherwise

Example x h1(x)

0 {7,23,…,20,…} (False positive)

fj(x)y

1

1

0

1

Example xi yi

Page 13: Face detection with boosted Gaussian features

h1

h1(xi)= 1 , if fj(xi) < 30 0, otherwise

False positive

Non-face

Page 14: Face detection with boosted Gaussian features

The training error for h1

1

1

0

{10,23,…,5, …}

{7,20,…, 25, …}

{15,21,…,100,…}

Example xi h1(x)

0 {7,23,…, 20,…}(False positive)

fj(x)yi

1

1

0

1

weight error

1/4

1/4

1/4

1/4

i

iiij yxhw |)(| 1

j

0

0

0

1/4

+

+

+

= 1/4

for h1

iw ,1

Page 15: Face detection with boosted Gaussian features

Update the weight (1/2)

otherwise

correctly classified is example if1

,

,,1

it

it

tit

it

w

xww

Distribute the contribution!

Page 16: Face detection with boosted Gaussian features

Update the weight (2/2)

1

1

0

{10,23,…,5, …}

{7,20,…, 25, …}

{15,21,…,100,…}

Example xi h1(x)

0 {7,23,…, 20,…}(False positive)

fj(x)yi

1

1

0

1

Weight error

1/4*

1/4

j

0

0

0

1/4

+

+

+

= 1/4

for h1

75.0

25.0

1/4* 75.0

25.0

1/4* 75.0

25.0

( 變小 )

( 變小 )

( 變小 )

( 不變 )

iw ,2

t

titit ww

1,,1

Page 17: Face detection with boosted Gaussian features

Normalization the weight

n

k kt

itit

w

ww

1 ,

,,1

N= # of the example

Page 18: Face detection with boosted Gaussian features

1

1

0

{10,23,…,5, …}

{7,20,…, 25, …}

{15,21,…,100,…}

Example xi h1(x)

0 {7,23,…, 20,…}(False positive)

fj(x)yi

1

1

0

1

NormalizeWeight

0.5

( 剛剛分錯的 , weight 變大 由 1/4 0.5)

iw ,2

0.166

0.166

0.166

n

k kt

itit

w

ww

1 ,

,,1

Page 19: Face detection with boosted Gaussian features

分析

• 因為我們選 classifier 是選擇產生總體分類錯誤最小的 feature , 進行分類 .

• 而上次的分類 , 分錯的 example 錯誤成本增加了 . 故整個 training process 會趨向不讓上次分錯的 example, 在這次分錯

i

iijij yxhw |)(|每一個 example 的 weight ( 錯誤成本 )

目前 使用 feature j 進行分類 , 整體的 training error

Page 20: Face detection with boosted Gaussian features

h1

False positive of h1

Non-face

h2

Cascaded Classifiers Structure

Page 21: Face detection with boosted Gaussian features

The Boost algorithm for

classifier learning

),(, ... ),,(),,( 2211 nn yxyxyx ),(, ... ),,(),,( 2211 nn yxyxyx

Image

Positive =1 Negative=0

Step 1: Giving example images

Step 2: Initialize the weights

positives. and negatives of # theare and

,1,0for 2

1,

2

1,1

lm

ylm

w ii

positives. and negatives of # theare and

,1,0for 2

1,

2

1,1

lm

ylm

w ii

For t = 1, … , T 1. Normalize the weights,

2. For each feature j, train a classifier hj which is restricted to using a single feature

3. Update the weights:

For t = 1, … , T 1. Normalize the weights,

2. For each feature j, train a classifier hj which is restricted to using a single feature

3. Update the weights:

ondistributiprobabity a is that so ,1 ,

,, tn

k kt

itit w

w

ww

ondistributiprobabity a is that so ,1 ,

,, tn

k kt

itit w

w

ww

.error lowest with the, ,classifier theChoose

|)(|

, respect to with evaluated iserror The

tt

iiijij

t

h

yxhw

w

.error lowest with the, ,classifier theChoose

|)(|

, respect to with evaluated iserror The

tt

iiijij

t

h

yxhw

w

Weak learner constructor Weak learner constructor

otherwise

correctly classified is example if1

,

,,1

it

it

tit

it

w

xww

Page 22: Face detection with boosted Gaussian features

The final strong classifier

t

tt

1 t

tt

1

若超過一半的人 , 贊成就通

每個人的投票的份量 , 由正確率決定

Page 23: Face detection with boosted Gaussian features

• Introduction• A brief overview of AdaBoost• The VC-Dimension concept• The features

– Anisotropic Gaussian filters– Gaussian vs. Haar-like

• Experiments and results

• Introduction• A brief overview of AdaBoost• The VC-Dimension concept• The features

– Anisotropic Gaussian filters– Gaussian vs. Haar-like

• Experiments and results

Page 24: Face detection with boosted Gaussian features

1

1esty

The VC-Dimension concept

• A learning machine f takes an input x and transforms it, somehow using weights a, into a predicted output

in some pagers, the definition is

0

1esty

ffx esty

(Some vector of adjustable parameters)

Page 25: Face detection with boosted Gaussian features

Examples

ffx esty

Page 26: Face detection with boosted Gaussian features

Examples

ffx esty

Page 27: Face detection with boosted Gaussian features

Examplesffx esty

Page 28: Face detection with boosted Gaussian features

How do we characterize “power”?

• Different machines have different amounts of “power”

• Tradeoff between:– More power: Can model more complex classifiers

but might overfit– Less power: Not going to overfit, but restricted in

what it can model• How do we characterize the amount of

power?

Page 29: Face detection with boosted Gaussian features

Some definitions

• Given some machine f• And under the assumption that all training

points (xk,yk) were drawn i.i.d from some distribution.

• And under the assumption that future test points will be drawn from the same distribution

i.i.d independent and identically distributed

Page 30: Face detection with boosted Gaussian features

Probability of misclassification

),(2

1)()( xfyETESTERRR

R

k

emp xfyR

TRAINERRR1

),(2

11)()(

R = # of training R = # of training

Fraction training set of misclassification片段

Page 31: Face detection with boosted Gaussian features

Vapnik-Chervonenkis dimension

• Given some machine f, let h be its VC dimension

• Vapnik showed that

with probability 1-

known

Known(# of training example)

Page 32: Face detection with boosted Gaussian features

known

known

This gives us a way to estimate the error on future data based only on the training error and the VC-dimension of f

This gives us a way to estimate the error on future data based only on the training error and the VC-dimension of f

Page 33: Face detection with boosted Gaussian features

• But given machine f,

how do we define and compute h?the VC-dimension of f

Page 34: Face detection with boosted Gaussian features

Shattering

• Machine f can shatter a set of points x1, x2 .. xr if and only if…– For every possible training set of the form

– There exists some value of that gets zero training error.

),(),...,,(),,( 2211 rr yxyxyx

Page 35: Face detection with boosted Gaussian features

Question

• Can the following f shatter the following points?

Page 36: Face detection with boosted Gaussian features

Answer: No problem

• There are four training sets to consider

水平線 (ok) 對角線 (ok) 對角線換正負號 (ok) 水平線換正負號 (ok)

Page 37: Face detection with boosted Gaussian features

Question

• Can the following f shatter the following points?

Page 38: Face detection with boosted Gaussian features

Answer: No way my friend

圓外一類圓內一類

(ok)

圓外一類圓內一類

(ok)

衝突無法變換參數( 因為 f(x,b) 中無法控制 x

Page 39: Face detection with boosted Gaussian features

• What ‘s VC dimension of

Definition of VC dimension

• Given machine f, the VC-dimension h is

The maximum number of points that can be arranged so that f shatter them

Ans: 1

這個機器 , 在所有 example 組合下 , 最多不會分錯的 example 個數

Page 40: Face detection with boosted Gaussian features

VC dim of line machine

• For 2-d inputs, what’s VC-dim of f(x,w,b) = sign(w.x+b)?

• Well, can we find four points that f can shatter?

Page 41: Face detection with boosted Gaussian features

• VC-dimension 越大的機器 , Power 越大 .

Page 42: Face detection with boosted Gaussian features

Structural Risk Minimization

• considers a sequence of hypothesis spaces of increasing complexity– For example, polynomials of increasing degree.

if

Page 43: Face detection with boosted Gaussian features

Structural Risk Minimization• We’re trying to decide which machine to use• We train each machine and make a table…

i TrainErr VC Conf Prob. Upper bound on

TestErr

Choice

1

2

3

4

5

if

越簡單

越複雜

Page 44: Face detection with boosted Gaussian features

分析• Vapnic-Chervonenkis 告訴我們任一台機器

的 TestError 與 VC-Dimension ( 機器複雜程度 ) 有關 .

• 對相同 data set 而言 , 複雜度越高的機器 , 對 training 資料 overfit 的程度也越高 .

Training example

Page 45: Face detection with boosted Gaussian features

Generalization error for the AdaBoost proposed by Freund

feature a is

sign, inequality theofdirection theindicating

, thresholda is

,1

)( if,1)(

j

j

j

jjjjj

f

P

where

otherwise

PxfPxh

feature a is

sign, inequality theofdirection theindicating

, thresholda is

,1

)( if,1)(

j

j

j

jjjjj

f

P

where

otherwise

PxfPxh

TRAINERR

examples ofnumber theis N

AdaBoostby output function decision theis Tf

d= VC-dimension

Page 46: Face detection with boosted Gaussian features

• For example– An AdaBoost algorithm proposed by [1]– Total # of features in all layer 6061

• AdaBoost has an important drawback– It tends to overfit training examples

Page 47: Face detection with boosted Gaussian features

• Introduction• A brief overview of AdaBoost• The VC-Dimension concept• The features

– Anisotropic Gaussian filters– Gaussian vs. Haar-like

• Experiments and results

• Introduction• A brief overview of AdaBoost• The VC-Dimension concept• The features

– Anisotropic Gaussian filters– Gaussian vs. Haar-like

• Experiments and results

Page 48: Face detection with boosted Gaussian features

The proposed new features – Anisotropic Gaussianfilters

• The generating function

• It efficiently capture contour singularities with a smooth low resolution function

2:),( yx

)exp(),( 2yxxyx

Page 49: Face detection with boosted Gaussian features

The transformations

• Translation by

• Rotation by

• Bending by r

),( 00 yx

),(),( 00, 00yyxxyxyx

)sinsin,sincos(),( yxyxyxR

. if2

,

, if,tan,)(),(

122

rxrrxyr

rxxr

yryyxr

yxBr

Page 50: Face detection with boosted Gaussian features

• Anisotropic scaling by

• By combining these four basic transforations,

),( yx ss

),(),(,yx

ss s

y

s

xyxS

yx

),(

),(),(),(

,,

,,,,

00

00

yxSBRT

yxyxyx

yx

yx

ssryx

yxssi

Page 51: Face detection with boosted Gaussian features

Anisotropic Gaussian filters with different rotating and bending parameters

Page 52: Face detection with boosted Gaussian features

Some of the features selected by the proposed method

YX iji dxdyyxIyxxfD ),(),()(

feature a is sign, inequality theofdirection theindicating

, thresholda is

,1

)( if,1)(

jj

j

jjjjj

fP

where

otherwise

PxfPxh

feature a is sign, inequality theofdirection theindicating

, thresholda is

,1

)( if,1)(

jj

j

jjjjj

fP

where

otherwise

PxfPxh

Input imageInput imageParticular filterParticular filterfeaturefeature

Page 53: Face detection with boosted Gaussian features

• The features are particularly well adapted to capture local contours – that are insensitive to changes of the lighting

conditions.

Page 54: Face detection with boosted Gaussian features

Gaussian vs. Haar-like

The Haar-like templates

Haar filters model global contrasts that are • more sensitive the direction of the light source• well capture the contrast between image regions• limited for modeling smooth transitions present in facial images

ComparisonComparison

(Good)

(Bad)

(Bad)

Page 55: Face detection with boosted Gaussian features

Gaussian vs. Haar-like

The first and second features selected by AdaBoost proposed by P. Viola

Page 56: Face detection with boosted Gaussian features

Gaussian vs. Haar-like

Haar: Test error stop decreasing

Gaussian: Keep decreasing

Stage 越多

Page 57: Face detection with boosted Gaussian features

• AdaBoost focuses on the hard to classify examples – the simplistic HF are not discriminant enough to

separate the two classes

Page 58: Face detection with boosted Gaussian features

The receiver operating characteristic analysis

Page 59: Face detection with boosted Gaussian features

Experimental Result

Page 60: Face detection with boosted Gaussian features

• 20x15 pixels window is used to scan the image• For different scale

– The image is then dilated by power of 1.2• For training models

– Face images: XM2VTS, BioID, FERET (scale, in-plane rotation, shift) 9500 face images

– Non-face dataset: randomly selected images without human faces 500,000 non-face images

• Total set of GF used for training the classifier– 202,200 features

Page 61: Face detection with boosted Gaussian features

The evaluation protocol

• The Popovici score:

between-eyes distancesbetween-eyes distances

angle between the eyes axisangle between the eyes axis

annotated positionannotated position

detected positiondetected position

Page 62: Face detection with boosted Gaussian features

因為所有的輸出結果都會被納入分數計算高 False Positive 結果也會拉低分數因為所有的輸出結果都會被納入分數計算高 False Positive 結果也會拉低分數

12,480 images in French and English datasets12,480 images in French and English datasets

Page 63: Face detection with boosted Gaussian features

23 images with 155 low-resolution faces23 images with 155 low-resolution faces

123 images with 483 faces123 images with 483 facesmanually selected

Page 64: Face detection with boosted Gaussian features

130 images with 507 faces130 images with 507 faces

complete set

Page 65: Face detection with boosted Gaussian features
Page 66: Face detection with boosted Gaussian features
Page 67: Face detection with boosted Gaussian features
Page 68: Face detection with boosted Gaussian features
Page 69: Face detection with boosted Gaussian features

因為大部分的 false positive 被快速 5 stage HF 濾掉 , 所以速度會比 12 stage 單純 GF 較快因為大部分的 false positive 被快速 5 stage HF 濾掉 , 所以速度會比 12 stage 單純 GF 較快

Page 70: Face detection with boosted Gaussian features

• Thank you