video&to&video(face(matching:( establishing(abaseline(for...

18
VideotoVideo Face Matching: Establishing a Baseline for Unconstrained Face Recogni:on Lacey BestRowden, Brendan Klare, Joshua Klontz, and Anil K. Jain Biometrics: Theory, Applica:ons, and Systems Washington DC, USA September 30, 2013

Upload: others

Post on 07-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Video-­‐to-­‐Video  Face  Matching:  Establishing  a  Baseline  for  

Unconstrained  Face  Recogni:on  Lacey  Best-­‐Rowden,  Brendan  Klare,    

Joshua  Klontz,  and  Anil  K.  Jain  

Biometrics:  Theory,  Applica:ons,  and  Systems  Washington  DC,  USA  September  30,  2013  

Page 2: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Face  Recogni:on  in  Video  •  Abundance  of  video  data  –  Ubiquity  of  surveillance  and  mobile  phone  cameras  

–  Low-­‐cost  digital  cameras  

•  Forensic  and  security  applica:ons  –  2011  London  riots  –  2011  Vancouver  riots  –  2013  Boston  bombings  

Page 3: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Face  Recogni:on  Scenarios  

S"ll-­‐to-­‐S"ll   S"ll-­‐to-­‐Video  

Video-­‐to-­‐Video  

Page 4: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Commercial-­‐off-­‐the-­‐Shelf  (COTS)  Performance  on  S:ll-­‐to-­‐S:ll  FR  

0

10

20

30

40

50

60

70

80

90

100

Constrained Unconstrained

TAR

@ F

AR =

0.1

%

0

10

20

30

40

50

60

70

80

90

100

Constrained Unconstrained

TAR

@ F

AR =

0.1

%

0

10

20

30

40

50

60

70

80

90

100

Constrained Unconstrained

TAR

@ F

AR =

0.1

%

0

10

20

30

40

50

60

70

80

90

100

Constrained Unconstrained

TAR

@ F

AR =

0.1

%

0

10

20

30

40

50

60

70

80

90

100

Constrained Unconstrained

TAR

@ F

AR =

0.1

%

FRGC Database!

99%!

LFW Database!

MBGC Database!

84%!

54%!

Page 5: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Approaches  to  Video-­‐based  FR  

•  Sequence  of  face  images  with  temporal  ordering  –  Explicitly  leverage  temporal  dynamics  between  frames  

•  Simultaneous  tracking  and  recogni:on  [Zhou  et  al.,  TIP,  2004]  

•  Unordered  set  of  face  images  –  Fuse  informa:on  prior  to  matching  

•  Output  single  representa:on  or  single  face  image    •  3-­‐D  modeling  [Park  and  Jain,  CVPR,  2007]  •  Super-­‐resolu:on  [Arandjelovic  and  Cipolla,  ICCV,  2007]  •  Manifold-­‐based  methods  [Wang  et  al.,  CVPR,  2008]  

–  Fuse  informa:on  a4er  matching  •  Combine  match  scores  from  sta:c  face  matchers  •  Frame  subset  selec:on  based  on  face  quality  and/or  diversity  [Thomas  et  al.,  IJCV,  2007]  

Page 6: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Video  Face  Databases  

Page 7: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

•  Methods  that  can  quickly  be  integrated  into  opera:onal  environments  are  preferred  over  those  that  have  been  demonstrated  as  proof  of  concept  

•  Video  matching  algorithms  are  oben  compared  to  sta:c  frame-­‐based  matching  

•  We  provide  a  baseline  accuracy  for  unconstrained  video-­‐based  face  recogni:on  by  using  state  of  the  art  commercial-­‐off-­‐the-­‐shelf  (COTS)  face  matchers  

Mo:va:on:  Representa:ve  Baselines  

Page 8: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Face  Track  Extrac"on  

U = u1,u2,...,ua

V = v1,v2,...,vb

Mul"-­‐frame  Score-­‐level  Fusion:  

•   mean  •   median  •   max    •   min  

...

...

All  Frame  Pairs  

…  

…  

a × b

Similarity  Matrix  

...

…  

s u1,v1( )...

... s ua,vb( )€

s U,V( )

Not  Same  

Same  

≥ t

< t

COTS  Face  Matcher  Mul:-­‐Frame  Fusion  

Same  

Page 9: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Fusion  of  Mul:ple  Matchers  Mul"-­‐Matcher  Mul"-­‐Frame  

(MMMF)  Fusion  Mul"-­‐Frame  Mul"-­‐Matcher  

(MFMM)  Fusion  

SB €

...

SA €

...

SAB €

...

sAB

SB €

...

SA €

...

sA

sB sAB

Mul"-­‐Matcher   Mul"-­‐Frame   Mul"-­‐Frame   Mul"-­‐Matcher  

Page 10: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Experimental  Details  

YouTube  Faces  (YTF)  Database  [Wolf  et  al.,  CVPR  2011]:  •  1,447  subjects  •  3,226  videos  

  1  –  6  (average  ≈  2)  videos  per  subject    48  –  2,157  (average  ≈  182)  frames  per  video  

•  Faces  detected  with  Viola-­‐Jones  detector      24  fps    Aligned  and  cropped  to  300×300  pixels  

•  Experimental  protocol    10-­‐fold,  cross-­‐valida:on,  pairwise  tests  

•  250  same,  250  not  same  face  pairs  per  split  

Page 11: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Experimental  Details  

COTS  Matchers:  •  COTS-­‐A,  COTS-­‐B,  COTS-­‐C  •  All  par:cipated  in  2010  NIST  MBE  

Previous  Results  on  YTF  DB:  •  Matched  Background  Similarity  (MBGS)  [Wolf  et  al.,  CVPR,  

2011]  •  Adap:ve  Probabilis:c  Elas:c  Matching  (APEM)  Fusion      [Li  et  al.,  CVPR,  2013]  

•  Spa:al-­‐Temporal  Face  Region  Descriptor  +  Pairwise-­‐constrained  Mul:ple  Metric  Learning  (STFRD+PMML)      [Cui  et  al.,  CVPR,  2013]  

•  Rank  Aggrega:on  [Bham  et  al.,  ICIP,  2013]  

Page 12: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Experimental  Results  Mul"-­‐Frame  Fusion  

Page 13: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

10−3 10−2 10−10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate (FAR)

True

Acc

ept R

ate

(TAR

)COTS−B

all frames (mean)30 frames (faceness)30 frames (near−frontal)1 frame (faceness)1 frame (near−frontal)

Experimental  Results  Quality-­‐based  Frame  Subset  Selec"on  

10−3 10−2 10−10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate (FAR)

True

Acc

ept R

ate

(TAR

)COTS−B

all frames (mean)30 frames (faceness)30 frames (near−frontal)1 frame (faceness)1 frame (near−frontal)

10−3 10−2 10−10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate (FAR)

True

Acc

ept R

ate

(TAR

)COTS−B

all frames (mean)30 frames (faceness)30 frames (near−frontal)1 frame (faceness)1 frame (near−frontal)

Page 14: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Experimental  Results  

10−3 10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate (FAR)

True

Acc

ept R

ate

(TAR

)

COTS−A (mean)COTS−B (mean)COTS−C (mean)MMMF (tanh, sum, mean)MBGSAPEM FusionSTFRD+PMMLRank Aggregation

10−3 10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate (FAR)

True

Acc

ept R

ate

(TAR

)

COTS−A (mean)COTS−B (mean)COTS−C (mean)MMMF (tanh, sum, mean)MBGSAPEM FusionSTFRD+PMMLRank Aggregation

10−3 10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate (FAR)

True

Acc

ept R

ate

(TAR

)

COTS−A (mean)COTS−B (mean)COTS−C (mean)MMMF (tanh, sum, mean)MBGSAPEM FusionSTFRD+PMMLRank Aggregation

Page 15: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

High  Score  

Low  Score  

Impostor  Examples  

Genuine  Examples  

Experimental  Results  

Page 16: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

•  Mul:-­‐matcher  fusion  overcomes  the  face  enrollment  problem  –  No.  of  frames  that  fail  to  enroll  (587,035  total  frames)    

•  Face  detec:on  and  landmark  localiza:on  are  crucial  to  leverage  all  available  frames  in  a  face  track  

Experimental  Results  

Page 17: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Conclusions  

•  All  three  COTS  face  matchers  outperformed  current  published  results  on  the  YTF  database  

•  Fusion  of  three  COTS  matchers  improved  performance  

•  Subsequent  research  on  face  matching  should  use  COTS  matchers  as  baselines  

•  Face  tracks  contain  redundant  facial  informa:on  – Quality-­‐based  key-­‐frame  selec:on  can  be  used  to  reduce  the  number  of  frames  for  matching  

Page 18: Video&to&Video(Face(Matching:( Establishing(aBaseline(for ...biometrics.cse.msu.edu/Presentations/LaceyBestRowden...Commercial&off&the&Shelf((COTS)(Performance(on(S:ll&to&S:ll(FR

Thank  you!