Transcript
Page 1: POP:$Person$Re+Iden.fica.on$Post+RankOp.misaccloy/files/iccv_2013_poster_reid.pdf · Introduction Incremental Construction of Affinity Graph POP:$Person$Re+Iden.fica.on$Post+RankOp.misa.on

Incremental Construction of Affinity Graph Introduction

POP:  Person  Re-­‐Iden.fica.on  Post-­‐Rank  Op.misa.on Chen  Change  Loy  

The  Chinese  University  of  Hong  Kong  [email protected]  

Shaogang  Gong  Queen  Mary  University  of  London    

[email protected]  

Project page: h"p://personal.ie.cuhk.edu.hk/~ccloy/project_pop/

1

Negative Propagation over Graph

Post-Rank Optimsation by Negative Mining

Chunxiao  Liu  Tsinghua  University  

[email protected]  

Guijin  Wang  Tsinghua  University  

[email protected]  

Problem:   •  Visual  ambigui.es  and  dispari.es  •  Off-­‐line  learning  scalability  (minimising  user  feedback  clicks)

Contribu.ons:   •  Formulate  a  systema.c  framework  for  fast  re-­‐iden.fica.on  post-­‐rank  op.misa.on,  with  significant  

increase  in  recogni.on  accuracy  (over  30%  increase  for  rank-­‐1  recogni4on  rate  on  VIPeR  dataset).  •  Minimise  human-­‐in-­‐the-­‐loop  effort  by  one-­‐shot  nega.ve  feedback  selec.on  (weak/strong  nega.ve).  •  Formulate  a  new  visual  expansion  model  for  overcoming  insufficient  training  samples.  •  Incremental  affinity  graph  construc.on  for  exploi.ng  large  quan..es  of  unlabelled  data.

visual  ambigui.es  and  dispari.es          probe                  weak                            strong

2

Nega.ve  selec.on  :  A  user  selects  one  (any)  strong  nega.ve  from  the  top  N  ranked  instances,  denoted  as                    .                  

Cross-Camera View Visual Expansion 3

Mo.va.on:  

•  A  single  strong  nega.ve  selected  by  user  is  insufficient  for  learning  a  post-­‐rank  func.on.  •  Due   to   large   feature   inconsistency   between   different   camera   views,   a   probe   image   from   the   probe  

camera  view  cannot  be  readily  used  as  posi.ve  sample  in  the  gallery  view.  

p  Step  1  :  Learning  an  appearance  mapping  space:  

 

 

p  Step  2  :  Genera.ng  synthesised  probe  instance:  

Method: Regression forest

x

p

{x̃p}synthesised probe instances

The  visual  varia4ons  between  a  probe  and  a  gallery  camera  view  are  accounted  by  mul4-­‐output  regression  forest.

•               is  the  regression  predictor  for  the  t-­‐th  regression  tree.  •  The  subscript                                                                              is  a  randomly  sampled  index.    

This  process  can  be  repeated  to  generate  more  synthesised  probe  instances  if  desired.

4

Mo.va.on: Propagate  the  sparse  labelled  samples  to  the  large  quan.ty  of  unlabelled  set.

Method: a)    Clustering  forest    for  graph  construc.on

•  Its  implicit  feature  selec.on  mechanism  is  beneficial  to  mi.ga.ng  noisy  visual  features.  •  It  offers  scalable  and  tractable  solu.on  to  our  incremental  graph  construc.on  requirement            so  to  accommodate  varying  number  of  selected  nega.ves  accumula.ng  on-­‐the-­‐fly.

b)    Incremental  construc.on

p Step  1:    construct  a  graph  for  the  gallery  instances                        .  p Step  2:    include  synthesised  posi.ves  in  the  construc.on  of  the  affinity  graph.  

{xg}

Comparative Evaluations

Ini.al  Ranking

L1-­‐norm RankSVM PRDC MCC

VIPeR one-­‐shot

0 R1 R3 R2 0 R1 R3 R2 9.43 31.90 47.56 42.88 9.43 30.41 50.13 44.21

14.87 16.01 17.85

59.05 71.08 67.06 14.87 58.48 71.58 67.85 59.91 72.03 67.88 16.01 59.49 72.22 68.35 60.13 66.87 64.08 17.85 60.06 66.20 63.64

0 R1 R3 R2 0 R1 R3 R2 29.60 67.60 75.60 73.20 29.60 67.60 81.80 75.40 29.80 31.40 30.00

73.40 79.80 77.20 29.80 73.40 82.40 79.40 70.20 78.00 75.60 31.40 70.20 80.00 77.00 69.80 76.60 73.60 30.00 69.20 80.40 74.80

mul.-­‐shots i-­‐LIDS

one-­‐shot mul.-­‐shots

5

Informa.on  propaga.on

Final  score:

Negative Accumulation 6

(a)  Three-­‐dimensional  embedding  of  gallery  images  obtained  using  mul.-­‐dimensional  scaling  afer  the  first  round  of  nega.ve  selec.on.  

(b)  The  embedding  afer  the  second  round.

Behavioural Studies

8

7

2.6  4mes  reduc4on  of  searching  4me

[1]  B.  Prosser,  et  al,  “Person  re-­‐idenBficaBon  by  support  vector  ranking”,  BMVC,  2010  [2]  W.  Zheng,  et  al,  “Re-­‐idenBficaBon  by  relaBve  distance  comparison”,  TPAMI,  2012.  [3]  A.  Globerson,  et  al,  “Metric  learning  by  collapsing  classers”,  NIPS,  2005.  

(a)  POP  vs.  L1-­‐norm,  RankSVM,  PRDC,  MCC

(b)  POP  vs.  other  Post-­‐Rank  Models

(c)  Benefits  from  Visual  Expansion

ü   The  rank-­‐1  average  recogni.on  rates  are  boosted  by  38.33%  and  40.05%  on  VIPeR  and  i-­‐LIDS  respec.vely    for  all  four  different  ini.al  ranking  models.  

ü   Improvements  of  over  12.00%  and  3.70%  at  rank-­‐5  recogni.on  rate  on  VIPeR  and  i-­‐LIDS  respec.vely,  afer  4  rounds  feedback.

ü  Improving  POP  from  37.66%  to  51.39%  afer  4  rounds  feedback  on  average.  

0 50 100 150 200 250 3000

50

100

150

200

Initial Rank Order

Sear

ch T

ime

(/sec

onds

)

Exhaustive searchPOP

5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

CMC order

Rec

ogni

tion

rate

RankSVMIteration 1Iteration 2Iteration 3

VIPeR VIPeR i-LIDS i-LIDS

5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

CMC order

Rec

ogni

tion

rate

PRDCIteration 1Iteration 2Iteration 3

5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

CMC order

Rec

ogni

tion

rate

RankSVMIteration 1Iteration 2Iteration 3

VIPeR VIPeR i-LIDS i-LIDS

5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

CMC order

Rec

ogni

tion

rate

viper(one click)

RankSVMIteration 1Iteration 2Iteration 3

5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

CMC order

Rec

ogni

tion

rate

viper(one click)

PRDCIteration 1Iteration 2Iteration 3

Top Related