one shot learning via compositions of meaningful patches ccvlweb.cs.ucla.edu › ~alexw ›...

1
One Shot Learning via Compositions of Meaningful Patches CCVL Alex Wong Alan L. Yuille University of California, Los Angeles http://ccvl.stat.ucla.edu/ Current stateoftheart algorithms perform very well on most common datasets when trained on thousands of examples However, humans are able to learn a concept from very few examples, perhaps even just one mo8va8on One shot learning is an object categoriza@on task where very few examples (15) are given for training what is one shot learning? Learn a meaningful patchbased representa@on of the underlying structure of an object without human supervision Build a composi@onal model composed of a set of compact dic@onaries of meaningful patches Reconstruct the target image with deforma@ons of the meaningful patch dic@onaries by patch matching Select the class of the best proposed reconstruc@on as label our approach Training Images Segmented Images into Parts (Features) Compositional Model Test Image Reconstruction Select Best Fit Parts Representing Each Region Dictionaries of Deformed Parts Our composi@onal model outperforms popular algorithms on the recogni@on task under one shot learning The extracted features are seman@cally meaningful The model generalizes beyond the training set and demonstrates transferability between separate datasets conclusion experimental results Symmetry axis acts as a robust object descriptor Branch points separate one meaningful part from another Small segments are merged with nearby meaningful parts feature extrac8on Similar parts, defined by a high match score via Normalized Cross Correla@on, are merged to create a compact dic@onary An ANDOR graph of the part rela@ons is construc@on for m patches for samples t and u: Deforma@ons are applied to the meaningful patches composi8onal model We would like to thank Brian Taylor for performing experiments and edi@ng the manuscript. This work was supported by NSF STC award CCF1231216 and ONR N000141210883. acknowledgements sample reconstruc8ons truth=1 label=2 label=6 truth=1 truth=2 label=7 truth=2 label=4 truth=3 label=2 truth=3 label=5 truth=4 label=9 truth=4 label=9 truth=5 label=4 truth=5 label=3 truth=6 label=4 truth=6 label=5 truth=7 label=3 truth=7 label=2 truth=8 label=6 truth=8 label=2 truth=9 label=8 truth=9 label=4 truth=0 label=3 truth=0 label=9 MNIST USPS (trained on MNIST) Images in the Wild (trained on MNIST) Misclassifications *left image denotes test image, right image denotes reconstruction

Upload: others

Post on 04-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: One Shot Learning via Compositions of Meaningful Patches CCVLweb.cs.ucla.edu › ~alexw › iccv2015_one_shot_learn_poster.pdf · 2019-05-24 · One Shot Learning via Compositions

One Shot Learning via Compositions of Meaningful Patches CCVL  

Alex Wong Alan L. Yuille University of California, Los Angeles

http://ccvl.stat.ucla.edu/ !

Current  state-­‐of-­‐the-­‐art  algorithms  perform  very  well  on  most  common  datasets  when  trained  on  thousands  of  examples  However,  humans  are  able  to  learn  a  concept  from  very  few  

examples,  perhaps  even  just  one  

mo8va8on  

One  shot  learning  is  an  object  categoriza@on  task  where  very  few  examples  (1-­‐5)  are  given  for  training  

what  is  one  shot  learning?  

•  Learn  a  meaningful  patch-­‐based  representa@on  of  the  underlying  structure  of  an  object  without  human  supervision  

•  Build  a  composi@onal  model  composed  of  a  set  of  compact  dic@onaries  of  meaningful  patches  

•  Reconstruct  the  target  image  with  deforma@ons  of  the  meaningful  patch  dic@onaries  by  patch  matching  

•  Select  the  class  of  the  best  proposed  reconstruc@on  as  label  

our  approach  

Training Images!Segmented Images into Parts!

(Features)!

Compositional Model!

Test Image!Reconstruction!

Select Best Fit!

Parts Representing !Each Region!

Dictionaries of Deformed Parts!

•  Our  composi@onal  model  outperforms  popular  algorithms  on  the  recogni@on  task  under  one  shot  learning  

•  The  extracted  features  are  seman@cally  meaningful  •  The  model  generalizes  beyond  the  training  set  and  demonstrates  

transferability  between  separate  datasets  

conclusion  

experimental  results  

•  Symmetry  axis  acts  as  a  robust  object  descriptor  

•  Branch  points  separate  one  meaningful  part  from  another  •  Small  segments  are  merged  with  nearby  meaningful  parts  

feature  extrac8on  

•  Similar  parts,  defined  by  a  high  match  score  via  Normalized  Cross  Correla@on,  are  merged  to  create  a  compact  dic@onary  

•  An  AND-­‐OR  graph  of  the  part  rela@ons  is  construc@on  for  m  patches  for  samples  t  and  u:  

•  Deforma@ons  are  applied  to  the  meaningful  patches    

composi8onal  model  

We  would  like  to  thank  Brian  Taylor  for  performing  experiments  and  edi@ng  the  manuscript.  This  work  was  supported  by  NSF  STC  award  CCF-­‐1231216  and  ONR  N00014-­‐12-­‐1-­‐0883.    

acknowledgements  

sample  reconstruc8ons  

truth=1! label=2!

label=6!truth=1!

truth=2! label=7!

truth=2! label=4!

truth=3! label=2!

truth=3! label=5! truth=4! label=9!

truth=4! label=9! truth=5! label=4!

truth=5! label=3!

truth=6! label=4!

truth=6! label=5!

truth=7! label=3!

truth=7! label=2! truth=8! label=6!

truth=8! label=2!

truth=9! label=8!

truth=9! label=4! truth=0! label=3!

truth=0! label=9!

MNIST USPS (trained on MNIST) Images in the Wild (trained on MNIST) Misclassifications

*left image denotes test image, right image denotes reconstruction