02 - iccv2009_classical_methods - bag of words models - part-based models - and discriminative...

Upload: antiw

Post on 09-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    1/56

    Classical Methods

    for Object Recognition

    Rob Fergus (NYU)

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    2/56

    Classical Methods

    1.Bag of words approaches

    2.Parts and structure approaches

    3.Discriminative

    methods4.Condensed version

    of sections from2007 edition oftutorial

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    3/56

    ag of WordsModels

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    4/56

    ObjectObject Bag of wordsBag of words

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    5/56

    Bag of Words

    Independent features

    Histogram representation

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    6/56

    1.Feature1.Feature detectiondetectionandand representationrepresentation

    Normalize

    patch

    Detect patches

    [Mikojaczyk and Schmid 02]

    [Mata, Chum, Urban & Pajdla, 02]

    [Sivic & Zisserman, 03]

    Computedescriptor

    e.g. SIFT [Lowe99]

    Slide credit: Josef Sivic

    Local interest operatoror

    Regular grid

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    7/56

    1.Feature1.Feature detectiondetectionandand representationrepresentation

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    8/56

    2. Codewords dictionary formation2. Codewords dictionary formation

    128-D SIFT space

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    9/56

    2. Codewords dictionary formation2. Codewords dictionary formation

    Vector quantization

    Slide credit: Josef Sivic128-D SIFT space

    +

    +

    +

    Codewords

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    10/56

    Image patch examples of codewordsImage patch examples of codewords

    Sivic et al. 2005

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    11/56

    Image representationImage representation

    ..

    frequency

    codewords

    Histogram of features

    assigned to each cluster

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    12/56

    Uses of BoW representation

    Treat as feature vector for standard classifier

    e.g SVM

    Cluster BoW vectors over image collection

    Discover visual themes

    Hierarchical models

    Decompose scene/object

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    13/56

    BoW as input to classifier

    SVM for object classification Csurka, Bray, Dance & Fan, 2004

    Nave Bayes See 2007 edition of this course

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    14/56

    Clustering BoW vectors

    Use models from text document literature Probabilistic latent semantic analysis (pLSA)

    Latent Dirichlet allocation (LDA)

    See 2007 edition for explanation/code

    d = image, w = visual word, z = topic (cluster)

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    15/56

    Clustering BoW vectors

    Scene classification (supervised) Vogel & Schiele, 2004

    Fei-Fei & Perona, 2005

    Bosch, Zisserman & Munoz, 2006

    Object discovery (unsupervised) Each cluster corresponds to visual theme

    Sivic, Russell, Efros, Freeman & Zisserman, 2005

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    16/56

    Related workRelated work

    Early bag of words models: mostly texture

    recognition Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie &

    Malik, 2001; Schmid 2001; Varma & Zisserman, 2002,2003; Lazebnik, Schmid & Ponce, 2003

    Hierarchical Bayesian models for documents

    (pLSA, LDA, etc.) Hoffman 1999; Blei, Ng & Jordan, 2004; Teh, Jordan, Beal &Blei, 2004

    Object categorization Csurka, Bray, Dance & Fan, 2004; Sivic, Russell, Efros,

    Freeman & Zisserman, 2005; Sudderth, Torralba,

    Freeman & Willsky, 2005; Natural scene categorization

    Vogel & Schiele, 2004; Fei-Fei & Perona, 2005; Bosch,Zisserman & Munoz, 2006

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    17/56

    What about spatial info?What about spatial info?

    ?

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    18/56

    Adding spatial info. to BoWAdding spatial info. to BoW

    Feature level

    Spatial influence through correlogram features:Savarese, Winn and Criminisi, CVPR 2006

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    19/56

    Adding spatial info. to BoWAdding spatial info. to BoW

    Feature level

    Generative models

    Sudderth, Torralba, Freeman & Willsky, 2005,2006

    Hierarchical model of scene/objects/parts

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    20/56

    Adding spatial info. to BoWAdding spatial info. to BoW

    Feature level

    Generative models

    Sudderth, Torralba, Freeman & Willsky, 2005,2006

    Niebles & Fei-Fei, CVPR 2007

    P

    3

    P1

    P2

    P

    4

    Bg

    Image

    w

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    21/56

    Adding spatial info. to BoWAdding spatial info. to BoW

    Feature level

    Generative models

    Discriminative methods

    Lazebnik, Schmid & Ponce, 2006

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    22/56

    Part-basedModels

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    23/56

    Problem with bag-of-words

    All have equal probability for bag-of-wordsmethods

    Location information is important

    BoW + location still doesnt givecorres ondence

    M d l P d S

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    24/56

    :Model Parts and Structure

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    25/56

    Representation

    Object as set of parts Generative representation

    Model:

    Relative locations between parts

    Appearance of part

    Issues:

    How to model location

    How to represent appearance

    How to handle occlusion/clutter

    [Figure from Fischler & Elschlager 7

    Hi t f P t d

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    26/56

    History of Parts andStructure approaches

    &Fischler Elschlager 1973

    Yuille 91 & Brunelli Poggio 93 , . . . Lades v d Malsburg et al 93 , , . Cootes Lanitis Taylor et al 95 & , Amit Geman 95 99 . , , , , , , Perona et al 95 96 98 00 03 04 05 & , Felzenszwalb Huttenlocher 00 04

    & , Crandall Huttenlocher 05 06

    & , Leibe Schiele 03 04

    Many papers since 2000

    S t ti

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    27/56

    Sparse representation+ (Computationally tractable 105pixels 10 1 -- 102

    )parts

    + Generative representation of class+ Avoid modeling global variability+ Success in specific object recognition

    - Throw away most image information

    - Parts need to be distinctive to separate from other

    The correspondence

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    28/56

    The correspondenceproblem

    Model with P parts

    Image with N possible assignments for each part

    Consider mapping to be 1-1

    NP

    ! ! !combinations

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    29/56

    from Sparse Flexible Models of Local FeaturesGustavo Carneiro and David Lowe, ECCV 2006

    D iffe re n t co n n e ctiv itystru ctu re s

    (O N 6) (O N2) (O N3)

    (O N2). Fergus et al 03- . Fei Fei et al 03 .Crandall et al05

    . Fergus et al 05

    .Crandall et al05

    &Felzenszwalb Huttenlocher 00

    &Bouchard Triggs05

    & Carneiro Lowe 06Csurka 04Vasconcelos 00

    Effi i t th d

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    30/56

    Efficient methods

    e tra n sfo rm s

    zw a lb a n d H u tte n lo che r 0 0 a n d 0 5

    )P N or tr stru tur ls

    s n e e d fo r re g io n d e te cto rs

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    31/56

    How much does shape help? Crandall, Felzenszwalb, Huttenlocher CVPR05

    Shape variance increases with increasing model complexity Do get some benefit from shape

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    32/56

    Appearance representation

    Decisiontrees

    Figure from Winn& ,Shotton CVPR

    SIFT

    PCA

    [ ]Lepetit and Fua CVPR 2005

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    33/56

    Learn Appearance

    Generative models of appearance Can learn with little supervision

    E.g. Fergus et al 03

    Discriminative training of part

    appearance model

    SVM part detectors Felzenszwalb, Mcallester, Ramanan,CVPR 2008

    Much better performance

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    34/56

    Felzenszwalb, Mcallester, Ramanan,CVPR 2008

    2-scale model Whole object

    Parts

    HOG representation +

    SVM training to obtainrobust part detectors

    Distancetransforms allowexamination of everylocation in the image

    Hierarchical

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    35/56

    HierarchicalRepresentations

    Pixels Pixel groupings Parts Object

    [ ]Images from Amit98

    -Multi scale approachincreases number of

    -low level features

    Amit and Geman 98 .Ullman et al & Bouchard Triggs 05

    Zhu and Mumford & Jin Geman 06 & Zhu Yuille 07 &Fidler Leonardis

    07

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    36/56

    Stochastic Grammar of Images

    S.C. Zhu et al. and D. Mumford

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    37/56

    ni m a l h e a dn s t a n t ia t e d byi g e r h e ad

    ni ma l h e a dns ta n t i a t ed b y be a rh e a d

    . .g ,iscontinuitiesradient

    . . ,g linelets, -urvelets Tjunctions

    . . ,g contoursntermediateobjects

    . . ,g animals,rees rocks

    n ex an erarc y n a ro a s c mageModel& ( )in Geman 2006

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    38/56

    A Hierarchical CompositionalSystem for Rapid Object

    Detection, . , .Long Zhu Alan L Yuille 2007

    #Able to learn parts at each

    level

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    39/56

    Learning a Compositional Hierarchy of Object StructureFidler & Leonardis, CVPR07; Fidler, Boben & Leonardis, CVPR 2008Fidler & Leonardis, CVPR07; Fidler, Boben & Leonardis, CVPR 2008

    The architecture

    Parts model

    Learned parts

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    40/56

    Parts and Structure modelsSummary

    Explicit notion of correspondencebetween image and model

    Efficient methods for large # parts

    and # positions in image

    With powerful part detectors, can getstate-of-the-art performance

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    41/56

    Classifier-

    basedmethods

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    42/56

    Classifier based methodsObject detection and recognition is formulated as a classification problem.

    Bag of image patches

    and a decision is taken at each window about if it contains a target object or not.

    Decisionboundary

    Computer screen

    Background

    In some feature space

    Where are the screens?

    The image is partitioned into a set of overlapping windows

    Di i i ti ti

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    43/56

    (The lousypainter)

    Discriminative vs. generative

    0 10 20 30 40 50 60 70

    0

    0.05

    0.1

    x = data

    Generative model

    0 10 20 30 40 50 60 700

    0.5

    1

    x = data

    Discriminative model

    0 10 20 30 40 50 60 70 80

    -1

    1

    x = data

    Classification function

    (The artist)

    Form lation

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    44/56

    Formulation: binary classification

    Formulation

    +1-1

    x1 x2 x3 xN

    xN+1 xN+2 xN+M

    -1 -1 ? ? ?

    Training data: each image patch is labeledas containing the object or background

    Test data

    Features x =

    Labels y =

    Where belongs to some family of functions

    Classification function

    Minimize misclassification error(Not that simple: we need some guarantees that there will be generalization)

    F d t ti

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    45/56

    Face detection

    The representation and matching of pictorial structuresFischler, Elschlager (1973).Face recognition using eigenfaces M. Turk and A. Pentland (1991).Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995)Graded Learning for Object Detection - Fleuret, Geman (1999)Robust Real-time Object Detection - Viola, Jones (2001)Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre,Mukherjee, Poggio (2001).

    Features: Haar filters

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    46/56

    Features: Haar filters

    Haar filters and integral image

    Viola and Jones, ICCV 2001

    Haar waveletsPapageorgiou & Poggio (2000)

    F t Ed d h f di t

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    47/56

    Features: Edges and chamfer distance

    Gavrila, Philomin, ICCV 1999

    Features: Edge fragments

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    48/56

    Features: Edge fragments

    Weak detector = k edgefragments and threshold.Chamfer distance uses 8orientation planes

    Opelt, Pinz, Zisserman,ECCV 2006

    Features: Histograms of oriented gradients

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    49/56

    Features: Histograms of oriented gradients

    Dalal & Trigs, 2006

    Shape context

    Belongie, Malik, Puzicha, NIPS 2000SIFT, D. Lowe, ICCV 1999

    Classifier: Nearest Neighbor

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    50/56

    Berg, Berg and Malik, 2005

    Classifier: Nearest Neighbor

    106 examples

    Shakhnarovich, Viola, Darrell, 2003

    Classifier: Neural Networks

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    51/56

    Classifier: Neural Networks

    Fukushimas Neocognitron, 1980

    Rowley, Baluja, Kanade 1998

    LeCun, Bottou, Bengio, Haffner 1998

    Serre et al. 2005

    LeNet convolutional architecture (LeCun 1998)

    Riesenhuber, M. and Poggio, T. 1999

    Classifier: Support Vector Machine

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    52/56

    Classifier: Support Vector Machine

    Guyon, Vapnik

    Heisele, Serre, Poggio, 2001..

    Dalal & Triggs , CVPR 2005

    Image HOGdescriptor

    HOG descriptor weighted by+veSVM -ve SVM

    weights

    HOG Histogram ofOriented gradients

    Learn weighting ofdescriptor with linearSVM

    Classifier: Boosting

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    53/56

    Viola & Jones 2001Haar features via Integral Image

    CascadeReal-time performance

    .

    Torralba et al., 2004Part-based Boosting

    Each weak classifier is a part

    Part location modeled byoffset mask

    Classifier: Boosting

    Summary of classifier based methods

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    54/56

    Summary of classifier-based methods

    Many techniques for training discriminativemodels are used

    Many not mentioned hereConditional random fieldsKernels for object recognitionLearning object similarities.....

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    55/56

    Dalal & Triggs HOG detector

  • 8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models

    56/56

    Dalal & Triggs HOG detector

    Image HOGdescriptor

    HOG descriptor weighted by+veSVM -ve SVM

    HOG Histogram of Oriented gradientsCareful selection of spatial bin size/# orientation bins/normalizationLearn weighting of descriptor with learn SVM