02 - iccv2009_classical_methods - bag of words models - part-based models - and discriminative...
TRANSCRIPT
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
1/56
Classical Methods
for Object Recognition
Rob Fergus (NYU)
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
2/56
Classical Methods
1.Bag of words approaches
2.Parts and structure approaches
3.Discriminative
methods4.Condensed version
of sections from2007 edition oftutorial
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
3/56
ag of WordsModels
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
4/56
ObjectObject Bag of wordsBag of words
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
5/56
Bag of Words
Independent features
Histogram representation
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
6/56
1.Feature1.Feature detectiondetectionandand representationrepresentation
Normalize
patch
Detect patches
[Mikojaczyk and Schmid 02]
[Mata, Chum, Urban & Pajdla, 02]
[Sivic & Zisserman, 03]
Computedescriptor
e.g. SIFT [Lowe99]
Slide credit: Josef Sivic
Local interest operatoror
Regular grid
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
7/56
1.Feature1.Feature detectiondetectionandand representationrepresentation
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
8/56
2. Codewords dictionary formation2. Codewords dictionary formation
128-D SIFT space
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
9/56
2. Codewords dictionary formation2. Codewords dictionary formation
Vector quantization
Slide credit: Josef Sivic128-D SIFT space
+
+
+
Codewords
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
10/56
Image patch examples of codewordsImage patch examples of codewords
Sivic et al. 2005
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
11/56
Image representationImage representation
..
frequency
codewords
Histogram of features
assigned to each cluster
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
12/56
Uses of BoW representation
Treat as feature vector for standard classifier
e.g SVM
Cluster BoW vectors over image collection
Discover visual themes
Hierarchical models
Decompose scene/object
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
13/56
BoW as input to classifier
SVM for object classification Csurka, Bray, Dance & Fan, 2004
Nave Bayes See 2007 edition of this course
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
14/56
Clustering BoW vectors
Use models from text document literature Probabilistic latent semantic analysis (pLSA)
Latent Dirichlet allocation (LDA)
See 2007 edition for explanation/code
d = image, w = visual word, z = topic (cluster)
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
15/56
Clustering BoW vectors
Scene classification (supervised) Vogel & Schiele, 2004
Fei-Fei & Perona, 2005
Bosch, Zisserman & Munoz, 2006
Object discovery (unsupervised) Each cluster corresponds to visual theme
Sivic, Russell, Efros, Freeman & Zisserman, 2005
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
16/56
Related workRelated work
Early bag of words models: mostly texture
recognition Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie &
Malik, 2001; Schmid 2001; Varma & Zisserman, 2002,2003; Lazebnik, Schmid & Ponce, 2003
Hierarchical Bayesian models for documents
(pLSA, LDA, etc.) Hoffman 1999; Blei, Ng & Jordan, 2004; Teh, Jordan, Beal &Blei, 2004
Object categorization Csurka, Bray, Dance & Fan, 2004; Sivic, Russell, Efros,
Freeman & Zisserman, 2005; Sudderth, Torralba,
Freeman & Willsky, 2005; Natural scene categorization
Vogel & Schiele, 2004; Fei-Fei & Perona, 2005; Bosch,Zisserman & Munoz, 2006
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
17/56
What about spatial info?What about spatial info?
?
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
18/56
Adding spatial info. to BoWAdding spatial info. to BoW
Feature level
Spatial influence through correlogram features:Savarese, Winn and Criminisi, CVPR 2006
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
19/56
Adding spatial info. to BoWAdding spatial info. to BoW
Feature level
Generative models
Sudderth, Torralba, Freeman & Willsky, 2005,2006
Hierarchical model of scene/objects/parts
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
20/56
Adding spatial info. to BoWAdding spatial info. to BoW
Feature level
Generative models
Sudderth, Torralba, Freeman & Willsky, 2005,2006
Niebles & Fei-Fei, CVPR 2007
P
3
P1
P2
P
4
Bg
Image
w
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
21/56
Adding spatial info. to BoWAdding spatial info. to BoW
Feature level
Generative models
Discriminative methods
Lazebnik, Schmid & Ponce, 2006
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
22/56
Part-basedModels
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
23/56
Problem with bag-of-words
All have equal probability for bag-of-wordsmethods
Location information is important
BoW + location still doesnt givecorres ondence
M d l P d S
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
24/56
:Model Parts and Structure
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
25/56
Representation
Object as set of parts Generative representation
Model:
Relative locations between parts
Appearance of part
Issues:
How to model location
How to represent appearance
How to handle occlusion/clutter
[Figure from Fischler & Elschlager 7
Hi t f P t d
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
26/56
History of Parts andStructure approaches
&Fischler Elschlager 1973
Yuille 91 & Brunelli Poggio 93 , . . . Lades v d Malsburg et al 93 , , . Cootes Lanitis Taylor et al 95 & , Amit Geman 95 99 . , , , , , , Perona et al 95 96 98 00 03 04 05 & , Felzenszwalb Huttenlocher 00 04
& , Crandall Huttenlocher 05 06
& , Leibe Schiele 03 04
Many papers since 2000
S t ti
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
27/56
Sparse representation+ (Computationally tractable 105pixels 10 1 -- 102
)parts
+ Generative representation of class+ Avoid modeling global variability+ Success in specific object recognition
- Throw away most image information
- Parts need to be distinctive to separate from other
The correspondence
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
28/56
The correspondenceproblem
Model with P parts
Image with N possible assignments for each part
Consider mapping to be 1-1
NP
! ! !combinations
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
29/56
from Sparse Flexible Models of Local FeaturesGustavo Carneiro and David Lowe, ECCV 2006
D iffe re n t co n n e ctiv itystru ctu re s
(O N 6) (O N2) (O N3)
(O N2). Fergus et al 03- . Fei Fei et al 03 .Crandall et al05
. Fergus et al 05
.Crandall et al05
&Felzenszwalb Huttenlocher 00
&Bouchard Triggs05
& Carneiro Lowe 06Csurka 04Vasconcelos 00
Effi i t th d
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
30/56
Efficient methods
e tra n sfo rm s
zw a lb a n d H u tte n lo che r 0 0 a n d 0 5
)P N or tr stru tur ls
s n e e d fo r re g io n d e te cto rs
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
31/56
How much does shape help? Crandall, Felzenszwalb, Huttenlocher CVPR05
Shape variance increases with increasing model complexity Do get some benefit from shape
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
32/56
Appearance representation
Decisiontrees
Figure from Winn& ,Shotton CVPR
SIFT
PCA
[ ]Lepetit and Fua CVPR 2005
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
33/56
Learn Appearance
Generative models of appearance Can learn with little supervision
E.g. Fergus et al 03
Discriminative training of part
appearance model
SVM part detectors Felzenszwalb, Mcallester, Ramanan,CVPR 2008
Much better performance
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
34/56
Felzenszwalb, Mcallester, Ramanan,CVPR 2008
2-scale model Whole object
Parts
HOG representation +
SVM training to obtainrobust part detectors
Distancetransforms allowexamination of everylocation in the image
Hierarchical
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
35/56
HierarchicalRepresentations
Pixels Pixel groupings Parts Object
[ ]Images from Amit98
-Multi scale approachincreases number of
-low level features
Amit and Geman 98 .Ullman et al & Bouchard Triggs 05
Zhu and Mumford & Jin Geman 06 & Zhu Yuille 07 &Fidler Leonardis
07
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
36/56
Stochastic Grammar of Images
S.C. Zhu et al. and D. Mumford
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
37/56
ni m a l h e a dn s t a n t ia t e d byi g e r h e ad
ni ma l h e a dns ta n t i a t ed b y be a rh e a d
. .g ,iscontinuitiesradient
. . ,g linelets, -urvelets Tjunctions
. . ,g contoursntermediateobjects
. . ,g animals,rees rocks
n ex an erarc y n a ro a s c mageModel& ( )in Geman 2006
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
38/56
A Hierarchical CompositionalSystem for Rapid Object
Detection, . , .Long Zhu Alan L Yuille 2007
#Able to learn parts at each
level
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
39/56
Learning a Compositional Hierarchy of Object StructureFidler & Leonardis, CVPR07; Fidler, Boben & Leonardis, CVPR 2008Fidler & Leonardis, CVPR07; Fidler, Boben & Leonardis, CVPR 2008
The architecture
Parts model
Learned parts
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
40/56
Parts and Structure modelsSummary
Explicit notion of correspondencebetween image and model
Efficient methods for large # parts
and # positions in image
With powerful part detectors, can getstate-of-the-art performance
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
41/56
Classifier-
basedmethods
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
42/56
Classifier based methodsObject detection and recognition is formulated as a classification problem.
Bag of image patches
and a decision is taken at each window about if it contains a target object or not.
Decisionboundary
Computer screen
Background
In some feature space
Where are the screens?
The image is partitioned into a set of overlapping windows
Di i i ti ti
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
43/56
(The lousypainter)
Discriminative vs. generative
0 10 20 30 40 50 60 70
0
0.05
0.1
x = data
Generative model
0 10 20 30 40 50 60 700
0.5
1
x = data
Discriminative model
0 10 20 30 40 50 60 70 80
-1
1
x = data
Classification function
(The artist)
Form lation
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
44/56
Formulation: binary classification
Formulation
+1-1
x1 x2 x3 xN
xN+1 xN+2 xN+M
-1 -1 ? ? ?
Training data: each image patch is labeledas containing the object or background
Test data
Features x =
Labels y =
Where belongs to some family of functions
Classification function
Minimize misclassification error(Not that simple: we need some guarantees that there will be generalization)
F d t ti
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
45/56
Face detection
The representation and matching of pictorial structuresFischler, Elschlager (1973).Face recognition using eigenfaces M. Turk and A. Pentland (1991).Human Face Detection in Visual Scenes - Rowley, Baluja, Kanade (1995)Graded Learning for Object Detection - Fleuret, Geman (1999)Robust Real-time Object Detection - Viola, Jones (2001)Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images - Heisele, Serre,Mukherjee, Poggio (2001).
Features: Haar filters
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
46/56
Features: Haar filters
Haar filters and integral image
Viola and Jones, ICCV 2001
Haar waveletsPapageorgiou & Poggio (2000)
F t Ed d h f di t
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
47/56
Features: Edges and chamfer distance
Gavrila, Philomin, ICCV 1999
Features: Edge fragments
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
48/56
Features: Edge fragments
Weak detector = k edgefragments and threshold.Chamfer distance uses 8orientation planes
Opelt, Pinz, Zisserman,ECCV 2006
Features: Histograms of oriented gradients
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
49/56
Features: Histograms of oriented gradients
Dalal & Trigs, 2006
Shape context
Belongie, Malik, Puzicha, NIPS 2000SIFT, D. Lowe, ICCV 1999
Classifier: Nearest Neighbor
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
50/56
Berg, Berg and Malik, 2005
Classifier: Nearest Neighbor
106 examples
Shakhnarovich, Viola, Darrell, 2003
Classifier: Neural Networks
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
51/56
Classifier: Neural Networks
Fukushimas Neocognitron, 1980
Rowley, Baluja, Kanade 1998
LeCun, Bottou, Bengio, Haffner 1998
Serre et al. 2005
LeNet convolutional architecture (LeCun 1998)
Riesenhuber, M. and Poggio, T. 1999
Classifier: Support Vector Machine
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
52/56
Classifier: Support Vector Machine
Guyon, Vapnik
Heisele, Serre, Poggio, 2001..
Dalal & Triggs , CVPR 2005
Image HOGdescriptor
HOG descriptor weighted by+veSVM -ve SVM
weights
HOG Histogram ofOriented gradients
Learn weighting ofdescriptor with linearSVM
Classifier: Boosting
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
53/56
Viola & Jones 2001Haar features via Integral Image
CascadeReal-time performance
.
Torralba et al., 2004Part-based Boosting
Each weak classifier is a part
Part location modeled byoffset mask
Classifier: Boosting
Summary of classifier based methods
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
54/56
Summary of classifier-based methods
Many techniques for training discriminativemodels are used
Many not mentioned hereConditional random fieldsKernels for object recognitionLearning object similarities.....
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
55/56
Dalal & Triggs HOG detector
-
8/8/2019 02 - ICCV2009_classical_methods - Bag of Words Models - Part-Based Models - And Discriminative Models
56/56
Dalal & Triggs HOG detector
Image HOGdescriptor
HOG descriptor weighted by+veSVM -ve SVM
HOG Histogram of Oriented gradientsCareful selection of spatial bin size/# orientation bins/normalizationLearn weighting of descriptor with learn SVM