Download - Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1
![Page 1: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/1.jpg)
1
Tamara BergObject Recognition – BoF models
790-133Recognizing People, Objects, & Actions
![Page 2: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/2.jpg)
2
Topic Presentations
• Hopefully you have met your topic presentations group members?
• Group 1 – see me to run through slides this week or Monday at the latest (I’m traveling Thurs/Friday). Send me links to 2-3 papers for the class to read.
• Sign up for class google group (790-133). To find the group go to groups.google.com and search for 790-133 (sorted by date). Use this to post/answer questions related to the class.
![Page 3: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/3.jpg)
3
ObjectBag of
‘features’
Bag-of-features models
source: Svetlana Lazebnik
![Page 4: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/4.jpg)
4
Exchangeability
• De Finetti Theorem of exchangeability (bag of words theorem): the joint probability distribution underlying the data is invariant to permutation.
![Page 5: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/5.jpg)
5
Origin 2: Bag-of-words models
US Presidential Speeches Tag Cloudhttp://chir.ag/phernalia/preztags/
• Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)
source: Svetlana Lazebnik
![Page 6: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/6.jpg)
6
Bag of words for text
· Represent documents as a “bags of words”
![Page 7: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/7.jpg)
7
Example
• Doc1 = “the quick brown fox jumped”• Doc2 = “brown quick jumped fox the”
Would a bag of words model represent these two documents differently?
![Page 8: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/8.jpg)
8
Bag of words for images
· Represent images as a “bag of features”
![Page 9: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/9.jpg)
9
Bag of features: outline1. Extract features
source: Svetlana Lazebnik
![Page 10: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/10.jpg)
10
Bag of features: outline1. Extract features2. Learn “visual vocabulary”
source: Svetlana Lazebnik
![Page 11: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/11.jpg)
11
Bag of features: outline1. Extract features2. Learn “visual vocabulary”3. Represent images by frequencies of
“visual words”
source: Svetlana Lazebnik
![Page 12: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/12.jpg)
12
2. Learning the visual vocabulary
Clustering
…
Slide credit: Josef Sivic
![Page 13: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/13.jpg)
13
2. Learning the visual vocabulary
Clustering
…
Slide credit: Josef Sivic
Visual vocabulary
![Page 14: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/14.jpg)
14
K-means clustering (reminder)• Want to minimize sum of squared Euclidean
distances between points xi and their nearest cluster centers mk
Algorithm:• Randomly initialize K cluster centers• Iterate until convergence:
• Assign each data point to the nearest center• Recompute each cluster center as the mean of all points assigned
to it
k
ki
ki mxMXDcluster
clusterinpoint
2)(),(
source: Svetlana Lazebnik
![Page 15: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/15.jpg)
15
Example visual vocabulary
Fei-Fei et al. 2005
![Page 16: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/16.jpg)
Image Representation
• For a query image Extract features
Associate each feature with the nearest cluster center (visual word)
Accumulate visual word frequencies over the image
Visual vocabulary
xx
x x
x x
x
x
x
x
![Page 17: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/17.jpg)
17
3. Image representation
…..
freq
uenc
y
codewords
source: Svetlana Lazebnik
![Page 18: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/18.jpg)
18
4. Image classification
…..
freq
uenc
y
codewords
source: Svetlana Lazebnik
Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
CAR
![Page 19: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/19.jpg)
Image Categorization
Choose from many categories
What is this? helicopter
![Page 20: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/20.jpg)
Image Categorization
Choose from many categories
What is this?
SVM/NBCsurka et al (Caltech 4/7)
Nearest NeighborBerg et al (Caltech 101)
Kernel + SVMGrauman et al (Caltech 101)
Multiple Kernel Learning + SVMsVarma et al (Caltech 101)…
![Page 21: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/21.jpg)
21
Visual Categorization with Bags of KeypointsGabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray
![Page 22: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/22.jpg)
22
Data
• Images in 7 classes: faces, buildings, trees, cars, phones, bikes, books
• Caltech 4 dataset: faces, airplanes, cars (rear and side), motorbikes, background
![Page 23: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/23.jpg)
23
Method
Steps:– Detect and describe image patches.– Assign patch descriptors to a set of predetermined
clusters (a visual vocabulary).– Construct a bag of keypoints, which counts the
number of patches assigned to each cluster.– Apply a classifier (SVM or Naïve Bayes), treating the
bag of keypoints as the feature vector– Determine which category or categories to assign to
the image.
![Page 24: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/24.jpg)
24
Bag-of-Keypoints Approach
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors Bag of Keypoints Multi-class
Classifier
5.1
.
.
.
5.0
1.0
Slide credit: Yun-hsueh Liu
![Page 25: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/25.jpg)
25
SIFT Descriptors
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors Bag of Keypoints Multi-class
Classifier
5.1
.
.
.
5.0
1.0
Slide credit: Yun-hsueh Liu
![Page 26: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/26.jpg)
26
Bag of Keypoints (1)
• Construction of a vocabulary– Kmeans clustering find “centroids” (on all the descriptors we find from all the training images) – Define a “vocabulary” as a set of “centroids”, where every centroid represents
a “word”.
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors Bag of Keypoints Multi-class
Classifier
Slide credit: Yun-hsueh Liu
![Page 27: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/27.jpg)
27
Bag of Keypoints (2)
• Histogram– Counts the number of occurrences of different visual words in each image
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors Bag of Keypoints Multi-class
Classifier
Slide credit: Yun-hsueh Liu
![Page 28: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/28.jpg)
28
Multi-class Classifier
• In this paper, classification is based on conventional machine learning approaches– Support Vector Machine (SVM)– Naïve Bayes
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors Bag of Keypoints Multi-class
Classifier
Slide credit: Yun-hsueh Liu
![Page 29: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/29.jpg)
29
SVM
![Page 30: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/30.jpg)
Reminder: Linear SVM
x1
x2 Margin
wT x + b = 0
wT x + b = -1w
T x + b = 1
x+
x+
x-
Support Vectors
Slide credit: Jinwei GuSlide 30 of 113
( ) Tg b x w x
( ) 1Ti iy b w x
21minimize
2w
s.t.
![Page 31: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/31.jpg)
31
Nonlinear SVMs: The Kernel Trick With this mapping, our discriminant function becomes:
SV
( ) ( ) ( ) ( )T Ti i
i
g b b
x w x x x
No need to know this mapping explicitly, because we only use the dot product of feature vectors in both the training and test.
A kernel function is defined as a function that corresponds to a dot product of two feature vectors in some expanded feature space:
( , ) ( ) ( )Ti j i jK x x x x
Slide credit: Jinwei Gu
![Page 32: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/32.jpg)
32
Nonlinear SVMs: The Kernel Trick
Linear kernel:
2
2( , ) exp( )
2i j
i jK
x x
x x
( , ) Ti j i jK x x x x
( , ) (1 )T pi j i jK x x x x
0 1( , ) tanh( )Ti j i jK x x x x
Examples of commonly-used kernel functions:
Polynomial kernel:
Gaussian (Radial-Basis Function (RBF) ) kernel:
Sigmoid:
Slide credit: Jinwei Gu
![Page 33: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/33.jpg)
34
SVM for image classification
• Train k binary 1-vs-all SVMs (one per class)• For a test instance, evaluate with each
classifier• Assign the instance to the class with the
largest SVM output
![Page 34: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/34.jpg)
35
Naïve Bayes
![Page 35: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/35.jpg)
36
Naïve Bayes Model
C – Class F - Features
We only specify (parameters): prior over class labels
how each feature depends on the class
![Page 36: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/36.jpg)
37
Slide from Dan Klein
Example:
![Page 37: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/37.jpg)
38
Slide from Dan Klein
![Page 38: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/38.jpg)
39
Slide from Dan Klein
![Page 39: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/39.jpg)
40
Percentage of documents in training set labeled as spam/ham
Slide from Dan Klein
![Page 40: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/40.jpg)
41
In the documents labeled as spam, occurrence percentage of each word (e.g. # times “the” occurred/# total words).
Slide from Dan Klein
![Page 41: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/41.jpg)
42
In the documents labeled as ham, occurrence percentage of each word (e.g. # times “the” occurred/# total words).
Slide from Dan Klein
![Page 42: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/42.jpg)
43
Classification
The class that maximizes:
![Page 43: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/43.jpg)
44
Classification
• In practice
![Page 44: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/44.jpg)
45
Classification
• In practice– Multiplying lots of small probabilities can result in
floating point underflow
![Page 45: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/45.jpg)
46
Classification
• In practice– Multiplying lots of small probabilities can result in
floating point underflow– Since log(xy) = log(x) + log(y), we can sum log
probabilities instead of multiplying probabilities.
![Page 46: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/46.jpg)
47
Classification
• In practice– Multiplying lots of small probabilities can result in
floating point underflow– Since log(xy) = log(x) + log(y), we can sum log
probabilities instead of multiplying probabilities.– Since log is a monotonic function, the class with
the highest score does not change.
![Page 47: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/47.jpg)
48
Classification
• In practice– Multiplying lots of small probabilities can result in
floating point underflow– Since log(xy) = log(x) + log(y), we can sum log
probabilities instead of multiplying probabilities.– Since log is a monotonic function, the class with
the highest score does not change.– So, what we usually compute in practice is:
![Page 48: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/48.jpg)
49
Naïve Bayes on images
![Page 49: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/49.jpg)
50
Naïve Bayes
C – Class F - Features
We only specify (parameters): prior over class labels
how each feature depends on the class
![Page 50: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/50.jpg)
51
Naive Bayes Parameters
Problem: Categorize images as one of k object classes using Naïve Bayes classifier:– Classes: object categories (face, car, bicycle, etc)– Features – Images represented as a histogram of
visual words. are visual words.
treated as uniform. learned from training data – images labeled
with category. Probability of a visual word given an image category.
![Page 51: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/51.jpg)
52
Multi-class classifier –Naïve Bayes (1)
• Let V = {vi}, i = 1,…,N, be a visual vocabulary, in which each vi represents a visual word (cluster centers) from the feature space.
• A set of labeled images I = {Ii } .
• Denote Cj to represent our Classes, where j = 1,..,M
• N(t,i) = number of times vi occurs in image Ii
• Compute P(Cj|Ii):
Slide credit: Yun-hsueh Liu
![Page 52: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/52.jpg)
53
Multi-class Classifier –Naïve Bayes (2)
• Goal - Find maximum probability class Cj:
• In order to avoid zero probability, use Laplace smoothing:
Slide credit: Yun-hsueh Liu
![Page 53: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/53.jpg)
Results
![Page 54: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/54.jpg)
55
Results
![Page 55: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/55.jpg)
56
Results
![Page 56: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/56.jpg)
57
Results
Results on Dataset 2
![Page 57: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/57.jpg)
58
Results
![Page 58: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/58.jpg)
59
Results
![Page 59: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/59.jpg)
60
Results
![Page 60: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/60.jpg)
Thoughts?
• Pros?
• Cons?
![Page 61: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/61.jpg)
62
Related BoF modelspLSA, LDA, …
![Page 62: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/62.jpg)
63
pLSA
wordtopicdocument
![Page 63: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/63.jpg)
64
pLSA
![Page 64: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/64.jpg)
67
pLSA on images
![Page 65: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/65.jpg)
68
Discovering objects and their location in imagesJosef Sivic, Bryan C. Russell, Alexei A. Efros, Andrew Zisserman, William T. Freeman
Documents – ImagesWords – visual words (vector quantized SIFT descriptors)Topics – object categories
Images are modeled as a mixture of topics (objects).
![Page 66: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/66.jpg)
69
Goals
They investigate three areas: – (i) topic discovery, where categories are
discovered by pLSA clustering on all available images.
– (ii) classification of unseen images, where topics corresponding to object categories are learnt on one set of images, and then used to determine the object categories present in another set.
– (iii) object detection, where you want to determine the location and approximate segmentation of object(s) in each image.
![Page 67: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/67.jpg)
70
(i) Topic Discovery
Most likely words for 4 learnt topics (face, motorbike, airplane, car)
![Page 68: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/68.jpg)
71
(ii) Image Classification
Confusion table for unseen test images against pLSA trained on images containing four object categories, but no background images.
![Page 69: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/69.jpg)
72
(ii) Image Classification
Confusion table for unseen test images against pLSA trained on images containing four object categories, and background images. Performance is not quite as good.
![Page 70: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/70.jpg)
73
(iii) Topic Segmentation
![Page 71: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/71.jpg)
74
(iii) Topic Segmentation
![Page 72: Tamara Berg Object Recognition – BoF models 790-133 Recognizing People, Objects, & Actions 1](https://reader036.vdocuments.net/reader036/viewer/2022062307/551bec03550346c3588b6338/html5/thumbnails/72.jpg)
75
(iii) Topic Segmentation