bag of words - carnegie mellon university16720.courses.cs.cmu.edu/lec/bagwords.pdf · • matlab...
TRANSCRIPT
Text Analysis
Political observers say that the government of Zorgia does not control the political situation. The government will not hold elections …
The ZH-20 unit is a 200Gigahertz processor with 2Gigabyte memory. Its strength is its bus and high-speed memory……
How to compare the two articles?
Bag-of-words
Words from vocabulary
Gov
ernm
ent
Obs
erve
rs
Pol
itica
l
Gov
ernm
ent
Gig
abyt
e
Obs
erve
rs
Ele
ctio
n
Mem
ory
Gig
aher
tz
Bus
Freq
uenc
y of
occ
urre
nce
Histogram from training “computer” fragments
Pol
itica
l
Gig
abyt
e
Ele
ctio
n
Mem
ory
Gig
aher
tz
Bus
Freq
uenc
y of
occ
urre
nce
Histogram from training “political” fragments
Words from vocabulary
Compare histograms
Analogy: Text fragment !" Image region
Word !" Texton
Representing textures as bags-of-visual words
Universal texton dictionary
histogram
Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003
Source: Lana Lazebnik
Bag-of-visual-words
Training images Filter responses
Clustering
Given a large set of vectorized image patches: and a bank of vectorized filters F = [f1,f2,..fb]
1. Project each patch into basis spanned by F: (does this basis span RM^2? Is it orthonormal?)
2. Cluster patches in this projected space
x 2 R
M⇥M ) x 2 R
M2
y = F
Tx, y 2 R
b
Use pseudoinverse of filter bank to visualize cluster means in original space
y = F
Tx, x 2 R
M2
, y 2 R
B
x ⇡ (FT )+y
V is(dj) ⇡ (FT )+dj
Given a M X M image patch ‘x’ (reshaped into a M2 vector) and a filter bank of B filters, filter bank responses can be seen as a change of basis
Recognition with bag-of-words
▪ Summarize entire image based on its distribution (histogram) of word occurrences.
▪ Compare to stored library of images (or class-specific models)
9Image credit: Fei-Fei Li
Pyramid match kernel
Number of newly matched pairs at level i
Measure of difficulty of a match at level i
Approximate partial match
similarity
Counting new matches
Difference in histogram intersections across levels counts number of new pairs matched
matches at this level matches at previous level
Histogram intersection
Pyramid match kernel
• Weights inversely proportional to bin size
• Normalize kernel values to avoid favoring large sets
measure of difficulty of a match at level i
histogram pyramids
number of newly matched pairs at level i
Spatial Pyramid MatchingQuantize features into words, but build pyramid in space
4
Level 0
Level 1
Level 2
Feature histograms:
Level 3
Total weight (value of pyramid match kernel):
Pyramid matchingFind maximum-weight matching (weight is inversely proportional to distance)
Indyk & Thaper (2003), Grauman & Darrell (2005)
Original images
Multiresolution representations are powerful!