slides: machine learning for computer vision
TRANSCRIPT
![Page 1: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/1.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 1/135
Machine Learning in
Computer Vision
Fei-Fei Li
![Page 2: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/2.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 2/135
W h at is ( com p u t er ) v i sion ?
• When we “see” something, what does itinvolve?
• Take a picture with a camera, it is just abunch of colored dots (pixels)
• Want to make computers understandimages
• Looks easy, but not really…
Image (or video) Sensing device Interpreting device Interpretations
Corn/mature corn jn a cornfield/ plant/blue sky inthe background
Etc.
![Page 3: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/3.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 3/135
What is it related to?
Computer Vision
Neuroscience
Machine learning
Speech Information retrieval
Maths
ComputerScience
InformationEngineering
Physics
Biology
Robotics
Cognitivesciences
Psychology
![Page 4: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/4.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 4/135
Quiz?
![Page 5: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/5.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 5/135
What about this?
![Page 6: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/6.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 6/135
A picture is worth a thousand words.--- Confucius
or Printers’ Ink Ad (1921)
![Page 7: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/7.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 7/135
horizontal lines
vertical
blue on the top
porous
oblique
white
shadow to the left
textured
large green patches
A picture is worth a thousand words.--- Confucius
or Printers’ Ink Ad (1921)
![Page 8: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/8.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 8/135
A picture is worth a thousand words.--- Confucius
or Printers’ Ink Ad (1921)
building court yard
clear sky
campus
trees
autumn leaves
people
day time
bicycles
talking outdoor
![Page 9: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/9.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 9/135
Today: machine learning methods
for object recognition
![Page 10: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/10.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 10/135
outline
• Intro to object categorization
• Brief overview – Generative
– Discriminative• Generative models
• Discriminative models
![Page 11: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/11.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 11/135
How many object categories are there?
Biederman 1987
![Page 12: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/12.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 12/135
Challenges 1: view point variation
Michelangelo 1475-1564
![Page 13: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/13.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 13/135
Challenges 2: illumination
slide credit: S. Ullman
![Page 14: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/14.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 14/135
Challenges 3: occlusion
Magritte, 1957
![Page 15: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/15.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 15/135
Challenges 4: scale
Ch ll d f i
![Page 16: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/16.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 16/135
Challenges 5: deformation
Xu, Beihong 1943
Ch ll 6 b k d l tt
![Page 17: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/17.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 17/135
Challenges 6: background clutter
Klimt, 1913
![Page 18: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/18.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 18/135
History: single object recognition
![Page 19: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/19.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 19/135
History: single object recognition
• Lowe, et al. 1999, 2003
• Mahamud and Herbert, 2000
• Ferrari, Tuytelaars, and Van Gool, 2004
• Rothganger, Lazebnik, and Ponce, 2004
• Moreels and Perona, 2005• …
Ch ll 7 i t l i ti
![Page 20: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/20.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 20/135
Challenges 7: intra-class variation
Object categorization:Object categorization:
![Page 21: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/21.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 21/135
Object categorization:Object categorization:
the statistical viewpointthe statistical viewpoint
)|( image zebra p
)( e zebra|imagno pvs.
• Bayes rule:
)(
)(
)|(
)|(
)|(
)|(
zebrano p
zebra p
zebranoimage p
zebraimage p
image zebrano p
image zebra p
⋅=
posterior ratio likelihood ratio prior ratio
Object categorization:Object categorization:
![Page 22: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/22.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 22/135
Object categorization:Object categorization:
the statistical viewpointthe statistical viewpoint
)(
)(
)|(
)|(
)|(
)|(
zebrano p
zebra p
zebranoimage p
zebraimage p
image zebrano p
image zebra p⋅=
posterior ratio likelihood ratio prior ratio
• Discriminative methods model posterior
• Generative methods model likelihood and
prior
Di i i ti
![Page 23: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/23.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 23/135
Discriminative
• Direct modeling of
Zebra
Non-zebra
Decisionboundary
)|(
)|(
image zebrano p
image zebra p
Generative
![Page 24: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/24.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 24/135
• Model and
Generative
)|( zebraimage p )|( zebranoimage p
Low Middle
High MiddleLow
)|( zebranoimage p)|( zebraimage p
Th i i
![Page 25: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/25.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 25/135
Three main issuesThree main issues
• Representation
– How to represent an object category
• Learning
– How to form the classifier, given training data
• Recognition – How the classifier is to be used on novel data
![Page 26: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/26.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 26/135
Representation
– Generative / discriminative / hybrid
![Page 27: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/27.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 27/135
Representation
– Generative / discriminative / hybrid
– Appearance only orlocation andappearance
![Page 28: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/28.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 28/135
Representation
– Generative / discriminative / hybrid
– Appearance only orlocation andappearance
– Invariances• View point
• Illumination
• Occlusion• Scale
• Deformation
• Clutter• etc.
R i
![Page 29: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/29.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 29/135
Representation
– Generative / discriminative / hybrid
– Appearance only orlocation andappearance
– invariances
– Part-based or global
w/sub-window
R t ti
![Page 30: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/30.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 30/135
Representation
– Generative / discriminative / hybrid
– Appearance only orlocation andappearance
– invariances
– Parts or global w/sub-
window – Use set of features or
each pixel in image
L i
![Page 31: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/31.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 31/135
– Unclear how to model categories, so welearn what distinguishes them rather thanmanually specify the difference -- hencecurrent interest in machine learning
Learning
L i
![Page 32: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/32.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 32/135
– Unclear how to model categories, so welearn what distinguishes them rather thanmanually specify the difference -- hencecurrent interest in machine learning)
– Methods of training: generative vs.discriminative
Learning
0 0.2 0.4 0.6 0.8 10
1
2
3
4
5
c l a s s
d e n
s i t i e s
p( x|C 1)
p( x|C 2)
x0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
p o s t e r i o r p r o b a b i l i t i e s
x
p(C 1| x) p(C
2| x)
L i
![Page 33: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/33.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 33/135
– Unclear how to model categories, so welearn what distinguishes them rather thanmanually specify the difference -- hencecurrent interest in machine learning)
– What are you maximizing? Likelihood(Gen.) or performances on train/validationset (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels
Learning
Contains a motorbike
L i
![Page 34: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/34.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 34/135
– Unclear how to model categories, so welearn what distinguishes them rather thanmanually specify the difference -- hencecurrent interest in machine learning)
– What are you maximizing? Likelihood(Gen.) or performances on train/validationset (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels
– Batch/incremental (on category and imagelevel; user-feedback )
Learning
Learning
![Page 35: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/35.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 35/135
– Unclear how to model categories, so welearn what distinguishes them rather thanmanually specify the difference -- hencecurrent interest in machine learning)
– What are you maximizing? Likelihood(Gen.) or performances on train/validationset (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels
– Batch/incremental (on category and imagelevel; user-feedback )
– Training images:• Issue of overfitting
• Negative images for discriminative methods
Learning
Learning
![Page 36: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/36.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 36/135
– Unclear how to model categories, so welearn what distinguishes them rather thanmanually specify the difference -- hencecurrent interest in machine learning)
– What are you maximizing? Likelihood(Gen.) or performances on train/validationset (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels
– Batch/incremental (on category and imagelevel; user-feedback )
– Training images:• Issue of overfitting
• Negative images for discriminative methods
– Priors
Learning
Recognition
![Page 37: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/37.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 37/135
– Scale / orientation range to search over
– Speed
Recognition
![Page 38: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/38.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 38/135
Bag-of-words models
![Page 39: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/39.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 39/135
ObjectObject Bag of Bag of ‘‘wordswords’’
Analogy to documentsAnalogy to documents
![Page 40: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/40.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 40/135
Analogy to documentsAnalogy to documents
Of all the sensory impressions proceeding tothe brain, the visual experiences are thedominant ones. Our perception of the worldaround us is based essentially on themessages that reach the brain from our eyes.For a long time it was thought that the retinal
image was transmitted point by point to visualcenters in the brain; the cerebral cortex was amovie screen, so to speak, upon which theimage in the eye was projected. Through thediscoveries of Hubel and Wiesel we now
know that behind the origin of the visualperception in the brain there is a considerablymore complicated course of events. Byfollowing the visual impulses along their pathto the various cell layers of the optical cortex,Hubel and Wiesel have been able to
demonstrate that the message about the image falling on the retina undergoes a step-
wise analysis in a system of nerve cells
stored in columns. In this system each cell
has its specific function and is responsible for
a specific detail in the pattern of the retinal
image.
sensory, brain,visual, perception,
retinal, cerebral cortex,
eye, cell, optical
nerve, imageHubel, Wiesel
China is forecasting a trade surplus of $90bn(£51bn) to $100bn this year, a threefoldincrease on 2004's $32bn. The CommerceMinistry said the surplus would be created bya predicted 30% jump in exports to $750bn,compared with a 18% rise in imports to
$660bn. The figures are likely to furtherannoy the US, which has long argued thatChina's exports are unfairly helped by adeliberately undervalued yuan. Beijingagrees the surplus is too high, but says the
yuan is only one factor. Bank of Chinagovernor Zhou Xiaochuan said the countryalso needed to do more to boost domesticdemand so more goods stayed within thecountry. China increased the value of theyuan against the dollar by 2.1% in July and
permitted it to trade within a narrow band, butthe US wants the yuan to be allowed to tradefreely. However, Beijing has made it clear thatit will take its time and tread carefully beforeallowing the yuan to rise further in value.
China, trade,surplus, commerce,
exports, imports, US,
yuan, bank, domestic,
foreign, increase,trade, value
![Page 41: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/41.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 41/135
learninglearning recognitionrecognition
![Page 42: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/42.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 42/135
categorycategory
decisiondecision
feature detection
& representation
codewords dictionarycodewords dictionary
image representation
category modelscategory models
(and/or) classifiers(and/or) classifiers
RepresentationRepresentation
![Page 43: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/43.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 43/135
feature detection
& representation
codewords dictionarycodewords dictionary
image representation
1.1.
2.2.
3.3.
1.Feature detection and representation1.Feature detection and representation
![Page 44: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/44.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 44/135
eatu e detect o a d ep ese tat op
1.Feature1.Feature detectiondetection andand representationrepresentation
![Page 45: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/45.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 45/135
pp
Normalize
patch
Detect patches
[Mikojaczyk and Schmid ’02]
[Matas et al. ’02]
[Sivic et al. ’03]
Compute
SIFT
descriptor
[Lowe’99]
Slide credit: Josef Sivic
1.Feature1.Feature detectiondetection andand representationrepresentation
![Page 46: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/46.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 46/135
…
pp
2. Codewords dictionary formation2. Codewords dictionary formation
![Page 47: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/47.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 47/135
yy
…
2. Codewords dictionary formation2. Codewords dictionary formation
![Page 48: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/48.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 48/135
y
Vector quantization
…
Slide credit: Josef Sivic
2. Codewords dictionary formation2. Codewords dictionary formation
![Page 49: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/49.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 49/135
y
Fei-Fei et al. 2005
3. Image representation3. Image representation
![Page 50: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/50.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 50/135
…..
f r e q u e
n c y
codewords
RepresentationRepresentation
![Page 51: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/51.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 51/135
feature detection
& representation
codewords dictionarycodewords dictionary
image representation
1.1.
2.2.
3.3.
Learning and RecognitionLearning and Recognition
![Page 52: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/52.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 52/135
categorycategory
decisiondecision
codewords dictionarycodewords dictionary
category modelscategory models
(and/or) classifiers(and/or) classifiers
2 case studies2 case studies
![Page 53: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/53.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 53/135
1. Naïve Bayes classifier
– Csurka et al. 2004
2. Hierarchical Bayesian text models(pLSA and LDA)
– Background: Hoffman 2001, Blei et al. 2004 – Object categorization: Sivic et al. 2005, Sudderth et
al. 2005
– Natural scene categorization: Fei-Fei et al. 2005
First, some notationsFirst, some notations
![Page 54: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/54.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 54/135
• wn: each patch in an image
– wn = [0,0,…1,…,0,0]T
• w: a collection of all N patches in an image
– w = [w1,w2,…,wN]
• d j: the jth image in an image collection
• c: category of the image
• z: theme or topic of the patch
Case #1: the NaCase #1: the Na ï ï veve BayesBayes modelmodel
![Page 55: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/55.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 55/135
w
N
c
)|()( cw pc p
Prior prob. ofthe object classes
Image likelihoodgiven the class
Csurka et al. 2004
∏==
N
n
n cw pc p1
)|()(
Object classdecision
∝)|( wc pc
c maxarg=∗
Case #2: Hierarchical BayesianCase #2: Hierarchical Bayesian
text modelstext models
![Page 56: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/56.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 56/135
Hoffman, 2001
text modelstext models
wN
d z
D
w
N
c z
D
π
Blei et al., 2001
Probabilistic Latent Semantic Analysis (pLSA)
Latent Dirichlet Allocation (LDA)
Case #2: Hierarchical BayesianCase #2: Hierarchical Bayesiantext modelstext models
![Page 57: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/57.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 57/135
wN
d z
D
text modelstext models
Probabilistic Latent Semantic Analysis (pLSA)
“face”
Sivic et al. ICCV 2005
Case #2: Hierarchical BayesianCase #2: Hierarchical Bayesiantext modelstext models
![Page 58: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/58.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 58/135
w
N
c z
D
π
Latent Dirichlet Allocation (LDA)
text modelstext models
Fei-Fei et al. ICCV 2005
“beach”
Another application
![Page 59: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/59.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 59/135
Another application
• Human action classification
Invariance issuesInvariance issues
![Page 60: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/60.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 60/135
• Scale and rotation – Implicit
– Detectors and descriptors
Kadir and Brady. 2003
Invariance issuesInvariance issues
![Page 61: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/61.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 61/135
• Scale and rotation• Occlusion
– Implicit in the models
– Codeword distribution: small variations
– (In theory) Theme (z) distribution: different
occlusion patterns
Invariance issuesInvariance issues
![Page 62: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/62.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 62/135
• Scale and rotation• Occlusion
• Translation – Encode (relative) location information
Sudderth et al. 2005
Invariance issuesInvariance issues
![Page 63: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/63.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 63/135
• Scale and rotation• Occlusion
• Translation• View point (in theory)
– Codewords: detectorand descriptor
– Theme distributions:
different view points
Fergus et al. 2005
Model propertiesModel properties
![Page 64: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/64.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 64/135
Of all the sensory impressions proceeding tothe brain, the visual experiences are thedominant ones. Our perception of the worldaround us is based essentially on themessages that reach the brain from our eyes.
For a long time it was thought that the retinalimage was transmitted point by point to visualcenters in the brain; the cerebral cortex was amovie screen, so to speak, upon which theimage in the eye was projected. Through thediscoveries of Hubel and Wiesel we now
know that behind the origin of the visualperception in the brain there is a considerablymore complicated course of events. Byfollowing the visual impulses along their pathto the various cell layers of the optical cortex,Hubel and Wiesel have been able to
demonstrate that the message about the image falling on the retina undergoes a step-
wise analysis in a system of nerve cells
stored in columns. In this system each cell
has its specific function and is responsible for
a specific detail in the pattern of the retinal image.
sensory, brain,
visual, perception,
retinal, cerebral cortex,
eye, cell, optical
nerve, image
Hubel, Wiesel
• Intuitive
– Analogy to documents
Model propertiesModel properties
![Page 65: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/65.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 65/135
• Intuitive
• (Could use)generative models
– Convenient for weakly-
or un-supervisedtraining
– Prior information
– Hierarchical BayesianframeworkSivic et al., 2005, Sudderth et al., 2005
Model propertiesModel properties
![Page 66: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/66.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 66/135
• Intuitive
• (Could use)generative models
• Learning and
recognition relativelyfast
– Compare to other
methods
Weakness of the modelWeakness of the model
![Page 67: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/67.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 67/135
• No rigorous geometric informationof the object components
• It’s intuitive to most of us thatobjects are made of parts – no
such information• Not extensively tested yet for – View point invariance
– Scale invariance
• Segmentation and localization
unclear
![Page 68: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/68.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 68/135
part-based models
Slides courtesy to Rob Fergus for “part-based models”
![Page 69: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/69.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 69/135
One-shot learning
of object categoriesFei-Fei et al. ‘03, ‘04, ‘06
![Page 70: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/70.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 70/135
One-shot learning
of object categories
P. Brueg e l , 15 62
Fei-Fei et al. ‘03, ‘04, ‘06
model representation
![Page 71: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/71.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 71/135
One-shot learning
of object categoriesFei-Fei et al. ‘03, ‘04, ‘06
X (location)
![Page 72: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/72.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 72/135
(x,y) coords. of region center
A (appearance)
Projection ontoPCA basis
c1
c2
c10
…..
normalize
1 1 x 1 1
p a t c h
X (location)The Generative Model
![Page 73: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/73.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 73/135
(x,y) coords. of region center
A (appearance)
Projection ontoPCA basis
c1
c2
c10
…..
normalize
1 1 x 1 1
p a t c h
X A
h
ΓXμX
I
ΓAμA
The Generative Model
![Page 74: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/74.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 74/135
X A
h
ΓXμX
I
ΓAμA
observed variables
hidden variable
parameters
The Generative Model
![Page 75: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/75.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 75/135
X A
h
ΓXμX
I
ΓAμA
θn
θ1
θ2
where θ = {µX, ΓX, µA, ΓA}
ML/MAP
Weber et al. ’98 ’00, Fergus et al. ’03
The Generative Model
![Page 76: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/76.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 76/135
θn
θ1
θ2
where θ = {µX, ΓX, µA, ΓA}
ML/MAPΓXμX
ΓAμA
shape model
appearance model
The Generative Model
![Page 77: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/77.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 77/135
X A
h
I
ΓXμX ΓAμA
θn
θ1
θ2
ML/MAP
The Generative Model
![Page 78: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/78.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 78/135
X A
h
I
P
ΓXμX ΓAμA
a0X
B0X
m0X
β0X
a0A
B0A
m0A
β0A
Bayesian
Parameters to estimate: {mX, βX, aX, BX, mA, βA, aA, BA}i.e. parameters of Normal-Wishart distribution
θn
θ1
θ2
Fei-Fei et al. ‘03, ‘04, ‘06
The Generative Model
![Page 79: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/79.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 79/135
X A
h
I
P
ΓXμX ΓAμA
a0X
B0X
m0X
β0X
a0A
B0A
m0A
β0A
parameters
priors
Fei-Fei et al. ‘03, ‘04, ‘06
The Generative Model
![Page 80: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/80.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 80/135
X A
h
I
P
ΓXμX ΓAμA
a0X
B0X
m0X
β0X
a0A
B0A
m0A
β0A
Prior distribution
Fei-Fei et al. ‘03, ‘04, ‘06
priors
1. human vision
![Page 81: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/81.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 81/135
2. modelrepresentation
3. learning& inferences
4. evaluation
& dataset& application
One-shot learning
of object categories
learning & inferences
![Page 82: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/82.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 82/135
One-shot learning
of object categoriesFei-Fei et al. 2003, 2004, 2006
No labeling No segmentation No alignment
learning & inferences
![Page 83: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/83.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 83/135
One-shot learning
of object categoriesFei-Fei et al. 2003, 2004, 2006
Bayesian
θn
θ1
θ2
RandominitializationVariational EMVariational EM
![Page 84: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/84.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 84/135
E-Step
new estimate
of p(θ|train)
M-Step
prior knowledge of p(θ)Attias, Jordan, Hinton etc.
evaluation & dataset
![Page 85: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/85.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 85/135
One-shot learning
of object categoriesFei-Fei et al. 2004, 2006a, 2006b
evaluation & dataset -- Caltech 101 Dataset
![Page 86: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/86.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 86/135
One-shot learning
of object categoriesFei-Fei et al. 2004, 2006a, 2006b
evaluation & dataset -- Caltech 101 Dataset
![Page 87: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/87.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 87/135
One-shot learning
of object categoriesFei-Fei et al. 2004, 2006a, 2006b
![Page 88: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/88.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 88/135
Part 3: discriminative methods
Discriminative methodsObject detection and recognition is formulated as a classification problem.
![Page 89: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/89.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 89/135
Bag of image patches
Decision
boundary
… and a decision is taken at each window about if it contains a target object or not.
Computer screen
Background
In some feature space
Where are the screens?
The image is partitioned into a set of overlapping windows
Discriminative vs. generativeGenerati e model
![Page 90: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/90.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 90/135
(The lousy painter)
0 10 20 30 40 50 60 70
0
0.05
0.1
x = data
• Generative model
0 10 20 30 40 50 60 700
0.5
1
x = data
• Discriminative model
0 10 20 30 40 50 60 70 80
-1
1
x = data
• Classification function
(The artist )
Discriminative methods
Nearest neighbor Neural networks
![Page 91: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/91.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 91/135
106
examples
Nearest neighbor
Shakhnarovich, Viola, Darrell 2003Berg, Berg, Malik 2005…
Neural networks
LeCun, Bottou, Bengio, Haffner 1998Rowley, Baluja, Kanade 1998…
Conditional Random FieldsSupport Vector Machines and Kernels
McCallum, Freitag, Pereira 2000Kumar, Hebert 2003
…
Guyon, Vapnik
Heisele, Serre, Poggio, 2001
…
F l ti bi l ifi ti
Formulation
![Page 92: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/92.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 92/135
• Formulation: binary classification
+1-1
x1 x2 x3 xN
…
… xN+1 xN+2 xN+M
-1 -1 ? ? ?
…
Training data: each image patch is labeledas containing the object or background
Test data
Features x =
Labels y =
Where belongs to some family of functions
• Classification function
• Minimize misclassification error(Not that simple: we need some guarantees that there will be generalization)
Overview of section
![Page 93: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/93.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 93/135
• Object detection with classifiers
• Boosting
– Gentle boosting
– Weak detectors – Object model
– Object detection
• Multiclass object detection
A i l l ith f l i b t l ifi
Why boosting?
![Page 94: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/94.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 94/135
• A simple algorithm for learning robust classifiers – Freund & Shapire, 1995 – Friedman, Hastie, Tibshhirani, 1998
• Provides efficient algorithm for sparse visualfeature selection – Tieu & Viola, 2000 – Viola & Jones, 2003
• Easy to implement, not requires externaloptimization tools.
D fi l ifi i dditi d l
Boosting
![Page 95: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/95.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 95/135
• Defines a classifier using an additive model:
Strongclassifier
Weak classifier
WeightFeaturesvector
D fi l ifi i dditi d l
Boosting
![Page 96: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/96.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 96/135
• Defines a classifier using an additive model:
• We need to define a family of weak classifiers
Strongclassifier
Weak classifier
WeightFeaturesvector
from a family of weak classifiers
Boosting
• It is a sequential procedure:
![Page 97: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/97.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 97/135
Each data point has
a class label:
wt =1
and a weight:
+1 ( )
-1 ( )yt =
• It is a sequential procedure:
xt=1
xt=2
xt
Toy exampleWeak learners from the family of lines
![Page 98: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/98.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 98/135
Weak learners from the family of lines
h => p(error) = 0.5 it is at chance
Each data point has
a class label:
wt =1
and a weight:
+1 ( )
-1 ( )yt =
Toy example
![Page 99: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/99.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 99/135
This one seems to be the best
Each data point has
a class label:
wt =1
and a weight:
+1 ( )
-1 ( )yt =
This is a ‘weak classifier ’: It performs slightly better than chance.
Toy example
![Page 100: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/100.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 100/135
We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )yt =
Toy example
![Page 101: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/101.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 101/135
We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )yt =
Toy example
![Page 102: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/102.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 102/135
We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )yt =
Toy example
![Page 103: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/103.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 103/135
We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )yt =
Toy example
f1 f2
![Page 104: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/104.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 104/135
The strong (non- linear) classifier is built as the combination ofall the weak (linear) classifiers.
f3
f4
From images to features:Weak detectors
W ill d fi f il f i l
![Page 105: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/105.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 105/135
We will now define a family of visualfeatures that can be used as weak
classifiers (“weak detectors”)
Takes image as input and the output is binary response.The output is a weak detector.
Weak detectors
Textures of textures
![Page 106: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/106.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 106/135
Textures of texturesTieu and Viola, CVPR 2000
Every combination of three filtersgenerates a different feature
This gives thousands of features. Boosting selects a sparse subset, so computationson test time are very efficient. Boosting also avoids overfitting to some extend.
Weak detectors
Haar filters and integral image
![Page 107: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/107.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 107/135
Haar filters and integral imageViola and Jones, ICCV 2001
The average intensity in theblock is computed with foursums independently of theblock size.
Weak detectors
Other weak detectors:
![Page 108: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/108.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 108/135
Other weak detectors:
• Carmichael, Hebert 2004
• Yuille, Snow, Nitzbert, 1998• Amit, Geman 1998
• Papageorgiou, Poggio, 2000
• Heisele, Serre, Poggio, 2001
• Agarwal, Awan, Roth, 2004
• Schneiderman, Kanade 2004• …
Weak detectors
Part based: similar to part-based generative
![Page 109: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/109.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 109/135
p gmodels. We create weak detectors by
using parts and voting for the object centerlocation
Car model Screen model
These features are used for the detector on the course web site.
Weak detectors
First we collect a set of part templates from a set of trainingobjects
![Page 110: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/110.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 110/135
objects.
Vidal-Naquet, Ullman (2003)
…
Weak detectors
We now define a family of “weak detectors” as:
![Page 111: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/111.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 111/135
= =
Better than chance
*
Weak detectors
We can do a better job using filtered images
![Page 112: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/112.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 112/135
Still a weak detectorbut better than before
* *= ==
TrainingFirst we evaluate all the N features on all the training images.
![Page 113: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/113.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 113/135
Then, we sample the feature outputs on the object center and at randomlocations in the background:
Representation and object model
Selected features for the screen detector
![Page 114: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/114.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 114/135
…
4 101 2 3
…
100
Lousy painter
Representation and object modelSelected features for the car detector
![Page 115: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/115.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 115/135
1 2 3 4 10 100
… …
Overview of section
• Object detection with classifiers
![Page 116: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/116.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 116/135
Object detection with classifiers
• Boosting – Gentle boosting
– Weak detectors – Object model
– Object detection
• Multiclass object detection
Example: screen detectionFeatureoutput
![Page 117: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/117.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 117/135
Example: screen detectionFeatureoutput
Thresholdedoutput
![Page 118: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/118.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 118/135
Weak ‘detector’
Produces many false alarms.
Example: screen detectionFeatureoutput
Thresholdedoutput
Strong classifierat iteration 1
![Page 119: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/119.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 119/135
Example: screen detectionFeatureoutput
Thresholdedoutput
Strongclassifier
![Page 120: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/120.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 120/135
Second weak ‘detector’
Produces a different set offalse alarms.
Example: screen detectionFeatureoutput
Thresholdedoutput
Strongclassifier
![Page 121: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/121.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 121/135
+
Strong classifierat iteration 2
Example: screen detectionFeatureoutput
Thresholdedoutput
Strongclassifier
![Page 122: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/122.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 122/135
+
…
Strong classifierat iteration 10
Example: screen detectionFeatureoutput
Thresholdedoutput
Strongclassifier
![Page 123: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/123.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 123/135
+
…
Addingfeatures
Final
classification
Strong classifierat iteration 200
applications
![Page 124: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/124.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 124/135
![Page 125: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/125.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 125/135
![Page 126: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/126.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 126/135
![Page 127: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/127.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 127/135
Document Analysis
![Page 128: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/128.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 128/135
Digit recognition, AT&T labshttp://www.research.att.com/~yann /
Medical Imaging
![Page 129: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/129.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 129/135
Robotics
![Page 130: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/130.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 130/135
Toys and robots
![Page 131: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/131.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 131/135
Finger prints
![Page 132: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/132.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 132/135
Surveillance
![Page 133: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/133.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 133/135
Security
![Page 134: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/134.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 134/135
Searching the web
![Page 135: slides: Machine Learning for Computer Vision](https://reader033.vdocuments.net/reader033/viewer/2022052319/577d1ef11a28ab4e1e8f973e/html5/thumbnails/135.jpg)
8/2/2019 slides: Machine Learning for Computer Vision
http://slidepdf.com/reader/full/slides-machine-learning-for-computer-vision 135/135