object recognition: history and overview slides adapted from fei-fei li, rob fergus, antonio...
Post on 23-Jan-2016
223 views
TRANSCRIPT
![Page 1: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/1.jpg)
Object Recognition: History and Overview
Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce
![Page 2: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/2.jpg)
How many visual object categories are there?
Biederman 1987
![Page 3: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/3.jpg)
![Page 4: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/4.jpg)
OBJECTS
ANIMALS INANIMATEPLANTS
MAN-MADENATURALVERTEBRATE …..
MAMMALS BIRDS
GROUSEBOARTAPIR CAMERA
![Page 5: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/5.jpg)
So what does object recognition involve?
![Page 6: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/6.jpg)
Scene categorization
• outdoor
• city
• …
![Page 7: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/7.jpg)
Image-level annotation: are there people?
• outdoor
• city
• …
![Page 8: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/8.jpg)
Object detection: where are the people?
![Page 9: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/9.jpg)
Image parsing
mountain
building
tree
banner
vendorpeople
street lamp
![Page 10: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/10.jpg)
Variability: Camera positionIlluminationShape parameters
Within-class variations?
Modeling variability
![Page 11: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/11.jpg)
Within-class variations
![Page 12: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/12.jpg)
Variability: Camera positionIllumination
Alignment
Roberts (1965); Lowe (1987); Faugeras & Hebert (1986); Grimson & Lozano-Perez (1986); Huttenlocher & Ullman (1987)
Shape: assumed known
![Page 13: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/13.jpg)
Recall: Alignment
• Alignment: fitting a model to a transformation between pairs of features (matches) in two images
i
ii xxT )),((residual
Find transformation T that minimizesT
xixi'
![Page 14: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/14.jpg)
Recall: Origins of computer vision
L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963.
![Page 15: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/15.jpg)
Alignment: Huttenlocher & Ullman (1987)
![Page 16: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/16.jpg)
Variability Camera positionIlluminationInternal parameters
Invariance to:
Duda & Hart ( 1972); Weiss (1987); Mundy et al. (1992-94);Rothwell et al. (1992); Burns et al. (1993)
![Page 17: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/17.jpg)
General 3D objects do not admit monocular viewpoint invariants (Burns et al., 1993)
Projective invariants (Rothwell et al., 1992):
Recall: invariant to similarity transformations computed from four points
A
B
CD
![Page 18: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/18.jpg)
ACRONYM (Brooks and Binford, 1981)
Representing and recognizing object categoriesis harder...
Binford (1971), Nevatia & Binford (1972), Marr & Nishihara (1978)
![Page 19: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/19.jpg)
Recognition by components
Geons (Biederman 1987)
???
![Page 20: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/20.jpg)
Zisserman et al. (1995)
Generalized cylindersPonce et al. (1989)
Forsyth (2000)
General shape primitives?
![Page 21: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/21.jpg)
Empirical models of image variability
Appearance-based techniques
Turk & Pentland (1991); Murase & Nayar (1995); etc.
![Page 22: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/22.jpg)
Eigenfaces (Turk & Pentland, 1991)
![Page 23: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/23.jpg)
Color Histograms
Swain and Ballard, Color Indexing, IJCV 1991.
![Page 24: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/24.jpg)
H. Murase and S. Nayar, Visual learning and recognition of 3-d objects from appearance, IJCV 1995
Appearance manifolds
![Page 25: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/25.jpg)
Limitations of global appearance models
• Requires global registration of patterns• Not robust to clutter, occlusion, geometric
transformations
![Page 26: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/26.jpg)
Sliding window approaches
• Turk and Pentland, 1991• Belhumeur, Hespanha, &
Kriegman, 1997• Schneiderman & Kanade 2004• Viola and Jones, 2000
• Schneiderman & Kanade, 2004• Argawal and Roth, 2002• Poggio et al. 1993
![Page 27: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/27.jpg)
– Scale / orientation range to search over – Speed– Context
Sliding window approaches
![Page 28: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/28.jpg)
Lowe’02
Mahamud & Hebert’03
Local featuresCombining local appearance, spatial constraints, invariants, and classification techniques from machine learning.
Schmid & Mohr’97
![Page 29: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/29.jpg)
Local features for recognition of object instances
![Page 30: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/30.jpg)
• Lowe, et al. 1999, 2003
• Mahamud and Hebert, 2000• Ferrari, Tuytelaars, and Van Gool, 2004• Rothganger, Lazebnik, and Ponce, 2004• Moreels and Perona, 2005• …
Local features for recognition of object instances
![Page 31: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/31.jpg)
Representing categories: Parts and Structure
Weber, Welling & Perona (2000), Fergus, Perona & Zisserman (2003)
![Page 32: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/32.jpg)
Parts-and-shape representation
• Model:– Object as a set of parts
– Relative locations between parts– Appearance of part
Figure from [Fischler & Elschlager 73]
![Page 33: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/33.jpg)
ObjectObjectBag of Bag of ‘words’‘words’
Bag-of-features models
![Page 34: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/34.jpg)
Objects as texture
• All of these are treated as being the same
• No distinction between foreground and background: scene recognition?
![Page 35: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/35.jpg)
Timeline of recognition
• 1965-late 1980s: alignment, geometric primitives• Early 1990s: invariants, appearance-based
methods• Mid-late 1990s: sliding window approaches• Late 1990s: feature-based methods• Early 2000s: parts-and-shape models• 2003 – present: bags of features• Present trends: combination of local and global
methods, modeling context, emphasis on “image parsing”
![Page 36: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/36.jpg)
Global scene context
• The “gist” of a scene: Oliva & Torralba (2001)
http://people.csail.mit.edu/torralba/code/spatialenvelope/
![Page 37: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/37.jpg)
J. Hays and A. Efros, Scene Completion using Millions of Photographs,
SIGGRAPH 2007
![Page 38: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/38.jpg)
Scene-level context for image parsing
J. Tighe and S. Lazebnik, ECCV 2010 submission
![Page 39: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/39.jpg)
D. Hoiem, A. Efros, and M. Herbert. Putting Objects in Perspective. CVPR 2006.
Geometric context
![Page 40: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/40.jpg)
What “works” today
• Reading license plates, zip codes, checks
![Page 41: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/41.jpg)
What “works” today
• Reading license plates, zip codes, checks• Fingerprint recognition
![Page 42: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/42.jpg)
What “works” today
• Reading license plates, zip codes, checks• Fingerprint recognition• Face detection
![Page 43: Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce](https://reader035.vdocuments.net/reader035/viewer/2022062500/56649d4b5503460f94a2959b/html5/thumbnails/43.jpg)
What “works” today
• Reading license plates, zip codes, checks• Fingerprint recognition• Face detection• Recognition of flat textured objects (CD covers,
book covers, etc.)