cs664 lecture #21: sift, object recognition, dynamic
TRANSCRIPT
![Page 1: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/1.jpg)
CS664 Lecture #21: SIFT, object recognition, dynamic programming
Some material taken from:
Sebastian Thrun, Stanfordhttp://cs223b.stanford.edu/
Yuri Boykov, Western Ontario
David Lowe, UBChttp://www.cs.ubc.ca/~lowe/keypoints/
![Page 2: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/2.jpg)
2
Announcements
Paper report due on 11/15Next quiz Tuesday 11/15– coverage through next lecture
PS#2 due today (November 8)– Code is due today, you can hand in the writeup
without penalty until 11:59PM Thursday (November 10)
There will be a (short) PS3, due on the last day of classes.
![Page 3: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/3.jpg)
3
Invariant local features
– Invariant to affine transformations, or changes in camera gain and bias
SIFT Features
![Page 4: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/4.jpg)
4
Keypoint detection
Laplacian is a center-surround filter– Very high response at dark point surrounded
by bright stuff• Very low response at the opposite
In practice, often computed as difference of Gaussians (DOG) filter:– (I hσ1)-(I hσ2), where σ1/σ2 is around 2– Scale parameter σ is important
Keypoints are maxima (minima) of DOG that occur at multiple scales
![Page 5: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/5.jpg)
5
Scale-space pyramid
All scales must be examined to identify scale-invariant featuresDOG pyramid (Burt & Adelson, 1983)
Blur
Resample
Subtract
![Page 6: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/6.jpg)
![Page 7: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/7.jpg)
![Page 8: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/8.jpg)
8
Rotation invariance
Create histogram of local gradient directions computed at selected scaleAssign canonical orientation at peak of smoothed histogramEach key specifies stable 2D coordinates (x, y, scale, orientation)
0 2π
![Page 9: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/9.jpg)
9
SIFT feature vector
Note: this is somewhat simplified; there are a number of somewhat ad hoc steps, but the whole thing works pretty well
![Page 10: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/10.jpg)
10
Hough transform
Motivation: find global features
![Page 11: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/11.jpg)
11
Example: vanishing points
![Page 12: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/12.jpg)
12
From edges to lines
An edge should “vote” for all lines that go (roughly) through it– Find the line with lots of votes– A line is parameterized by m and b
• This is actually a lousy choice, as it turns out
![Page 13: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/13.jpg)
13
Hough transform for lines
m
b
mm
![Page 14: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/14.jpg)
14
SIFT-based recognition
Given: a database of features– Computed from model library
We want to probe for the features we see in the imageUse approximate nearest-neighbor scheme
![Page 15: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/15.jpg)
15
SIFT is quite robust
![Page 16: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/16.jpg)
16
SIFT DEMO!
![Page 17: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/17.jpg)
17
Recognition
Classical recognition (Roberts, 1962)– http://www.packet.cc/files/mach-per-3D-
solids.html• Influenced by J. J. Gibson
Given: set of objects of known fixed shapeFind: position and pose (“placement”)Match model features to image featuresModels and/or image can be 2D or 3D– 2D to 2D example: OCR– Common case is 3D model, 2D image
![Page 18: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/18.jpg)
18
Face recognition
Extensively studied special caseApproaches: intensities or features– Intensities: SSD (L2 distance) or variants– Features: extract eyes, nose, chin, etc.
Intensities seem to work more reliably– Images need to be registered– Famous application of PCA: eigenfaces
Nothing really works with serious changes in lighting, profile, appearance– FERET database has good evaluation metrics
![Page 19: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/19.jpg)
19
Combinatorial search
Possible formulation of recognition: match each model feature to an image feature– Some model features can be occluded
This leads to an intractable problem with lots of backtracking – “Interpretation tree” search– Especially bad with unreliable features
The methods that work tend to avoid explicit search over matchings– Robust to feature unreliability
![Page 20: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/20.jpg)
20
Distance-based matching
Intuition: all points (features) in model should be close to some point in image– We will assume binary features, usually edges– All points assumption means no occlusions– Many image points will be unmatched
Naively posed, this is very hard– For each point in the model, find the distance
to the nearest point in the image– Do this for each placement of the model– How can we make this fast?
![Page 21: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/21.jpg)
21
Dynamic programming
General technique to speed up computations by re-using results– Many successful applications in vision
Canonical examples:– Shortest paths (Dijkstra’s algorithm)
• Many applications in vision (curves)
– Integral images • Efficiently compute the sum of any quantity over
an arbitrary rectangle• Useful for image smoothing, stereo, face
detection, etc.
![Page 22: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/22.jpg)
22
Shortest paths via DP
A
B
Dijkstra’s algorithm
- processed nodes (distance to A is known)- active nodes (front)- active node with the smallest distance value
![Page 23: CS664 Lecture #21: SIFT, object recognition, dynamic](https://reader033.vdocuments.net/reader033/viewer/2022050600/6272e3b3432e8471af276ce4/html5/thumbnails/23.jpg)
23
Integral images via DP
Suppose we want to compute the sum in D– At each pixel (x,y),
compute the sum in the rectangle [(0,0),(x,y)]
– Gives: A+C,A+B,A+B+C+D– (A+B+C+D) – (A+C) –
(A+B) + A = D– Can compute rectangle
sums by same trick• Row major scan
A BC D