lecture 10 - interest point detectors & ransac
TRANSCRIPT
Interest Point Detectors & RANSAC
Instructor - Simon Lucey
16-423 - Designing Computer Vision Apps
Today
• Image Gradients.
• Interest Point Detectors
• Matching Regions.
• RANSAC.
I(x
,y)
I(x + 1, y + 1)
I
Simoncelli & Olshausen 2001
I(x
,y)
I
I(x + 8, y + 8) Simoncelli & Olshausen 2001
I(x
,y)
I
I(x + 16, y + 16) Simoncelli & Olshausen 2001
I(x
,y)
I
I(x + 50, y + 50) Simoncelli & Olshausen 2001
I(x
,y)
I
I(x + 50, y + 50) Simoncelli & Olshausen 2001
Pixel Coherence
• The pixel coherence assumption demonstrates that pixels within natural images are heavily correlated with one another within a local neighborhood (e.g., +/- 5 pixels).
• For example,
5
N
CS143 Intro to Computer VisionSept, 2007 ©Michael J. Black
Image Filtering
3 4 3
2 ? 5
5 4 2
3
What assumptions are you
making to infer the center value?
Pixel Coherence
• The pixel coherence assumption demonstrates that pixels within natural images are heavily correlated with one another within a local neighborhood (e.g., +/- 5 pixels).
• For example,
5
N
CS143 Intro to Computer VisionSept, 2007 ©Michael J. Black
Image Filtering
3 4 3
2 ? 5
5 4 2
3
What assumptions are you
making to infer the center value?
“Typically assume that pixels within +/- 3, 5 or 7 pixels are highly correlated.”
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&
*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<
IGxH
x
Estimating Gradients
6
• Images are a discretely sampled representation of a continuous signal,
(Black)
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&
*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<
IGxH
x
Estimating Gradients
7
• Images are a discretely sampled representation of a continuous signal,
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&
*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<
IGxH
x
(Black)
• What if I want to know given that I have only the appearance at and ?I(x0 + �x)
Estimating Gradients
8
(Black)
�x
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&
*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<
IGxH
x
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&
dx<#I
IFxH
xx0
I(x0)
I(x0 + �x)I(x0)
• Again, simply take the Taylor series approximation?
Estimating Gradients
9
(Black)
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&
*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<
IGxH
x
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&
dx<#I
IFxH
xx0
I(x0 + �x)I(x0)
I(x0 + �x) � I(x0) +�I(x0)
�x
T
�x
• Again, simply take the Taylor series approximation?
Estimating Gradients
9
(Black)
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&
*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<
IGxH
x
!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@
',;A/2&;2&B.(9)1+(2
C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&
dx<#I
IFxH
xx0
I(x0 + �x)I(x0)
I(x0 + �x) � I(x0) +�I(x0)
�x
T
�x
�I(x0)�x
• Traditional method for calculating gradients in vision is through the use of edge filters. (e.g., Sobel, Prewitt).
• where,
�
�
Gradients through Filters
10
“Horizontal”
“Vertical”
⇥I(x)⇥x
= [�Ix(x),�Iy(x)]T
�Iy
�Ix
I
• Traditional method for calculating gradients in vision is through the use of edge filters. (e.g., Sobel, Prewitt).
• where,
�
�
Gradients through Filters
10
“Horizontal”
“Vertical”
⇥I(x)⇥x
= [�Ix(x),�Iy(x)]T
�Iy
�Ix
I
“Often have to apply a smoothing filter as well.”
Today
• Natural Image Statistics.
• Interest Point Detectors.
• Matching Regions.
• RANSAC.
Interest Point Detectors
Interest Point Detectors
• Interest point detectors, in general, detect corners. • A dense approach (using all pixels) will be far too slow. • Matching lines suffers from the “aperture problem” (i.e. has
the line moved along itself?). • Corners are:-
• relatively sparse • well-localized • good for matching
• reasonably cheap to compute • appear quite robustly • useful for geometry.
A(�x,�y) =X
(xk,yk)2N (x,y)
||I(xk
, y
k
)� I(xk
+ �x, y
k
+ �y)||2
N
(�x,�y)
Interest Point Detectors
• Approach is based on concept of auto-correlation,
14
N (x)
Interest Point Detectors
• Approach is based on concept of auto-correlation,
15
“in vector form”
“local neighborhood”
A(�x) =X
xk2N (x)
||I(xk)� I(xk + �x)||2
A(�x)
5 10 15 20 25
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
Interest Point Detectors
• Which patches match the auto-correlation responses?
16
A(�x) =X
xk2N (x)
||I(xk)� I(xk + �x)||2
A(�x)
5 10 15 20 25
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
Interest Point Detectors
• Which patches match the auto-correlation responses?
16
A(�x) =X
xk2N (x)
||I(xk)� I(xk + �x)||2
Harris Detector
• Harris & Stephens (1988) proposed,
17
⇡X
i2N||I(xi)� I(xi)�
@I(xi)@x
T
�x||2
⇡ �xT H�x
H =X
i2N
@I(xi)@x
@I(xi)@x
T
Harris Detector
• How can one characterize these three situations?
A(�x) =X
xk2N (x)
||I(xk)� I(xk + �x)||2
• We can simplify this through a linear approximation,
• where,
Harris Corner Detector
Make decision based on image structure tensor
H =X
i2N
@I(xi)@x
@I(xi)@x
T
Harris Detector
20
Effects of Scale
• Consider a single row or column of the image – Plotting intensity as a function of position gives a signal
Source: S. Seitz
Effects of Scale
• Consider a single row or column of the image – Plotting intensity as a function of position gives a signal
• Where is the edge? Source: S. Seitz
Search Multiple Scales
f
Source: S. Seitz
Search Multiple Scales
f
g
Source: S. Seitz
Search Multiple Scales
f
g
f * g
Source: S. Seitz
Search Multiple Scales
f
g
f * g
Source: S. Seitz
Search Multiple Scales
• To find edges, look for peaks in
f
g
f * g
Source: S. Seitz
SIFT Detector
• Filter with difference of Gaussian filters at increasing scales• Build image stack (scale space)• Find extrema in this 3D volume
Coarse to Fine
24
Scale & Affine Invariant Interest Point Detectors 67
scale estimates the characteristic length of the corre-sponding image structures, in a similar manner as thenotion of characteristic length is used in physics. Theselected scale is characteristic in the quantitative sense,since it measures the scale at which there is maximumsimilarity between the feature detection operator andthe local image structures. This scale estimate will (fora given image operator) obey perfect scale invarianceunder rescaling of the image pattern.
Given a point in an image and a scale selection op-erator we compute the operator responses for a setof scales σn (Fig. 1). The characteristic scale corre-sponds to the local extremum of the responses. Notethat there might be several maxima or minima, thatis several characteristic scales corresponding to differ-ent local structures centered on this point. The char-acteristic scale is relatively independent of the imageresolution. It is related to the structure and not to theresolution at which the structure is represented. Theratio of the scales at which the extrema are found forcorresponding points is the actual scale factor betweenthe point neighborhoods. In Mikolajczyk and Schmid(2001) we compared several differential operators andwe noticed that the scale-adapted Harris measure rarelyattains maxima over scales in a scale-space representa-tion. If too few interest points are detected, the imagecontent is not reliably represented. Furthermore, theexperiments showed that Laplacian-of-Gaussians findsthe highest percentage of correct characteristic scales
Figure 1. Example of characteristic scales. The top row shows two images taken with different focal lengths. The bottom row shows theresponse Fnorm(x, σn) over scales where Fnorm is the normalized LoG (cf. Eq. (3)). The characteristic scales are 10.1 and 3.89 for the left andright image, respectively. The ratio of scales corresponds to the scale factor (2.5) between the two images. The radius of displayed regions inthe top row is equal to 3 times the characteristic scale.
to be found.
|LoG(x, σn)| = σ 2n |Lxx (x, σn) + L yy(x, σn)| (3)
When the size of the LoG kernel matches with thesize of a blob-like structure the response attains an ex-tremum. The LoG kernel can therefore be interpretedas a matching filter (Duda and Hart, 1973). The LoG iswell adapted to blob detection due to its circular sym-metry, but it also provides a good estimation of thecharacteristic scale for other local structures such ascorners, edges, ridges and multi-junctions. Many pre-vious results confirm the usefulness of the Laplacianfunction for scale selection (Chomat et al., 2000;Lindeberg, 1993, 1998; Lowe, 1999).
2.2. Harris-Laplace Detector
In the following we explain in detail our scale invariantfeature detection algorithm. The Harris-Laplace detec-tor uses the scale-adapted Harris function (Eq. (2)) tolocalize points in scale-space. It then selects the pointsfor which the Laplacian-of-Gaussian, Eq. (3), attainsa maximum over scale. We propose two algorithms.The first one is an iterative algorithm which detectssimultaneously the location and the scale of character-istic regions. The second one is a simplified algorithm,which is less accurate but more efficient.
SIFT Detector
Identified Corners Remove those on edges
Remove those where contrast
is low
Today
• Natural Image Statistics.
• Interest Point Detectors.
• Matching Regions.
• RANSAC.
• Once we have detected local regions, we must then match them.
Matching Regions
27
image 1 image 2
SSD : sum of square difference
Small difference values ! similar patches
X
[�x,�y]T2N
||I1(x1 +�x, y1 +�y)� I2(x2 +�x, y2 +�y)||22
(x1, y1)(x2, y2)
Problems with SSD
• SSD measures are
sensitive to both,
• geometric and,
• photometric variation.
• Common practice to
use descriptors.
28
I1
I2
Planar Affine Patch Assumption
29
“View 1” “View 2”
• A number of technique have been proposed to detect affine covariant regions (Schmid et al. 2004).
Rotation Invariance
• Estimation of the dominant orientation – extract gradient orientation – histogram over gradient orientation – peak in this histogram
• Rotate patch in dominant direction
SIFT Descriptor
1. Compute image gradients 2. Pool into local histograms3. Concatenate histograms4. Normalize histograms
More on this in future lectures.
Why Pool?
Why Pool?
Why Pool?
“average”
Why Pool?
⇤ “histogram”“pooling”“blurring”
Why Pool?
⇤ “histogram”“pooling”“blurring”
MATLAB Example
• Example in MATLAB,
>> h = fspecial(‘gaussian’,[25,25],3); >> resp = imfilter(im, h);
⇤h resp
MATLAB Example
• Example in MATLAB,
>> h = fspecial(‘gaussian’,[25,25],3); >> resp = imfilter(im, h);
⇤h resp
Other Descriptors
• Since SIFT, a plethora of other descriptors have been proposed.
• SIFT is sometimes problematic to use in practice as it is • protected under existing patents.
• In OpenCV alone there are, • SURF - (patent protected) • BRIEF • PHREAK
• See following link for tutorial in OpenCV.
• ORB • BRISK • FAST
Other Descriptors
• Since SIFT, a plethora of other descriptors have been proposed.
• SIFT is sometimes problematic to use in practice as it is • protected under existing patents.
• In OpenCV alone there are, • SURF - (patent protected) • BRIEF • PHREAK
• See following link for tutorial in OpenCV.
• ORB • BRISK • FAST
Today
• Natural Image Statistics.
• Interest Point Detectors.
• Matching Regions.
• RANSAC.
Robust Estimation
• Least squares criterion is not robust to outliers
• For example, the two outliers here cause the fitted line to be quite wrong.
• One approach to fitting under these circumstances is to use RANSAC – “Random sampling by consensus”
RANSAC
RANSAC
Fitting a homography with RANSAC
Original images Initial matches Inliers from RANSAC
Things to try in iOS - GPUImage
See - https://github.com/BradLarson/GPUImage for more details.
More to read…
• Prince et al. • Chapter 13, Sections 2-3.
• Corke et al. • Chapter 13, Section 3.
• E. Rublee et al. “ORB: an efficient alternative to SIFT or SURF”, ICCV 2011.