lecture 10 - interest point detectors & ransac

Interest Point Detectors & RANSAC

Instructor - Simon Lucey

16-423 - Designing Computer Vision Apps

Today

• Image Gradients.

• Interest Point Detectors

• Matching Regions.

• RANSAC.

I(x

,y)

I(x + 1, y + 1)

I

Simoncelli & Olshausen 2001

I(x

,y)

I

I(x + 8, y + 8) Simoncelli & Olshausen 2001

I(x

,y)

I


Pixel Coherence

• The pixel coherence assumption demonstrates that pixels within natural images are heavily correlated with one another within a local neighborhood (e.g., +/- 5 pixels).

• For example,

5

N

CS143 Intro to Computer VisionSept, 2007 ©Michael J. Black

Image Filtering

3 4 3

2 ? 5

5 4 2

3

What assumptions are you

making to infer the center value?

Pixel Coherence

• The pixel coherence assumption demonstrates that pixels within natural images are heavily correlated with one another within a local neighborhood (e.g., +/- 5 pixels).

• For example,

5

N

CS143 Intro to Computer VisionSept, 2007 ©Michael J. Black

Image Filtering

3 4 3

2 ? 5

5 4 2

3

What assumptions are you

making to infer the center value?

“Typically assume that pixels within +/- 3, 5 or 7 pixels are highly correlated.”

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

Estimating Gradients

6

• Images are a discretely sampled representation of a continuous signal,

(Black)

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x


7

• Images are a discretely sampled representation of a continuous signal,

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

(Black)

• What if I want to know given that I have only the appearance at and ?I(x0 + �x)


8

(Black)

�x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&

dx<#I

IFxH

xx0

I(x0)

I(x0 + �x)I(x0)

• Again, simply take the Taylor series approximation?


9

(Black)

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&

dx<#I

IFxH

xx0

I(x0 + �x)I(x0)

I(x0 + �x) � I(x0) +�I(x0)

�x

T

�x

• Again, simply take the Taylor series approximation?


9

(Black)

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&

dx<#I

IFxH

xx0

I(x0 + �x)I(x0)

I(x0 + �x) � I(x0) +�I(x0)

�x

T

�x

�I(x0)�x

• Traditional method for calculating gradients in vision is through the use of edge filters. (e.g., Sobel, Prewitt).

• where,

�

�

Gradients through Filters

10

“Horizontal”

“Vertical”

⇥I(x)⇥x

= [�Ix(x),�Iy(x)]T

�Iy

�Ix

I

• Traditional method for calculating gradients in vision is through the use of edge filters. (e.g., Sobel, Prewitt).

• where,

�

�

Gradients through Filters

10

“Horizontal”

“Vertical”

⇥I(x)⇥x

= [�Ix(x),�Iy(x)]T

�Iy

�Ix

I

“Often have to apply a smoothing filter as well.”

Today

• Natural Image Statistics.

• Interest Point Detectors.


• RANSAC.

Interest Point Detectors


• Interest point detectors, in general, detect corners. • A dense approach (using all pixels) will be far too slow. • Matching lines suffers from the “aperture problem” (i.e. has

the line moved along itself?). • Corners are:-

• relatively sparse • well-localized • good for matching

• reasonably cheap to compute • appear quite robustly • useful for geometry.

A(�x,�y) =X

(xk,yk)2N (x,y)

||I(xk

, y

k

)� I(xk

+ �x, y

k

+ �y)||2

N

(�x,�y)


• Approach is based on concept of auto-correlation,

14

N (x)


• Approach is based on concept of auto-correlation,

15

“in vector form”

“local neighborhood”

A(�x) =X

xk2N (x)

||I(xk)� I(xk + �x)||2

A(�x)

5 10 15 20 25

5

10

15

20

25

5 10 15 20 25

5

10

15

20

25


• Which patches match the auto-correlation responses?

16

A(�x) =X

xk2N (x)

||I(xk)� I(xk + �x)||2

Harris Detector

• Harris & Stephens (1988) proposed,

17

⇡X

i2N||I(xi)� I(xi)�

@I(xi)@x

T

�x||2

⇡ �xT H�x

H =X

i2N

@I(xi)@x

@I(xi)@x

T

Harris Detector

• How can one characterize these three situations?

A(�x) =X

xk2N (x)

||I(xk)� I(xk + �x)||2

• We can simplify this through a linear approximation,

• where,

Harris Corner Detector

Make decision based on image structure tensor

H =X

i2N

@I(xi)@x

@I(xi)@x

T

Harris Detector

20

Effects of Scale

• Consider a single row or column of the image – Plotting intensity as a function of position gives a signal

Source: S. Seitz

Effects of Scale

• Consider a single row or column of the image – Plotting intensity as a function of position gives a signal

• Where is the edge? Source: S. Seitz

Search Multiple Scales

f

Source: S. Seitz


f

g

Source: S. Seitz


f

g

f * g

Source: S. Seitz


• To find edges, look for peaks in

f

g

f * g

Source: S. Seitz

SIFT Detector

• Filter with difference of Gaussian filters at increasing scales• Build image stack (scale space)• Find extrema in this 3D volume

Coarse to Fine

24

Scale & Affine Invariant Interest Point Detectors 67

scale estimates the characteristic length of the corre-sponding image structures, in a similar manner as thenotion of characteristic length is used in physics. Theselected scale is characteristic in the quantitative sense,since it measures the scale at which there is maximumsimilarity between the feature detection operator andthe local image structures. This scale estimate will (fora given image operator) obey perfect scale invarianceunder rescaling of the image pattern.

Given a point in an image and a scale selection op-erator we compute the operator responses for a setof scales σn (Fig. 1). The characteristic scale corre-sponds to the local extremum of the responses. Notethat there might be several maxima or minima, thatis several characteristic scales corresponding to differ-ent local structures centered on this point. The char-acteristic scale is relatively independent of the imageresolution. It is related to the structure and not to theresolution at which the structure is represented. Theratio of the scales at which the extrema are found forcorresponding points is the actual scale factor betweenthe point neighborhoods. In Mikolajczyk and Schmid(2001) we compared several differential operators andwe noticed that the scale-adapted Harris measure rarelyattains maxima over scales in a scale-space representa-tion. If too few interest points are detected, the imagecontent is not reliably represented. Furthermore, theexperiments showed that Laplacian-of-Gaussians findsthe highest percentage of correct characteristic scales

Figure 1. Example of characteristic scales. The top row shows two images taken with different focal lengths. The bottom row shows theresponse Fnorm(x, σn) over scales where Fnorm is the normalized LoG (cf. Eq. (3)). The characteristic scales are 10.1 and 3.89 for the left andright image, respectively. The ratio of scales corresponds to the scale factor (2.5) between the two images. The radius of displayed regions inthe top row is equal to 3 times the characteristic scale.

to be found.

|LoG(x, σn)| = σ 2n |Lxx (x, σn) + L yy(x, σn)| (3)

When the size of the LoG kernel matches with thesize of a blob-like structure the response attains an ex-tremum. The LoG kernel can therefore be interpretedas a matching filter (Duda and Hart, 1973). The LoG iswell adapted to blob detection due to its circular sym-metry, but it also provides a good estimation of thecharacteristic scale for other local structures such ascorners, edges, ridges and multi-junctions. Many pre-vious results confirm the usefulness of the Laplacianfunction for scale selection (Chomat et al., 2000;Lindeberg, 1993, 1998; Lowe, 1999).

2.2. Harris-Laplace Detector

In the following we explain in detail our scale invariantfeature detection algorithm. The Harris-Laplace detec-tor uses the scale-adapted Harris function (Eq. (2)) tolocalize points in scale-space. It then selects the pointsfor which the Laplacian-of-Gaussian, Eq. (3), attainsa maximum over scale. We propose two algorithms.The first one is an iterative algorithm which detectssimultaneously the location and the scale of character-istic regions. The second one is a simplified algorithm,which is less accurate but more efficient.

SIFT Detector

Identified Corners Remove those on edges

Remove those where contrast

is low

Today




• RANSAC.

• Once we have detected local regions, we must then match them.

Matching Regions

27

image 1 image 2

SSD : sum of square difference

Small difference values ! similar patches

X

[�x,�y]T2N

||I1(x1 +�x, y1 +�y)� I2(x2 +�x, y2 +�y)||22

(x1, y1)(x2, y2)

Problems with SSD

• SSD measures are

sensitive to both,

• geometric and,

• photometric variation.

• Common practice to

use descriptors.

28

I1

I2

Planar Affine Patch Assumption

29

“View 1” “View 2”

• A number of technique have been proposed to detect affine covariant regions (Schmid et al. 2004).

Rotation Invariance

• Estimation of the dominant orientation – extract gradient orientation – histogram over gradient orientation – peak in this histogram

• Rotate patch in dominant direction

SIFT Descriptor

1. Compute image gradients 2. Pool into local histograms3. Concatenate histograms4. Normalize histograms

More on this in future lectures.

Why Pool?

Why Pool?

“average”

Why Pool?

⇤ “histogram”“pooling”“blurring”

MATLAB Example

• Example in MATLAB,

>> h = fspecial(‘gaussian’,[25,25],3); >> resp = imfilter(im, h);

⇤h resp

Other Descriptors

• Since SIFT, a plethora of other descriptors have been proposed.

• SIFT is sometimes problematic to use in practice as it is • protected under existing patents.

• In OpenCV alone there are, • SURF - (patent protected) • BRIEF • PHREAK

• See following link for tutorial in OpenCV.

• ORB • BRISK • FAST

http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_feature2d/py_table_of_contents_feature2d/py_table_of_contents_feature2d.html

Today




• RANSAC.

Robust Estimation

• Least squares criterion is not robust to outliers

• For example, the two outliers here cause the fitted line to be quite wrong.

• One approach to fitting under these circumstances is to use RANSAC – “Random sampling by consensus”

RANSAC

Fitting a homography with RANSAC

Original images Initial matches Inliers from RANSAC

Things to try in iOS - GPUImage

See - https://github.com/BradLarson/GPUImage for more details.

https://github.com/BradLarson/GPUImage

More to read…

• Prince et al. • Chapter 13, Sections 2-3.

• Corke et al. • Chapter 13, Section 3.

• E. Rublee et al. “ORB: an efficient alternative to SIFT or SURF”, ICCV 2011.

https://www.willowgarage.com/sites/default/files/orb_final.pdf

lecture 10 - interest point detectors & ransac

Documents