phuong phap sift

27
Scale Invariant Feature Transform Harri Auvinen, Tapio Lepp ¨ alampi, Joni Taipale and Maria Teplykh Lappeenranta University of Technology Machine Vision and Digital Image Analysis November 24 th , 2009

Upload: taulau1234

Post on 04-Apr-2018

242 views

Category:

Documents


0 download

TRANSCRIPT

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 1/27

Scale Invariant Feature Transform

Harri Auvinen, Tapio Leppalampi,

Joni Taipale and Maria Teplykh

Lappeenranta University of TechnologyMachine Vision and Digital Image Analysis

November 24th

, 2009

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 2/27

Introduction

Scale-Invariant Feature Transform (SIFT)

A method developed by David G. Lowe

Feature extraction method

Invariance in feature extraction

A method should locate features

Extraction method should be robust, it should handledifferent types of changes between images

Illumination Affine transform

Scale Rotation

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 3/27

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 4/27

Algorithm

The steps of the SIFT algorithm:

1. Scale-space extrema detection Search over scales and image locations Locate local extremas

2. Keypoint localization Selects keypoints from local extremas

Keypoints are selected based on measures of their stability3. Orientation assignment

Orientations are assigned to each keypoint based on localimage gradient directions

4. Keypoint descriptor Local image gradients are measured at the selected scale

in the region around each keypoint These are transformed into a representation that allows for

signifigant levels of local shape distortion and change inillumination

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 5/27

Scale-space extrema detection

The first stage in keypoint detection is to find the local extremas

in scale-space. It contains

A cascade filtering approach Creation of octaves and

Scale-space images for each octave

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 6/27

Scale-space extrema detection

The original image I (x , y ) is blurred with Gaussian filter

L(x , y , σ) = G (x , y , σ) ∗ I (x , y ), (1)

where

G (x , y , σ) =1

2πσ2

exp−1

2σ2

(x 2 + y 2) . (2)

The procedure is repeated by changing the scale σ by

multiplying it with the factor k s  times. Then the difference of

Gaussians (DoG),

D (x , y , σ) = L(x , y , k σ) − L(x , y , σ), (3)

is calculated for adjagent blurred images. According to Lowe to

achieve stable keypoints one should set s  = 3 and k  = 21/s  .

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 7/27

Scale-space extrema detection

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 8/27

Scale-space extrema detection

The procedure to calculate the differences of Gaussians is then

repeated for each octave. The creation of the next octave:

Select the Gaussian blurred image which has σ value twice

to that of the original Subsample the image and use the output as the starting

point for next octave

Subsampling is made by selecting every second pixel from

the rows and columns of the image

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 9/27

Keypoint localization

Detection of local extremas/keypoints Find the extrema points in the DoG pyramid Improve the localization of the keypoint to subpixel

accuracy by using a second order Taylor series expansion

Elimination of keypoints Eliminate some points from the candidate list of keypoints

by finding those that have low contrast or are poorly

localised on an edge Contrast thresholding Cornerness thresholding

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 10/27

Detection of local extremas/keypoints

To detect the local maxima and minima of D (x , y , σ) each

point is compared with the pixels of all its 26 neighbours

If this value is the minimum or maximum, then this point is

an extrema

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 11/27

Detection of local extremas/keypoints. Brown and

Lowe methodImprovement to matching and stability

Approach uses the Taylor expansion of the scale-spacefunction, D (x , y , σ), shifted so that the origin is at the sample

point:

D (X ) = D +∂ D T

∂ X X  +

1

2X ∂ 2D 

∂ X 2X  (4)

where D  and its derivatives are evaluated at the sample pointand X  = (x , y , σ)T is the offset from this point. The location of

the extremum X  is determined by taking the derivative of this

function with respect to X  and setting it to zero, giving

X  = −

∂ 2D 

∂ X 2

1 ∂ D 

∂ X (5)

If X  > 0.5 then it means that the extremum lies closer to a

different sample point. In this case, the interpolation is

performed.

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 12/27

Elimination of keypoints

a The 233x189 pixel

original image

b The initial 832 keypoints

locations at maxima and

minima of the

difference-of-Gaussian

functionc After applying a

threshold on minimum

contrast 729 keypoints

remain

d The final 536 keypoints

that remain following an

additional threshold on

ratio of principal

curvatures

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 13/27

Contrast thresholding

The function value at the extremum, D (X ), is useful for

rejecting unstable extrema with low contrast. This can be

obtained by substituting equation (5) into (4), giving

D (X ) = D + 12∂ D T∂ X 

X . (6)

If the function value at X  is below a threshold value this point is

excluded. For (c) all extrema with a value of |D(X )| < 0.03 were

discarded.

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 14/27

Cornerness thresholding

A poorly defined peak in the difference-of-Gaussian functionwill have a large principal curvature across the edge but a small

one in the perpendicular direction. The principal curvatures can

be computed from a 2x2 Hessian matrix, H, computed at the

location and scale of the keypoint:

H =

D xx  D xy 

D xy  D yy 

(7)

The derivatives are estimated by taking differences of

neighboring sample points. The eigenvalues of H areproportional to the principal curvatures of D .

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 15/27

Cornerness thresholdingLet α be the eigenvalue with the largest magnitude and β be

the smaller oneTr (H) = D xx  + D yy  = α + β  (8)

Det (H) = D xx D yy  − (D xy )2 = αβ  (9)

Let r  be the ratio between the largest magnitude eigenvalue

and the smaller one, so that α = r β . Then,

Tr (H)2

Det (H)= (α + β )2

αβ = (r β + β )2

r β 2= (r  + 1)2

r (10)

The quantity (r  + 1)2/r  is at a minimum when the two

eigenvalues are equal and it increases with r . Therefore, to

check that the ratio of principal curvatures is below somethreshold, r , we only need to check

Tr (H)2

Det (H)<

(r  + 1)2

r (11)

The transition from (c) to (d) was obtained with r  = 10.

O i i i ( )

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 16/27

Orientation assignment (2)

Left: The point in the middle is the keypoint candidate. The

orientations of the points in the square area around this point

are precomputed using pixel differences.Right: Each bin in the histogram holds 10 degree, so it covers

the whole 360 degree with 36 bins in it. The value of each bin

holds the magnitude sums from all the points precomputed

within that orientation.

K i d i

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 17/27

Keypoint descriptor

Keypoint samples are accumulated into orientation

histograms summarizing the contents over 4x4 subregions

Best result is obtained 4X4 array of histograms with 8orientation bins in each

As a result a 4x4x8 = 128 element feature vector is

generated for each keypoint

K i t d i t

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 18/27

Keypoint descriptor

O i t ti i i

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 19/27

Orientation invariance

In order to achieve orientation invariance the coordinates

of the descriptor and the gradient orientations are rotatedrelative to the keypoint orientation

For efficiency, the gradients are precomputed for all levels

of the pyramid

A Gaussian weighting function with equal to one half thewidth of the descriptor window is used to assign a weight

to the magnitude of each sample point

The purpose of the Gaussian window is To avoid sudden changes in the descriptor with small

changes in the position of the window And to give less emphasis to gradients that are far from the

center of the descriptor, as these are most affected bymisregistration errors

Bo ndar affects

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 20/27

Boundary affects

To avoid all boundary affects

Trilinear interpolation is used to distribute the value of each

gradient sample into adjacent histogram bins

In other words, each entry into a bin is multiplied by aweight of 1 − d  for each dimension

d  is the distance of the sample from the central value of

the bin as measured in units of the histogram bin spacing

Effect of illumination

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 21/27

Effect of illumination

The feature vector modification

Reason by this is to reduce the effects of illumination

change

First, the vector is normalized to unit length

Second, threshold the values in the unit feature vector

And then renormalizing to unit length

Demo and applications

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 22/27

Demo and applications

Search for the sample in the image Classification of remote sensed imagery. [Yang&Newsam,

2008]

Model images of planar objects Recognition of 3D objects

Recognising Panoramas

People Redetection [Hu et al., 2008]

Object recognition

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 23/27

Object recognition

Recognising panoramas

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 24/27

Recognising panoramas

Comparison and Modifications

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 25/27

Comparison and ModificationsUntil now, SIFT has been proven to be the most reliable

descriptor among the others.

Ancuti&Bekaert, 2007 Mikolajczyk&Schmid, 2005

Modifications

CSIFT: A SIFT Descriptor with Color Invariant

Characteristics [Abdel-Hakim&Farag] SIFT-CCH: Increasing the SIFT distinctness by Color

Co-occurrence Histograms [Ancuti&Bekaert, 2007]

PCA-SIFT: A More Distinctive Representation for LocalImage Descriptors [Ke et al., 2004]

”. . . instead of using SIFT’s smoothed weighted histograms,we apply Principal Components Analysis (PCA) to thenormalized gradient patch.”

”. . . more distinctive and more compact leading tosignificant improvements in matching accuracy (and speed)

for both controlled and real-world conditions.”

References

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 26/27

References

David G. Lowe, Object Recognition from Local

Scale-Invariant Features, Proc. of the International

Cenference on Computer Vision, 1999

David G. Lowe, Distinctive Image Features from

Scale-Invariant Keypoints, International Journal of

Computer Vision, 2004

M. Brown and D.G. Lowe, Recognising Panoramas,International Conference on Computer Vision, 2002

Andrea Vevaldi, SIFT for Matlab,

http://www.vlfeat.org/ vedaldi/code/sift.html

Cosmin Ancuti and Philippe Bekaert, SIFT-CCH:Increasing the SIFT distinctness by Color Co-occurrence

Histograms, Proceedings of 5th IEEE International

Symposium on Image and Signal Processing and Analysis,

2007.

References

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 27/27

References

K. Mikolajczyk and C. Schmid. A performance evaluation

of local descriptors. IEEE PAMI, 2005.

Alaa E. Abdel-Hakim, Aly A. Farag: CSIFT: A SIFTDescriptor with Color Invariant Characteristics.

Y. Ke, R. Suthankar and L. Hutson, PCA-SIFT: a more

distinctive representation for local image descriptors, in

Proc. of CVPR, 2004. Y. Yang and S. Newsam, Comparing SIFT Descriptors and

Gabor Texture Features for Classification of Remote

Sensed Imagery, IEEE International Conference on Image

Processing, 2008

Lei Hu, Shuqiang Jiang, Qingming Huang, Yizhou

Wang,Wen Gao,PEOPLE RE-DETECTION USING

ADABOOST WITH SIFT AND COLOR CORRELOGRAM,

The International Conference on Image Processing

(ICIP2008)