phuong phap sift

7/30/2019 Phuong phap SIFT

http://slidepdf.com/reader/full/phuong-phap-sift 1/27

Scale Invariant Feature Transform

Harri Auvinen, Tapio Leppalampi,

Joni Taipale and Maria Teplykh

Lappeenranta University of TechnologyMachine Vision and Digital Image Analysis

November 24th

, 2009

http://goforward/

http://find/

http://goback/



Introduction

Scale-Invariant Feature Transform (SIFT)

A method developed by David G. Lowe

Feature extraction method

Invariance in feature extraction

A method should locate features

Extraction method should be robust, it should handledifferent types of changes between images

Illumination Affine transform

Scale Rotation

http://goforward/

http://find/

http://goback/



http://find/



Algorithm

The steps of the SIFT algorithm:

1. Scale-space extrema detection Search over scales and image locations Locate local extremas

2. Keypoint localization Selects keypoints from local extremas

Keypoints are selected based on measures of their stability3. Orientation assignment

Orientations are assigned to each keypoint based on localimage gradient directions

4. Keypoint descriptor Local image gradients are measured at the selected scale

in the region around each keypoint These are transformed into a representation that allows for

signifigant levels of local shape distortion and change inillumination

http://find/



Scale-space extrema detection

The first stage in keypoint detection is to find the local extremas

in scale-space. It contains

A cascade filtering approach Creation of octaves and

Scale-space images for each octave

http://find/




The original image I (x , y ) is blurred with Gaussian filter

L(x , y , σ) = G (x , y , σ) ∗ I (x , y ), (1)

where

G (x , y , σ) =1

2πσ2

exp−1

2σ2

(x 2 + y 2) . (2)

The procedure is repeated by changing the scale σ by

multiplying it with the factor k s times. Then the difference of

Gaussians (DoG),

D (x , y , σ) = L(x , y , k σ) − L(x , y , σ), (3)

is calculated for adjagent blurred images. According to Lowe to

achieve stable keypoints one should set s = 3 and k = 21/s .

http://find/

http://goback/




http://find/




The procedure to calculate the differences of Gaussians is then

repeated for each octave. The creation of the next octave:

Select the Gaussian blurred image which has σ value twice

to that of the original Subsample the image and use the output as the starting

point for next octave

Subsampling is made by selecting every second pixel from

the rows and columns of the image

http://find/

http://goback/



Keypoint localization

Detection of local extremas/keypoints Find the extrema points in the DoG pyramid Improve the localization of the keypoint to subpixel

accuracy by using a second order Taylor series expansion

Elimination of keypoints Eliminate some points from the candidate list of keypoints

by finding those that have low contrast or are poorly

localised on an edge Contrast thresholding Cornerness thresholding

http://find/



Detection of local extremas/keypoints

To detect the local maxima and minima of D (x , y , σ) each

point is compared with the pixels of all its 26 neighbours

If this value is the minimum or maximum, then this point is

an extrema

http://find/



Detection of local extremas/keypoints. Brown and

Lowe methodImprovement to matching and stability

Approach uses the Taylor expansion of the scale-spacefunction, D (x , y , σ), shifted so that the origin is at the sample

point:

D (X ) = D +∂ D T

∂ X X +

1

2X ∂ 2D

∂ X 2X (4)

where D and its derivatives are evaluated at the sample pointand X = (x , y , σ)T is the offset from this point. The location of

the extremum X is determined by taking the derivative of this

function with respect to X and setting it to zero, giving

X = −

∂ 2D

∂ X 2

−

1 ∂ D

∂ X (5)

If X > 0.5 then it means that the extremum lies closer to a

different sample point. In this case, the interpolation is

performed.

http://find/

http://goback/



Elimination of keypoints

a The 233x189 pixel

original image

b The initial 832 keypoints

locations at maxima and

minima of the

difference-of-Gaussian

functionc After applying a

threshold on minimum

contrast 729 keypoints

remain

d The final 536 keypoints

that remain following an

additional threshold on

ratio of principal

curvatures

http://find/



Contrast thresholding

The function value at the extremum, D (X ), is useful for

rejecting unstable extrema with low contrast. This can be

obtained by substituting equation (5) into (4), giving

D (X ) = D + 12∂ D T∂ X

X . (6)

If the function value at X is below a threshold value this point is

excluded. For (c) all extrema with a value of |D(X )| < 0.03 were

discarded.

http://find/



Cornerness thresholding

A poorly defined peak in the difference-of-Gaussian functionwill have a large principal curvature across the edge but a small

one in the perpendicular direction. The principal curvatures can

be computed from a 2x2 Hessian matrix, H, computed at the

location and scale of the keypoint:

H =

D xx D xy

D xy D yy

(7)

The derivatives are estimated by taking differences of

neighboring sample points. The eigenvalues of H areproportional to the principal curvatures of D .

http://find/



Cornerness thresholdingLet α be the eigenvalue with the largest magnitude and β be

the smaller oneTr (H) = D xx + D yy = α + β (8)

Det (H) = D xx D yy − (D xy )2 = αβ (9)

Let r be the ratio between the largest magnitude eigenvalue

and the smaller one, so that α = r β . Then,

Tr (H)2

Det (H)= (α + β )2

αβ = (r β + β )2

r β 2= (r + 1)2

r (10)

The quantity (r + 1)2/r is at a minimum when the two

eigenvalues are equal and it increases with r . Therefore, to

check that the ratio of principal curvatures is below somethreshold, r , we only need to check

Tr (H)2

Det (H)<

(r + 1)2

r (11)

The transition from (c) to (d) was obtained with r = 10.

O i i i ( )

http://find/

http://goback/



Orientation assignment (2)

Left: The point in the middle is the keypoint candidate. The

orientations of the points in the square area around this point

are precomputed using pixel differences.Right: Each bin in the histogram holds 10 degree, so it covers

the whole 360 degree with 36 bins in it. The value of each bin

holds the magnitude sums from all the points precomputed

within that orientation.

K i d i

http://find/



Keypoint descriptor

Keypoint samples are accumulated into orientation

histograms summarizing the contents over 4x4 subregions

Best result is obtained 4X4 array of histograms with 8orientation bins in each

As a result a 4x4x8 = 128 element feature vector is

generated for each keypoint

K i t d i t

http://find/



Keypoint descriptor

O i t ti i i

http://find/

http://goback/



Orientation invariance

In order to achieve orientation invariance the coordinates

of the descriptor and the gradient orientations are rotatedrelative to the keypoint orientation

For efficiency, the gradients are precomputed for all levels

of the pyramid

A Gaussian weighting function with equal to one half thewidth of the descriptor window is used to assign a weight

to the magnitude of each sample point

The purpose of the Gaussian window is To avoid sudden changes in the descriptor with small

changes in the position of the window And to give less emphasis to gradients that are far from the

center of the descriptor, as these are most affected bymisregistration errors

Bo ndar affects

http://find/



Boundary affects

To avoid all boundary affects

Trilinear interpolation is used to distribute the value of each

gradient sample into adjacent histogram bins

In other words, each entry into a bin is multiplied by aweight of 1 − d for each dimension

d is the distance of the sample from the central value of

the bin as measured in units of the histogram bin spacing

Effect of illumination

http://find/



Effect of illumination

The feature vector modification

Reason by this is to reduce the effects of illumination

change

First, the vector is normalized to unit length

Second, threshold the values in the unit feature vector

And then renormalizing to unit length

Demo and applications

http://find/



Demo and applications

Search for the sample in the image Classification of remote sensed imagery. [Yang&Newsam,

2008]

Model images of planar objects Recognition of 3D objects

Recognising Panoramas

People Redetection [Hu et al., 2008]

Object recognition

http://find/



Object recognition

Recognising panoramas

http://find/



Recognising panoramas

Comparison and Modifications

http://goforward/

http://find/

http://goback/



Comparison and ModificationsUntil now, SIFT has been proven to be the most reliable

descriptor among the others.

Ancuti&Bekaert, 2007 Mikolajczyk&Schmid, 2005

Modifications

CSIFT: A SIFT Descriptor with Color Invariant

Characteristics [Abdel-Hakim&Farag] SIFT-CCH: Increasing the SIFT distinctness by Color

Co-occurrence Histograms [Ancuti&Bekaert, 2007]

PCA-SIFT: A More Distinctive Representation for LocalImage Descriptors [Ke et al., 2004]

”. . . instead of using SIFT’s smoothed weighted histograms,we apply Principal Components Analysis (PCA) to thenormalized gradient patch.”

”. . . more distinctive and more compact leading tosignificant improvements in matching accuracy (and speed)

for both controlled and real-world conditions.”

References

http://find/

http://goback/



References

David G. Lowe, Object Recognition from Local

Scale-Invariant Features, Proc. of the International

Cenference on Computer Vision, 1999

David G. Lowe, Distinctive Image Features from

Scale-Invariant Keypoints, International Journal of

Computer Vision, 2004

M. Brown and D.G. Lowe, Recognising Panoramas,International Conference on Computer Vision, 2002

Andrea Vevaldi, SIFT for Matlab,

http://www.vlfeat.org/ vedaldi/code/sift.html

Cosmin Ancuti and Philippe Bekaert, SIFT-CCH:Increasing the SIFT distinctness by Color Co-occurrence

Histograms, Proceedings of 5th IEEE International

Symposium on Image and Signal Processing and Analysis,

2007.

References

http://find/



References

K. Mikolajczyk and C. Schmid. A performance evaluation

of local descriptors. IEEE PAMI, 2005.

Alaa E. Abdel-Hakim, Aly A. Farag: CSIFT: A SIFTDescriptor with Color Invariant Characteristics.

Y. Ke, R. Suthankar and L. Hutson, PCA-SIFT: a more

distinctive representation for local image descriptors, in

Proc. of CVPR, 2004. Y. Yang and S. Newsam, Comparing SIFT Descriptors and

Gabor Texture Features for Classification of Remote

Sensed Imagery, IEEE International Conference on Image

Processing, 2008

Lei Hu, Shuqiang Jiang, Qingming Huang, Yizhou

Wang,Wen Gao,PEOPLE RE-DETECTION USING

ADABOOST WITH SIFT AND COLOR CORRELOGRAM,

The International Conference on Image Processing

(ICIP2008)

http://find/

phuong phap sift

Documents