a feature-based image registration algorithm using …...image segmentation for the purpose of...

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 37, NO. 5, SEPTEMBER 1999 2351

A Feature-Based Image Registration AlgorithmUsing Improved Chain-Code Representation

Combined with Invariant MomentsXiaolong Dai, Member, IEEE,and Siamak Khorram,Member, IEEE

Abstract—In this paper, a new feature-based approach toautomated image-to-image registration is presented. The char-acteristic of this approach is that it combines an invariant-moment shape descriptor with improved chain-code matchingto establish correspondences between the potentially matchedregions detected from the two images. It is robust in that itovercomes the difficulties of control-point correspondence bymatching the images both in the feature space, using the principleof minimum distance classifier (based on the combined criteria),and sequentially in the image space, using the rule of root mean-square error (RMSE). In image segmentation, the performanceof the Laplacian of Gaussian operators is improved by intro-ducing a new algorithm called thin and robust zero crossing.After the detected edge points are refined and sorted, regionsare defined. Region correspondences are then performed by animage-matching algorithm developed in this research. The centersof gravity are then extracted from the matched regions and areused as control points. Transformation parameters are estimatedbased on the final matched control-point pairs. The algorithmproposed is automated, robust, and of significant value in anoperational context. Experimental results using multitemporalLandsat TM imagery are presented.

Index Terms—Automated, feature extraction, image matching,image registration, satellite imagery.

I. INTRODUCTION

I MAGE registration is the process of spatially matchingtwo images so that corresponding pixels in the two images

correspond to the same physical region of the scene beingimaged. Image registration is an inevitable problem arising inmany image-processing applications whenever two or moreimages of the same scene have to be compared pixel by pixel.These applications include the following.

Multisource Data Fusion: Fusion of multisource data usuallyrequires establishing pixel-to-pixel correspondence [1].

Time-Varying Image Analysis for Changes: Change analysisrequires reliable and accurate multitemporal image regis-tration. Most existing change detection algorithms dependconsiderably on reasonably accurate image registration to beworkable [2].

Manuscript received March 25, 1998; revised December 11, 1998. Thiswork was supported in part by the Coastal Change Analysis Program (C-CAP)of the National Oceanic and Atmospheric Administration (NOAA), fundedthrough the North Carolina Sea Grant Program and by SGI/Cray Research,Inc., St. Louis, MO.

The authors are with the Center for Earth Observation, North Carolina StateUniversity, Raleigh, NC 27695-7106 USA.

Publisher Item Identifier S 0196-2892(99)06150-1.

Image Mosaicking: In many regional and/or global re-mote sensing applications involving multiple scenes, a singlecoverage is needed, and individual satellite scenes must bemosaicked for further processing [3].

Motion Detection and Object Recognition: Image registra-tion also is a required process in other examples such as virtualreality construction, stereo matching and generation, imagecomposite generation, and object height estimation [4].

Existing image registration techniques fall into two broadcategories: manual registration and automated registration.Manual registration techniques commonly have been used inpractical applications. In these methods, ground control points(GCP’s) are located visually on both the sensed image and thereference image by choosing special and easily recognizablepoints such as road intersections. By doing so, the steps offeature identification and matching are done simultaneously byhuman operators. In order to get reasonably good registrationresults, a large number of control points must be selectedacross the whole image [4]. This is very tedious, labor-intensive, and repetitive work, especially when multiple wide-area scenes such as thematic mapper (TM) scenes (over6000 6000 pixels for each TM scene), are involved in aregional and/or global change-detection implementation. Thesetechniques also are subject to the problems of inconsistency,limited accuracy, and, in many instances, lack of availabilityof reference data for locating GCP’s. Therefore, there is acritical need to develop an automated technique that requireslittle or no operator supervision to coregister multitemporaland/or multisensor images when higher accuracy is desiredand image analysis is subject to time constraints [5].

The current automated registration techniques can be clas-sified into two broad categories: area-based and feature-basedtechniques. In the area-based algorithms, a small windowof points in the sensed image is compared statistically withwindows of the same size in the reference image [6]. Windowcorrespondence is based on the similarity measure betweentwo given windows. The measure of similarity is usually thenormalized cross correlation. Other useful similarity measuresare the correlation coefficient and the sequential-similaritydetection [7]. Area-based techniques can be implemented bythe Fourier transform using the fast Fourier transform (FFT)[8]. The centers of the matched windows then are usedas control points to solve for the transformation parametersbetween the two images. Although a novel approach has beenproposed by [9] to register images with large misalignment by

0196–2892/99$10.00 1999 IEEE

2352 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 37, NO. 5, SEPTEMBER 1999

correcting the image rotation first and then establishing cor-respondences, a majority of the area-based methods have thelimitation of registering only images with small misalignment,and therefore, the images must be roughly aligned with eachother initially. The correlation measures become unreliablewhen the images have multiple modalities and the gray-levelcharacteristics vary (e.g., TM and synthetic aperture radar(SAR) data).

In contrast, the feature-based methods are more robust andmore suitable in these cases. There are two critical proceduresgenerally involved in the feature-based techniques: featureextraction and feature correspondence. A variety of image-segmentation techniques have been used for extraction ofedge and boundary features, such as the Canny operator in[10] and [11], the Laplacian of Gaussian (LoG) operatorin [12] and [13], the thresholding technique in [14], theclassification method in [15], and the region growing in [16].Images are represented by a set of features after extractioneither in the spatial domain [17], [18] or in the transformdomain [20]–[22]. Spatial features usually include edges,boundaries, road intersections, and special structures. In thetransform domain, a set of transform coefficients represent theimages. Feature representations, which are invariant to scaling,rotation, and translation, are desirable in the matching process.The general feature representations are chain code, momentinvariants, Fourier descriptors, shape signatures, centroidalprofiles, and shape matrices [23]. Centers of gravity of closedboundaries generally have been used as control points [14],[16], [17]. Other points, such as salient points in [6] andlocations of maximum curvature in [12], also have beenused as control points. Feature correspondence is performedbased on the characteristics of the features detected. A robustalgorithm to establish control-point correspondences is one ofthe most important tasks in automated image registration. Ex-isting feature-matching algorithms include binary correlation,distance transform and Chamfer matching, dynamic program-ming, structural matching, chain-code correlation, distance ofinvariant moments, and relaxation algorithms [16], [23]. Inmost existing feature-based techniques, feature correspondenceis still the most challenging problem.

This paper proposes a new feature-based approach to au-tomated registration of remotely sensed images of the samescene using combined criteria of invariant-moment distanceand chain-code matching, as shown in Fig. 1. The criticalelements for an automated image registration procedure areexplored. These elements include feature identification, featurematching, and transformation parameter estimation. Specialattention has been paid to feature extraction by image segmen-tation and control-point correspondence by region matching.In image segmentation, the performance of the LoG zero-crossing edge detector is improved by developing a selectivezero-crossing scheme and defining an edge strength (ES)array. Images are segmented by the improved LoG edge-extraction technique. In region matching, a new method forthe determination of corresponding regions is proposed usingcombined criteria of invariant-moment distance and chain-code matching between the detected regions. Each region isrepresented by affine invariant moments and improved chain

codes that are affine-invariant features describing the shapeof a region. Region matching then is implemented in thefeature space and is implemented sequentially in the imagespace. In the feature space, minimum distance classificationis used to identify the most robust control points for initialimage transformation. In the image space, region-to-regioncorrespondence is established by the root mean-square error(RMSE) rule.

The rest of the paper is organized as follows. In Section II,the image segmentation technique developed in this work isdescribed. In Section III, we focus on the image-matchingalgorithm proposed to achieve a robust scheme for control-point correspondence. The experimental results are presentedin Section IV. The final conclusions are included in Section V.

II. FEATURE EXTRACTION: IMAGE SEGMENTATION

Image segmentation for the purpose of obtaining boundariesis an important step in a feature-based registration system.There are many segmentation techniques available that couldbe used. However, there is no unique segmentation techniquethat can perform best on all types of images, and most seg-mentation techniques are image-dependent [14]. In this paper,we use a technique that can produce a number of desirableboundaries to be used in the control-point finding process. TheLaplacian of Gaussian (LoG) operator combines smoothingand second-derivative operations and produces closed-edgecontours. Based on the image obtained by convolving theoriginal image with the LoG, edges are found at zero-crossingpoints. Therefore, the LoG zero-crossing operator is deemedappropriate in this study. The LoG filter is well understood,and there are abundant references on the technique itselfand its implementations (see [6], [10], and [23]). However,determination of the zero crossings is not without problems.

Conventionally, the resultant convolved image is scanned todetect pixels that have a zero value or pixels at which a changeof sign has occurred. These pixels are marked as edge pixels.To avoid noisy pixels, a threshold is used to cut the edgepixels that have small convolution values [23]. However, thismethod causes discontinuity at the weak edge pixels. Simplezero-crossing patterns, which are composed of signs of pixelvalues of the filtered image, are detected along both verticaland horizontal directions in [6]. The authors then used a two-threshold method based on the ES at each zero-crossing point.This was used to refine the edge points and assure continuitiesat weak edge points. A method that conceptually correspondsto intersecting the convolution surface with a horizontal planeat convolution value zero was introduced in [12]. This assurednot only closed contours but also correct connections at-junctions. Starting with a seed pixel, its neighboring pixelswere expanded until a sign change occurred. Pixels on theboundary were flagged as zero crossings. These methods havetwo general drawbacks when we use the edge points for thepurpose of image registration: thick edges and discontinuityat weak edge points.

To overcome these drawbacks, we propose a method, calledThin and Robust Zero-Crossing, which produces thin edges andassures continuities at weak edge points as well. This technique

DAI AND KHORRAM: A FEATURE-BASED IMAGE REGISTRATION ALGORITHM 2353

Fig. 1. Flowchart of the proposed automatic image registration system.

can be performed in two steps:selective zero-crossing pointmarkingandedge refinement and sorting. At the first stage, wemark as an edge point every pixel that satisfies the followingthree conditions.

1) The pixel is a zero-crossing point (sign change on theconvolved image).

2) The pixel lies in the direction of the steepest gradientchange.

3) The pixel is the closest pixel to the virtual zero plane ofthe LoG image among its eight neighbors.

Equivalently, if a pixel is a zero-crossing point and itsLoG value satisfies the following condition:

LoG sign LoG (1)

where LoG is the LoG value of the pixelneighborhood, and sign is the

signum function, it is flagged as an edge point. This conditionusually preserves the continuities at the-junctions andassures the detected edge is only one-pixel thick, whichindicates the best locational approximation of edge points. Forevery marked edge point, ES is defined as the steepest gradientslope change along the eight directions in its 8-neighborhood,

based on the LoG values of itself and its eight neighbors [11]

ES LoG sign LoG

(2)

The results from this stage may be “noisy,” as shown inFig. 2(a) and (b), but the edges are continuous and thin. Moreimportantly, the noisy edge points can be discarded easily inthe edge sorting and refinement processing using the ES arraydefined above.

At the edge refinement and sortingstage, we use the two-threshold method [6] to search and clean the marked edgepoints to detect contours. This scheme applies the followingtwo conditions to the detected zero-crossing points.

1) The ES at each point along the edges is greater than.2) At least one point on the contour has an ES greater than

.

Here, and are the preset thresholds, and .preserves the whole contour around the region boundary

without incurring discontinuity at weak edge points. Fig. 2(c)and (d) show the results after the thresholdwas applied,where was set to ten. is set large enough to avoid


(a) (b)

(c) (d)

(e) (f)

(g) (h)

Fig. 2. Examples of region extraction by the improved LoG zero-crossing edge detector: (a) edges from improved LoG, 1988, (b) edges from improvedLoG, 1994, (c) weak edges removed by ES, 1988, (d) weak edges removed, 1994, (e) edges after contour sorting, 1988, (f) edges after contour sorting,1994, (g) closed contours detected, 1988, and (h) closed contours detected, 1994.


spurious edges and initiates a new contour search. This two-threshold scheme is implemented by the contour search andsorting program (Fig. 1). A new contour search is initiatedwhenever one point with a value greater thanis detected.When a new search for another contour is initiated, theneighboring pixels with a value greater than are acceptedas contour points until no neighboring pixels can satisfy thiscondition. Then, all the ES values along the detected contourare disabled by setting them to zeros such that these points willnot be visited again. The search operation continues until thewhole ES array has been scanned. The detected edge pointsare shown in Fig. 2(e) and (f). The detected closed-contourpoints for each contour are then sorted and stored in order inan array for further processing. Sorted closed contours are thenmarked as region boundaries, as shown in Fig. 2(g) and (h).

III. CONTROL-POINT CORRESPONDENCE: IMAGE MATCHING

After the closed contours are detected from two images,a correspondence mechanism between the regions must beestablished to match these two images. Region similarityis obtained by comparing the shapes of the segmented re-gions. Since the images may have translational, rotational, andscaling differences, the shape measures should be invariantwith respect to translation, rotation, and scaling. Some ofthe techniques that measure shape similarity in this fashionare Fourier descriptor, chain-code matching, shape signatures,centroidal profiles, invariant momentum, and shape matrices.In practice, these descriptors have been applied individually orsequentially in the process of image matching. A comparisonbetween these similarity measures can be found in [23]. Toachieve better discrimination capability, the modified chain-code matching and the invariant moment are combined to buildthe correspondence between regions in the two correspondingimages.

A. Moment Representation of Regions: Invariant Moments

One set of the useful shape descriptors is based on thetheory of moments. The moment invariants are moment-baseddescriptors of planar shapes, which are invariant under generaltranslational, rotational, scaling, and reflection transformation[24]. The moments of a binary image are given by

(3)

where and define the order of moment. The center ofgravity of the object can be given by using object moments

(4)

The central moments then are defined as

(5)

Zero-order moment represents the binary object area.Second-order moments express the distribution of matteraround the center of gravity. In the case of objects with mass,they are called moments of inertia. Third-order moments

express the basic properties of symmetry of object. They areequal to zero for objects with central symmetry. Moments ofhigher order describe more slight variations in shape, but theyare more sensitive to noise. Central moments are translationalinvariants. Under a scale change and , themoments of change to . Thenormalized moments are then defined as

where (6)

Central moments and scaling invariant-moment representationcan be employed to produce a set of invariant moments thatare further invariant to rotational and reflectional differences.The lower-order invariants, which are composed of the second-and third-order central moments, are usually sufficient for mostregistration tasks on remotely sensed image matching. Detaileddefinitions of these invariants can be found in literature (see[17] and [25]). The following seven invariants are used in thisresearch:

(7)

(8)

(9)

(10)

(11)

(12)

(13)

All moments are scale, rotation, and trans-lation invariant. Moments are reflection in-variant as well. The magnitude of is reflection invariant.However, its sign changes under reflection.

B. Improved Chain-Code Representation of Regions

Another efficient representation of digital curves is themodified Freeman chain code. The standard Freeman chaincode has certain drawbacks for correlation between two po-tentially matched contours, such as being noisy and variantto rotational, scaling, and translational differences. Therefore,the standard chain-code representation is improved by thefollowing four operations, as described in [6].

1) Shift Operation: A modified version ofthe standard chain code is produced by a


shifting operation defined recursively by

integersmod while is minimized.

(14)

The shift operation eliminates or reduces the problem ofwraparound.

2) Smoothing Operation:Curve smoothing is obtained bya smoothing filter, such as a Gaussian.

3) Normalization Operation:The difference between themeans of the corresponding chain-code representations mea-sures the rotational difference between two curves. This rota-tional difference can be removed by subtracting the means.

4) Resampling Operation:To be scale invariant, the chaincode is resampled to the same length as that of its correspond-ing region.

The improvement in similarity measure gained by introduc-ing these operations can be quantified easily with matchingcoefficients for a pair of chain codes. For a visual demon-stration, a graphical comparison is presented in Fig. 3(a)–(h).The standard chain codes, the shifted chain codes, the shiftedand smoothed chain codes, and the modified chain codes arecomputed by all four operations for a pair of example-matchedcontours and displayed for comparison. The improvementachieved by these four operations easily can be seen fromthe plots in terms of the accuracy with the similarity measureof matched contours.

C. Control-Point Correspondence by Region Matching

Control-point correspondence is performed in two steps. Thefirst step includes region matching in the feature space basedon combined criteria of minimum distance of seven invariantmoments and maximum chain-code matching coefficients be-tween a pair of regions. In this step, two similarity matricesbetween detected regions in two images are first computed.The first similarity matrix is the Euclidean distance matrix inseven-dimensional (7-D) invariant-moment space. The secondsimilarity matrix is the chain-code matching matrix. Based onthese two similarity matrices, the minimum distance classifieris used to identify the most robust matches. For example, aminimum of three GCP’s are required to fit a general affine-bivariate polynomial transformation. The centers of gravityfor these three matched regions are used as the GCP’s. Thesecond step is performed in image space based on the imageregistration result of the first step. Given the three most robustGCP’s from the result of the first step, the parameters ofthe first-order bivariate polynomial can be calculated simply.Based on this transformation, we can map the other potentialregions in the sensed image into the reference image. Thedistance between the center of gravity of the mapped regionin the sensed image and the actual center of gravity inthe reference image is computed. This distance is actuallythe RMSE at this pair of corresponding points. A thresholdeasily can be set to avoid false matches. The final imagetransformation parameters are based on the matched pointsresulting from the second step of the image match algorithm.

1) Invariant-Moment Distance Matrix:Letand

denote the values of the seven invariantmoments of the regions detected in the reference imageand the regions detected in the sensed image, respectively.For every region in the reference image and regioninthe sensed image, we compute theinvariant-moment distancebetween regions and

(15)

The result is an distance matrix in 7-D featurespace, which represents the similarity relationship between theregions in two images. To simplify the search process, theimpossible matches between region pairs that have more thana certain amount of difference in their lengths (say 35%) areleft out by setting the distance values to one (a special flagfor ignoring these region pairs) in the distance matrix. Basedon the principle ofminimum distanceclassifier, we concludethat the smaller the distance, the more similar the shapes oftwo regions.

2) Chain-Code Matching Matrix:A correlation measure ofa region pair proposed in [6] can be used to derive thematching matrix. Let contour be represented by an -point shifted and smoothed chain code , and letbe represented by an -point shifted and smoothed chaincode . The longer contour (for example, contour) isresampled to the same length (denote as) as the shortercontour (contour ), denoted as . For any two regioncontours and in the reference image and the sensedimage, respectively, we can define the following chain-codematching coefficient by sliding normalized contour overthe normalized and resampled contourwhere []

Max (16)

(17)

mod mod

(18)

The chain-code curve of closed contour is periodic and canbe represented by a modulus operation, as shown in (17). Thesliding process between contours is achieved by the maximumoperation in (16) to find the best fit between the contour pair.The matching measure is similar to the MSE of two signals.The cosine function guarantees the matching measure to beless than one. When , there is a perfect match.

The result is also an matrix in which each element rep-resents another reliable similarity measure between the regionsin two images. By the same token, the impossible matchesbetween the region pair which have more than a certain amountof difference (say 35%) are disabled by setting the matchingcoefficient value to zero in the matching matrix. We can notice


(a) (b)

(c) (d)

(e) (f)

(g) (h)

(i) (j)

Fig. 3. Improved chain-code representation of contours: (a) and (b) matched regions, (c) and (d) original chain-code representations, (e) and (f) shiftedchain-code representations, (g) and (h) smoothed and shifted chain-code representations, and (i) and (j) resampled and normalized chain-code representations.

that the greater the chain-code matching coefficient, the morethe contours resemble each other in shape.

3) The Algorithm of Image Matching:Based on the twosimilarity matrices defined above, the following matchingalgorithm, “ImageMatch,” is developed.

Algorithm ImageMatch:

Step 1: Based on the output of the contour search andsorting program, the regions detected in thereference image and the regions detected in thesensed image are identified, ordered, and denoted


as and , re-spectively.

Step 2: Compute two feature matrices: the invariant-moment distance matrix [defined in(15)] and the chain-code matching matrix

, based on (16). In the invariant-momentcomputation, we also build two matrices containingthe coordinates of the centers of gravity. One is an

matrix, denoted as ,for the regions in the reference image. The otheris an matrix, denoted as

, for the regions in the sensedimage.

Step 3: Searching for potential matches is based on the fol-lowing condition. For any pair of regions ,if and , then the pairis accepted as a potential match, where and

are preset thresholds. Let denote a set ofthe potential matches detected by the conditionabove. This step will dramatically reduce searchcomplexity and limit the following search to thematches with highest probabilities.

Step 4: Working with the reduced-potential match set, minimum distance classification is performed

based on

(19)

and the three pairs of regions with minimum valuesin this classification are then identified as the initialmatched regions.

Step 5: Based on the three initial matched pairs of regions,their corresponding centers of gravity are used asthe GCP’s to fit a first-order bivariate polynomialtransformation and solve the following two equa-tions in a least-square sense for the transformationparameters and

(20)

where and are two sets of co-ordinates of the corresponding matched points inthe reference image and the sensed image, re-spectively. At the same time, these three pairs ofthe potential GCP’s are removed from and theremainder of potential matches are denoted as.

Step 6: Based on the transformed image in the image spaceand working with , any additional matches canbe detected by examining the following condition:

(21)

where , the th potentially matched pair withcoordinates and in the referenceimage and sensed image, respectively, and

and

is a preset threshold that is actually the max-imum RMSE allowed at the GCP’s for the finalimage registration. If this condition is satisfied, thepair of points is considered a GCP pair. If not,this pair of points is removed from the potentialmatches.

The output from Step 6 of the algorithm described above is aset of matched regions that are considered the final matchesbetween the detected regions in the two images. Their centersof gravity then are used as the ground control points for usein the estimation of transformation parameters.

IV. EXPERIMENTAL RESULTS

In this section, we provide the experimental results using theproposed algorithms for automated image registration. In theexperiments, two multitemporal images of the coastal plain ofNorth Carolina were used. The reference image (image 1) wastaken by Landsat 5 TM in November 1988. The sensed image(image 2, the image to be transformed) also was taken byLandsat 5 TM, but in December 1994. Subscenes of 512512pixels from the original images were used. Water boundariesusually are defined better in the infrared band images than inthe visible band images. Therefore, TM band 5 images wereused in the registration experiments, as shown in Fig. 4(a)and (c), respectively.

A. Image Segmentation and Region Extraction

In the image segmentation, the standard deviation of theLoG operator was set to 1.5 and the kernel size was 1515pixels. The minimum value of ES for edge points was set toten, and the value of ES to initiate a contour search was set to40. The minimum region boundary length was set to 40 pixelsfor a region to be qualified in region matching. With theseparameters set, there are 22 regions in the reference imageand 29 regions in the sensed image that were detected by thecontour search and sorting program. These detected featuresare shown in Fig. 4(b) and (d), respectively. The main regionboundaries of interest are clearly recognizable in the stretchedgray-scale images. The rest of the feature boundaries, whichare short in length and embedded in the image features, aredetected falsely for the most part. These features were to beexamined for correspondence and removed by the ImageMatchalgorithm.

B. Feature Extraction and Control-Point Correspondence

From the detected regions in two images, we constructedthe invariant-moment distance matrix and chain-code matchingmatrix. Step 3 in theImageMatchalgorithm identified 21potential matches out of 22 29 possibilities, where theinvariant-moment threshold was set to 0.1, and the chain-code matching threshold was set to 0.8. Three pairs of GCP’swith minimum distance and highest correlation were detectedand used initially to estimate the transformation parameters.In the transformed image space, four additional matches weredetected by Step 6 in the ImageMatch algorithm. The centersof gravity of the corresponding regions were used as control


(a) (b)

(c) (d)

Fig. 4. Test windows and image segmentation results: (a) reference image: TM band 5, 1988, (b) detected regions from the reference image, (c) sensedimage: TM band 5, 1994, and (d) detected regions from the sensed image.

TABLE ICOORDINATES OF THEDETECTED CONTROL POINTS

points. The complete matching results determined after theimage matching algorithm are summarized in Table I.

C. Estimation of Registration Parameters

Given a number of corresponding ground-control pointsfrom two images, the transformation parameters can be es-

timated for the images. The registration of images with rota-tional, translational, scaling, and image-skew differences canbe approximated by the following general affine relationship[16]

(22)

where and are two corresponding pointsin the reference image and the sensed image, respectively

and are the transformation param-eters and express scaling, rotational, translational, and skewdifferences between two images. The affine mapping functionis appropriate for data with flat topography. For some datawith nonlinear, local geometric distortions (such as hilly areaswith terrain changes), more complex transformation functionsmay be needed to produce better interpolation results. Thesemapping functions include thin-plate spline interpolation [26],[27] and adaptive mapping function [28]. For general affine


TABLE IIROOT MEAN-SQUARE ERRORS AT CONTROL POINTS (IN PIXELS)

transformation, a least-square approach is used to determinethese six transformation parameters:from matched control points

(23)

where

(24)

and

(25)

The RMSE at the GCP’s commonly is used to evaluate theperformance of the least-square fitting and the accuracy of theimage registration, which is usually defined as shown in (26).

D. Image Resampling and Performance Comparison

Since the determination of parameters of the general affinetransformation (22) requires knowledge of only three controlpoints, seven point pairs were sufficient. Given a number ofcorresponding control points in two images, the transformationparameters can be estimated using the least-square technique in(23). The transformation equations were obtained as follows:

Knowing the mapping function, the sensed image was trans-formed and resampled using cubic convolution interpolation.The resampled image is shown in Fig. 5(a). The resampledimage was then mosaicked with the reference image, as shownin Fig. 5(b). The registration accuracy was estimated using(31) at every control point, as shown in Table II.

To do a fair comparison with the result of manual regis-tration, we visually located and digitized seven best-controlpoints using a magnifier on the standard false-color composite

image (TM bands 4, 3, 2) displayed on a 24-bit screen. Theseseven points were then used in image transformation parameterestimation. The RMSE in both and directions is below 0.7pixel, which is about the best accuracy that manual registrationmethod can achieve. It seems that manual registration resultscan be improved by changing control points and other meth-ods. But in practice, this improvement or refinement is limited.These limiting factors mainly include inaccuracy of referencedata for image-to-map registration and inevitable positionalerror or deviation during visual correspondence. These factorsare the sources of system errors in the manual registrationmethod. The manual registration result was then comparedto that of the proposed automated algorithm. The result ofthis comparison is shown in Table III. The results show thatthe proposed automated algorithm has better performance thanmanual registration, with an improvement in the registrationaccuracy of more than half a pixel on the average.

V. CONCLUSIONS

This paper has dealt with automatic registration of remotelysensed imagery. This procedure appears often in the processingof multitemporal and/or multisensor satellite image analyses,such as multisource data fusion, multitemporal change detec-tion, and image mosaicking. In this work, we explored thecritical elements for an automated image registration systemusing Landsat TM imagery. These elements included im-age segmentation, control-point selection and correspondence,and transformation parameter estimation. Attention has beenpaid to the region-feature extraction and automated control-point correspondence. In image segmentation, we developeda method calledthin and robust zero-crossingbased on theconventional LoG operator, and we improved the performanceof the LoG zero-crossing edge detector by selectively pickingzero-crossing points and defining an ES array. The images(both the reference image and the sensed image) were thensegmented by thethin and robust zero-crossing-edge extrac-tion technique, and the regions with closed boundaries wereextracted. Each region was then represented by modified chaincodes and invariant moments, which are affine-invariant de-scriptors describing the shape of a region. The correspondence

RMSE (26)


(a)

(b)

(c)

Fig. 5. (a) Registered image, (b) mosaic of the two images after registration,and (c) overlapped area between these two images after registration, which isready for further processing (such as digital-change detection).

between regions was established using the combined criteria ofinvariant-moment distance and improved chain-code matching.Region matching was first implemented in the feature spaceand was then implemented sequentially in the image space.

TABLE IIICOMPARISON BETWEEN MANUAL REGISTRATION AND THE PROPOSEDALGORITHM

In the feature space, minimum distance classification wasused to identify the most robust control points for initialimage transformation. In the image space, region-to-regioncorrespondence was established by the RMSE rule. After thecorrespondence between the regions had been established, thecenters of gravity of the corresponding regions were usedas control points. The parameters of transformation werecomputed by the least-square rule based on general affinetransformation.

The performance of the proposed algorithm has beendemonstrated by registering two multitemporal Landsat TMimages taken in different years. Registration accuracy ofless than one-third of a pixel has been achieved in theexperiments. The proposed automated algorithm outperformsmanual registration by over half a pixel on the average (interms of the RMSE at the GCP’s). In summary, the techniqueof automated image registration developed in this work ispotentially powerful in terms of its registration accuracy, thedegree of automation, and its significant value in an operationalcontext. The approach is also robust, since it overcomes thedifficulties of control-point correspondence caused by theproblem of feature inconsistency. It has numerous applicationsin automated change detection, image fusion, motion detection,and stereo matching.

ACKNOWLEDGMENT

The authors would like to thank W. E. Snyder and G. L.Bilbro of the Department of Electrical and Computer Engineer-ing and J. Lu of the Department of Biological and AgricultureEngineering, North Carolina State University, Raleigh, for thestimulating discussion and generous help in computer coding.The authors are also thankful to the anonymous referees fortheir critical review and constructive comments.

REFERENCES

[1] X. Dai and S. Khorram, “A hierarchical methodology framework formultisource data fusion in vegetation classification,”Int. J. RemoteSensing, vol. 19, pp. 3697–3701, Dec. 1998.

[2] , “The effects of image misregistration on the accuracy of re-motely sensed change detection,”IEEE Trans. Geosci. Remote Sensing,vol. 36, pp. 1566–1577, Sept. 1998.

[3] M. S. Tanaka, S. Tamura, and K. Tanaka, “On assembling subpicturesinto a mosaic picture,”IEEE Trans. Syst., Man, Cybern., vol. SMC-7,pp. 42–48, Mar. 1977.

[4] L. M. G. Fonseca and B. S. Manjunath, “Registration techniquesfor multisensor remotely sensed imagery,”Photogramm. Eng. RemoteSensing, vol. 562, pp. 1049–1056, Sept. 1996.

[5] X. Dai and S. Khorram, “Development of a feature-based approach toautomated image registration of multisensor and multitemporal remotelysensed imagery,” inProc. 1997 IEEE Int. Geosci. Remote Sensing Symp.(IGARSS’97),Singapore, Aug. 1997, pp. 243–245.


[6] H. Li, B. S. Manjunath, and S. K. Mitra, “A contour based approach tomultisensor image registration,”IEEE Trans. Image Processing, vol. 4,pp. 320–334, Mar. 1995.

[7] L. G. Brown, “A survey of image registration techniques,”Comput.Surv., vol. 24, no. 4, pp. 325–376, 1992.

[8] A. V. Cideciyan, S. G. Jacobson, C. M. Kemp, R. W. Knightton, andJ. H. Magel, “Registration of high resolution images of the retina,” inProc. SPIE, Medical Imaging VI: Image Processing, Newport Beach,CA, Feb. 1992, vol. 1652, pp. 310–322.

[9] Q. Zheng and R. Chellappa, “A computational vision approach to imageregistration,”IEEE Trans. Image Processing, vol. 2, pp. 311–326, July1993.

[10] E. J. M Rignot, R. Kowk, J. C. Curlander, and S. S. Pang, “Automatedmultisensor registration: Requirements and techniques,”Photogramm.Eng. Remote Sensing, vol. 57, pp. 1029–1038, Aug. 1991.

[11] G. O. Mason and K. W. Wong, “Image alignment by line triples,”Photogramm. Eng. Remote Sensing, vol. 58, pp. 1329–1334, Sept. 1992.

[12] T. Schenk, J. Li, and C. Toth, “Toward an autonomous system fororienting digital stereopairs,”Photogramm. Eng. Remote Sensing, vol.57, pp. 1057–1064, Aug. 1991.

[13] J. S. Greenfeld, “An operator-based matching system,”Photogramm.Eng. Remote Sensing, vol. 57, pp. 1049–1055, Aug. 1991.

[14] A. Goshtasby, G. C. Stockman, and C. V. Page, “A region-basedapproach with subpixel accuracy,”IEEE Trans. Geosci. Remote Sensing,vol. GE-24, pp. 390–399, May 1986.

[15] A. D. Ventura, A. Rampini, and R. Schettini, “Image registration byrecognition of corresponding structures,”IEEE Trans. Geosci. RemoteSensing, vol. 28, pp. 305–314, May 1990.

[16] J. Ton and A. K. Jain, “Registering Landsat images by point matching,”IEEE Trans. Geosci. Remote Sensing, vol. 27, pp. 642–651, Sept. 1989.

[17] J. Flusser and T. Suk, “A moment-based approach to registration ofimages with affine geometric distortion,”IEEE Trans. Geosci. RemoteSensing, vol. 32, pp. 382–387, May 1994.

[18] A. Goshtasby, “Registration of images with geometric distortions,”IEEETrans. Geosci. Remote Sensing, vol. 26, pp. 60–64, Jan. 1988.

[19] H. H. Li and Y. T. Zhou, “Automatic visual/IR image registration,”Opt.Eng., vol. 35, pp. 391–400, Feb. 1996.

[20] B. S. Reddy and B. N. Chatterji, “An FFT-based technique for transla-tion, rotation, and scale-invariant image registration,”IEEE Trans. ImageProcessing, vol. 5, pp. 1266–1271, Aug. 1996.

[21] J. Le Moigne, “Parallel registration of multi-sensor remotely sensedimagery using wavelet coefficients,” inProc. SPIE: Aerosense WaveletConf., Orlando, FL, Apr. 1994, vol. 2242, pp. 432–443.

[22] J. P. Djamdji, A. Bijaoui, and R. Maniere, “Geometrical registrationof images: The multiresolution approach,”Photogramm. Eng. RemoteSensing, vol. 59, no. 5, pp. 645–653, 1993.

[23] R. M. Haralick and L. G. Shapiro,Computer and Robot Vision, vol. 1.Reading, MA: Addison-Wesley, 1993.

[24] A. K. Jain, Fundamentals of Digital Image Processing. EnglewoodCliffs, NJ: Prentice-Hall, 1989.

[25] M. K. Hu, “Visual pattern recognition by moment invariants,”IEEETrans. Inform. Theory, vol. IT-8, no. 2, pp. 179–187, 1962.

[26] F. L. Bookstein, “Principal warps: thin-plate splines and the decompo-sition of deformations,”IEEE Trans. Pattern Anal. Machine Intell., vol.11, pp. 567–585, June 1989.

[27] I. Barrodale, D. Skea, M. Berkley, R. Kuwahara, and R. Poeckert,“Warping digital images using thin plate splines,”Pattern Recognit.,vol. 26, no. 2, pp. 375–376, 1993.

[28] J. Flusser, “An adaptive method for image registration,”Pattern Recog-nit., vol. 25, no. 1, pp. 45–54, 1992.

Xiaolong Dai (S’95–M’97) received the B.S.E.E.degree in electrical engineering from the Depart-ment of Electronics and Information Engineering,Huazhong University of Science and Technology(HUST), Wuhan, China, in 1986, the M.S. de-gree in systems ecology under a joint graduateprogram from the Chinese Academy of Forestryand Tsinghua University, Beijing, China, in 1989,and the Ph.D. degree in both electrical engineeringand forestry from North Carolina State University(NCSU), Raleigh, in 1997.

He was with the Chinese Academy of Sciences from 1989 to 1992, where heworked as an Engineer and Assistant Director in the Laboratory of SystemsEcology, Beijing, China. He has been affiliated with the Center for EarthObservation (CEO) (formerly the Computer Graphics Center), NCSU, since1992. He is currently a Senior Research Scientist with CEO. Over thepast decade, he has been involved in conducting research in digital imageprocessing and machine vision as applied to remotely sensed data and medicalimagery. He has published one book in human ecology and more than 35articles in refereed journals and international conference proceedings in remotesensing, medical imaging, and image analysis. His interests span a wide gamutof interdisciplinary and multidisciplinary fields in science and technology,from electronic engineering to geosciences and information technology. Hisprimary research interests include the theory, technology, and applicationsof information extraction from imagery, including automated image analysissystems, digital image processing, pattern analysis and machine intelligence,geographic information systems, data visualization and virtual reality, andmultisource data fusion.

Siamak Khorram (M’84) received the M.S. degreein engineering and the M.S. degree in ecology fromthe University of California, Davis (UCD). In 1975,he received the Ph.D. degree with emphasis onremote sensing and image processing under a jointstudy program from the University of California,Berkeley, and from the Water Science and Engi-neering Department, UCD.

Since 1975, he has been involved in teachingand conducting research in remote sensing, geoin-formatics, environmental science and engineering,

the development of computer-based information technologies, and systemsintegration. In 1982, he established the Computer Graphics Center at NorthCarolina State University (NCSU), Raleigh, as a university-wide organizedresearch center involved in conducting and facilitating research and training inremote sensing and geoinformatics. In 1995 and 1996, he served as the Deanand Vice President for Academic Programs, International Space University(ISU), Strasbourg, France. Currently, he serves on the Board of Trustees ofISU and is a founding member of the Peaceful Uses of Space for all Humanity,Switzerland. In 1997, he established the Center for Earth Observation andserves as the Professor and the Director of this Center. He has taught coursesin air photo interpretation and photogrammetry, environmental remote sensing,digital image processing, and information systems. He has authored more than130 technical publications.

a feature-based image registration algorithm using …...image segmentation for the purpose of...

Documents