image morphing in frequency domain - università di torino

Noname manuscript No.(will be inserted by the editor)

Image Morphing in Frequency Domain

M. Shahid Farid · Arif Mahmood

Received: date / Accepted: date

Abstract Image morphing is often used in film andtelevision industry to create synthetic visual effects bysmooth transformation of one object into another. Basedupon spatial representation of images, several imagemorphing techniques have been proposed. Simple spa-tial techniques, for example cross-dissolve, suffer fromlack of smooth transformation while better quality tech-niques, like mesh warping or field warping, have signifi-cant computational complexity. In this paper we presenta simple but good quality image morphing techniquebased upon frequency domain representation of images.Transformation from a source image to a target imagetakes place by mixing low frequencies of the source im-age and high frequencies of the target image in varyingproportions. The proposed technique has been appliedto a wide variety of images. The resulting sequence ofimages are better in visual quality and faster in execu-tion time.

Keywords Image Morphing · Metamorphosis ·Frequency Domain Morphing · Image Transformation ·Warping

This work was partially supported by a research grant fromUniversity of the Punjab, Lahore. We are thankful to the VCof the University of the Punjab, Prof. Dr. Mujahid Kamranfor approving this research grant to carry out this research.

M. Shahid FaridPunjab University College of Information TechnologyUniversity of the Punjab, Lahore - 54000, PakistanTel.: +92-42-99212506Fax: +92-42-9212505E-mail: [email protected]

Arif MahmoodPunjab University College of Information TechnologyUniversity of the Punjab, Lahore - 54000, PakistanE-mail: [email protected]

1 Introduction

Image morphing is an important area due to its ap-plications in the creation of special visual effects forentertainment in film, television industry [39,32], ineducation and in the field of medicine [9]. These ef-fects are generated by smooth transformation of the ob-jects and the colors of one image to the objects and thecolors of another image. For this purpose, several spa-tial domain image morphing techniques have been pro-posed [18,22,30,31,25,44]. Some of these techniques arecomputationally simple, for example, the cross-dissolvetechnique [39], however these simple techniques do notexhibit good quality results. Whereas, some other tech-niques like mesh warping [22], field warping [18] andmultilevel free form deformation [25] are better in qual-ity, however these techniques exhibit significant com-putational complexity. In this paper we propose a newmorphing technique which is computationally simpleand yields good quality results.

The image morphing problem may be considered asthe transformation of a source image into a target im-age by generating a sequence of intermediate images. Ifthese images are frames of a video sequence, the sourceimage may be considered to be ‘fading out’ and thetarget image to be ‘fading in’ with the passage of time.Each frame in the intermediate sequence contains in-formation from both the source and the target images.Therefore areas in the intermediate frames, where thesource and the target images are misaligned, may con-tain ‘ghost-effect’, which is undesirable and should beproperly handled. In simple image morphing techniqueslike cross-dissolve, ghost-effect has not been properlyhandled while the more complex techniques, like meshand field warping, properly handle this artifact by in-

2

troducing an additional step that aligns features in thesource and the target images.

Image morphing techniques which effectively han-dle the ghost-effect may be organized into three basicsteps: (1) feature specification, (2) warp generation and(3) transition control. In step (1), perceptually similarpoint correspondences are marked across the source andthe target images. Both images may contain very differ-ent contents, therefore automatic detection of perceptu-ally similar points is a tough process and generally thesecorrespondences have been manually marked in the im-age morphing techniques. In field morphing approach,instead of point correspondences, corresponding linesare specified in the source and the target images. Inmesh morphing, correspondences are specified in theform of feature polygons.

In step (2), a geometric transformation model iscomputed by using the feature correspondences markedin step (1). That model is then used to geometricallyalign the target image to the source image in order tominimize the ghost-effect. If objects in the source andthe target images are significantly different in size, thena sequence of geometric models may be required to gen-erate a sequence of transformed target frames. Informa-tion from each of the transformed target frame will beproperly mixed with the source image to generate anintermediate frame. This mixing of information is con-trolled through a process known as ‘transition control’.

In the intermediate frames, the source image infor-mation is gradually reduced and the target image in-formation is gradually increased. This variation of thesource and the target image information is consideredas the ‘transition control’. A proper transition controlfunction ensures a visually smooth transformation ofthe source image into the target image.

Our proposed image morphing technique is basedupon the frequency domain representation of the sourceand the target frames. We observe in frequency do-main representation, high frequencies capture local de-tails while low frequencies capture the global struc-ture of the image. We exploit this fact for the creationof smooth transformation of source image into targetimage by blending low frequencies of the source im-age with the high frequencies of the target image. Ourtransition-control process consists of a set of low passand high pass filters with gradually varying cutoff fre-quencies. For a particular intermediate frame genera-tion, the source image is low passed and the target im-age is high passed and the filtered image frequencies areblended together.

The proposed algorithm has been implemented andtested on different types of target and source images.The experimental results demonstrate that the proposed

algorithm produces results better in visual quality andfaster in running time than other morphing techniques.

2 Related Work

Several morphing techniques have been introduced inthe last couple of decades in spatial domain. Cross-Dissolve morphing technique is the most fundamentalway to morph two images [38]. In this approach sourceimage fades out, with the passage of time, and targetimage fades in. In simple words each pixel in sourceimage starts transforming to that of target image. Theproblem with Cross-Dissolve method is the double ex-posure in the misaligned regions. This effect is referredto as ‘ghost’ and is particularly apparent in the middleframes.

Fig. 1 ‘a’ source image, ‘d’ target image, ‘b’ and ‘c’ are twointermediate frames.

Mesh Warping or Mesh Morphing was pioneered atIndustrial Light & Magic (ILM) by Douglas Smythefor use in movie Willow in 1988 [22]. Mesh Warpinghave been organized in three steps: feature specifica-tion, warp generation and transition control. Matchingfeatures of the source and target images were specifiedto align the images which was referred to as warpingor warp generation. In transition control, selected con-trol points of the source image are mapped to that oftarget image by time. In mesh warping, ‘ghost’ was re-moved by performing transition control locally ratherthan globally. Corresponding features in two imageswere specified in the the form of meshes. The points ofsource image were mapped over time to coincide withthe points in the target image with color transition.

Meshes may be a convenient way to specify pairsof feature points, however, sometimes meshes turn outto be cumbersome to use. The field morphing approachdeveloped by Bier and Neely [18] simplified the featurespecification by means of line pairs. A pair of corre-sponding lines in the source and the target images wasused to define a coordinate mapping between the sourceand the target images. The pixels in the target imagewere then warped with respect to their distance fromthe feature lines. Since multiple lines were usually spec-ified, the displacement of a point in the source imagewas a weighted sum of the mappings due the each line

3

pair with the weights attributed to the distance andline length.

An important thing in any morphing algorithm isthe alignment of the two images, which has often beenachieved by defining a mapping function between them.Features points has often been marked using correspon-dence, lines, meshes and snakes [42,19]. An image align-ment technique based on line is described in [36]. Aline segment based morphing algorithm is given in [18].Wolfgang Kruger described a line segment based im-age alignment technique in [20] to compute the map-ping function between line segments of a map and linesegments extracted from an image. A frequency do-main image registration technique is presented in [21].The entire registration process is done in frequency do-main. Polyaffine, an image registration framework ispresented in [5] that warps images with a small degreeof freedom (DOF). A 3D real time registration tech-nique is given by SY Park et. el. in [28]. The technique ispoint-to-point based and exploits the power of graphicalprocessing unit (GPU) for image registration. The tech-nique uses points pair to define the line segments and amapping function is defined for registration. There areseveral other techniques [6,11,15,13,29] to align digitalimages.

Ruprecht and Muller constructed a new mappingfunction for interpolation that was a combination ofseveral basis functions [30]. This approach was namedas Radial Basis Functions or Thin Splines. A level-setapproach to image blending [37] generates the interme-diate images by successively minimizing the differencematrix of the source and the target image. S. Lee et.al.[24] extended the idea of morphing two reference imagesto n reference images. They formulated the n−1 dimen-sional simplex model of n images, which is a polyhedronwith n vertices. Any coordinate in this polyhedron rep-resents a morphed image and barycentric coordinates ofthat point determines the weights to be used to blendthe n images. In Skeleton-driven approach [7], a fea-ture graph was built from skeletons of the source andthe target images and then intermediate skeletons areconstructed.

In layer based morphing technique [12], objects aresegmented into regions in the form of separate layers.Morphing is done between specific object segments. Inthis technique, ghost effect is locally minimized. Imagemorphing using piecewise linear curves in is proposedin [43]. In this approach, two images are representedby piecewise linear curves and a hierarchical approachis then used to find the corresponding feature transfor-mation. Ju Young Kang and B.S. Lee used morphing tohull form generation [17]. Gotsman and Surazhsky [2]extended morphing to polygons such that the interme-

diate polygons are simple and intersection free. Imagemorphing is used in [3] to generate intermediate im-age between top and bottom slices of coronal loops. Amorphing technique in which objects are represented bycontours has been described in [1]. In this technique, in-termediate frames are generated by using physics basedformulation. Image morphing using elastic body splinedeveloped in [4]. There are several other morphing tech-niques, for example energy minimization [31], multilevelfree form deformation [25] and morphing based on op-timal mass preserving mapping [44,23,16].

View morphing is an extension of image morph-ing, in which a view is constructed from two differentviews. It was introduced in [35] which correctly handles3D projective camera and scene transformations. Man-ning and Dyer [26] extended this approach to dynamicscenes. Xiao and Shah [40] further extended their workto generate video from three uncalibrated images with-out using 3D model. Tri-view morphing [41] is a recentapproach used to create 3D scenes using multiple imagemorphing. It used trifocal tensor to generate the warp-ing transforms among three views. Often the objectsto morph have different pose or view, simple morphingtechniques, in such cases, do not create pleasing effects.Seitz and Dyer [34,33] showed that from two differentviewpoints of an object, any view can be created usingmorphing along the line connecting these viewpoints.

Most of the existing techniques have been imple-mented in spatial domain. In this work, we propose amorphing technique based on frequency domain. Ourwork is motivated by the concept of Hybrid Images[27]. In hybrid image technique, by mixing high andlow frequencies of two different images, a third image isgenerated which may produce different perceptions, asthe viewing distance changes. The hybrid image tech-nique is not an image morphing technique. In our work,we have extended the same idea for the application ofimage morphing. To the best of our knowledge no suchmorphing algorithm has been proposed before us. Ourcomplete algorithm is discussed in the following section.

3 Image Morphing in Frequency Domain

Numerous spatial domain image morphing techniqueswere discussed in the previous section. In this section,we describe our proposed technique which we have namedas Image Morphing in Frequency Domain (IMFD). Ourproposed technique is based on two steps: image align-ment and transition control. Each of these steps is ex-plained in the following subsections:

4

3.1 Image Alignment

In order to minimize ghost-effect, the target image shouldbe geometrically aligned with the source image. For thispurpose, in our proposed algorithm, the animator hasto manually specify the perceptually similar feature cor-respondences in the form of points or lines. For imageswith little fine details, we observe, features may be spec-ified as points whereas in case of non-regular, complexobjects line segments produce better results. In the fol-lowing subsections, alignment techniques used in ourproposed morphing algorithm are described.

3.1.1 Point based feature alignment

In point based feature alignment, the animator selectsperceptually similar feature point correspondences inthe source and the target images. The minimum num-ber of correspondences depends upon the parameters inthe assumed geometric model between the source andthe target images. For an affine transformation model,at least three correspondences are must and for the pro-jective model at least four correspondences are required.Specifying larger number of correspondences, than theminimum required, causes a reduction in the effect ofthe point click error by using the least error squaredmodel fitting approach. Figure 2 shows six feature pointcorrespondences specified in the source and the targetimages.

Fig. 2 Six perceptually similar point correspondence aremanually marked in the source image (left) and the targetimage (right). The dotted line shows the order of clicking thefeature points.

Once the feature correspondences are specified, thegeometric model that defines the spatial relationshipbetween the source and the target images may be com-puted. We empirically observed that the projective trans-formation model can sufficiently align the target imagewith the source image. Each correspondence has twopoints: {(x, y), (x′, y′)}, the point (x, y) is in the sourceimage and the point (x′, y′) is in the target image. For

one correspondence, the projective transformation isgiven by:

x′ =a1x+ a2y + b1c1x+ c2y + 1

, (1)

y′ =a3x+ a4y + b2c1x+ c2y + 1

(2)

where a1, a2, a3, a4, b1, b2 and c1, c2 are projective pa-rameters. These Equations may also be written in thematrix form:

[x y 1 0 0 0 −xx′ −yx′0 0 0 x y 1 −xy′ −yy′

]

a1

a2

b1a3

a4

b2c1c2

=[x′

y′

]. (3)

Since there are eight projective parameters and onecorrespondence yields two equations, we need at leastfour correspondences to solve for the eight parameters:

x1 y1 1 0 0 0 −x1x′1 −y1x′1

0 0 0 x1 y1 1 −x1y′1 −y1y′1

x2 y2 1 0 0 0 −x2x′2 −y2x′2

0 0 0 x2 y2 1 −x2y′2 −y2y′2

x3 y3 1 0 0 0 −x3x′3 −y3x′3

0 0 0 x3 y3 1 −x3y′3 −y3y′3

x4 y4 1 0 0 0 −x4x′4 −y4x′4

0 0 0 x4 y4 1 −x4y′4 −y4y′4

a1

a2

b1a3

a4

b2c1c2

=

x′1y′1x′2y′2x′3y′3x′4y′4

. (4)

Equation (4) may be written as:

XP = Y, (5)

where X is (8 × 8) matrix, P is (8 × 1) and Y is also(8×1). P is the geometric model that defines the spatialrelationship between the source and target images. For4 correspondences, X will be non-singular if these 4points do not lie on a straight line. In this case:

P = X−1Y. (6)

If the number of features points, M, are greater than 4,X becomes (2M×8) and Y becomes (2M×1). As X−1

does not exist for non square matrices, the system oflinear equations may be solved using the pseudo inversetechnique:

P = (XtX)−1(XtY ), (7)

5

which yields the same results as the least error squaredmodel fitting technique. The transformation P is ap-plied to the source image so that it gets aligned withthe target image. Maximum number of features shouldbe used for better alignment. Figure 3 shows the twoimages shown in figure 2 after feature alignment. Theselected feature points are aligned.

Fig. 3 Source and target images, shown in figure 2, afteralignment.

Using points to specify the feature correspondencesbetween the source and the target image is easy andcomputationally fast than other alignment techniques.However, we identify three main limitations of the pointbased alignment technique.

1. Local versus global alignment: If the two imagesto be aligned do not follow one global transforma-tion model, local transformation method must bepreferred. The line segment based feature alignmenttechnique is a local alignment technique, thereforemay be preferred in such cases.

2. Time consuming: The quality of alignment is di-rectly affected by the number of feature points. Tohave a better alignment, the animator may have tospecify a large number of points which is quite te-dious and time consuming process.

3. Distorted image: Small inaccuracy while mark-ing feature points may result in a distorted imageinstead of an aligned image.

Therefore, in cases, when the images contain com-plex 3D objects like faces, the global geometric trans-formation models, like affine and projective, may notproperly align the two images. in such cases, we mayuse the line segments to specify the feature correspon-dences.

3.1.2 Line segment based feature alignment

In line segment based feature alignment, the animatormarks lines across a feature in the two images. A linecovers a large feature length (that may require a largenumber of points otherwise) and it is easy to mark a fewlines as compared to a large number of feature points.Each correspondence in this case consists of two lines

defined by two pixels. Let AB = {(x1, y1), (x2, y2)} bea feature line segment in target image and A′B′ ={(x′1, y′1), (x′2, y

′2)} is the corresponding feature line in

the source image. Let P = (x, y) be a pixel in the targetimage and P ′ = (x′, y′) be the new coordinates of pixelP in the source image is computed by the approachoutlined in [18]. Let u be the position of the point Palong the line AB (figure 4) and may be computed as:

u =(P −A) · (B −A)

|B −A|2(8)

Where · is dot product of vectors (P −A) and (B−A),and |B −A| is the magnitude of vector B − A. As Pmoves from A to B, the the value of u changes from 0to 1. The value of u is less than 0 or greater than 1 ifpoint P is outside the line AB, that is:

u =

[0, 1] if A ≤ P ≤ B< 0 if A > P

> 1 if B < P

v is the perpendicular distance in pixels of P from lineAB.

v =(P −A) · (B −A)⊥

|B −A|(9)

(B−A)⊥ is perpendicular vector to vector (B−A) andmay be computed as:

(B −A)⊥ =[

0 −11 0

](B −A)

The new position P ′ of point P is given by A′ and thetransformed distance of P computed using the valuesof u and v.

P ′ = A′ + u(B′ −A′) +v(B′ −A′)⊥

|B′ −A′|(10)

Generally multiple lines are marked in the sourceand the target images. In that case, the position P’ oscomputed by considering the effect of all the line pairs.The reader must see [18] for complete algorithm. Thenext section describes the transition control procedureto generate the intermediate images between the sourceand target images.

3.2 Transition Control

In transition control, a morph sequence is generatedwhich is a set of intermediate frames between the sourceand the target images. When intermediate frames areviewed in sequence the source image transforms into

6

Fig. 4 Points A, A and A’, B’ are marked by the animator.P is any pixel in target image and P’ is the new position ofP in the source image.

the target image seamlessly. Let s and t be the alignedsource and the target images each of size m× n pixelsthat are to be transformed from spatial domain intothe frequency domain. The most commonly used trans-forms include Discrete Fourier Transform (DFT) andthe Discrete Cosine Transform (DCT). We have exper-imented with both DFT and DCT, however, we preferto use DCT due to the Following reasons:-

1. Space complexity of DCT is less than DFT. It isbecause of the fact that DCT coefficients are realwhereas the DFT coefficients are complex numbershaving real as well as imaginary components.

2. In DCT, the transition function may consist of aset of low pass and high pass filters with cutoff fre-quencies varying with uniform step size. In contrast,in case of DFT, in order to produce seamless tran-sition, the step size must be non uniform, whichcomplicates the transition function. It is because ofthe fact that different frequency bands in DFT havemore variation in perceptional significance as com-pared to DCT coefficients.

Let S and T be the frequency domain representa-tions of the source and the target images and let (u, v)be the indices in the frequency domain: 0 ≤ u < m,0 ≤ v < n and S = DCT(s) and T = DCT(t).

In frequency domain, the frequencies near (u, v) =(0, 0) are low frequencies and as we move away from(0, 0), we get high frequencies. The highest frequen-cies are around the other corners of the images (m −1, 0), (0, n− 1) and (m− 1, n− 1). We observe that thelow frequencies correspond to the structure or big de-tails in the image while the high frequencies add the finedetails in the image. Frequency for (u, v) = (0, 0) is DC-component, which is the average of all the image pixels.The size of the image when represented in frequency do-

main is same as in spatial domain. The distance of anypoint from the origin (0, 0) may be computed using Eu-clidean formula. The distance of a point at (u, v) fromthe origin (u, v) = (0, 0) is:

D(u, v) =√u2 + v2 (11)

The maximum distance is at corner (m− 1, n− 1) andis√

(m− 1)2 + (n− 1)2.To extract a particular band of frequencies we use

different types of low pass and high pass filters. Thefilter that cuts-off all the high frequencies from thetransform is called a low pass filter and the filter thatcuts-off all the low frequencies is a high pass filter [14].Commonly used filters include Ideal , Butterworth andGaussian filters to extract certain frequencies from animage. We use Gaussian filters in our algorithm becauseit does not produce the ringing effect in the filtered im-ages. In frequency domain, Gaussian low pass filter maybe written as:

Hlp(u, v) = e−(D2(u,v))/(2D2o)) (12)

where D(u, v) is given by Equation 11 and D0 is thecut off frequency. For an image of size m×n pixels, thevalue of cut-off frequency D0 ranges from D(0, 0) = 0to D(m− 1, n− 1) =

√(m− 1)2 + (n− 1)2, see figure

6. Figure 5 shows eight low pass filtered images of thetarget image shown in Figure 3, obtained by applyingGaussian low pass filter with different cut-off frequen-cies listed in table 1, denoted by D0(u, v). Since Gaus-sian filter is not defined for 0 variance, the first cut-offfrequency D0 = 0.001.

Any intermediate frame is obtained by blending thelow frequencies of the source image S with the high fre-quencies of the target image T . We apply Gaussian lowpass filters with gradually decreasing cut-off frequenciesto filter the high frequencies from S, and Gaussian highpass filters to remove the low frequencies from the T .The blend of the corresponding low pass filtered source

Fig. 5 Eight low pass filtered images of the target shown infigure 3 with different cut-off frequencies, mentioned in table1.

7

Fig. 6 For image size m×n, the cut off frequencies vary from(0, 0) to (m− 1, n− 1) in equal steps for DCT transform.

Table 1 Details of cut-off frequencies D0 for each image in5.

Image label Cut-off frequency Do

a ≈0b 30c 60d 90e 120f 150g 180h 210

images with high pass filtered target image producesthe morph sequence. In the transition control function,Do is varied from the maximum value to the minimumvalue in equal steps. If there are N intermediate framesto be generated between the source and the target im-ages, the total range of Do is divided into N equal stepswith step size:

Step Size =

√(m− 1)2 + (n− 1)2

N(13)

For the ith intermediate frame, the cut-off frequency isgiven by:

Dio =

i

N

√(m− 1)2 + (n− 1)2 (14)

where for i = 1, 2, · · · , N . For each intermediate frame,S is multiplied by the ith Gaussian low pass filterHi

lp(u, v)

Hilp(u, v) = e−(D2(u,v))/2Di

o2) (15)

and T is multiplied by the ith Gaussian high pass filterHihp(u, v):

Hihp(u, v) = 1−Hi

lp(u, v) (16)

Figure 7 shows eight high pass filtered images of thetarget. this is obtained by applying Gaussian high passfilters with the same cut-off frequencies as used in figure5.Both filtered spectrum are then added up:

Fi(u, v) = S(u, v)Hilp(u, v) + T (u, v)Hi

hp(u, v), (17)

where Fi(u, v) is the frequency domain representationof the ith intermediate frame. Equation (17) can besimplified by substituting the value of Hhp(u, v) fromEquation (16):

Fi(u, v) = (S(u, v)− T (u, v))Hilp(u, v) + T (u, v). (18)

For all intermediate frames, (S(u, v)−T (u, v)) remainssame therefore it may be computed only once to reducethe computational cost. An inverse DCT of the Fi(u, v)will yield the intermediate frame in spatial domain:

fi(x, y) = iDCT(Fi(u, v)) (19)

The morph sequence can now be generated betweenthe two images shown in figure 3 by adding the low passfiltered images of the source image to its correspondinghigh pass filtered images of the target. Figure 10 showsthe morph sequence generated as the result. Algorithmfor generating N number of intermediate frames is sum-marized in the following listing, and Figure 9 is theblock diagram of the same algorithm. In case whereimages are colored, the DCT is applied to each colorcomponent.

Fig. 7 Eight high pass filtered images of the source shown inFigure 3 with different cut-off frequencies, mentioned in table1.

8

Algorithm 1 IMFDS(u, v)← DCT (s(x, y))T (u, v)← DCT (t(x, y))for i = 1 to N do

Dio ← i

N

√(m/2)2 + (n/2)2

Hlp(u, v)← Gaussian low pass filter with radius Dio

Fi(u, v)← (S(u, v)− T (u, v))Hilp(u, v) + T (u, v)

fi ← iDCT(Fi(u, v)) {iDCT is inverse of DCT, fi is theith intermediate frame}SAV Eframefi

end for

Fig. 8 From left to right: First image and second image isthe source and target images respectively. The third image isthe source image after alignment and last image is an inter-mediate image. The ghost effect is visible around eyes, noseand lips.

3.3 Choice of features versus quality of the morphedsequence

Quality of the morphed sequence depends on two mainfactors. The quality of the selected features for align-ment and the number of features selected. For featureselection and alignment, either point based or line seg-ment based alignment techniques may be used depend-ing on the complexity of the images. Each approachhas its own advantages and disadvantages as alreadydiscussed in sections 3.1.

3.3.1 Effect of the quality of selected features on thequality of morphed sequence

Selection of good quality features is very important forany morphing algorithm. The animator should carefullychoose the feature points because a bad choice may leadto a badly morphed image sequence. Human vision ismore sensitive to the structure of any object than itscolor. We propose that the main features of the struc-ture in an image should be selected as points of corre-spondence. For example in any facial image eyes, nose,ears, eyebrose, chin and lips are the structural featuresand hence are good candidates. We have repeated thesame experiment but with different features as shownin Figure 8 where ghost effect especially around eyes,nose and lips area is clearly visible. This shows that thechoice of the selected features is very important to thequality of the morphed sequence.

Fig. 9 IMFD Algorithm

3.3.2 Effect of number of features on the quality of themorphed sequence

Quality of morphed sequence also depends on the num-ber of features selected. Greater the number of features,better the alignment and hence better the quality ofthe morphed sequence. The number of features requiredfor a good quality morphing depends on the details inthe two images and the similarity between the objectsto morph. We empirically observed that to morph onehuman facial image to another requires almost 20 fea-tures, but to morph a human face to a horse face willrequire around 100 features. The next section describesthe time complexity and space analysis of the proposedtechnique.

3.4 Execution time and space complexity analysis ofIMFD

Time complexity of generating one intermediate framebetween the two images, after alignment, is the timespent on domain transformation, that is converting theimage from spatial domain to discrete cosine transform.Discrete cosine transform (DCT) of an image I of sizem× n is calculated as:

9

F (u, v) = αuβv∑m−1i=0

∑n−1j=0 I(i, j)cos

[πm (i+ 1

2 )u)]

cos

[π

n(j +

12

)v)]

(20)

where

αu =

{ 1√m

if u = 0√2m if 0 < u < m

and

βv =

{ 1√n

if v = 0√2n if 0 < v < n

Direct implementation of the equation 20 requiresO(N2) time, where N = mn. Discrete cosine transformcan be computed through fast Fourier transform FFTwith O(N) pre-processing and O(N) post-processingsteps [8]. For N elements, FFT can be computed inO(N logN) time. Frigo and Johnson [10] gave an adap-tive, parallel algorithm to compute discrete Fourier trans-form. This implementation is based on SIMD (Single In-struction Multiple Data) architecture. So, the proposedalgorithm is optimal in this way. The asymptotic timecomplexity of the proposed technique is to generate oneintermediate frame between source and target imagesof size m×n is O(mn log(mn)). Often one dimentionalFFT is used to compute 2 dimentional FFT, resultingin O(mn logm + mn log n). Hence, time complexity ofIMFD to generate k number of intermediate frames isO(k(mn log(mn)). The proposed technique uses O(mn)extra space to store discrete cosine transform of thesource image, target image and intermediate frame ofsize m× n.

4 Experiments and Results

The proposed algorithm has been implemented in Mat-lab and tested for the quality of morphing in termsof the smoothness of the transformation and executiontime on a wide variety of images. The results of fivedifferent experiments have been reported in this paper.In each of the experiment, different number correspon-dences are marked for alignment (either using pointbased or line segment based techniques). The interme-diate frames in each experiment are generated by usinga uniform step size in the cut-off frequency of the lowpass and high pass filters. The image size, the numberof intermediate frames, the maximum value of cut-offfrequency, the step size used in each experiment and thetotal execution time taken by each experiment are re-ported in Table 2. The execution time reported in each

Table 2 Experimental details: Image size in pixels, maxi-mum cut-off frequency (Do), step size, number of Intermedi-ate Frames (IF) and the execution time in seconds.

Exp. Size maxDo IF StepSize Time1 175 × 129 216 8 27 0.7812 477 × 360 596.2 55 10.84 28.3633 258 × 273 374.2 80 4.67 19.6254 252 × 218 331.7 50 6.63 7.0165 380 × 300 482.7 35 13.79 11.688

experiment is the total time required to generate thefull sequence of intermediate frames, including the fileI/O times as well. These execution times are reportedupon Dell Optiplex 330 with Intel core 2 duo CPU 2.2GHz and 1GB RAM.

In the first experiment as shown in Figure 10, afacial image of a girl transforms into a lion. The align-ment of the two images is done by using point basedtechnique. Six feature points are marked in the sourceand the target images shown in figure 2. In this experi-ment, 8 intermediate frames are generated in less than0.781 seconds. In next experiment shown in figure 11, aman transforms into a woman. In this experiment 55 in-termediate frames are generated in only 38.363 seconds,costing 0.697 seconds per frame. Figure 11 shows only8 intermediate frames. In this experiment, line segmentbased alignment is done with 12 feature lines.

Figure 12 shows third experiment1, where a boytransforms to another boy. In this experiment, align-ment is done using line segment based technique with 8feature lines and 80 intermediate frames are generatedin 19.625 seconds. In fourth experiment shown in fig-ure 13, a brown cat transforms into a black cat. Fiftyintermediate frames are generated in 7.016 seconds. Fif-teen feature lines were marked to align the source andthe target images. Figure 13 shows every fifth framein the morphed sequence. Figure 14 shows the fifth ex-periment, where alignment is done using line segmentbased technique and 35 intermediate frames were gener-ated. Figure 14 shows every third frame in the morphedsequence.

4.1 Comparison with Mesh Warping and FieldMorphing techniques

The proposed algorithm has also been compared withfield morphing and mesh morphing for the executiontime speedups and quality of the morphed sequences.The five experiments listed in Table 2 are also carriedout using Mesh Morphing and Field Morphing tech-

1 Thanks to Mr. Umair and Mr. Hassan for images used inthird experiment

10

Fig. 10 Experiment 1: Top row: Eight low passed source images with decreasing cut-off frequency from top to bottom. Middlerow: Eight high passed target images with the same cut-off frequencies as the corresponding images in the left column. Bottomrow: The resultant eight intermediate frames. See Table 2 for more details of Experiment 1.

Fig. 11 Experiment 2: A man transforms into a woman. Leftimage in the top row is source image and left image in thebottom row is target image. The left to right in the top rowand right to left in the bottom row are intermediate images.

Fig. 12 Experiment 3: A boy transforms into another boy.Left image in the top row is source image and left image in thebottom row is target image. The left to right in the top rowand right to left in the bottom row are intermediate images.

Fig. 13 Experiment 4: A brown cat transforms into a blackcat. Left image in the top row is source image and left imagein the bottom row is target image. The left to right in thetop row and right to left in the bottom row are intermediateimages.

Fig. 14 Experiment 5: A dog transforms into a cat. Leftimage in the top row is source image and left image in thebottom row is target image. The left to right in the top rowand right to left in the bottom row are intermediate images.

niques in the same environment and on the same ma-chine using the same tool, Matlab.

In section 4.1.1 the quality of the morphed sequencesgenerated by mesh warping technique is compared tothat of the proposed technique. Execution time for eachexperiment is also recorded and presented in table 3.Section 4.1.2 describes quality comparison of Field Mor-phing technique with the proposed technique. Execu-tion time in each experiment is reported in table 4.Section 4.1.3 describes time comparison of the proposedtechnique with mesh warping and field morphing.

4.1.1 Comparison with Mesh Warping

Figures 15, 16, 17, 18 and 19 show the same five exper-iment carried out using mesh warping technique. Youmight observe one problem in the morphed sequencegenerated in these experiments and that is the edgeregions are blurred and give unnatural look. This is be-cause the pixels are moved in blocks or meshes in thistechnique. This effect is evident in the region aroundthe shoulders in figure 16 and in the background tran-sition in figure 17 especially in second and third imagesin the top row and fourth image in the bottom row. Thisproblem is very visible in the right eye in the transfor-mation of brown cat to black cat in figure 18. A blur

11

Fig. 15 Experiment 1 - with mesh warping: Woman trans-forms into a tiger.

Fig. 16 Experiment 2 - with mesh warping: A man transforminto a woman.

Fig. 17 Experiment 3 - with mesh warping: One facial imagetransforms into another.

wave is visible in figure 19 at the bottom of interme-diate images. From all these experiment, it is evidentthat the quality of proposed technique is better thanmesh warping. Table 3 reports the execution time ofeach experiment with number of frames.

Table 3 Execution time of the five experiment listed in table2 using Mesh Warping technique with the same number ofintermediate frames. Time is in seconds.

Exp. No. of IF Execution Time1 8 2.5932 55 35.8903 80 29.9384 50 19.9225 35 21.827

4.1.2 Comparison with Field Morphing

Figures 20, 21, 22, 23 and 24 show the five experi-ment carried out using field morphing technique. There

Fig. 18 Experiment 4 - with mesh warping: A brown cattransforms to a black cat.

Fig. 19 Experiment 5 - with mesh warping: A dog trans-forms into a cat.

are two major problems in field morphing technique.First it is very slow as each pixel in every intermediateframe is computed with respect to all the features lines.The second is related to the quality of the morphed se-quence. As each pixels is interpolated with respect tofeature lines, sometimes unexpected interpolation er-rors occur and object is distorted in intermediate frame.This problem is visible in figure 21. See the region ofright cheek and the chin (especially in middle inter-mediate images), the right cheek and chin is distorted(compare this with figure 11 where the same experi-ment is done with the proposed technique, there is nosuch distortion in that sequence). Experiment 3 in fig-ure 22 suffers with this problem as well. Note the lips inthe morphed sequence are distorted (clearly visible inmiddle intermediate images in this morphed sequence).Compare this sequence with the sequence generated bythe proposed technique in figure 12, which is free fromdistortion of lips and region around lips. The Executiontime of each experiment is reported in table 4.

Table 4 Execution time of the five experiment listed in table2 using Field Morphing techniques. Time is in seconds.

Exp. No. of IF Execution Time1 8 3.3122 55 68.1803 80 60.6124 50 30.4935 35 37.153

12

Table 5 Execution time comparison of the proposed technique with Mesh Warping and Field Morphing techniques. Time isin seconds and MW stands for Mesh Warping and FM stands for Field Morphing.

Exp. No. of IF IMFD MW Speedup over MW FM Speedup over FM1 8 0.781 2.593 3.32 3.312 4.242 55 28.363 35.890 1.26 68.180 2.413 80 19.625 29.938 1.52 60.612 3.094 50 7.016 19.922 2.82 30.493 4.345 35 11.688 21.827 1.86 37.153 3.17

Fig. 20 Experiment 1 - with field morphing: Woman trans-forms into a tiger.

Fig. 21 Experiment 2 - with field morphing: A man trans-form into a woman.

Fig. 22 Experiment 3 - with field morphing: One facial im-age transforms into another.

Fig. 23 Experiment 4 - with field morphing: A brown cattransforms to a black cat.

Fig. 24 Experiment 5 - with field morphing: A dog trans-forms into a cat.

4.1.3 Time speedup over Mesh Warping and FieldMorphing

Time spent to generate same number of intermediateframes for each experiment using the proposed tech-nique, Mesh Warping and Field Morphing techniqueswas reported in tables 2, 3 and 4 respectively. Table 5summarizes the whole experiment section. The execu-tion time speedup of our algorithm is on the average2.16 over mesh warping and on the average 3.45 overfield morphing for same number of intermediate frames.We observe that the proposed technique is much fasterthan field morphing and significantly faster than meshwarping technique without compromising the quality ofthe morphed sequence. This is because in both of thetechniques all features points are referenced to calcu-late one pixel in an intermediate frame. That is, theirexecution time is directly dependent on the number ofcorresponding features, there is no such dependency inthe proposed technique.

5 Conclusion

In this paper a new image morphing technique basedupon frequency domain representation of images is pre-sented. The proposed technique is significantly fasterin terms of execution time than existing spatial do-main techniques without compromising the quality. Thetechnique works in two main steps. In the first step,the source and the target images are aligned using ei-ther the point based or line based alignment technique.In point based feature alignment approach, global geo-metric transformation model is computed, while in the

13

line based approach local transformation model of com-puted at each pixel. We find that in case the image pairis not following one global geometrical transformationmodel, the local approach produces better results. Inthe second step, intermediate frames are generated be-tween the source image and the target image by blend-ing low frequencies of source with high frequencies ofthe target image by using a transition function. Theproposed technique is useful when large number of in-termediate frames are to be generated or morphing hasto be done between images of larger sizes.

Acknowledgements We are thankful to Stephen Mullensfor Mesh Warping code. We are also thankful to Prof. Dr.Syed Mansoor Sarwar for his support.

We are thankful to the referees and the Editor of J MathImaging Vis for their constructive feedback and commentsthat resulted in a significant quality improvement in the orig-inal manuscript.

References

1. Planar shape recognition by shape morphing. PatternRecognition 33(10), 1683 – 1699 (2000)

2. Guaranteed intersection-free polygon morphing. Com-puters & Graphics 25(1), 67 – 75 (2001)

3. A parallel application for 3d reconstruction of coronalloops using image morphing. Image and Vision Comput-ing 25(1), 95 – 102 (2007). SIBGRAPI

4. Aboul-Ella, H., Karam, H., Nakajima, M.: Image meta-morphosis transformation of facial images based on elas-tic body splines. Signal Process. 70, 129–137 (1998)

5. Arsigny, V., Commowick, O., Ayache, N., Pennec, X.:A fast and log-euclidean polyaffine framework for locallylinear registration. J. Math. Imaging Vis. 33, 222–238(2009)

6. Bigot, J., Gadat, S., Loubes, J.M.: Statistical m-estimation and consistency in large deformable modelsfor image warping. J. Math. Imaging Vis. 34, 270–290(2009)

7. Che, W., Yang, X., Wang, G.: Skeleton-driven 2d dis-tance field metamorphosis using intrinsic shape parame-ters. Graph. Models 66, 261–261 (2004)

8. Chen, W.H., Smith, C., Fralick, S.: A fast computationalalgorithm for the discrete cosine transform. Communica-tions, IEEE Transactions on 25(9), 1004 – 1009 (1977)

9. Dykstra, C., Celler, A., Greer, K., Jaszczak, R.: The useof image morphing to improve the detection of tumors inemission imaging. Nuclear Science, IEEE Transactionson 46(3), 673 –679 (1999)

10. Frigo, M., Johnson, S.: The design and implementationof fftw3. Proceedings of the IEEE 93(2), 216 –231 (2005)

11. Fuchs, M., Juttler, B., Scherzer, O., Yang, H.: Shape met-rics based on elastic deformations. J. Math. Imaging Vis.35, 86–102 (2009)

12. Gong, M., Yang, Y.H.: Layer-based morphing. Graph.Models 63, 45–59 (2001)

13. Gonzalez, J., Arevalo, V.: Mesh topological optimiza-tion for improving piecewise-linear image registration. J.Math. Imaging Vis. 37, 166–182 (2010)

14. Gonzalez, R.C., Woods, R.E.: Digital Image Processing(3rd Edition). Prentice-Hall, Inc., Upper Saddle River,NJ, USA (2006)

15. Hagege, R., Francos, J.M.: Parametric estimation ofaffine transformations: An exact linear solution. J. Math.Imaging Vis. 37, 1–16 (2010)

16. Johan, H., Koiso, Y., Nishita, T.: Morphing using curvesand shape interpolation techniques. In: Proceedings ofthe 8th Pacific Conference on Computer Graphics andApplications, PG ’00, pp. 348– (2000)

17. Kang, J.Y., Lee, B.S.: Mesh-based morphing method forrapid hull form generation. Comput. Aided Des. 42, 970–976 (2010)

18. Karam, H., Hassanien, A., Nakajima, M.: Feature-basedimage metamorphosis optimization algorithm. In: Pro-ceedings of the Seventh International Conference on Vir-tual Systems and Multimedia (VSMM’01), VSMM ’01,pp. 553– (2001)

19. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Activecontour models. INTERNATIONAL JOURNAL OFCOMPUTER VISION 1(4), 321–331 (1988)

20. Kruger, W.: Robust and efficient map-to-image registra-tion with line segments. Mach. Vision Appl. 13, 38–50(2001)

21. Larrey-Ruiz, J., Verdu-Monedero, R., Morales-Sanchez,J.: A fourier domain framework for variational image reg-istration. J. Math. Imaging Vis. 32, 57–72 (2008)

22. Lee, S., Woberg, G., Chwa, K.Y., Shin, S.Y.: Imagemetamorphosis with scattered feature constraints. IEEETransactions on Visualization and Computer Graphics 2,337–354 (1996)

23. Lee, S., Wolberg, G., Shin, S.Y.: Scattered data interpo-lation with multilevel b-splines. IEEE Transactions onVisualization and Computer Graphics 3, 228–244 (1997)

24. Lee, S., Wolberg, G., Shin, S.Y.: Polymorph: Morphingamong multiple images. IEEE Comput. Graph. Appl. 18,58–71 (1998)

25. Lee, S.Y., Chwa, K.Y., Shin, S.Y.: Image metamorphosisusing snakes and free-form deformations. In: Proceed-ings of the 22nd annual conference on Computer graphicsand interactive techniques, SIGGRAPH ’95, pp. 439–448(1995)

26. Manning, R.A., Dyer, C.R.: Dynamic view morphing. In:In Proc. SIGGRAPH 96, pp. 21–30 (1996)

27. Oliva, A., Torralba, A., Schyns, P.G.: Hybrid images.In: ACM SIGGRAPH 2006 Papers, SIGGRAPH ’06, pp.527–532 (2006)

28. Park, S.Y., Choi, S.I., Kim, J., Chae, J.: Real-time 3dregistration using gpu. Machine Vision and Applicationspp. 1–14

29. Reyes-Lozano, L., Medioni, G., Bayro-Carrochano, E.:Registration of 2d points using geometric algebra andtensor voting. J. Math. Imaging Vis. 37, 249–266 (2010)

30. Ruprecht, D., Muller, H.: Image warping with scattereddata interpolation. IEEE Comput. Graph. Appl. 15, 37–43 (1995)

31. S. Lee K. Y. Chwa, J.H., Shin, S.: Image morphing usingdeformation techniques. . Visualization and ComputerAnimation 7

32. Seitz, S.: Bringing photographs to life with view morph-ing. In: In Proc. Imagina 97 Conf, pp. 153–158 (1997)

33. Seitz, S.M., Dyer, C.R.: Physically-valid view synthesisby image interpolation. In: In Proc. IEEE Workshop onRepresentations of Visual Scenes, pp. 18–25 (1995)

34. Seitz, S.M., Dyer, C.R.: Toward image-based scene rep-resentation using view morphing. In: IN PROC. 13THINT. CONF. ON PATTERN RECOGNITION, pp. 84–89 (1996)

35. Seitz, S.M., Dyer, C.R.: View morphing. In: Proceedingsof the 23rd annual conference on Computer graphics andinteractive techniques, SIGGRAPH ’96, pp. 21–30 (1996)

14

36. Wang, W.H., Chen, Y.C.: Image registration by controlpoints pairing using the invariant properties of line seg-ments. Pattern Recogn. Lett. 18, 269–281 (1997)

37. Whitaker, R.T.: A level-set approach to image blend-ing. IEEE Transactions on Image Processing 9(11), 1849–1861 (2000)

38. Wolberg, G.: Digital Image Warping, 1st edn. IEEE Com-puter Society Press, Los Alamitos, CA, USA (1994)

39. Wolberg, G.: Image morphing: a survey. The Visual Com-puter 14(8/9), 360–372 (1998)

40. Xiao, J., Shah, M.: From images to video: View morphingof three images. In: VMV, pp. 495–502 (2003)

41. Xiao, J., Shah, M.: Tri-view morphing. Comput. Vis.Image Underst. 96, 345–366 (2004)

42. Xu, C., Prince, J.L.: Snakes, shapes, and gradient vectorflow. IEEE TRANSACTIONS ON IMAGE PROCESS-ING 7(3), 359–369 (1998)

43. Yang, W., Feng, J.: Technical section: 2d shape morphingvia automatic feature matching and hierarchical interpo-lation. Comput. Graph. 33, 414–423 (2009)

44. Zhu, L., Yang, Y., Haker, S., Tannenbaum, A.: An im-age morphing technique based on optimal mass preserv-ing mapping. Image Processing, IEEE Transactions on16(6), 1481 –1495 (2007). DOI 10.1109/TIP.2007.896637

image morphing in frequency domain - università di torino

Documents