video motion interpolation for special effect applications timothy k. shih, senior member, ieee,...

Video Motion Interpolation for Special Effect Applications

Timothy K. Shih, Senior Member, IEEE, Nick C. Tang, Joseph C. Tsai, and Jenq-Neng Hwang, Fellow, IEEE

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 5, SEPTEMBER 2011

Outline• Introduction• System Overview•Motion Layer Segmentation and Tracking•Motion Interpolation Using Video Inpainting• Experimental Results• Conclusion

Introduction

Background• Video forgery (video falsifying):

• A technique for generating fake videos by altering, combining, or creating new video contents

• For instance, the outcome of a 100 m race in the olympic game is changed.

Introduction• Example of video forgery :

Original video frame Falsifying result

Objective• To create a forged video, which is almost

indistinguishable from the original video

• To create special effects in video editing applications

Introduction• To change the content of video, the

following techniques are commonly used:• object tracking• motion interpolation• video inpainting• video layer fusing

Introduction• Contributions of this paper:

• 1) It is the first time that video forgery is attempted based on video inpainting techniques.

• 2) A new concept called guided inpainting for motion interpolation of video objects is proposed.

• 3) A guided quasi-3-D (i.e., X, Y, and time) video inpainting mechanism is proposed.

System Overview

System Overview• 1) Motion Layer Segmentation: • Separates background and tracked object• 2) Motion Prediction:• Finds Reference Stick-Figure to predict cycle

of motion • 3) Motion Interpolation: • Motion analysis• Patch assertion• Motion completion via inpainting.

System Overview• 4) Background Inpainting: • Inpaints background of different camera

motions• 5) Layer Fusion: • Merges an object layer and a background

layer

Motion Layer Segmentation and

Tracking

Motion Layer Segmentation and Tracking• Separate target objects from the background

• Adopt Mean Shift Feature Space Analysis Algorithm[2] for color region segmentation

[2] D. Comaniciu and P.Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002.

• Initial segmentation of objects from their background

• ALGORITHM: REFERENCE STICK FIGURE TRACKING

Manually Selected

Mean Shift Algorithm

C0

C0’

Frame 0


Revised Fast Tracking Mechanism[6]

Bounding Box B1

Frame 1

[6] K. Hariharakrishnan and D. Schonfeld, “Fast object tracking using adaptive blockmatching,” IEEE Trans.Multimedia, vol. 7, no. 5, pp. 853–859, Oct. 2005.


Mean Shift Algorithm

B1’

Comparing Color Segments of C0’ and B1’

C1’

Frame 1 Frame 0


Applying Dilation C1’ C1*

Frame 1


Frame 0

CoC1*

Comparing corresponding

pixel p

If (p in C1*) - (p in C0) > L2

　 Exclude p in C1’Else　 Keep p in C1’

Frame 1

‧Set L2 = 2‧Using L in LUV color space

Motion Segmentation• Different parts of the target may move in

different directions

• Decomposing an object into different regions

• Using revised block searching algorithm[10] to compute motion map

[10] J. Jia, Y.-W. Tai, T.-P.Wu, and C.-K. Tang, “Video repairing under variable illumination using cyclic motions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 5, pp. 832–839, May 2006.

Motion Segmentation• The Mean Shift color segmentation can also be

revised to deal with motion segmentation:

• Based on blocks, not pixels

• Important for video inpainting

• Ghost shadows can be eliminated

Original Video Frame

Corresponding result of color segmentation by using [2]

The example of tracked object and estimated vectors

Motion Interpolation Using Video Inpainting

Motion Interpolation Using Video Inpainting•Motion Interpolation :

• Motions of the target object need to be interpolated.

• Video Inpainting :

• In order to obtain the interpolated figures

• Motion interpolation may create background holes.

Motion Interpolation of Target Objects• A target object can be segmented into a layer.

•Motion interpolation is required to produce a slow motion of the target layer.

Original OriginalInterpolated Interpolated

tn tn+1 tn+2 tn+3

General Inpainting Strategy

• In order to obtain the interpolated figures

• Using a rule-based thinning algorithm[1] :

• To obain the stick figures of target objects

• Stick figures:

• Used to guide the selection of patches

• Copied from the original video

[1] M. Ahmed and R. Ward, “A rotation invariant rule-based thinning algorithmfor character recognition,” IEEE Trans. Pattern Anal. Mach. Intell.,vol. 24, no. 12, pp. 1672–1678, Dec. 2002.

General Inpainting Strategy

• Quasi-3-D video space (2-D plus tIme)

• Using 3-D patches in quasi-3-D video inpainting

• produce a smooth movement

Prediction and Interpolation of Cyclic Motion• Consider the following scenario:

• 1) It’s common for target objects to perform actions in a repeated cycle .

• 2) A stick figure can be used to estimate the relative positions of patches .• (e.g., head, body, and legs).

Prediction and Interpolation of Cyclic Motion• Stick figures and the contours of target objects

can be used to predict repeated cycles.

Prediction and Interpolation of Cyclic Motion•Missing stick figure can be reproduced by:

• 1) Searching for similar reference stick figures in a repeated motion cycle

• 2) Interpolation of two known stick figures

• ALGORITHM: REFERENCE STICK FIGURE SEARCHING

• ALGORITHM: REFERENCE STICK FIGURE SEARCHING

x ………… x+r

r : the number of frames in a repeated cycle

index range function of a given frame number x as idx(x) = [x + r − 2, x + r − 1, x + r, x + r + 1, x + r + 2] [x − r − 2, x − r − 1, x − r, x − r + 1, x − r + 2].∪

x-r

• STICK FIGURE INTERPOLATION

Oa Union of Oa and ObObThinning

result

Motion Interpolation Alogorithm

• extending our image inpainting algorithm for motion interpolation

• consider a video as a 2-D plus time domain

Motion Interpolation Alogorithm

• I3 = Φ3 Ω∪ 3 •Φ3 is a source space•Ω3 is a target space•Φ3 ∩ Ω3= ∅ (an empty set)

• ALGORITHM: PATCH ASSERTION

• Example for patch assertion:

• ALGORITHM: PATCH ASSERTION

Patches on stick figure

Contour ω of (a)(searched in the nearby motion cycle)

Result of patch assertion

• The main algorithm

• Let ∂Ω3 be:

• a front surface on Ω3

• adjacent to Φ3

• ALGORITHM: MOTION INTERPOLATION

3

33

3Source Region

Target Region

• Given a 3-D patch Ψp centered at the point p

• Let Ψp ‘s priority P(p) = C(p) × D(p)


3

3

3

33

3Source Region

Target Region

• C(p) : Confidence term:

• The percentage of useful information inside a patch centered at p

the size of 3-D patch is denoted as |Ψ3| = 27 pixels


• D(p) : Data term

• Compute the percentage of edge pixels in the patch ( Instead of computing the isophote[3] )

var(Ψp ) : the color variation of the patch [21]


3

[3] A. Criminisi, P. Perez, and K. Toyama, “Region filling and object removal by exemplar-based image inpainting,” IEEE Trans. Image Process., vol. 13, no. 9, pp. 1200–1212, Sep. 2004.

[21] T. K. Shih, N. C. Tang,W.-S. Yeh, T.-J. Chen, andW. Lee, “Video inpainting and implant via diversified temporal continuations,” in Proc. 2006 ACM Multimedia Conf., Santa Barbara, CA, Oct. 23–27, 2006, pp. 133–136.

• Example of motion interpolation via inpainting


Inpainting Camera Motions

• Using mechanism proposed in [22]

• Ensures that there is no “ghost shadows” created in the background

• Segment motions into different regions

• The inpainted area in the previous frame needs to be incorporated.

[22] T. K. Shih, N. C. Tang, and J.-N. Hwang, “Ghost shadow removal in multilayeredvideo inpainting,” in proc. IEEE 2007 Int. Conf.Multimedia Expo, Beijing, China, Jul. 2–5, pp. 1471–1474.

Layer Fusion• Need to merge video layers to produce

forged video.

• The fusion process merges an object layer and a background layer.

(With contour of object layer computed based on the object tracking)

•

• ALGORITHM: LAYER FUSION


Object layer obj Dilation area of obj δob j

Corresponding area on background layer

δbkgδob j pixel intensity → υL

Object Background

•


(-2,70)

Histogram of the intensity difference in δob j and δbkg

υL

Intensity Difference

Count

Experimental Results

Experimental Results• Block size used in Patch Assertion:• 10-by-10 pixels

• Patch size used in Motion Interpolation:• 3-by-3-by-3 pixels

• Hardware:• CPU 2.1G with 2G RAM

Evaluation•Motion Layer Segmentation

•Motion Prediction

•Motion Interpolation

• Background Inpainting

• Layer Fusion

Evaluation

Limitations of the Proposed Mechanism• Shadows cannot be tracked precisely.

• Only use intensity to merge layers (without considering the chrominance information)

• A sophisticated 3-D reconstruction mechanism needs to be investigated.

• Only produces actions two times slower for slow motion.

Conclusion

Conclusion• This paper Proposes an interesting technology

to alter the behavior of moving objects in a video

• Effectively extends the inpainting technique to a quasi-3-D space

• Allows a video • to be separated into several layers,• played in different speeds,• and then merged

Conclusion• A series of difficult problems are solved.

• Solutions are successfully integrated.

•Mostly, it is a subjective feeling of how a fake video looks real.

• Necessary to develop an authoring tool to allow the users to specify the spatiotemporal fluctuation property

video motion interpolation for special effect applications timothy k. shih, senior member, ieee,...

Documents