video motion interpolation for special effect applications timothy k. shih, senior member, ieee,...
TRANSCRIPT
Video Motion Interpolation for Special Effect Applications
Timothy K. Shih, Senior Member, IEEE, Nick C. Tang, Joseph C. Tsai, and Jenq-Neng Hwang, Fellow, IEEE
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 5, SEPTEMBER 2011
Outline• Introduction• System Overview•Motion Layer Segmentation and Tracking•Motion Interpolation Using Video Inpainting• Experimental Results• Conclusion
Background• Video forgery (video falsifying):
• A technique for generating fake videos by altering, combining, or creating new video contents
• For instance, the outcome of a 100 m race in the olympic game is changed.
Objective• To create a forged video, which is almost
indistinguishable from the original video
• To create special effects in video editing applications
Introduction• To change the content of video, the
following techniques are commonly used:• object tracking• motion interpolation• video inpainting• video layer fusing
Introduction• Contributions of this paper:
• 1) It is the first time that video forgery is attempted based on video inpainting techniques.
• 2) A new concept called guided inpainting for motion interpolation of video objects is proposed.
• 3) A guided quasi-3-D (i.e., X, Y, and time) video inpainting mechanism is proposed.
System Overview• 1) Motion Layer Segmentation: • Separates background and tracked object• 2) Motion Prediction:• Finds Reference Stick-Figure to predict cycle
of motion • 3) Motion Interpolation: • Motion analysis• Patch assertion• Motion completion via inpainting.
System Overview• 4) Background Inpainting: • Inpaints background of different camera
motions• 5) Layer Fusion: • Merges an object layer and a background
layer
Motion Layer Segmentation and Tracking• Separate target objects from the background
• Adopt Mean Shift Feature Space Analysis Algorithm[2] for color region segmentation
[2] D. Comaniciu and P.Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002.
• ALGORITHM: REFERENCE STICK FIGURE TRACKING
Revised Fast Tracking Mechanism[6]
Bounding Box B1
Frame 1
[6] K. Hariharakrishnan and D. Schonfeld, “Fast object tracking using adaptive blockmatching,” IEEE Trans.Multimedia, vol. 7, no. 5, pp. 853–859, Oct. 2005.
• ALGORITHM: REFERENCE STICK FIGURE TRACKING
Mean Shift Algorithm
B1’
Comparing Color Segments of C0’ and B1’
C1’
Frame 1 Frame 0
• ALGORITHM: REFERENCE STICK FIGURE TRACKING
Frame 0
CoC1*
Comparing corresponding
pixel p
If (p in C1*) - (p in C0) > L2
Exclude p in C1’Else Keep p in C1’
Frame 1
‧Set L2 = 2‧Using L in LUV color space
Motion Segmentation• Different parts of the target may move in
different directions
• Decomposing an object into different regions
• Using revised block searching algorithm[10] to compute motion map
[10] J. Jia, Y.-W. Tai, T.-P.Wu, and C.-K. Tang, “Video repairing under variable illumination using cyclic motions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 5, pp. 832–839, May 2006.
Motion Segmentation• The Mean Shift color segmentation can also be
revised to deal with motion segmentation:
• Based on blocks, not pixels
• Important for video inpainting
• Ghost shadows can be eliminated
Original Video Frame
Corresponding result of color segmentation by using [2]
The example of tracked object and estimated vectors
Motion Interpolation Using Video Inpainting•Motion Interpolation :
• Motions of the target object need to be interpolated.
• Video Inpainting :
• In order to obtain the interpolated figures
• Motion interpolation may create background holes.
Motion Interpolation of Target Objects• A target object can be segmented into a layer.
•Motion interpolation is required to produce a slow motion of the target layer.
Original OriginalInterpolated Interpolated
tn tn+1 tn+2 tn+3
General Inpainting Strategy
• In order to obtain the interpolated figures
• Using a rule-based thinning algorithm[1] :
• To obain the stick figures of target objects
• Stick figures:
• Used to guide the selection of patches
• Copied from the original video
[1] M. Ahmed and R. Ward, “A rotation invariant rule-based thinning algorithmfor character recognition,” IEEE Trans. Pattern Anal. Mach. Intell.,vol. 24, no. 12, pp. 1672–1678, Dec. 2002.
General Inpainting Strategy
• Quasi-3-D video space (2-D plus tIme)
• Using 3-D patches in quasi-3-D video inpainting
• produce a smooth movement
Prediction and Interpolation of Cyclic Motion• Consider the following scenario:
• 1) It’s common for target objects to perform actions in a repeated cycle .
• 2) A stick figure can be used to estimate the relative positions of patches .• (e.g., head, body, and legs).
Prediction and Interpolation of Cyclic Motion• Stick figures and the contours of target objects
can be used to predict repeated cycles.
Prediction and Interpolation of Cyclic Motion•Missing stick figure can be reproduced by:
• 1) Searching for similar reference stick figures in a repeated motion cycle
• 2) Interpolation of two known stick figures
• ALGORITHM: REFERENCE STICK FIGURE SEARCHING
x ………… x+r
r : the number of frames in a repeated cycle
index range function of a given frame number x as idx(x) = [x + r − 2, x + r − 1, x + r, x + r + 1, x + r + 2] [x − r − 2, x − r − 1, x − r, x − r + 1, x − r + 2].∪
x-r
Motion Interpolation Alogorithm
• extending our image inpainting algorithm for motion interpolation
• consider a video as a 2-D plus time domain
Motion Interpolation Alogorithm
• I3 = Φ3 Ω∪ 3 •Φ3 is a source space•Ω3 is a target space•Φ3 ∩ Ω3= ∅ (an empty set)
• Example for patch assertion:
• ALGORITHM: PATCH ASSERTION
Patches on stick figure
Contour ω of (a)(searched in the nearby motion cycle)
Result of patch assertion
• The main algorithm
• Let ∂Ω3 be:
• a front surface on Ω3
• adjacent to Φ3
• ALGORITHM: MOTION INTERPOLATION
3
33
3Source Region
Target Region
• Given a 3-D patch Ψp centered at the point p
• Let Ψp ‘s priority P(p) = C(p) × D(p)
• ALGORITHM: MOTION INTERPOLATION
3
3
3
33
3Source Region
Target Region
• C(p) : Confidence term:
• The percentage of useful information inside a patch centered at p
the size of 3-D patch is denoted as |Ψ3| = 27 pixels
• ALGORITHM: MOTION INTERPOLATION
• D(p) : Data term
• Compute the percentage of edge pixels in the patch ( Instead of computing the isophote[3] )
var(Ψp ) : the color variation of the patch [21]
• ALGORITHM: MOTION INTERPOLATION
3
[3] A. Criminisi, P. Perez, and K. Toyama, “Region filling and object removal by exemplar-based image inpainting,” IEEE Trans. Image Process., vol. 13, no. 9, pp. 1200–1212, Sep. 2004.
[21] T. K. Shih, N. C. Tang,W.-S. Yeh, T.-J. Chen, andW. Lee, “Video inpainting and implant via diversified temporal continuations,” in Proc. 2006 ACM Multimedia Conf., Santa Barbara, CA, Oct. 23–27, 2006, pp. 133–136.
Inpainting Camera Motions
• Using mechanism proposed in [22]
• Ensures that there is no “ghost shadows” created in the background
• Segment motions into different regions
• The inpainted area in the previous frame needs to be incorporated.
[22] T. K. Shih, N. C. Tang, and J.-N. Hwang, “Ghost shadow removal in multilayeredvideo inpainting,” in proc. IEEE 2007 Int. Conf.Multimedia Expo, Beijing, China, Jul. 2–5, pp. 1471–1474.
Layer Fusion• Need to merge video layers to produce
forged video.
• The fusion process merges an object layer and a background layer.
(With contour of object layer computed based on the object tracking)
• ALGORITHM: LAYER FUSION
Object layer obj Dilation area of obj δob j
Corresponding area on background layer
δbkgδob j pixel intensity → υL
Object Background
•
• ALGORITHM: LAYER FUSION
(-2,70)
Histogram of the intensity difference in δob j and δbkg
υL
Intensity Difference
Count
Experimental Results• Block size used in Patch Assertion:• 10-by-10 pixels
• Patch size used in Motion Interpolation:• 3-by-3-by-3 pixels
• Hardware:• CPU 2.1G with 2G RAM
Evaluation•Motion Layer Segmentation
•Motion Prediction
•Motion Interpolation
• Background Inpainting
• Layer Fusion
Limitations of the Proposed Mechanism• Shadows cannot be tracked precisely.
• Only use intensity to merge layers (without considering the chrominance information)
• A sophisticated 3-D reconstruction mechanism needs to be investigated.
• Only produces actions two times slower for slow motion.
Conclusion• This paper Proposes an interesting technology
to alter the behavior of moving objects in a video
• Effectively extends the inpainting technique to a quasi-3-D space
• Allows a video • to be separated into several layers,• played in different speeds,• and then merged