Oct. 18, 2004
Universityof Toronto
Modelling Motion Patterns with Video Epitomes
Machine Learning Group MeetingUniversity of Toronto
Oct. 18, 2004
Vincent Cheung
Probabilistic and Statistical Inference GroupElectrical & Computer Engineering
University of TorontoToronto, Ontario, Canada
Advisor: Dr. Brendan J. Frey
Cheung2 / 18
ML Group Meeting, Oct. 18, 2004
Outline
● Image epitome► What?► Why?► How?
● Implementation computation issues► Efficiently implementing the learning algorithm
● Video epitome► Extension to videos► Filling-in missing information
Cheung3 / 18
ML Group Meeting, Oct. 18, 2004Im
age
Lea
rnin
gV
ide
o
Image Epitome
● Jojic, N., Frey, B., & Kannan, A. (2003). Epitomic analysis of appearance and shape. In Proc. IEEE ICCV.
● Miniature, condensed version of the image
● Models the image’s shape and textural components.
● Applications► object detection► texture segmentation► image retrieval► compression
Cheung4 / 18
ML Group Meeting, Oct. 18, 2004Im
age
Lea
rnin
gV
ide
o
Image Epitome Examples
Cheung5 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Learning the Image Epitome (1)
Cheung6 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Learning the Image Epitome (2)
Epitome
Input image
Training Set
SamplePatches
UnsupervisedLearning
e
Z1 Z2 ZM
…
TMT2T1Bayesiannetwork
e – epitomeTk – mappingZk – image patch
E:
M:
Cheung7 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Xe(i,j)
KK
N
N
Cumsum
2
N
N
Xe(i,j)
KK
N
N
Cumsum
2
N
N
Shifted Cumulative Sum Algorithm
(2, 2), (i+1, j+1)
(1, 2), (i, j+1)
(2, 1), (i+1, j)
– col 1+ col (P+1)
– row 1+ row (P+1)
–
+
+ + pixel (1,1)+ pixel (P+1, P+1)
(1, 1), (i, j)
(2, 2), (i+1, j+1)(2, 2), (i+1, j+1)
(1, 2), (i, j+1)(1, 2), (i, j+1)
(2, 1), (i+1, j)(2, 1), (i+1, j)
– col 1+ col (P+1)
– row 1+ row (P+1)
–
+
+ + pixel (1,1)+ pixel (P+1, P+1)
(1, 1), (i, j)(1, 1), (i, j)
+-
-+
Cheung8 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
X e
P
P
P
P
Ta
Tß
X e
P
P
P
P
Ta
Tß
Collecting Sufficient Statistics
Cheung9 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Extending Epitomes to Videos
● Desire a miniature, condensed version of a video sequence
● Want the epitome to model the basic shape, textural, and motion patterns of the video
● Applications► optic flow► segmentation► texture transfer► layer separation► compression► noise reduction► fill-in / inpainting
Cheung10 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Input Video
Frame 1Frame 2
Frame 3
Training Set
SamplePatches
Video Epitome
UnsupervisedLearning
Video Epitome
Cheung11 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Video Epitome Example
Temporally Compressed
Spatially Compressed
Cheung12 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Video Inpainting (1)
● Fill in missing portions of a video► damaged films► occluding objects
● Reconstruct the missing pixels from the video epitome
Cheung13 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Video Inpainting (2)
Cheung14 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Filling-in Missing Data
Likelihood
Joint
VariationalApprox
E-Step
M-Step
Cheung15 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Missing Channels
● Generalization of the video inpainting problem
● Inpainting► Missing entire pixels
● Missing Channels► Missing one or more of the red, green, or blue (RGB)
components of a given pixel
● Epitome must consolidate multiple patches together to piece together the missing channel information
► No training patch contains all the channel information► Use the epitome to fill-in the missing data
Cheung16 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Image Missing Channels Fill-in
Cheung17 / 18
Imag
eL
earn
ing
Vid
eo
ML Group Meeting, Oct. 18, 2004
Video Missing Channels Fill-in
Cheung18 / 18
ML Group Meeting, Oct. 18, 2004
Conclusion
● Improved the efficiency of learning image epitomes
● Extended the concept of epitomes to video sequences
● Demonstrated the ability of video epitomes to model motion patterns through video inpainting
Cheung19 / 18
ML Group Meeting, Oct. 18, 2004
Future Work
● Determining the size of the epitome► Dependent on the complexity of the image / video
■ Minimum description length■ Variational Bayesian
● Optimal patch size(s)► Problem specific
● Additional transformations into the epitome► Rotation► Scale
● Additional video epitome applications► Super-resolution► Layer separation► Object recognition