layered scene representations vision for graphics cse 590ss, winter 2001 richard szeliski
DESCRIPTION
2/12/2001Vision for Graphics3 Block-based motion prediction Break image up into square blocks Estimate translation for each block Use this to predict next frame, code difference (MPEG-2)TRANSCRIPT
![Page 1: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/1.jpg)
Layered Scene Representations
Vision for GraphicsCSE 590SS, Winter 2001
Richard Szeliski
![Page 2: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/2.jpg)
2/12/2001 Vision for Graphics 2
Motion representations
How can we describe this scene?
![Page 3: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/3.jpg)
2/12/2001 Vision for Graphics 3
Block-based motion prediction
Break image up into square blocksEstimate translation for each blockUse this to predict next frame, code difference
(MPEG-2)
![Page 4: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/4.jpg)
2/12/2001 Vision for Graphics 4
Layered motion
Break image sequence up into “layers”:
=
Describe each layer’s motion
![Page 5: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/5.jpg)
2/12/2001 Vision for Graphics 5
Outline
• Why layers?• 2-D layers [Wang & Adelson 94; Weiss 97]• 3-D layers [Baker et al. 98]• Layered Depth Images [Shade et al. 98]• Transparency [Szeliski et al. 00]• Bayesian estimation [Torr et al. 99]
![Page 6: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/6.jpg)
2/12/2001 Vision for Graphics 6
Layered motion
Advantages:• can represent occlusions / disocclusions• each layer’s motion can be smooth• video segmentation for semantic processingDifficulties:• how do we determine the correct number?• how do we assign pixels?• how do we model the motion?
![Page 7: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/7.jpg)
2/12/2001 Vision for Graphics 7
Layers for video summarization
![Page 8: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/8.jpg)
2/12/2001 Vision for Graphics 8
Background modeling (MPEG-4)
Convert masked images into a background sprite for layered video coding
+ + +
=
![Page 9: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/9.jpg)
2/12/2001 Vision for Graphics 9
What are layers?
[Wang & Adelson, 1994]
• intensities• alphas• velocities
![Page 10: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/10.jpg)
2/12/2001 Vision for Graphics 10
How do we composite them?
![Page 11: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/11.jpg)
2/12/2001 Vision for Graphics 11
How do we form them?
![Page 12: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/12.jpg)
2/12/2001 Vision for Graphics 12
How do we form them?
![Page 13: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/13.jpg)
2/12/2001 Vision for Graphics 13
How do we estimate the layers?
1. compute coarse-to-fine flow2. estimate affine motion in blocks (regression)3. cluster with k-means4. assign pixels to best fitting affine region5. re-estimate affine motions in each region…
![Page 14: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/14.jpg)
2/12/2001 Vision for Graphics 14
Layer synthesis
For each layer:• stabilize the sequence with the affine motion• compute median value at each pixelDetermine occlusion relationships
![Page 15: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/15.jpg)
2/12/2001 Vision for Graphics 15
Results
![Page 16: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/16.jpg)
2/12/2001 Vision for Graphics 16
What if the motion is not affine?
Use a “regularized” (smooth) motion field[Weiss, CVPR’97]
![Page 17: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/17.jpg)
A Layered Approach To Stereo Reconstruction
Simon Baker, Richard Szeliski and P. Anandan
CVPR’98
![Page 18: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/18.jpg)
2/12/2001 Vision for Graphics 19
Examples:• Disparity-Spaces [Intille and Bobick, ‘94] [Scharstein and Szeliski, ‘96] • Space-Coloring [Seitz and Dyer, ‘97]• Maximum-Flow Stereo [Roy and Cox, ‘98]
Advantages:• Modeling occlusions [Intille and Bobick, ‘94] • Mixed pixels + transparency [Szeliski and Golland, ‘98]• Equal treatment of many images [Collins, ‘96]
Volumetric Approaches to Stereo
Camera 1Camera 2
x
z
y
![Page 19: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/19.jpg)
2/12/2001 Vision for Graphics 20
2.5-D Layered Approach
Additional advantages over volumetric approaches:• Fewer degrees of freedom• Less resampling artifacts• Robustness of global model + local correction
– c.f. “Plane + Parallax” and “Model-Based Stereo”• Output particularly suitable for certain applications
– e.g. Image-based rendering and interactive editing
Layer 2
Camera 1
Layer 1
Layer 3Camera 2
![Page 20: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/20.jpg)
2/12/2001 Vision for Graphics 21
Layered Stereo
Use arbitrarily oriented sprites
Estimate 3D plane equation for each sprite
layers (“sprites”)layers (“sprites”)
![Page 21: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/21.jpg)
2/12/2001 Vision for Graphics 23
Layer Representation
Coordinate frame defined by (u, v, 1)Tu = = Q l x
Layer sprite L = (a . r , a . g , a . b , a)l
Residual depth Zl
World origin
x = (x, y, z, 1)TWorld point
u
v
Plane vectorn = (n , n , n , n )T
x zyPlane equation
n . x = 0
l
l
d
![Page 22: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/22.jpg)
2/12/2001 Vision for Graphics 24
Image Formation
Scene
Image I k
Camera Pk
v
u Masked image M k l
v
u
Boolean mask B k l
v
u
Layer l
![Page 23: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/23.jpg)
2/12/2001 Vision for Graphics 25
Overview
Output: n , L , & Zll l
Refine Layer Sprites L l
Input: Images I & Cameras Pkk
Re-assign pixelslayers Bkl
Estimate residual depth Z l
Estimate plane vectors n l
Estimate sprite images Ll
Initialize layer assignment B kl
![Page 24: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/24.jpg)
2/12/2001 Vision for Graphics 26
Layer Initialization AlternativesIterate dominant motion estimation
• e.g. [Irani et al., ‘95]Apply simple stereo algorithm + fit planesColor segmentation
• e.g. [Sawhney and Ayer, ‘94]Human initialization
• e.g. [Debevec et al., ‘96]
![Page 25: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/25.jpg)
2/12/2001 Vision for Graphics 28
H lik
M klH lik o
Estimation of Plane Equations
Camera PkM klCamera P j M jl
H lij
M jlH lij o
Camera Pi
M il
Layer l
Warped images , , … functions of n only M jlH lij o M klH l
ik ol
Minimize image variance using hierarchical gradient descent
![Page 26: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/26.jpg)
2/12/2001 Vision for Graphics 30
Estimation of Layer Sprites
Camera PkM klCamera P j M jl
Camera Pi
M il
Plane nl
“Blend” the masked images, warped onto the layer plane
![Page 27: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/27.jpg)
2/12/2001 Vision for Graphics 31
Estimation of Residual DepthPer-pixel residual depth estimation
• plane plus parallax [Anandan et al.]• model-based stereo [Debevec et al.]
• better accuracy / fidelity• makes forward warping more difficult
![Page 28: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/28.jpg)
2/12/2001 Vision for Graphics 32
Estimation of Residual Depth
Camera Pk
M klCamera P j M jl
Camera P i
M il
Perturbed Plane n + (0,0,0,d)lT
Warp masked images onto perturbed plane
Compute variance image For each pixel, choose d that
minimizes varianceSmooth, incorporating
confidence weighting [Szeliski & Golland, ‘98]
Recompute sprite using “Plane + Parallax” warp
![Page 29: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/29.jpg)
2/12/2001 Vision for Graphics 33
Pixel Assignment
Camera Pi
M il
Plane nlSprite L l
• Warp masked image onto each layer plane • Compute difference images • Un-warp difference images • For each pixel, choose the best difference across layers • Smooth pixel assignment
Un-warpeddifference
image
![Page 30: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/30.jpg)
2/12/2001 Vision for Graphics 34
Flower Garden Results
Initial Segmentation
Image 1 Image 9
Grey coded planar depth
![Page 31: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/31.jpg)
2/12/2001 Vision for Graphics 35
Recovered Sprite: Without residual depth estimation
Recovered Sprite: With residual depth estimation
Flower Garden Results
![Page 32: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/32.jpg)
2/12/2001 Vision for Graphics 36
Graphics Symposium Results
Image 1 of 5 Initial segmentation
Grey coded planar depth Residual depth
![Page 33: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/33.jpg)
2/12/2001 Vision for Graphics 37
Graphics Symposium Results
Resulting sprite collection
![Page 34: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/34.jpg)
2/12/2001 Vision for Graphics 38
Graphics Symposium Results
Original image 3 Re-synthesized image 3
Novel view without residual depth Novel view with residual depth
![Page 35: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/35.jpg)
2/12/2001 Vision for Graphics 39
Layered Stereo Demo
SpriteViewer: renders sprites with depth
![Page 36: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/36.jpg)
2/12/2001 Vision for Graphics 40
Discussion
Layer initialization:• Can tolerate bad initial plane estimates• Residual depth estimation:
– Plane sweep algorithm, similar to [Szeliski and Golland, ‘98]
Pixel assignment:• Combine color and residual depth estimates• Currently under investigation
![Page 37: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/37.jpg)
2/12/2001 Vision for Graphics 41
Summary
New approach to stereo matching:• represent scene as collection of layers• each layer has a 3-D plane equation, an alpha-matted color
image, and an optional residual depth• generalizes layered motion to 3-D
Computation:• plane eqns. by warping mosaics of masked images• residual depth by perturbing planes• iteratively refine color values and pixel assignments
![Page 38: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/38.jpg)
Layered Depth Images
Jonathan Shade Steven GortlerLi-wei He Richard Szeliski
SIGGRAPH’98
![Page 39: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/39.jpg)
2/12/2001 Vision for Graphics 43
How to render a layer + parallax?
Can’t use inverse warping [Laveau 94]
![Page 40: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/40.jpg)
2/12/2001 Vision for Graphics 44
3D Sprites with Depth
3D sprite consists of:• alpha-matted image I1(x1,y1)• 4×4 camera matrix C1 [ w1x1 w1y1 w1d1 w1]T = C1 [X Y Z
1]T
• plane equation AX + BY + CZ + D = 0(forms third row of C1 )
• optional per-pixel depth d1 (x1,y1)
![Page 41: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/41.jpg)
2/12/2001 Vision for Graphics 45
Sprites with Depth
Store d1(x1,y1) (scaled displacement) along with each sprite image I1(x1,y1)
I1 d1 I1 d1
![Page 42: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/42.jpg)
2/12/2001 Vision for Graphics 46
3D Sprites — Reprojection
sprites new view
use standard texture mapping (projective warp)
![Page 43: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/43.jpg)
2/12/2001 Vision for Graphics 47
Forward Mapping
Mapping equation with per-pixel depth d1:[ w2x2 w2y2 w2 ]T = H1,2 [ x1 y1 1 ]T + d1 e1,2
I1 d1 (I2 ) I2
Problems: gaps and aliasing
![Page 44: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/44.jpg)
2/12/2001 Vision for Graphics 48
Inverse Mapping
Reverse order of images 1 & 2:[ w1x1 w1y1 w1 ]T = H2,1 [ x2 y2 1 ]T + d2 e2,1
I1 (I2) d2 I2
Problem: we don’t know d2!
![Page 45: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/45.jpg)
2/12/2001 Vision for Graphics 49
Crude perspective map
How to map d1 d2?
Simple idea: use perspective transform H2,1
I1 d1 d2 I2
Works well for small amounts of motion
![Page 46: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/46.jpg)
2/12/2001 Vision for Graphics 50
Better forward map
How to map d1 d2?
Better idea: use full H1,2x1+d1e1,2 fwd. map
I1 d1 d2 I2
Works better for moderate amounts of motion
![Page 47: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/47.jpg)
2/12/2001 Vision for Graphics 51
2-pass Mapping
Why is 2-pass mapping (d1 d2 forward followed by I1 I2 backward) a good idea?• can tolerate bigger errors in d1 mapping (since d1 is
typically smooth)• can store/process d1 at lower resolution• can use better filtering on color image
![Page 48: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/48.jpg)
2/12/2001 Vision for Graphics 52
Sprites with Depth — Demo
Demo
![Page 49: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/49.jpg)
2/12/2001 Vision for Graphics 53
Refinements
Only forward map d1 with parallax component
Use affine approximation to parallax flowBetter gap filling
Forward map (u,v) flow instead of d1 depth
![Page 50: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/50.jpg)
2/12/2001 Vision for Graphics 54
Layered Depth Images (LDIs)
Store multiple (color,z) values at each pixelSimilar to [sparse] volumetric representationRender with forward warp (splat)
![Page 51: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/51.jpg)
Layer extraction from multiple images containing reflections and transparency
Richard SzeliskiShai AvidanP. Anandan
CVPR’2000
![Page 52: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/52.jpg)
2/12/2001 Vision for Graphics 56
Transparent motion
Photograph (Lee) and reflection (Michael)
![Page 53: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/53.jpg)
2/12/2001 Vision for Graphics 57
Previous work
Physics-based vision and polarization[Shafer et al.; Wolff; Nayar et al.]
Perception of transparency [Adelson…]
Transparent motion estimation[Shizawa & Mase; Bergen et al.; Irani et al.; Darrell & Simoncelli]
3-frame layer recovery [Bergen et al.]
![Page 54: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/54.jpg)
2/12/2001 Vision for Graphics 58
Problem formulation
X
Y
MotionMotionX,iX,i( )( )
MotionMotionY,iY,i( )( )++
![Page 55: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/55.jpg)
2/12/2001 Vision for Graphics 59
Image formation model
Pure additive mixing of positive signalsmk(x) = l Wkl fl(x)
ormk = l Wkl fl
Assume motion is planar (perspective transform, aka homography)
![Page 56: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/56.jpg)
2/12/2001 Vision for Graphics 60
Two processing stages
Estimate the motions and initial layer estimates
Compute optimal layer estimates (for known motion)
![Page 57: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/57.jpg)
2/12/2001 Vision for Graphics 61
Dominant motion estimation
Stabilize sequence by dominant motion
robust affine [Bergen et al. 92; Szeliski & Shum]
![Page 58: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/58.jpg)
2/12/2001 Vision for Graphics 62
Dominant layer estimate
How do we form composite (estimate)?
TimeTime
Inte
nsity
Inte
nsity
![Page 59: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/59.jpg)
2/12/2001 Vision for Graphics 63
Average?
TimeTime
Inte
nsity
Inte
nsity
![Page 60: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/60.jpg)
2/12/2001 Vision for Graphics 64
Median?
Hint: all layers are non-negative
TimeTime
Inte
nsity
Inte
nsity
![Page 61: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/61.jpg)
2/12/2001 Vision for Graphics 65
Min-composite
Smallest value is over-estimate of layer
TimeTime
Inte
nsity
Inte
nsity
![Page 62: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/62.jpg)
2/12/2001 Vision for Graphics 66
Difference sequence
Subtract min-composite from original image
=
original - min composite = difference imageoriginal - min composite = difference image
![Page 63: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/63.jpg)
2/12/2001 Vision for Graphics 67
Min composite
TimeTime
Inte
nsity
Inte
nsity
(overestimate of background layer)(overestimate of background layer)
![Page 64: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/64.jpg)
2/12/2001 Vision for Graphics 68
Difference sequence
TimeTime
Inte
nsity
Inte
nsity
(underestimate of foreground layer)(underestimate of foreground layer)
![Page 65: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/65.jpg)
2/12/2001 Vision for Graphics 69
Stabilizing secondary motion
TimeTime
Inte
nsity
Inte
nsity
How do we form composite (estimate)?How do we form composite (estimate)?
![Page 66: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/66.jpg)
2/12/2001 Vision for Graphics 70
Max-composite
TimeTime
Inte
nsity
Inte
nsity
Largest value is Largest value is under-estimateunder-estimate of layer of layer
![Page 67: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/67.jpg)
2/12/2001 Vision for Graphics 71
Min-max alternation
Subtract secondary layer (under-estimate) from original sequence
Re-compute dominant motion and better min-composite
Iterate …Does this process converge?
![Page 68: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/68.jpg)
2/12/2001 Vision for Graphics 72
Min-max alternation
Does this process converge?• in theory: yes• each iteration reduces number of mis-estimated
pixels (tightens the bounds) — proof in paper
![Page 69: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/69.jpg)
2/12/2001 Vision for Graphics 73
Min-max alternation
Does this process converge?• in practice: no• resampling errors and noise both lead to
divergence — discussion in paper
resampling error noisy
![Page 70: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/70.jpg)
2/12/2001 Vision for Graphics 74
Two processing stages
Estimate the motions and initial layer estimates
Compute optimal layer estimates (for known motion)
![Page 71: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/71.jpg)
2/12/2001 Vision for Graphics 75
Optimal estimation
Recall: additive mixing of positive signalsmk = l Wkl fl
Use constrained least squares(quadratic programming)
min k | l Wkl fl – mk |2 s.t. fl 0
![Page 72: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/72.jpg)
2/12/2001 Vision for Graphics 76
Least squares example
background foregroundbackground foreground
blue: least squaresblue: least squares
red: constrained LSred: constrained LS
![Page 73: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/73.jpg)
2/12/2001 Vision for Graphics 77
Uniqueness of solution
If any layer does not have a “black” region, i.e., if fl c, then can add this offset to another layer (and subtract it from fl)
background background foreground foreground
![Page 74: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/74.jpg)
2/12/2001 Vision for Graphics 78
Degeneracies in solution
If motion is degenerate (e.g., horizontal), regions (scanlines) decouple (w/o MRF)
mixedmixed scaled scaled errors errors
recovered recovered
![Page 75: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/75.jpg)
2/12/2001 Vision for Graphics 79
Noise sensitivity
In general, low-frequency components hard to recover for small motions
mixedmixed
recovered recovered
scaled scaled errors errors
![Page 76: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/76.jpg)
2/12/2001 Vision for Graphics 80
Three-layer example
3 layers with general motion works well
= + +
![Page 77: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/77.jpg)
2/12/2001 Vision for Graphics 81
Complete algorithm
Dominant motion with min-compositesDifference (residual) images Non-dominant motion on differences Improve the motion estimatesUnconstrained least-squares problemConstrained least-squares problem
![Page 78: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/78.jpg)
2/12/2001 Vision for Graphics 82
Complete example
originaloriginal
stabilizedstabilizedmin-compositemin-composite
![Page 79: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/79.jpg)
2/12/2001 Vision for Graphics 83
Complete example
differencedifference
stabilizedstabilizedmax-compositemax-composite
![Page 80: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/80.jpg)
2/12/2001 Vision for Graphics 84
Final Results
= += +
![Page 81: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/81.jpg)
2/12/2001 Vision for Graphics 85
Another example
original stabilized min-comp. resid. 2
![Page 82: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/82.jpg)
2/12/2001 Vision for Graphics 86
Results: Anne and books
= += +
original background foreground (photo)original background foreground (photo)
![Page 83: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/83.jpg)
2/12/2001 Vision for Graphics 87
Transparent layer recovery
Pure (additive) mixing of intensities• simple constrained least squares problem• degeneracies for simple or small motions
Processing stages• dominant motion estimation• min- and max-composites to initialize• optimization of motion and layers
![Page 84: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/84.jpg)
2/12/2001 Vision for Graphics 88
Future workMitigating degeneracies (regularization)Opaque layers ( estimation)
Non-planar geometry (parallax)
![Page 85: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/85.jpg)
2/12/2001 Vision for Graphics 89
BibliographyJ. Y. A. Wang and E. H. Adelson. Representing moving images with
layers. IEEE Transactions on Image Processing, 3(5):625--638, September 1994.
Y. Weiss and E. H. Adelson. A unified mixture framework for motion segmentation: Incorporating spatial coherence and estimating the number of models. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'96), pages 321--326, San Francisco, California, June 1996.
Y. Weiss. Smoothness in layers: Motion segmentation using nonparametric mixture estimation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'97), pages 520--526, San Juan, Puerto Rico, June 1997.
P. R. Hsu, P. Anandan, and S. Peleg. Accurate computation of optical flow by using layered motion representations. In Twelfth International Conference on Pattern Recognition (ICPR'94), pages 743--746, Jerusalem, Israel, October 1994. IEEE Computer Society Press
![Page 86: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/86.jpg)
2/12/2001 Vision for Graphics 90
BibliographyT. Darrell and A. Pentland. Cooperative robust estimation using layers of
support. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(5):474--487, May 1995.
S. X. Ju, M. J. Black, and A. D. Jepson. Skin and bones: Multi-layer, locally affine, optical flow and regularization with transparency. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'96), pages 307--314, San Francisco, California, June 1996.
M. Irani, B. Rousso, and S. Peleg. Computing occluding and transparent motions. International Journal of Computer Vision, 12(1):5--16, January 1994.
H. S. Sawhney and S. Ayer. Compact representation of videos through dominant multiple motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8):814--830, August 1996.
M.-C. Lee et al. A layered video object coding system using sprite and affine motion model. IEEE Transactions on Circuits and Systems for Video Technology, 7(1):130--145, February 1997.
![Page 87: Layered Scene Representations Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski](https://reader035.vdocuments.net/reader035/viewer/2022062905/5a4d1ace7f8b9ab059970a2e/html5/thumbnails/87.jpg)
2/12/2001 Vision for Graphics 91
BibliographyS. Baker, R. Szeliski, and P. Anandan. A layered approach to stereo
reconstruction. In IEEE CVPR'98, pages 434--441, Santa Barbara, June 1998.
R. Szeliski, S. Avidan, and P. Anandan. Layer extraction from multiple images containing reflections and transparency. In IEEE CVPR'2000, volume 1, pages 246--253, Hilton Head Island, June 2000.
J. Shade, S. Gortler, L.-W. He, and R. Szeliski. Layered depth images. In Computer Graphics (SIGGRAPH'98) Proceedings, pages 231--242, Orlando, July 1998. ACM SIGGRAPH.
S. Laveau and O. D. Faugeras. 3-d scene representation as a collection of images. In Twelfth International Conference on Pattern Recognition (ICPR'94), volume A, pages 689--691, Jerusalem, Israel, October 1994. IEEE Computer Society Press.
P. H. S. Torr, R. Szeliski, and P. Anandan. An integrated Bayesian approach to layer extraction from image sequences. In Seventh ICCV'98, pages 983--990, Kerkyra, Greece, September 1999.