structure-from-motion algorithm to capture 3d information from a sequence of video images by rishabh...
TRANSCRIPT
Structure-from-Motion Algorithm to Capture 3D Information from a
Sequence of Video ImagesBy
Rishabh Malhotra
Supervisor: Dr. Kunio Takaya
TRLabs / University of Saskatchewan
New Media
EE 990 Seminar Dec 04, 2003
Outline of the Presentation
Introduction: Problem Definition The “Structure-from-Motion” Algorithm with an illustration Conclusion Applications
New Media
Defining The Problem
At least 2 views are required. 2D is already available. Need to find the third dimension. Depth information obtained creates 3D model only for visible part. Hence called 2.5D model. Motion of object gives intensity change in same pixels of the object which is used to calculate the depth information leading to the 3D Structure.
New Media
What is SfM?
Specific computation of 3D geometry (Structure) from given 2D geometry frames (Motion).
For an example –
What is VRML ? Specification for displaying 3-D objects on the WWW. 3-D equivalent of HTML. Need a VRML browser or VRML plug-in to a Web browser. E.g. Cortona Plug-in from Parallel Graphics. Produces a hyperspace (or a world), a 3-D space that appears on the display screen. Can figuratively move within this space.
New Media
The Scene (Sphere and Spot Light) remains fixed. Only the Camera moves in pure Translational motion showing the different regions of shadow on the Sphere.
Scene moves to LEFTor
Camera moves to RIGHT
Scene moves to RIGHTor
Camera moves to LEFT
Camera, Spot Light and Sphere are Collinear
This world was developed in VRML (Virtual Reality Modeling Language)
Surface and Depth (of Third Dimension) Estimation
New Media
Camera Moves a very small distance to the Right
NDifferent intensity
values on the same pixel of the two frames
Conclusion: Surface is Concave and it must be given a lower elevation (third dimension)
Assumptions:Relatively large Sphere radius.
Very Small Camera displacement.
New Media
The Concept:Phong reflection model tells us how light reflects from surfaces. Phong Shading is a form of interpolated shading for approximating curved surfaces. Instead of interpolating intensities, it interpolates the vertex normals.
Why is it needed here?
For Depth estimation at a location of the object.
•Phong Lighting: an empirical model to calculate illumination at a point on a surface.
•Phong Shading: linearly interpolating the surface normal across the facet, applying the Phong lighting model at every pixel (normal-vector interpolation-shading)
Examples of Images made using Phong's Model:
Phong Lighting and Shading Model
New Media
Has 3 Components:• Ambient Light• Diffuse Reflection• Specular reflection
The Phong equation:
aassdd LkvrLknlLkcdbda
I
))()((1
2
Ambient IntensityDiffuse Reflection Specular Reflection
Shininess Coefficient
As Shininess Coefficient (γ) is increased, the reflected light is concentrated in a narrower region, centred on the angle of perfect reflector.
Quadratic Attenuation Term
Uses 4 vectors:• From Source (L)• To Viewer (V)• Normal (N)• Perfect Reflector (R)
Phong Lighting and Shading Model
from Akenine-Moller & Haines
Ambient Diffuse Specular
An Example:
The Structure-from-Motion Algorithm
1) Gradient Vector Flow (GVF)
2) Image Segmentation using 2D Wavelets
3) Motion Vector Estimation using Berkeley MPEG Tools
4) Phong Lighting and Shading Model
New Media
Step 1: Gradient Vector Flow Calculation
Gradient is a vector quantity and is a 2D first derivative measure of change.
By Definition,
The gradient of an image of continuous spatial coordinates x and y, is
New Media
),( yxI
yyxI
xyxI
y
x
G
GyxIG
),(
),(
)],([
Hence,Magnitude of 22)],([ yx GGyxIG
x
y
G
GyxIG 1tan)],([ and Direction of
where
2)1,()1,(),(
2),1(),1(),(
yxIyxI
yyxI
yxIyxIxyxI
Results using Gradient Vector Flow Calculation
The Simplest Image: A Sphere (an oversimplified image as it has no edges)
New Media
Original Image(100 x 100 pixels)
Gradient Vector Flow Map
Zoom In View
New Media
Step 2: Image Segmentation using 2D Wavelets
Process of separating objects from the background, as well as from each other by deciding which pixels belong to each object.
Wavelet Transform applied to the vector potential defined in a 2D image.a) Sub-band filtering applied to the vector potential can produce contour images of different
scales.
b) The Mallat or Haar Wavelet is considered for Image Segmentation.
Active contours or snakes are computer-generated curves that move within images to find object boundaries.GVF Snake Method
This is a GVF field for a U-shaped object. These vectors will pull an active contour
toward the boundary of the object.
A GVF snake can start far from the boundary and will converge to
boundary concavities.
Active Contour
New Media
Image Segmentation using Edge Detection for a more complicated image: Face
Original Image(640 x 480 pixels)
Edge Detection using x-direction Sobel operator (Threshold: 153)
Other Popular Image Segmentation methods include:
• Edge Detection
• Segmentation based on color
• Region Growing and Shrinking
• Clustering
• Morphological Filtering
Step 3: Motion Vector Estimation using Berkeley MPEG Tools
New Media
Previous Approaches:1. Full Search Algorithm (Most precise matching but Computational Complexity (2w+1)2
times)2. Conjugate Direction Search (Complexity is reduced noticeably 3+2w)3. Modified Logarithmic Search (Efficient and fast 2+7log(w))
What is needed?
-Novel motion vector prediction technique
-A highly localized search pattern
-A computational constraint explicitly incorporated into the cost measure
BMA partitions the current frame in small, fixed size blocks and matches them in the previous frame in order to estimate blocks displacement (referred to as motion vectors) between two successive frames.
Block Matching Algorithm
New Media
To find the “best” block from an earlier frame to construct an area of the current frame
Motion Estimation technique – using Block Matching Algorithm
Translate motion-vectors into motion-predictions
Step 3: Motion Vector Estimation using Berkeley MPEG Tools
Frame #2Frame #1
<2,0>
Frame #3
Is<2,0>valid ?
Apply block-matching algorithm to compute motion-vectors
Motion vector: The displacement of the closest matching block in reference frame (past of future) for a block in current frame
ConclusionA 2.5 dimensional figure of the object is produced similar to the carved in relief as a result of the series of processing's.
New Media
Applications1. 3D Model Reconstruction
– 3D Motion Matching
– Camera Calibration
– 3D Vision
2. Stereo Television– Conversion of ordinary 2D films to a stereo movie to be displayed on
a stereo TV.
– Will become available as the next generation Television.
Thank You
Questions ?
New Media
Shininess Coefficient and Specular Component
New Media
18New Media
• • • • • •
I1 B1 B2 B3 P1 B4 B5 B6 P2 B7 B8 B9 I2
MPEG Encoding
Frame TypesI Intra Encode complete image, similar to JPEG
P Forward Predicted Motion relative to previous I and P’s
B Backward Predicted Motion relative to previous & future I’s & P’s
Step 3: Motion Vector Estimation using Berkeley MPEG Tools