structure-from-motion algorithm to capture 3d information from a sequence of video images by rishabh...

Structure-from-Motion Algorithm to Capture 3D Information from a

Sequence of Video ImagesBy

Rishabh Malhotra

Supervisor: Dr. Kunio Takaya

TRLabs / University of Saskatchewan

New Media

EE 990 Seminar Dec 04, 2003

http://www.trlabs.ca/

Outline of the Presentation

Introduction: Problem Definition The “Structure-from-Motion” Algorithm with an illustration Conclusion Applications

New Media


Defining The Problem

At least 2 views are required. 2D is already available. Need to find the third dimension. Depth information obtained creates 3D model only for visible part. Hence called 2.5D model. Motion of object gives intensity change in same pixels of the object which is used to calculate the depth information leading to the 3D Structure.

New Media

What is SfM?

Specific computation of 3D geometry (Structure) from given 2D geometry frames (Motion).


For an example –

What is VRML ? Specification for displaying 3-D objects on the WWW. 3-D equivalent of HTML. Need a VRML browser or VRML plug-in to a Web browser. E.g. Cortona Plug-in from Parallel Graphics. Produces a hyperspace (or a world), a 3-D space that appears on the display screen. Can figuratively move within this space.

New Media

The Scene (Sphere and Spot Light) remains fixed. Only the Camera moves in pure Translational motion showing the different regions of shadow on the Sphere.

Scene moves to LEFTor

Camera moves to RIGHT

Scene moves to RIGHTor

Camera moves to LEFT

Camera, Spot Light and Sphere are Collinear

This world was developed in VRML (Virtual Reality Modeling Language)


Surface and Depth (of Third Dimension) Estimation

New Media

Camera Moves a very small distance to the Right

NDifferent intensity

values on the same pixel of the two frames

Conclusion: Surface is Concave and it must be given a lower elevation (third dimension)

Assumptions:Relatively large Sphere radius.

Very Small Camera displacement.

New Media

The Concept:Phong reflection model tells us how light reflects from surfaces. Phong Shading is a form of interpolated shading for approximating curved surfaces. Instead of interpolating intensities, it interpolates the vertex normals.

Why is it needed here?

For Depth estimation at a location of the object.

•Phong Lighting: an empirical model to calculate illumination at a point on a surface.

•Phong Shading: linearly interpolating the surface normal across the facet, applying the Phong lighting model at every pixel (normal-vector interpolation-shading)

Examples of Images made using Phong's Model:

Phong Lighting and Shading Model

New Media

Has 3 Components:• Ambient Light• Diffuse Reflection• Specular reflection

The Phong equation:

aassdd LkvrLknlLkcdbda

I

))()((1

2

Ambient IntensityDiffuse Reflection Specular Reflection

Shininess Coefficient

As Shininess Coefficient (γ) is increased, the reflected light is concentrated in a narrower region, centred on the angle of perfect reflector.

Quadratic Attenuation Term

Uses 4 vectors:• From Source (L)• To Viewer (V)• Normal (N)• Perfect Reflector (R)

Phong Lighting and Shading Model

from Akenine-Moller & Haines

Ambient Diffuse Specular

An Example:

The Structure-from-Motion Algorithm

1) Gradient Vector Flow (GVF)

2) Image Segmentation using 2D Wavelets

3) Motion Vector Estimation using Berkeley MPEG Tools

4) Phong Lighting and Shading Model

New Media

Step 1: Gradient Vector Flow Calculation

Gradient is a vector quantity and is a 2D first derivative measure of change.

By Definition,

The gradient of an image of continuous spatial coordinates x and y, is

New Media

),( yxI

yyxI

xyxI

y

x

G

GyxIG

),(

),(

)],([

Hence,Magnitude of 22)],([ yx GGyxIG

x

y

G

GyxIG 1tan)],([ and Direction of

where

2)1,()1,(),(

2),1(),1(),(

yxIyxI

yyxI

yxIyxIxyxI

Results using Gradient Vector Flow Calculation

The Simplest Image: A Sphere (an oversimplified image as it has no edges)

New Media

Original Image(100 x 100 pixels)

Gradient Vector Flow Map

Zoom In View

New Media

Step 2: Image Segmentation using 2D Wavelets

Process of separating objects from the background, as well as from each other by deciding which pixels belong to each object.

Wavelet Transform applied to the vector potential defined in a 2D image.a) Sub-band filtering applied to the vector potential can produce contour images of different

scales.

b) The Mallat or Haar Wavelet is considered for Image Segmentation.

Active contours or snakes are computer-generated curves that move within images to find object boundaries.GVF Snake Method

This is a GVF field for a U-shaped object. These vectors will pull an active contour

toward the boundary of the object.

A GVF snake can start far from the boundary and will converge to

boundary concavities.

Active Contour

New Media

Image Segmentation using Edge Detection for a more complicated image: Face

Original Image(640 x 480 pixels)

Edge Detection using x-direction Sobel operator (Threshold: 153)

Other Popular Image Segmentation methods include:

• Edge Detection

• Segmentation based on color

• Region Growing and Shrinking

• Clustering

• Morphological Filtering

Step 3: Motion Vector Estimation using Berkeley MPEG Tools

New Media

Previous Approaches:1. Full Search Algorithm (Most precise matching but Computational Complexity (2w+1)2

times)2. Conjugate Direction Search (Complexity is reduced noticeably 3+2w)3. Modified Logarithmic Search (Efficient and fast 2+7log(w))

What is needed?

-Novel motion vector prediction technique

-A highly localized search pattern

-A computational constraint explicitly incorporated into the cost measure

BMA partitions the current frame in small, fixed size blocks and matches them in the previous frame in order to estimate blocks displacement (referred to as motion vectors) between two successive frames.

Block Matching Algorithm

New Media

To find the “best” block from an earlier frame to construct an area of the current frame

Motion Estimation technique – using Block Matching Algorithm

Translate motion-vectors into motion-predictions


Frame #2Frame #1

<2,0>

Frame #3

Is<2,0>valid ?

Apply block-matching algorithm to compute motion-vectors

Motion vector: The displacement of the closest matching block in reference frame (past of future) for a block in current frame

ConclusionA 2.5 dimensional figure of the object is produced similar to the carved in relief as a result of the series of processing's.

New Media

Applications1. 3D Model Reconstruction

– 3D Motion Matching

– Camera Calibration

– 3D Vision

2. Stereo Television– Conversion of ordinary 2D films to a stereo movie to be displayed on

a stereo TV.

– Will become available as the next generation Television.

Thank You

Questions ?

New Media

Shininess Coefficient and Specular Component

New Media

18New Media

• • • • • •

I1 B1 B2 B3 P1 B4 B5 B6 P2 B7 B8 B9 I2

MPEG Encoding

Frame TypesI Intra Encode complete image, similar to JPEG

P Forward Predicted Motion relative to previous I and P’s

B Backward Predicted Motion relative to previous & future I’s & P’s


structure-from-motion algorithm to capture 3d information from a sequence of video images by rishabh...

Documents

d model

new media image segmentation

shading model new media

new media step

shading model slide

d structure

d information

edges new media original