structure from motion class 12 read chapter 5. assignment 2 chrisms regions nathan… brianm&s...

Structure from motionClass 12

Read Chapter 5

Assignment 2

Chris MS regions

Nathan …

Brian M&S LoG features

Li SIFT features

Chad MS regions

Seon Joo SIFT features

Jason SIFT features

Sudipta T&VG elliptic features

Sriram …

Christine …

Assignment 3

• Collect potential matches from all algorithms for all pairs• Matlab ASCII format, exchange data

• Implement RANSAC that uses combined match dataset

• Compute consistent set of matches and epipolar geometry• Report thresholds used, match sets used, number of

consistent matches obtained, epipolar geometry, show matches and epipolar geometry (plot some epipolar lines).

Due next Tuesday, Nov. 2

naming convention: firstname_ij.dat

chris_56.dat

[F,inliers]=FRANSAC([chris_56; brian_56; …])

Papers

• Each should present a paper during 20-25 minutes followed by discussion. Partially outside of class schedule to make up for missed classes.(When?)

• List of proposed papers will come on-line by Thursday, feel free to propose your own (suggestion: something related to your project).

• Make choice by Thursday, assignments will be made in class.

• Everybody should have read papers that are being discussed.

3D photography course schedule

Introduction

Aug 24, 26 (no course) (no course)

Aug.31,Sep.2

(no course) (no course)

Sep. 7, 9 (no course) (no course)

Sep. 14, 16 Projective Geometry Camera Model and Calibration

(assignment 1)

Feb. 21, 23 Camera Calib. and SVM Feature matching(assignment 2)

Feb. 28, 30 Feature tracking Epipolar geometry(assignment 3)

Oct. 5, 7 Computing F Triangulation and MVG

Oct. 12, 14 (university day) (fall break)

Oct. 19, 21 Stereo Active ranging

Oct. 26, 28 Structure from motion Self-calibration

Nov. 2, 4 Shape-from-silhouettes Space carving

Nov. 9, 11 3D modeling Appearance Modeling Nov.12 papers(2-3pm SN115)

Nov. 16, 18 (VMV’04) (VMV’04)

Nov. 23, 25 papers & discussion (Thanksgiving)

Nov.30,Dec.2

papers & discussion papers and discussion Dec.3 papers(2-3pm SN115)

Dec. 7? Project presentations

Ideas for a project?

Chris Wide-area display reconstruction

Nathan ?

Brian ?

Li Visual-hulls with occlusions

Chad Laser scanner for 3D environments

Seon Joo Collaborative 3D tracking

Jason SfM for long sequences

SudiptaCombining exact silhouettes and photoconsistency

Sriram Panoramic cameras self-calibration

Christine desktop lamp scanner

Today’s class

• Structure from motion

• factorization • sequential

• bundle adjustment

Factorization

• Factorise observations in structure of the scene and motion/calibration of the camera

• Use all points in all images at the same time

Affine factorisation Projective factorisation

Affine cameraThe affine projection equations are

1

j

j

j

yi

xi

ij

ij

Z

Y

X

P

Py

x

10001

1

j

j

j

yi

xi

ij

ij

Z

Y

X

P

P

y

x

~

~

4

4

j

j

j

yi

xi

ij

ij

yiij

xiij

Z

Y

X

P

Py

x

Py

Px

how to find the origin? or for that matter a 3D reference point?

affine projection preserves center of gravity

i

ijijij xxx~ i

ijijij yyy~

Orthographic factorizationThe ortographic projection equations are

where njmijiij ,...,1,,...,1,Mm P

All equations can be collected for all i and j

where

n

mmnmm

n

n

M,...,M,M,,

mmm

mmm

mmm

212

1

21

22221

11211

M

P

P

P

Pm

MPm

M ~

~m

j

j

j

jyi

xi

iij

ijij

Z

Y

X

,P

P,

y

xP

Note that P and M are resp. 2mx3 and 3xn matrices and

therefore the rank of m is at most 3

(Tomasi Kanade’92)

Orthographic factorizationFactorize m through singular value

decomposition

An affine reconstruction is obtained as follows

TVUm

TVMUP ~,

~


nm

mnmm

n

n

M,...,M,M

mmm

mmmmmm

min 212

1

21

22221

11211

P

P

P

Closest rank-3 approximation yields MLE!

0~~

1~~

1~~

1

1

1

TT

TT

TT

yi

xi

yi

yi

xi

xi

PP

PP

PP

AA

AA

AA

0~~

1~~

1~~

T

T

T

yi

xi

yi

yi

xi

xi

PP

PP

PP

C

C

C

A metric reconstruction is obtained as follows

Where A is computed from

Orthographic factorizationFactorize m through singular value

decomposition

An affine reconstruction is obtained as follows

TVUm

TVMUP ~,

~

MAMAPP~

,~ 1

0

1

1

T

T

T

yi

xi

yi

yi

xi

xi

PP

PP

PP 3 linear equations per view on

symmetric matrix C (6DOF)

A can be obtained from C through Cholesky factorisation

and inversion


Examples

Tomasi Kanade’92,Poelman & Kanade’94

Perspective factorizationThe camera equations

for a fixed image i can be written in matrix form as

where

mjmijiijij ,...,1,,...,1,Mmλ P

MPm iii

imiii

mimiii

λ,...,λ,λdiagM,...,M,M , m,...,m,m

21

2121

Mm

Perspective factorizationAll equations can be collected for all i as

wherePMm

mnn P

P

P

P

m

m

m

m...

,...

2

1

22

11

In these formulas m are known, but i,P and M are unknown

Observe that PM is a product of a 3mx4 matrix and a 4xn matrix, i.e. it is a rank-4 matrix

Perspective factorization algorithm

Assume that i are known, then PM is known.

Use the singular value decomposition PM=U VT

In the noise-free case

=diag(1,2,3,4,0, … ,0)and a reconstruction can be obtained by setting:

P=the first four columns of U.M=the first four rows of V.

Iterative perspective factorization

When i are unknown the following algorithm can be used:

1. Set ij=1 (affine approximation).

2. Factorize PM and obtain an estimate of P and M. If 5 is sufficiently small then STOP.

3. Use m, P and M to estimate i from the camera equations (linearly) mi i=PiM

4. Goto 2.

In general the algorithm minimizes the proximity measure P(,P,M)=5Note that structure and motion recovered

up to an arbitrary projective transformation

Further Factorization work

Factorization with uncertainty

Factorization for dynamic scenes

(Irani & Anandan, IJCV’02)

(Costeira and Kanade ‘94)

(Bregler et al. 2000, Brand 2001)

practical structure and motion recovery from images

• Obtain reliable matches using matching or tracking and 2/3-view relations

• Compute initial structure and motion• Refine structure and motion• Auto-calibrate• Refine metric structure and motion

Initialize Motion (P1,P2 compatibel with F)

Sequential Structure and Motion Computation

Initialize Structure (minimize reprojection error)

Extend motion(compute pose through matches seen in 2 or more previous views)

Extend structure(Initialize new structure, refine existing structure)

Computation of initial structure and motion

according to Hartley and Zisserman “this area is still to some extend a

black-art”All features not visible in all images No direct method (factorization not applicable) Build partial reconstructions and assemble (more views is more stable, but less corresp.)

1) Sequential structure and motion recovery

2) Hierarchical structure and motion recovery

Sequential structure and motion recovery

• Initialize structure and motion from two views

• For each additional view• Determine pose• Refine and extend structure

• Determine correspondences robustly by jointly estimating matches and epipolar geometry

Initial structure and motion

eeaFeP

0IPT

x

2

1

Epipolar geometry Projective calibration

012 FmmT

compatible with F

Yields correct projective camera setup(Faugeras´92,Hartley´92)

Obtain structure through triangulation

Use reprojection error for minimizationAvoid measurements in projective space

Compute Pi+1 using robust approach (6-point RANSAC)

Extend and refine reconstruction

)x,...,X(xPx 11 iii

2D-2D

2D-3D 2D-3D

mimi+1

M

new view

Determine pose towards existing structure

Compute P with 6-point RANSAC

• Generate hypothesis using 6 points

• Count inliers • Projection error ?x,x...,,xXP 11 td iii

• Back-projection error ijtd jiij ?,x,xF

• Re-projection error td iiii x,x,x...,,xXP 11

• 3D error ?X,xP 3-1

Dii td

• Projection error with covariance

td iii x,x...,,xXP 11

• Expensive testing? Abort early if not promising• Verify at random, abort if e.g. P(wrong)>0.95(Chum and Matas, BMVC’02)

Dealing with dominant planar scenes

• USaM fails when common features are all in a plane

• Solution: part 1 Model selection to detect problem

(Pollefeys et al., ECCV‘02)

Dealing with dominant planar scenes

• USaM fails when common features are all in a plane• Solution: part 2 Delay ambiguous computations

until after self-calibration(couple self-calibration over all 3D

parts)

(Pollefeys et al., ECCV‘02)

Non-sequential image collections

4.8im/pt64 images

3792

po

ints

Problem:Features are lost and reinitialized as new features

Solution:Match with other close views

For every view iExtract featuresCompute two view geometry i-1/i and matches Compute pose using robust algorithmRefine existing structureInitialize new structure

Relating to more views

Problem: find close views in projective frame

For every view iExtract featuresCompute two view geometry i-1/i and matches Compute pose using robust algorithmFor all close views k

Compute two view geometry k/i and matchesInfer new 2D-3D matches and add to list

Refine pose using all 2D-3D matchesRefine existing structureInitialize new structure

Determining close views

• If viewpoints are close then most image changes can be modelled through a planar homography

• Qualitative distance measure is obtained by looking at the residual error on the best possible planar homography

Distance = m´,mmedian min HD

9.8im/pt

4.8im/pt

64 images

64 images

3792

po

ints

2170

po

ints

Non-sequential image collections (2)

Hierarchical structure and motion recovery

• Compute 2-view• Compute 3-view• Stitch 3-view reconstructions• Merge and refine reconstruction

FT

H

PM

Stitching 3-view reconstructions

Different possibilities1. Align (P2,P3) with (P’1,P’2) -1

23-1

12H

HP',PHP',Pminarg AA dd

2. Align X,X’ (and C’C’) j

jjAd HX',XminargH

3. Minimize reproj. error

jjj

jjj

d

d

x',HXP'

x,X'PHminarg 1-

H

4. MLE (merge) j

jjd x,PXminargXP,

Refining structure and motion

• Minimize reprojection error

• Maximum Likelyhood Estimation (if error zero-mean

Gaussian noise)• Huge problem but can be solved

efficiently (Bundle adjustment)

m

k

n

iikD

ik 1 1

2

kiM̂,P̂

M̂P̂,mmin

Non-linear least-squares

• Newton iteration• Levenberg-Marquardt • Sparse Levenberg-Marquardt

(P)X f (P)X argminP

f

Newton iteration

Taylor approximation

J)(P )(P 00 ffP

XJ

Jacobian

)(PX 1f

JJ)(PX)(PX 001 eff

0T-1T

0TT JJJJJJ ee

i1i PP 0T-1T JJJ e

01-T-11-T JJJ e

normal eq.

Levenberg-Marquardt

0TT JNJJ e

0TJN' e

Augmented normal equations

Normal equations

J)λdiag(JJJN' TT

30 10λ

10/λλ :success 1 ii

ii λ10λ :failure solve again

accept

small ~ Newton (quadratic convergence)

large ~ descent (guaranteed decrease)

Levenberg-Marquardt

Requirements for minimization• Function to compute f

• Start value P0

• Optionally, function to compute J(but numerical ok, too)

Sparse Levenberg-Marquardt

• complexity for solving• prohibitive for large problems

(100 views 10,000 points ~30,000 unknowns)

• Partition parameters• partition A • partition B (only dependent on A and

itself)

0T-1 JN' e3N

Sparse bundle adjustment

residuals:normal equations:

with

note: tie points should be in partition A


normal equations:

modified normal equations:

solve in two parts:


U1

U2

U3

WT

W

V

P1 P2 P3 M

Jacobian of has sparse block structure

J JJN T

12xm 3xn(in general

much larger)

im.pts. view 1

m

k

n

iikD

1 1

2

ki M̂P̂,m

Needed for non-linear minimization

Sparse bundle adjustment• Eliminate dependence of

camera/motion parameters on structure parametersNote in general 3n >> 11m

WT V

U-WV-1WT

NI0WVI 1

11xm 3xn

Allows much more efficient computations

e.g. 100 views,10000 points,

solve 1000x1000, not 30000x30000Often still band diagonaluse sparse linear algebra algorithms


normal equations:

modified normal equations:

solve in two parts:


• Covariance estimation

-1WVY

1a

Tb

1-a

VYY

WWVU

Yaab -

Next class: self-calibration

*

*

projection

constraints

structure from motion class 12 read chapter 5. assignment 2 chrisms regions nathan… brianm&s...

Documents