introduction 3d reconstruction c classical …ocho.uwaterloo.ca/research/posters/icip00_tj.pdf ·...

❖INTRODUCTION ❖

GOAL:To recover the structure of a rigid object usinga sequence of stereo images for suchaerospace applications as autonomousprecision landing, satellite servicing andretrieving payloads.

BASIC ASSUMPTIONS:1. Localised point features such as corners

are readily available.

2. Structure is represented by a collection of3D points.

3. There is a single, unknown rigid motionbetween the cameras and the object.

RESULT:An integrated framework for reconstructing anincrementally accurate and denserepresentation of a rigid object.

❖3D RECONSTRUCTION ❖

PROBLEM FORMULATION:�frame number in image sequence� camera viewpoint�perspective projection function

Set of geometric or textural features,represented as 3D points� � � � � � � � � � � � � � � � � � � � � � �� and their 2D projections onto image � � � � �

are� � � � � � � �� OBJECTIVE:

Find� � � �

given a set of images� � � � � �

varying in � and/or�.

CHALLENGES:

1. Feature correspondence: how to locate theprojections of a physical 3D point on twodifferent images?

♦ search problem♦ ill-posed, often ambiguous

2. Structure estimation: how to recover thedepth information from featurecorrespondences and how accurate arethe estimates?

♦ want to reduce as much as possible thesensitivity to noise and outliers incorrespondences

3. Implementation: how to lowercomputational complexity and datastorage requirements?

❖CLASSICAL APPROACHES ❖

STEREO

Set-up:

♦ spatially varying images

♦ large baselinea

♦ usually known stereo geometry

Method:

♦ area-based or feature-based stereomatching

♦ reconstruction by triangulation

Properties:

♦ large baseline� accurate depth estimates� correspondence difficult due togeometric distortion, occlusion,changes in specular reflection, etc.

abaseline = separation between two images in terms ofrelative distance between camera positions

MOTION

Set-up:

♦ temporally varying images

♦ small baseline

♦ known or unknown motion

Method:

♦ correspondence by optical flow or featuretracking

♦ motion estimation complementary to shaperecovery

♦ recursive or batch processing of longsequences

Properties:

♦ small baseline� easy correspondence� depth estimates sensitive to errorin 2D feature positions

COMBINED STEREO AND MOTION

Set-up:

♦ two consecutive pairs of stereo images or along stereo sequence

♦ known stereo geometry

♦ known or unknown motion

Method: Adaptations/extensions of existingstereo and/or motion techniques, e.g.,

♦ refine depth estimates for known initialstructure

♦ known motion to constrain stereo matching

♦ extend optical flow to stereo pairs

Properties:

♦ stereo and motion complement each otherto overcome individual weaknesses

♦ lack of unified framework to address all offeature correspondence, motion andstructure estimation

❖PROPOSED APPROACH ❖

BASIC IDEA:♦ Feature matching, 3D reconstruction,

feature tracking and motion estimationbootstrap each other;

♦ Initially unambiguous stereocorrespondences provide 3D points forunique determination of motion estimates;

♦ Ambiguities do not need to be resolvedimmediately at each frame. Matchingcandidates are treated as hypotheses to betested in future frames;

♦ Motion estimates give additional constraintsfor feature tracking and stereo matching� may resolve previous matching

ambiguities� generate more 3D points for moreaccurate motion estimation

NOTATION:� � ! " # image feature $ extracted from % � ! " #� &' ! " # image feature ( extracted from % & ! " #) * ! " + $ + ( # hypothesis that � � ! " # and � &' ! " # arestereo correspondences, � ! " # true projection of a 3D feature on % � ! " #, &' ! " # true projection of a 3D feature on % & ! " #-. ' ! " # 3D point reconstructed from

-, � ! " # and-, &' ! " #MOTION AND MEASUREMENT MODELS

2D MOTION MODEL: Initially, a second order motionestimator is used for each 2D feature point in bothleft and right images:, * ! " / 0 # 1 2 3 4 3 0 5 6778 , * ! " #, * ! " 4 0 #, * ! " 4 9 # : ;;</ = ! " # > ! " #where > ! " # ? @ ! A + % # and = ! " # models theprocess noise.

2D MEASUREMENT MODEL: Feature extractionerrors are modelled as B ! " # ? @ ! A + C # :� * ! " # 1 , * ! " # / B ! " #1 D E F + . ! " # G / B ! " # H

3D MOTION MODEL: After 3D motion estimatesbecome available, rigidity constraint for the wholeobject is enforced using a single consistent motion.. ! " / 0 # 1 I ! " # . ! " # / J ! " #I ! " # is 3 K 3 rotation matrix and J ! " # is atranslation vector.

3D MEASUREMENT MODEL: The measurementvector now consists of the extracted features onboth left and right images:68 � � ! " #� & ! " # :< 1 68 , � ! " #, & ! " # :< / 68 B ! " #B ! " # :<1 68 D E L + . ! " # GD E I + . ! " # G :< / 68 B ! " #B ! " # :<

ALGORITHM:

1. M $ + ( + if � � ! " # and � &' ! " # satisfy set of epipolar andminimum/maximum depth constraintsN Create O ) * ! " + $ + ( # PN Reconstruct O -. ' ! " # P

2. For each) *

, generate predictions-� � ! " / 0 Q " # and-� &' ! " / 0 Q " #

3. Match image features at frame " / 0 withpredictions.

If � �R ! " / 0 # matched with-� � ! " / 0 Q " # , &� &S ! " / 0 # matched with-� &' ! " / 0 Q " # , &O � �R ! " / 0 # , � &S ! " / 0 # P satisfy epipolar constraints,N Create

) * ! " / 0 + T + U # N Update-. R S ! " / 0 # # .

If � �R ! " / 0 # has only one stereo matchingcandidateN -. R S ! " / 0 # V W ! " / 0 #

4. Estimate new 3D motion parameters I ! " # andJ ! " # using W ! " # and W ! " / 0 # .5. Repeat from 1.

Validated motioncorrespondences

Motion estimation

2D right image features2D left image features

Multiple hypothesistracking and stereo matching

Validated stereocorrespondences

3D structurerepresentationMotion parameters

3D reconstruction

The Incremental Reconstruction Algorithm

Del

ay

Mat

chin

g

Gen

erat

e n

ew h

ypo

thes

es

Ste

reo

mat

ch h

ypo

thes

es

at f

ram

e f

Ste

reo

mat

ch h

ypo

thes

es a

t fr

ame

f+1

Hyp

oth

esis

Man

agem

ent

(pru

nin

g, m

erg

ing

)F

or

each

hyp

oth

esis

,g

ener

ate

pre

dic

tio

ns

Imag

e fe

atu

res

Pre

dic

ted

fea

ture

loca

tio

ns

V

alid

ated

s

tere

o &

mo

tio

n

corr

esp

on

den

ces

M

oti

on

p

aram

eter

s,3D

str

uct

ure

Mul

tiple

hypo

thes

istr

acki

ngan

dst

ereo

mat

chin

g

2D dynamics

frame f

frame f+1

rightimage

leftimage

stereo match hypothesis

predicted feature locations

2D dynamics

Without 3D motion parameters

frame fXframe f+1X

rightimage

leftimage

stereo match hypothesis

predicted feature locations

3D dynamics

projection

With 3D motion parameters

❖RESULTS ❖

SYNTHETIC PROBLEM

♦ Thirty 3D data points randomly generatedon synthetic model

♦ Simulated stereo set-up and motion tocreate a stereo image sequence

♦ Occlusion not modelled

♦ Random noise with distribution Y � Z � � �added to simulate feature extraction noise

SUMMARY OF RESULTS:♦ Increased number of reconstructed points

and decreased number of stereo matchinghypotheses over the first few frames

♦ 3D motion estimates incorporated afterframe 6, lost track of some features butreconstruction accuracy improved

0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

Frame number

Num

ber

of p

oint

s

[Active hypotheses \ Reconstructed points] Mismatched points — Visible features

REAL IMAGE SEQUENCE

♦ 30 corner features extracted from eachimage in sequence

♦ Many disappearing features due to lightingchanges

♦ New features at each frame are added tolist of hypotheses

Left image Right image

One sample pair of images from the real sequence.

Extracted features are shown as white points.

SUMMARY OF RESULTS:♦ Results not as satisfactory as synthetic

problem

♦ No ground truth to assess accuracy

♦ Many ambiguities unresolved by motionand epipolar constraint alone

♦ Motion estimates affected by outliers

0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

Frame numberN

umbe

r of

poi

nts

Active hypothesesReconstructed pointsExisting features

[Active hypotheses \ Reconstructed points

— Visible features

❖CONCLUSIONS ❖

♦ presented incremental 3D reconstructionusing a stereo image sequence

♦ all of feature matching, tracking, motion andstructure estimation integrated into onesingle framework

♦ demonstrated potential in a synthetic problem

♦ motion and epipolar constraints alone notsufficient for real sequence

♦ future work includes: occlusion modelling,robust motion estimation, integrating otherstereo matching techniques

Acknowledgments:

Research in this paper is funded in part by Natural Scienceand Engineering Research Council of Canada. Images arecourtesy of Macdonald Dettwiler Space and AdvancedRobotics Ltd.

References:

I. J. Cox, “A review of statistical data association techniquesfor motion correspondence,” Int. J. Computer Vision, vol.10, no. 1, pp. 53–66, 1993.

I. J. Cox and S. L. Hingorani, “An efficient implementation ofReid’s multiple hypothesis tracking algorithm . . . ,” IEEETrans. PAMI, vol. 18, no. 2, pp. 138–50, Feb. 1996.

U. R. Dhond and J. K. Aggarwal, “Structure from stereo — areview,” IEEE Trans. Systems, Man, and Cybernetics, vol.19, no. 6, pp. 1489–1510, 1989.

T. S. Huang and A. N. Netravali, “Motion and structure fromfeature correspondences: A review,” Proc. IEEE, vol. 82,no. 2, pp. 252–268, Feb. 1994.

G. Stein and A. Shashua, “Direct estimation of motion andextended scene structure for a moving stereo rig,” in Proc.IEEE CVPR, 1998.

C. Tomasi and T. Kanade, “Detection and tracking of pointfeatures,” Tech. Rep. CMU-CS-91-132, Carnegie MellonUniversity, Apr. 1991.

J. Yi and J. Oh, “Recursive resolving algorithm for multiplestereo and motion matches,” Image and Vision Computing,vol. 15, no. 3, pp. 181–96, Mar. 1997.

Res

ults

ofR

econ

stru

ctio

n

^ grou

ndtr

uth

_ reco

nstr

uctio

n

Fram

e1:

allt

hepo

ints

that

initi

ally

have

unam

bigu

ous

ster

eom

atch

esar

ere

cons

truc

ted.

Fron

tvie

wTo

pvi

ew

−60

0−

400

−20

00

200

400

600

800

−50

0

−40

0

−30

0

−20

0

−10

00

100

200

300

400

500

X (

mm

)

Y (mm)

−80

0−

600

−40

0−

200

020

040

060

080

021

00

2200

2300

2400

2500

2600

2700

2800

2900

3000

X (

mm

)

Z (mm)

Fram

e5:

mor

epo

ints

are

reco

nstr

ucte

das

som

eof

the

prev

ious

ambi

guiti

esar

ere

solv

ed.

Fron

tvie

wTo

pvi

ew

−60

0−

400

−20

00

200

400

600

800

−50

0

−40

0

−30

0

−20

0

−10

00

100

200

300

400

500

X (

mm

)

Y (mm)

−80

0−

600

−40

0−

200

020

040

060

080

021

00

2200

2300

2400

2500

2600

2700

2800

2900

3000

X (

mm

)

Z (mm)

Fram

e10

:3D

mot

ion

estim

ates

have

been

inco

rpor

ated

.T

heac

cura

cyof

the

dept

hes

timat

esim

prov

ed.

Fron

tvie

wTo

pvi

ew

−60

0−

400

−20

00

200

400

600

800

−50

0

−40

0

−30

0

−20

0

−10

00

100

200

300

400

500

X (

mm

)

Y (mm)

−80

0−

600

−40

0−

200

020

040

060

080

021

00

2200

2300

2400

2500

2600

2700

2800

2900

3000

X (

mm

)

Z (mm)

Fram

e20

:th

ede

pth

estim

ates

ofso

me

ofth

epo

ints

beco

me

even

mor

eac

cura

te.

Fron

tvie

wTo

pvi

ew

−60

0−

400

−20

00

200

400

600

800

−50

0

−40

0

−30

0

−20

0

−10

00

100

200

300

400

500

X (

mm

)

Y (mm)

−80

0−

600

−40

0−

200

020

040

060

080

021

00

2200

2300

2400

2500

2600

2700

2800

2900

3000

X (

mm

)

Z (mm)

introduction 3d reconstruction c classical …ocho.uwaterloo.ca/research/posters/icip00_tj.pdf ·...

Documents