binocular stereo

42
Computer Vision March 2007 L1.1 © 2007 by Davi Geiger Binocular Stereo Binocular Stereo Left Image Right Image

Upload: misae

Post on 12-Jan-2016

67 views

Category:

Documents


0 download

DESCRIPTION

Binocular Stereo. Left Image Right Image. Binocular Stereo. Binocular Stereo. There are various different methods of extracting relative depth from images, some of the “passive ones” are based on relative size of known objects, - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Binocular Stereo

Computer Vision March 2007 L1.1© 2007 by Davi Geiger

Binocular Stereo

Binocular Stereo

Left Image Right Image

Page 2: Binocular Stereo

Computer Vision March 2007 L1.2© 2007 by Davi Geiger

There are various different methods of extracting relative depth from images, some of the “passive ones” are based on

(i) relative size of known objects, (ii) occlusion cues, such as presence of T-Junctions,(iii) motion information, (iv) focusing and defocusing, (v) relative brightness

Moreover, there are active methods such as (i) Radar , which requires beams of sound waves or (ii) Laser, uses beam of light

Stereo vision is unique because it is both passive and accurate.

Binocular Stereo

Page 3: Binocular Stereo

Computer Vision March 2007 L1.3© 2007 by Davi Geiger

Julesz’s Random Dot Stereogram. The left image, a black and white image, is generated by a program that assigns black or white values at each pixel according to a random number generator.

The right image is constructed from by copying the left image, but an imaginary square inside the left image is displaced a few pixels to the left and the empty space filled with black and white values chosen at random. When the stereo pair is shown, the observers can identify/match the imaginary square on both images and consequently “see” a square in front of the background. It shows that stereo matching can occur without recognition.

Human Stereo: Random Dot Stereogram

Page 4: Binocular Stereo

Computer Vision March 2007 L1.4© 2007 by Davi Geiger

Not even the identification/matching of illusory contour is known a priori of the stereo process. These pairs gives evidence that the human visual system does not process illusory contours/surfaces before processing binocular vision. Accordingly, binocular vision will be thereafter described as a process that does not require any recognition or contour detection a priori.

Human Stereo: Illusory Contours

Here not only illusory figures on left and right images don’t match, but also stereo matching yields illusory figures not seen on either left or right images alone.

Stereo matching occurs in the presence of illusory.

Page 5: Binocular Stereo

Computer Vision March 2007 L1.5© 2007 by Davi Geiger

Human Stereo: Half Occlusions

Left Right

Left Right

An important aspect of the stereo geometry are half-occlusions. There are regions of a left image that will have no match in the right image, and vice-versa. Unmatched regions, or half-occlusion, contain important information about the reconstruction of the scene. Even though these regions can be small they affect the overall matching scheme, because the rest of the matching must reconstruct a scene that accounts for the half-occlusion.

Leonardo DaVinci had noted that the larger is the discontinuity between two surfaces the larger is the half-occlusion. Nakayama and Shimojo in 1991 have first shown stereo pair images where by adding one dot to one image, like above, therefore inducing occlusions, affected the overall matching of the stereo pair.

Page 6: Binocular Stereo

Computer Vision March 2007 L1.6© 2007 by Davi Geiger

Projective CameraLet be a point in the 3D world represented by a “world” coordinate system. Let be the center of projection of a camera where a camera reference frame is placed. The camera coordinate system has the z component perpendicular to the camera frame (where the image is produced) and the distance between the center and the camera frame is the focal length, . In this coordinate system the point is described by the vector and the projection of this point to the image (the intersection of the line with the camera frame) is given by the point , where

z

x

Po=(Xo,Yo,Zo)

0PZ

fpo

po=(xo,yo,f)f

y

OO

f

),,( ZYXP

),,( OOOO ZYXP

PO ),,( fyxp ooo

),,( ZYXP

O

0PZ

fpo

Page 7: Binocular Stereo

Computer Vision March 2007 L1.7© 2007 by Davi Geiger

y

x

)1,,( 0 yoxo qqq

xo

yoO

,)(;)( 0000 yyyxxx soqysoqx

We have neglected to account for the radial distortion of the lenses, which would give an additional intrinsic parameter. Equation above can be described by the linear transformation

fooss yxyx );,();,(where the intrinsic parameters of the camera, , represent the size of the pixels (say in millimeters) along x and y directions, the coordinate in pixels of the image (also called the principal point) and the focal length of the camera.

f

oss

oss

Q yyy

xxx

00

0

01

01qQpo

0pQqo

f

f

o

s

f

o

s

Q y

y

x

x

100

10

01

Projective Camera Coordinate System

pixel coordinates

Page 8: Binocular Stereo

Computer Vision March 2007 L1.8© 2007 by Davi Geiger

Ol

xl

yl

P=(X,Y,Z)

pl=(xo,yo,f)

xr

yr

zr

Or

zl

f pr=(xo,yo,f)

f

A 3D point P projected on both cameras. The transformation of coordinate system, from left to right is described by a rotation matrix R and a translation vector T. More precisely, a point P described as Pl in the left frame will be described in the right frame as

rl OO T

lP

rP

)(1 TPRP lr

Two Projective Cameras

Page 9: Binocular Stereo

Computer Vision March 2007 L1.9© 2007 by Davi Geiger

Ol

xl

ylP=(X,Y,Z)

xr

yr

zr

el

epipolar lines

Or

zl

pr=(xo,yo,f)

er

Each 3D point P defines a plane . This plane intersects the two camera frames creating two corresponding epipolar lines. The line will intersect the camera planes at and , known as the epipoles. The line is common to every plane POlOl and thus the two epipoles belong to all pairs of epipolar lines (the epipoles are the “center/intersection” of all epipolar lines.)

rl OO

le

re

rlOPO

rl OO T

lP

)(1 TPRP lr

rl OO

pl=(xo,yo,f)

Two Projective CamerasEpipolar Lines

Page 10: Binocular Stereo

Computer Vision March 2007 L1.10© 2007 by Davi Geiger

0),(0),()(0)()(

0)()(0)()(0)()(''''

'

lrlrlrlr

llllll

pTREpPTREPPTSRPPTPR

PTTPPTTPPTTP

matrixessentialtheisTSRTREand

TT

TT

TT

TS T

xy

xz

yz

)(),(

0

0

0

)(

0),,,(0),(0),( '1'' lrlrll

Trrlr qiiTRFqqQTREQqpTREp

Estimating Epipolar Lines and Epipoles

The two vectors, , span a 2 dimensional space and their cross product , , is perpendicular to this 2 dimensional space. Therefore

lPT

,)( lPT

ll PP

'

where

F is known as the fundamental matrix and needs to be estimated

rl ZZ

f

2

)(1 TPRP lr

Page 11: Binocular Stereo

Computer Vision March 2007 L1.11© 2007 by Davi Geiger

“Eight point algorithm”:

(i) Given two images, we need to identify eight points or more on both images, i.e., we provide n 8 points with their correspondence. The points have to be non-degenerate.

(ii) Then we have n linear and homogeneous equations with 9 unknowns, the components of F. We need to estimate F only up to

some scale factors, so there are only 8 unknowns to be computed from the n 8 linear and homogeneous equations.

(i) If n=8 there is a unique solution (with non-degenerate points), and if n > 8 the solution is overdetermined and we can use the SVD decomposition to find the best fit solution.

Computing F (fundamental matrix)

0),,,(' lrlr qiiTRFq

Page 12: Binocular Stereo

Computer Vision March 2007 L1.12© 2007 by Davi Geiger

Each potential match is represented by a square. The black ones represent the most likely scene to “explain” the images, but other combinations could have given rise to the same images (e.g., red)

Stereo Correspondence: Ambiguities

What makes the set of black squares preferred/unique is that they have similar disparity values, the ordering constraint is satisfied and there is a unique match for each point. Any other set that could have given rise to the two images would have disparity values varying more, and either the ordering constraint violated or the uniqueness violated. The disparity values are inversely proportional to the depth values

Page 13: Binocular Stereo

Computer Vision March 2007 L1.13© 2007 by Davi Geiger

A BC

DE F

AB

A

CD

DC

F

FE

Stereo Correspondence: Matching Space

Rig

ht

boundary

no m

atc

hBoundary no match

Left

depth discontinuity

Surface orientation

discontinuity

F D C B A

AC

D

E

F

Note 2: Due to pixel discretization, points A and C in the right frame are neighbors.

Note 1: Depth discontinuities and very tilted surfaces can/will yield the same images ( with half occluded pixels)

In the matching space, a point (or a node) represents a match of a pixel in the left image with a pixel in the right image

Page 14: Binocular Stereo

Computer Vision March 2007 L1.14© 2007 by Davi Geiger

Cyclopean Eye

2and

22 and

2

wxl

wxr

lrw

lrx

w

x

l

r

l

r

w

x

11

11

2

1

11

11

2

1

The cyclopean eye “sees” the world in 3D where x represents the coordinate system of this eye and w is the disparity axis

For manipulating with integer coordinate values, one can also use the following representation

w

x

l

r

l

r

w

x

11

11

2

1

11

11

Restricted to integer values. Thus, for l,r=0,…,N-1 we have x=0,…2N-2 and w=-N+1, .., 0, …, N-1

Note: Not every pair (x,w) have a correspondence to (l,r), when only integer coordinates are considered. For x+w even we have integer values for pixels r and l and for x+w odd we have supixel locations.

w

w=2

Right Epipolar Line

l-1 l=3 l+1

r+1

r=5

r-1

x

x=8

Page 15: Binocular Stereo

Computer Vision March 2007 L1.15© 2007 by Davi Geiger

Left Epipolar Line

w

w=2

Right Epipolar Line x

x=8

Smoothness

Surface Constraints ISmoothness : In nature most surfaces are smooth in depth compared to their distance to the observer, but depth discontinuities also occur.  Usually smoothness implies an ordering constraint, where points to the right of must match points to the right of lq

rq

r+1

r=5

r-1

l-1 l=3 l+1

x=8

YES

YES x=8

NO: Ordering Violation

NO: Ordering Violation

YES

YES

Given that the l=3 and r=5 are matched (blue square), then the red squares represent violations of the ordering constraint while the yellow squares represent smooth matches.

Page 16: Binocular Stereo

Computer Vision March 2007 L1.16© 2007 by Davi Geiger

Surface Constraints IIUniqueness: There should be only one disparity value associated to each cyclopean coordinate x. Note: multiple matches for left eye points or right eye points are allowed.

Uniqueness

Left Epipolar Line

w

w=2

Right Epipolar Linex

x=8

r+1

r=5

r-1

l-1 l=3 l+1

NO: Uniqueness

x=8

YES, but note that it is a multiple match for the right eye

YES, but note that it is a multiple match for the left eye

NO: Uniqueness

Given that the l=3 and r=5 are matched (blue square), then the red squares represent violations of the uniqeness constraint while the yellow squares represent unique matches, in the cyclopean coordinate system but multiple matches in the left or right eyes coordinate system.

Page 17: Binocular Stereo

Computer Vision March 2007 L1.17© 2007 by Davi Geiger

Bayesian Formulation

)}),(),,(({

))},(({)}),({|)},(),,(({)}),(),,({|)},(({

eqIeqIP

exwPexweqIeqIPeqIeqIexwP

rR

lL

rR

lL

rR

lL

The probability of a surface w(x,e) to account for the left and right image can be described by the Bayes formula as

where e index the epipolar lines. Let us develop formulas for both probability terms on the numerator. The denominator can be computed as the normalization constant to make the probability sum to 1.

Page 18: Binocular Stereo

Computer Vision March 2007 L1.18© 2007 by Davi Geiger

Peven(e,x,w) Є [0,1], for x+w even, represents how similar the images are between pixels (e,l) in the left image and (e,r) in the right image, given that they match. The epipolar lines are indexed by e.

odd if0

even if1),(where

),,()),(1(),,(),()}),({|)},(),,(({1

0

12

0

wx

wxwx

wxePwxwxePwxexweqIeqIPN

e

N

xoddevenr

Rl

L

The Image Formation

Podd(e,x,w) Є [0,1], for x+w odd, represents how similar intensity edges are in between (e,l -> e,l+1) in the left image and at (e,r ->e,r+1) in the right image.

Page 19: Binocular Stereo

Computer Vision March 2007 L1.19© 2007 by Davi Geiger

P(e,x,w) Є [0,1], for x+w even, represents how similar the images are between pixels (e,l) in the left image and (e,r) in the right image, given that they match.

3),,(),,(

1),,(

T

wxeW

T

wxeW

even eZ

wxeP

We use “left” and “right” windows to account for occlusions. We also “spread” thedifference in intensities that are below a mean value as much as possible, assuming differences above the mean to be all “unacceptable”, i.e., P(e,x,w)=0.

The Image Formation I (x+w even)

even,)5,,,(ˆ)5,,,(ˆ,)5,0,,(ˆ)5,0,,(ˆmin),,(22

wxerIelIerIelIwxeW RLRL

                                                

                                                                       

                                                                                                                       

left right left right

Page 20: Binocular Stereo

Computer Vision March 2007 L1.20© 2007 by Davi Geiger

even,)5,,,(ˆ)5,,,(ˆ,)5,0,,(ˆ)5,0,,(ˆmin),,(22

wxerIelIerIelIwxeW RLRL

The Image Formation I (x+w even, cont...)

3

3

2),,(

1

3

2),,(

T

W

T

W

ewxePZ

wxeP

33 ),,(),,(

2

1),,(),,(

3

4

3

21),,(

T

wxeW

T

wxeW

T

wxeW

T

wxeW

even eZ

wxeP

a. W~0, good, but not as good as edge to edge matching, so P ~ 2/3

b. W~T then, match is still good P ~ 1/2

1

1

1

4

3ln

3

2

2

1),,( ewxeP

3

1

1

4

3

3

2),,(

T

W

T

W

wxeP

c. W~2T then, match should not be good P 0.16~1/6

15821

1

4

3

4

3

4

3

3

216.0),,(

821

1582

1

1

wxeP

Page 21: Binocular Stereo

Computer Vision March 2007 L1.21© 2007 by Davi Geiger

The Image Formation II (x+w odd)

2

r ,2

where, oddwxwx

lwx

),0,(),0,(max

image. righton edgean match toimageleft on edgean havingfor Threshold

,),0,(),0,(,),0,(),0,(

,,max elDIerDID

T

elDIerDIDelDIerDID

LRrle

D

LRLR

where

22),,(

1),,(

1),,(

1),,(

DT

wxeD

wxeD

wxeD

odd eZ

wxeP

Page 22: Binocular Stereo

Computer Vision March 2007 L1.22© 2007 by Davi Geiger

a. Large D+ ~ TD, small D- ~ 0 P ~ 1

22

11

1 D

DT

T

eZZ

e

b. D+ ~0, D- ~0, ok, not great (just like W~0 match), so P ~ 2/3 1)1( 2

3

21

3

2 DTZeZ

22),,(

1),,(

1),,(

2

3),,(

DT

wxeD

wxeD

wxeD

wxeP

The Image Formation II (cont...)

32

3

3

2),,(

14

wxeP

c. D+ ~ (TD), D- ~ (TD) , e.g., DL ~ 0 or Dr ~ 0, not good, so say P ~1/5~(2/3)^4

d. D- > D+ is bad news, so as D- TD and D+ ~ 0 then P 0

02

3),,(

631 2

DD TT

wxeP

222),,(

31),,(

1),,(6

),,(

1),,(

1),,(

2

31),,(

DDD T

wxeD

wxeD

wxeDTT

wxeD

wxeD

wxeD

odd eZ

wxeP

Say TD>6 1 Zand3

2 e

Page 23: Binocular Stereo

Computer Vision March 2007 L1.23© 2007 by Davi Geiger

Summary of Probabilities

],[

2

3),,(

DwxCe

wxeP

We can normalize across w, i.e.,

D

Dw

wxeP

wxePwxe

),,(

),,(),,(

3

4

2

3 where

72.0

odd if),,(

31),,(

1),,(

even if),,(),,(

36.01

],[

matriceson data thestorecan we, lineepipolar each For

22

3

wxT

wxeD

wxeD

wxeD

wxT

wxeW

T

wxeW

DwxC

e

D

e

Page 24: Binocular Stereo

Computer Vision March 2007 L1.24© 2007 by Davi Geiger

Problems: Flat or (Double) Tilted ?

w

left righ

t

left righ

t

083.02

1

4

3

3

2

321

3

2

32

32

1

3

2

32

32

32

1

3

2

32

32

1

3

2

321

3

223

Tiltflat PP

The probabilities for (a) and (b) are the same. More precisely, after normalization, we have

The preference for flat surfaces must be built as a prior model of surfaces.

w=2

x

l=1 l=2 l=3

r=5

r=4

r=3

x=8

X

X

a.Flat Plane b.Doubled Tilted

Page 25: Binocular Stereo

Computer Vision March 2007 L1.25© 2007 by Davi Geiger

left righ

t

Occluded

left righ

t

Flat

w=2

x

l=1 l=2 l=3

r=5

r=4

r=3

r=2

x=8

X

X

083.02

1

4

3

3

2

321

3

2

32

32

1

3

2

32

32

32

1

3

2

32

32

1

3

2

321

3

223

Occludedflat PP

The probabilities for (a) and (b) are the same. More precisely, after normalization, we have

The occlusions are computed as matches between edges, e.g., (x=6,w=1), and we should correct this.Also, and related, the preference for flat surfaces must be built as a prior model of surfaces.

Problems: Flat or Occluded?

a.Flat Plane b.Occluded

X

X

Page 26: Binocular Stereo

Computer Vision March 2007 L1.26© 2007 by Davi Geiger

Problems: Occluded or Tilted?

left righ

t

59.013

15

3

2

321

3

2

1

11

32

51

1

3

2

32

51

1

3

2

321

3

222

TiltP

left righ

t

x

w

w=2

l=2 l=3

r=6

r=5

r=4

r=3x=

8X

X

X

0)1,5,()2,6,()3,7,()3,9,(

)1,6,( and,)1,6,(

)2,7,( and,)2,7,(

0)3,8,( and2)3,8,(

Say

wxeWwxeWwxeWwxeW

TwxeDTwxeD

TwxeDTwxeD

wxeDTwxeD

DD

DD

D

05.013

15

5

1

321

3

2

1

11

32

51

1

5

1

32

51

1

5

1

321

3

222

OcclusionP

Occlusions are computed as matches between edges, e.g., (x7=,w2) and we should correct this.The balance (or preference) for flat and occluded surfaces must be built as a prior model of surfaces.

a.Flat and Occluded Planes b.Tilted Planes

Page 27: Binocular Stereo

Computer Vision March 2007 L1.27© 2007 by Davi Geiger

1,1,' and

odd if0

even if1),(where

)',()),(1()',(),(

)}),(({1

0

12

0),(),(

wwww

wx

wxwx

wwTwxwwTwx

exwPN

e

N

xxeoddxeeven

Prior Pairwise Model

We will introduce a bias cost for tilt surfaces and another one for occlusions

Page 28: Binocular Stereo

Computer Vision March 2007 L1.28© 2007 by Davi Geiger

TitltCostwwwwwwTitltCost

xeeven CC

wwT

evenwx

2

321

2

31)',(

1,,1''

,

Prior Model for Tilted Surfaces (x+w even)

We introduce a cost for tilt surfaces

w

w=2

Right Epipolar Line

l-1 l=3 l+1

r+1

r=5

r-1

x

x=8w

w=2

Right Epipolar Line

l-1 l=3 l+1

r+1

r=5

r-1

x

x=8X

22

3

3

),,(3

1),,(

1),,(

)1,1,()1,1,(36.01

)1,1,()1,1,(36.01

2

3

2

3

2

3

2

31)',,1,,(

DT

wxeD

wxeD

wxeD

T

wxeW

T

wxeWTitltCost

T

wxeW

T

wxeW

even CwwxxeP

Tilted: w’=w-1, w+1

Flat w’=w

How to set the value for TiltCost? Check two neighbor probabilities

Flat should win over Tilt even under some edge-edge matching errors (pixel-pixel being the same).

it. do will75.02

3

2

3~

2

3

2

3 :case bad""

175.1

2

3~

3

41

2

131

72.02

TiltCostTitltCost

TitltCost

Page 29: Binocular Stereo

Computer Vision March 2007 L1.29© 2007 by Davi Geiger

2

max

223

2

max

),,(1

),,(3

1),,(

1),,(),1,(),1,(36.01

),,(1

2

3

2

3

2

3

2

3

1)',,1,,(

R

R

D

L

L

D

wxeDostOcclusionC

T

wxeD

wxeD

wxeD

T

wxeW

T

wxeW

D

wxeDostOcclusionC

odd CwwxxeP

w

w=2

Right Epipolar Line

l-1 l=3 l+1

r+1

r=5

r-1

x

x=8

XX

X

When transitioning from edge to edge, it does not represent edges being matched but rather an occlusion. So a different metric must be introduced.

Flat w’=w

Prior Model for Occluded Surfaces (x+w odd)

1'),,(

1

'0

1'),,(

1

)',,(

2

3)',(

2

max

2

max

)',,(

),(

wwD

wxeD

ww

wwD

wxeD

wwx

wwT

oddwx

R

R

L

L

wwxostOcclusionC

xeodd

For two neighbor matches we have

Moreover, when an occlusion transitions occurs ww’=w-1, w+1, we do not want the edge-edge match cost, C[x,w+D]

Page 30: Binocular Stereo

Computer Vision March 2007 L1.30© 2007 by Davi Geiger

Problems: Tilted or Occluded ?

left righ

t

left righ

t

x

w

w=2

l=2 l=3

r=6

r=5

r=4

r=3

x=8

X

X

X

32

max

),,(),,(36.01

)1,,(1

2

3

2

3

2

3

3,7

T

wxeW

T

wxeWTitltCost

D

wxeDostOcclusionC

L

L

wx

4424

1~

2

3

2

3 0.5 ~for

2

3

2

3

2

3

14

1

max

1)1,,(

1

2

max

TiltCostostOcclusionCTiltCostostOcclusionC

D

DTitltCostostOcclusionC

L

L

TitltCostD

wxeDostOcclusionC

L

L

Occluded with intensity edges wins

Page 31: Binocular Stereo

Computer Vision March 2007 L1.31© 2007 by Davi Geiger

Problems: Tilted or Occluded ?

left righ

t

left righ

t

x

w

w=2

l=2 l=3

r=6

r=5

r=4

r=3

x=8

X

X

X

Tilted wins

3),,(),,(36.012

max

2

3

2

3

3

2)1,,(

1T

wxeW

T

wxeWL

TitltCostD

wxeDostOcclusionC

TiltCostostOcclusionCD

D

wx

TitltCostostOcclusionC

L

L

TitltCostD

wxeDostOcclusionC

L

L

181.0~2

3

2

3 0.1 ~for

2

3

2

3

2

3

3,7

181.0

max

1)1,,(

1

2

max

Page 32: Binocular Stereo

Computer Vision March 2007 L1.32© 2007 by Davi Geiger

01.12

321

2

31)',(

101,,1''10

),(

CC

wwTwwwwww

xeeven

1,1,' andodd if0

even if1),(where

)',()),(1()',(),()}),(({1

0

12

0),(),(

wwwwwx

wxwx

wwTwxwwTwxexwPN

e

N

xxeoddxeeven

Final Prior Pairwise Model

1'),,(

1

'0

1'),,(

1

)',,(where2

31)',(

2

max

2

max]',1[|'|)',,(25

),(

wwD

wxeD

ww

wwD

wxeD

wwxC

wwT

R

R

L

L

wxCwwwwx

xeodd

25 and10 issolution A

181.0&44&75.0Since

ostOcclusionCTitltCost

TitltCostostOcclusionCTitltCostostOcclusionCTitltCost

Page 33: Binocular Stereo

Computer Vision March 2007 L1.33© 2007 by Davi Geiger

Limit DisparityThe search is within a range of

disparity : 2D+1 , i.e.,

The rational is:

(i) Less computations

(ii) Larger disparity matches imply larger errors in 3D estimation.

(iii) Humans only fuse stereo images within a limit. It is called the Panum’s limit.

Dlrw ||||

w=-3

w

w=2

Right Epipolar LineSmoothness (+Ordering)

l-1 l=3 l+1

r+1

r=5

r-1

x

x=8

D=3

Dw ||

Startin

g x

We may start the computations at x=D to avoid limiting the range of w values. In this case, we also limit the computations to up to x=2N-2-D

Page 34: Binocular Stereo

Computer Vision March 2007 L1.34© 2007 by Davi Geiger

The Posterior Model

odd if0

even if1),(where

)1(),()(,,)))(,(1(

)1(),()(,,))(,(

)}),(),,({|)},(({

is, modelposterior final Then the

1

0

12

0 ),(

),(

wx

wxwx

xwxwTxwxePxwx

xwxwTxwxePxwx

eqIeqIexwP

N

e

N

x xeoddodd

xeeveneven

rR

lL

For the optimization process we are only interested on the cost function associated to the probability.

We then store the cost functions associated to P(e,x,w) and T(w,w’) in arrays as follows

Page 35: Binocular Stereo

Computer Vision March 2007 L1.35© 2007 by Davi Geiger

odd

1),,(

125

0],[

1),,(

125

even1,0,1],1[|)|1(10

]1,,[

2

max

2

max

wx

iD

wxeD

iDwxC

iD

wxeD

wxiDiwxCii

iDwxF

R

R

L

L

odd if),,(

31),,(

1),,(

even if),,(),,(

36.01

],[

matriceson data thestorecan we, lineepipolar each For

22

3

wxT

wxeD

wxeD

wxeD

wxT

wxeW

T

wxeW

DwxC

e

D

Cost function arrays

Page 36: Binocular Stereo

Computer Vision March 2007 L1.36© 2007 by Davi Geiger

w=D

w

w=-D

?

States (2D+1)

?

1 2 3 … x-1 x ... 2N-2

?

Dynamic Programming

F[x, w+D, i+1]

],[ DwxC

x-1*[w+i+D]

x*[w+D]= (x,w) C[x, w+D] + mini=-1,0,1{x-1*[w+i+D]+F[x,w+D, i+1]}

Page 37: Binocular Stereo

Computer Vision March 2007 L1.37© 2007 by Davi Geiger

Stereo-Matching DP( ImageLeft, ImageRight, D, e ) (to be solved line by line)InitializeCreate the Graph (V(x,w), E(x,w,x-1,w’) (length is 2N-1 and width is 2D+1)/* precomputation of the match cost and transition cost, stored in arrays C and F */ loop for v=(x,w) in V (i.e., loop for x and loop for w) Set-Array C[x,w+D] (see previous slides for the formula) loop for i=-1,0,1 such that (x’=x-1, w’=w+i) Set-Array F[x, w+D, i+1] (see previous slides for the formula) end loop end loop Main loop loop for x=D,2N-1-D loop for w=-D,…,0,…,D Cost loop for i=-1,0,1 (check –D <= w+i = <D ) /* (w’=w-1, w, w+1) */ Temp= x-1*[w+i+D]+ F[x, w+D, i+1]; if Temp < Cost);

Cost= Temp; back[x,w+D] =w+i; end loopx*[w+D]= Cost+ if (x+w=even) C[x,w+D];

end loop end loop

Dynamic Programming

Page 38: Binocular Stereo

Computer Vision March 2007 L1.38© 2007 by Davi Geiger

leftright

Epipolar Line Interaction

1

0

22

0

))1,(),,(,,(1),(

N

e

N

x

exwexwxeD

eZ

exwP

1)5,2

,1),,((ˆ)5,2

,1),,((ˆ)5,2

3,),,((ˆ)5,

23

,),,((ˆ

)1,(),(

)))1,(),,(,,(

11

1

ewxrIewxlIewxrIewxlI

exwexw

exwexwxeD

eR

eL

eR

eL

ee

Epipolar interaction: the larger the intensity edges the less the cost (the higher the probability) to have disparity changes across epipolar lines

Page 39: Binocular Stereo

Computer Vision March 2007 L1.39© 2007 by Davi Geiger

1

0

22

0

))1,(),,(,,()),1(),,(,,(),),,((1),|)},(({

N

e

N

x

exwexwxeDexwexwxeFexexwCRL e

ZIIexwP

Stereo Correspondence: Belief Propagation (BP)

We want to obtain/compute the marginal

We have now the posterior distribution for disparity values (surface {w(x,e)})

D

DNexw

D

DNexw

D

DNeNxw

RL

D

Dexw

D

Deexxw

D

DeNxw

D

Dexw

D

Dexw

D

DeNxw

D

Dexw

D

Dexw

D

DeNxw

IIexwP

exexwP

)',0'( )','( ),22'(

)',0'( )'&'( )',22'(

)1,0'( )1','( )1',22'(

)0',0'( )0','( )0',22'(

),|)}','(({......

...

......

...

......

......),),,((

These are exponential computations on the size (length) of the grid N

Page 40: Binocular Stereo

Computer Vision March 2007 L1.40© 2007 by Davi Geiger

12 w

x

e e

x

“Horizontal” Belief Tree

)),((h exwP )),((v exwP

“Vertical” Belief Tree

Kai Ju’s approximation to BP

12 w

We use Kai Ju’s Ph.D. thesis work to approximate the (x,e) graph/lattice by horizontal and vertical graphs, which are singly connected. Thus, exact computation of the marginal in these graphs can be obtained in linear time. We combine the probabilities obtained for the horizontal and vertical graphs, for each lattice site, by “picking” the “best” one (the ones with lower entropy, where .)

D

Dw

exwPexwPexS )),((log)),((),(

Page 41: Binocular Stereo

Computer Vision March 2007 L1.41© 2007 by Davi Geiger

Result

Page 42: Binocular Stereo

Computer Vision March 2007 L1.42© 2007 by Davi Geiger

Region A

Region B

Region A Left

Region A Right

Region B Left

Region B Right

Junctions and its properties (false matches that reveal information from vertical disparities (see Malik 94, ECCV)

Some Issues in Stereo: