binocular stereo - university of california, san...

Post on 27-Jul-2020

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Binocular StereoYangyue Wan

Binocular Stereo● Stereo Matching by Training a Convolutional Neural Network to Compare

Image Patches

● Efficient Deep Learning for Stereo Matching

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

What is Stereo Matching?● Different horizontal view

● Correspondence

● Disparity

Four Steps of Stereo Algorithm1. Matching cost computation

2. Cost aggregation

3. Optimization

4. Disparity

Q: What is matching cost?

What is Matching Cost?● Matching cost measures the similarity/difference of pixels● Corresponding pixel is chosen in a way such that the similarity between the

pixels is high, which means matching cost is low● “Winner-takes-all”: For every pixel select the disparity with lowest cost

Matching Cost by Learning Similarity● Inspiration

● Construct dataset

○ Same amount of positive/negative training examples (pairs of patches) from KITTI/Middlebury

● Network architectures

○ Fast

○ Accurate

Network ArchitecturesFast: Cosine similarity

Loss:

Q: Why this loss?

Network ArchitecturesAccurate: FC layers

Loss:

Q: Why this loss?

Matching Cost● Inspiration

● Construct dataset

● Network architectures

● Computing the matching cost

○ Perform the forward pass for each image location and each disparity under consideration

Matching Cost● Computing the matching cost

○ Perform the forward pass for each image location and each disparity under consideration

○ Running time?

Stereo MethodThe raw outputs of previous steps are not enough to produce accurate disparity map, post-processing steps are needed

● Cross-based cost aggregation

● Semiglobal matching

● Computing the disparity image

○ Interpolation

○ Subpixel enhancement

○ Refinement

Stereo Method● Cross-based cost aggregation: Collected only from pixels of the same

physical object

○ Support region for position p

○ Combined support region

Stereo Method● Cross-based cost aggregation: Collected only from pixels of the same

physical object

○ Averaged matching cost

Stereo Method● Semiglobal matching

○ Understand basic semiglobal matching

Stereo Method● Semiglobal matching

○ Energy function

Stereo Method● Semiglobal matching

○ Cost function in order to minimize E(D)

○ Choose P1 and P2

Stereo Method● Semiglobal matching

○ Final cost

○ Repeat cross-based cost aggregation

Stereo Method● Compute the disparity image

○ Interpolation

○ Subpixel enhancement

Stereo Method● Compute the Disparity Image

○ Refinement

■ 5x5 median filter

■ Bilateral filter

Experiments● Datasets: KITTI 2012, KITTI 2015, Middlebury

Experiments● Datasets: KITTI 2012, KITTI 2015, Middlebury

Experiments● Datasets: KITTI 2012, KITTI 2015, Middlebury

Experiments● Details of learning (skip)

● Dataset augmentation

Experiments● Runtime

Experiments● Comparison of approaches for

○ Computing matching cost

○ Stereo method

Experiments● Effects of dataset size

● Transfer learning Q: What is transfer learning?● Hyperparameters (skip)

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

● Learn similarity on pairs of patches to compute matching cost

● Two network used, for speed and accuracy separately

● Supervised way to train

● Output of the CNN is used to initialize the stereo matching cost

● A series of post-processing steps following……

Efficient Deep Learning for Stereo Matching

Introduction● Old methods use hand-crafted cost/energy functions

● Current CNN-based methods are very time-consuming

● The authors propose a new and faster network (similar to the Fast

Architecture in previous paper )

Network Architecture● Siamese network, remove ReLU from last layer● Use a product layer instead of another network

Training● Size of inputs

○ Left = receptive field size○ Right > receptive field size Q: Why?

● Size of outputs○ Left = 64○ Right = Q: Why?

● Softmax● Cross-entropy loss

Smoothing Deep Net Outputs● Cost aggregation

○ Simply performs average pooling over a window of size 5 x 5

● Semiglobal block matching○ Energy function

● Slanted plane (not very clear in the paper)● Sophisticated post-processing

○ In contrast to the “Compute the Disparity Image” part in previous paper, only use interpolation here, since the other two are found not indeedly improve performance

Experiments● Hyperparameters (skip)

● Datasets: KITTI 2012, KITTI 2015

Experiments● KITTI 2012

Q: How to explain?

Experiments● KITTI 2015

DiscussionThat two papers were almost concurrent work, how are they related?

And the strengthens and weakness for each of them when compared?

Thank you!

top related