3d photography: features & correspondences - cvg

75
3D Photography: Features & Correspondences

Upload: others

Post on 23-Mar-2022

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3D Photography: Features & Correspondences - CVG

3D Photography:Features &

Correspondences

Page 2: 3D Photography: Features & Correspondences - CVG

Feb 17 Introduction

Feb 24 Geometry, Camera Model, Calibration

Mar 3 Features, Tracking / Matching

Mar 10 Project Proposals by Students

Mar 17 Structure from Motion (SfM) + 2 papers

Mar 24 Dense Correspondence (stereo / optical flow) + 2 papers

Mar 31 Bundle Adjustment & SLAM + 2 papers

Apr 7 Multi-View Stereo & Volumetric Modeling + 2 papers

Apr 14 Project Updates

Apr 21 Easter

Apr 28 3D Modeling with Depth Sensors + 2 papers

May 5 3D Scene Understanding + 2 papers

May 12 4D Video & Dynamic Scenes + 2 papers

May 19 Guest lecture: KinectFusion by Shahram Izadi

May 26 Final Demos

Schedule (tentative)

Page 3: 3D Photography: Features & Correspondences - CVG

2D-2D

2D-3D 2D-3D

mimi+1

M

Correspondences are at the heart of 3D reconstruction from images

Today: Features & Correspondences

Page 4: 3D Photography: Features & Correspondences - CVG

Feature matching vs. tracking

Extract features independently and then match by comparing descriptors

Extract features in first images and then try to find same feature back in next view

What is a good feature?

Image-to-image correspondences are key to passive triangulation-based 3D reconstruction

Page 5: 3D Photography: Features & Correspondences - CVG

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Sum of Square Differences

Dissimilarity measures

Page 6: 3D Photography: Features & Correspondences - CVG

Feature points

• Required properties:

• Well-localized

• Stable across views, „repeatable“

(i.e. same 3D point should be extractedas feature for neighboring viewpoints)

Page 7: 3D Photography: Features & Correspondences - CVG

Feature point extraction

homogeneous

edge

corner

Find points (local image patches) that differ as much as possible from all neighboring points

Page 8: 3D Photography: Features & Correspondences - CVG

Feature point extraction

• Approximate SSD for small displacement ∆

• Image difference, square difference for pixel

• SSD for window

Page 9: 3D Photography: Features & Correspondences - CVG

Feature point extraction

homogeneous

edge

corner

Find points for which the following is maximum

i.e. maximize smallest eigenvalue of M

Page 10: 3D Photography: Features & Correspondences - CVG

Harris corner detector

• Only use local maxima, subpixel accuracy through second order surface fitting

• Select strongest features over whole image and over each tile (e.g. 1000/image, 2/tile)

• Use small local window:

• Maximize „cornerness“:

Page 11: 3D Photography: Features & Correspondences - CVG

Simple matching• for each corner in image 1 find the corner in

image 2 that is most similar and vice-versa

• Only compare geometrically compatible points

• Keep mutual best matches

What transformations does this work for?

Page 12: 3D Photography: Features & Correspondences - CVG

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Sum of Square Differences

Dissimilarity measures

Page 13: 3D Photography: Features & Correspondences - CVG

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Zero-mean Normalized Cross Correlation

Similarity measures

Page 14: 3D Photography: Features & Correspondences - CVG

Feature matching: example

0.96 -0.40 -0.16 -0.39 0.19

-0.05 0.75 -0.47 0.51 0.72

-0.18 -0.39 0.73 0.15 -0.75

-0.27 0.49 0.16 0.79 0.21

0.08 0.50 -0.45 0.28 0.99

1 5

24

3

1 5

24

3

What transformations does this work for?

What level of transformation do we need?

Page 15: 3D Photography: Features & Correspondences - CVG

Wide baseline matching

• Requirement to cope with larger variations between images• Translation, rotation, scaling

• Foreshortening

• Non-diffuse reflections

• Illumination

} geometric

transformations

photometric

changes}

Page 16: 3D Photography: Features & Correspondences - CVG

Invariant detectors

Scale invariant

Affine invariant(approximately invariant

w.r.t. perspective/viewpoint)

Rotation invariant

Page 17: 3D Photography: Features & Correspondences - CVG

2D Transformations of a Local Patch

Block

Matching

e.g. MSER

In practise

hardly observable

in small

patches !

Page 18: 3D Photography: Features & Correspondences - CVG

Example: Find Correspondences between these images using the MSER Detector [Matas‘02]

Page 19: 3D Photography: Features & Correspondences - CVG

MSER Features

Local regions, not points !

Page 20: 3D Photography: Features & Correspondences - CVG

Extremal Regions:

- Much Brighter than

Surrounding

- Use intensity threshold

Page 21: 3D Photography: Features & Correspondences - CVG

Extremal Regions:

- OR: Much Darker than

Surrounding

- Use intensity threshold

Page 22: 3D Photography: Features & Correspondences - CVG

Regions: Connected Pixels at some threshold

- Region Size = # Pixels

- Maximally stable: Size Constant near some

threshold

Page 23: 3D Photography: Features & Correspondences - CVG

A Sample Feature

Page 24: 3D Photography: Features & Correspondences - CVG

„T“ is maximally stable with respect to surrounding

Page 25: 3D Photography: Features & Correspondences - CVG

- Compute „center of gravity“

- Compute Scatter (PCA / Ellipsoid)

Page 26: 3D Photography: Features & Correspondences - CVG

Different Images: Different positions/sizes/shapes

Ellipse abstracts from pixels !

Geometric representation: position/size/shape

Page 27: 3D Photography: Features & Correspondences - CVG

Still: How to compare ?

Idea: Normalize to „Default“ Position/Size/Shape !

⇒ e.g. Circle of Radius 16 Pixels !

Page 28: 3D Photography: Features & Correspondences - CVG

Idea: Normalize to „Default“ Position/Size/Shape !

Ok, but 2D orientation ?

Page 29: 3D Photography: Features & Correspondences - CVG

• Idea (Lowe‘99): Run over all Pixels:Chart Local Gradient Orientation in Histogram

• Find Dominant Orientation in Histogram

• Rotate Local Patch into Dominant Orientation

Page 30: 3D Photography: Features & Correspondences - CVG

Each normalized patch obtained from single image !

Page 31: 3D Photography: Features & Correspondences - CVG

Wrap-up MSER

• detect sets of pixels brighter/darker than surr.

• fit elliptical shape to pixel set

• warp image so that ellipse becomes circle

• rotate to dominant gradient direction [otherconstructions possible as well]

⇒Affine normalization of feature leads to similarpatches in different views !

Two MSERegions: Are they in correspondence ?

Page 32: 3D Photography: Features & Correspondences - CVG

Traditional Matching Approach:

Compare Regions (Sum of Squared Differences)

- Small Misalignment

- Brightness Change

⇒ More Tolerant Comparison ?

Page 33: 3D Photography: Features & Correspondences - CVG

SIFT Descriptor [Lowe‘99]:

- Brightness offset: use only gradients !

For each Sector:

- Store Orientations of Gradients !

Gradient Magnitude

Partition into Sectors

Gradient Orientation/Magnitude

Page 34: 3D Photography: Features & Correspondences - CVG

Quantize Gradient Orientation, e.g. 45° Steps

Orientation Histogram per SectorGradient Orientation/Magnitude

Orientation Histogram (Magnitude as Weight)

m Sectors with n Orientations: (m·n) values

Construct Vector

35

12

10

25

29

Page 35: 3D Photography: Features & Correspondences - CVG

35

12

10

25

29

0.12

0.04

0.03

0.08

0.10

Normalize

(Suppresses

Changing

Contrast

Effects)

… …

Summed Gradient

Magnitudes for

Different Sectors and

Orientations

„SIFT

Descriptor“:

128 Bytes

Memory Comparison

11x11 Patch:

Raw Grey Values =

121 Bytes

Page 36: 3D Photography: Features & Correspondences - CVG

Wrap Up:

Normalized Patch Comparison vs. Descriptor

Usage of Gradients:

⇒ Intensity Offset Compensation

Subdivision into Sectors / Per-Sector Histogram:

⇒Small Alignment Error Compensation

Normalization of Histogram Vector:

⇒ Image Contrast Compensation

But most important:

Avoid sudden „descriptor jumps“

Page 37: 3D Photography: Features & Correspondences - CVG

Classical Histogram (Quantization 45°):

22° quantized/rounded to 0°

23° quantized/rounded to 45°

Small Differences can lead to different bins !

Feature Position, Size, Shape, Orientation uncertain,

Image Content noisy !

Descriptor MUST tolerate this (no sudden changes !)

Solution: „Soft-Binning“ !

Page 38: 3D Photography: Features & Correspondences - CVG

Histogram (Quantization 45°):

20

°

0

°

45

°

90

°

Classical

(closest bin)

1.0

22

°

2.0

If orientation is 3°

different, all

measurements go to

second bin !

=> Sudden Change in

Histogram from

(2 0 0 0) to (0 2 0 0)

Page 39: 3D Photography: Features & Correspondences - CVG

Histogram (Quantization 45°):

20

°

0

°

45

°

90

°

Soft-Binning

0.56

22

°

1.07

If orientation is 3°

different, descriptor

changes only gradually !

0.440.93

0.56 0.44

Soft Weights:

„Bin Correctness“

Page 40: 3D Photography: Features & Correspondences - CVG

Wrap-up

Detector:

- Find interesting regions (position/size/shape)

- Assign dominant gradient orientation

- Normalize regions

Descriptor:

- Compute „signature“ upon normalized region

- Behave smoothly in presence of distortions:

brightness changes / normalization inaccuracies

Page 41: 3D Photography: Features & Correspondences - CVG

How to find correspondences ?

For each Region: 128-dim. Descriptor

Page 42: 3D Photography: Features & Correspondences - CVG

Matching Scenario I

Two images in a dense foto sequence:

- think about maximum movement d (e.g. 50 pixel)

- Search in a window +/- d of old position

- Compare descriptors, choose most similar

Page 43: 3D Photography: Features & Correspondences - CVG

Matching Scenario II

Two arbitrary images / Wide baseline

- Compare every descriptor with every other (e.g. GPU)

- OR: Find small set of matches, predict others

- OR: Find nearest neighbor in descriptor space

Page 44: 3D Photography: Features & Correspondences - CVG

Searching Descriptor Space

Key Ideas:

- Each descriptor consists of 128 numbers

Imagine vector from IR128

- Correspondending descriptors: not far apart !

- Arrange all descriptors of image 1 in kd-tree

(imagine „octree“ but with more dims)

- For each descriptor of image 2:

Find (approximate) nearest neighbor in tree

Page 45: 3D Photography: Features & Correspondences - CVG

Searching Descriptor Space

„Learn“ important dimensions of 128D space for a

given scene, e.g. PCA or LDA

Project descriptors to important dimensions, use kd-tree

Page 46: 3D Photography: Features & Correspondences - CVG

Matching Techniques

Spatial Search Window:

- Requires/exploits good prediction

- Can avoid far away similar-looking features

- Good for sequences

Descriptor Space:

- Initial tree setup

- Fast lookup for huge amounts of features

- More sophisticated outlier detection required

- Good for asymmetric (offline/online) problems,

registration, initialization, object recognition, wide

baseline matching

Page 47: 3D Photography: Features & Correspondences - CVG

Correspondence Verification

Features have only very local view => Mismatches

How to detect ?

- Discard Matches with low similarity

- Delete „non-distinctive“ features (those with close

match in same image or with similar 2nd best match)

- Check for bi-directional consistency

- Geometric verification, e.g. RANSAC

Page 48: 3D Photography: Features & Correspondences - CVG

Object Detection / Pose Estimation using single MSER

Affine feature: position(2x), shape(3x), orientation(1x)

6 degree of freedom, more than 2D simple point !

Page 49: 3D Photography: Features & Correspondences - CVG

2D Transformations of a Local Patch

Block

Matching

e.g. MSERe.g. SIFT

Page 50: 3D Photography: Features & Correspondences - CVG

Lowe’s SIFT features

Detector + Descriptor

Recover features with position, orientation and scale

(Lowe, ICCV99)

Page 51: 3D Photography: Features & Correspondences - CVG

Position

• Look for strong responses of DOG filter (Difference-Of-Gaussian)

• Only consider local maxima

Page 52: 3D Photography: Features & Correspondences - CVG

Scale

• Look for strong responses of DOG filter (Difference-Of-Gaussian) over scale space

• Only consider local maxima in both position and scale

• Fit quadratic around maxima for subpixel

Page 53: 3D Photography: Features & Correspondences - CVG

Minimum contrast and “cornerness”

Page 54: 3D Photography: Features & Correspondences - CVG

Orientation

• Create histogram of local gradient directions computed at selected scale

• Assign canonical orientation at peak of smoothed histogram

• Each key specifies stable 2D coordinates (x, y, scale, orientation) 0 2π

Page 55: 3D Photography: Features & Correspondences - CVG

SIFT descriptor

• Thresholded image gradients are sampled over 16x16 array of locations in scale space

• Create array of orientation histograms

• 8 orientations x 4x4 histogram array = 128 dimensions

Page 56: 3D Photography: Features & Correspondences - CVG
Page 57: 3D Photography: Features & Correspondences - CVG

• Affine feature evaluation + binaries:http://www.robots.ox.ac.uk/~vgg/research/affine/

• SIFT + MSER + some tools:http://vlfeat.org

• SURF:http://www.vision.ee.ethz.ch/~surf/

• GPU-SIFT:http://www.cs.unc.edu/~ccwu/siftgpu/

• DAISY (dense descriptors)http://cvlab.epfl.ch/~tola/daisy.html

• FAST[er] (simple but …)http://svr-www.eng.cam.ac.uk/~er258/work/fast.html

Some Feature Resources

Check also opencv + try to google

Page 58: 3D Photography: Features & Correspondences - CVG

• BRIEF[Calonder10]: binary descriptor (tests=position a darker than b), compare descriptors by XOR (Hamming) + POPCNT

• RIFF[Takacs10]: CENSURE + gradients tangential/radial

• ORB[Rublee11] FAST+orientation

• BRISK[Leutenegger11] FAST+scale+BRIEF

• FREAK[Alahi12] FAST + “daisy”-BRIEF

• Lucid[Ziegler12]: “sort intensities”

• D-BRIEF[Trzcinski12]:Box-Filter+learned projection+BRIEF

• LDA HASH[Strecha12]: binary testson descriptor

Recent Variants and Accelerations (much faster, but this usually comes at a price)

Page 59: 3D Photography: Features & Correspondences - CVG

Features:

local normalization + robust descriptors

Allow

• changing perspective/illumination/scale

• scenarios with many occlusions

• different reasoning (descriptor vectors)

Require• (Nearly) planarity across feature in 3d• „detectable“ regions, enough structure

But• fewer features / slower• invariance vs. descriptiveness:

Descriptors can be similar when regions are not !

=> What level of invariance / speed / properties doYOU really need ?

Page 60: 3D Photography: Features & Correspondences - CVG

Feature tracking

• Identify features and track them over video

• Small difference between frames

• potential large difference overall

• Standard approach:

KLT (Kanade-Lukas-Tomasi)

Page 61: 3D Photography: Features & Correspondences - CVG

Tracking corners through video

Page 62: 3D Photography: Features & Correspondences - CVG

Good features to track

• Use same window in feature selection as for tracking itself

• Compute motion assuming it is small

Affine is also possible, but a bit harder (6x6 instead of 2x2)

differentiate:

Page 63: 3D Photography: Features & Correspondences - CVG

Example

Simple displacement is sufficient between consecutive frames, but not to compare to reference template

Page 64: 3D Photography: Features & Correspondences - CVG

Example

Page 65: 3D Photography: Features & Correspondences - CVG

Synthetic example

Page 66: 3D Photography: Features & Correspondences - CVG

Good features to keep tracking

Perform affine alignment between first and last frame

Stop tracking features with too large errors

Page 67: 3D Photography: Features & Correspondences - CVG

• Brightness constancy assumption

Intensity Linearization

(small motion)

• 1D example

possibility for iterative refinement

Page 68: 3D Photography: Features & Correspondences - CVG

• Brightness constancy assumption

Intensity Linearization

(small motion)

• 2D example

(2 unknowns)

(1 constraint)?

isophote I(t)=I

isophote I(t+1)=I

the “aperture” problem

Page 69: 3D Photography: Features & Correspondences - CVG

Intensity Linearization

• How to deal with aperture problem?

Assume neighbors have same displacement

(3 constraints if color gradients are different)

Page 70: 3D Photography: Features & Correspondences - CVG

Lucas-Kanade

Assume neighbors have same displacement

least-squares:

Page 71: 3D Photography: Features & Correspondences - CVG

Revisiting the small motion assumption

• Is this motion small enough?

• Probably not—it’s much larger than one pixel (1st order Taylor not sufficient)

• How might we solve this problem?* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Page 72: 3D Photography: Features & Correspondences - CVG

Reduce the resolution!

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Page 73: 3D Photography: Features & Correspondences - CVG

image It-1 image I

Gaussian pyramid of image It-1 Gaussian pyramid of image I

image Iimage It-1u=10 pixels

u=5 pixels

u=2.5 pixels

u=1.25 pixels

Coarse-to-fine optical flow estimation

slides from

Bradsky and Thrun

Page 74: 3D Photography: Features & Correspondences - CVG

image Iimage J

Gaussian pyramid of image It-1 Gaussian pyramid of image I

image Iimage It-1

Coarse-to-fine optical flow estimation

run iterative L-K

run iterative L-K

warp & upsample

.

.

.

slides from

Bradsky and Thrun

Page 75: 3D Photography: Features & Correspondences - CVG

Next week: Project Proposals