image registration - uppsala university · image similarity – area based methods several options...
TRANSCRIPT
Registration – To Read?!
Unfortunately, the course book does not cover all aspects of image registration. You will find pieces of it, however.
Matching 6.4
ICP 12.2.6
Transformations 5.2, 11.2
Interest points 5.3.10, 5.3.11
Examples of good reading material on the topic include: Image Alignment and Stitching: A Tutorial
http://www.cs.washington.edu/education/courses/cse576/05sp/papers/MSR-TR-2004-92.pdf
Image Registration Methods: A Survey http://library.utia.cas.cz/prace/20030125.pdf
oExercises
Registration – What is it?
“register” “To adjust so as to be properly aligned.”
“fusion” “something new created by a mixture of qualities, ideas, or things”
“warping” “become or cause to become bent or twisted out of shape, typically as a result of the effects of heat or dampness”
“matching” “a person or thing that resembles or corresponds to another”
y = j(x)
X
Y
x1
x2
y1
y2
Registration – When is it used?
Art and Photography
Stitching photos together, panoramic images
Astronomy
Stitching photos together and fusion of different
wavelengths
Registration – When is it used?
Robotics
Finding and tracking objects in a scene (2-D)
Chemistry
Finding similar images of molecules
Registration – When is it used?
“Word Spotting”
Matching and aligning words in
old handwritten historical
documents
Enables searching in large
document collections without
the need for manual translation
F. Wahlberg et al. 2011
Registration – When is it used?
Medical Imaging
study skin changes over time
colonoscopy (1-D)
Pre operative vs post operative
Fusion of different modalities
CT-MRI
CT-Ultrasound
PET-CT
…
And much more …
Holmberg et al. 2008
Nain et al. 2002
Registration – An ill posed problem
An ill-posed problem: Many degrees of freedom compared to available data. Given a set of pixel intensities, find a vector (deformation) for each position – a
number of unknowns is greater than the number of constraints
Rotational symmetry of objects affects uniqueness of the solution.
Solution: Restrict the set of possible transformations φ
y = j(x)
X
Y
x1
x2
y1
y2
Image registration flow chart
Registration – How to do it?
Choose starting
parameters
Transform Study
Evaluate cost
function
Converged?
Choose new set of parameters
Reference Study
No Yes
Registration – How to do it?
1) Define the set of allowed transformations φ
2) Define a useful measure of similarity
3) Choose a practical optimization procedure
y = j(x)
X
Y
x1
x2
y1
y2
GEOMETRIC
TRANSFORMATIONS
Geometric Transformations
Translation (2-D, 3-D, … N-D):
y = x + t
y = j(x)
1100
10
01
1
2
1
2
1
2
1
x
x
t
t
y
y
xt
yT
10
I
Geometric Transformations
Rigid transformation
Rigid transformation + mirroring
y = Rx + t
y = Rx + t
y = j(x)
Geometric Transformations
Affine transformations
y = Ax + t
y = j(x)
Geometric Transformations
Projective (2-D):
A
222120
121110
222120
020100 , hyhxh
hyhxhy
hyhxh
hyhxhx
Similarity Affine Projective
Geometrical Transformations
Intensity Interpolation
Both position and signal value is interpolated during registration
Without proper resampling of the image values (color/intensity/…), the image might
look ugly and the registration may perform poor
Nearest neighbor
Bi- (2-D) or trilinear (3-D) interpolation
Bicubic or spline based interpolation, Windowed sinc, …
In Matlab: imrotate, imresize, interp2, interp3, …
Intensity value interpolation enables sub-pixel precision registration
Nearest neighbor Bilinear Bicubic
Registration – How to do it?
1) Define the set of allowed transformations φ
2) Define a useful measure of similarity
3) Choose a practical optimization procedure
y = j(x)
X
Y
x1
x2
y1
y2
Different kinds of image fusion
Similar images
Same camera
Same modality
Same patient
Less similar images
Different brightness or contrast
Different lighting (e.v. day/night)
Different modalities (CT/MRI/Ultrasound)
Photo vs a drawing
SIMILARITY MEASURES
Image similarity – area based methods
Several options exist to measure (dis)similarity between the transformed source image φ(X) and the target Y: Sum of squared differences (dissimilarity)
Correlation (similarity)
Mutual information (similarity)
These are based on pointwise comparisons between pixels in φ(X) and Y
Compare pixel values in φ(X) and Y,
sweep and average/sum/aggregate
over the whole image φ(X) where it
overlaps with Y
Sum of Squared Differences
Pros:
Simple, intuitive and fast
Cons:
Signals must have exactly the same brightness and
contrast
Does not work for different modalities
2)(),( ii yxYXSSD
The correlation coefficient
How can we find this template?
Linear intensity changes are common in
practice. SSD is not appropriate here.
Normalized Cross-Correlation can be used.
The correlation coefficient
The correlation coefficient
Pros:
Corrects for intensity changes (common in practice)
Efficient to evaluate
Cons:
Does not work for (very) different modalities.
Flattnes of similarity measure maxima.
In Fourier domain:
Phase correlation for translation
Find a peak in the inverse of the Cross power spectrum of the two
FTs of the images.
Position indicates displacement.
)()(
)()(
YFXF
YFXF
• When data come from different modalities, there is no
linear relationship between voxel intensities and a simple
similarity measure as SSD or correlation will not suffice.
Cost Function for Dissimilar Images
Histograms
0 2 2
1 4 5
1 4 5
3 3 2
3 6 5
3 6 7
X
Y
0
2
4
6
0 1 2 3 4 5 6 7
0
1
2
3
0 1 2 3 4 5 6 7
Hx(k)
Hy(k)
Joint Histograms
0 2 2
1 4 5
1 4 5
3 3 2
3 6 5
3 6 7
X
Y
k/m 0 1 2 3 4 5 6 7
0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0
2 0 0 1 0 0 0 0 0
3 1 2 1 0 0 0 0 0
4 0 0 0 0 0 0 0 0
5 0 0 0 0 0 1 0 0
6 0 0 0 0 2 0 0 0
7 0 0 0 0 0 1 0 0
X
Y
Hxy(k,m)
The Joint Histogram
The Joint Histogram of images Y and φ(X):
H(k,m; Y,φ(X)) = Number of pixel positions with intensity k in image Y and
intensity m in image φ(X)
Joint intensity histogram
Before
registration
After registration
Spect
Spect
MR
M
R
Example: Fused MR and Spect images
o Mutual Information measures images similarity:
o H(k, m) is a value in the joint histogram
o Hx(k) and Hy(m) are values in the ordinary histograms for image X and image Y, respectively
o Mutual information measures how much information two sources X and Y have in common
(amount of uncertainty in Y reduced when X is known)
o Mutual information can be expressed in terms of entropy of distributions.
MI = H(k,m)logHxy (k,m)
Hx(k)× Hy (m)
æ
è ç ç
ö
ø ÷ ÷
k,m
å
Mutual Information
Hxy
Hy
Hx
Entropy(X)
Entropy(Y)
MI(X,Y)
• The MI registration criterion states that the images
are geometrically aligned when MI(A,B) is maximal.
• For example, let A be an MRI-scan and B a
SPECT-scan. If we know the MRI intensities, the
uncertainty of the SPECT intensities is minimal
when the scans are aligned.
Mutual Information
Joint intensity histogram
Before
registration
After registration
Spect
Spect
MR
M
R
MI =
0.357
MI =
0.446
Mutual information – an Example
Area based similarity measures
Appropriate in absence of distinctive details in
the images
Based on image intensities (not e.g., shape or
structure)
Require rather similar images (same or
statistically dependent)
Limited to small deformations (translation and
small rotations) to work reasonably well.
Registration – How to do it?
1) Define the set of allowed transformations φ
2) Define a useful measure of similarity
3) Choose a practical optimization procedure
y = j(x)
X
Y
x1
x2
y1
y2
Functionals for Image similarity
A cost functional evaluates a particular transformation φ
The goal is to minimize the cost: A low cost means that
we have a good fit
We can also include an additional cost to punish “bad”
inappropriate transformations
(Restricting the set of possible transformations is also a kind of punishment,
i.e. setting the cost to infinite to all other transformations)
Cost = image dissimilarity(φ(X),Y)
Cost = image dissimilarity(φ(X),Y) + deformation cost(φ)
Optimization
Non-convex functions have local minima
We are looking for the global one
global minimum local minimum
gradient decent
Simulated annealing
(eventually steps should
be smaller and less random)
Starting point
Area based methods – fitness landscapes
Image and template
SSD
Distance based on fuzzy set theory
1-NCC
FEATURE BASED METHODS
Feature points based registration
Instead of matching all pixels with all pixels, it may be
beneficial (faster, more robust) to match salient
structures (points/lines) in the two images.
1. Feature detection – find outstanding points in the images
2. Feature description – to improve pairing of feature points
3. Feature matching – pairing up feature points
4. Transformation estimation – actually match the two
images (possibly with outlier rejection)
Interest Point Detectors
Feature points:
Distinctive, frequently spread, easy to detect.
Invariant under image degradation.
Surrounded by rich content.
Feature detectors:
Corner detectors
Edge detectors
SIFT – Scale-Invariant Feature Transform
Iterative Closest Points – ICP
1. Associate feature points in X with their closest
neighbors in Y
2. Estimate a transformation of respective points in X to
Y (e.g. an affine transformation)
3. Transform points in X, i.e. X = φ(X);
4. Iterate steps 1-3 until convergence
Requires good initialization to not get stuck in local minimum!
Find pairs of closest
feature points in X & Y
Transform X closer to
Y, using matching pairs
Registration – Least Squares Fit
Example: 2D Affine Transform , y = Ax + b
x = [x1, x2]T and y = [y1, y2]T
x is a point before the transformation
y is a point after the transformation
For several pairs of before/after points (x, y),
the problem of finding A and b becomes a
linear equation system.
For more than three pairs (x, y), there is no
guarantee to find A and b. However, we can
seek a Least Squares Solution.
X
Y
Registration – Least Squares Fit
Arrange all X-points in a matrix, where each
column represent a point in X. Call it X.
Arrange all Y-points in a matrix, where each
column represent a point in Y. Call it Y.
Find a Least Squares Fit using
X
Y Quadratic cost = sensitive to outliers
RANSAC
RANSAC – RANdom Sample Consensus
A robust estimation technique
Non-deterministic
Similarities with Hough- and Radon transforms
Robustness with respect to outliers
Paradigm: Hypothesize and Test
RANSAC
An example. Find the translation that transforms
red dots to yellow.
There is one outlier present
One corresponding pair is
needed (at least) to determine
the transformation
Outliers
RANSAC
In RANSAC we randomly select a pair and use it
to estimate a transformation
Hypothesis No match
RANSAC
And continue to select pairs until we find a good
transformation where many points are matching
(the outliers will of course not match)
Hypothesis
Match!
RANSAC – Algorithm
N, minimal number of data points needed to
estimate a model;
K, maximum number of allowed iterations;
T, threshold for data fitting (what is an OK match?);
D, number of data points needed to consider a fit
as good.
RANSAC – Algorithm
1. Draw a random sample of N data points;
2. Use these N data points to estimate a model M;
3. Test if this model fits D data points, within a threshold T;
4. If it does, you have a global fit. Exit.
5. If it does not, and if you have iterated fewer than K times, return to 1.
Optional final step: Use all inliers in a
LSQ fit to fine tune the final transformation.
RANSAC – Comments
Obviously, random sampling may be inefficient.
RANSAC is non-deterministic, you may not find the
global optimal fit within the K allowed iterations.
With larger model complexities, N, RANSAC may
need a lot of iterations.
RANSAC imposes problem-specific thresholds.
RANSAC estimates only one model for a particular
data set.
May perform poorly with more than 50% outliers.
FEATURE DETECTION AND
DESCRIPTION
SIFT – scale invariant feature transform
SIFT, detect interest points:
1) L(x,y,σ) = G(x,y, σ)*I(x,y), where G is a Gaussian
2) D(x,y,σ) = L(x,y,kσ) – L(x,y,σ)
3) Find max/min in D(x,y,σ)
4) Interpolate subpixel positions of max/min
Images from: Implementing the Scale Invariant Feature Transform(SIFT) Method
by MENG and Tiddeman
SIFT, detect interest points (cont.):
4) Eliminate low contrast responses D(x,y, σ) < 0.03
5) Eliminate edge responses (r=10)
H =Dxx Dxy
Dyx Dyy
é
ë ê
ù
û ú
Tr(H)2
Det(H)<
(r +1)2
r
SIFT – scale invariant feature transform
Now compute feature vectors (description)
Position a 4 x 4 grid
at the position of each interest point
scaled according to the detected scale
oriented according to the dominant gradient direction
SIFT – scale invariant feature transform
In each part of the grid, compute an 8 direction
histogram over the gradient directions
Compose this into a 4 x 4 x 8 = 128 value vector
This feature vector is invariant to scale, rotation
and intensity changes
SIFT – scale invariant feature transform
From: Programming Computer Vision with Python
SIFT features are invariant to scale, rotation and
intensity changes.
SIFT features are robust to noise and 3D
viewpoint changes.
Scale and position are given by the interest point
detection step.
Rotation invariance comes from choosing the
strongest gradient as a reference direction.
SIFT – scale invariant feature transform
Feature point pairing
Distance (e.g. Euclidean) between the descriptors
Pick the most similar feature point.
Weighting based on reliability can be used.
Decreased performance (multiple matches)
observed in cases of high illumination changes and
non-rigid deformations.
Note that invariance and distinctiveness are to some
extent contradictory properties, and have to be in
some balance.
RANSAC applied
Correspondences based on feature descriptor alone
RANSAC filtered correspondences
Summary
Image registration is an optimization problem.
Define a similarity measure
Restrict transformation space
Choose an optimization method
We have observed parametric transformations.
Complexity increases if local deformations are allowed.
Area based registration
Feature points based registration
Feature detection
Feature description
Feature matching
Transformation estimation (ICP, LSQ, RANSAC)
Optimization is challenging
Chamfer matching – Lab2
Technique for finding best fit of edge points
from two different images by minimizing a
generalized distance between them.
Algorithm based on distance transform to
locate one-dimensional features (edges).
Good performance in close to correct positions,
but poor elsewhere. (Can be seen as a variant of
ICP)
Chamfer matching – Lab2
Start from edge image
Compute DT from edges
Find locations providing minimal error