performance of optical flow barron, fleet and beauchemin ijcv 12:1, 1994
TRANSCRIPT
Performance of Optical Flow
Barron, Fleet and BeaucheminIJCV 12:1, 1994
http://www.csd.uwo.ca/faculty/barron/
Performance of Optical Flow
• Evaluation of different optical flow techniques– Accuracy, reliability, density of measurements
• A common set of synthetic and real sequences• Several optical flow methods– Differential – Matching – Energy-based – Phase-based
Performance of Optical Flow
• Accurate and dense velocity measurement• Accurate 2d motion filed estimation is ill-
posed– Inherent differences between the 2D motion field
and intensity variations • Only qualitative information can be extracted
Optical Flow Process
• Three stages – Perfiltering or smoothing with low-pass/band-pass filters in
order to • extract signal structure of interest • enhance the signal-to-noise ratio
– Extraction of basic measurements• Spatiotemporal derivatives • Local correlation surface
– Integration of measurements to produce 2D flow field • Often involves assumptions about the smoothness of the underlying
flow field
Differential Techniques
• First-order derivatives and based on image translation
• Intensity is conserved
• Normal velocity
( , ) ( ,0)I t I t x x v ( , )Tu vv
( , )0
dI t
dt
x
( , ) ( , ) 0tI t I t x v x ( , ) ( ( , ), ( , ))Tx yI t I t I t x x x
n sv n ( , )( , )
( , )tI t
s tI t
xx
x
( , )( , )
( , )
tt
I t
x
n xx
Differential Techniques• Second-order differential
• Stronger restriction than first-order derivatives on permissible motion field
• Can be combined with 1st order in isolation or together (over-determined system)
• Velocity estimation from 2nd-order methods are often assumed be to sparser and less accurate than estimation from 1st-order methods
1
2
( , ) ( , ) ( , ) 0
( , ) ( , ) ( , ) 0xx yx tx
xy yy tx
I t I t I tv
I t I t I tv
x x x
x x x
( , ) ( , ) 0tI t I t x v x
Differential Techniques
• Additional constraints – Fits the measurements in each neighborhood to a
local model for 2d velocity • Using least squares minimization or Hough transform
– Global smoothness
Differential Techniques
• must be differentiable – Temporal smoothing at the sensors is needed to
avoid aliasing– Numerical differentiation must be done carefully
• If aliasing can not be avoided in image acquisition – Apply differential techniques in a coarse-to-fine
manner
( , )I tx
Horn and Schunck
• Combine gradient constraint with a global smoothness term, minimizing
2 22 2
2 2tDI I u v d v x
0 0 0v u 1
2 2 2
12 2 2
( )
( )
k kx x y tk k
x y
k ky x y tk k
x y
I I u I v Iu u
I I
I I u I v Iv v
I I
0.5 instead of 100
Horn and Schunck
• Relatively crude form of numerical differentiation can be source of error
• Spatiotemporal smoothing – Gaussian prefilter with 1.5 pixels in space and
1.5 frames in time
• 4-point central differences for differentiation– mask
1( 1,8,0, 8,1)
12
Lucas and Kanade
• Weighted least squares • Fixed velocity in a small neighborhood
22 ( ) ( , ) ( , )tW I t I t
x
x x v xMinimizing
Lucas and Kanade
• When is nonsingular,
Weighted least squares estimates of v from estimates of normal velocities
Confidence measure
Lucas and Kanade
• Spatiotemporal smoothing – Gaussian prefilter with 1.5 pixels-frames
• 4-point central differences for differentiation– mask
• Spatial neighborhood 5x5 pixels
• Window function W(x)– (0.0625, 0.25, 0.375, 0.25, 0.0625)
1( 1,8,0, 8,1)
12
Lucas and Kanade
• Identify the unreliable estimates by eigenvalues of– If – If , compute normal velocity• v=sn
• From LS minimization
– Otherwise, do not compute velocity
1 2,
1 2, ( , )
( , )( , )tI t
s tI t
xx
x
( , )( , )
( , )
tt
I t
x
n xx
Nagel
• First to use second-order derivatives to measure optical flow– Basic measurements and global smoothness– Oriented smoothness constraint
– Attenuates the variation of the flow in the direction perpendicular to the gradient
Nagel
• Gauss-Seidel iterations
Nagel
Weight matrix
Nagel
• Spatiotemporal smoothing• 4-point central differences for differentiation
• Velocity derivatives– 1st order: 2 point central difference ½(1,0,-1) – 2nd order: cascades of 1st order derivatives
Barron’s implementation
Uras, Girosi, Verri and Torre
• Local solution to
• Solved wherever the Hessian H is nonsingular• 8x8 pixel regions– For each region, select 8 estimates that best satisfy
– Choose the estimate with the smallest condition number k(H) as the velocity for the entire region
Uras et al.
• Presmooth using Gaussian – 3 pixels in space and 1.5 frames in time
• Derivatives of I and v– 4 point central difference operators
• Confidence measurement – They use k(H)– Barron et al. found det(H) is more reliable
Barron’s implementation
Region-Based Methods
• Accurate numerical differentiation may be impractical because of noise, a small number of frames, aliasing
• Region-based approaches– Define v as the shift that yields the
best fit between image regions at different times– Best match maximizing a similarity measure
Region-Based Matching
• Sum-of-squared difference (SSD)
• Cross-correlation, NCC…
Discrete 2D window Integer values (dx, dy)
Anandan
• Based on Laplacian pyramid – Allows the computation of large displacement between
frames – Help enhance image structure (edges.. )
• Coarse-to-fine SSD-based matching strategy – Coarsest level: displacement be 1p/f or less – SSD minima in 3x3 search space using 5x5 Gaussian of
W(x)– Subpixel displacement are computed by finding the
minimum of a quadratic surface parameters
Anandan
• Confidence measures of the SSD surface at the minimum
– – S_min: SSD value at the minima– k_1 = 150, k_2=1, k_3 =0
Principle curvatures
Anandan
• Additional smoothness constraint• Minimize
• Gauss-Seidal iterations
min max,e e The direction of min and max curvature on the SSD surface at the minima
0v The displacement from the higher level
Anandan
• Matching and Smoothing are performed at each level of the Laplacian pyramid
• Confidence measure– Try to use c_min and c_max suggested by
Anandan, but not reliable
Singh
• Two-stage matching method– First, SSD with 3 adjacent band-pass filtered image
• Converts SSD0 into a probability distribution
Average out spurious SSD minima due to noise or periodic texture
Singh
• Subpixel velocity: mean of the distribution– Averaged over the integer displacement d
• Coarse-to-fine strategy • Confidence measures: eigenvalues of the inverse
covariant matrix
Singh
• Step1, computed SSD for a wide range of integer displacement, N=4
– (4N+1)x(4N+1) SSD surface to (2N+1)x(2N+1) subregions
• Step2: propagate velocity using neighborhood constraints
Barron’s implementation
Gauss function of distance, better results with w=2 than w=1
Singh
• Covariance matrix
• Final velocity
– S_c, v_c are derived from intensity data in step1– S_n, v_n
Barron’s implementation
Matrix inverse: replace singular values less than 0.1 by 0.1 to avoid singular systems
Singh
• Confidence measures– Eigenvalues of covariance matrix– , serve as confidence measures
• Rejecting velocities where
Energy-Based Methods • Based on the output energy of velocity-tuned filters
– Also called frequency-based methods owing to the design of velocity-tuned filters in the Fourier domain
• Fourier transform of a translating 2d pattern is
– All non-zero power associated with a translating pattern lies on a plane through the origin in frequency space
• Equivalent to correlation-based method, gradient-based method of Lucas and Kanade
FT of I(x,0) Temporal frequency K=(k_x,k_y)spatial frequency
Heeger
• Least-squares fit of spatiotemporal energy to a plane in frequency space– Extract local energy using Gabor-energy filters, with
12 filters at each of several spatial scales, tuned to different spatial orientations and temporal frequencies
• Ideally, for a single translational motion, the response of these filters are concentrated about a plane in frequency space
Heeger
• Expected response of a Gabor-energy filter tuned to frequency for translating white noise as a function of velocity
The standard deviations of Gaussian component of Gabor filter
Heeger
• The set of filters with the same orientation tuning:
• Sum of measured and predicted energies from filter j in the set of M_i:
• Least-squares estimate for (u,v): minimize
Heeger
• Two ways of minimizing – Non-linear minimization using Newton’s method: unsatisfactory
results– Rarely get convergence if the measurement error was much over
10%
• Modified minimizing– Construct distribution for a range – The minima of the distribution gives the subpixel velocity estimate
– Ad hoc method involves multi-resolution minima selection is used to
compute subpixel minima
Phase-Based Techniques
• Velocity is defined in terms of the phase behaviour of band-pass filter outputs
• First developed by Fleet and Jepson
Waxman and Wu and Bergholm
• Apply spatiotemporal filters to binary edges maps to track edges in real-time
• Convected activation profile A(x,t)
• Track level contours of A using differential methods– Spatial gradient of A = 0 at edge locations– 2nd order approaches to estimate
Edge map
Waxman and Wu and Bergholm
• Implementation– Central Gaussian of the DOG had a standard deviation of
1.5 pixels-frames– Activation profile
• Require 7 frames
– Waxman et al, multiple method to choose the best velocity at an edge location. • For various values (1.0, 1.5, 2.0), choose the velocity
that maximizes
Waxman and Wu and Bergholm
• Confidence measure: Hessian of A (Gaussian curvature of A )
• If , compute full velocity • Otherwise, compute normal velocity
AH
Fleet and Jepson
• Define component velocity in terms of the instantaneous motion normal to level phase contours in the output of band-pass velocity-tuned filters
• Band-pass filters: to decompose the input signal according to scale, speed and orientation
Fleet and Jepson
• 2D velocity
• Phase derivatives
Fleet and Jepson
• Motivation– The phase component of band-pass filters outputs is more
stable than the amplitude component when small deviations from image translations
• Unstable phase– Instabilities occur in the neighborhoods about phase
singularities – Detect with constraint on the instantaneous frequency of the
filter output and its amplitude variation in space-time
– Also a signal-to-noise constaints
Fleet and Jepson
• Given component velocity estimates from different filter channels, a linear velocity model is fit for each local region– Collect reliable velocity estimates from 5x5 neighborhoods, – Estimate the linear velocity model in a LS sense
• Additional constraint to ensure sufficient local information – Conditioning of linear system < 10 – Residual LS error < 0.5
Experimental Technique
• Test sequences– Real sequences– Synthetic sequence – With 2D motion field known
• Error metric– Angular measures of error
Synthetic Image Sequence
• 2D motion fields and sequence properties can be controlled and tested in a methodical fashion– Clean signals • No occlusion, specularity, shadowing, transparency, etc
• Optimistic bound on the expected errors with real image sequence
Sinusoidal Inputs
• Superposition of two sinusoidal plane-waves
• Results– Spatial wavelength of 6 pixels, with – Orientations of 54°and -27°, – Speeds of 1.63 and 1.02 pixel/frame
• Two sinusoidal inputs– Translates with velocity– Another plaid pattern with wavelength of 16 pixels/cycle and velocity
1 1 2 2sin( ) sin( )t t k x k x
(1.585,0.863)v
(1,1)v
Sinusoid 1
Results
Translating Sqaures
• Translating squares (width of 40 pixels) • Velocity – Uniform velocity– Sometimes
• Helps illustrate the aperture problem and the inherent spatial smoothing in the difference techniques
2
4 4( , )3 3
v
1 (1,1)v
Square 2
Results
Sinusoidal and Squares
• Sinusoidal inputs– Dense in space– Sparse in frequency space
• Squares– Concentrated in space along the edges– Richer in frequency spectra
Sinusoid 1
Square 2
Barron et al.
3D Camera Motion and Planar Surface
• Textured planar surface • Simulated translational camera motion
• Translating tree • Diverging Tree
3D Camera Motion and Planar Surface
(a) Surface texture (b) Translating tree (c) Diverging tree
Camera move normal to line of sight along X-axis along its line of sight
Velocity direction all parallel with image x-axis Focus of expansion is at the image center
velocity 1.73~2.26 pixel/frame 1.29 p/f on the left to 1.86 p/f on the right
David Fleet
Yosemite Sequence
• Motion– Divergent motion in the upper-right– Clouds translates to the right with 1 p/f– Velocities in the lower-left ~ 4 p/f
• Difficult sequence – Velocities in a large ranges – Occluding edges between
mountains and at the horizon
Lynn Quam
Real Image Sequences
SRI trees NASA sequence
Rotating Rubik Cube Hamburg Taxi
SRI Trees
• Challenging because– Poor resolution– Amount of occlusion– Low contrast – Velocities ~ 2
pixel/frame
NASA Sequence
• Primarily dilational • Velocities < 1 pixel/frame
Rotating Rubik Cube
• The cube is rotating counterclockwise on a turntable
• Velocities on the table 1.2~1.4 p/f
• Velocities on the cube 0.2~0.5 p/f
Hamburg Taxi Sequence
• Four moving objects• Speeds– 1.0 p/f– 3.0 p/f– 3.0 p/f– 0.3 p/f
http://i21www.ira.uka.de/image_sequences/
Error Measurement • Angular measure of error
arccos( )E c e v v
Angular error
Correct velocity Estimate
( , )u vvVelocity Displacement per time unit
( , ,1)u vvVelocity Space-time direction vector in units of (pixel, pixel, frame)
Error Measurement • Angular measure of error
• Advantage– It handles large and very small speeds without the amplification
inherent in a relative measure of vector differences
• Disadvantages– Have bias: directional errors at small speeds do not give as large
an angular error as similar directional errors at higher speeds
arccos( )E c e v v
Angular error
Error Measurement
Error Measurement
• Complementary measure of normal velocity– Linear relationship between normal velocity and
2-d velocity
– All component velocities generated by a translating texture pattern should ideally lie on the plane normal to
– Angle between measured component velocity and the constraint plane
0c s n vn sv n
cv
cv
arcsin( )E c n v v
2
1( , )
1n s
s
v n
Error Measurement
• Many ways in which error behavior may be reported– For synthetic sequence • Extract subsets of estimates using confidence measures
and then report the densities of them along with their mean error and standard deviations
– For real image sequence • Show computed flow field and discuss qualitative
properties
Experimental Results
• Synthetic image sequences, known velocity field
• Error statistics between estimates and ground truth – Mean ( ) and standard deviation ( )
• Density of measurements for subsets of the estimates extracted using confidence measures as threshold
a b
Sinusoid I
Sinusoid I
• Generally very good • Relative dense, homogeneous structure of the
input– Most flow estimates are not thresholded by
confidence measure• No smoothness
Sinusoid I
• Modified method with improved numerical differentiation, performed better • Accuracy of original H-S method approaches the modified method as the spatial wavelength is increased (Sinusoid 2, 0.97 °± 2.62 °)• Large standard deviations are not very significant as they are caused by directional errors near the image boundary• Performance related on ƛ, when ƛ=100, results were noticeably worse. Here ƛ=0.25
Sinusoid I
• Similar accuracy to that produced by modified Horn and Schunck algorithm, which shares the same numerical differentiation
Sinusoid I
• The results are also good• Get more accurate results when Sinusoidal 2 were used as better derivative estimation is possible ( 0.04 ° ± 0.23 °)• Results were sensitive to parameters: results were significantly worse with larger values of a
Sinusoid I• Differential techniques works well on sinusoidal inputs, the matching techniques did not
• accurate direction, but poor speed estimates • Main problem ---- aliasing in the construction of Laplacian pyramid: although complete, the Laplacian pyramid produces band-pass channels (levels) that contain substantial aliasing when considered independently of one another
Sinusoid I
• Only when different levels are combined• Aliasing cancel to provide accurate reconstruction
• With sinusoidal inputs and a coarse-to-fine control strategy on the Laplacian pyramid• Aliasing causes major errors at coarse levels that are then propagated
systematically to finer levels
Sinusoid I
• Same problems for Singh if implemented with a Laplacian pyramid• Multiple local minima in the SSD surface with nearly periodic inputs. • The SSD surface is initially evaluated at a small number of integer displacements the global minima may fall midway between integer displacement, other minima may be mistaken for global minima if they occur closer to a integer displacement• The sampling problem occurs less frequently in natural images which lack the exact periodicity, but sampling problem will continue to occur unless finer sampling and interpolation are used
Sinusoid I
• Heeger’s technique – Reasonable results can be expected when input frequencies
matches those in the pass-band to which the filters are tuned
– Required Assumption: the input has a flat amplitude spectrum (violated by the sinusoid inputs here) • Violation is most evident when the frequencies of the component
sinusoids are not close to the filter tunings • Sinusoid I: no results• Good for others: sinusoid with orientations of 0°and 90°, speeds of
1 p/f, spatiotemporal wavelength of 4 pixels/cycle, errors ( 3.24 °± 0.05 °) with density of 24.3%
Sinusoid I
• To obtain good results with this zero-crossing algorithm, one must choose the standard deviation of the activation kernel so that • it is small enough to prevent interaction between adjacent edges and• yet big enough to track each edge over time
• Zero-crossing must be localized to sub-pixel accuracy (not done by Waxman et al.) in order to obtain good qualitative results when the underlying motion is not integer multiple of pixels
• Sinusoid 2 satisfy this, errors ( 0.04 °± 0.03 °) with a density of 11.94%, reflecting the density of edge location
Sinusoid I
• Spatiotemporal wavelength of the sinusoid closely matches those to which their filters are tuned. The results are very good
• With general inputs, when input signals have local power concentrated near the boundary of a filter’s amplitude spectra, slight errors appear, as a bias in the component estimates toward the velocity tuning of the filters
Translating Square 2
• Expect normal estimates along the edges and 2d velocities only at the corners
Square 2
Lack of discrimination by the algorithm between measurements of normal velocity v.s. 2d velocity
Translating Square 2
• Poor results for several methods– Differential methods• Do not have a way of segmenting the measurements
into 2d flow, normal velocity, or unreliable estimates
Translating Square 2
• integrates measurements locally with a clear means of segmenting normal from 2d velocities based on the eigenvalues of the normal matrix
Translating Square 2
• Use confidence measure based on the spatial Hessian of the smoothed image sequence• Higher density due to using a single estimate for each 8x8 region• but limits the spatial resolution of the flow field
Still LACK of discrimination by the algorithm between measurements of normal velocity v.s. 2d velocity , even with the confidence measure
Translating Square 2
• Visually pleasing but somewhat inaccurate• The common aperture problem with matching methods• SSD minima found at integer displacement is extremely
sensitive to small variations along the edges • Even with good confidence at step 1, the poor estimate will
corrupt in step 2
Translating Squares
• Square 1 with integer speeds, Square 2 has subpixel motion
• Most techniques have similar performance on them – Waxman et al.: poorly on Square 2 because of the
implementation lacks of subpixel resolution
Square 2 Normal Velocity
• 2d
• Estimates from level 1 are more accurate than level 0• Correct velocity coincides with the appropriate velocity
range for level 1
Translating Squares
• Provide a clear way of examining the normal velocity estimate as distinct from the 2d velocity estimate
• Lucas and Kanade, provide two sources of normal estimates explicitly – Gradient constraint – LS minimization
Square 2 Normal Velocity
• Density as two quantities • 17.6%, 65.4%: the density of positions where one or more
normal velocities is recovered • 1.1, 4.2: the average number of velocities at a single point
Realistic Synthetic Data
• General behaviour of the techniques is similar with above synthetic sequence
• • Modified Horn and Schunck with presmoothing and
improved numerical differentiation – Large smoothness parameter yielded somewhat poorer
results – Still less accurate than Lucas-Kanade
• Differs in the method used to combine normal constraints • Confidence measure based on eigenvalues of the normal
equations A’W^2A performs well
Translating tree
Diverging Tree
Quantization Error
• Gradient-based algorithms– Initial implementation• Quantize the Gaussian smoothed sequence with
8-bit/pixel, prior to gradient computation and LS minimization noisy derivatives • Velocity errors
– Grew 40%~50% for Lucas-Kanade – Larger for Horn and Schunck (more sensitive to noise)
Translating Tree • Horn and Schunck’s method of combining normal constraints( the
global smoothness constraint) is significantly more sensitive to noise than the local least squares method by Lucas and Kanade
• 2nd order technique – Good results on translating tree (both accurate and dense)– Poor on diverging tree, and Yosemite
• 1st order constraint is valid for smooth deformations of the input • 2nd order constraints are based on the conservation of the intensity
gradient, invalid for rotation, dilation and shear• Aliasing of Yosemite sequence makes accurate 2nd order
differentiation difficult
Nagel
• Produce good results • Confidence measure is not entirely
successful• Large threshold more accurate but less dense • Diverging tree: 1.0 threshold -> poorer results
– 2nd order derivatives of intensity and velocity are small for most cases, --> similar results to Horn and Schuck’s
Matching algorithms
• Both methods produce good results on translating tree (Singh’s > Anandan’s)– Larger neighborhood support for Singh’s algorithm
• If use 3x3 regions instead of 5x5 regions, errors increase to 2.13 °± 5.15 ° (stage 1) and 1.35 °± 1.68 °(stage 2)
• Confidence measures– Anandan’s based on cmax, and cmin is not reliable – Singh’s: inverse eigenvalues of covariance matrix at stage 1 is
useful, but inverse eigenvalues of covariance matrix is inefficient • Small changes in a threshold based on the largest eigenvalue
dramatically change the density of the estimates
Matching Algorithms
• Poorer results on Diverging Tree – Singh’s: about an order of magnitude worse, especially
at step 1• Some due to aliasing and confusion between normal and 2D
velocities • Most due to subpixel inaccuracy: errors at noninteger
displacements are often two or three time larger than those at integer displacement – Diverging tree: a wide range of velocities– Translating tree: close to integer displacement
• Use coarser temporal sampling • Coarse-fine approach
Heeger
• Results from different levels– level 1 of the pyramid for translating tree • Input speeds coincide with its velocity range of
1.25~2.5 p/f
– Level 0 for Diverging Tree • Most of its velocities were below 1.25 p/f
– All three levels for Yosemite• Choose the velocity estimates from the level whose
speed range was consistent with the true motion field
Diverging Tree
Normal/Component velocity results
• Yosemite• 2d velocity
Synthetic Data
• Phase-based method (Fleet and Jepson) produced the most consistently accurate results – Perform extremely well on translating tree and
Diverging Tree – Not significantly better on Yosemite
• Only 15 frames available, have to increase the tuning frequency of filters to reduce the width of support and increase– Narrow bandwidths greater sensitivity to aliasing and corruption
at high frequencies a compared with the Gaussians used by differential techniques
– A significant amount of aliasing in certain regions of the image
Yosemite
• Fleet and Jepson– As the phase stability threshold increases, the 2d
velocity errors initially increases, but then decreases significantly • Increasing number of component velocities available
for 2d velocity computations, Increasing robustness of the minimization slightly • Considerable improvement with a tighter constraint on
the condition number in the LS system
Yosemite
• Most techniques perform relatively poorly– Aliasing – Occluding boundaries, especially for the horizon • If the sky is excluded for analysis, better performance• But the density does not change
Confidence Measures• The importance of confidence measures
– All techniques produce velocity estimates with a large range of accuracy
• Use confidence measures as thresholds to extract subset of velocities that are reliable– Perform well– Useful to distinguish locations at which 2D velocity v.s. normal velocity is
measured
• Justify confidence measures– Error behaviour– Density of estimates
Real Image Data
• With natural image sequence, it is hard to see the difference between different techniques – Errors of 10% or 20% is hard to discern at this
resolution – Other errors, like normal velocities mistaken for 2d
velocities • Main problem
Main Problem
• For integrate normal constraints with global smoothness constraints– Is the lack of a confidence measure that allows
one to distinguish a normal velocity estimate from 2d velocity estimate
– Comparing Horn and Schunck with local explict method
SRI Tree NASA
Rubik cube Hamburg
Taxi
Horn and Schunck
SRI Tree NASA
Rubik cube Hamburg
Taxi
Lucas and Kanade
Real Image Data
• Differential and phase-based algorithm works well – Lucas-Kanade, Uras et al., Fleet and Jepson– Uras et al.• Sparser set of estimates, but the density competitive
– Fleet and Jepson• Extremely good at the ground plane toward the front of
the SRI tree sequence compared with the above trees
SRI Tree NASA
Rubik cube Hamburg
Taxi
Nagel
Gaussian filter: 3 in space and 1.5 in time
SRI Tree NASA
Rubik cube Hamburg
Taxi
Uras et al.
SRI Tree NASA
Rubik cube Hamburg
Taxi
Anandan no thresholding
SRI Tree NASA
Rubik cube Hamburg
Taxi
Singh No thresholding
SRI Tree NASA
Rubik cube Hamburg
Taxi
Heeger
Heeger
• Based on 3 levels of Gaussian pyramid • Choose the estimates with speeds that are
consistent from their respective levels of the pyramid
• If consistent estimates are at more than one levels, chose the lowest level
SRI Tree NASA
Rubik cube Hamburg
Taxi
Waxman et al. Gaussian with 1.5 space time
SRI Tree NASA
Rubik cube Hamburg
Taxi
Fleet and Jepson
Summary
• Compare the performance of a number of optical flow techniques: density and accuracy
• 9 algorithms– Differential methods– Region-based matching– Energy-based,– Phase-based
• Comparison between – Different types of algorithms– Different method of the same concept
Summary
• Both real and synthetic image sequence – Not severely corrupted by spatial and temporal aliasing
• Comparison– Most reliable:
• 1st order, local differential method of Lucas and Kanade • Local phase-based method -- Fleet and Jepson
– 2nd order differential method of Uras et al. also performs well– Perform consistently well over all the image sequence
• With confidence measures at different stages • Limitation: lack of reliable confidence measures
Differential Approaches
• Importance of numerical differentiation and spatiotemporal smoothing – Some degree of spatiotemporal presmoothing to• remove small amount of temporal aliasing and • improve the subsequent derivative estimates • Had a marked effect on the quantitative accuracy
– Temporal smoothing is particularly useful
Differential Approaches
• Methods that combine local differential constraint to obtain 2d velocity estimates – Local explicit methods (local fit to constant or
linear models of v) • Superior in both accuracy and computational efficiency • More robust with respect to errors in gradient
measurement caused by quantization noise (modified Horn and Schunck -- Lucas, kanade)• Because of the existence of confidence measure to
distinguish estimates of normal velocity and 2d velocity
2nd order differential methods
• Produce accurate and relatively dense measurement of 2d velocity
• Det(H) is a good confidence measure, more effective than its condition number k(H)
• Inconsistent– Good at predominately translational sequence – Degrades fast as the mount of higher-order geometric
deformation in the input increases (compare translating tree and diverging tree )
Matching Techniques
• Generally poorer than good differential methods– SSD-based matching: poor ability to estimate
subpixel displacement • Good for image translation and higher speeds • Poor: small velocities with dilational component
– Important to use neighborhood smoothness constraint (Singh, Anandan)
– Confidence measurement is not effective
Energy-based Techniques
• Not as reliable as others – Nonlinear optimation in Heeger is extremely
sensitive to initial conditions and do not produce reliable results
• Generally, difficult to use
Phase-Based Approaches
• Fleet and Jepson produced the most accurate results overall
• However – Sensitive to temporal aliasing because of the frequency
tuning of the filter – Potential number of confidence measures
• Phase stability, SNR • Better to combine them to a single measure that would
facilitate the LS solution to 2d velocities
– High computational load • A large number of filter
Conditions of Tests
• Temporal aliasing is not a severe problem and the intensity is differentiable
• Relatively simple image sequences – Without occlusion, specularities, multiple motions..– Performance measures should be taken as lower
bound on the expected accuracy under general conditions
– Most implementations use only one-scale of filtering, multi-scale implementations