space-time interest points computational vision and active perception laboratory (cvap) dept of...

Space-time interest points Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute of Technology) SE-100 44 Stockholm, Sweden Ivan Laptev and Tony Lindeberg

Upload: joan-daniels

Post on 17-Dec-2015




1 download


Page 1: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Space-time interest pointsSpace-time interest points

Computational Vision and Active Perception Laboratory (CVAP)Dept of Numerical Analysis and Computer Science

KTH (Royal Institute of Technology)SE-100 44 Stockholm, Sweden

Ivan Laptev and Tony Lindeberg

Page 2: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

General motivation

Spatio-temporal image data contains rich information about the external world.

Traditional methods for video analysis include

• optical flow estimation;

• tracking of features/models over time.


Events in video are often characterised by non-constant motion and non-constant appearance

Page 3: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Spatio-temporal data

Idea: detect points with high spatio-temporal variation of image values

Direct method for event detection

Page 4: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Why local features in time?

Non-constant motion in images may be an indication of physical interaction between objects in the world (ball

bouncing the ground, car crash, etc.) non-rigid motion, e.g. relative motion of body parts,

gestures, etc. occlusions/disocclusions in the field of view

Goal: make a sparse and informative representation of

complex motion patterns; obtain robustness w.r.t. missing data (occlusions) and

outliers (dynamic, complex background)

Page 5: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Interest points in space

(Harris and Stephens 1988): image points with high variation of values in both image direction

High eigenvectors of the second-moment matrix integrated at the local neighbourhood

where Lx, Ly are Gaussian derivatives

Select points with positive maxima of the corner function

Page 6: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Interest points in space-time

High variation of image values in both space and time

extend Harris corner function into 3D spatio-temporal domain; compute the second moment matrix

where Lx, Ly , Lt are Gaussian derivatives in space-time obtained by spatio-temporal convolution:


Page 7: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Interest points in space-time

Points with high space-time variations of image values correspond to the maxima of

distinct scale parameters for the spatial scale and the temporal scale : spatial and temporal extents of events are independent in general.

Convolution with the Gaussian kernels violates causality constraint of temporal domain. Alternative (recursive) kernels can be used to address this problem (Koenderink 1988, Lindeberg & Fagerström 1996, Florack 1997)

where are eigenvalues of .

Page 8: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Experiments with synthetic sequences

Spatio-temporal ”corner” Collision I

Page 9: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Experiments with synthetic sequences

Collision II





Page 10: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Motivation for scale selection





Page 11: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Motivation for velocity adaptation

vx=-0.8 vx=1.4vx=0.0

Page 12: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Spatio-temporal scale selection

Estimate the spatio-temporal extent of image structures

Local scale estimation has been investigated and applied previously in the spatial domain (Lindeberg IJCV’98; Chomat ECCV’00; Mikolajczyk and Schmid ICCV’01):

Here: Extend scale selection into the spatio-temporal domain; estimate spatial and temporal scale parameters

Task: find normalisation parameters (a,b,c,d) of

such that normalised derivatives obtain extrema at scales corresponding to the extents of image structures in space-time

Page 13: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Spatio-temporal scale selection

Analyse spatio-temporal blob

Extrema constraints

Give parameter values a=1, b=1/4, c=1/2, d=3/4

Page 14: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

The normalised spatio-temporal Laplacian operator

assumes extrema values at positions and scales corresponding to the centres and the spatio-temporal extent of a Gaussian blob

Spatio-temporal scale selection

Page 15: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Want to adapt point neighbourhoods to the direction of motion and obtain invariance w.r.t. the first-order motion

Velocity adaptation

Stationary pattern:

First-order motion is described by the Galilean transformation


and it follows

Page 16: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Velocity adaptation

expansion gives

However, this scheme needs the estimate of in advance in order to adapt the smoothing filter kernel .

Iteratively estimate and adapt the filter kernel until the fixed-point condition is reached:


(Similar approach for affine shape adaptation in space, Lindeberg)

Page 17: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Find interest points p=(x,y,t,2,2,vx ,vy) that are

maxima of the corner function H over (x,y,t); maxima of the normalised Laplacian over (2,2); satisfy fixed-point condition

Scale and velocity adaptation


1. Find interest points P for a set of sampled (2,2,vx ,vy)2. For each pi in P

3. select new scale (2,2) at (x,y,t) that maximises Laplacian in the local scale-neighbourhood

4. estimate velocity (vx ,vy) 5. re-detect interest point for new scales and velocities6. If changes in (2,2 ,vx ,vy) => repeat from 3.7. else i=i+1

(Similar in spatial domain: Mikolajczyk and Schmid ICCV01, ECCV02)

Page 18: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Scale- and velocity-adaptedinterest points

Page 19: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

ExperimentsStationary cameraStabilised camera

Page 20: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute








sed c



No adaptation Scale adaptationScale and velocity adaptation

Page 21: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute


Invariance with respect to size changes

Page 22: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute


Selection of temporal scales captures the temporal extents of events

Page 23: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Applications of interest points

(preliminary results)

Classify detected interest points using their spatio-temporal neighbourhoods

Represent video data by a set of classified interest points (features)

Align video sequences by matching spatio-temporal features

Recognise motion patterns using probability distribution of features derived from training sequences

Page 24: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Classification of events

When analysing periodic motion such as the gait pattern, the interest points with similar spatio-temporal structure are likely to correspond to the interesting events, while the others are more likely to be caused by noise.

Describe each interest point pi, i=1,...,n by the local responses of spatio-temporal Gaussian derivatives:

and normalise descriptors w.r.t. the covariance

Group similar points in the space of normalised descriptors using k-means clustering

Select significant clusters and represent each of them by the mean and the covariance matrix

Page 25: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

K-means clusteringFor the gait pattern, four significant clusters (clusters with most points) correspond to distinct spatio-temporal events







Page 26: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Application I: Sequence matching

Represent the model sequence and the test sequence by a set of classified spatio-temporal points.

Find a valid transformation of a model that brings model features in correspondence with data features.

Problem: Find walking people and estimate their poses from image sequences

Match a model sequence with data sequences using spatio-temporal interest points

Note: the feature matching is defined in a 3D spatio-temporal window

Page 27: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Walking model

Represent the gait pattern using classified spatio-temporal points corresponding the one gait cycle

Define the state of the model X for the moment t0 by the position, the size, the phase and the velocity of a person:

Associate each phase with a silhouette of a person extracted from the original sequence

Page 28: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Sequence alignment Given a data sequence with the current moment t0,

detect and classify interest points in the time window of length tw: (t0, t0-tw)

Transform model features according to X and for each model feature fm,i=(xm,i, ym,i, tm,i, m,i, m,i, cm,i) compute its distance di to the most close data feature fd,j, cd,j=cm,i:

Define the ”fit function” D of model configuration X as a sum of distances of all features weighted w.r.t. their ”age” (t0-tm) such that recent features get more influence on the matching

Page 29: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Sequence alignment

data featuresmodel features

At each moment t0 minimize D with respect to X using standard Gauss-Newton minimization method

Page 30: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute


Page 31: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute


Page 32: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Walking Exercise Running Cycling

1. Detect spatio-temporal velocity- and scale-adapted interest points and compute their jet descriptors

2. Cluster all the descriptors using k-means

3. Compute distributions of points over detected clusters for each sequence separately

Application II: Action recognition

Page 33: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Cluster id

Cluster id

Cluster id

Cluster id





Model histograms

(related to Leung & Malik, IJCV01)

Page 34: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute






Test sequences

Page 35: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute


1. Detect interest points and classify their jet responses w.r.t. the cluster means :

2. Compute distribution of cluster labels and classify the sequence as an action if


Confusion matrix:

test walking test exercise test running test cycling test background

Page 36: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute


ROC curve corresponding to changes of the decision threshold when classifying 37 sequences using different histogram-distance measures

% correct

% false

Page 37: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Performance comparison

Velocity- and scale-adapted space-time interest points

Non-adapted space-time interest points

Spatial interest points

Page 38: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute

Back-projection of points

Test running

Test cycling

Test walking

Test exercise

Page 39: Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute


Points with high variation of image values in space-time are detected

Direct approach for event detection (no tracking needed) invariant treatment of events at different spatial and

temporal scales; invariance w.r.t. camera motion

Interest point detection

Applications Classified space-time features provide a compact

representation of video information Interpretation of scenes with complex, non-stationary


Future work: contrast and orientation invariant descriptors, large-scale action recognition experiments, integration of multi-local constraints, on-line implementation.