sequential learned linear predictors for object tracking - center for machine...

Sequential Learned Linear Predictors for ObjectTracking

Tomas Svobodajoint work with Karel Zimmermann, Jirı Matas, and David Hurych

Czech Technical University, Faculty of Electrical EngineeringCenter for Machine Perception, Prague, Czech Republic

[email protected]

http://cmp.felk.cvut.cz/~svoboda

June 23, 2009

http://cmp.felk.cvut.cz/~svoboda

Object tracking

I Tracking - iterative estimation of an object pose in asequence (e.g., 2D position of Basil’s head).

I Trade-off among accuracy, speed, robustness is explicitlytaken into account.

video: Basil head

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/human_simple.avi

Object tracking

I Tracking - iterative estimation of an object pose in asequence (e.g., 2D position of Basil’s head).

I Trade-off among accuracy, speed, robustness is explicitlytaken into account.

video: Cartoon eye

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/cartoon_single_predictor.avi

Tracking

Current imageTemplate

Motion

Observation

Observation (image) I and template J

Tracking of a single point

X

Current position

Tracking of a single point

X

Old position

[Lucas-Kanade1981] - Iterative minimization

argmint‖I(t)− J‖2argmint‖I(t)− J‖2image I alignment t

J← I(t)

optional template updateLucas–Kanade

[Lucas-Kanade1981] - Iterative minimization

argmint‖I(t)− J‖2argmint‖I(t)− J‖2image I alignment t

J← I(t)

optional template updateLucas–Kanade

video: KLT iterations

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/cartoon_KLT.avi

Regression - learn the mapping in advanceo

H(I− J)H(I− J)image I alignment t

J← I(t)

optional template updateJurie–Dhome

I [Jurie-BMVC-2002] - learned motion and optional hardtemplate update

I [Cootes-PAMI-2001] - learned regression during AAM iterations

I . . .

Learning alignment for one predictor

I ϕ( )= (0, 0)>

I ϕ( )= (−25, 0)>

I ϕ( )= (25,−15)>

I ϕ( )= (0, 0)>

I ϕ( )= (−25, 0)>

I ϕ( )= (25,−15)>

Generating training examples

[ ]

t − motioni I = I(x ,...x ) − observation1 ci

examples

displacementestimation

generating

+2

+7+9

+4

−8

−4

training set: +2 −4 +4+9 +7 −8

T= I=

Training set: (I, T)I = [I1 − J, I2 − J, . . . Id − J] and T = [t1, t2, . . . td ].

LS learning

ϕ∗ = argminϕ∑

t

‖ϕ(I(t ◦ X

))− t‖2.

Minimizes sum of square errors over all training set. Leads tomatrix pseudoinverse computation.

Example for linear mapping:

H∗ = argminH

d∑i=1

‖H(Ii − J)− ti‖22 = argminH‖HI− T‖2F

after some derivation

H∗ = T I>(II>)−1︸︷︷︸I+

= TI+.

Min-max learning

ϕ∗ = argminϕ maxt‖ϕ(I(t ◦ X

))− t‖∞.

Minimizes the worst case (the biggest estimation error) in thetraining set. Leads to linear programming.

Tracking of a single point by a sequence of predictors

X

t =1 ϕ1

Old position


X

o

1

t =1 ϕ1

Old position


X

o

1

t =1 ϕ1

t =2 ϕ2

Old position


X

o

o2

1

t =1 ϕ1

t =2 ϕ2

Old position


X

o

o2

1

t =1 ϕ1

t =3 ϕ3

t =2 ϕ2

Old position


X

o

o

o

2

1

3 t =1 ϕ1

t =3 ϕ3

t =2 ϕ2

Old position


X

o

o

o

2

1

3

New position

t =1 ϕ1

t =3 ϕ3

t =2 ϕ2

Old position


1 2Φ=(ϕ,ϕ,ϕ)

3

X

o

o

o

2

1

3

New position

t =1 ϕ1

Motion

t =3 ϕ3

t =2 ϕ2

Old position


1 2Φ=(ϕ,ϕ,ϕ)

3

Ranges

X

o

o

o

2

1

3

New position

t =1 ϕ1

Motion

t =3 ϕ3

t =2 ϕ2

Old position

Learning of sequential predictor

I Learning - searching for the sequence with predefined range,accuracy and minimal computational cost.

I [Zimmermann-PAMI-2009] - Dynamic programming estimatesthe optimal sequence of linear predictors.

I [Zimmermann-IVC-2009] - Branch & bound search allows fortime constrained learning (demo in MATLAB).

[Zimmermann-PAMI-2009] K.Zimmermann, J.Matas, T.Svoboda. Tracking by anOptimal Sequence of Linear Predictors, in IEEE Transactions on Pattern Analysis andMachine Intelligence, IEEE computer society, 2009, vol. 31, No 4, pp 677–692.

[Zimmermann-IVC-2009] K.Zimmermann, T.Svoboda, J.Matas. Anytime learning for

the NoSLLiP tracker. Image and Vision Computing, Elsevier, accepted, available

on-line.

Learning of the optimal sequence of linear predictors

Range

Complexity

Uncertaintyregion

I Range: the set of admissible motions, r .I Complexity: cardinality of support set, c .I Uncertainty region: the region within which all predictions

lie, λ. Small red circles show acceptable uncertainty.


0

20

40

60

80

100

0

50

100

150

200

0

10

20

30

40

50

r − range

c − complexity

λ −

unc

erta

inty

are

a

Achievablepredictors ω

Inachievablepredictors

λ−minimal predictors



Branch and Bound

0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Graph depth

Pre

dic

tion e

rror

c=600c=320 c=360

c=80

0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Graph depth

Pre

dic

tion e

rror

c=100

c=600c=320

c=380

Don’t forget to show the live demo!

Tracking with one linear predictor.

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/cartoon_single_predictor.avi

Modeling motion by number of linear predictors.

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/2D_cartoon_occlusion.avi

Motion blur, fast motion, views from acute angles andother image distortions.

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/3D_cd_box.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/curling.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/car_step2.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/keny_fast2.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/gastro02.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/towel_fast.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/mouse_pad_acute_angle.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/mouse_pad_bending.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/2D_cartoon_shaking.avi

Support set selection

0.065 0.07 0.075 0.08 0.085 0.09 0.095 0.1 0.105 0.110

50

100

150

200

250

300

350

400

450

500

MSE

I Greedy LS selection (red) of an efficient support set.

I Much better than 1%-quantile (green) achieavable byrandomized sampling

Tracking of objects with variable appearance

I Variable appearance - the way how the object looks like inthe camera changes due to illumination, non-rigiddeformation, out-of-plane rotation, ...

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/face2.avi



Simultaneous learning of motion and appearance

ϕ(I)ϕ(I)image I alignment t

I Introduce feedback which encodes appearance in a lowdimensional space and adjust the predictor.

I Appearance parameters learned in unsupervised way.

I Simultaneous learning of ϕ and γ ⇒ appearance encoded inthe low dimensional space, which is the most suitable for themotion estimation.


ϕ(I;θ)ϕ(I;θ)

γ(J)γ(J)

image I alignment t

J← I(t) imageθ appearance parameters

I Introduce feedback which encodes appearance in a lowdimensional space and adjust the predictor.

I Appearance parameters learned in unsupervised way.

I Simultaneous learning of ϕ and γ ⇒ appearance encoded inthe low dimensional space, which is the most suitable for themotion estimation.

Learning appearance

H(I− J)H(I− J)image I alignment t

J← I(t)

optional template updateJurie–Dhome

Learning appearance

H(I− J(θ))H(I− J(θ))H(I− J(θ))H(I− J(θ))

PCAJPCAJ

timage I

I(t)θ

Cootes–Edwards

Learning appearance – our approach


PCAJPCAJ

timage I

I(t)θ

Cootes–Edwards

ϕ(I;θ)ϕ(I;θ)

γ(I(t))γ(I(t))

image I alignment t

I(t) imageθ appearance parameters

Our algorithm

Learning appearance – our approach


PCAJPCAJ

timage I

I(t)θ

Cootes–Edwards

ϕ(I;θ)ϕ(I;θ)

γ(I(t))γ(I(t))

image I t

I(t) imageθ appearance parameters

Our algorithm

motion estimator

appearance encoder

Learning the appearance encoder γ

I Current appearance encoded in low-dim parameters.

I γ( ) = θ1 I γ( ) = θ2

Learning the tracker ϕ(I; θ)

I ϕ( ; θ1)= (0, 0)>

I ϕ( ; θ1)= (−25, 0)>

I ϕ( ; θ1)= (25,−15)>

I ϕ( ; θ2)= (0, 0)>

I ϕ( ; θ2)= (−25, 0)>

I ϕ( ; θ2)= (25,−15)>

Simultaneous learning of ϕ and γ

I Learning = minimization of the least-squares error

(ϕ∗, γ∗) = arg minϕ,γ[ϕ(

; γ( ))−(0, 0)>

]2+[

ϕ(

; γ( ))−(−25, 0)>

]2+[

ϕ(

; γ( ))−(25,−15)>

]2+[

ϕ(

; γ( ))−(0, 0)>

]2+[

ϕ(

; γ( ))−(−25, 0)>

]2+[

ϕ(

; γ( ))−(25,−15)>

]2

Linear mapping

I γ(J) : θ = GJ

I ϕ(I,θ) : t = (H0 + θ1H1 + · · ·+ θnHn)I

I Criterion is sum of squares of bilinear functions.

Algorithm: iterative minimization of criterion e(ϕ, γ)

ϕ – motion (geometry mapping), γ – appearance mapping

I Iterative minimization:I initialization γ0 = randI ϕ1 = arg minϕ e(ϕ, γ0)I γ1 = arg minγ e(ϕ1, γ)I ϕ2 = arg minϕ e(ϕ, γ1)I γ2 = arg minγ e(ϕ2, γ)I ϕ3 = arg minϕ e(ϕ, γ2)I γ3 = arg minγ e(ϕ3, γ)I until convergence reached

color encodes criterion value e(ϕ, γ)

γϕ

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

I Global optimality for linear ϕ, γ experimentally shown.





γϕ

[ϕ0, γ0]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






γϕ [ϕ1, γ0]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






γϕ [ϕ1, γ1]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






γϕ [ϕ2, γ1]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






γϕ [ϕ2, γ2]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






γϕ

[ϕ3, γ2]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






γϕ

[ϕ3, γ3]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






γϕ

[ϕ9, γ9]

0 2 4 6 8

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4






0 50 100 15020

25

30

35

40

45

50

iterationsM

SE






0 5 10 15 20

22

23

24

25

26

27

28

29

30

31

32

iterationsM

SE



http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/human_appearance.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/cup1.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/cup1.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/medical1.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/woman_affine_warp_02.avi

Experiments - videos II

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/flower2.avi

http://cmp.felk.cvut.cz/demos/Tracking/linTrack//videodir/woman_affine_warp_02.avi

Conclusions

I Learnable and very efficient tracking of objects with variableappearance.

I Accuracy, speed, robustness explicitly taken into account.

I Simultaneous learning motion and appearance.

Limitations

I Small, thin objects intractable.

Data, papers, various implemenations freely available athttp://cmp.felk.cvut.cz/demos/Tracking/linTrack/

http://cmp.felk.cvut.cz/demos/Tracking/linTrack/

sequential learned linear predictors for object tracking - center for machine...

Documents