tutorial on particle filters assembled and extended by longin jan latecki temple university,...

Tutorial on Particle Filters

assembled and extended by Longin Jan LateckiTemple University, latecki@temple.edu

using slides from

Keith Copsey, Pattern and Information Processing Group, DERA Malvern;

D. Fox, J. Hightower, L. Liao, D. Schulz, and G. Borriello, Univ. of Washington, SeattleHonggang Zhang, Univ. of Maryland, College ParkMiodrag Bolic, University of Ottawa, CanadaMichael Pfeiffer, TU Gratz, Austria

Outline

Introduction to particle filters

– Recursive Bayesian estimation Bayesian Importance sampling

– Sequential Importance sampling (SIS)

– Sampling Importance resampling (SIR) Improvements to SIR

– On-line Markov chain Monte Carlo Basic Particle Filter algorithm Example for robot localization Conclusions

Particle Filters

Sequential Monte Carlo methods for on-line learning within a Bayesian framework.

Known as

– Particle filters

– Sequential sampling-importance resampling (SIR)

– Bootstrap filters

– Condensation trackers

– Interacting particle approximations

– Survival of the fittest

History

First attempts – simulations of growing polymers– M. N. Rosenbluth and A.W. Rosenbluth, “Monte Carlo calculation of the average extension of molecular chains,” Journal of

Chemical Physics, vol. 23, no. 2, pp. 356–359, 1956.

First application in signal processing - 1993– N. J. Gordon, D. J. Salmond, and A. F. M. Smith, “Novel approach to nonlinear/non-Gaussian Bayesian state estimation,”

IEE Proceedings-F, vol. 140, no. 2, pp. 107–113, 1993.

Books– A. Doucet, N. de Freitas, and N. Gordon, Eds., Sequential Monte Carlo Methods in Practice, Springer, 2001.

– B. Ristic, S. Arulampalam, N. Gordon, Beyond the Kalman Filter: Particle Filters for Tracking Applications, Artech House Publishers, 2004.

Tutorials– M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filters for online nonlinear/non-gaussian

Bayesian tracking,” IEEE Transactions on Signal Processing, vol. 50, no. 2, pp. 174–188, 2002.

Problem Statement

Tracking the state of a system as it evolves over time

Sequentially arriving (noisy or ambiguous) observations

We want to know: Best possible estimate of the hidden variables

Solution: Sequential Update

Storing and processing all incoming measurements is inconvenient and may be impossible

Recursive filtering:–Predict next state pdf from current estimate–Update the prediction using sequentially

arriving new measurements

Optimal Bayesian solution: recursively calculating exact posterior density

Particle filtering ideas

Particle filter is a technique for implementing recursive Bayesian filter by Monte Carlo sampling

The idea: represent the posterior density by a set of random particles with associated weights.

Compute estimates based on these samples and weights

Sample space

Posterior density

Global Localization of Robot with Sonarhttp://www.cs.washington.edu/ai/Mobile_Robotics/mcl/animations/global-floor.gif

Tools needed

1 1 1( ) ( | ) ( )t t t t tp x p x x p x dx ( | ) ( )

( | )( )

t t tt t

p z x p xp x z

Recall “law of total probability” (or marginalization) and “Bayes’ rule”

Recursive Bayesian estimation (I)

Recursive filter:

– System model:

– Measurement model:

– Information available:

)|( ),( 11 kkkkkk xxpxfx

)|( ),( kkkkkk xypxhy

),,( 1 kk yyD

)( 0xp

Recursive Bayesian estimation (II)

– i = 0: filtering.

– i > 0: prediction.

– i<0: smoothing.

Prediction:

– since:

)|( kik Dxp

1111 )|,()|( kkkkkk dxDxxpDxp

11111 )|()|()|( kkkkkkk dxDxpxxpDxp

)|()|()|(),|()|,( 111111111 kkkkkkkkkkkk DxpxxpDxpDxxpDxxp

Recursive Bayesian estimation (III)

Update:

where:

– since:

kkkkkk dxDxypDyp )|,()|( 11

kkkkkkk dxDxpxypDyp )|()|()|( 11

)|()|()|(

kkkkkk Dyp

DxpxypDxp

)|()|()|(),|()|,( 1111 kkkkkkkkkkkk DxpxypDxpDxypDxyp

Bayes Filters (second pass)

1( , )

( , )t t t t

t t t t

x f x w

z g x v

System state dynamics

Observation dynamics

1( ) ( | , , )t t tBel x p x z z

We are interested in: Belief or posterior density

Estimating system state from noisy observations

1:( 1) 1 1where , ,t tz z z

1:( 1) 1, 1:( 1) 1 1:( 1) 1( | ) ( | ) ( | )t t t t t t t tp x z p x x z p x z dx

From above, constructing two steps of Bayes Filters

1:( 1)1:( 1) 1:( 1)

1:( 1)

( | , )( | , ) ( | )

( | )t t t

t t t t tt t

p z x zp x z z p x z

Predict:

Update:

1:( 1) 1, 1:( 1) 1 1:( 1) 1( | ) ( | ) ( | )t t t t t t t tp x z p x x z p x z dx

1:( 1)replace ( | , ) with ( | )t t t t tp z x z p z x

Predict:

Update:

Assumptions: Markov Process

1 1: 1 1replace ( | , ) with ( | )t t t t tp x x z p x x

1:( 1)1:( 1) 1:( 1)

1:( 1)

( | , )( | , ) ( | )

( | )t t t

t t t t tt t

p z x zp x z z p x z

1:( 1) 1:( 1)( | , ) ( | ) ( | )t t t t t t t tp x z z p z x p x z

Bayes Filter

1:( 1) 1 1 1:( 1) 1( | ) ( | ) ( | )t t t t t t tp x z p x x p x z dx

1( | )

( | )t t

How to use it? What else to know?

Motion Model

Perceptual Model

Start from: 0 00 0 0

( | )( | ) ( )

p z xp x z p x

Example 1

10 0( ) or ( )Bel x p x

Step 0: initialization

0 0 0 0

( ) or ( | )

( | ) ( )

Bel x p x z

p z x p x

Step 1: updating

Example 1 (continue)

1 1 1 0 0

( ) or ( | )

( | ) ( | )

Bel x p x z

p z x p x z

Step 3: updating

12 2 1

2 1 1 1 1

( ) or ( | )

( | ) ( | )

Bel x p x z

p x x p x z dx

Step 4: predicting

11 1 0

1 0 0 0 0

( ) or ( | )

( | ) ( | )

Bel x p x z

p x x p x z dx

Step 2: predicting

Classical approximations

Analytical methods:

– Extended Kalman filter,

– Gaussian sums… (Alspach et al. 1971)

• Perform poorly in numerous cases of interest

Numerical methods:

– point masses approximations,

– splines. (Bucy 1971, de Figueiro 1974…)

• Very complex to implement, not flexible.

Perfect Monte Carlo simulation

Recall that

Random samples are drawn from the posterior distribution.

Represent posterior distribution using a set of samples or particles.

Easy to approximate expectations of the form:

– by:

),,( 0:0 kk xxx

kkkkk dxDxpxgxgE :0:0:0:0 )|()())((

ikk xg

1:0:0 )(

ikx :0

ikkkk xx

1:0:0:0 )(

Random samples and the pdf (I)

Take p(x)=Gamma(4,1) Generate some random samples Plot histogram and basic approximation to pdf

0 2 4 6 8 10 12 14 16 18 200

0 20 40 60 80 100 120 140 160 180 2000

200 samples

Random samples and the pdf (II)

0 2 4 6 8 10 12 14 16 18 200

500 samples 1000 samples

Random samples and the pdf (III)

0 5 10 15 20 250

200000 samples5000 samples

Importance Sampling

Unfortunately it is often not possible to sample directly from the posterior distribution, but we can use importance sampling.

Let p(x) be a pdf from which it is difficult to draw samples.

Let xi ~ q(x), i=1, …, N, be samples that are easily generated from a proposal pdf q, which is called an importance density.

Then approximation to the density p is given by

xpw )()(

i xxwxp

Bayesian Importance Sampling

By drawing samples from a known easy to sample proposal distribution we obtain:

ikkk xxwDxp

1:0:0:0 )()|(

)|( :0 kk Dxq

ikx :0

are normalized weights.

Sequential Importance Sampling (I)

Factorizing the proposal distribution:

and remembering that the state evolution is modeled as a Markov process

we obtain a recursive estimate of the importance weights:

Factorizing is obtained by recursively applying

jjjjkk DxxqxqDxq

11:00:0 ),|()()|(

)|()|(

kkkkkk Dxxq

xxpxypww

)|(),|()|( 11:01:0:0 kkkkkkk DxqDxxqDxq

Sequential Importance Sampling (SIS) Particle Filter

SIS Particle Filter Algorithm

],},[{]},[{ 1111 kNi

ik zwxSISwx

for i=1:N

Draw a particle

Assign a weight

),|(~ 1 kik

ik zxxqx

)|()|(

kik Dxxq

xxpxzpww

(k is index over time and i is the particle index)

Derivation of SIS weights (I)

The main idea is Factorizing :

jjjk xxpxpxp

110:0 )|()()( and

jjjkk xypxDp

1:0 )|()|(

)|(),|()|( 11:01:0:0 kkkkkkk DxqDxxqDxq

Our goal is to expand p and q in time t

Derivation of SIS weights (II)

)()|()|(),|(

)()|()|(

1:01:0:01:01:0:0:0

kkkkkkkk

kkkkk DpDzp

xpxxpxDpxDzp

xpxDpDxp

)|()|(),|(

11:01:0:01

kkkkkkk

DxpxxpxDzp

)()|()|(),|(

1:01:011:0:01

kkkkkkkk

xpxDpxxpxDzp

)|()|(),|( 11:01:0:01 kkkkkkk DxpxxpxDzp

)|(),|(

1:0:01

xxpxDzp

Derivation of SIS weights (II)

)|(),|(

1:0:011

k Dxxq

xxpxDzpw

)|()|(

kik zxxq

xxpxzpww

and under Markov assumptions

SIS Particle Filter Foundation

At each time step k Random samples are drawn from the proposal distribution for i=1, …, N

They represent posterior distribution using a set of samples or particles

Since the weights are given by

and q factorizes as

ikkk xxwDxp

),|( 1:0 kkk Dxxq

jjjjkk DxxqxqDxq

11:00:0 ),|()()|(

Sequential Importance Sampling (II)

Choice of the proposal distribution:

Choose proposal function to minimize variance of (Doucet et al. 1999):

Although common choice is the prior distribution:

We obtain then

),|( 1:0 kkk Dxxq

),|(),|( 1:01:0 kkkkkk DxxpDxxq

)|(),|( 11:0 kkkkk xxpDxxq

)|(),|(

)|()|(1

kik xzpw

xxpxzpww

Illustration of SIS:

Degeneracy problems:

– variance of importance ratios increases stochastically over time (Kong et al. 1994; Doucet et al. 1999).

– In most cases then after a few iterations, all but one particle will have negligible weight

Sequential Importance Sampling (III)

Time 19

Time 10

Time 1

)|(/)|( :0:0 kkkk DxqDxp

Sequential Importance Sampling (IV)

Illustration of degeneracy:

Time 19

Time 10

Time 1

SIS - why variance increase

Suppose we want to sample from the posterior

– choose a proposal density to be very close to the posterior

density

• Then

• and

So we expect the variance to be close to 0 to obtain reasonable estimates

– thus a variance increase has a harmful effect on accuracy

kkq Dxq

)|(var

kkq Dxq

Sampling-Importance Resampling

SIS suffers from degeneracy problems so we don’t want to do that!

Introduce a selection (resampling) step to eliminate samples with low importance ratios and multiply samples with high importance ratios.

Resampling maps the weighted random measure on to the equally weighted random measure

– by sampling uniformly with replacement from

with probabilities

Scheme generates children such that and satisfies:

)}(~,{ :0:0ikk

ik xwx} { 1-

:0 Nx j k},,1;{ :0 Nixi k

},,1;~{ Niwik

iki wNNE ~)(

)~1(~)var( ik

iki wwNN

Basic SIR Particle Filter - Schematic

Initialisation

Importancesampling step

Resamplingstep

)}(~,{ :0:0ikk

ik xwx

},{ 1:0

measurement

Extract estimate, kx :0ˆ

Basic SIR Particle Filter algorithm (I) Initialisation

– For sample

– and set

Ni ,,1 )(~ 00 xpxi

Importance Sampling step

– For sample

– For compute the importance weights wik

– Normalise the importance weights,

ik www

Ni ,,1 )|(~~1

ik xxqx

),(~1:0:0

ik xxx and set

Ni ,,1

Basic SIR Particle Filter algorithm (II)

Resampling step

– Resample with replacement particles:

– from the set:

– according to the normalised importance weights,

– proceed to the Importance Sampling step, as the next

measurement arrives.

),,1;( :0 Nixi k

),,1;~( :0 Nix i k ikw

Resampling

mk wx 1

)()( ,

)(~ 1,

mk wx 1

Generic SIR Particle Filter algorithm M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filters …,” IEEE Trans. on Signal Processing,

50( 2), 2002.

Improvements to SIR (I)

Variety of resampling schemes with varying performance in terms of the variance of the particles :

– Residual sampling (Liu & Chen, 1998).

– Systematic sampling (Carpenter et al., 1999).

– Mixture of SIS and SIR, only resample when necessary (Liu &

Chen, 1995; Doucet et al., 1999).

Degeneracy may still be a problem:

– During resampling a sample with high importance weight may

be duplicated many times.

– Samples may eventually collapse to a single point.

)var( iN

Improvements to SIR (II)

To alleviate numerical degeneracy problems, sample smoothing methods may be adopted.

– Roughening (Gordon et al., 1993).

• Adds an independent jitter to the resampled particles

– Prior boosting (Gordon et al., 1993).

• Increase the number of samples from the proposal distribution to M>N,

• but in the resampling stage only draw N particles.

Improvements to SIR (III)

Local Monte Carlo methods for alleviating degeneracy:

– Local linearisation - using an EKF (Doucet, 1999; Pitt &

Shephard, 1999) or UKF (Doucet et al, 2000) to estimate the

importance distribution.

– Rejection methods (Müller, 1991; Doucet, 1999; Pitt & Shephard,

1999).

– Auxiliary particle filters (Pitt & Shephard, 1999)

– Kernel smoothing (Gordon, 1994; Hürzeler & Künsch, 1998; Liu &

West, 2000; Musso et al., 2000).

– MCMC methods (Müller, 1992; Gordon & Whitby, 1995; Berzuini et

al., 1997; Gilks & Berzuini, 1998; Andrieu et al., 1999).

Improvements to SIR (IV)

Illustration of SIR with sample smoothing:

Time 19

Time 10

Time 1

Ingredients for SMC

Importance sampling function

– Gordon et al

– Optimal

– UKF pdf from UKF at Redistribution scheme

– Gordon et al SIR

– Liu & Chen Residual

– Carpenter et al Systematic

– Liu & Chen, Doucet et al Resample when necessary

Careful initialisation procedure (for efficiency)

)|( 1ikk xxp

),|( 1:0 kikk Dxxp

Particle filters

Also known as Sequential Monte Carlo Methods Representing belief by sets of samples or

particles

are nonnegative weights called importance factors

Updating procedure is sequential importance sampling with re-sampling

( ) ~ { , | 1,..., }i it t t tBel x S x w i n

Example 2: Particle Filter

Step 0: initialization

Each particle has the same weight

Step 1: updating weights. Weights are proportional to p(z|x)

Example 2: Particle Filter

Particles are more concentrated in the region where the person is more likely to be

Step 3: updating weights. Weights are proportional to p(z|x)

Step 4: predicting.

Predict the new locations of particles.

Step 2: predicting.

Predict the new locations of particles.

Compare Particle Filter with Bayes Filter with Known Distribution

Example 1

Example 2

Example 1

Example 2

Predicting

Updating

Particle Filters

)()|()()|()(

xzpxBel

xBelxzpw

xBelxzpxBel

Sensor Information: Importance Sampling

'd)'()'|()( , xxBelxuxpxBel

Robot Motion

)()|()()|()(

xzpxBel

xBelxzpw

xBelxzpxBel

Sensor Information: Importance Sampling

Robot Motion

'd)'()'|()( , xxBelxuxpxBel

Tracking in 1D: the blue trajectory is the target.The best of10 particles is in red.

Matlab code: truex is a vector of 100 positions to be tracked.

Application Examples

Robot localization Robot mapping Visual Tracking

–e.g. human motion (body parts) Prediction of (financial) time series

–e.g. mapping gold price to stock price

Target recognition from single or multiple images Guidance of missiles Contour grouping

Nice video demos:http://www.cs.washington.edu/ai/Mobile_Robotics/mcl/

2nd Book Advert

Statistical Pattern Recognition Andrew Webb, DERA ISBN 0340741643, Paperback: 1999: £29.99 Butterworth Heinemann

Contents:

– Introduction to SPR, Estimation, Density estimation, Linear

discriminant analysis, Nonlinear discriminant analysis - neural

networks, Nonlinear discriminant analysis - statistical methods,

Classification trees, Feature selction and extraction, Clustering,

Additional topics, Measures of dissimilarity, Parameter estimation,

Linear algebra, Data, Probability theory.

Homework

Implement all three particle filter algorithms

SIS Particle Filter Algorithm (p. 27)Basic SIR Particle Filter algorithm (p. 39,40)Generic SIR Particle Filter algorithm (p. 42)

and evaluate their performance on a problem of your choice.

Groups of two are allowed. Submit a report and a ready to run Matlab code (with a

script and the data). Present a report to the class.

tutorial on particle filters assembled and extended by longin jan latecki temple university,...

austria slide

fittest slide

recursive bayesian filter

exact posterior density

bayes rule slide

bayesian framework

monte carlo calculation

monte carlo sampling

Documents

rosen, section 8.5 equivalence relations longin jan latecki...

language recognition (11.4) and turing machines (11.5)...

computer vision and data mining research projects longin jan...

1 motion analysis using optical flow cis601 longin jan...

by dr. michael p. frank, university of florida modified by...

segmentation in color space using clustering student: yijian...

cis581-presentation contour finding presented by: wang,...

naïve bayes classifier ke chen modified and extended by...

longin jan latecki ( latecki@temple ) computer and...

1 random walks on graphs: an overview purnamrita sarkar, cmu...

introduction to expectation maximization assembled and...

finite-state machines with no output longin jan latecki ...

chapter 5 kenneth rosen, discrete mathematics and its...

1 closures of relations based on aaron bloomfield modified...

language recognition (12.4) longin jan latecki temple...

shape-representation and shape similarity cis 601 by rolf...

tianyang ma and longin jan latecki temple university

cis 601 image fundamentals longin jan latecki

longin jan latecki temple university latecki@temple

turing machines (13.5) longin jan latecki temple university