introduction to particle filters for data assimilationoksana/docs/statmos-ssda/dowd... ·...

28
Mike Dowd Dept of Mathematics & Statistics (and Dept of Oceanography) Dalhousie University, Halifax, Canada Introduction to Particle Filters for Data Assimilation STATMOS Summer School in Data Assimila5on, Boulder, Colorado, May 1825 2015

Upload: lamkhuong

Post on 30-Jan-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Mike  Dowd  Dept of Mathematics & Statistics (and Dept of Oceanography)

Dalhousie University, Halifax, Canada    

Introduction to Particle Filters for Data Assimilation

STATMOS  Summer  School  in  Data  Assimila5on,  Boulder,  Colorado,  May  18-­‐25  2015      

Page 2: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Problem Statement    

•  Noisy  'me  series  observa'ons  on  the  system  state    

•  a  discrete  'me  stochas'c  dynamic  model  describing  the  'me  evolu'on  of  the  system  state  

         

y1:T = (y1,..., yT )

xt = θxt−1 + n t

•  We  want  to  es'mate  the  system  state  (and  some'mes  parameters)  

e.g.  Bivariate  AR(1)  process  

Page 3: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Noisy Observations of System State

0 50 100 150 200

-10

-50

510

time

state

Observations

Page 4: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Stochastic Dynamics governing System State

0 50 100 150 200

-10

-50

510

10 Realizations of System State

time

state

Page 5: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Estimation of System State by Particle Filters

0 50 100 150 200

-10

-50

510

time

state

Particle Filter Estimate of the State

Page 6: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Target: Evolution of the Probability Distribution Of the State through Time

state  

'me  

Page 7: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

�Hierarchical  Bayesian  Model    

p(x1:T ,θ | y1:T )∝ p(y1:T | x1:T ,θ ) ⋅ p(x1:T |θ ) ⋅ p(θ )

A General Framework for DA

Special  Cases  in  DA  

•  Parameter  es'ma'on  (no  model  error)  

•  Filtering  solu'on  for  state  *  

•  Smoothing  solu'on  for  state  

•  Joint  state  &  parameter  es'ma'on    

•  Inference  on  dynamic  model  structure  

target  (posterior)   measurement   dynamics   prior  

Page 8: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

•  The  general  goal  is  to  make  inferences  about  the  state  xt  and  some'mes  the  sta'c  parameters*  θ and  ϕ**.

•  We  use  explicit  priors  on  process  noise  &  observa'on  errors  (i.e.  parametric  distribu'ons  for  et  and  vt).  Assume  yt  condi'onally  independent  given  xt,  and  that  et  and vt are  independent.  

 •  For  DA,  we  want  to  do  space-­‐'me  problems:  space  is  incorporated  

in  xt

è Solu'ons  involve  sequen'al  Monte  Carlo  based  for  nonlinear  and  nonGaussian  state  space  models    

xt = d(xt−1,θ ,et )yt = h(xt ,φ,ν t )

� State  Space  Model:  xt ~ p(xt ,θ | xt−1)yt | xt ~ p(yt ,φ | xt )

or  

Data Assimilation: A Statistical Framework

*  Dynamic  parameters  are  part  of  the  state  

t =1,…,T

**  drop  φ  hereaTer,  and  defer  θ  

Page 9: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Problem Classes We  want  to  es'mate  the  system  state  at  'me  t,  or      given  the  observa'ons  up  to  'me  s,  or    

xty1:S = (y1, y2,.., yS )

data available

no data available

estimation time

A) Filtering (Nowcast)

Time

B) Forecasting

C) Smoothing (Hindcast)

s=t  èFiltering  

s>t  èPredic'on  

s<t  èSmoothing  

s  

s  

s  

Page 10: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

time

measurement

nowcast forecast

time = t-1 time = t

prediction

nowcast

p(xt−1,θ | y1:t−1)

xt = d(xt−1,θ ,et ) yt

p(xt ,θ | y1:t−1) p(xt ,θ | y1:t )

-  We’ll drop parameters θ … for now

Single stage transition of system from time t-1 to time t Sequential Estimation

Page 11: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Probabilistic Solution for Filtering

p(xt | y1:t ) ∝ p( yt | xt ) p(xt | xt−1) ⋅ p(xt−1 | y1:t−1)dxt−1∫

è General Filtering Solution:

do for t=1, …, T, given p(x0 ) and

A single stage transition of the system for time t-1 to t involves:

Prediction:

Measurement update:

p(xt | y1:t−1) = p(xt | xt−1) ⋅ p(xt−1 | y1:t−1) dxt−1∫

p(xt | y1:t ) =p(yt | xt ) ⋅ p(xt | y1:t−1)

p(y1:t )

y1:t = (y1, y2,.., yT )

Page 12: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Sampling Based Estimation

Let    {xt|t(i ),wt

(i )}, i = 1, ..., n be  a  random  sample*  from      

drawn  from  the  target  distribu'on  (the  filter  density)    

p(xt | y1:t )

where:   xt|t(i ) are  the  support  points  

wt(i )

are  the  normalized  weights  

The  filter  density  can  then  be  approximated  as  

p(xt | y1:t ) ≈ wt(i ) δ (xt|t − xt|t

(i ) )i=1

n

*referred  to  an  ensemble,  made  up  of  a  collec'on  of  par'cles  (which  are  sample  members)  

Page 13: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Sampling Based Estimation

−2 0 20

0.2

0.4

(a) unweighted ensemble, n=25

x

pro

babi

lity

−2 0 20

0.2

0.4

(b) weighted ensemble, n=25

x

pro

babi

lity

−2 0 20

0.2

0.4

(c) unweighted ensemble, n=250

x

pro

babi

lity

−2 0 20

0.2

0.4

(d) weighted ensemble, n=250

x

pro

babi

lity

Page 14: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Sequential Importance Sampling (SIS)

At time t-1 we have:

i {xt−1|t−1(i ) ,wt−1

(i ) }, draws from p(xt−1 | y1:t−1)i yt , an observation

If a candidate xt|t(i ) is drawn from a proposal q(xt|t )

then wt(i ) ∝ p(xt|t

(i ) | yt )q(xt|t

(i ) | yt )

or wt(i ) ∝wt−1

(i ) p(yt | xt|t(i ) ) ⋅ p(xt|t

(i ) | xt−1|t−1(i ) )

q(xt|t(i ) )

A draw from p(xt | y1:t ) is then available as {xt|t(i ),wt

(i )}see  Arulampalam  et  al.  2002    

Page 15: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

SIS Remarks

•  A  common  simplifica'on  is  to  let  the  proposal  be  the  the  prior  (the  predic've  density)                                                                    leading  to:  q(xt|t ) = p(xt | xt−1)

wt(i ) ∝wt−1

(i ) p(yt | xt−1(i ) )

•  The  main  prac'cal  problem  with  SIS  is  ‘weight  collapse”  wherein  aTer  a  few  'me  steps,  all  the  weights  become  concentrated  on  a  few  par'cles.  How  to  fix?  

•  SIS  solves  the  filtering  problems  for  the  nonlinear,  non-­‐Gaussian  state  space  model  via  an  algorithm  for  sequen'al  “online”  es'ma'on  of  the  system  state  relying  on  a  proposal  and  weight  update.      

Page 16: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Sequential Importance Resampling (SIR)

•  The major breakthrough in sequential Monte Carlo methods (Gordon et al. 1993) was to incorporate a resampling step to obviate weight collapse.

i Specifically, for each time t, {xt|t−1(i ) ,wt

(i )} is a sample from p(xt | yt−1)

i We can resample (with replacement) the xt|t(i ) with a probability

proportional to wt(i ) (i.e. a weighted bootstrap)

i This yields a modified sample {xt|t(i )}, wherein particles were both killed

and replicated, and the weights are now equal, i.e. wt(i ) = 1∀ i.

Remaining Issue: sample impoverishment - some particles replicated often in the unweighted sample . Less severe than weight collapse (but of course related).

{xt|t(i )}

Page 17: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Basic Particle Filter Algorithm Given:  (1)  dynamical  model,  (2)  observa'ons,  (3)  ini'al  condi'ons    

for  t  =  1  to  T  

(a) Prediction: {xt−1|t−1(i ) }→ {xt|t−1

(i ) } as xt|t−1

(i ) = d(xt−1|t−1(i ) ,et

(i ) ) for i = 1,...,n

(b) Measurement: {xt|t−1(i ) }→ {xt|t

(i )} as • wt

(i ) ∝ p(yt | xt|t−1(i ) ) for i = 1,...,n

• resample with replacement {xt|t−1(i ) } using weights wt

(i ) → yields {xt|t

(i )}

end  (for  t)  

Page 18: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Particle Filter Schematic

Page 19: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Interpretation: Particle History

     'me              

     state                      

Page 20: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

R-code for a Basic Particle Filter

!for (t in 2:T)!{!!xf=t(apply(xa,1,dynamicmodel)) !!yobs=matrix(y[t,],1,ncol(y))! !w=apply(matrix(xf[,1],np,1),1,dmvnorm,x=yobs,sigma=R) !!m=sample(1:np,size=np,prob=w,replace=T);!xa=xf[m,] !!}!

predic'on  step  from  t-­‐1  to  t  

extract  measurement  at  'me  t  

assign  weights    

weighted  resampling        

Page 21: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Element 1. Prediction Step xt−1|t−1

(i ){ }draw from p(xt−1 | y1:t−1)

xt|t−1(i ){ }

draw from p(xt | y1:t−1)filter density at time t-1

predictive density at time t

Remarks:    -­‐  computa'onally  expensive  to  generate  large  samples  for  

big  GFD  models    -­‐  hard  to  effec'vely  populate  a  large  dimension  state  space  

with  par'cles    

Role: acts as proposal distribution for particle filter

xt|t−1(i ) = d(xt−1|t−1

(i ) ,et(i ) ) for i = 1,...,n

Page 22: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Element 2. Measurement Update   xt|t−1

(i ){ }draw from p(xt | y1:t−1)

xt|t(i ){ }

draw from p(xt | y1:t )predictive density

at time t-1 filter density

at time t

uses p(xt|t | y1:t )∝ p(yt | xt|t−1) ⋅ p(xt|t−1)(Bayes’  Formula)  

i PF/SIR: weighted resample of xt|t−1(i ) ,wt

(i ){ }→ xt|t(i ){ }

�          Metropolis Hastings MCMC

� Ensemble Kalman filter (approximate)

� plus many others (auxiliary, unscented, ….) that modify proposal distributions

yt

Approaches:  

Page 23: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Sequential Metropolis-Hastings  Construct  (independence)  chain  to  yield  sample  from  target  

filter  distribu5on      .    

xt|t(i ){ }, a sample from p(xt | y1:t )

•  flexible,  configurable  and  efficient  (+  a  parallel  with  SIR)    •  can  alleviate  sample  impoverishment  (resample-­‐move)  

1. Generate candidate from predictive density: draw xt|t−1

* from predictive density p(xt | y1:t−1)

α = min 1, p(yt | xt|t−1* )

p(yt | xt|t(k ) )

⎛⎝⎜

⎞⎠⎟

2. Evaluate acceptance probability

for  k  =  1  to  K  

3. Let xt|t* enter the posterior sample {xt|t

(i )} with probability α ,otherwise retain xt|t

(k )

end  (for  k)   yields  

Page 24: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Limitations

Theory:  convergence  to  target  marginal  distribu'on                        holds  under  weak  assump'ons  (e.g.  Kunsch  2005)    Prac7cal  Problem:  The  standard  par'cle  filtering  (SIR)  when  used  in  DA  suffers  from  sample  impoverishment  (recall  path  degeneracy)      Typically,  the  predic'on  ensemble  far  away  from  the  observed  state  (due  to  small  ensemble  size,  and  high  dimensional  state  space).      è likelihood  is  poorly  represented  and  es'ma'on  compromised.    

 

What  modifica5ons  are  made  to  improve  par5cle  filters  ?  

 

p(xt | y1:t )

Page 25: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Add  “Ji;er”:  overdisperse  the  sample  of  predic've  density  (inflate  process  error)    Make  a  Gaussian  approxima5on,  or  transforma5on/anamorphosis:  Can  Use  Kalman  filter  upda'ng  equa'on  

Approximate  the  Likelihood  func5on:  change  its  func'onal  form,  or  inflate  or  alter  measurement  error.  

Error  Subspace:  confine  stochas'city  in  parameters  only.  Dimension  reduc'on.  

Use  Fixed  lag  smoother,  Batch  processing  incorporate  observa'ons  from  mul'ple  'mes  into  observa'on  update.      Clever  Proposal  Distribu5ons  and  look-­‐ahead  filters:  move  beyond  using  “prior”  (predic've  density)  as  proposal.    

Some Fixes

Page 26: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

What about Parameter Estimation?

xt=xtθt

⎝⎜⎜

⎠⎟⎟

1. State Augmentation: append parameters to the state

Can use particle filter machinery to solve (Kitagawa 1998). Also, see multiple iterative filtering (Ionides 2006)

2. Likelihood Methods: Use sample based likelihoods

[y1:T |θ ]= L(θ | y1:T ) = [yt | xt ,θ ][xt | y1:t−1,θ ]dxt∫t=1

T

≈ 1nt=1

T

∏ p(yt | xt|t−1(i )

i=1

n

∑ )

Page 27: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Summary

1.  State  space  model  is  sta's'cal  framework  for  considering  dynamic  models  and  measurements  

2.  Importance  sampling  provides  theore'cal  basis  for  par'cle  filters  

3.  Sequen'al  importance  resampling  (SIR)  yields  a  straighforward  and  workable  algorithm  for  the  filtering  problem    

4.  For  the  realis'c  problems  in  data  assimila'on,  par'cle  filters  must  be  modified  either  via  approxima'ons,  or  clever  use  of  proposal  distribu'ons.      

 

Page 28: Introduction to Particle Filters for Data Assimilationoksana/docs/STATMOS-SSDA/Dowd... · Introduction to Particle Filters for Data Assimilation ... filter density ... Can use particle

Some Key References Jazwinski  AH.  1970.  Stochas'c  Processes  and  Filtering  Theory.  Academic  Press:  New  York,  376pp.    Gordon  NJ,  Salmond  DJ,    Smith  AFM.  1993.  Novel  approach  to  nonlinear/non-­‐Gaussian  Bayesian  state  es'ma'on.  IEE  Proceedings-­‐F  140(2):107-­‐113.    Kitagawa  G.  1996.  Monte  Carlo  filter  and  smoother  for  non-­‐Gaussian  nonlinear  state  space  models.    Journal  of  Computa'onal  and  Graphical  Sta's'cs  5:  1-­‐25.    Kitagawa  G.  1998.  A  self-­‐organizing  state  space  model.  Journal  of  the  American  Sta's'cal  Associa'on.  93(443):1203-­‐1215.    Arulampalam  MS,  Maskell  S,  Gordon  N,  Clapp  T,  2002.  A  tutorial  on  par'cle  filters  for  online  nonlinear  non-­‐Gaussian  Bayesian  tracking.  IEEE  Transac'ons  on  Signal  Processing  50(2):174-­‐188.    Kunsch  HR.  2005.  Recursive  Monte  Carlo  filters:  algorithms  and  theore'cal  analysis.  The  Annals  of  Sta's'cs.  33(5):1983-­‐2021.    Godsill  SJ,  Doucet  A,  West  M.  2004.  Monte  Carlo  Smoothing  for  Nonlinear  Time  Series.  Journal  of  the  American  Sta's'cal  Associa'on.  99(465):156-­‐168.    Ionides  EL,  Breto  C,  King  AA.  2006.  Inference  for  nonlinear  dynamical  systems.  Proceedings  of  the  Na'onal  Academy  of  Sciences  103:  18438-­‐18443.