data assimilation systems: focus on enkf diagnostics · 2016. 1. 25. · 1 data assimilation...

1

Data Assimilation Systems:Data Assimilation Systems:Focus on EnKF diagnosticsFocus on EnKF diagnostics

Former students (Former students (Shu-Chih Yang, Takemasa Miyoshi, Hong Li,Shu-Chih Yang, Takemasa Miyoshi, Hong Li,Junjie Liu, Chris Danforth, Junjie Liu, Chris Danforth, Ji-Sun Ji-Sun Kang, Matt Hoffman)Kang, Matt Hoffman)

andand Eugenia KalnayEugenia Kalnay UMCPUMCP

Acknowledgements:UMD Chaos-Weather Group: Brian Hunt, Istvan Szunyogh, EdOtt, Jim Yorke, Kayo Ide, and studentsAlso: Malaquías Peña, Matteo Corazza, Pablo Grunman, DJPatil, Debra Baker, Steve Greybush, Tamara Singleton, StevePenny, Elana Fertig.

2

Ensemble Kalman Filter:Ensemble Kalman Filter:status and new ideasstatus and new ideas

• EnKF and 4D-Var are in a friendly competition:• Jeff Whitaker results: EnKF better than GSI (3D-Var)• Canada (Buehner): 4D-Var & EnKF the same in theNH and EnKF is better in the SH. Hybrid best.• JMA (Miyoshi): at JMA, EnKF faster than 4D-Var,better in tropics and NH, worse in SH due to model bias.• EnKF needs no adjoint model, priors, it adapts tochanges in obs, it can even estimate ob errors.• We “plagiarize” ideas and methods developed for 4D-Var and adapt them to the LETKF (Hunt et al., 2007)

3

Whitaker: Comparison of T190, 64 members EnKF withT382 operational GSI, same observations (JCSDA, 2009)

4

There are several types of EnKFThere are several types of EnKF

1.1. Perturbed obs (e.g., Houtekamer and Mitchell)Perturbed obs (e.g., Houtekamer and Mitchell)2.2. Square root filters (e.g., Whitaker and Hamill)Square root filters (e.g., Whitaker and Hamill)Most filters get their speed from assimilating one

observation at a timeThe LETKF (Hunt 2005) assimilates all obs

simultaneously and get its speed from localprocessing of each grid point

Because it is a Transform Square root filter, the LETKFanalysis ensemble is explicitly expressed as a linearcombination of the forecast ensemble

This has a number of nice properties, so here we willfocus on the LETKF

5

Diagnostic tools that improveDiagnostic tools that improveLETKF/EnKFLETKF/EnKF

We adapted ideas that were inspired by 4D-VarWe adapted ideas that were inspired by 4D-Var: No-cost smoother (Kalnay et al, Tellus 2007) “Outer loop”, nonlinearities and long windows (Yang and Kalnay) Accelerating the spin-up (Kalnay and Yang, 2008) Forecast sensitivity to observations (Liu and Kalnay, QJ, 2008) Analysis sensitivity to observations and cross-validation (Liu et al., QJ,2009) Coarse analysis resolution without degradation (Yang et al., QJ, 2009) Low-dimensional model bias correction (Li et al., MWR, 2009) Simultaneous estimation of optimal inflation and observation errors (Li etal., QJ, 2009).

6

Local Ensemble Transform Kalman Filter(Ott et al, 2004, Hunt et al, 2004, 2007)

• Model independent(black box)• Obs. assimilatedsimultaneously at eachgrid point• 100% parallel: fast• No adjoint needed• 4D LETKF extension

(Start with initial ensemble)

LETKFObservationoperator

Model

ensemble analyses

ensemble forecasts

ensemble“observations”

Observations

7

Perform data assimilation in a local volume, choosing observations

The state estimate is updated at thecentral grid red dot

Localization based on observations

8

Perform data assimilation in a local volume, choosing observations

The state estimate is updated at thecentral grid red dot

All observations (purple diamonds)within the local region are assimilated

Localization based on observations

The LETKF algorithm can be described in a single slide!

9

Local Ensemble Transform Kalman Filter (Local Ensemble Transform Kalman Filter (LETKFLETKF))

Forecast step:Analysis step: construct

Locally: Choose for each grid point the observations to be used, andcompute the local analysis error covariance and perturbations inensemble space:

Analysis mean in ensemble space:and add to to get the analysis ensemble in ensemble space

The new ensemble analyses in model space are the columns of . Gathering the grid point analyses forms the new

global analyses. Note that the the output of the LETKF are analysisweights and perturbation analysis matrices of weights . Theseweights multiply the ensemble forecasts.

x

n,k

b= M

nx

n!1,k

a( )Xb = x

1

b ! xb | ... | xK

b ! xb"# $%;

yi

b= H (x

i

b); Y

n

b= y

1

b ! yb | ... | yK

b ! yb"# $%

!Pa= K !1( )I + YbTR!1Yb"# $%

!1;W

a= [(K !1) !Pa ]1/2

Xn

a= X

n

bW

a+ x

b

wa= !P

aY

bTR

!1(yo ! yb )

Wa

Globally:

wa

Wa

10

The 4D-LETKF produces an analysis in terms ofweights of the ensemble forecast members at theanalysis time tn, giving the trajectory that best fits allthe observations in the assimilation window.

analysis time weightsA linear comb. of trajectories is ~a trajectory. If it is closeto the truth at the end it should be close to the truththroughout the trajectory (neglecting model errors).

11

No-cost LETKF smoother ( ): apply at tn-1 the sameweights found optimal at tn. It works for 3D- or 4D-LETKF

The no-cost smoother makes possible:Outer loop (like in 4D-Var)“Running in place” (faster spin-up)Use of future data in reanalysisAbility to use longer windows

12

No-cost LETKF smoothertested on a QG model: It works!

“Smoother” reanalysis

LETKF Analysisxna= xn

f+ Xn

fwn

aLETKF analysis

at time n

Smoother analysis at time n-1 !xn!1

a= xn!1

f+ Xn!1

fwn

a

This very simple smoother allows us to go backand forth in time within an assimilation window:it allows assimilation of future data in reanalysis

13

Nonlinearities and Nonlinearities and ““outer loopouter loop””

• The main disadvantage of EnKF is that it cannot handlenonlinear (non-Gaussian) perturbations and therefore needsshort assimilation windows.

•• It doesnIt doesn’’t have the t have the outer loopouter loop so important in 3D-Var andso important in 3D-Var and4D-Var (DaSilva, pers. 4D-Var (DaSilva, pers. commcomm. 2006). 2006)

Lorenz -3 variable model (Kalnay et al. 2007a Tellus), RMSanalysis error

4D-Var LETKFWindow=8 steps 0.31 0.30 (linear window)Window=25 steps 0.53 0.66 (nonlinear window)

Long windows + Pires et al. => 4D-Var clearly wins!

14

““Outer loopOuter loop”” in 4D-Var in 4D-Var

15

Nonlinearities,Nonlinearities, ““Outer LoopOuter Loop”” and and ““Running in PlaceRunning in Place””

Outer loop: similar to 4D-Var: use the final weights tocorrect only the mean initial analysis, keeping the initialperturbations. Repeat the analysis once or twice. Itcenters the ensemble on a more accurate nonlinearsolution.

Lorenz -3 variable model RMS analysis error

4D-Var LETKF LETKF LETKF +outer loop +RIP

Window=8 steps 0.31 0.30 0.27 0.27Window=25 steps 0.53 0.66 0.48 0.39

“Running in place” smoothes both the analysis and theanalysis error covariance and iterates a few times…

16

Estimation of forecast sensitivity toEstimation of forecast sensitivity toobservations observations without adjointwithout adjoint in an in an

ensemble Kalman filterensemble Kalman filter

Junjie Liu and Eugenia KalnayQJRMS October 2008

Inspired by Langland and Baker (2004)and Zhu and Gelaro (2008)

Ideal for diagnosing NCEP “5-day skilldropouts” because it remains valid for

nonlinear perturbations

17

Motivation: Langland and Baker (2004)

The adjoint method proposed by Langland and Baker (2004) and Zhu andGelaro (2007) quantifies the reduction in forecast error for each individualobservation source

The adjoint method detects the observations which make the forecast worse.

The adjoint method requires adjoint model which is difficult to get.

AIRS shortwave 4.180 µm

AIRS shortwave 4.474 µm

AIRS longwave 14-13 µm

AMSU/A

18

Schematic of the observation impact on the reduction offorecast error

The only difference between and is the assimilation of observations at 00hr.

Observation impact on the reduction of forecast error:

(Adapted from Langlandand Baker, 2004)

et |0 = xt |0f! xt

a

et |!6 = xt |!6f

! xta

et |0

et |!6

J =1

2(e

t |0

Tet |0! e

t |!6

Tet |!6)

analysis t

et |!6

et |0

-6hr 00hr

OBS.

19

J =1

2(e

t |0

Tet |0! e

t |!6

Tet |!6)Euclidian cost function:

The ensemble forecast sensitivity method

!J

!v0

= !K0

TXt |"6

fT#$ %& et |"6 + Xt |"6f !K

0v0

#$ %&

The sensitivity of cost function with respect to the assimilated observations:

v0= y

0

o! h(x

0|!6

b)

J = v0,!J

!v0

Cost function as function of obs. Increments:

With this formula we can predict the impact of observations on the forecasts!

20

Test ability to detect the poor quality observation onthe Lorenz 40 variable model

Like adjoint method, ensemble sensitivity method can detect the observationpoor quality (11th observation location)

The ensemble sensitivity method has a stronger signal when the observation hasnegative impact on the forecast.

Observation impact from LB (red) and from ensemble sensitivity method (green)

Larger random error Biased observation case

21

Test ability to detect poor quality observation fordifferent forecast lengths

After 2-days theadjoint has the wrongsensitivity sign!

The ensemblesensitivity method hasa strong signal evenafter forecast error hassaturated!

Larger random error Biased observation case

2 days

5 days

20 days

22

How can we possibly detect bad observations evenafter all skill is lost??? (Liu and Kalnay, 2009)

After 20-days there is noforecast skill but theensemble sensitivity stilldetects the wrongobservation.

The ensemble sensitivityis based on the assumptionthat the analysis weightscan be used in theforecasts. This is accurateeven after forecast errorhas saturated (triangles).

As a result we canidentify a bad observationeven after forecast skill islost.

20 days

Mean Square Error of the -6hr weighted forecasts (diamonds),MSE of the 0hr ensemble mean (circles) and MS Differencebetween ensemble mean and weighted forecasts (triangles).

Error made by using the -6hrweights in the forecasts

23

Analysis sensitivity to observationsAnalysis sensitivity to observationsand and exact Cross-Validationexact Cross-Validation in an EnKF in an EnKF

Junjie Liu, E Kalnay, T Miyoshi and C CardinaliQJRMS 2009

Inspired in Cardinali et al., 2004

24

Observation quality control using cross-validation

Experimental design: the observation error standard deviation at the 11th point is 4times larger than the others.

* The difference between the predicted observation and the actual obs islarger when the ith observation has larger error (11th point).

* Does not need much computational time.

1

T(yi

o! yi

a(! i ))

2

t=1

T

" =1

T

(yi,to! yi,t

a)

2

(1! Sii,to

)2

t=1

T

" , i = 1,!,m

yio

yia(! i )

Grid Points

25

Observation impact based on self-sensitivity & the observationimpact from adjoint and ensemble sensitivity method

-6hr 00hr t analysis

Obs.

et |!6 = xt |!6f

! xta

et |0 = xt |0f! xt

a

(Langland and Baker, 2004; Liu and Kalnay, 2008)

-6hr 00hr t analysisTotal Obs.

(Total-1) Obs.

The adjoint and ensemble sensitivity method:

* The cost function reflects the impact of all theobservations assimilated at 00hr on the forecasterror difference (model space) at time t.

* The cost function is rewritten as function of theobservations assimilated at 00hr.

* The cost function reflects the impact of the ithobservation assimilated at 00hr on the forecast errordifference (observation space) at time t.

* There is no need to rewrite the cost function.

The observation impact based on self-sensitivity:

Ji = [(yi,t |0f

! yi,ta)2! (yi,t |0

f (!i)! yi,t

a)2]

J = [(xt |0f! xt

a)2! (xt |!6

f! xt

a)2]

et |0f (! i )

= yi,t |0f (!i)

! yi,ta

et |0f= yi,t |0

f! yi,t

a

26

• In EnKF the analysis is a weighted average of the forecast ensemble• We performed experiments with a QG model interpolating weights

compared to analysis increments.• Coarse grids of 11%, 4% and 2% interpolated analysis points.• Weight fields vary on large scales: they interpolate very well

1/(3x3)=11% analysis grid

Coarse analysis with interpolated weightsYang et al (2008)

27

Weight interpolation versus Increment interpolation

With increment interpolation, the analysis degrades quickly…With weight interpolation, there is almost no degradation!LETKF maintains balance and conservation properties

28

Impact of coarse analysis on accuracy

With increment interpolation, the analysis degradesWith weight interpolation, there is no degradation,

the analysis is actually slightly better!

29

Model error: comparison ofModel error: comparison of methodsmethodsto correct model bias and inflationto correct model bias and inflation

Hong Li, Chris Danforth, Takemasa Miyoshi,and Eugenia Kalnay, MWR (2009)

Inspired by the work of Dick Dee, but with modelerrors estimated in model space, not in obs space

30

Model error: If we assume a perfect model in EnKF,Model error: If we assume a perfect model in EnKF,we underestimate the analysis errors (Li, 2007)we underestimate the analysis errors (Li, 2007)

imperfect modelimperfect model(obs from NCEP- NCAR(obs from NCEP- NCARReanalysis NNR)Reanalysis NNR)

perfect SPEEDY modelperfect SPEEDY model

31

— Why is EnKF vulnerable to model errors ?

In the theory of Extended Kalmanfilter, forecast error is represented bythe growth of errors in IC and themodel errors.

However, in ensemble Kalman filter,error estimated by the ensemblespread can only represent the firsttype of errors.

Pif!

1

k "1(xi

f

i=1

K

# " xf)(xi

f" x

f)T

1 11a a

i i

f a T

i i! !

!= +

x xP M P M Q

Ens mean

Ens 1

Ens 2

Truth

Imperfect model

Forecasterror

The ensemble spread is ‘blind’to model errors

32

imperfect modelperfect model

Low Dimensional Method to correct the bias (Danforth et al, 2007)combined with additive inflation

We compared several methods to handlebias and random model errors

33

Simultaneous estimation of EnKF inflation andobs errors in the presence of model errors

Hong Li, Miyoshi and Kalnay (QJ, 2009)

Any data assimilation scheme requires accurate statistics for theobservation and background errors (usually tuned or from gut feeling). EnKF needs inflation of the background error covariance: tuning isexpensive Wang and Bishop (2003) and Miyoshi (2005) proposed a technique toestimate the covariance inflation parameter online. It works well if ob errorsare accurate. We introduce a method to simultaneously estimate ob errors andinflation.

Inspired by Houtekamer et al. (2001) andDesroziers et al. (2005)

34

Diagnosis of observation error statistics

< do!ado!b

T>= R

These relationships are correct if the R and B statistics are correct anderrors are uncorrelated!

HPbH

T! H"P

bH

T

Desroziers et al, 2005, introduced two new statistical relationships:

OMB*OMB

OMA*OMB

AMB*OMB

< do!bdo!b

T>= HP

bH

T+ R

Houtekamer et al (2001) well known statistical relationship:

< da!bdo!b

T>= HP

bH

T

With inflation: ! > 1with

35

Diagnosis of observation error statistics

Here we use a simple KF to estimate both and online.!

( !! o )2= do"a

Tdo"b / p = (yj

o

j=1

p

# " yja)(yj

o" yj

b) / p OMA*OMB

!o=(d

o"b

Tdo"b) " Tr(R)

Tr(HPbHT)

!o= (yj

a

j=1

p

" # yjb)(yj

o# yj

b) /Tr(HP

bH

T) AMB*OMB

OMB2

!o

2

Transposing, we get “observations” of and! !o

2

36

SPEEDY model: online estimated observationalerrors, each variable started with 2 not 1.

The original wrongly specified R quicklyconverges to the correct value of R (in about 5-10days)

37

Estimation of the inflation

Using a perfect R and estimating adaptivelyUsing an initially wrong R and but estimating them adaptively!

Estimated Inflation

!

After R converges, the time dependent inflation factors are quite similar

38

Tests with LETKF with imperfect L40 model:added random errors to the model

Error

amplitude

(random)

A: true !o

2=1.0

(tuned) constant "

B: true !o

2=1.0

adaptive "

C: adaptive!o

2

adaptive "

a " RMSE " RMSE " RMSE !o

2

4 0.25 0.36 0.27 0.36 0.39 0.38 0.93

20 0.45 0.47 0.41 0.47 0.38 0.48 1.02

100 1.00 0.64 0.87 0.64 0.80 0.64 1.05

The method works quite well evenwith very large random errors!

39

Tests with LETKF with imperfect L40 model:added biases to the model

The method works well for low biases, butless well for large biases: Model bias needs

to be accounted by a separate bias correction

Error

amplitude

(bias)

A: true !o

2=1.0

(tuned) constant "

B: true!o

2=1.0

adaptive "

C: adaptive!o

2

adaptive "

# " RMSE " RMSE " RMSE !o

2

1 0.35 0.40 0.31 0.42 0.35 0.41 0.96

4 1.00 0.59 0.78 0.61 0.77 0.61 1.01

7 1.50 0.68 1.11 0.71 0.81 0.80 1.36

40

SummarySummary

• EnKF and 4D-Var give similar results in Canada and in JMA,except for model bias. (Buehner et al, Miyoshi et al)

• EnKF is better than GSI with half resolution model, 64members. Computationally competitive (Whitaker)

• Many ideas to further improve EnKF were inspired in 4D-Var:– No-cost smoothing and “running in place”– A simple outer loop to deal with nonlinearities– Adjoint forecast sensitivity without adjoint model– Analysis sensitivity and exact cross-validation– Coarse resolution analysis without degradation– Correction of model bias combined with additive inflation gives the

best results– Can estimate simultaneously optimal inflation and obs. errors

41

Extra Slides on Low Dim Method

42

• Generate a long time series of model forecast minus reanalysisfrom the training period

2.3 Low-dim method (Danforth et al, 2007: Estimating and correcting globalweather model error. Mon. Wea. Rev, J. Atmos. Sci., 2007)

t=0t=6hr

model

fx

xtruth

Bias removal schemes (Low Dimensional Method)

e

hrx6

We collect a large number of estimated errors and estimate from them bias, etc.

NNR

NNR

NNR

Time-meanmodel bias

Diurnalmodel error

Statedependentmodel error

m

M

m

mnl

L

l

ln

t

n

a

n

t

n

f

n

f

n MM febxxxx !!==

++++++"="=

1

,

1

,111 )()( #$%

fx

Forecast errordue to error in IC

NNRNNR

43

model error = b + !n,l

l=1

2

" el + # n,mm=1

10

" fm

Low-dimensional method

Include Bias, Diurnal and State-Dependent model errors:

Having a large number of estimated errors allows toestimate the global model error beyond the bias

44

SPEEDY 6 hr model errors against NNR (diurnal cycle)1987 Jan 1~ Feb 15

Error anomalies

• For temperature at lower-levels, in additionto the time-independent bias, SPEEDY hasdiurnal cycle errors because it lacks diurnalradiation forcing

Leading EOFs for 925 mb TEMP

pc1pc2

e

hr

e

hri

e

hrxxx 66

'

)(6!=

data assimilation systems: focus on enkf diagnostics · 2016. 1. 25. · 1 data assimilation...

Documents