data assimilation systems: focus on enkf diagnostics · 2016. 1. 25. · 1 data assimilation...

44
1 Data Assimilation Systems: Data Assimilation Systems: Focus on EnKF diagnostics Focus on EnKF diagnostics Former students ( Former students ( Shu-Chih Yang, Takemasa Miyoshi, Hong Li, Shu-Chih Yang, Takemasa Miyoshi, Hong Li, Junjie Liu, Chris Danforth, Junjie Liu, Chris Danforth, Ji-Sun Ji-Sun Kang, Matt Hoffman) Kang, Matt Hoffman) and and Eugenia Kalnay Eugenia Kalnay UMCP UMCP Acknowledgements: UMD Chaos-Weather Group : Brian Hunt, Istvan Szunyogh, Ed Ott, Jim Yorke, Kayo Ide, and students Also: Malaquías Peña, Matteo Corazza, Pablo Grunman, DJ Patil, Debra Baker, Steve Greybush, Tamara Singleton, Steve Penny, Elana Fertig.

Upload: others

Post on 10-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • 1

    Data Assimilation Systems:Data Assimilation Systems:Focus on EnKF diagnosticsFocus on EnKF diagnostics

    Former students (Former students (Shu-Chih Yang, Takemasa Miyoshi, Hong Li,Shu-Chih Yang, Takemasa Miyoshi, Hong Li,Junjie Liu, Chris Danforth, Junjie Liu, Chris Danforth, Ji-Sun Ji-Sun Kang, Matt Hoffman)Kang, Matt Hoffman)

    andand Eugenia KalnayEugenia Kalnay UMCPUMCP

    Acknowledgements:UMD Chaos-Weather Group: Brian Hunt, Istvan Szunyogh, EdOtt, Jim Yorke, Kayo Ide, and studentsAlso: Malaquías Peña, Matteo Corazza, Pablo Grunman, DJPatil, Debra Baker, Steve Greybush, Tamara Singleton, StevePenny, Elana Fertig.

  • 2

    Ensemble Kalman Filter:Ensemble Kalman Filter:status and new ideasstatus and new ideas

    • EnKF and 4D-Var are in a friendly competition:• Jeff Whitaker results: EnKF better than GSI (3D-Var)• Canada (Buehner): 4D-Var & EnKF the same in theNH and EnKF is better in the SH. Hybrid best.• JMA (Miyoshi): at JMA, EnKF faster than 4D-Var,better in tropics and NH, worse in SH due to model bias.• EnKF needs no adjoint model, priors, it adapts tochanges in obs, it can even estimate ob errors.• We “plagiarize” ideas and methods developed for 4D-Var and adapt them to the LETKF (Hunt et al., 2007)

  • 3

    Whitaker: Comparison of T190, 64 members EnKF withT382 operational GSI, same observations (JCSDA, 2009)

  • 4

    There are several types of EnKFThere are several types of EnKF

    1.1. Perturbed obs (e.g., Houtekamer and Mitchell)Perturbed obs (e.g., Houtekamer and Mitchell)2.2. Square root filters (e.g., Whitaker and Hamill)Square root filters (e.g., Whitaker and Hamill)Most filters get their speed from assimilating one

    observation at a timeThe LETKF (Hunt 2005) assimilates all obs

    simultaneously and get its speed from localprocessing of each grid point

    Because it is a Transform Square root filter, the LETKFanalysis ensemble is explicitly expressed as a linearcombination of the forecast ensemble

    This has a number of nice properties, so here we willfocus on the LETKF

  • 5

    Diagnostic tools that improveDiagnostic tools that improveLETKF/EnKFLETKF/EnKF

    We adapted ideas that were inspired by 4D-VarWe adapted ideas that were inspired by 4D-Var: No-cost smoother (Kalnay et al, Tellus 2007) “Outer loop”, nonlinearities and long windows (Yang and Kalnay) Accelerating the spin-up (Kalnay and Yang, 2008) Forecast sensitivity to observations (Liu and Kalnay, QJ, 2008) Analysis sensitivity to observations and cross-validation (Liu et al., QJ,2009) Coarse analysis resolution without degradation (Yang et al., QJ, 2009) Low-dimensional model bias correction (Li et al., MWR, 2009) Simultaneous estimation of optimal inflation and observation errors (Li etal., QJ, 2009).

  • 6

    Local Ensemble Transform Kalman Filter(Ott et al, 2004, Hunt et al, 2004, 2007)

    • Model independent(black box)• Obs. assimilatedsimultaneously at eachgrid point• 100% parallel: fast• No adjoint needed• 4D LETKF extension

    (Start with initial ensemble)

    LETKFObservationoperator

    Model

    ensemble analyses

    ensemble forecasts

    ensemble“observations”

    Observations

  • 7

    Perform data assimilation in a local volume, choosing observations

    The state estimate is updated at thecentral grid red dot

    Localization based on observations

  • 8

    Perform data assimilation in a local volume, choosing observations

    The state estimate is updated at thecentral grid red dot

    All observations (purple diamonds)within the local region are assimilated

    Localization based on observations

    The LETKF algorithm can be described in a single slide!

  • 9

    Local Ensemble Transform Kalman Filter (Local Ensemble Transform Kalman Filter (LETKFLETKF))

    Forecast step:Analysis step: construct

    Locally: Choose for each grid point the observations to be used, andcompute the local analysis error covariance and perturbations inensemble space:

    Analysis mean in ensemble space:and add to to get the analysis ensemble in ensemble space

    The new ensemble analyses in model space are the columns of . Gathering the grid point analyses forms the new

    global analyses. Note that the the output of the LETKF are analysisweights and perturbation analysis matrices of weights . Theseweights multiply the ensemble forecasts.

    x

    n,k

    b= M

    nx

    n!1,k

    a( )Xb = x

    1

    b ! xb | ... | xK

    b ! xb"# $%;

    yi

    b= H (x

    i

    b); Y

    n

    b= y

    1

    b ! yb | ... | yK

    b ! yb"# $%

    !Pa= K !1( )I + YbTR!1Yb"# $%

    !1;W

    a= [(K !1) !Pa ]1/2

    Xn

    a= X

    n

    bW

    a+ x

    b

    wa= !P

    aY

    bTR

    !1(yo ! yb )

    Wa

    Globally:

    wa

    Wa

  • 10

    The 4D-LETKF produces an analysis in terms ofweights of the ensemble forecast members at theanalysis time tn, giving the trajectory that best fits allthe observations in the assimilation window.

    analysis time weightsA linear comb. of trajectories is ~a trajectory. If it is closeto the truth at the end it should be close to the truththroughout the trajectory (neglecting model errors).

  • 11

    No-cost LETKF smoother ( ): apply at tn-1 the sameweights found optimal at tn. It works for 3D- or 4D-LETKF

    The no-cost smoother makes possible:Outer loop (like in 4D-Var)“Running in place” (faster spin-up)Use of future data in reanalysisAbility to use longer windows

  • 12

    No-cost LETKF smoothertested on a QG model: It works!

    “Smoother” reanalysis

    LETKF Analysisxna= xn

    f+ Xn

    fwn

    aLETKF analysis

    at time n

    Smoother analysis at time n-1 !xn!1

    a= xn!1

    f+ Xn!1

    fwn

    a

    This very simple smoother allows us to go backand forth in time within an assimilation window:it allows assimilation of future data in reanalysis

  • 13

    Nonlinearities and Nonlinearities and ““outer loopouter loop””

    • The main disadvantage of EnKF is that it cannot handlenonlinear (non-Gaussian) perturbations and therefore needsshort assimilation windows.

    •• It doesnIt doesn’’t have the t have the outer loopouter loop so important in 3D-Var andso important in 3D-Var and4D-Var (DaSilva, pers. 4D-Var (DaSilva, pers. commcomm. 2006). 2006)

    Lorenz -3 variable model (Kalnay et al. 2007a Tellus), RMSanalysis error

    4D-Var LETKFWindow=8 steps 0.31 0.30 (linear window)Window=25 steps 0.53 0.66 (nonlinear window)

    Long windows + Pires et al. => 4D-Var clearly wins!

  • 14

    ““Outer loopOuter loop”” in 4D-Var in 4D-Var

  • 15

    Nonlinearities,Nonlinearities, ““Outer LoopOuter Loop”” and and ““Running in PlaceRunning in Place””

    Outer loop: similar to 4D-Var: use the final weights tocorrect only the mean initial analysis, keeping the initialperturbations. Repeat the analysis once or twice. Itcenters the ensemble on a more accurate nonlinearsolution.

    Lorenz -3 variable model RMS analysis error

    4D-Var LETKF LETKF LETKF +outer loop +RIP

    Window=8 steps 0.31 0.30 0.27 0.27Window=25 steps 0.53 0.66 0.48 0.39

    “Running in place” smoothes both the analysis and theanalysis error covariance and iterates a few times…

  • 16

    Estimation of forecast sensitivity toEstimation of forecast sensitivity toobservations observations without adjointwithout adjoint in an in an

    ensemble Kalman filterensemble Kalman filter

    Junjie Liu and Eugenia KalnayQJRMS October 2008

    Inspired by Langland and Baker (2004)and Zhu and Gelaro (2008)

    Ideal for diagnosing NCEP “5-day skilldropouts” because it remains valid for

    nonlinear perturbations

  • 17

    Motivation: Langland and Baker (2004)

    The adjoint method proposed by Langland and Baker (2004) and Zhu andGelaro (2007) quantifies the reduction in forecast error for each individualobservation source

    The adjoint method detects the observations which make the forecast worse.

    The adjoint method requires adjoint model which is difficult to get.

    AIRS shortwave 4.180 µm

    AIRS shortwave 4.474 µm

    AIRS longwave 14-13 µm

    AMSU/A

  • 18

    Schematic of the observation impact on the reduction offorecast error

    The only difference between and is the assimilation of observations at 00hr.

    Observation impact on the reduction of forecast error:

    (Adapted from Langlandand Baker, 2004)

    et |0 = xt |0f! xt

    a

    et |!6 = xt |!6f

    ! xta

    et |0

    et |!6

    J =1

    2(e

    t |0

    Tet |0! e

    t |!6

    Tet |!6)

    analysis t

    et |!6

    et |0

    -6hr 00hr

    OBS.

  • 19

    J =1

    2(e

    t |0

    Tet |0! e

    t |!6

    Tet |!6)Euclidian cost function:

    The ensemble forecast sensitivity method

    !J

    !v0

    = !K0

    TXt |"6

    fT#$ %& et |"6 + Xt |"6f !K

    0v0

    #$ %&

    The sensitivity of cost function with respect to the assimilated observations:

    v0= y

    0

    o! h(x

    0|!6

    b)

    J = v0,!J

    !v0

    Cost function as function of obs. Increments:

    With this formula we can predict the impact of observations on the forecasts!

  • 20

    Test ability to detect the poor quality observation onthe Lorenz 40 variable model

    Like adjoint method, ensemble sensitivity method can detect the observationpoor quality (11th observation location)

    The ensemble sensitivity method has a stronger signal when the observation hasnegative impact on the forecast.

    Observation impact from LB (red) and from ensemble sensitivity method (green)

    Larger random error Biased observation case

  • 21

    Test ability to detect poor quality observation fordifferent forecast lengths

    After 2-days theadjoint has the wrongsensitivity sign!

    The ensemblesensitivity method hasa strong signal evenafter forecast error hassaturated!

    Larger random error Biased observation case

    2 days

    5 days

    20 days

  • 22

    How can we possibly detect bad observations evenafter all skill is lost??? (Liu and Kalnay, 2009)

    After 20-days there is noforecast skill but theensemble sensitivity stilldetects the wrongobservation.

    The ensemble sensitivityis based on the assumptionthat the analysis weightscan be used in theforecasts. This is accurateeven after forecast errorhas saturated (triangles).

    As a result we canidentify a bad observationeven after forecast skill islost.

    20 days

    Mean Square Error of the -6hr weighted forecasts (diamonds),MSE of the 0hr ensemble mean (circles) and MS Differencebetween ensemble mean and weighted forecasts (triangles).

    Error made by using the -6hrweights in the forecasts

  • 23

    Analysis sensitivity to observationsAnalysis sensitivity to observationsand and exact Cross-Validationexact Cross-Validation in an EnKF in an EnKF

    Junjie Liu, E Kalnay, T Miyoshi and C CardinaliQJRMS 2009

    Inspired in Cardinali et al., 2004

  • 24

    Observation quality control using cross-validation

    Experimental design: the observation error standard deviation at the 11th point is 4times larger than the others.

    * The difference between the predicted observation and the actual obs islarger when the ith observation has larger error (11th point).

    * Does not need much computational time.

    1

    T(yi

    o! yi

    a(! i ))

    2

    t=1

    T

    " =1

    T

    (yi,to! yi,t

    a)

    2

    (1! Sii,to

    )2

    t=1

    T

    " , i = 1,!,m

    yio

    yia(! i )

    Grid Points

  • 25

    Observation impact based on self-sensitivity & the observationimpact from adjoint and ensemble sensitivity method

    -6hr 00hr t analysis

    Obs.

    et |!6 = xt |!6f

    ! xta

    et |0 = xt |0f! xt

    a

    (Langland and Baker, 2004; Liu and Kalnay, 2008)

    -6hr 00hr t analysisTotal Obs.

    (Total-1) Obs.

    The adjoint and ensemble sensitivity method:

    * The cost function reflects the impact of all theobservations assimilated at 00hr on the forecasterror difference (model space) at time t.

    * The cost function is rewritten as function of theobservations assimilated at 00hr.

    * The cost function reflects the impact of the ithobservation assimilated at 00hr on the forecast errordifference (observation space) at time t.

    * There is no need to rewrite the cost function.

    The observation impact based on self-sensitivity:

    Ji = [(yi,t |0f

    ! yi,ta)2! (yi,t |0

    f (!i)! yi,t

    a)2]

    J = [(xt |0f! xt

    a)2! (xt |!6

    f! xt

    a)2]

    et |0f (! i )

    = yi,t |0f (!i)

    ! yi,ta

    et |0f= yi,t |0

    f! yi,t

    a

  • 26

    • In EnKF the analysis is a weighted average of the forecast ensemble• We performed experiments with a QG model interpolating weights

    compared to analysis increments.• Coarse grids of 11%, 4% and 2% interpolated analysis points.• Weight fields vary on large scales: they interpolate very well

    1/(3x3)=11% analysis grid

    Coarse analysis with interpolated weightsYang et al (2008)

  • 27

    Weight interpolation versus Increment interpolation

    With increment interpolation, the analysis degrades quickly…With weight interpolation, there is almost no degradation!LETKF maintains balance and conservation properties

  • 28

    Impact of coarse analysis on accuracy

    With increment interpolation, the analysis degradesWith weight interpolation, there is no degradation,

    the analysis is actually slightly better!

  • 29

    Model error: comparison ofModel error: comparison of methodsmethodsto correct model bias and inflationto correct model bias and inflation

    Hong Li, Chris Danforth, Takemasa Miyoshi,and Eugenia Kalnay, MWR (2009)

    Inspired by the work of Dick Dee, but with modelerrors estimated in model space, not in obs space

  • 30

    Model error: If we assume a perfect model in EnKF,Model error: If we assume a perfect model in EnKF,we underestimate the analysis errors (Li, 2007)we underestimate the analysis errors (Li, 2007)

    imperfect modelimperfect model(obs from NCEP- NCAR(obs from NCEP- NCARReanalysis NNR)Reanalysis NNR)

    perfect SPEEDY modelperfect SPEEDY model

  • 31

    — Why is EnKF vulnerable to model errors ?

    In the theory of Extended Kalmanfilter, forecast error is represented bythe growth of errors in IC and themodel errors.

    However, in ensemble Kalman filter,error estimated by the ensemblespread can only represent the firsttype of errors.

    Pif!

    1

    k "1(xi

    f

    i=1

    K

    # " xf)(xi

    f" x

    f)T

    1 11a a

    i i

    f a T

    i i! !

    != +

    x xP M P M Q

    Ens mean

    Ens 1

    Ens 2

    Truth

    Imperfect model

    Forecasterror

    The ensemble spread is ‘blind’to model errors

  • 32

    imperfect modelperfect model

    Low Dimensional Method to correct the bias (Danforth et al, 2007)combined with additive inflation

    We compared several methods to handlebias and random model errors

  • 33

    Simultaneous estimation of EnKF inflation andobs errors in the presence of model errors

    Hong Li, Miyoshi and Kalnay (QJ, 2009)

    Any data assimilation scheme requires accurate statistics for theobservation and background errors (usually tuned or from gut feeling). EnKF needs inflation of the background error covariance: tuning isexpensive Wang and Bishop (2003) and Miyoshi (2005) proposed a technique toestimate the covariance inflation parameter online. It works well if ob errorsare accurate. We introduce a method to simultaneously estimate ob errors andinflation.

    Inspired by Houtekamer et al. (2001) andDesroziers et al. (2005)

  • 34

    Diagnosis of observation error statistics

    < do!ado!b

    T>= R

    These relationships are correct if the R and B statistics are correct anderrors are uncorrelated!

    HPbH

    T! H"P

    bH

    T

    Desroziers et al, 2005, introduced two new statistical relationships:

    OMB*OMB

    OMA*OMB

    AMB*OMB

    < do!bdo!b

    T>= HP

    bH

    T+ R

    Houtekamer et al (2001) well known statistical relationship:

    < da!bdo!b

    T>= HP

    bH

    T

    With inflation: ! > 1with

  • 35

    Diagnosis of observation error statistics

    Here we use a simple KF to estimate both and online.!

    ( !! o )2= do"a

    Tdo"b / p = (yj

    o

    j=1

    p

    # " yja)(yj

    o" yj

    b) / p OMA*OMB

    !o=(d

    o"b

    Tdo"b) " Tr(R)

    Tr(HPbHT)

    !o= (yj

    a

    j=1

    p

    " # yjb)(yj

    o# yj

    b) /Tr(HP

    bH

    T) AMB*OMB

    OMB2

    !o

    2

    Transposing, we get “observations” of and! !o

    2

  • 36

    SPEEDY model: online estimated observationalerrors, each variable started with 2 not 1.

    The original wrongly specified R quicklyconverges to the correct value of R (in about 5-10days)

  • 37

    Estimation of the inflation

    Using a perfect R and estimating adaptivelyUsing an initially wrong R and but estimating them adaptively!

    Estimated Inflation

    !

    After R converges, the time dependent inflation factors are quite similar

  • 38

    Tests with LETKF with imperfect L40 model:added random errors to the model

    Error

    amplitude

    (random)

    A: true !o

    2=1.0

    (tuned) constant "

    B: true !o

    2=1.0

    adaptive "

    C: adaptive!o

    2

    adaptive "

    a " RMSE " RMSE " RMSE !o

    2

    4 0.25 0.36 0.27 0.36 0.39 0.38 0.93

    20 0.45 0.47 0.41 0.47 0.38 0.48 1.02

    100 1.00 0.64 0.87 0.64 0.80 0.64 1.05

    The method works quite well evenwith very large random errors!

  • 39

    Tests with LETKF with imperfect L40 model:added biases to the model

    The method works well for low biases, butless well for large biases: Model bias needs

    to be accounted by a separate bias correction

    Error

    amplitude

    (bias)

    A: true !o

    2=1.0

    (tuned) constant "

    B: true!o

    2=1.0

    adaptive "

    C: adaptive!o

    2

    adaptive "

    # " RMSE " RMSE " RMSE !o

    2

    1 0.35 0.40 0.31 0.42 0.35 0.41 0.96

    4 1.00 0.59 0.78 0.61 0.77 0.61 1.01

    7 1.50 0.68 1.11 0.71 0.81 0.80 1.36

  • 40

    SummarySummary

    • EnKF and 4D-Var give similar results in Canada and in JMA,except for model bias. (Buehner et al, Miyoshi et al)

    • EnKF is better than GSI with half resolution model, 64members. Computationally competitive (Whitaker)

    • Many ideas to further improve EnKF were inspired in 4D-Var:– No-cost smoothing and “running in place”– A simple outer loop to deal with nonlinearities– Adjoint forecast sensitivity without adjoint model– Analysis sensitivity and exact cross-validation– Coarse resolution analysis without degradation– Correction of model bias combined with additive inflation gives the

    best results– Can estimate simultaneously optimal inflation and obs. errors

  • 41

    Extra Slides on Low Dim Method

  • 42

    • Generate a long time series of model forecast minus reanalysisfrom the training period

    2.3 Low-dim method (Danforth et al, 2007: Estimating and correcting globalweather model error. Mon. Wea. Rev, J. Atmos. Sci., 2007)

    t=0t=6hr

    model

    fx

    xtruth

    Bias removal schemes (Low Dimensional Method)

    e

    hrx6

    We collect a large number of estimated errors and estimate from them bias, etc.

    NNR

    NNR

    NNR

    Time-meanmodel bias

    Diurnalmodel error

    Statedependentmodel error

    m

    M

    m

    mnl

    L

    l

    ln

    t

    n

    a

    n

    t

    n

    f

    n

    f

    n MM febxxxx !!==

    ++++++"="=

    1

    ,

    1

    ,111 )()( #$%

    fx

    Forecast errordue to error in IC

    NNRNNR

  • 43

    model error = b + !n,l

    l=1

    2

    " el + # n,mm=1

    10

    " fm

    Low-dimensional method

    Include Bias, Diurnal and State-Dependent model errors:

    Having a large number of estimated errors allows toestimate the global model error beyond the bias

  • 44

    SPEEDY 6 hr model errors against NNR (diurnal cycle)1987 Jan 1~ Feb 15

    Error anomalies

    • For temperature at lower-levels, in additionto the time-independent bias, SPEEDY hasdiurnal cycle errors because it lacks diurnalradiation forcing

    Leading EOFs for 925 mb TEMP

    pc1pc2

    e

    hr

    e

    hri

    e

    hrxxx 66

    '

    )(6!=