data assimilation systems: focus on enkf diagnostics · 2016. 1. 25. · 1 data assimilation...
TRANSCRIPT
-
1
Data Assimilation Systems:Data Assimilation Systems:Focus on EnKF diagnosticsFocus on EnKF diagnostics
Former students (Former students (Shu-Chih Yang, Takemasa Miyoshi, Hong Li,Shu-Chih Yang, Takemasa Miyoshi, Hong Li,Junjie Liu, Chris Danforth, Junjie Liu, Chris Danforth, Ji-Sun Ji-Sun Kang, Matt Hoffman)Kang, Matt Hoffman)
andand Eugenia KalnayEugenia Kalnay UMCPUMCP
Acknowledgements:UMD Chaos-Weather Group: Brian Hunt, Istvan Szunyogh, EdOtt, Jim Yorke, Kayo Ide, and studentsAlso: Malaquías Peña, Matteo Corazza, Pablo Grunman, DJPatil, Debra Baker, Steve Greybush, Tamara Singleton, StevePenny, Elana Fertig.
-
2
Ensemble Kalman Filter:Ensemble Kalman Filter:status and new ideasstatus and new ideas
• EnKF and 4D-Var are in a friendly competition:• Jeff Whitaker results: EnKF better than GSI (3D-Var)• Canada (Buehner): 4D-Var & EnKF the same in theNH and EnKF is better in the SH. Hybrid best.• JMA (Miyoshi): at JMA, EnKF faster than 4D-Var,better in tropics and NH, worse in SH due to model bias.• EnKF needs no adjoint model, priors, it adapts tochanges in obs, it can even estimate ob errors.• We “plagiarize” ideas and methods developed for 4D-Var and adapt them to the LETKF (Hunt et al., 2007)
-
3
Whitaker: Comparison of T190, 64 members EnKF withT382 operational GSI, same observations (JCSDA, 2009)
-
4
There are several types of EnKFThere are several types of EnKF
1.1. Perturbed obs (e.g., Houtekamer and Mitchell)Perturbed obs (e.g., Houtekamer and Mitchell)2.2. Square root filters (e.g., Whitaker and Hamill)Square root filters (e.g., Whitaker and Hamill)Most filters get their speed from assimilating one
observation at a timeThe LETKF (Hunt 2005) assimilates all obs
simultaneously and get its speed from localprocessing of each grid point
Because it is a Transform Square root filter, the LETKFanalysis ensemble is explicitly expressed as a linearcombination of the forecast ensemble
This has a number of nice properties, so here we willfocus on the LETKF
-
5
Diagnostic tools that improveDiagnostic tools that improveLETKF/EnKFLETKF/EnKF
We adapted ideas that were inspired by 4D-VarWe adapted ideas that were inspired by 4D-Var: No-cost smoother (Kalnay et al, Tellus 2007) “Outer loop”, nonlinearities and long windows (Yang and Kalnay) Accelerating the spin-up (Kalnay and Yang, 2008) Forecast sensitivity to observations (Liu and Kalnay, QJ, 2008) Analysis sensitivity to observations and cross-validation (Liu et al., QJ,2009) Coarse analysis resolution without degradation (Yang et al., QJ, 2009) Low-dimensional model bias correction (Li et al., MWR, 2009) Simultaneous estimation of optimal inflation and observation errors (Li etal., QJ, 2009).
-
6
Local Ensemble Transform Kalman Filter(Ott et al, 2004, Hunt et al, 2004, 2007)
• Model independent(black box)• Obs. assimilatedsimultaneously at eachgrid point• 100% parallel: fast• No adjoint needed• 4D LETKF extension
(Start with initial ensemble)
LETKFObservationoperator
Model
ensemble analyses
ensemble forecasts
ensemble“observations”
Observations
-
7
Perform data assimilation in a local volume, choosing observations
The state estimate is updated at thecentral grid red dot
Localization based on observations
-
8
Perform data assimilation in a local volume, choosing observations
The state estimate is updated at thecentral grid red dot
All observations (purple diamonds)within the local region are assimilated
Localization based on observations
The LETKF algorithm can be described in a single slide!
-
9
Local Ensemble Transform Kalman Filter (Local Ensemble Transform Kalman Filter (LETKFLETKF))
Forecast step:Analysis step: construct
Locally: Choose for each grid point the observations to be used, andcompute the local analysis error covariance and perturbations inensemble space:
Analysis mean in ensemble space:and add to to get the analysis ensemble in ensemble space
The new ensemble analyses in model space are the columns of . Gathering the grid point analyses forms the new
global analyses. Note that the the output of the LETKF are analysisweights and perturbation analysis matrices of weights . Theseweights multiply the ensemble forecasts.
x
n,k
b= M
nx
n!1,k
a( )Xb = x
1
b ! xb | ... | xK
b ! xb"# $%;
yi
b= H (x
i
b); Y
n
b= y
1
b ! yb | ... | yK
b ! yb"# $%
!Pa= K !1( )I + YbTR!1Yb"# $%
!1;W
a= [(K !1) !Pa ]1/2
Xn
a= X
n
bW
a+ x
b
wa= !P
aY
bTR
!1(yo ! yb )
Wa
Globally:
wa
Wa
-
10
The 4D-LETKF produces an analysis in terms ofweights of the ensemble forecast members at theanalysis time tn, giving the trajectory that best fits allthe observations in the assimilation window.
analysis time weightsA linear comb. of trajectories is ~a trajectory. If it is closeto the truth at the end it should be close to the truththroughout the trajectory (neglecting model errors).
-
11
No-cost LETKF smoother ( ): apply at tn-1 the sameweights found optimal at tn. It works for 3D- or 4D-LETKF
The no-cost smoother makes possible:Outer loop (like in 4D-Var)“Running in place” (faster spin-up)Use of future data in reanalysisAbility to use longer windows
-
12
No-cost LETKF smoothertested on a QG model: It works!
“Smoother” reanalysis
LETKF Analysisxna= xn
f+ Xn
fwn
aLETKF analysis
at time n
Smoother analysis at time n-1 !xn!1
a= xn!1
f+ Xn!1
fwn
a
This very simple smoother allows us to go backand forth in time within an assimilation window:it allows assimilation of future data in reanalysis
-
13
Nonlinearities and Nonlinearities and ““outer loopouter loop””
• The main disadvantage of EnKF is that it cannot handlenonlinear (non-Gaussian) perturbations and therefore needsshort assimilation windows.
•• It doesnIt doesn’’t have the t have the outer loopouter loop so important in 3D-Var andso important in 3D-Var and4D-Var (DaSilva, pers. 4D-Var (DaSilva, pers. commcomm. 2006). 2006)
Lorenz -3 variable model (Kalnay et al. 2007a Tellus), RMSanalysis error
4D-Var LETKFWindow=8 steps 0.31 0.30 (linear window)Window=25 steps 0.53 0.66 (nonlinear window)
Long windows + Pires et al. => 4D-Var clearly wins!
-
14
““Outer loopOuter loop”” in 4D-Var in 4D-Var
-
15
Nonlinearities,Nonlinearities, ““Outer LoopOuter Loop”” and and ““Running in PlaceRunning in Place””
Outer loop: similar to 4D-Var: use the final weights tocorrect only the mean initial analysis, keeping the initialperturbations. Repeat the analysis once or twice. Itcenters the ensemble on a more accurate nonlinearsolution.
Lorenz -3 variable model RMS analysis error
4D-Var LETKF LETKF LETKF +outer loop +RIP
Window=8 steps 0.31 0.30 0.27 0.27Window=25 steps 0.53 0.66 0.48 0.39
“Running in place” smoothes both the analysis and theanalysis error covariance and iterates a few times…
-
16
Estimation of forecast sensitivity toEstimation of forecast sensitivity toobservations observations without adjointwithout adjoint in an in an
ensemble Kalman filterensemble Kalman filter
Junjie Liu and Eugenia KalnayQJRMS October 2008
Inspired by Langland and Baker (2004)and Zhu and Gelaro (2008)
Ideal for diagnosing NCEP “5-day skilldropouts” because it remains valid for
nonlinear perturbations
-
17
Motivation: Langland and Baker (2004)
The adjoint method proposed by Langland and Baker (2004) and Zhu andGelaro (2007) quantifies the reduction in forecast error for each individualobservation source
The adjoint method detects the observations which make the forecast worse.
The adjoint method requires adjoint model which is difficult to get.
AIRS shortwave 4.180 µm
AIRS shortwave 4.474 µm
AIRS longwave 14-13 µm
AMSU/A
-
18
Schematic of the observation impact on the reduction offorecast error
The only difference between and is the assimilation of observations at 00hr.
Observation impact on the reduction of forecast error:
(Adapted from Langlandand Baker, 2004)
et |0 = xt |0f! xt
a
et |!6 = xt |!6f
! xta
et |0
et |!6
J =1
2(e
t |0
Tet |0! e
t |!6
Tet |!6)
analysis t
et |!6
et |0
-6hr 00hr
OBS.
-
19
J =1
2(e
t |0
Tet |0! e
t |!6
Tet |!6)Euclidian cost function:
The ensemble forecast sensitivity method
!J
!v0
= !K0
TXt |"6
fT#$ %& et |"6 + Xt |"6f !K
0v0
#$ %&
The sensitivity of cost function with respect to the assimilated observations:
v0= y
0
o! h(x
0|!6
b)
J = v0,!J
!v0
Cost function as function of obs. Increments:
With this formula we can predict the impact of observations on the forecasts!
-
20
Test ability to detect the poor quality observation onthe Lorenz 40 variable model
Like adjoint method, ensemble sensitivity method can detect the observationpoor quality (11th observation location)
The ensemble sensitivity method has a stronger signal when the observation hasnegative impact on the forecast.
Observation impact from LB (red) and from ensemble sensitivity method (green)
Larger random error Biased observation case
-
21
Test ability to detect poor quality observation fordifferent forecast lengths
After 2-days theadjoint has the wrongsensitivity sign!
The ensemblesensitivity method hasa strong signal evenafter forecast error hassaturated!
Larger random error Biased observation case
2 days
5 days
20 days
-
22
How can we possibly detect bad observations evenafter all skill is lost??? (Liu and Kalnay, 2009)
After 20-days there is noforecast skill but theensemble sensitivity stilldetects the wrongobservation.
The ensemble sensitivityis based on the assumptionthat the analysis weightscan be used in theforecasts. This is accurateeven after forecast errorhas saturated (triangles).
As a result we canidentify a bad observationeven after forecast skill islost.
20 days
Mean Square Error of the -6hr weighted forecasts (diamonds),MSE of the 0hr ensemble mean (circles) and MS Differencebetween ensemble mean and weighted forecasts (triangles).
Error made by using the -6hrweights in the forecasts
-
23
Analysis sensitivity to observationsAnalysis sensitivity to observationsand and exact Cross-Validationexact Cross-Validation in an EnKF in an EnKF
Junjie Liu, E Kalnay, T Miyoshi and C CardinaliQJRMS 2009
Inspired in Cardinali et al., 2004
-
24
Observation quality control using cross-validation
Experimental design: the observation error standard deviation at the 11th point is 4times larger than the others.
* The difference between the predicted observation and the actual obs islarger when the ith observation has larger error (11th point).
* Does not need much computational time.
1
T(yi
o! yi
a(! i ))
2
t=1
T
" =1
T
(yi,to! yi,t
a)
2
(1! Sii,to
)2
t=1
T
" , i = 1,!,m
yio
yia(! i )
Grid Points
-
25
Observation impact based on self-sensitivity & the observationimpact from adjoint and ensemble sensitivity method
-6hr 00hr t analysis
Obs.
et |!6 = xt |!6f
! xta
et |0 = xt |0f! xt
a
(Langland and Baker, 2004; Liu and Kalnay, 2008)
-6hr 00hr t analysisTotal Obs.
(Total-1) Obs.
The adjoint and ensemble sensitivity method:
* The cost function reflects the impact of all theobservations assimilated at 00hr on the forecasterror difference (model space) at time t.
* The cost function is rewritten as function of theobservations assimilated at 00hr.
* The cost function reflects the impact of the ithobservation assimilated at 00hr on the forecast errordifference (observation space) at time t.
* There is no need to rewrite the cost function.
The observation impact based on self-sensitivity:
Ji = [(yi,t |0f
! yi,ta)2! (yi,t |0
f (!i)! yi,t
a)2]
J = [(xt |0f! xt
a)2! (xt |!6
f! xt
a)2]
et |0f (! i )
= yi,t |0f (!i)
! yi,ta
et |0f= yi,t |0
f! yi,t
a
-
26
• In EnKF the analysis is a weighted average of the forecast ensemble• We performed experiments with a QG model interpolating weights
compared to analysis increments.• Coarse grids of 11%, 4% and 2% interpolated analysis points.• Weight fields vary on large scales: they interpolate very well
1/(3x3)=11% analysis grid
Coarse analysis with interpolated weightsYang et al (2008)
-
27
Weight interpolation versus Increment interpolation
With increment interpolation, the analysis degrades quickly…With weight interpolation, there is almost no degradation!LETKF maintains balance and conservation properties
-
28
Impact of coarse analysis on accuracy
With increment interpolation, the analysis degradesWith weight interpolation, there is no degradation,
the analysis is actually slightly better!
-
29
Model error: comparison ofModel error: comparison of methodsmethodsto correct model bias and inflationto correct model bias and inflation
Hong Li, Chris Danforth, Takemasa Miyoshi,and Eugenia Kalnay, MWR (2009)
Inspired by the work of Dick Dee, but with modelerrors estimated in model space, not in obs space
-
30
Model error: If we assume a perfect model in EnKF,Model error: If we assume a perfect model in EnKF,we underestimate the analysis errors (Li, 2007)we underestimate the analysis errors (Li, 2007)
imperfect modelimperfect model(obs from NCEP- NCAR(obs from NCEP- NCARReanalysis NNR)Reanalysis NNR)
perfect SPEEDY modelperfect SPEEDY model
-
31
— Why is EnKF vulnerable to model errors ?
In the theory of Extended Kalmanfilter, forecast error is represented bythe growth of errors in IC and themodel errors.
However, in ensemble Kalman filter,error estimated by the ensemblespread can only represent the firsttype of errors.
Pif!
1
k "1(xi
f
i=1
K
# " xf)(xi
f" x
f)T
1 11a a
i i
f a T
i i! !
!= +
x xP M P M Q
Ens mean
Ens 1
Ens 2
Truth
Imperfect model
Forecasterror
The ensemble spread is ‘blind’to model errors
-
32
imperfect modelperfect model
Low Dimensional Method to correct the bias (Danforth et al, 2007)combined with additive inflation
We compared several methods to handlebias and random model errors
-
33
Simultaneous estimation of EnKF inflation andobs errors in the presence of model errors
Hong Li, Miyoshi and Kalnay (QJ, 2009)
Any data assimilation scheme requires accurate statistics for theobservation and background errors (usually tuned or from gut feeling). EnKF needs inflation of the background error covariance: tuning isexpensive Wang and Bishop (2003) and Miyoshi (2005) proposed a technique toestimate the covariance inflation parameter online. It works well if ob errorsare accurate. We introduce a method to simultaneously estimate ob errors andinflation.
Inspired by Houtekamer et al. (2001) andDesroziers et al. (2005)
-
34
Diagnosis of observation error statistics
< do!ado!b
T>= R
These relationships are correct if the R and B statistics are correct anderrors are uncorrelated!
HPbH
T! H"P
bH
T
Desroziers et al, 2005, introduced two new statistical relationships:
OMB*OMB
OMA*OMB
AMB*OMB
< do!bdo!b
T>= HP
bH
T+ R
Houtekamer et al (2001) well known statistical relationship:
< da!bdo!b
T>= HP
bH
T
With inflation: ! > 1with
-
35
Diagnosis of observation error statistics
Here we use a simple KF to estimate both and online.!
( !! o )2= do"a
Tdo"b / p = (yj
o
j=1
p
# " yja)(yj
o" yj
b) / p OMA*OMB
!o=(d
o"b
Tdo"b) " Tr(R)
Tr(HPbHT)
!o= (yj
a
j=1
p
" # yjb)(yj
o# yj
b) /Tr(HP
bH
T) AMB*OMB
OMB2
!o
2
Transposing, we get “observations” of and! !o
2
-
36
SPEEDY model: online estimated observationalerrors, each variable started with 2 not 1.
The original wrongly specified R quicklyconverges to the correct value of R (in about 5-10days)
-
37
Estimation of the inflation
Using a perfect R and estimating adaptivelyUsing an initially wrong R and but estimating them adaptively!
Estimated Inflation
!
After R converges, the time dependent inflation factors are quite similar
-
38
Tests with LETKF with imperfect L40 model:added random errors to the model
Error
amplitude
(random)
A: true !o
2=1.0
(tuned) constant "
B: true !o
2=1.0
adaptive "
C: adaptive!o
2
adaptive "
a " RMSE " RMSE " RMSE !o
2
4 0.25 0.36 0.27 0.36 0.39 0.38 0.93
20 0.45 0.47 0.41 0.47 0.38 0.48 1.02
100 1.00 0.64 0.87 0.64 0.80 0.64 1.05
The method works quite well evenwith very large random errors!
-
39
Tests with LETKF with imperfect L40 model:added biases to the model
The method works well for low biases, butless well for large biases: Model bias needs
to be accounted by a separate bias correction
Error
amplitude
(bias)
A: true !o
2=1.0
(tuned) constant "
B: true!o
2=1.0
adaptive "
C: adaptive!o
2
adaptive "
# " RMSE " RMSE " RMSE !o
2
1 0.35 0.40 0.31 0.42 0.35 0.41 0.96
4 1.00 0.59 0.78 0.61 0.77 0.61 1.01
7 1.50 0.68 1.11 0.71 0.81 0.80 1.36
-
40
SummarySummary
• EnKF and 4D-Var give similar results in Canada and in JMA,except for model bias. (Buehner et al, Miyoshi et al)
• EnKF is better than GSI with half resolution model, 64members. Computationally competitive (Whitaker)
• Many ideas to further improve EnKF were inspired in 4D-Var:– No-cost smoothing and “running in place”– A simple outer loop to deal with nonlinearities– Adjoint forecast sensitivity without adjoint model– Analysis sensitivity and exact cross-validation– Coarse resolution analysis without degradation– Correction of model bias combined with additive inflation gives the
best results– Can estimate simultaneously optimal inflation and obs. errors
-
41
Extra Slides on Low Dim Method
-
42
• Generate a long time series of model forecast minus reanalysisfrom the training period
2.3 Low-dim method (Danforth et al, 2007: Estimating and correcting globalweather model error. Mon. Wea. Rev, J. Atmos. Sci., 2007)
t=0t=6hr
model
fx
xtruth
Bias removal schemes (Low Dimensional Method)
e
hrx6
We collect a large number of estimated errors and estimate from them bias, etc.
NNR
NNR
NNR
Time-meanmodel bias
Diurnalmodel error
Statedependentmodel error
m
M
m
mnl
L
l
ln
t
n
a
n
t
n
f
n
f
n MM febxxxx !!==
++++++"="=
1
,
1
,111 )()( #$%
fx
Forecast errordue to error in IC
NNRNNR
-
43
model error = b + !n,l
l=1
2
" el + # n,mm=1
10
" fm
Low-dimensional method
Include Bias, Diurnal and State-Dependent model errors:
Having a large number of estimated errors allows toestimate the global model error beyond the bias
-
44
SPEEDY 6 hr model errors against NNR (diurnal cycle)1987 Jan 1~ Feb 15
Error anomalies
• For temperature at lower-levels, in additionto the time-independent bias, SPEEDY hasdiurnal cycle errors because it lacks diurnalradiation forcing
Leading EOFs for 925 mb TEMP
pc1pc2
e
hr
e
hri
e
hrxxx 66
'
)(6!=