the general data assimilation problem ronald errico goddard earth sciences technology and research...
TRANSCRIPT
The General Data Assimilation Problem
Ronald Errico
Goddard Earth Sciences Technology and Research Center at Morgan State University
and
Global Modeling and Assimilation Office at NASA Goddard Space Flight Center
Outline
1. Definition of the general DA problem2. DA as an application of Bayes’s Theorem3. Interpretation of the background4. The “observation operator” or “forward model”5. The dynamical balance problem6. Recommendations
The General Problem
Produce an estimate of the state of some system (e.g., the ocean or atmosphere) consistent with:
1. observations,2. known physical relationships,3. prior information,
accounting for statistics of possible errors in each.
The Character of Information
A. Observations are:1. always imperfect, sometimes with gross errors2. often indirect, or highly processed3. usually inadequate to completely define what we want
B. Physics is useful to:1. provide constraints2. relate what is observed to what is analyzed3. interpolate or extrapolate information
C. Background (prior) information is required to:1. account for previously analyzed observations2. provide additional “knowns” to determine all “unknowns”3. fill observational voids
“The most general way of describing information:”
As a probability density function (PDF or pdf)
Bayes’s Theorem (1763)
PDFs of the Information in our DA problem
Interpretation of the background
1. The background is an estimate of the state to be analyzed prior to consideration of any new observations.2. It is defined in the same state space as the analysis and thus presents a complete description of the state to be analyzed.3. The best background is generally provided by a temporal extrapolation of the most recent past analysis to the new analysis time using a good, physically-based, forecast (e.g., NWP) model.4. This background can be considered an estimate valid at the analysis time based on all observations considered during the recent past and based on our physical understanding of the system analyzed. 5. For the above reason, the background is generally a better estimate of the state to be analyzed than is provided by many new observations. 6. Mathematically, the background is the same as an observation, but with an observation operator that is the identity operator; i.e., it is just another piece of information to be considered.7. Due to the dynamics of the forecast model, background errors are generally correlated in space and time.
The observation operator H1. Generally, we do not observe where or what we want to analyze.2. We thus need to relate the observation to what we want to analyze, using some quantitative relationship that may be either statistically or physically formulated. 3. This relationship is called an “observation operator” or “forward model”, y=H(x). 3. Two examples of H are spatial interpolation and radiative transfer algorithms.5. In general, these operators are imperfect: Even if x=truth, H(x) would not yield the true y.6. The difference y(true)-H(x(true)) is termed the representativeness error.7. It is best considered as the error in the formulation of the observation operator, e.g., in the interpolation and radiative transfer algorithms.8. Analysis at lower resolutions will tend to yield greater spatial interpolation errors and thus greater representiveness errors.9. The representativeness error is often larger than the instrument error.
A Bayesian Example
T2
T1
T2 T2
T1 T1
Analysis pdf
Prior pdf
Model
Observ.& Modelpdfs
R
q
T
Analysis pdf
Errico et al.QJRMS 2000
Gaussian PDFs
Solution to the analysis problem for Gaussian errors
Solution for linear H
Reasonableness of the unbiased Gaussian assumption
1. Real errors are likely biased and non Gaussian to some degree.
2. If observations having gross errors can be identified and eliminated through quality control, errors for the remaining observations may be more approximately Gaussian.
3. If observational error biases can be estimated well, they can be eliminated by adding a correction to all observation values.
4. Many observations are a result of an averaging process. If this effectively averages errors, the net error will tend to be Gaussian according to the central limit theorem of statistics.
Implications of the Bayesian Approach
1. Unless the underlying distributions are simple, the problem is computationally intractable for large problems.2. We see how the different information should be optimally
combined.3. We see what statistical knowledge is required as input.4. Results may depend on shapes of distributions, not only their means
and variances.5. We see that selection of a “best” analysis can be somewhat
ambiguous.6. Multi-modality of the PDF can occur, particularly due to model non-linearity.7. Any analysis has associated error statistics.8. While an explicit Bayesian approach may be impractical, the Bayesian implications of other techniques should be considered.
Daley 1992
Consideration of Balance: Example of Geostrophic Adjustment
0 hour
Consideration of primitive equations
dg/dt (t=0) = 0
NonlinearNormalModeInitialization
Why does balance matter in data assimilation?
1. Large initial imbalances will tend to create less accurate backgrounds
2. Balance can be exploited to relate u, v, T, ps (esp. in extra-tropics)
3. Errors in balanced initial conditions will tend to create balanced background errors, so the error statistics should reflect that; i.e.,
background errors of u, v, T, ps tend to be correlated, esp. in the extra-tropics.
Implications of geostrophic adjustment
1. If wind, temperature, and pressure are not considered properly together, unrealistic gravity waves will be propagated and informationwill not be retained.
2. The problem is generally aggravated at high altitudes where the massdensity is low. It is also very scale-dependent, especially vertically.
3. The fundamental balance is nonlinear, which is not straightforward to implement in a linear analysis scheme.
4. Many ways to mitigate this problem have been developed, each with its own advantages and disadvantages and computational issues.
5. Theoretical aspects of the problem are best described in the context ofthe balancing technique called “nonlinear normal mode initialization.”
6. Beware of claims that balancing is a solved problem!
Character of the DA Problem
1. A well-developed body of theory exists. Control theory, Inverse modeling, Bayesian analysis It is fundamental and foundational.
2. This theory is currently insufficient. The computational demand can be overwhelming. The required input statistics are not well known.
3. Gross approximations or unsupported assumptions may be required Although “wrong,” they can be useful. Sometimes they create confusion.
4. Many techniques are available Most are similar in a very general sense Results are affected by details
Basics
1. Fundamentals are foundational.2. Statistical theory is critical.3. Quality control is critical.4. Consideration of covariances is critical.5. Consideration of dynamic balance is critical.6. Model error is not negligible.7. Much model physics is not linear.8. Model error is probably not white noise.9. Experience counts!10. Data assimilation is as much art as science.
Tarantola, A., 1987: Inverse problem theory: Methods for data fitting and model parameter estimation. Elsevier Science B. V. (See chapter 1 in the 1st edition, which is now out of print).
Baker, N., 2000: Observation adjoint sensitivity and the adaptive targeting problem. Thesis, Naval Postgraduate School. (Very good explanation of howdata assimilation utilizes observations).
Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 420 pp.(Somewhat dated, but good and accurate presentation of many basics).
Ghil, M., K. Ide, A.Bennett, P. Courtier, M. Kimoto, M. Nagata, M.Saiki. M. Sato, Eds., 1997: Data assimilation in meteorology and oceanography: Theory and practice. Meteorological Society of Japan. 386pp. (Several short tutorial-like paperson various aspects of data assimilation).
Some References
Linear Normal-Mode Initialization
Temperton and Williamson 1979
Daley 1991
Structures
of two
normal modes
g(t=0) = 0
NNMI
Errico 1997
Harmonic Dial for External m=4 Mode, Period=3.7hWithout NNMI With NNMI
Errico 1997