intelligent data reduction algorithms for real-time data assimilation xiang li, rahul ramachandran,...

17
INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville Bradley Zavodsky ESSC/University of Alabama in Huntsville Steven Lazarus, Mike Splitt, Mike Lueken Florida Institute of Technology May 5, 2009

Upload: kerry-hawkins

Post on 18-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME

DATA ASSIMILATION

Xiang Li, Rahul Ramachandran, Sara GravesITSC/University of Alabama in Huntsville

Bradley Zavodsky ESSC/University of Alabama in Huntsville

Steven Lazarus, Mike Splitt, Mike LuekenFlorida Institute of Technology

May 5, 2009

Page 2: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Data Reduction

• It is a common practice to remove a portion of or combine high spatial and temporal resolution observations to reduce data volume in DA process, due to

High computation resources required for large volume data set (exponential increase with data volume)

Data redundancy in large volume high resolution observations

Local spatial correlation of satellite data observation data resolution exceeds assimilation grid resolution

Reducing data redundancy may improve analysis quality (Purser et al., 2000)

Page 3: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Com

puta

tion

al R

esou

rces

Little

Lot

AnalysisTechnique

Data Volume

Horizontal Resolution

SuccessiveCorrections

Statistical Interpolation

4D-Var3D-Var

Little Lot

1km80 km

Computational Resources Required for Data Assimilation

Page 4: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Need for ‘new’ Data Reduction Techniques• Current data thinning approaches

Sub-sampling Random Sampling Super-Obing (subsampling with averaging)

• Limitations All data points are treated equally Information contents that observation data contain and their contributions to data analysis performance may be different

• Intelligent Data Thinning Algorithms Reduces number of data points required for an analysis Maintains fidelity of the analysis (keeps the most important data points)

Page 5: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

High Data Volume from satellite platforms ( e.g. infrared based SST, scatterometer winds) carry redundant data. Computationally Expensive!

Analyses derived from simple subsampling of data can be inconsistent and are not optimal in efficiency.

Same data subsampling interval, but shifted.

Simple subsampling strategies can be susceptible to impact frommissing ‘significant’ data sample.

Example

Page 6: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Intelligent data thinning algorithms• Objective: reserve samples in the thinned data set that have high information content and large impact on analysis.

• Assumption: samples with high local variances contain high information content

• Approach: Use synthetic test to determine and validate the optimal thinning strategy and then apply to real satellite observations

Synthetic Data Test: Truncated Gaussian Real Data Experiment: Atmospheric Infrared Sounder (AIRS) profiles

Page 7: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Synthetic Data Test: Truncated Gaussian• Explicitly defined truth and background fields• Direct thinning method

35 observations sampled to find the 5 observations yielding the best analysis (1D variational approach) 325,000+ unique spatial combinations

• First guess: base of Gaussian function• Observations: created by adding white noise to truth

first guess

truth

analysis

optimal observation locations

Page 8: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Synthetic Data Test: Truncated Gaussian (cnt’d)• Optimal observation configuration retains data at the:

peak gradient anchor points (where gradient changes most sharply)

• Dependent on key elements of the analysis itself:

length scale (L) quality of background and observations

Lesson Learned:

Thinned data samples should combine homogeneous points, gradient points, and anchor points for optimal performance, and a dynamic length scale should be applied to each thinned data set.

Page 9: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Intelligent Data Reduction Algorithms• Earlier versions of intelligent data thinning

algorithms (IDT, DADT, mDADT)• Density-Balanced Data Thinning (DBDT)

Three metrics are calculated for data samples and samples are put into priority queues for the three metrics

Thermal Front Parameter (TFP): High value of TFP indicates rapid change of temperature gradient and ‘anchor’ samples

Local Variance (LV): high values indicate gradient regions Homogeneity: low values indicate homogeneous regions

Data selected from the three metrics: user determines the portions of samples from these metrics

Radius of impact (R): used to control uniform spatial distribution of thinned data set. Distance between any two samples needs to be larger than R

Data selection process: select top qualified samples from priority queues. Start with TFP queue, followed by LV queue and homogeneity queue

DBDT algorithm performs best in these thinning algorithms

Page 10: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

AIRS & ADAS: Our Real-World Testing Ground• Atmospheric Infrared Sounder (AIRS)

NASA hyperspectral sounder generates temperature and moisture profiles with ≈ 50-km resolution at nadir each profile contains a pressure level above which quality data are found

• ARPS Data Assimilation System (ADAS) version 5.2.5; Bratseth scheme background comes from a short-term Weather Research and Forecasting (WRF) model forecast error covariances:

– background: standard short-term forecast errors cited in ADAS– observation: from Tobin et al. (2006)* AIRS validation study

dynamic length scale (L) calculated from average distance of nearest observation neighbors

*D. C. Tobin, H. E. Revercomb, R. O. Knuteson, B. M. Lesht, L. L. Strow, S. E. Hannon, W. F. Feltz, L. A. Moy, E. J. Fetzer, and T. S. Cress, “ARM site atmospheric state best estimates for AIRS temperature and water vapor retrieval validation,” J. Geophys. Res., D09S14, pp. 1-18, 2006.

Page 11: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Thinning Strategies (11% of full)

• Subsample: Takes profile with most retrieved levels within a 3x3 box

• Random: Searches observations and ensures that retained observations are thinned to a user-defined distance 10 permutations performed to create an ensemble

• DBDT: thins on 2-D pressure levels using equivalent potential temperature; then levels are recombined to form 3-D structure Thinning uses Equivalent Potential Temperature (θe) to account for both temperature and moisture profiles

Page 12: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Case Study Day: 12 March 2005

• 700 hPa temperature gradient in observations and background over midwest and northern Gulf of Mexico• Observations and background show similar patterns700 hPa AIRS temperature

observations700 hPa WRF forecast temperatures (bckgd)

Page 13: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Subsample Random DBDT

700 hPa Temperature Analysis Comparison• Overall analysis increments are ±1.5oC over AIRS swath• Largest differences between analyses in upper midwest and over Southern Canada

Page 14: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Full Subsample

Random DBDT

# OBS 793 99 100 87

ALYS TIME (s) 244 56 56 106

L (km) 80 146 147 152

θe MSE N/A 0.60 0.56 0.36

Quantitative Results (Full vs. Thinned)

• Computation times are 50-70% faster for the thinned data sets • MSEs compare analyses between full and each thinned• DBDT is superior analysis with least observations:

has a longer computation time (thinning algorithm more rigorous) cuts MSE almost in half with 1/10 the observations of the full

Page 15: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Conclusions

• Intelligent data thinning strategies are important to eliminate redundant observations that may hinder convergence of DA schemes and reduce computation times• Synthetic data tests have shown that observations must be retained in gradient, anchor, and homogeneous regions and that results are dependent on key elements of the analysis system• Analyses of AIRS thermodynamic profiles using different thinning strategies yields the DBDT as the superior thinning technique

Page 16: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Future Work

• Manuscript in review with Weather and Forecasting (AMS) • Testing forecasts spawned from the various thinned analyses to see if superior DBDT analysis produces the best forecasts• Demonstration of algorithm capabilities with respect to real-time data dissemination• Use of gradient detecting portion of algorithm for applications in locating cloud edges for radiance assimilation

Page 17: INTELLIGENT DATA REDUCTION ALGORITHMS FOR REAL-TIME DATA ASSIMILATION Xiang Li, Rahul Ramachandran, Sara Graves ITSC/University of Alabama in Huntsville

Thank you for your attention.Are there any questions?