kostas kalpakis associate professor computer science and electrical engineering department...

32
Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint work with Shiming Yang and Yaacov Yesha Improving HYSPLIT Forecasts with Data Assimilation* *Supported in part by an IBM grant

Upload: natalie-seaborn

Post on 14-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Kostas Kalpakis

Associate Professor

Computer Science and Electrical Engineering Department

University of Maryland Baltimore County

April 5, 2011

Joint work with Shiming Yang and Yaacov Yesha

Improving HYSPLIT Forecasts with Data Assimilation*

*Supported in part by an IBM grant.

Page 2: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

OutlineIntroduction

Motivation and GoalData Assimilation

Our approachState-Space ModelsThe NOAA HYSPLIT ModelThe LETKF Algorithm

Experiments and EvaluationCAPTEXCalifornia wildfires, August 2009

Summary

2

Page 3: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

MotivationHigh volume real-time sensor data streams for

monitoring and forecasting applications are becoming ubiquitous

Bridging the gap between predictions and real-time observations is needed

Demands for environmental monitoring and hazard prediction are pressing

Need to incorporate measurements from the thousands of sensors that underlie IBM’s “smarter Planet” initiatives into various geophysical processes

3

Page 4: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Goals

4

Our goals are toincorporate a data assimilation capability into HYSPLIT

HYSPLIT is extensively used as a routine for many data productsutilize in-situ and remotely sensed observations for

improved forecastsapply to wildfire smoke prediction and monitoringdevelop efficient data assimilation system using

InfoSphereStream’s SPADE framework for distributed high-performance platforms

Page 5: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Data assimilationData assimilation is a set of techniques that

Incorporate real world observations into model analysis and forecast cycle

Help reduce model error growth (small correction and short range forecast)

Improve upon the estimation of model initial conditions for the next forecast cycle

5

Page 6: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

The state-space model

Model a system by

Where111t1

t1

)(

)(M

t

t

vxy

uxx

tt

tt

H

tt

tt

t

t

RQ

vu

y

x

and covariance with noise-lly white typica

processes random noisen observatio and model theare and

operatorn observatio theis H

operator model theis M

tat time system theofn observatio theis

tat time system theof state theis

t

t

6

Page 7: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Data assimilation in state-spaceData assimilation becomes an estimation problem

Find a maximum likelihood estimate of the trajectory of the system states given a set of observations

Problem reduces to minimizing the cost function

Kalman filters, a recursive method, can be used to minimize this cost function efficiently for low-dimensional state space, with linear model and observation operators, and Gaussian noise processesOtherwise, the problem is often computationally difficult

2

11t

2/1 MH)(

t

tottJ txyRx

7

Page 8: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Data assimilation via Kalman filtersGraphical view of data assimilation using Kalman

filters

gainKalman theis where

H :step Correction

)(M :step Predictionb

1t111b

1ta

1t

b1t

t

ttt

att

K

xyKxx

xx

- - -

time

Background state

Analysis state

Observation

8

Page 9: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

The NOAA HYSPLIT ModelHYSPLIT

Hybrid Single Particle Lagrangian Integrated Trajectory Model

A model system that computes air parcel trajectories, dispersion and deposition of pollutants

Computes particle dispersion with the puff model or the particle model

Needs meteorology data and emission source informationHas been validated using ground truth observations*

Used as a routine for various data productsAir Quality Index (AQI)Smoke Forecast System (SFS)

9

*R.R. Draxler, J.L. Heffter, and G.D. Rolph. Datem: Data archive of tracer experiments and meteorology. August 2001. http://www.arl.noaa.gov/DATEM.php, last checked Jul. 2010

Page 10: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

System design

10

Page 11: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Data assimilation for HYSPLITUtilize HYSPLIT as a model operator in a state-space

model and assimilate observations into HYSPLIT

First, we need to carefully define the system state, so that we can extract it, modify it, and restart HYSPLIT

Second, since the model operator is non-linear and the system state is very large, standard extended Kalman filters are an expensive option for data assimilationWe use the LETKF algorithm, an ensemble transform

Kalman filter

11

Page 12: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Data assimilation for HYSPLITUse

the mass of the particles in HYSPLIT as the system statethe grid concentrations as the default observation

operator

12

Page 13: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

LETKF AlgorithmLETKF (Local Ensemble Transform Kalman Filter)*

nonlinear model operators, linear observation operatorsGaussian state and observational noise processes

Reduces implementation costs since it does not need adjoints

It does analysis locally in the ensemble space which is typically of low dimension (< 100)avoids inverses of large matrices

It is embarrassingly parallelWe have implemented LETKF in C with MPI, and in IBM

InfoSphereStreams

13*Brian Hunt, Eric Kostelich and Istvan Szunyogh, “Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter”, Physica D 230, pp 112-126, 2007.

Page 14: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

The LETKF AlgorithmGlobal steps: maintain an ensemble of K system states

Forward system state:Analysis: construct background analysis ensemble , background

observation ensemble , and their mean and covariance matrices

Local steps: for each grid point, choose local observation and background system state. Then calculate:Analysis error covariance:Perturbation:Analysis ensemble in ensemble space:

Analysis ensemble in state space:

14

Page 15: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Implementation using IBM InfoSphereStreams

InfoSphereStreams is a system developed by IBM for the very fast processing of large and fast data streams that supportsparallel and high performance stream processingcontinuous ingestion and analysisscaling over a range of hardware capabilitiesflexible to changing user objectives, available data, and

computing resource availabilitythe bursty nature of real-time observations of rapidly evolving

physical phenomenaUses SPADE to describe the stream operators

15

Page 16: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

SPADE Implementation Flowchart

16

Page 17: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Experiments and evaluation

Experimentally evaluate our approach using the controlled releases of tracers available in DATEM datasets

Demonstrate our approach using in-situ and remotely sensed real data from a California fire in August 2009Observation and emission rates are taken from EPA AQS

and GBBEP, and MODIS AOD when available

17

Page 18: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Evaluation metricsWe use HYSPLIT’s statmain to compute evaluation

metrics for a HYSPLIT forecast with respect to the ground truth

We report on the following metricsThe Normalized Mean Squared Error (NMSE)

The model rank, an overall quality of the model (larger values are better; the maximum value is 4).

N

iiM

MPN 1

2i )(P

1NMSE

18

Page 19: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

CAPTEXCAPTEX (Cross-Appalachian Tracer Experiment)

Time: 2100 UTC Sep 18 to 2100 UTC Oct 29, 1983Area: U.S. and Canada6 releases (3hr duration each) of special tracer (PFT).emission sources and rates are those in DATEM

Use DATEM CAPTEX observations as the ground truthObservations at 84 stations every 3 hrs for 48 hrs after

each release Run 160 iterations, each iteration simulating a 3hr time

period

19

Page 20: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

CAPTEXAfter 3hr

After 6hr After 9hr After 12 hr

Forecasts with data assimilation

20

Page 21: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

CAPTEXCAPTEX with and w/o data assimilation

21

Page 22: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

CAPTEXCAPTEX with and w/o data assimilation

22

Page 23: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Modified CAPTEXTo assess whether our approach improves the forecasts

given inaccurate emissions rates, we do the followingUse the CAPTEX concentrations as ground truthRun HYSPLIT with modified emissions rate for

CAPTEX in two modes (with and w/o data assimilation)For the 2nd release that begins at 1700 UTC 25 Sep. 1983 use the

emission rate of 33.5 Kg/h instead of the 67Kg/h given in DATEM

Compare with unmodified CAPTEX emissions w/o data assimilation

23

Page 24: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Modified CAPTEX

24

Page 25: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

California wildfire, August 2009Experiments to forecast particulate matter PM2.5

concentrations from a wildfire in California on August 2009Data used

Ground observations from EPA’s Air Quality System (AQS) (hourly obs)

Satellite observations fromTerra/Aqua MODIS Aerosol Optical Depth (AOD) (daily obs)Geostationary Operational Environmental Satellite (GOES) East/West

AOD (hourly obs)Emission rates from GBBEP (GOES-E/W Biomass Burning

Emission Product) (hourly obs)Data for SO2, NOx, CO, CO2, relative humidity are also

available from these data sources but not used

25

Page 26: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

California wildfire, August 2009Experiment using AQS observations and GBBEP

emission ratesTime: 2100 UTC Aug 9 to 2100 UTC Aug 20, 2009Area: California and Nevadause hourly AQS data as ground truth observationsuse GBBEP hourly PM2.5 emissions from 2019 source

points emission rates range from 200g/hr to 10Kg/hreach iteration simulates a 1hr period

26

Page 27: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

California wildfire, August 2009AQS+GBBEP

27

Page 28: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

California wildfire, August 2009AQS+GBBEP

28

Page 29: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

California wildfire, August 2009AQS+GBBEP

29

Page 30: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

SummaryOur data assimilation system:

demonstrates improvement on statistical metrics, e.g. average 16.0% improvement on NMSE in DATEM/CAPTEX

uses state-of-the-art prediction model and assimilation algorithm shows that LETKF offers good algorithmic efficiency

can easily utilize other models and multiple data sourcesUses data sources from ground sites and satellites for pollutant

concentration and emission rates

Can be extended to other domains, e.g. volcanic ashDemo website:

http://bluegrit.cs.umbc.edu/~shiming1/demo/

30

Page 31: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

Acknowledgments

31

We would like to thank

IBM for its generous support, and the InfoSphereStream team for its indispensible help

Drs. Ben Kyger and Roland Draxler for providing the HYSPLIT model and answering many of our questions

Dr. Milt Halem for his encouragement and support, and the Multicore Computing Center at UMBC for providing the computing environment

Dr. Hai Zhang of the UMBC Atmospheric Lidar Group, for his help on MODIS AOD

NASA for the MODIS data, NOAA for the GOES, GBBEP, and DATEM data, and EPA for the AQS data

Page 32: Kostas Kalpakis Associate Professor Computer Science and Electrical Engineering Department University of Maryland Baltimore County April 5, 2011 Joint

32

Thank you.