data integration: assessing the value and significance of new observations and products

24
Data Integration: Assessing the Value and Significance of New Observations and Products John Williams, NCAR Haig Iskenderian, MIT LL NASA Applied Sciences Weather Program Review Boulder, CO November 19, 2008

Upload: josephine-kirkland

Post on 01-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Data Integration: Assessing the Value and Significance of New Observations and Products. John Williams, NCAR Haig Iskenderian, MIT LL. NASA Applied Sciences Weather Program Review Boulder, CO November 19, 2008. Data Integration. Goals - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Integration:  Assessing the Value and Significance of New Observations and Products

Data Integration: Assessing the Value and Significance of New Observations and

Products

John Williams, NCAR

Haig Iskenderian, MIT LL

NASA Applied Sciences Weather Program Review

Boulder, CO

November 19, 2008

Page 2: Data Integration:  Assessing the Value and Significance of New Observations and Products

Data Integration

• Goals– Integrate NASA-funded research into NextGen 4-D data

cube for SAS products and decision support

– Evaluate potential of new data to contribute to NextGen product skill, in context of other data sources

– Provide feedback on temporal/spatial scales and operationally significant scenarios where new data may contribute

• Approaches– Perform physically-informed transformations and forecast

system integration, e.g., into fuzzy logic algorithm

– Use nonlinear statistical analysis to evaluate new data importance in conjunction with other predictor fields

– Implement, evaluate and tune the system

Page 3: Data Integration:  Assessing the Value and Significance of New Observations and Products

Example of Forecast System Integration:

SATCAST integration into CoSPA

Page 4: Data Integration:  Assessing the Value and Significance of New Observations and Products

Numerical Forecast Models

CoSPAWeather Product

Generator

CoSPA Situation Display

NEXRADTDWR

Weather Radar

LLWAS ASOS

Surface Weather

Lightning

Canadian

Satellite

Air TrafficManagers

AirlineDispatch

DecisionSupport Tools

CoSPA 0-2 hour Forecasts

Page 5: Data Integration:  Assessing the Value and Significance of New Observations and Products

Overview of Heuristic Forecast

FeatureExtraction

Satellite

Mosaic Radar Products

Forecast Engine

Weather AnalysisProducts

Forecasts

Error Statistics

Surface Obs

RUC

InterestImages

Page 6: Data Integration:  Assessing the Value and Significance of New Observations and Products

Generation of Interest Images

• Interest Images:

– Are VIL-like (0-255) images that have a high impact upon evolution and pattern of future VIL

– Result from combining individual predictor fields using expert meteorological knowledge and image processing for feature extraction

Page 7: Data Integration:  Assessing the Value and Significance of New Observations and Products

Creating Interest ImagesConvective Initiation

Forecast Engine

Lower Tropospheric Winds/Speed

Regional CI Weights

Orientation and elongation of elliptical kernel prescribed by winds

Cumulus

CI Interest

Locations prescribed by CI Scores

Stability Mask

Number CI Indicators & Visible

Unfavorablefor CI

Favorablefor CI

Predictor Fields Image Processing Feature Extraction

Page 8: Data Integration:  Assessing the Value and Significance of New Observations and Products

Feature ExtractionWeather Classification

Line

Stratiform

Large Airmass

Small Airmass

Embedded

Page 9: Data Integration:  Assessing the Value and Significance of New Observations and Products

Overview of Heuristic Forecast

FeatureExtraction

Satellite

Mosaic Radar Products

Forecast Engine

Weather AnalysisProducts

Forecasts

Error Statistics

Surface Obs

RUC

InterestImages

Page 10: Data Integration:  Assessing the Value and Significance of New Observations and Products

Forecast EngineCombine Interest Images

weight * Pixel Value)weight

VILLong-term

Trend Satellite Interest

RADAR Boundary

Weather Type Image Combined Forecast Image

P(t,pixel,wxtype) =

Short-termTrend

. . . . .

Page 11: Data Integration:  Assessing the Value and Significance of New Observations and Products

Example of VIL Interest Evolution

1530

4560

7590

105120

LineLarge Air

Small AirStratiform

No Type0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Weight

Forecast Time

VIL Pixel Value = 100

Line

Large Air

Small Air

Stratiform

No Type

Page 12: Data Integration:  Assessing the Value and Significance of New Observations and Products

Summary of Heuristic Approach and Limitations

• Individual interest images are each 0-255 VIL-like images resulting from a combination of predictor fields and feature extraction

• Forecast is a weighted average of all interest images dependent on lead time and WxType, with weights determined heuristically

– Combines static set of interest images into 0-2 hour forecasts– Storm evolution is embedded in the weights, dependent on WxType

• Limitations: – The process of integrating a candidate predictor is a manual, time-

intensive process– The utility of the predictor or an interest image to the forecast is

known only qualitatively– There may be other predictor fields and interest images that would

be helpful that are not being currently used– Interest image weights and evolution functions may not be optimal– An objective method could help address these issues

Page 13: Data Integration:  Assessing the Value and Significance of New Observations and Products

Automated Data Importance Evaluation:

Random Forests

Page 14: Data Integration:  Assessing the Value and Significance of New Observations and Products

Random Forest (RF)

• A non-linear statistical analysis technique

• Produces a collection of decision trees using a “training set” of predictor variables (e.g., observation and model datafeatures) and associated “truth” (e.g., future storm intensity) values

– each decision tree’s forecast logic is based on a random subset of data and predictor variables, making it independent from others

– during training, random forests produce estimates of predictor importance

Page 15: Data Integration:  Assessing the Value and Significance of New Observations and Products

Example: CoSPA combiner development(focus on 1 hour VIP level prediction)

• Analyzed data collected in summer 2007– Radar, satellite, RUC model, METAR, MIT-LL feature fields,

storm climatology and satellite-based land use fields

– Transformations• distances to VIP thresholds; channel differences

• disc min, max, mean, coverage over 5, 10, 20, 40 and 80-km radii

– Used motion vectors to “pull back” +1 hr VIP truth data to align with analysis time data fields

• For each problem, randomly selected balanced sets of “true” and “false” pixels from dataset and trained RF– VIP 3 (operationally significant convection)

– initiation at varying distances from existing convection

• Plotted ranks of each predictor (low rank is good) for various scenarios

Page 16: Data Integration:  Assessing the Value and Significance of New Observations and Products

VIL8bit 06/19/2007 23:30

Page 17: Data Integration:  Assessing the Value and Significance of New Observations and Products

VIL8bit_40kmMax 06/19/2007 23:00

Page 18: Data Integration:  Assessing the Value and Significance of New Observations and Products

Example fieldsVIL8bit_40kmPctCov 06/19/2007 23:30

Page 19: Data Integration:  Assessing the Value and Significance of New Observations and Products

VIL8bit_distVIPLevel6+ 06/19/2007 23:30

Page 20: Data Integration:  Assessing the Value and Significance of New Observations and Products

Importance summary for VIP 3 (var. WxType)

Imp

ort

ance

Ran

km

ore

imp

ort

ant

le

ss im

po

rtan

t

MITLL WxType

Page 21: Data Integration:  Assessing the Value and Significance of New Observations and Products

Imp

ort

ance

Ran

km

ore

imp

ort

ant

le

ss im

po

rtan

t

MITLL WxType

Importance summary for init 20 km from existing storm

Page 22: Data Integration:  Assessing the Value and Significance of New Observations and Products

Imp

ort

ance

Ran

km

ore

imp

ort

ant

le

ss im

po

rtan

t

MITLL WxType

Importance summary for init 80 km from existing storm

Page 23: Data Integration:  Assessing the Value and Significance of New Observations and Products

RF Empirical Model Performance: VIP 3

Random Forest votes for VIP >= 3

Fra

ct.

Inst

ance

s w

ith V

IP >

= 3

Calibration

ROC Curve (blue)

RF empirical model provides a probabilistic forecast performance benchmark

Page 24: Data Integration:  Assessing the Value and Significance of New Observations and Products

Summary and Conclusions

• Developing satellite-based weather products may be only the first step of their integration into an operational forecast system

• Integration into an existing forecast system may require physically-informed transformations and heuristics

• An RF statistical analysis can help evaluate new candidate predictors in the context of others– Relative importance

– Feedback on scales of contribution

– Also supplies an empirical model benchmark

• Successful operational implementation may require additional funding beyond initial R&D