julie demargne, james brown, yuqiong liu and d-j seo
DESCRIPTION
Application of Forecast Verification Science to Operational River Forecasting in the National Weather Service. Julie Demargne, James Brown, Yuqiong Liu and D-J Seo. UCAR. NROW, November 4-5, 2009. Approach to river forecasting. Observations. Forecasters. Models. Input forecasts. Users. - PowerPoint PPT PresentationTRANSCRIPT
Application of Forecast Verification Application of Forecast Verification Science to Operational River Forecasting Science to Operational River Forecasting
in the National Weather Servicein the National Weather Service
Julie Demargne, Julie Demargne, James Brown,James Brown,
Yuqiong Liu and D-J SeoYuqiong Liu and D-J Seo
NROW, November 4-5, 2009UCAR
2
Approach to river forecastingApproach to river forecasting
INTERFLOWSURFACERUNOFF
INFILTRATIONTENSION
TENSION TENSION
PERCOLATION
LOWERZONE
UPPERZONE
PRIM ARYFREE
SUPPLE-MENTAL
FREE
RESERVED RESERVED
FREE
EVAPOTRANSPIRATION
BASEFLOW
SUBSURFACEOUTFLOW
DIRECTRUNOFF
Observations
Models
Input forecasts
Forecastproducts
Users
Forecasters
Forecasters
3
Where is the …?Where is the …?VerificationVerification
???
In the past
• Limited verification of hydrologic forecasts
• How good are the forecasts for application X?
4
Where is the …?Where is the …?
!!!
Verification Experts
Verification Systems
Verification Products
Papers
NowVerificationVerification
5
Hydrologic forecasting: a multi-scale problemHydrologic forecasting: a multi-scale problem
National Major river system River basin with river forecast points
Forecast group Headwater basin with radar rainfall grid
High resolution flashflood basins
Hydrologic forecasts must be verified consistently across all spatial scales and resolutions.
6
Hydrologic forecasting: a multi-scale problemHydrologic forecasting: a multi-scale problem
Seamless probabilistic water forecasts are required for all lead times and all users; so is verification information.
Benefits
Forecast Lead Time
Protection of Life & Property
Hydropower Recreation Ecosystem State/Local Planning
Environment
Flood Mitigation & Navigation
Agriculture Health Commerce Reservoir Control
Forecast Uncertainty
Weeks
Months
Seasons Years
Days
Hours
Minutes
7
Need for hydrologic forecast verificationNeed for hydrologic forecast verification
• In 2006, NRC recommended NWS expand verification of its uncertainty products and make it easily available to all users in near real time
Users decide whether to take action with risk-based decisionMust educate users on how to interpret forecast and verification info
8
River forecast verification serviceRiver forecast verification service
http://www.nws.noaa.gov/oh/rfcdev/docs/NWS-Hydrologic-Forecast-Verification-
Team_Final-report_Sep09.pdf.pdf
http://www.nws.noaa.gov/oh/ rfcdev/docs/ Final_Verification_Report.pdf
9
River forecast verification serviceRiver forecast verification service
• To help us answer How good are the forecasts for application X? What are the strengths and weaknesses of the forecasts? What are the sources of error and uncertainty in the
forecasts? How are new science and technology improving the
forecasts and the verifying observations? What should be done to improve the forecasts? Do forecasts help users in their decision making?
10
River forecast verification serviceRiver forecast verification service
INTERFLOWSURFACERUNOFF
INFILTRATIONTENSION
TENSION TENSION
PERCOLATION
LOWERZONE
UPPERZONE
PRIM ARYFREE
SUPPLE-MENTAL
FREE
RESERVED RESERVED
FREE
EVAPOTRANSPIRATION
BASEFLOW
SUBSURFACEOUTFLOW
DIRECTRUNOFF
Observations
Models
Input forecasts
Forecastproducts
Verification systems
Verification products
River forecasting system
Users
UsersForecasters
11
River forecast verification serviceRiver forecast verification service
• Verification Service within Community Hydrologic Prediction System (CHPS) to:
Compute metrics
Display data & metrics
Disseminate data & metrics
Provide real-time access to metrics
Analyze uncertainty and error in forecasts
Track performance
12
Verification challengesVerification challenges• Verification is useful if the information generated leads to
decisions about the forecast/system being verified Verification needs to be user oriented
• No single verification measure provides complete information about the quality of a forecast product Several verification metrics and products are needed
• To facilitate communication of forecast quality, common verification practices and products are needed from weather and climate forecasts to water forecasts Collaborations between meteorology and hydrology communities
are needed (e.g., Thorpex-Hydro, HEPEX)
13
Verification challenges: two classes of Verification challenges: two classes of verificationverification
• Diagnostic verification: to diagnose and improve model performance done off-line with archived forecasts or hindcasts to analyze
forecast quality relative to different conditions/processes
• Real-time verification: to help forecasters and users make decisions in real-time done in real-time (before the verifying observation occurs)
using information from historical analogs and/or past forecasts and verifying observations under similar conditions
14
Diagnostic verification productsDiagnostic verification products
• Key verification metrics for 4 levels of information for single-valued and probabilistic forecasts
1. Observations-forecasts comparisons (scatter plots, box plots, time series plots)
2. Summary verification (e.g. MAE/Mean CRPS, skill score)
3. More detailed verification (e.g. measures of reliability, resolution, discrimination, correlation, results for specific conditions)
4. Sophisticated verification (e.g. for specific events with ROC)
To be evaluated by forecasters and forecast users
15
For
ecas
t va
lue
Observed value
User-defined threshold
Diagnostic verification productsDiagnostic verification products• Examples for level 1: scatter plot, box-and-whiskers plot
16
Diagnostic verification productsDiagnostic verification products
Zero error line
Observed daily total precipitation [mm]
Low biasHigh bias
“Blown” forecasts
American River in California – 24-hr precipitation ensembles (lead day 1)F
ore
ca
st
err
or
(fo
rec
as
t -
ob
se
rve
d)
[mm
]‘Errors’ forone forecast
Max.
90%
80%
Median
20%10%
Min.
• Examples for level 1: box-and-whiskers plot
17
Diagnostic verification productsDiagnostic verification products• Examples for level 2: skill score maps by months
January April October
Smaller score, better
18
Diagnostic verification productsDiagnostic verification products• Examples for level 3: more detailed plots
ScoreScore
Performance under different conditions
Performance for different months
19
Diagnostic verification productsDiagnostic verification products• Examples for level 4: event specific plots
Probability of False Detection POFD
Pro
bab
ility
of
Det
ecti
on
PO
D
Event: > 85th percentile from observed distribution
Predicted Probability
Ob
serv
ed f
req
uen
cy
Perfect
PerfectReliability Discrimination
20
Diagnostic verification productsDiagnostic verification products• Examples for level 4: user-friendly spread-bias plot
60% of time, observation should fall in window covering middle 60% (i.e. median ±30%)
“Underspread”
“Hit rate” = 90%
60% Perfect
21
Diagnostic verification analysesDiagnostic verification analyses• Analyze any new forecast process with verification
• Use different temporal aggregations Analyze verification statistic as a function of lead time
If similar performance across lead times, data can be pooled
• Perform spatial aggregation carefully Analyze results for each basin and results plotted on spatial maps
Use normalized metrics (e.g. skill scores)
Aggregate verification results across basins with similar hydrologic processes (e.g. by response time)
• Report verification scores with sample size In the future, confidence intervals
22
Diagnostic verification analysesDiagnostic verification analyses• Evaluate forecast performance under different conditions
w/ time conditioning: by month, by season
w/ atmospheric/hydrologic conditioning: – low/high probability threshold
– absolute thresholds (e.g., PoP, Flood Stage)
Check that sample size is not too small
• Analyze sources of uncertainty and errorVerify forcing input forecasts and output forecasts
For extreme events, verify both stage and flow
Sensitivity analysis to be set up at all RFCs:
1) what is the optimized QPF horizon for hydrologic forecasts?
2) do run-time modifications made on the fly improve forecasts?
23
Diagnostic verification softwareDiagnostic verification software• Interactive Verification Program (IVP) developed at OHD:
verifies single-valued forecasts at given locations/areas
24
Diagnostic verification softwareDiagnostic verification software• Ensemble Verification System (EVS) developed at OHD:
verifies ensemble forecasts at given locations/areas
25
Dissemination of diagnostic verificationDissemination of diagnostic verification• Example: WR water supply website
http://www.nwrfc.noaa.gov/westernwater/
Data Visualization
Error•MAE, RMSE•Conditional on lead time, year
Skill•Skill relative to Climatology•Conditional
Categorical•FAR, POD, contingency table (based on climatology or user definable)
26
http://www.erh.noaa.gov/ohrfc/bubbles.php
Dissemination of diagnostic verificationDissemination of diagnostic verification• Example: OHRFC bubble plot online
27
Real-time verificationReal-time verification• How good could the ‘live’ forecast be?
Live forecast
Observations
28
• Select analogs from a pre-defined set of historical events and compare with ‘live’ forecast
Real-time verificationReal-time verification
Analog 1
ObservedLive forecast
Analog ForecastAnalog Observed
Analog 2Analog 3
“Live forecast for Flood is likely to be too high”
29
Real-time verificationReal-time verification
What happened
Live forecast
• Adjust ‘live’ forecast based on info from the historical analogs
Live forecast was too high
30
Real-time verificationReal-time verification• Example for ensemble forecasts
Tem
pera
ture
(o F)
Forecast lead day
Live forecast (L)
Analog observations
Analog forecasts (H): μH = μL ± 1.0˚C
“Day 1 forecast is probably too high”
31
Real-time verificationReal-time verification• Build analog query prototype using multiple criteria
Seeking analogs for precipitation: “Give me past forecasts for the 10 largest events relative to hurricanes for this basin.”
Seeking analogs for temperature: “Give me all past forecasts with lead time 12 hours whose ensemble mean was within 5% of the live ensemble mean.”
Seeking analogs for flow: “Give me all past forecasts with lead times of 12-48 hours whose probability of flooding was >=0.95, where the basin-averaged soil-moisture was > x and the immediately prior observed flow exceeded y at the forecast issue time”.
Requires forecasters’ input!
32
Outstanding science issuesOutstanding science issues• Define meaningful reference forecasts for skill scores
• Separate timing error and amplitude error in forecasts
• Verify rare events and specify sampling uncertainty in metrics
• Analyze sources of uncertainty and error in forecasts
• Consistently verify forecasts on multiple space and time scales
• Verify multivariate forecasts (issued at multiple locations and for multiple time steps) by accounting for statistical dependencies
• Account for observational error (measurement and representativeness errors) and rating curve error
• Account for non-stationarity (e.g., climate change)
33
Verification service developmentVerification service development
OHD-NCEPThorpex-Hydro project
OHDOCWWS
NCEP
Forecasters Users
Academia Forecast agencies
Private HEPEX Verification Test Bed (CMC, Hydro-Quebec, ECMWF)
OHD-Deltares collaboration for CHPS enhancements
COMET-OHD-OCWWS collaboration on training
34
Looking aheadLooking ahead
• 2012: Info on quality of forecast service
available online
real-time and diagnostic verification implemented in CHPS
RFC verification standard products available online along with forecasts
• 2015: Leveraging grid-based verification
tools
FUTURE
36
Extra slideExtra slide
37
Diagnostic verification productsDiagnostic verification products• Key verification metrics from NWS Verification Team report