nachiketa acharya - india meteorological...

42
Seasonal Forecasting Using the Climate Predictability Tool Validation& Verification in CPT Nachiketa Acharya [email protected] Big Thanks to Dr. Simon Mason

Upload: others

Post on 14-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

Seasonal Forecasting Using the Climate Predictability Tool

Validation& Verification in CPT Nachiketa Acharya [email protected]

Big Thanks to Dr. Simon Mason

Page 2: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

2 Seasonal Forecasting Using the Climate Predictability Tool

Validation vs Verification

• “Validation” v “verification”: we validate a model, but verify forecasts.

• In CPT, “validation” relates to the assessment of a model for deterministic (“best guess”) cross-validated and retroactive predictions; “verification” relates to the assessment of probabilistic forecasts.

Page 3: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

3 Seasonal Forecasting Using the Climate Predictability Tool

Cross-validation Leave-one-out cross-validation

1971 Predict 1971

Training period

1972 Training period

Predict 1972

Training Period

1973 Training period

Predict 1973

Training period

1974 Training period

Predict 1974

Training Period

1975 Training period

Predict 1975

Training period

… repeat to 2010.

Leave-k-out cross-validation

1971 Predict 1971

Omit 1972

Omit 1973

Training period

1972 Omit 1971

Predict 1972

Omit 1973

Omit 1974

Training period

1973 Omit 1971

Omit 1972

Predict 1973

Omit 1974

Omit 1975

Training period

1974 Training period

Omit 1972

Omit 1973

Predict 1974

Omit 1975

Omit 1976

Training period

1975 Training period

Omit 1973

Omit 1974

Predict 1975

Omit 1976

Omit 1977

Page 4: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

Data period (24 year)

1982 2005

1982 2005

Cross validation manner (leave- one –out method)

Page 5: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

5 Seasonal Forecasting Using the Climate Predictability Tool

Retroactive forecasting

Given data for 1951-2000, it is possible to calculate a retroactive set of probabilistic forecasts. CPT will use an initial training period to cross-validate a model and make predictions for the subsequent year(s), then update the training period and predict additional years, repeating until all possible years have been predicted.

1981 Training period

(1951-1980) Predict 1981

Omit 1982+

1982 Training period

(1951-1981) Predict 1982

Omit 1983+

1983 Training period

(1951-1982) Predict 1983

Omit 1984+

1984 Training period

(1951-1983) Predict 1984

Omit 1985+

1985 Training period

(1951-1984) Predict 1985

Page 6: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

6 Seasonal Forecasting Using the Climate Predictability Tool

Forecasts and observations

Discrete Continuous

Deterministic It will rain tomorrow

There will be 10 mm of rain tomorrow

Probabilistic There is a 50% chance of rain

tomorrow

There is a p% chance of more than k mm of rain tomorrow

Page 7: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

7 Seasonal Forecasting Using the Climate Predictability Tool

Forecasts and observations

Discrete Continuous

Deterministic It will rain tomorrow

There will be 10 mm of rain tomorrow

Probabilistic There is a 50% chance of rain

tomorrow

There is a p% chance of more than k mm of rain tomorrow

Page 8: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

8 Seasonal Forecasting Using the Climate Predictability Tool

Continuous measures compare the best-guess forecasts with the observed values without regard to the categories. They compare forecasts in mm or °C against observations in mm or °C. Tools ~ Validation ~ Cross-validated ~ Performance measures

Page 9: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

9 Seasonal Forecasting Using the Climate Predictability Tool

Pearson’s correlation

Pearson’s correlation measures association (are increases and decreases in the forecasts associated with increases and decreases in the observations?).

It does not measure accuracy.

When squared, it tells us how much of the variance of the observations is correctly forecast.

2 2

n

i ii

n n

i ii i

x x y yr

x x y y

Correlation: Measuring the strength of

Linear relationship between two variables

Page 10: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

Correlation between two variables Pearson product-moment correlation

Correlation is a systematic relationship between x and y: When one goes up, the other tends to go up also, or may tend to go down. Need corresponding pairs of cases of x, y. “Perfect” positive correlation is +1 “Perfect” negative correlation is –1 No correlation (x and y completely unrelated) is 0 Correlation can be anywhere between –1 and +1. A relationship between x and y may or may not be causal – if not, x and y may be under control of some third variable. Correlation can be estimated visually by looking at a scatterplot of dots on an x vs. y graph.

Page 11: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

| | | o | | o o | | o o o | | o | | o o | Y| o o o o | | o | | o o | | o o | | o | | o | | o | |_______________________________________________| X correlation = 0.8

Page 12: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

| o | | o | | o o o | | o o o o | | o o o | | o o | Y| o o o o | | o o o | | o o o | | o o | | o | | o o o | | o | |_______________________________________________| X correlation = 0.55

Page 13: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

| | | | Y | | | | | o | | o o | | o o | | o o | | o o | |______o_______________________o____|

X correlation = 0

there is a strong nonlinear relationship The Pearson correlation only detects linear relationships

Page 14: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

| o | | | Y | | | | | | | | | o | |o o o | |oooooo o o | |oooooo_o_________________________________|

X correlation = 0.87 (due to one outlier in upper right) If domination by one case is not desired, can use the Spearman rank correlation (correlation among ranks instead of actual values).

Page 15: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

15 Seasonal Forecasting Using the Climate Predictability Tool

Spearman’s correlation

Numerator: ?

Denominator: ?

How much of the squared variance of the ranks for the observations can we correctly forecast?

Huh?

Spearman’s correlation does not have as obvious an interpretation as Pearson’s, but it is much less sensitive to extremes.

2

1

2

6

11

i i

n

x yi

r r

n n

Page 16: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

Spearman rank correlation Rank correlation is the Pearson correlation between the ranks of X vs. the ranks of Y, treating ranks as numbers. Rank correlation measures the strength of monotonic relationship between two variables. Rank correlation defuses outliers by not honoring original intervals between adjacent ranks. Adjacent ranks simply differ by 1. Simpler formula for rank correlation for small samples: If difference in rank for a given case is D,

Spearman cor = 1 -

If ranks identical for all cases, all D are zero and cor = 1. An example of the use of this formula is given in next slide.

n

i iDNN 1

22 )1(

6

Page 17: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

Spearman rank correlation

Rank correlation is simply the correlation between the ranks of X vs. the ranks of Y, treating ranks as numbers. When there are outliers, or when the X and/or Y data are very much non-normal, the Spearman rank correlation should be computed in addition to the standard correlation.

Example of conversion to ranks for X or for Y: Original numbers: 2 9 189 3 21 7 Corresponding ranks: 6 3 1 5 2 4 can also be 1 4 6 2 5 3 Note in above example that the difference between 189 and 21 is treated as the same as that between 9 and 7.

Page 18: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

18 Seasonal Forecasting Using the Climate Predictability Tool

2 AFC (Kendall’s tau)

Denominator: total number of pairs.

Numerator: difference in the numbers of concordant and discordant pairs.

Kendall’s correlation measures discrimination (do the forecasts increase and decrease as the observations increase and decrease?). It can be transformed to the probability that the forecasts successfully distinguish the wetter (or hotter) of two observations?

12 1

c dn n

n n

Page 19: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

19 Seasonal Forecasting Using the Climate Predictability Tool

Error measures compare the best-guess forecasts with the observed values without regard to the categories. They compare forecasts in mm or °C against observations in mm or °C.

Page 20: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

20 Seasonal Forecasting Using the Climate Predictability Tool

Biases Mean bias:

Always close to zero for cross-validated forecasts;

Slightly negative if predictand data are positively skewed.

Indicates ability to forecast shifts in climate for retroactive forecasts.

Variance or amplitude bias:

Typically very small if skill is low because forecasts always close to the mean

If there is no mean or variance bias, the RMSE of the forecasts will exceed that of climatology if the correlation is less than 0.5.

Page 21: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

Root-mean-Square Skill Score: RMSSS for continuous deterministic forecasts RMSSS is defined as: where: RMSEf = root mean square error of forecasts, and RMSEs = root mean square error of standard used as no-skill baseline. Both persistence and climatology can be used as baseline. Persistence, for a given parameter, is the persisted anomaly from the forecast period immediately prior to the LRF period being verified. For example, for seasonal forecasts, persistence is the seasonal anomaly from the season period prior to the season being verified. Climatology is equivalent to persisting an anomaly of zero.

RMSf =

Page 22: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

22 Seasonal Forecasting Using the Climate Predictability Tool

Forecasts and observations

Discrete Continuous

Deterministic It will rain tomorrow

There will be 10 mm of rain tomorrow

Probabilistic There is a 50% chance of rain

tomorrow

There is a p% chance of more than k mm of rain tomorrow

Page 23: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

23 Seasonal Forecasting Using the Climate Predictability Tool

Categorical measures measure the skill of the deterministic forecasts with the observations as categories. Some compare forecasts in mm or °C with observations as categories, others compare categories with categories.

Page 24: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

24 Seasonal Forecasting Using the Climate Predictability Tool

Hit scores convert the forecasts to categories and then compare these with the observed categories. But note that the category containing the best guess is not necessarily the most likely!

Page 25: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

25 Seasonal Forecasting Using the Climate Predictability Tool

Hit scores

The contingency tables are based on cross-validated definitions of the categories and so may not perfectly match implied scores from the graph.

Some hits can be expected even with useless forecasts (e.g., guessing, or always forecasting the same outcome…

Tools ~ Contingency Tables ~ Cross-validated

Page 26: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr
Page 27: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr
Page 28: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

28 Seasonal Forecasting Using the Climate Predictability Tool

Measures of discrimination: can the forecasts successfully distinguish different outcomes? The observations are categories, but the forecasts are continuous (except where indicated).

Page 29: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

29 Seasonal Forecasting Using the Climate Predictability Tool

ROC diagrams

ROC areas: do we issue a higher probability when the category occurs?

Graph bottom left: when the probabilities are high, does the category occur?

Graph top right: when the probabilities are low, does the category not occur?

Retroactive forecasts of MAM 1986 – 2010 Thailand rainfall using February Pacific SSTs

Page 30: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

30 Seasonal Forecasting Using the Climate Predictability Tool

Relative Operating Characteristics

Page 31: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

31 Seasonal Forecasting Using the Climate Predictability Tool

Continuous scores

Correlations

Pearson’s: % variance

Spearman’s: % variance of ranks

Kendall’s: 2AFC – probability of successfully identifying warmer / wetter observation

Errors

Mean bias: unconditional error

Variance bias: underestimation of variability

RMSE: correlation, mean and variance bias

MAE: average error

Page 32: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

32 Seasonal Forecasting Using the Climate Predictability Tool

Categorical scores

Hits

Hit score: % correct

Hit skill: % correct adjusted for guessing

LEPS: adjusts for near-misses

Gerrity: adjusts for near-misses

Discrimination

2AFC: probability of successfully identifying warmer / wetter category

ROC: probability of successfully identifying observation in current category

Page 33: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

33 Seasonal Forecasting Using the Climate Predictability Tool

Significance testing

Tools ~ Validation ~ Cross-validated ~ Bootstrap

Page 34: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

34 Seasonal Forecasting Using the Climate Predictability Tool

Probabilistic Forecasts

Why do we issue forecasts probabilistically?

• We cannot be certain what is going to happen

• The probabilities try to give an indication of how confident we are that the specified outcome will occur.

Page 35: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

35 Seasonal Forecasting Using the Climate Predictability Tool

Verification of probabilistic forecasts

Attributes Diagrams: graphs reliability, resolution, sharpness ROC Diagrams: graphs showing discrimination Scores: a table of scores for probabilistic forecasts Skill Maps: maps of scores for probabilistic forecasts Tendency Diagram: graphs showing unconditional biases Ranked Hits Diagram: graphs showing frequencies of observed

categories having the highest probability Weather Roulette: graphs showing estimates of forecast value

Page 36: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

36 Seasonal Forecasting Using the Climate Predictability Tool

What makes a “good” probabilistic forecast?

Reliability the event occurs as frequently as implied by the forecast

Sharpness the forecasts frequently have probabilities that differ from climatology considerably

Resolution the outcome differs when the forecast differs

Discrimination the forecasts differ when the outcome differs

36

Page 37: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

37 Seasonal Forecasting Using the Climate Predictability Tool

Attributes diagrams

The histograms show the sharpness.

The vertical and horizontal lines show the observed climatology and indicate the forecast bias.

The diagonal lines show reliability and “skill”.

The coloured line shows the reliability and resolution of the forecasts.

The dashed line shows a smoothed fit.

Page 38: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

38 Seasonal Forecasting Using the Climate Predictability Tool

Probabilistic scores

Scores per category

Brier score: mean squared error in probability (assuming that the probability should be 100% if the category occurs and 0% if it does not occur)

Brier skill score: % improvement over Brier score using climatology forecasts (often pessimistic because of strict requirement for reliability)

ROC area: probability of successfully discriminating the category (i.e., how frequently the forecast probability for that category is higher when it occurs than when it does not occur)

Resolution slope: % increase in frequency for each 1% increase in forecast probability

Page 39: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

39 Seasonal Forecasting Using the Climate Predictability Tool

Probabilistic scores

Overall scores

Ranked prob score: mean squared error in cumulative probabilities

RPSS: % improvement over RPS using climatology forecasts (often pessimistic because of strict requirement for reliability)

2AFC score: probability of successfully discriminating the wetter or warmer category

Resolution slope: % increase in frequency for each 1% increase in forecast probability

Effective interest: % return given fair odds

Linear prob score: average probability on the category that occurs

Hit score (rank n): how often the category with the nth highest probability occurs

Page 40: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

Verification of Probabilistic Categorical Forecasts: The Ranked Probability Skill Score (RPSS)

Epstein (1969), J. Appl. Meteor.

RPSS measures cumulative squared error between categorical forecast probabilities and the observed categorical probabilities relative to a reference (or standard baseline) forecast. The observed categorical probabilities are 100% in the observed category, and 0% in all other categories.

2( ) ( )

1

( )Ncat

F cat O catcat

RPS Pcum Pcum

Where Ncat = 3 for tercile forecasts. The “cum” implies that the sum- mation is done for cat 1, then cat 1 and 2, then cat 1 and 2 and 3.

Page 41: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

2( ) ( )

1

( )Ncat

F cat O catcat

RPS Pcum Pcum

The higher the RPS, the poorer the forecast. RPS=0 means that the probability was 100% given to the category that was observed. The RPSS is the RPS for the forecast compared to the RPS for a reference forecast such as one that gives climatological probabilities.

1 forecast

reference

RPSRPSS

RPS

RPSS > 0 when RPS for actual forecast is smaller than RPS for the reference forecast.

Page 42: Nachiketa Acharya - India Meteorological Departmentrcc.imdpune.gov.in/Training/SASCOF12/CPT_Daytwo/Validation... · Nachiketa Acharya nachiketa@iri.columbia.edu Big Thanks to Dr

42 Seasonal Forecasting Using the Climate Predictability Tool

What is “skill”?

42