michael k. tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfa more...
TRANSCRIPT
![Page 1: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/1.jpg)
Pitfalls of cross validation
Michael K. Tippett
International Research Institute for Climate and SocietyThe Earth Institute, Columbia University
Statistical Methods in Seasonal Prediction, ICTPAug 2-13, 2010
![Page 2: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/2.jpg)
Main ideas
I Cross-validation is useful for estimating the performance ofa model on independent data.
I Few assumptions.I Computationally expensive.I Can be misused.
![Page 3: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/3.jpg)
Outline
I Cross validation and linear regressionI Computational methodI Bias
I PitfallsI Not including model/predictor selection in the
cross-validation.I Not leaving out enought data so that the training and
validation samples are independent.
![Page 4: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/4.jpg)
Cross-validation is expensive.
![Page 5: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/5.jpg)
Cross-validation of linear regression
Cross-validated error can be computed without thecomputational cost of cross-validation.
y(i)− ycv (i) =e(i)
1− vii
whereI ycv (i) is the prediction from the regression with the i-th
sample left out of the compuation of the regressioncoefficients .
I e(i) = y(i)− y(i) in-sample errorI vii is the i-th diagonal of the “hat”-matrix
V = X (X T X )−1X T .
(Cook & Weisberg, Residuals and Influence in Regression, 1980)
![Page 6: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/6.jpg)
Cross-validation for linear regression
What is the correlation of a climatological forecast?What is the cross-validated correlation of a climatologicalforecast?
I X = column of n ones.I X T X = n, (X T X )−1 = 1/n,I V = X (X T X )−1X T = n × n matrix with values 1/n.I e(i) = y(i)− y .
ycv (i) = y(i)− e(i)1− vii
= y(i)− y(i)− y1− 1/n
= yn
n − 1− y
1n − 1
What is the correlation between y and ycv ? Why?
![Page 7: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/7.jpg)
Cross-validation for linear regression
What is the correlation of a climatological forecast?What is the cross-validated correlation of a climatologicalforecast?
I X = column of n ones.I X T X = n, (X T X )−1 = 1/n,I V = X (X T X )−1X T = n × n matrix with values 1/n.I e(i) = y(i)− y .
ycv (i) = y(i)− e(i)1− vii
= y(i)− y(i)− y1− 1/n
= yn
n − 1− y
1n − 1
What is the correlation between y and ycv ? Why?
![Page 8: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/8.jpg)
Cross-validation for linear regression
What is the correlation of a climatological forecast?What is the cross-validated correlation of a climatologicalforecast?
I X = column of n ones.I X T X = n, (X T X )−1 = 1/n,I V = X (X T X )−1X T = n × n matrix with values 1/n.I e(i) = y(i)− y .
ycv (i) = y(i)− e(i)1− vii
= y(i)− y(i)− y1− 1/n
= yn
n − 1− y
1n − 1
What is the correlation between y and ycv ? Why?
![Page 9: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/9.jpg)
Cross-validation for linear regression
What is the correlation of a climatological forecast?What is the cross-validated correlation of a climatologicalforecast?
I X = column of n ones.I X T X = n, (X T X )−1 = 1/n,I V = X (X T X )−1X T = n × n matrix with values 1/n.I e(i) = y(i)− y .
ycv (i) = y(i)− e(i)1− vii
= y(i)− y(i)− y1− 1/n
= yn
n − 1− y
1n − 1
What is the correlation between y and ycv ? Why?
![Page 10: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/10.jpg)
Cross-validation for linear regression
What is the correlation of a climatological forecast?What is the cross-validated correlation of a climatologicalforecast?
I X = column of n ones.I X T X = n, (X T X )−1 = 1/n,I V = X (X T X )−1X T = n × n matrix with values 1/n.I e(i) = y(i)− y .
ycv (i) = y(i)− e(i)1− vii
= y(i)− y(i)− y1− 1/n
= yn
n − 1− y
1n − 1
What is the correlation between y and ycv ? Why?
![Page 11: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/11.jpg)
Cross-validation for linear regression
What is the correlation of a climatological forecast?What is the cross-validated correlation of a climatologicalforecast?
I X = column of n ones.I X T X = n, (X T X )−1 = 1/n,I V = X (X T X )−1X T = n × n matrix with values 1/n.I e(i) = y(i)− y .
ycv (i) = y(i)− e(i)1− vii
= y(i)− y(i)− y1− 1/n
= yn
n − 1− y
1n − 1
What is the correlation between y and ycv ? Why?
![Page 12: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/12.jpg)
Cross-validation for linear regression
What is the correlation of a climatological forecast?What is the cross-validated correlation of a climatologicalforecast?
I X = column of n ones.I X T X = n, (X T X )−1 = 1/n,I V = X (X T X )−1X T = n × n matrix with values 1/n.I e(i) = y(i)− y .
ycv (i) = y(i)− e(i)1− vii
= y(i)− y(i)− y1− 1/n
= yn
n − 1− y
1n − 1
What is the correlation between y and ycv ? Why?
![Page 13: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/13.jpg)
Cross-validation for linear regression
More intuitivelyI Leaving out a wet year, gives a drier meanI Leaving out a drier year, gives a wetter mean.
Negative correlation.
Problem solved if somehow the mean did not change. How?
![Page 14: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/14.jpg)
Cross-validation for linear regression
More intuitivelyI Leaving out a wet year, gives a drier meanI Leaving out a drier year, gives a wetter mean.
Negative correlation.
Problem solved if somehow the mean did not change. How?
![Page 15: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/15.jpg)
Cross-validation for linear regression
More intuitivelyI Leaving out a wet year, gives a drier meanI Leaving out a drier year, gives a wetter mean.
Negative correlation.
Problem solved if somehow the mean did not change. How?
![Page 16: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/16.jpg)
Cross-validation for linear regression
More intuitivelyI Leaving out a wet year, gives a drier meanI Leaving out a drier year, gives a wetter mean.
Negative correlation.
Problem solved if somehow the mean did not change. How?
![Page 17: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/17.jpg)
Cross-validation for linear regression
More intuitivelyI Leaving out a wet year, gives a drier meanI Leaving out a drier year, gives a wetter mean.
Negative correlation.
Problem solved if somehow the mean did not change. How?
![Page 18: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/18.jpg)
Cross-validation for linear regression
More intuitivelyI Leaving out a wet year, gives a drier meanI Leaving out a drier year, gives a wetter mean.
Negative correlation.
Problem solved if somehow the mean did not change. How?
![Page 19: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/19.jpg)
Cross-validation for linear regressionSome benefit to leaving out more years and predicting middleyear. (n = 30)
1 3 5 7 9 11 13 15−1
−0.9
−0.8
−0.7
−0.6
−0.5
−0.4
−0.3
−0.2
−0.1
0
number left out
cross−validated correlation of climatology forecast
Problem with trend?
![Page 20: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/20.jpg)
Cross-validation for linear regressionSome benefit to leaving out more years and predicting middleyear but increase in error variance. (n = 30)
1 3 5 7 9 11 13 15−0.1
−0.09
−0.08
−0.07
−0.06
−0.05
−0.04
−0.03
−0.02
−0.01
0
number left out
cross−validated 1−normalized error variance
![Page 21: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/21.jpg)
Bias of cross validationn=30. y = ax + b.
0 0.2 0.4 0.6 0.8 1−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
population correlation
cv correlation
cv
![Page 22: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/22.jpg)
Bias of cross validationn=30. y = ax + b.
0 0.2 0.4 0.6 0.8 1−0.2
0
0.2
0.4
0.6
0.8
1
1.2
population correlation
cv 1 − normalized error variance
![Page 23: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/23.jpg)
Pitfalls of cross validation
I Performing an initial analysis using the entire data set toidentify the most informative features
I Using cross-validation to assess several models, and onlystating the results for the model with the best results.
I Allowing some of the training data to be (essentially)included in the test set
From wikipedia
![Page 24: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/24.jpg)
Example: PhilippinesProblem: predict gridded April-June precipitation over thePhilippines from proceeding (January-March) SST.
100E 120E 140E 160E 180 160W 140W
20S
10S
0
10N
20N
SST anomaly JFM 1971
−1.6
−1.2
−0.8
−0.4
0
0.4
0.8
1.2
1.6
115E 120E 125E 130E
5N
10N
15N
20N
Rainfall anomalies AMJ 1971
−8
−6
−4
−2
0
2
4
6
8
![Page 25: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/25.jpg)
Example: Philippines
Simple regression model: Either climatology or ENSO aspredictors.
Use AIC to choose.
115E 120E 125E 130E
5N
10N
15N
20N
AIC selected model
CL
ENSO
![Page 26: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/26.jpg)
Example: Philippines
I Some skill(cross-validated)
I Why the negativecorrelation?
115E 120E 125E 130E
5N
10N
15N
20N
correlation
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
115E 120E 125E 130E
5N
10N
15N
20N
1−error var/clim var
0
0.1
0.2
0.3
0.4
0.5
![Page 27: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/27.jpg)
Pitfall!
I Showed the model selected.I Presented the cross-validated skill of that model.
What’s wrong?
The entire dataset was used to select the predictors.Solution?
![Page 28: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/28.jpg)
Pitfall!
I Showed the model selected.I Presented the cross-validated skill of that model.
What’s wrong?
The entire dataset was used to select the predictors.Solution?
![Page 29: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/29.jpg)
Pitfall!
I Showed the model selected.I Presented the cross-validated skill of that model.
What’s wrong?
The entire dataset was used to select the predictors.Solution?
![Page 30: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/30.jpg)
Pitfall!
Solution: include predictor selection in cross-validation.
115E 120E 125E 130E
5N
10N
15N
20N
min AIC selected model
CL
ENSO
115E 120E 125E 130E
5N
10N
15N
20N
max AIC selected model
CL
ENSO
![Page 31: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/31.jpg)
Pitfall!
Negative impact on correlation in places with skill?
115E 120E 125E 130E
5N
10N
15N
20N
Change in correlation
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
Why is the impact positive in some areas?
![Page 32: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/32.jpg)
Pitfall!
Negative impact on normalized error
115E 120E 125E 130E
5N
10N
15N
20N
Change in 1−error var/clim var
−0.2
−0.16
−0.12
−0.08
−0.04
0
0.04
0.08
0.12
0.16
0.2
![Page 33: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/33.jpg)
Pitfall!
Points with little skill are most affected.
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
w/o cv pred. sel.
w/ cv p
red. sel.
correlation
−0.2 −0.1 0 0.1 0.2 0.3 0.4
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
w/o cv pred. sel.
w/ cv p
red. sel.
1−error var/clim var
Idea: convervative models don’t go too far bad . . .What if the model had used many PCs?
![Page 34: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/34.jpg)
A more nefarious (but real) example
I Observe that in a 40-member ensemble of GCMpredictions some members have more skill than others.
I Pick the members with skill exceeding some threshold.I Perform PCA and retain the PCs with skill exceeding some
threshold as your predictors.I Estimate skill using cross-validation.
Sounds harmless, maybe even clever.
What is the problem?
What is the impact?
![Page 35: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/35.jpg)
A more nefarious (but real) example
I Observe that in a 40-member ensemble of GCMpredictions some members have more skill than others.
I Pick the members with skill exceeding some threshold.I Perform PCA and retain the PCs with skill exceeding some
threshold as your predictors.I Estimate skill using cross-validation.
Sounds harmless, maybe even clever.
What is the problem?
What is the impact?
![Page 36: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/36.jpg)
A more nefarious (but real) example
I Observe that in a 40-member ensemble of GCMpredictions some members have more skill than others.
I Pick the members with skill exceeding some threshold.I Perform PCA and retain the PCs with skill exceeding some
threshold as your predictors.I Estimate skill using cross-validation.
Sounds harmless, maybe even clever.
What is the problem?
What is the impact?
![Page 37: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/37.jpg)
A more nefarious (but real) example
I Observe that in a 40-member ensemble of GCMpredictions some members have more skill than others.
I Pick the members with skill exceeding some threshold.I Perform PCA and retain the PCs with skill exceeding some
threshold as your predictors.I Estimate skill using cross-validation.
Sounds harmless, maybe even clever.
What is the problem?
What is the impact?
![Page 38: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/38.jpg)
A more nefarious (but real) example
I Observe that in a 40-member ensemble of GCMpredictions some members have more skill than others.
I Pick the members with skill exceeding some threshold.I Perform PCA and retain the PCs with skill exceeding some
threshold as your predictors.I Estimate skill using cross-validation.
Sounds harmless, maybe even clever.
What is the problem?
What is the impact?
![Page 39: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/39.jpg)
A more nefarious (but real) example
I Observe that in a 40-member ensemble of GCMpredictions some members have more skill than others.
I Pick the members with skill exceeding some threshold.I Perform PCA and retain the PCs with skill exceeding some
threshold as your predictors.I Estimate skill using cross-validation.
Sounds harmless, maybe even clever.
What is the problem?
What is the impact?
![Page 40: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/40.jpg)
A more nefarious (but real) example
Cross-validated forecasts show good skill.
1980 1985 1990 1995 2000 2005 2010−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
correlation =0.84
obs.
CV forecast
What is the real skill?
![Page 41: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/41.jpg)
A more nefarious (but real) example
Apply this procedure 1000 times to random numbers
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.950
50
100
150
200
250frequency of correlation
mean correlation = 0.8
Never underestimate the power of a screening procedure.
![Page 42: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/42.jpg)
A more nefarious (but real) example
Apply this procedure 1000 times to random numbers
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.950
50
100
150
200
250frequency of correlation
mean correlation = 0.8
Never underestimate the power of a screening procedure.
![Page 43: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/43.jpg)
Cross-validation and Predictor selection: Bad
predictor_selection(x,y)ypred = y+NAfor(ii in 1:N) {
out = (ii-1):(ii+1)training = setdiff(1:N,out)xcv = x[training]ycv = y[training]model.cv = lm(ycv ~ xcv)ypred[ii] = predict(model.cv,list(xcv=x[ii]))
}
![Page 44: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/44.jpg)
Cross-validation and Predictor selection: Good
ypred = y+NAfor(ii in 1:N) {
out = (ii-1):(ii+1)training = setdiff(1:N,out)xcv = x[training]ycv = y[training]predictor_selection(xcv,ycv)model.cv = lm(ycv ~ xcv)ypred[ii] = predict(model.cv,list(xcv=x[ii]))
}
![Page 45: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/45.jpg)
Summary
I Predictor selection needs to be included in thecross-validation.
I Impact varies.
![Page 46: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/46.jpg)
Example: PCA and regression
We asked:Is there any benefit to predicting the PCs of y rather than y?
Compared regression at each gridpoint to regression betweenpatterns.
I Compute PCs of SSTI Compute PCs of rainfall.I Skill from cross-validated regression between the PCs.
What is the problem?
Cannot do PCA of “future” observations. PCA of y needs to goin the CV-loop
![Page 47: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/47.jpg)
Example: PCA and regression
We asked:Is there any benefit to predicting the PCs of y rather than y?
Compared regression at each gridpoint to regression betweenpatterns.
I Compute PCs of SSTI Compute PCs of rainfall.I Skill from cross-validated regression between the PCs.
What is the problem?
Cannot do PCA of “future” observations. PCA of y needs to goin the CV-loop
![Page 48: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/48.jpg)
Example: PCA and regression
We asked:Is there any benefit to predicting the PCs of y rather than y?
Compared regression at each gridpoint to regression betweenpatterns.
I Compute PCs of SSTI Compute PCs of rainfall.I Skill from cross-validated regression between the PCs.
What is the problem?
Cannot do PCA of “future” observations. PCA of y needs to goin the CV-loop
![Page 49: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/49.jpg)
Example: Philippines
Y PCA outside CV Y PCA inside CV
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
pointwise
pattern
correlation
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
pointwise
pattern
correlation
![Page 50: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/50.jpg)
Example: Philippines
Y PCA outside CV Y PCA inside CV
−0.1 0 0.1 0.2 0.3 0.4 0.5−0.1
0
0.1
0.2
0.3
0.4
0.5
pointwise
pattern
1−error var/cl var
−0.1 0 0.1 0.2 0.3 0.4 0.5−0.1
0
0.1
0.2
0.3
0.4
0.5
pointwise
pattern
1−error var/cl var
![Page 51: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/51.jpg)
Similar problem.
I Do CCA.I Find patterns and time-series.I Use time-series in a regression.I Check skill using cross-validation.
What is the problem?
What is a solution?
![Page 52: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/52.jpg)
Similar problem.
I Do CCA.I Find patterns and time-series.I Use time-series in a regression.I Check skill using cross-validation.
What is the problem?
What is a solution?
![Page 53: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/53.jpg)
CPT
Climate prediction tool.I 3 consecutive years are left out.I CCA is applied to the remaining years.
I CCA depends on the number of PCs retained.I Middle year of the left out years is forecast.I Repeat until all years are forecast.I Cross-validated forecast depends on the number of PCs
retained.I Select number of PCs that optimizes cross-validated skill.
This skill is the forecast skill.
What is the problem? Solution?
![Page 54: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/54.jpg)
CPT
Climate prediction tool.I 3 consecutive years are left out.I CCA is applied to the remaining years.
I CCA depends on the number of PCs retained.I Middle year of the left out years is forecast.I Repeat until all years are forecast.I Cross-validated forecast depends on the number of PCs
retained.I Select number of PCs that optimizes cross-validated skill.
This skill is the forecast skill.
What is the problem? Solution?
![Page 55: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/55.jpg)
CPT
Climate prediction tool.I 3 consecutive years are left out.I CCA is applied to the remaining years.
I CCA depends on the number of PCs retained.I Middle year of the left out years is forecast.I Repeat until all years are forecast.I Cross-validated forecast depends on the number of PCs
retained.I Select number of PCs that optimizes cross-validated skill.
This skill is the forecast skill.
What is the problem? Solution?
![Page 56: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/56.jpg)
Three data sets
I Data set 1 = estimate parameters of the model.I Data set 2 = select predictor/model.I Data set 3 = estimate skill of model
Why are two data sets not enough?
“Example”Suppose many models are compared.Same skill except for sampling differences. s ± δPick model with largest skill s + δ , larger than real skill s.
![Page 57: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/57.jpg)
Three data sets
I Data set 1 = estimate parameters of the model.I Data set 2 = select predictor/model.I Data set 3 = estimate skill of model
Why are two data sets not enough?
“Example”Suppose many models are compared.Same skill except for sampling differences. s ± δPick model with largest skill s + δ , larger than real skill s.
![Page 58: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/58.jpg)
Three data sets
I Data set 1 = estimate parameters of the model.I Data set 2 = select predictor/model.I Data set 3 = estimate skill of model
Why are two data sets not enough?
“Example”Suppose many models are compared.Same skill except for sampling differences. s ± δPick model with largest skill s + δ , larger than real skill s.
![Page 59: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/59.jpg)
Three data sets
I Data set 1 = estimate parameters of the model.I Data set 2 = select predictor/model.I Data set 3 = estimate skill of model
Why are two data sets not enough?
“Example”Suppose many models are compared.Same skill except for sampling differences. s ± δPick model with largest skill s + δ , larger than real skill s.
![Page 60: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/60.jpg)
SST prediction
I Predict monthly SST anomalies from monthly SSTanomalies six months before.
I 1982-2009. 28 years.I PCA of monthly SST anomalies.I Cross validation to pick the number of PCs.I Look at the correlation of those cv’d forecasts.
What are the problems?
![Page 61: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/61.jpg)
SST predictionI Leave-one-month out cross-validation.I Checked truncations up to 75 PCs.I Lowest cross-validated error with 66 PCs.
What’s wrong with this picture?
120E 160E 160W 120W 80W
30S
20S
10S
0
10N
20N
30N
mean correlation = 0.64
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
![Page 62: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/62.jpg)
SST prediction
I Leave-one-year-out cross-validation.I Checked truncations up to 75 PCs.I Lowest cross-validated error with 6 PCs.
120E 160E 160W 120W 80W
30S
20S
10S
0
10N
20N
30N
mean correlation = 0.42
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
![Page 63: Michael K. Tippettindico.ictp.it/event/a09161/session/17/contribution/10/material/0/0.pdfA more nefarious (but real) example I Observe that in a 40-member ensemble of GCM predictions](https://reader034.vdocuments.net/reader034/viewer/2022051913/60049b7dab1ff777ee7426da/html5/thumbnails/63.jpg)
Summary
I Efficient method to compute leave-one-out cross-validationfor linear regression.
I There are some biases with CV. Climatology forecastshave negative correlation.
I Include model/predictor selection in the CV.I Left-out data must be independent.