Why do we seek help from “big data” in nowcasting of precipitation?
Data description, Introduction, Motivation
Research presentation
Radar Data Used here: 17 years of quasi-continuous records. Reflectivity at a resolution 4x4 km2, every 15min.
0.00
0.05
0.10
0.15
Prob
. Pre
cipi
tatio
n [Z
> 1
5dBZ
]
PLUS NSEP Reanalysis Data every 3h
Description of the CAPS SSEF ensembles. All ensemble members, except one, assimilate radar data.
Physics options used in the ensemble: MP-Thompson (Thompson et al.,2008), Ferrier (Ferrier et al.,2002),WSM6 (Hong and Lim,2006), WDM6 (Lim and Hong, 2010), Morrison (Morrison et al.,2009), M-Y (Milbrandt and Yau,2005); PBL-MYJ (Janji 1994),YSU (Hong et al.,2006),QNSE (Sukoriansky et al.,2005), MYNN (Nakanishi and Niino, 2006), ACM2 (Pleim,2007);LSM-Noah (Tewari et al.,2004), RUC (Benjamin et al.,2004); SW -Dudhia(Dudhia,1989), RRTMG (Iacono et al.,2008), Goddard (Chou and Suarez,1999); lsls LW-RRTM (Mlawer et al.,1997),RRTMG (Iacono et al.,2008).
The NWP model data Year Forecast
lengthIC/LBC/PHYS MP only PB L only Miscellaneous
2008 30h 8 members;MP : Thompson,Ferrier, WSM6;
SW : Dudhia, Goddard;PBL : MYJ, YSU.
;2.2VWRA-FRW--CN :MP : Thompson,
SW -G oddard, PBL - MYJ;all : LW -R RTM;
LSM -Noah.2009 30h --
2010 30h
2011 36h
2012 36h
2013 48h
WRF-ARW V3.0.1.1;CN :MP: Thompson,
SW -G oddard, PBL -MYJ,
all : LW -R RTM.LSM -Noah;
SW -G oddard;
WRF-ARW V3.2.1;CN :MP -Th ompson,
all : LW -R RTM,PBL -MYJ, LSM -Noah;
Rad ar-D A cycl ed member(CC).
SW -G oddard;
WRF-ARW V3.1.1;CN :MP: Thompson,,
all : LW -R RTM,PBL -MYJ, LSM -Noah;
2 I C-o nly memb ers .
SW -G oddard;
WRF-ARW V3.3.1;CN :MP -Th ompson,
all : LW -R RTM,PBL -MYJ, LSM -Noah;
1 SK EB memb er.
SW -RR TMG;
WRF-ARW V3.4.1;CN :MP -Th ompson,
all : LW -R RTMG,PBL -MYJ, LSM -Noah;
1 m embe r M P coupled to radiation.
LSM : Noah, RUC.
8 members;MP : Thompson,Ferrier, WSM6;
SW : Dudhia, Goddard;PBL : MYJ, YSU;
LSM : Noah.
3 members;MP : WDM6,
Ferrier, WSM6, Morr ison;PBL : MYJ;
LSM : Noah, RUC.
9 members;MP : Thompson, WDM6,Ferrier, WSM6, Morrison;
PBL : MYJ, YSU,QNSE, MYNN;
LSM : Noah.
2 members;MP : Thompson;
PBL : MYNN, QNSE;
LSM : Noah, RUC.
17 members;MP : Thompson, WDM6,
Ferrier (+) , WSM6,
PBL : MYJ, YSU,QNSE, MYNN, ACM2;
Morri son, M -Y;
LSM : Noah.
10 members,MP : Thompson-v31,
Ferrier ( 2), WSM 6 ( 5),
PBL : MYJ;Morri son, M -Y , WDM6;
LSM : Noah.
10 members;MP : Thompson,
PBL : MYJ (4) , YSUQNSE, MYNN,
ACM2 ( 3), YSU-Thom pson;
LSM : Noah.
3 m embers,MP : Morrison,
PBL : MYJ;M-Y , WD M6;
LSM : Noah.
4 m embers;
MP : Thompson;
PBL : ACM2, YSU,QNSE, MYNN;
LSM : Noah, RUC.
13 members;MP : Thompson, WDM6,
PBL : MYJ, YSU,QNSE, MYNN, ACM2;
Morrison, M -Y;
LSM : Noah.
6 m embers,MP : WDM6, NSS L, Morrison, M -Y , WSM 6;
PBL : MYJ;LSM : Noah.
10 members;
MP : Thompson;
PBL : YSU, ACM2,QNSE, MYNN;
-
Ming Xue
Mesoscale nowcasting in the past (1 to 500 km) (forecasting for 0 to 6h)
In the beginning there were analogues: “BIG DATA” was in a forecaster’s memory (accumulated experience for
situations analogous to the present) and the processing algorithms were
conceptual models.
Conceptual models: a complex problem reduced to a system with few proxy variables.
Best example in my memory:
Hector Grandoso forecasting occurrence of hailstorms in Mendoza
Then computers became more and more powerful: NWP became the future.
Models became better, numerical methods improved, more and more physics
was added…
Then, deterministic chaos cast a shadow, model errors became an issue,
physical parameterizations are always poor approximations to reality,
nature is constantly perturbed at the smaller scales and runs away from
model predictions….
Ensemble forecast accounted for uncertainties, dada assimilation would
correct the drift the NWP shortcomings.
Forecast improved continuously over time (at least at 500mb)!
AND ALL THE WHILE, MESOSCALE PRECIPITATION STUBBORNLY REFUSED TO BE WELL PREDICTED QUANTITATIVELY
WHAT IS THE PRESENT SITUATION?
Surcel, M , et al, 2015: A study on the scale-dependence of the predictability of precipitation patterns, J. Atmos. Sci., 72, 216-235.
Madalina Surcel
Methodology in a nutshel: 1- take 1-h accumulations of precipitation patterns, form radar and model forecasts 2- decompose by scale, λ 3- compare these patterns scale by scale (band-pass) 4- from this comparison determine λ0, the scale at which predicability is totally lost as a function of lead time:
That is, at λ0 forecast and verification are totally decorrelated.
Surcel, M , et al, 2015: A study on the scale-dependence of the predictability of precipitation patterns, J. Atmos. Sci., 72, 216-235.
Madalina Surcel
PHVR�њ
0 5 10 15 20 25Lead time [h]
10
100
1000
h 0 [km
] PHVR�ћ
PHVR�ќ
Methodology in a nutshel: 1- take 1-h accumulations of precipitation patterns, form radar and model forecasts 2- decompose by scale, λ 3- compare these patterns scale by scale (band-pass) 4- from this comparison determine λ0, the scale at which predicability is totally lost as a function of lead time:
That is, at λ0 forecast and verification are totally decorrelated.
Surcel, M , et al, 2015: A study on the scale-dependence of the predictability of precipitation patterns, J. Atmos. Sci., 72, 216-235.
Madalina Surcel
NWP wthout Radar Data Assim. veryfied by observations PHVR�њ
0 5 10 15 20 25Lead time [h]
10
100
1000
h 0 [km
] PHVR�ћ
PHVR�ќ
Methodology in a nutshel: 1- take 1-h accumulations of precipitation patterns, form radar and model forecasts 2- decompose by scale, λ 3- compare these patterns scale by scale (band-pass) 4- from this comparison determine λ0, the scale at which predicability is totally lost as a function of lead time:
That is, at λ0 forecast and verification are totally decorrelated.
Surcel, M , et al, 2015: A study on the scale-dependence of the predictability of precipitation patterns, J. Atmos. Sci., 72, 216-235.
Madalina Surcel
NWP wthout Radar Data Assim. veryfied by observations
NWP with Radar Data Assim. veryfied by observations
PHVR�њ
0 5 10 15 20 25Lead time [h]
10
100
1000
h 0 [km
] PHVR�ћ
PHVR�ќ
Methodology in a nutshel: 1- take 1-h accumulations of precipitation patterns, form radar and model forecasts 2- decompose by scale, λ 3- compare these patterns scale by scale (band-pass) 4- from this comparison determine λ0, the scale at which predicability is totally lost as a function of lead time:
That is, at λ0 forecast and verification are totally decorrelated.
Surcel, M , et al, 2015: A study on the scale-dependence of the predictability of precipitation patterns, J. Atmos. Sci., 72, 216-235.
Madalina Surcel
NWP wthout Radar Data Assim. veryfied by observations
NWP with Radar Data Assim. veryfied by NWP (intrinsic)
NWP with Radar Data Assim. veryfied by observations
PHVR�њ
0 5 10 15 20 25Lead time [h]
10
100
1000
h 0 [km
] PHVR�ћ
PHVR�ќ
Methodology in a nutshel: 1- take 1-h accumulations of precipitation patterns, form radar and model forecasts 2- decompose by scale, λ 3- compare these patterns scale by scale (band-pass) 4- from this comparison determine λ0, the scale at which predicability is totally lost as a function of lead time:
That is, at λ0 forecast and verification are totally decorrelated.
Surcel, M , et al, 2015: A study on the scale-dependence of the predictability of precipitation patterns, J. Atmos. Sci., 72, 216-235.
Madalina Surcel
NWP wthout Radar Data Assim. veryfied by observations
NWP with Radar Data Assim. veryfied by NWP (intrinsic)
NWP with Radar Data Assim. veryfied by observations
PHVR�њ
0 5 10 15 20 25Lead time [h]
10
100
1000
h 0 [km
] PHVR�ћ
PHVR�ќ
Methodology in a nutshel: 1- take 1-h accumulations of precipitation patterns, form radar and model forecasts 2- decompose by scale, λ 3- compare these patterns scale by scale (band-pass) 4- from this comparison determine λ0, the scale at which predicability is totally lost as a function of lead time:
That is, at λ0 forecast and verification are totally decorrelated.
Observations contribution from errors in observations
Limits to Mesoscale Predictability
Summary of model validation with radar data: With no radar data assimilation scales smaller than ~300km are not predictable
Effect of Radar Data Assimilation lasts forever, but the improvement in skill (small) lasts up to ~4h lead time (roughly as the spin-up time)
Madalina Surcel
Limits to Mesoscale Predictability
Limit to intrinsic model predicability When an ensemble member is used as truth there is some
predictability for 12h at λ>200km
Summary of model validation with radar data: With no radar data assimilation scales smaller than ~300km are not predictable
Effect of Radar Data Assimilation lasts forever, but the improvement in skill (small) lasts up to ~4h lead time (roughly as the spin-up time)
Madalina Surcel
Limits to Mesoscale Predictability
Limit to intrinsic model predicability When an ensemble member is used as truth there is some
predictability for 12h at λ>200km
Summary of model validation with radar data: With no radar data assimilation scales smaller than ~300km are not predictable
Effect of Radar Data Assimilation lasts forever, but the improvement in skill (small) lasts up to ~4h lead time (roughly as the spin-up time)
Madalina Surcel
FOR PRECIPITATION, BEYOND THESE SHORT LIMITS AND SMALLER SCALES THERE CAN ONLY BE PROBABILISTIC FORECASTING / NOWCASTING.
Can we improve short term forecasting of precipitation with the available long term radar records and model outputs?
Isztar Zawadzki, McGill University, Presenting work of:
Aitor Atencia Ruiz de Gopegui Bernat Puigdomènech Treserras
And many thanks to M.K. Yau and F. Fabry for comments and support
BACK TO ANALOGUES (but now with data instead of subjective experience)
__________________________
Probabilistic Nowcasting of Precipitation Using “Analogues” (Similar Patterns);
Preliminary Comparison with NWP.
For every radar map similar patterns were searched in 17 years of radar archives. Similarity is defined by a decision-tree comprising the following criteria:
1. Spatial cross-correlation between Rainfall patterns
2. Location to account for Geographical factors
3. Temporal correlation to select for similarity of Motion & Evolution of patterns
4. Time of the day and year to account for the Diurnal and Annual cycles
Analogue selection criteria
5- Synoptic situation to account for influence of Large scale forcing
There are three main factors needed for rainfall occurrence: instability, moisture and forcing.
The variables more appropriate were found to be:
temperature, Τ at 50 kPa, specific humidity at 70 kPa and
pressure vertical velocity, ω at 85 kPa.
Analogue selection criteria
5- Synoptic situation to account for influence of Large scale forcing
There are three main factors needed for rainfall occurrence: instability, moisture and forcing.
The variables more appropriate were found to be:
temperature, Τ at 50 kPa, specific humidity at 70 kPa and
pressure vertical velocity, ω at 85 kPa.
Analogue selection criteria
Why these 7 parameters? Because this is the conceptual model framework to which we are used and we understand, and black-box approach like SOM did
not work for precipitation patters. Surely it can be refined.
For all criteria the degree of similarity between the observation and the analogue candidate is determined by the desired ensemble number of “analogues” available in the records.
The number of “analogues” can be adjusted so that the ensemble is not over nor under-dispersive. The latter can be evaluated by the nowcast of the previous time.
Thus, the procedure can be made adaptable to the situation.
Analogue selection criteria
10.0
17.5
25.0
32.5
40.0
47.5
55.0
62.5
70.0
Ref
lect
ivity
[dBZ
]
-100 -95 -90 -85 -80 -75Longitude
30
35
40
45
50
Latit
ude
OBSERVATION
10.0
17.5
25.0
32.5
40.0
47.5
55.0
62.5
70.0
Ref
lect
ivity
[dBZ
]
-100 -95 -90 -85 -80 -75Longitude
30
35
40
45
50
Latit
ude
CANDIDATE
Example of Analogues
Example of Analogues
From these 26 patterns we determine a map of probability of precipitation occurrence
Example of 22 May 2013
Comparison of Probability of PrecipitationMaps of probability of precipitation derived from 26 members ensembles.
Green contours are radar verification
Analogues ensemble OU model ensemble
-100 -95 -90 -85 -80 -75Longitude
30
35
40
45
50
Latit
ude
0.0
0.2
0.4
0.6
0.8
1.0
Prob
. Ana
logs
-100 -95 -90 -85 -80 -75Longitude
30
35
40
45
50
Latit
ude
0.0
0.2
0.4
0.6
0.8
1.0
Prob
. NW
P
-100 -95 -90 -85 -80 -75Longitude
30
35
40
45
50La
titud
e
0.0
0.2
0.4
0.6
0.8
1.0
Prob
. NW
P
-100 -95 -90 -85 -80 -75Longitude
30
35
40
45
50
Latit
ude
0.0
0.2
0.4
0.6
0.8
1.0
Prob
. Ana
logs
Example of 10 May 2013
and they are either 1 or 0 depending whether an event occurred or not.280
• ROC area score: This score is the area under the ROC curve. ROC stands for Relative281
Operating Characteristic and it is a plot of the Probability of Detection (POD) versus282
the False Alarm Rate (FAR). These two scores are contingency-table scores that are283
based on dichotomous forecasts (Dobryshman 1972). A dichotomous forecast indicates284
whether an event has occurred or not at each point of the grid and it is specified by285
the exceedance of a certain threshold. In this study 15 dBZ (0.3 mm) is the selected286
threshold. The four combinations of forecasts (yes or no) and observations (yes or no)287
are hit, miss, false alarm and correct negative. The two scores used to define the ROC288
curve are computed as:289
POD =hit
hit + miss(3)
290
FAR =false alarm
false alarm + correct negatives(4)
These indices are computed by using a set of increasing probability thresholds (for291
example, 0.05, 0.15, 0.25, etc.) to make the yes/no decision for the ensemble forecast.292
The ROC area score’s range takes values from 0 to 1. 1 stands for a perfect forecast293
and values lower than 0.5 indicate no forecast skill.294
• The spread of an ensemble of members at a particular forecast time step is measured295
as:296
Spread =
vuut 1
Np
(Nm
� 1)·
NpX
i=1
NmX
j=1
(fij
� f̂i
)2 (5)
where Np
is the total number of points of the domain, Nm
is the number of ensemble297
members, fij
stand for i pixel of the jth ensemble member (Fj
) and f̂i
is the mean of298
the ensemble members at a given pixel i.299
12
Answers the question: What is the ability of the forecast to discriminate between precipitation events and non-events?
Range: 0 to 1. 0.5 = no skill. Perfect score: 1
ROC Area = POD d(FAR)∫Relative Operating Characteristic, ROC:
Skill Comparison (probability)
0 2 4 6 8 10Lead-time [h]
0.4
0.6
0.8
1.0
ROC
Area
No Skill
Model Analogues
(For 26 events from 31st April to 12th June and 26 members ensembles)
and they are either 1 or 0 depending whether an event occurred or not.280
• ROC area score: This score is the area under the ROC curve. ROC stands for Relative281
Operating Characteristic and it is a plot of the Probability of Detection (POD) versus282
the False Alarm Rate (FAR). These two scores are contingency-table scores that are283
based on dichotomous forecasts (Dobryshman 1972). A dichotomous forecast indicates284
whether an event has occurred or not at each point of the grid and it is specified by285
the exceedance of a certain threshold. In this study 15 dBZ (0.3 mm) is the selected286
threshold. The four combinations of forecasts (yes or no) and observations (yes or no)287
are hit, miss, false alarm and correct negative. The two scores used to define the ROC288
curve are computed as:289
POD =hit
hit + miss(3)
290
FAR =false alarm
false alarm + correct negatives(4)
These indices are computed by using a set of increasing probability thresholds (for291
example, 0.05, 0.15, 0.25, etc.) to make the yes/no decision for the ensemble forecast.292
The ROC area score’s range takes values from 0 to 1. 1 stands for a perfect forecast293
and values lower than 0.5 indicate no forecast skill.294
• The spread of an ensemble of members at a particular forecast time step is measured295
as:296
Spread =
vuut 1
Np
(Nm
� 1)·
NpX
i=1
NmX
j=1
(fij
� f̂i
)2 (5)
where Np
is the total number of points of the domain, Nm
is the number of ensemble297
members, fij
stand for i pixel of the jth ensemble member (Fj
) and f̂i
is the mean of298
the ensemble members at a given pixel i.299
12
Answers the question: What is the ability of the forecast to discriminate between precipitation events and non-events?
Range: 0 to 1. 0.5 = no skill. Perfect score: 1
ROC Area = POD d(FAR)∫Relative Operating Characteristic, ROC:
Skill Comparison (probability)
Even though the patterns were different, the probability map have a reasonable performance with both techniques.
0 2 4 6 8 10Lead-time [h]
0.4
0.6
0.8
1.0
ROC
Area
No Skill
Model Analogues
(For 26 events from 31st April to 12th June and 26 members ensembles)
and they are either 1 or 0 depending whether an event occurred or not.280
• ROC area score: This score is the area under the ROC curve. ROC stands for Relative281
Operating Characteristic and it is a plot of the Probability of Detection (POD) versus282
the False Alarm Rate (FAR). These two scores are contingency-table scores that are283
based on dichotomous forecasts (Dobryshman 1972). A dichotomous forecast indicates284
whether an event has occurred or not at each point of the grid and it is specified by285
the exceedance of a certain threshold. In this study 15 dBZ (0.3 mm) is the selected286
threshold. The four combinations of forecasts (yes or no) and observations (yes or no)287
are hit, miss, false alarm and correct negative. The two scores used to define the ROC288
curve are computed as:289
POD =hit
hit + miss(3)
290
FAR =false alarm
false alarm + correct negatives(4)
These indices are computed by using a set of increasing probability thresholds (for291
example, 0.05, 0.15, 0.25, etc.) to make the yes/no decision for the ensemble forecast.292
The ROC area score’s range takes values from 0 to 1. 1 stands for a perfect forecast293
and values lower than 0.5 indicate no forecast skill.294
• The spread of an ensemble of members at a particular forecast time step is measured295
as:296
Spread =
vuut 1
Np
(Nm
� 1)·
NpX
i=1
NmX
j=1
(fij
� f̂i
)2 (5)
where Np
is the total number of points of the domain, Nm
is the number of ensemble297
members, fij
stand for i pixel of the jth ensemble member (Fj
) and f̂i
is the mean of298
the ensemble members at a given pixel i.299
12
Answers the question: What is the ability of the forecast to discriminate between precipitation events and non-events?
Range: 0 to 1. 0.5 = no skill. Perfect score: 1
ROC Area = POD d(FAR)∫Relative Operating Characteristic, ROC:
Skill Comparison (probability)
Even though the patterns were different, the probability map have a reasonable performance with both techniques.
0 2 4 6 8 10Lead-time [h]
0.4
0.6
0.8
1.0
ROC
Area
No Skill
Model Analogues
The differences are statistically significative.
(For 26 events from 31st April to 12th June and 26 members ensembles)
Answers the question: What is the magnitude of the probability forecast errors?
Range: 0 to 1 Perfect score: 0
Brier Score = 1N
1Nm
f jj=1
Nm∑⎛
⎝⎜⎞
⎠⎟− oi
⎡
⎣⎢⎢
⎤
⎦⎥⎥
2
i=1
N
∑
Skill Comparison (probability)
indicates points in space (N); indicates members (Nm) is forecast value: 0, miss; 1, hit is observation: 0, miss; 1, hit
ijfoi
0 2 4 6 8 10Lead-time [h]
0.00
0.02
0.04
0.06
0.08
0.10
Brie
r Model Analogues
(For 26 events from 31st April to 12th June and 26 members ensembles)
Answers the question: What is the magnitude of the probability forecast errors?
Range: 0 to 1 Perfect score: 0
Brier Score = 1N
1Nm
f jj=1
Nm∑⎛
⎝⎜⎞
⎠⎟− oi
⎡
⎣⎢⎢
⎤
⎦⎥⎥
2
i=1
N
∑
Skill Comparison (probability)
indicates points in space (N); indicates members (Nm) is forecast value: 0, miss; 1, hit is observation: 0, miss; 1, hit
ijfoi
0 2 4 6 8 10Lead-time [h]
0.00
0.02
0.04
0.06
0.08
0.10
Brie
r Model Analogues
The differences are statistically significative.
(For 26 events from 31st April to 12th June and 26 members ensembles)
0 2 4 6 8 10Lead-time [h]
0.0
0.5
1.0
1.5
2.0
2.5
RM
SE [S
kill]
Model Analog
0
5
10
15
20
25
Even
t
Skill Comparison (RMS)
The differences are statistically significative.
(For 26 events from 31st April to 12th June and 26 members ensembles)
RMSE = 1N
(Roi=1
N
∑ − f )2
indicates point (N); is observation is average of members
iR0f
0 2 4 6 8 10Lead-time [h]
0.0
0.5
1.0
1.5
2.0
2.5
RM
SE [S
kill]
Model Analog
0
5
10
15
20
25
Even
t
Skill Comparison (RMS)
The differences are statistically significative.
(For 26 events from 31st April to 12th June and 26 members ensembles)
Analogues have more information on the probability and intensity of precipitation.
RMSE = 1N
(Roi=1
N
∑ − f )2
indicates point (N); is observation is average of members
iR0f
Do analogues ensembles forecasts cover observations? Skill-Spread Ratio
(For 26 events from 31st April to 12th June and 26 members ensembles)
0 2 4 6 8 10Lead-time [h]
0.2
1.0
10
RM
SE -
Skill
SD o
f Ens
embl
e Sp
read
Analogues
Over-dispersive
Under-dispersiveSpread = 1
N(Nm −1)( f j − f )2
j=1
Nm
∑i=1
N
∑
indicates point (N); indicates members (Nm) is average of members
ijf
Do analogues ensembles forecasts cover observations? Skill-Spread Ratio
The small under-dispersivity is perhaps due to the limited number of ensemble members
(For 26 events from 31st April to 12th June and 26 members ensembles)
0 2 4 6 8 10Lead-time [h]
0.2
1.0
10
RM
SE -
Skill
SD o
f Ens
embl
e Sp
read
Analogues
Over-dispersive
Under-dispersiveSpread = 1
N(Nm −1)( f j − f )2
j=1
Nm
∑i=1
N
∑
indicates point (N); indicates members (Nm) is average of members
ijf
Do model ensembles forecasts cover observations? Skill-Spread Ratio
(For 26 events from 31st April to 12th June. For 26 members ensembles,
and for 15 most dispersive members)
0 2 4 6 8 10Lead-time [h]
0.2
1.0
10
OU model RM
SE -
Skill
SD o
f Ens
embl
e Sp
read
Over-dispersive
Under-dispersive
1526Spread = 1
N(Nm −1)( f j − f )2
j=1
Nm
∑i=1
N
∑
indicates point (N); indicates members (Nm) is average of members
ijf
Do model ensembles forecasts cover observations? Skill-Spread Ratio
(For 26 events from 31st April to 12th June. For 26 members ensembles,
and for 15 most dispersive members)
And it is not bias. It is hard to get model ensembles sufficiently dispersive.
0 2 4 6 8 10Lead-time [h]
0.2
1.0
10
OU model RM
SE -
Skill
SD o
f Ens
embl
e Sp
read
Over-dispersive
Under-dispersive
1526Spread = 1
N(Nm −1)( f j − f )2
j=1
Nm
∑i=1
N
∑
indicates point (N); indicates members (Nm) is average of members
ijf
NEXT True analogues: close states on an attractor.
The Lorenz attractor
As a 3-D trajectory with 2-D projections
o
The X-Y projection of a sparse attractor
101
102
103
104
105
106
107
108
Den
sity
[# p
oint
s]
-20 -10 0 10 20X (t)
-20
-10
0
10
20
Y (t)
Projection on the X,Y plane)
LABasin
The X-Y projection of the full attractor: high density of
trajectories (points)
Dimension (fraction of phase space occupied by the attractor)
Correlation Dimension: 1-take a circle of radius r and count the total nº of points within the circle for all
positions in the domain, Cr 2- Repeat for all r and plot log Cr vs log r The slope is the CD0 2 4 6 8
log(r)
18
20
22
24
log(
Cr)
XY Projection
Slope=Corr. Dim
=1.215
dxdt
=σ (y − x); dydt
= x(ρ − z)− y; dzdt
= xy − βz σ = 10; ρ = 28 ; β = 83
The trajectory of the LA attractor is obtained by solving three differential
equations.
If the time of each point of the trajectory is recorded the equations are no
longer needed. The attractor is “DATA”
Starting from the red point we can obtain the future trajectories (red) of all
the analogues within the point. This is the analogues ensemble forecast.
Forecasting with the attractor
-20 -10 0 10 20X(t)
0
10
20
30
40
50
Z(t)
The trajectory of the LA attractor is obtained by solving three differential
equations.
If the time of each point of the trajectory is recorded the equations are no
longer needed. The attractor is “DATA”
Starting from the red point we can obtain the future trajectories (red) of all
the analogues within the point. This is the analogues ensemble forecast.
Forecasting with the attractor
-20 -10 0 10 20X(t)
0
10
20
30
40
50
Z(t)
An ensemble forecast of x at time tf is made by choosing a set of close values on the
attractor around x(t0), y(t0), z(t0), the ANALOGUES, and then following the evolution of
the ensemble average of x as well as their spread in time until the forecast time tf.
The trajectory of the LA attractor is obtained by solving three differential
equations.
If the time of each point of the trajectory is recorded the equations are no
longer needed. The attractor is “DATA”
Starting from the red point we can obtain the future trajectories (red) of all
the analogues within the point. This is the analogues ensemble forecast.
If the equations for a system are known it may be more practical to get a solution
every time a forecast is needed than have the huge table for all the solutions.
But, for the atmosphere we ignore the exact equations. Moreover, we cannot
compute all the solutions from the equations we have.
Forecasting with the attractor
-20 -10 0 10 20X(t)
0
10
20
30
40
50
Z(t)
An ensemble forecast of x at time tf is made by choosing a set of close values on the
attractor around x(t0), y(t0), z(t0), the ANALOGUES, and then following the evolution of
the ensemble average of x as well as their spread in time until the forecast time tf.
What happens if we only have a reduced dimension?
101
102
103
104
105
106
107
108
Den
sity
[# p
oint
s]
-20 -10 0 10 20X (t)
-20
-10
0
10
20
Y (t)
Projection on the X,Y plane)
LABasin
0Climatology:
RMS of the LA
0.1 1.0Time [s]
33 Analogues on the 3-D attractor
10 1000.01
0.10
1.0
10
100 0
x-R
MS
of th
e An
alog
ues
Ense
mbl
e Sp
read
The value of the forecast by analogues-ensemble is measured by
the distance of the curve to the RMS of climatology.
What happens if we only have a reduced dimension?
101
102
103
104
105
106
107
108
Den
sity
[# p
oint
s]
-20 -10 0 10 20X (t)
-20
-10
0
10
20
Y (t)
Projection on the X,Y plane)
LABasin
0Climatology:
RMS of the LA
0.1 1.0Time [s]
33 Analogues on the 3-D attractor
10 1000.01
0.10
1.0
10
100 0
x-R
MS
of th
e An
alog
ues
Ense
mbl
e Sp
read
Even if the dimension of the attractor is reduced (from 3 to 2
in this case) the information may have practical value.
141 analogues on the 2_D attractor(Reduc. dim. anal.)
The value of the forecast by analogues-ensemble is measured by
the distance of the curve to the RMS of climatology.
What happens if we only have a reduced dimension?
101
102
103
104
105
106
107
108
Den
sity
[# p
oint
s]
-20 -10 0 10 20X (t)
-20
-10
0
10
20
Y (t)
Projection on the X,Y plane)
LABasin
0Climatology:
RMS of the LA
The analogues ensemble forecast from the reduced dimension attractor
contains a greater amount of information than climatology.
0.1 1.0Time [s]
33 Analogues on the 3-D attractor
10 1000.01
0.10
1.0
10
100 0
x-R
MS
of th
e An
alog
ues
Ense
mbl
e Sp
read
Even if the dimension of the attractor is reduced (from 3 to 2
in this case) the information may have practical value.
141 analogues on the 2_D attractor(Reduc. dim. anal.)
The value of the forecast by analogues-ensemble is measured by
the distance of the curve to the RMS of climatology.
Must we know the exact variables of the phase space?
0
We can taylor the attractor to fit our needs. Say, we want to
forecast the mean of x,y,z, the eccentricity of x,z, [z2/(x2+z2)]½, and
the euclidian distance between x and y. From the original attractor
we can construct an attractor in these coordinates:
0.001 0.01 0.1 1 10 10010-8
10-6
10-4
10-2
100
0.001 0.01 0.1 1 10 100Radius
10-8
10-6
10-4
10-2
100
Cr
Cor. Dim: 2.08
The attractor conserves all its characteristic.
WE CAN USE PROXY VARIABLES TO DEFINE THE PHASE SPACE
101
102
103
104
105
106
107
Den
sity
[# p
oint
s]
5 10 15 20 25 30d(x,y)
-5
0
5
10
15
20
25
Mea
n (x
,y,z
)
Is this low order system relevant for Nowcasting of such a high order system as precipitation?
Consider the attractor as the climatology of the
system represented as the joint probability of
the phase-space variables, p(x|y|(z). The
number of variables can be reduced, or use
proxy variables, and still the ensemble of
analogues will have predictive value. 101
102
103
104
105
106
107
108
Den
sity
[# p
oint
s]
-20 -10 0 10 20X (t)
-20
-10
0
10
20
Y (t)
Projection on the X,Y plane)
LABasin
Is this low order system relevant for Nowcasting of such a high order system as precipitation?
Consider the attractor as the climatology of the
system represented as the joint probability of
the phase-space variables, p(x|y|(z). The
number of variables can be reduced, or use
proxy variables, and still the ensemble of
analogues will have predictive value. 101
102
103
104
105
106
107
108
Den
sity
[# p
oint
s]
-20 -10 0 10 20X (t)
-20
-10
0
10
20
Y (t)
Projection on the X,Y plane)
LABasin
OUR GOAL IS TO EXPLORE THE USE OF RADAR DATA, TOGETHER
WIH OTHER DATA, TO CONSTRUCT A CLIMATOLOGY OF THE JOINT
PROBABILITY OF RELEVANT VARIABLES, WHICH WE WILL CALL
“RAIN ATTRACTOR” AND STUDY ITS NOWCASTING POTENTIAL
Is this low order system relevant for Nowcasting of such a high order system as precipitation?
Consider the attractor as the climatology of the
system represented as the joint probability of
the phase-space variables, p(x|y|(z). The
number of variables can be reduced, or use
proxy variables, and still the ensemble of
analogues will have predictive value. 101
102
103
104
105
106
107
108
Den
sity
[# p
oint
s]
-20 -10 0 10 20X (t)
-20
-10
0
10
20
Y (t)
Projection on the X,Y plane)
LABasin
OUR GOAL IS TO EXPLORE THE USE OF RADAR DATA, TOGETHER
WIH OTHER DATA, TO CONSTRUCT A CLIMATOLOGY OF THE JOINT
PROBABILITY OF RELEVANT VARIABLES, WHICH WE WILL CALL
“RAIN ATTRACTOR” AND STUDY ITS NOWCASTING POTENTIAL
What are “RELEVANT VARIABLES”?
Here we go back to conceptual models and Hector Grandoso.
A first attempt at a “Rain Attractor”
From 17 years of continental composites of precipitation patterns we
can construct a 5-D (sparce) RAIN ATTRACTOR. The phase space we
use is comprised of one thermodynamic variable and 4 statistical
properties of the pattern:
70-50 kPa thickness (every 3h, from reanalysis),
Area of precipitation,
its Eccentricity,
Marginal Mean reflectivity,
Decorrelation time.
Here we will use all variables or a smaller dimension subset of them.
105 1062.4
2.5
2.6
2.7
1 10Decorrelation time [h]
0.0 0.2 0.4 0.6 0.8 1.0Eccentricity
20 25 30Marginal mean [dBZ]
1
2
7
18
50
132
Area [km2]2.4
2.5
2.6
2.7
2.4
2.5
2.6
2.7
(50-
70) k
Pa th
ickn
ess
[km
]
2.4
2.5
2.6
2.7
Nº o
f poi
nts
(50-
70) k
Pa th
ickn
ess
[km
]
(50-
70) k
Pa th
ickn
ess
[km
]
(50-
70) k
Pa th
ickn
ess
[km
]
Density Projections of a 5-D Rain Attractor
-1.5 -1.0 -0.5 0.0 0.5 1.0log(r)
-10
-8
-6
-4
-2
0
log(
Cr)
Phase space:Area - M.M. - Ecc. - Dec Time - Thickness
corr. dimension = 4.81
The other Projections of the Rain Attractor
For this 4-D attractor:
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0log(r)
-10
-8
-6
-4
-2
0
log(
Cr)
corr. dimension = 3.74
Phase space:Area - M.M. - Ecc. - Dec Time
2 7 19 51 136 364Nº of points
1
20 25 30Marginal Mean [dBZ]
105
106
Area
[km
2 ]
20 25 30Marginal Mean [dBZ]
1
10D
ecor
rela
tion
time
[h]
0.0 0.2 0.4 0.6 0.8 1.0Eccentricty
105
106
Area
[km
2 ]
0.0 0.2 0.4 0.6 0.8 1.0Eccentricty
1
10
Dec
orre
latio
n tim
e [h
]
105 106
Area [km2]
1
10
Dec
orre
latio
n tim
e [h
]
20 25 30Marginal Mean [dBZ]
0.0
0.2
0.4
0.6
0.8
1.0
Ecce
ntric
ty
Trajectories on the Rain Attractor
1
2
7
19
50
136
363
Den
sity
20 25 30Marginal Mean [dBZ]
105
106
Area
[km
2 ] 1
1
2
6
18
47
125
331
Den
sity
105 106
Area [km2]
1
10
Dec
orre
latio
n tim
e [h
]
1
26/10/1995 - 28/02/2014 374286 files scale 1 - 4x4km - point 1 - Z average t=0 t=100 days
Forecast and Ensemble Spread on the Rain Attractor
How does it work? Say, at the present time we have only marginal mean (MM). 1-Select in the 1-D attractor 250 cases with the closest MM
We see that there is a skill better than climatology for ~2 days.
ForecastEnsemble spread
Marginal mean,M.M
10-2
Lead-Time [days]
0.0
0.2
0.4
0.6
0.8
1.0
Nor
mal
ized
SD
Spr
ead
of M
argi
nal M
ean
10 100Lead-Time [days]
22
25M
argi
nal M
ean
[dBZ
]
250 analogues26
24
23
27
0.1 10.1 1 10 100
climatology of the season
2-Make the ensemble forecast
3-Measure the ensemble spread
Forecast and Ensemble Spread on the Rain Attractor
1-Select in this 4-D attractor 250 cases with the closest values of these 4 parameters 2-Make the average ensemble forecast 3-Measure the ensemble spread
ForecastEnsemble spread
10-2
Lead-Time [days]
0.0
0.2
0.4
0.6
0.8
1.0
Nor
mal
ized
SD
Spr
ead
of M
argi
nal M
ean
10 100Lead-Time [days]
22
25M
argi
nal M
ean
[dBZ
]
250 analogues26
24
23
27
0.1 10.1 1 10 100
climatology of the season
Marginal mean,M.MM.M + Area + Ecc. + Dec-Time
Now add to MM the Area, Eccentricity and Decorrelation Time of the pattern.
We see that the skill is much better than climatology for ~10 days.
Forecast and Ensemble Spread on the Rain Attractor
1-Select in this 3-D attractor 250 cases with the closest values of the 3 parameters 2-Make the average ensemble forecast 3-Measure the ensemble spread
ForecastEnsemble spread
10-2
Lead-Time [days]
0.0
0.2
0.4
0.6
0.8
1.0
Nor
mal
ized
SD
Spr
ead
of M
argi
nal M
ean
10 100Lead-Time [days]
22
25M
argi
nal M
ean
[dBZ
]
250 analogues26
24
23
27
0.1 10.1 1 10 100
climatology of the season
Marginal mean,M.MM.M + Area + Ecc. + Dec-Time
Thickness + M.M. + Dec-Time
Now we take the 50-70 kPa Thickness, MM, and Decorrelation Time.
We see that the skill is much better than climatology for ~30 days and longer.
Comparison of Performance
30/04/2013 00:00 - 26 analogues
0 2 4 6 8 10Lead-Time [h]
18
20
22
24
26
28
Mar
gina
l Mea
n [d
BZ]
Area: 4.07e5 [km2]Marginal Mean: 24.69 [dBZ]Decorrelation time: 2.74 [h]Eccentricity: 0.86
Starting time ((2h): 00:00 [UTC] Similar pattern analogues4D attractor analogues
Verification
0 2 4 6 8 10Lead-Time [h]
105
5x105
Area
[km
2 ]
Area: 4.07e5 [km2]Marginal Mean: 24.69 [dBZ]Decorrelation time: 2.74 [h]Eccentricity: 0.86
Starting time ((2h): 00:00 [UTC] Similar pattern analogs4D attractor analogs
Verification
Not surprisingly, attractor analogues are better at predicting Area and
Marginal Mean that similar patterns. They were tailored for this purpose!
And the average of 24 days:24 days x 26 analogues
0 2 4 6 8 10Lead-Time [h]
20
22
24
26
28
Mar
gina
l Mea
n [d
BZ]
Similar pattern analogues 4D attractor analogues
Verification
0 2 4 6 8 10Lead-Time [h]
105
106
Area
[km
2 ]
Similar pattern analogues 4D attractor analogues
Verification
0 2 4 6 8 10Lead-Time [h]
20
22
24
26
28M
argi
nal M
ean
[dBZ
]
Similar pattern analogues 3D attractor analogues
Verification
0 2 4 6 8 10Lead-Time [h]
105
106
Area
[km
2 ]
Similar pattern analogues 3D attractor analogues
Verification
Comparison of Performance
This attractor approach is now moving to Switzerland where it became part
of an SNSF AMBIZIONE project (Urs German, Loris Foresti).
Combination with satellite data, surface data, orography;
Search for the smallest most effective phase space (principal components?)
State dependence
…..
A PhD position available
This was a first exploration of the “Rain Attractor”. It looks promising.
NEXT:
THANK YOU