evaluating conditional density estimation networks as a … · 2015. 6. 26. · evaluating...

50
Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application to precipitation in Newfoundland by c Biswas, Rasel B. Sc.(Hons), M. Sc. A report submitted to the School of Graduate Studies in partial fulfillment of the requirements for the degree of Master of Science. Department of Scientific Computing Memorial University of Newfoundland 21st June 2015 St. John’s Newfoundland

Upload: others

Post on 22-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Evaluating conditional density estimation networks as a probabilisticdownscaling tool: Application to precipitation in Newfoundland

by

c⃝ Biswas, RaselB. Sc.(Hons), M. Sc.

A report submitted to theSchool of Graduate Studiesin partial fulfillment of the

requirements for the degree ofMaster of Science.

Department of Scientific ComputingMemorial University of Newfoundland

21st June 2015

St. John’s Newfoundland

Page 2: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Abstract

The objective of this research is to better quantify the distribution of extreme pre-

cipitation within Newfoundland, for the current climate conditions. The province of

Newfoundland is interested in this information as guidance for climate adaptation

and longterm infrastructure planning purposes. Extreme analyses are commonly lim-

ited by short periods of observation (often only 30 years or less), resulting in large

uncertainty regarding rare events. These limitations can be addressed by increasing

the period of observation, or partially addressed using synthetic time series generated

by stochastic weather generators. These weather generators attempt to replicate

relevant statistics of the observed climate, using various statistical modelling tools.

Here, a probabilistic neural network-based downscaling method is used to predict the

probability distribution of precipitation at a study site conditional upon the synoptic

state of the atmosphere; that is, features of a given day’s large-scale atmospheric

state are used to estimate the likelihood of all possible precipitation amounts for that

day. My results show that CDEN-based probabilistic downscaling generally agrees

better with observations than uncorrected reanalysis-based precipitation estimates.

However, it gives significantly higher estimates for extreme events at low frequency

(e.g., 100 year return periods). This approach is demonstrated for St. John’s airport.

Comparison with nearby stations, Windsor Lake, suggests that these higher estimates

may be reasonable; however, further work and a longer observational record is needed

ii

Page 3: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

to determine whether these estimates represent added value to the raw observational

record or a shortcoming of our CDEN-based methodology.

iii

Page 4: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Contents

Abstract ii

List of Tables vi

List of Figures vii

1 Introduction 1

2 Data 7

2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Atmospheric Reanalyses . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Station data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Methodology 10

3.1 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2 Multilayer perceptron (MLP) . . . . . . . . . . . . . . . . . . . . . . 11

3.3 The Conditional Density Estimation Network (CDEN) . . . . . . . . 13

3.4 Predictor selection method . . . . . . . . . . . . . . . . . . . . . . . . 15

3.5 Probability distributions for daily Precipitation . . . . . . . . . . . . 18

3.6 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

iv

Page 5: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

4 Application of CDEN to Precipitation Downscaling 22

4.1 Predictor Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.3 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 33

v

Page 6: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

List of Tables

4.1 Cross-validated model performance statistics for CDEN, NCEP down-

scaling models with root mean square error (RMSE), correlation (r),

Hit-rate (HIT), False alarm ratio (FAR), True scale statistic (TSS),

Bias ratio (BIAS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

vi

Page 7: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

List of Figures

3.1 Potential predictors from NCEP reanalysis at locations used in the

current study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2 Station data and randomly created Precipitation data via Bernoulli-

gamma distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3 Cross Validation (CV) method. . . . . . . . . . . . . . . . . . . . . . 21

4.1 Red locations and blue locations represents selected predictor locations

and potential predictor locations respectively for precipitation at the

St. John’s airport station. . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2 Compared total amount of seasonal precipitation between the station

data and the stochastic weather generator using CDEN Model. The

blue line represents the total amount of seasonal precipitation for the

station data and box plot represents the total amount of seasonal pre-

cipitation for the stochastic weather generator. . . . . . . . . . . . . . 27

4.3 Quantile-Quantile plot at the St. John’s international airport. . . . . 28

4.4 Compared between (a) the precipitation data for St. John’s airport

and (b) the stochastic weather generator using CDEN Model.The line

inside each box represents the median, boxes extend from the 25th to

75th percentiles, and outliers are shown as circles. . . . . . . . . . . . 32

vii

Page 8: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 1

Introduction

The response of precipitation to climate change has been the subject of numer-

ous studies, including annual and seasonal analyses on both global and local scales

(Houghton et al., 1996). Past studies suggest that mean global precipitation has

increased throughout the 20th century; however, trends show considerable spatial

variability (Houghton et al., 1996). For example, in northern Europe, positive trends

have been identified (Forland et al., 1996; Schonwiese et al., 1997), while in parts

of the south trends are often negative (Schonwiese et al., 1997). This variability is

partially explained by the complexity of precipitation responses to changing climate,

which may include adjustments in the frequency of events, the intensity of events,

or both (Trenberth et al., 2003; Allen & Ingram, 2002); furthermore, these responses

may vary across the seasons (Finnis et al., 2007).

Impacts of climate change on intense, but relatively rare, periods of precipitation

are especially of interest. These extreme precipitation events arise from complex in-

teractions between temperature, moisture, winds and orography; individual extreme

events typically influence small areas, making it difficult to predict the response of

extremes to climate change. The biggest obstacle in this area of research is a lack of

1

Page 9: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 1. Introduction 2

high quality data with which to constrain results. At the end of the last century, a

significant positive trend in the frequency of extreme rainfall (greater than 50mm per

day) has been identified in the US (Karl et al., 1995, 1998). In Australia, a signific-

ant increase in the 90th and 95th percentiles of daily precipitation was identified by

Suppiah & Hennessy (1995) and Suppiah & Hennessy (1998). Hennessy & Suppiah

(1999) and Plummer et al. (1999) further showed increases in the 99th percentile.

A study by Groisman et al. (1999) provides a useful synopsis of precipitation responses

to climate change, both on global and regional scales. Over the past century in the

US, Australia, and Norway, Groisman et al. (1999) found that total annual precip-

itation had increased, as had the mean intensity of individual precipitation events,

i.e., increasing intensity was contributing to an increase in precipitation amounts. In

order to describe these changes, they compared fitted probability distributions under

warmer and cooler climates; results suggested that parameters associated with the

spread of a distribution (e.g., scale parameters) were most affected by warming, while

those associated with the shape of a distribution were less affected. This translates

to an increase in the likelihood of high precipitation amounts. These changes are in-

dependent of total precipitation; however, some locations (e.g., Siberia) experienced

this increase in intensity without an associated increase in total precipitation, sug-

gesting a compensating decrease in the frequency of events. Generally, however, the

projected increases in total accumulated precipitation were found to be more robust

than changes in extreme (intense) precipitation (Groisman et al., 1999).

A further problem in predicting climate responses to climate change involves geo-

graphic scale. While the general circulation models (GCMs) typically used for climate

change analyses perform reasonably well over large spatial scales (e.g., 100-1000km),

they do not adequately represent the small (regional or local) scales of greatest in-

terest in climate adaptation planning. This issue compounds problems associated

Page 10: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 1. Introduction 3

with spatial variability of precipitation, as small-scale factors such as local topo-

graphy, coastline orientation, and slope aspect (among many others) can exert large

influences on local-scale precipitation, but cannot be captured by GCMs. The heav-

iest (most intense) precipitation events are also typically associated with processes

acting on spatial scales much smaller than GCM resolutions; consequently, heavy

precipitation is reduced as the impact of small-scale, intense events, such as convect-

ive storms, are spread over a very large GCM grid cell. Despite these limitations,

large-scale GCM output can be used to infer local-scale impacts, assuming that a

location’s precipitation is adequately connected to large-scale forcing. The process of

predicting local-scale information from large-scale data is referred to as ‘downscaling’,

and is accomplished through a wide range of techniques (e.g., Benestad et al., 2008).

Broadly, these fall into two categories:

i. Dynamic downscaling, in which higher resolution regional climate models are

nested within low resolution GCM output, and

ii. Statistical downscaling, which capitalize on long observational records to identify

cross-scale statistical relationships.

There is no single, universally accepted approach to precipitation downscaling, des-

pite a number of comparative studies (e.g., Wigley et al., 1990; Wilby et al., 1998)

and dedicated dynamical downscaling efforts (e.g., Christensen et al., 2007; Goodess,

2011; Gutowski et al., 2010). Commonly used statistical methods include transfer

functions (e.g., regression) (Wigley et al., 1990; Wilby et al., 1998) and weather typ-

ing schemes (e.g., Vrac et al., 2007). Perfect prognosis methods (or, model output

statistics) have been adapted for downscaling projections (Rummukainen, 1997), as

have artificial neural networks (ANNs).

In order for statistical downscaling to be effective, the GCMs being downscaled

Page 11: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 1. Introduction 4

must simulate required large-scale variables realistically. When a direct relationship

between large-scale variables is simulated by a model, and local-scale observations

exist, statistical downscaling essentially corrects for model errors (see, e.g., Lenderink

et al., 2007). Most downscaling methods are deterministic, that is, they provide a

single ‘best-guess’ of precipitation given large-scale conditions. However, the same

data used to identify and train downscaling models can be used to develop stochastic

weather generators. These are statistical models that generate synthetic time series

of the local scale variable while reproducing statistical properties of the observations

(e.g., Wilks, 1998; Vrac et al., 2007). Incorporating stochastic weather generators

into downscaling schemes has the potential to improve representation of extreme

events (which are often the primary concern in adaptation planning); deterministic

approaches frequently underestimate these extremes as they strive to predict the most

likely outcome, which is often closer to the long-term mean value (e.g., Fowler et al.,

2007).

Multivariate linear regression (MLR) forms the basis of the most commonly used

statistical downscaling approaches. However, the interest in nonlinear regression

methods, including ANNs, is increasing because of their potential to accommod-

ate complex, nonlinear and time-varying input-output mapping. Relative to MLR,

ANN-based regression is extremely flexible and capable of identifying nonlinear rela-

tionships between input (predictors) and output (predictands), given enough training

data.

If no nonlinear relationship exists, ANNs give similar results to MLR-based methods,

but with additional computational costs, and using a regression methodology that

is much more difficult to interpret (Schoof & Pryor, 2001). However, when applied

to precipitation, ANNs are often able to account for heavy rainfall events missed by

linear regression approaches (Weichert & Burger, 1998). Cannon & McKendry (2002)

Page 12: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 1. Introduction 5

found that an ensemble ANN downscaling model was capable of predicting changes

in stream flows using only large-scale atmospheric conditions as model input. Other

nonlinear downscaling methods have been used, to varying effect. These include (but

are not limited to) dynamic neural networks, which accommodate time-varying input-

output relationships (Gautam & Holz, 2000), and Gaussian Processes (e.g., Cai et al.,

2006). A common thread in these analyses is that nonlinear approaches typically of-

fer better performance relative to MLR, at the cost of increased complexity. This is

particularly true in locations where environmental processes are complex and highly

nonlinear (Hsieh, 2009).

The multilayer perceptron (MLP) is the most common ANN used in statistical down-

scaling, and essentially acts as a nonlinear alternative to MLR. MLP can deal with

unspecified interactions between multiple covariates to provide a deterministic predic-

tion. An alternative approach uses a conditional density estimation network (CDEN),

which is a probabilistic extension of the MLP (Gardner & Dorling, 1998; Hsieh &

Tang, 1998; Dawson & Wilby, 2001). CDENs have been successfully applied to vari-

ous prediction problems, including air pollutant concentration estimations (Dorling

et al., 2003) and hydroclimatology (Haylock et al., 2006; Cannon, 2010). Rather

than predicting the most likely outcome, a CDEN predicts probability distribution

parameters, conditional upon the state of predictor variables. As such, it provides

a more detailed prediction, allowing the likelihood of all outcomes to be quantified.

When used in precipitation downscaling, its output can be considered as probabilistic

downscaling.

The current study applies the CDEN method to precipitation in St. John’s, New-

foundland. The purpose is to assess the method’s suitability for climate projection

analyses in the province of Newfoundland and Labrador. Here, a CDEN-based model

is used to produce a conditional precipitation distribution for a modern study period.

Page 13: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 1. Introduction 6

Drawing randomly from these distributions produces an extended precipitation time

series. This can be considered a conditional stochastic weather generator. The skill

of the CDEN is assessed against available observations, and the approach is then

used to estimate the magnitude of extreme events at our study location. The meth-

ods potential value in climate projection is assessed on the basis of this analysis.

Subsequent chapters describe this method in detail. In the current study, extreme

value statistics are based on annual maxima (a ‘block maxima’ approach, rather than

a peaks-over-threshold approach). This was chosen to better emulate the methods

used by Environment Canada for their official intensity/duration/frequency (IDF)

analyses.

Page 14: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 2

Data

2.1 Overview

All analyses are based on freely available data sets from Environment Canada and

the U.S. National Centers for Environmental Prediction (NCEP). Predictor variables

(i.e., inputs to downscaling algorithms) were taken from the NCEP/NCAR Reanalysis

(Kalnay et al., 1996), while predictand (output) variables are from the Adjusted and

Homogenized Canadian Climate Data set (AHCCD; Vincent, 1998).

2.2 Atmospheric Reanalyses

The need to monitor climate and weather arises for a variety of reasons, from short-

term planning and hazard preparedness to long-term infrastructure planning (e.g.,

using intensity-duration-frequency curves for precipitation). Climate observing sta-

tions are our primary tool in this effort, providing detailed descriptions of climate

at fixed locations. However, maintaining these observing sites is expensive, and cov-

erage is largely limited to populated land areas. Satellite observations can help fill

7

Page 15: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 2. Data 8

gaps in the surface observing network, but provide less detailed coverage (varying

spatial/temporal coverage), and often provide only indirect measurement of many

variables of interest (Climate Change Science Program, 2008).

Reanalyses were originally proposed as a means of addressing the shortcomings of

climate station and satellite-based observations. This approach was inspired by the

use of atmospheric analyses in meteorology, i.e., the merging of current observations

and prior model forecasts into a best-guess of the atmospheric state at a given point

in time. In operational meteorology, analyses provide a starting point for numerical

weather prediction (NWP) models. The approach used to generate analyses evolves

continually as data assimilation and modelling techniques improve, complicating ef-

forts to interpret long-term climate using incompatible analyses. Reanalyses address

this consistency issue, creating a continuous record using a single NWP model and

data assimilation methodology. The result is a gridded data set of all “analysed” vari-

ables with global coverage over the reanalysis period (usually ∼1950s to present). In

essence, reanalyses estimate data at locations away from climate stations or satellite

observations using a combination of all available observations at other locations and

model output; effectively, it provides an intelligent interpolation informed by NWP

physics. Variables that are not directly analysed can be provided by subsequent iter-

ations of the NWP model (referred to as reanalysis forecast variables) (Kalnay, 2003).

Reanalysis data has proven consistent with the trends produced from the other ob-

servational datasets with some regional exceptions, especially since satellite measure-

ments originated in the late 1970s (Climate Change Science Program, 2008). Unfor-

tunately, reanalysis treatments of precipitation are often of poor quality relative to

other variables (e.g., temperature or sea level pressure). The primary factors that are

responsible for this inconsistency in the trends are limitations in the models and the

methods used to integrate the datasets (Bromwich & Cullather, 1998). Nevertheless,

Page 16: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 2. Data 9

reanalysis data helps to identify and explore atmospheric features associated with the

weather and climate variability. Reanalysis can assist in these efforts by providing

large-scale context for events such as droughts or floods, allowing us to move beyond

studies at fixed locations.

2.3 Station data

Station data used in the current study were taken from the Adjusted and Homogenized

Canadian Climate Data archive (Vincent, 1998; Vincent & Gullett, 1999; Vincent et

al., 2002; Mekis & Vincent, 2011). This was used in place of raw (unadjusted) station

data, as the influence of factors such as station relocation, changes in observational

practices, and instrument replacement have been removed in the AHCCD. We have

used daily precipitation values; total precipitation accumulated in a 24 hour period

(mm). It is calculated as the addition of the station’s adjusted daily rain gauge

and snow observations (converted to water equivalent). AHCCD development is a

continuous process, and adjusted precipitation is updated every year, making it easy

to update or extend our analysis with new data as it becomes available.

Page 17: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3

Methodology

3.1 Neural networks

Inspired by the mechanics of biological learning, artificial neural networks (ANNs) are

a family of computational algorithms commonly used for classification and regression

problems. Capable of identifying nonlinear relationships between various paramet-

ers, they are well suited to statistical modelling and prediction in the environmental

sciences. Although building neural networks (“training”) can be an involved process,

and resulting models are often very complex, many easy-to-use software packages are

now available that simplify the process for users with limited formal statistical train-

ing.

In the most common approach to training, a neural network is iteratively shown

paired inputs (predictors) and desired outputs (predictands). This is referred to as

supervised training, as the network is guided towards a known outcome (i.e., the

predictand); it is distinct from unsupervised training, which excludes information

regarding outputs (i.e., no prior information regarding outcomes). In the course of

supervised training, a network adapts to minimize differences between its predictions

10

Page 18: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 11

and supplied outputs. Essentially, the network assumes a relationship exists between

supplied inputs and outputs, and attempts to capture it. Supervised NNs are com-

monly used as nonlinear analogs of traditional multivariate linear regression.

The goal of training is to optimize the NN’s parameters, which are a series of weights

(w) and biases (b). The number of parameters and structure of the final network are

set by a user prior to training; a description of how weights and biases are structured

is provided in Section 3.2.

3.2 Multilayer perceptron (MLP)

Each iteration of the training process adjusts the weights by a small amount. Suppose

the training process is in its m th iteration; when shown the (m + 1) th set of paired

predictors and predictands, the weights may be updated in the following way:

wm+1i j = wm

i j + ∆wmi j, (3.1)

where ∆wmi j is the updated weights. The multilayer perceptron (MLP) is a common

neural network model, belonging to the class of supervised networks. Using historical

data, the MLP network is capable of effectively mapping output data to input data

(i.e., they may be used for classification or regression). The MLP consists of layers of

‘nodes’; the number of layers and number of nodes per layer are set by the user prior

to training. All MLPs include an input layer (which accepts predictors as input), an

output layer (producing desired predictands), and an arbitrary number of ‘hidden’

layers, so named because the value of nodes in these layers is not typically examined

or relevant to the user.

Mathematically, a trained MLP can be described as follows. Let

Page 19: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 12

F : X → H

be a simple mathematical model which represents the first layer of MLP mappings,

where X represents the predictor set (inputs), H represents the first layer of MLP

nodes, and F represents the activation function or transfer function connecting the

nodes input layer to the hidden layer. The transfer function F can be linear or nonlin-

ear. The hyperbolic tangent function and unipolar sigmoid functions are commonly

used nonlinear transfer functions. Let xi (i = 1, 2, ....., I) ϵX be a given set (i.e.,

input set) that will produce a value for hj ϵH determined by set weights (wj,i) and

biases (bj) connecting it to input values; here, j gives the number of nodes in this

first hidden layer.

The value of the jth node (hj), is then calculated as follows:

hj = F

(∑i

xi wj,i + bj

)(3.2)

That is, hj is a function of all inputs xi. For an MLP with only a single hidden layer,

the values of H then determine the values of the output layer O:

g : H → O.

Suppose ok ϵO and let gk be the transfer function, which is often chosen to constrain

output within a specified (e.g., physically meaningful) range. Now, hj is the output

from the first input layer, which is multiplied by the interconnected weights wk,j and

bk is the bias for the output layer. Then

ok = gk

(∑j

hj wk,j + bk

)(3.3)

Again, the outputs are a function of all nodes in the hidden layer. For MLPs with

multiple hidden layers, nodes in subsequent layers are determined by the value of

Page 20: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 13

nodes in the previous layer. If the multilayer perceptron is too complex (i.e., with

many hidden layers), this may lead to over-fitting by incorporating the noise, and not

just the desired signal, into predictions. When compared to training data, over-fitted

MLP output will show extremely high agreement with actual outcomes; however, they

will show poor performance when used to predict novel data points not used in the

training period. For this reason, MLP performance must always be evaluated using

‘testing’ data, independent of training data (Hsieh, 2009).

3.3 The Conditional Density Estimation Network

(CDEN)

The CDEN is a probabilistic extension of the classical MLP. Rather than return-

ing the single, deterministic value provided by the standard MLP, a CDEN returns

probability distribution parameters, conditional upon input variables. Training of the

network proceeds in a similar fashion to the traditional MLP, with some adjustment

to the standard cost functions. The current study uses the algorithm described in

Cannon (2008). The goal is to identify the NN’s parameters such that the result-

ing conditional distribution maximizes the probability of the training dataset, rather

than minimizing the error of NN predictions. The CDEN assesses the likelihood of

actual outcomes against a predicted probability density function; as the probability

of the true outcomes increases, the cost function is optimized. A description of this

optimization approach follows.

Suppose x is a continuous sample of data with known probability distribution, f(x, θi),

where θi, i = 1, 2, 3, ...., n are the unknown constant NN weights and biases (hereafter

referred to as NN’s parameters, distinct from statistical distribution parameters) to

Page 21: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 14

be estimated from the sampling data. The likelihood function L is a map

L : Θ → R

which is given by

L(x | θi) = f(x |θi); θiϵΘi, i = 1, 2, 3, ..., n

where Θ is the parameter space. If x1, x2, ..., xn are independent observations then

the likelihood function can be written as

L = L(x1, x2, ...., xn|θ1, θ2, ....., θn) = Πni=1 f(xi|θ1, θ2, ....., θn). (3.4)

Taking the natural logarithm of both sides, then

InL =n∑

i=1

In f(xi|θ1, θ2, ....., θn). (3.5)

Now, the maximum likelihood estimates of θ1, θ2, ....., θn are obtained by maximizing

L or InL. By using the definition of critical point, we can write

d InL

dθj= 0; j = 1, 2, ...., n (3.6)

The solution of unknown parameters θj (j = 1, 2, 3, ........, n) is found by solving the

above system of equations. When using the likelihood function, f is set to a known

probability distribution. In the current study, the Bernoulli-gamma distribution has

been used, with three distribution parameters.

L = Πni=1 p(yn | pn, αn, βn) (3.7)

where

p(yn |xn) = p(yn | pn, αn, βn)

Page 22: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 15

and pn = f(xn), αn = f(xn), βn = f(xn) are CDEN outputs; that is, probability

distribution parameters vary with the input vectors. Minimizing the negative log of

the likelihood function is equivalent to maximizing the likelihood function. Hence,

the objective function is

J = −n∑

n=1

In p(yn|w, xn)

Since xn and yn are known from the given data, the unknown w is optimized to

minimize J .

3.4 Predictor selection method

When building statistical models, it is very important to select a set of appropriate

input feature variables. The objective is to find the smallest set of features that en-

sures satisfactory predictive performance. With sufficient time and computing power,

this can be accomplished by training multiple prediction/downscaling models using

every possible combination of predictors, then using the single model that gives the

best performance. However, this is not feasible if

i. very large numbers of potentially useful predictors are available, and/or

ii. training takes significant computing resources.

In these situations, an alternate (and faster) prediction method may be used to test

all predictors first. Decision trees are often used for this purpose. These are com-

putationally efficient tools that can be trained quickly, but have limited predictive

power relative to systems such as neural networks; for example, decision trees cannot

extrapolate outside the range of output data supplied during training (SAS Institute

Page 23: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 16

Inc., 2011). The process partitions the entire output data set into multiple disjointed

subsets, and then fits single responses into each individual region of the tree. The

tree is built by finding the input variable that best splits the output data into two

groups, and then, after separating the data, this process is applied to each sub-group

separately. The process is continued until the subgroups either reach a minimum

size or until no improvement can be made. The earlier and/or more often a given

input variable is selected as the basis for division, the more likely it is to be a useful

predictor. Here, Recursive Partitioning and Regression Trees (RPART) have been

used to screen candidate predictors. These included functions of day of year, sea level

pressure (psl850), relative humidity (rh850), specific humidity (sp850), temperature

(t850) for 850mb, vertical velocity (v500), and geopotential height (z500) fields for

500mb from NCEP reanalysis for the fifteen nearest points to St. John’s. Loca-

tions are shown in Figure 3.1. The nearest and furthest point is 19km and 476.59km

respectively from St. John’s airport.

Page 24: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 17

Latitude

-58 -56 -54 -52 -50 -48 -46

4446

4850

5254

56Potential Predictor Locations

Potential Predictor Locations

Longitude

Figure 3.1: Potential predictors from NCEP reanalysis at locations used in the current

study.

Page 25: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 18

3.5 Probability distributions for daily Precipita-

tion

In order to apply the CDEN methodology to precipitation downscaling, a suitable

probability distribution for precipitation at the target location must be selected.

A common choice for daily precipitation totals is the mixed Bernoulli-gamma dis-

tribution, in which a probability of zero precipitation is combined with a gamma

distribution for non-zero precipitation probabilities (e.g., Williams, 1998; Haylock et

al., 2006; Cawley et al., 2007; Cannon, 2008). The current study also adopts the

Bernoulli-gamma distribution for St. John’s, having confirmed the distribution is

suitable for this region. Figure 3.2 compares a histogram of St. John’s daily precipit-

ation to a histogram of randomly sampled data from a Bernoulli-gamma distribution

fitted to this same St. John’s data.

The Bernoulli-gamma distribution has three specified parameters; a probability of

precipitation occurring (p), a shape parameter (α), and a scale parameter (β). Its

probability density function is

f(y; p, α, β) =p ( y

β)α−1 exp(−y

β)

βΓα, if y > 0

and

f(y; p, α, β) = 1− p, if y = 0

where y denotes the amount of precipitation. The mean and variance of the distri-

bution are given respectively as

µ = pα

β

and

s = pα

(1 + (1− p)α

β2

).

Page 26: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 19

Station Precipitation data

Amount of Precipitation(mm)

Frequency

0 20 40 60 80 120

02000

4000

6000

8000

10000

Randomly created Precipitation data

Amount of Precipitation(mm)

Frequency

0 20 40 60 80 120

02000

4000

6000

8000

10000

Figure 3.2: Station data and randomly created Precipitation data via Bernoulli-

gamma distribution.

The current study uses a probabilistic downscaling approach, in which the paramet-

ers of a Bernoulli-gamma distribution are conditional upon daily large-scale forcing;

that is, all three Bernoulli-gamma parameters are allowed to vary in response to daily

conditions, reflecting changes in the likelihood that a given amount of precipitation

will occur. For example, a properly trained CDEN should increase the likelihood of

zero precipitation during synoptic conditions that promote clear skies, and increase

the likelihood of heavy precipitation when conditions are extremely warm, humid,

Page 27: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 20

and highly unstable. Daily CDEN-predicted distributions can then be used as a con-

ditional weather generator, informed by daily large-scale variability; the distribution

can be randomly sampled multiple times to identify a reasonable precipitation range

for that day. This approach provides a greater depth of precipitation information

than single-value deterministic downscaling, and better reflects daily variability than

a weather generator based on a static daily precipitation distribution. So, the extreme

events are identified as the maximum precipitation of each year. These extreme events

are extracted from precipitation data and stochastic weather generator sample data.

3.6 Cross-Validation

When using any powerful empirical prediction technique, we must be careful to avoid

over-fitting. This can be accomplished by excluding some data from training, al-

lowing testing of the trained prediction algorithm against independent data. This is

convenient if very long time series are available, but is difficult if only limited data is

available. In this situation, cross-validation is often used to produce longer data sets

for evaluation (e.g., Hsieh, 2009). Figure 3.3 illustrates how cross validation works.

The process splits the data into a pre-selected number of segments. The data has

been distributed equally among the segments; one segment is called the test data and

the remaining segments are called the training data. An estimation of each segment

is then produced by a prediction model trained with the remaining segments. In this

way, the full time series of data is estimated by a series of independent models.

Page 28: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 3. Methodology 21

u u uu uu u| |Test

| |Train

u u u u u u uu u u u u u u| |Test

| |Train

| |Train

u u u u u u u| |Test

| |Train

| |Train

...

u u u u u u u| |Test

| |Train

Figure 3.3: Cross Validation (CV) method.

Page 29: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4

Application of Conditional Density

Estimation Networks to

Precipitation Downscaling in

Newfoundland

4.1 Predictor Selection

The RPART predictor selection method used here is described in Chapter 3; here,

it has been used to select between the following daily mean fields obtained from

the NCEP reanalysis: sea level pressure (psl850), relative humidity (rh850), spe-

cific humidity (sp850), and temperatures (t850) recorded at the 850mb pressure level

(∼ 1km above sea level), and vertical velocity (v500) and geopotential height (z500)

at the 500mb pressure level (∼ 5.5km above sea level). Data from the fifteen NCEP

grid points nearest the St. John’s airport were included; these all lie within 450km

of the airport. As the expected daily precipitation varies with the annual seasonal

22

Page 30: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 23

cycle, sine/cosine representations of the seasonal cycle (as a function of day of year)

were added as additional candidate predictors. The following additional predict-

ors (variable and location given) were identified as relevant in an RPART analysis,

and including in CDEN training: temperature at (−47.5◦, 47.5◦), sea level pressure

at (−55◦, 45◦) and (−55◦, 50◦), vertical velocity at five locations (−52.5◦, 47.5◦),

(−50◦, 47.5◦), (−55◦, 47.5◦), (−55◦, 45◦) and (−57.5◦, 45◦), and relative humidity

at three different locations (−52.5◦, 47.5◦), (−55◦, 47.5◦), and (−55◦, 45◦). In Fig-

ure 4.1, the blue points demonstrate the locations of the potential predictor paramet-

ers and the red points demonstrate the locations of the effective predictor parameters.

Page 31: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 24

Latitude

-58 -56 -54 -52 -50 -48 -46

4446

4850

5254

56

Potential Predictor Locations

Potential Predictor LocationsEffective Predictor Locations

Longitude

Figure 4.1: Red locations and blue locations represents selected predictor locations

and potential predictor locations respectively for precipitation at the St. John’s

airport station.

Page 32: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 25

4.2 Results and Discussion

The thirteen candidate predictors were applied in the CDEN training process. Daily

data covering January 1st, 1976 through December 31st, 2011 were used. A six-fold

cross-validation approach was adopted, in which the full 1976-2011 daily time series

was separated into six equal segments. Daily precipitation for each segment was

then simulated using a CDEN trained with the remaining five segments, ensuring

the simulations were based on independent data. These simulations consist of daily

Bernoulli-gamma probability distribution parameters, conditional upon the state of

the thirteen predictor values: scale and shape parameters, along with the conditional

probability of non-zero precipitation for the day.

In order to evaluate the performance of the CDEN, a ‘best-guess’ time series of daily

precipitation was evaluated by comparing

i. the median of the daily CDEN-generated Bernoulli-gamma distribution to

ii. observed daily precipitation totals.

Table 4.1 summarizes results as a series of relevant skill scores (Wilks, 2006). As a

comparison point, the same scores are presented for predictions for St. John’s from

the raw NCEP-NCAR reanalysis daily precipitation field. The reanalysis values give

a benchmark for performance at our study location.

Table 4.1 shows the deterministic and categorical performance of the CDEN-derived

‘best-guess’ time series for St. John’s; this represents a deterministic prediction by the

trained CDEN model. Results show that the root-mean square error (RMSE) of the

deterministic CDEN prediction is not significantly different from NCEP, consistent

with results generated in other studies using a coupled ANN-analog downscaling

model (Cannon, 2008). The correlation coefficient for both methods is statistically

Page 33: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 26

Statistic CDEN NCEP

RMSE(mm) 7.57 7.97

r 0.62 .62

HIT 0.90 0.63

FAR 0.13 0.23

TSS 0.77 0.40

BIAS 1.03 0.82

Table 4.1: Cross-validated model performance statistics for CDEN, NCEP downscal-

ing models with root mean square error (RMSE), correlation (r), Hit-rate (HIT),

False alarm ratio (FAR), True scale statistic (TSS), Bias ratio (BIAS).

significant, but there is no difference between them. The remaining skill measures in

Table 4.1 are based on categorical predictions; specifically, whether precipitation does

or does not occur. Here, precipitation was considered to have occurred if a prediction

(from the deterministic application of the CDEN or NCEP) gave a daily precipitation

of 0.5mm or more. Four skill measures are considered:

i. the hit rate (HIT) gives the fraction of correctly forecasted events, with HIT =

1 being a perfect score;

ii. the false alarm ratio (FAR) gives the fraction of incorrect ‘yes’ forecasts (i.e.,

predicting an event that does not occur), with a perfect score of FAR = 0;

iii. the true skill statistic (TSS) is the difference between HIT and FAR, with 1 being

a perfect score; and

iv. BIAS (B), calculated as the ratio between the number of times an event is pre-

dicted relative to the number of times an event occurs, with B = 1 being the

Page 34: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 27

ideal score (Wilks, 2006).

200

400

600

800

1000

DJF seasonal total precipitation

Time

Tota

l am

ount

of s

easo

nal p

reci

pita

tion

(mm

)

1976 1980 1984 1988 1992 1996 2000 2004 2008

Weather generators data Station data

(a)

200

400

600

800

1000

MAM seasonal total precipitation

TimeTo

tal a

mou

nt o

f sea

sona

l pre

cipi

tatio

n (m

m)

1976 1980 1984 1988 1992 1996 2000 2004 2008

Weather generators data Station data

(b)

200

400

600

800

1000

JJA seasonal total precipitation

Time

Tota

l am

ount

of s

easo

nal p

reci

pita

tion

(mm

)

1976 1980 1984 1988 1992 1996 2000 2004 2008

Weather generators data Station data

(c)

200

400

600

800

1000

SON seasonal total precipitation

Time

Tota

l am

ount

of s

easo

nal p

reci

pita

tion

(mm

)

1976 1980 1984 1988 1992 1996 2000 2004 2008

Weather generators data Station data

(d)

Figure 4.2: Compared total amount of seasonal precipitation between the station data

and the stochastic weather generator using CDEN Model. The blue line represents

the total amount of seasonal precipitation for the station data and box plot represents

the total amount of seasonal precipitation for the stochastic weather generator.

Page 35: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 28

Here, the CDEN predictions give significantly higher skills than NCEP predictions,

with higher HIT, lower FAR, and better TSS. The CDEN model is a little over-biased,

while NCEP is significantly under-biased (predicting too few events). Thus, we have

found the CDEN model to be more effective than the NCEP in predicting whether

precipitation will or will not occur, even if RMSE and correlation does not show signi-

ficant differences. Figure 4.2 compares total seasonal precipitation from observations

(blue line) to the 30-member ensemble generated with stochastic generator. Results

show reasonable agreement, with seasonal totals usually remaining within the 95%

confidence interval predicted by the model. There are several instances when obser-

vations and the model diverge significantly (e.g., Winter, 1998), although these are

relatively rare. Along with results in Table 4.1, this plot emphasizes that the CDEN

provides a reasonable, though imperfect, fit with observations.

In an attempt to further evaluate the strength of the CDEN approach to simulate

Figure 4.3: Quantile-Quantile plot at the St. John’s international airport.

Page 36: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 29

realistic precipitation fields, a quantile-quantile plot was produced (Figure 4.3). Here,

the probabilistic potential of the CDEN approach is explored. Instead of drawing a

single ‘best-guess’ from the daily CDEN-predicted precipitation distributions, 30 ran-

dom samples were taken, providing a greater sense of each day’s likely precipitation.

The result is a CDEN-derived stochastic weather generator sample thirty times the

size of the original observed time series. If the CDEN is well trained and works well,

this stochastic sample should give a better sense of extreme precipitation events than

a short observation time series. If the observation period is long enough, the two

should give similar results.

Figure 4.3 compares quantiles of observed St. John’s precipitation over the train-

ing period (horizontal axis) to quantiles from the CDEN sample (vertical axis). A

reference line in this plot shows what perfect agreement would look like. From Fig-

ure 4.3, it is clear that there is a strong relationship between the observed test data

and the simulated precipitation data for events lower than 70mm. This is a signific-

ant amount of precipitation, exceeded roughly once every 1.33 years in the observed

data. Above 70mm, the quantile-quantile plot suggests the CDEN overpredicts pre-

cipitation amounts, with several large outliers. These outliers, or extreme events, are

important events for climatology, and are used as design criteria in engineering in-

frastructure. Results suggest that either the CDEN stochastic sample overestimates

these large events, or that the observational data underestimates them. Figure 4.4(a)

shows the daily precipitation data of our observation site. In these figure, the x-axis

represents the precipitation interval and the y-axis represents the amount of precipit-

ation; that is, we have split observed precipitation into discrete categories (‘intervals’;

x-axis), then show the range of precipitation in that interval as a box and whisker

plot (y-axis). This figure is presented to highlight aspects of extreme precipitation

at this location; notably, the observational record shows that daily precipitation in

Page 37: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 30

excess of 100mm occurs only four times in the period examined. These extreme

outliers may have a strong impact on the training of the CDEN model, in either a

negative or positive manner. Ideally, these rare events would provide information on

conditions that promote extreme precipitation. However, with so few large events,

the CDEN may not be able to accurately identify predictor/extreme precipitation

relationships, leaving the CDEN unable to simulate these events realistically. This

may be the case here, as the CDEN developed for this paper appears to over-estimate

the magnitude of extreme precipitation relative to observations (Figure 4.3). Fig-

ure 4.4(b) shows the same information as Figure 4.4(a), but this time for a synthetic

time series (CDEN output). Comparing these gives more insight into the CDEN’s

extreme overestimates. The larger stochastic sampling is thirty times the size of the

original time series, allowing a closer examination of the CDEN’s interpretation of

rare events. Results again suggest the CDEN output overestimates the magnitude

of large precipitation events. Comparing results in the high precipitation categories

(>100mm) in Figure 4.4(a) and Figure 4.4(b), the CDEN tends to adopt much higher

values, even reaching 216.7mm in the most extreme case. However, the CDEN ac-

curately captures the relative frequency of extreme events, producing 131 events with

100mm or higher. Given that the stochastic CDEN sample is thirty times the size

of the observation time series, this is a close match to observations (four events in

36 years, compared to 131 in 1080 simulated years). Respectively, these give a mean

frequency of 0.11 and 0.12 extreme events per year, or return periods of 9 and 8.2

years.

The apparent tendency for the CDEN to ‘overforecast’ the magnitude of extreme

events raises concerns that this statistical model is unsuitable for extreme precipit-

ation analysis in our study region. Alternatively, results could be interpreted as an

indication that very high precipitation events are more likely to occur here than the

Page 38: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 31

raw observational record would suggest.

A recent extreme precipitation analysis conducted using a second data source suggests

this may be the case, and that much larger precipitation events are possible than have

been recorded at St. John’s airport. Specifically, a rain gauge located 1.6km from St.

John’s airport (Windsor Lake) recorded three events that are greater than 100mm

since 1998; respectively, these were 104.9mm, 149.6mm, and 180mm (CBCL Ltd.,

2014). This later amount far exceeds the maximum (144.8mm) recorded at the air-

port, and suggests the CDEN’s larger estimates are warranted (e.g., reasonably likely

to occur). It is important to note that although the CDEN does simulate a handful of

events larger than the maximum 180mm Windsor Lake measurement, they are very

rare: in our 1080-year simulation, only 8 were recorded, giving a return period of 135

years.

Page 39: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 32

(a)

(b)

Figure 4.4: Compared between (a) the precipitation data for St. John’s airport and

(b) the stochastic weather generator using CDEN Model.The line inside each box

represents the median, boxes extend from the 25th to 75th percentiles, and outliers

are shown as circles.

Page 40: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 33

4.3 Summary and Conclusion

Results indicate that the CDEN methodology, as applied in the current study, can

offer some useful insight into local precipitation distributions and extreme events.

However, they also suggest the method should be used with caution. As a precipit-

ation downscaling method, the CDEN appears to work reasonably well, although it

retains errors in magnitude comparable to NCEP/NCAR reanalysis estimates. The

CDEN does, however, strongly outperform the reanalysis when predicting whether

precipitation will or will not occur, with much better hit rates, bias, and false alarm

rates. CDEN skill shown here is comparable to skill reported in similar studies of

other locations (e.g., British Columbia; Cannon, 2008).

Although the reasonable performance as a deterministic downscaling technique con-

firms that our CDEN application is capable of capturing relationships between large-

scale forcing and local scale precipitation (Table 4.1), the method is potentially more

valuable as a tool for exploring the broader precipitation probability distribution and

extreme precipitation events. To explore this potential, the CDEN was used as a

conditional weather generator, generating a large synthetic precipitation data set by

randomly sampling CDEN-based daily conditional precipitation distributions. In this

case, each day in our 36-year study period was sampled 30 times, giving 1080 equival-

ent years of synthetic data. A comparison between the 36-station record suggests that

the CDEN overestimates extreme precipitation values (Figure 4.3, 4.4). However,

our ability to evaluate the CDEN is limited by the available observational record;

with only 36 years examined, it is possible that this record is under-representing ex-

treme events. This is supported by other nearby precipitation records, which include

events much larger than those observed at St. John’s airport over the same period

of record (e.g., Windsor Lake, with a maximum of 180mm in 24 hrs). Keeping this

Page 41: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Chapter 4. Application of CDEN to Precipitation Downscaling 34

in mind, CDEN results suggest that design criteria like the 100-year return period

should be increased relative to values based on observations alone. At present, how-

ever, it would be premature to base design criteria on CDEN output alone.

The novelty of the work is mostly in our application in a new geographic context,

and reframing of the CDEN framework as ‘probabilistic’ downscaling. Computing

challenges included

i. predictor screening/selection for St. John’s,

ii. adaptation of the CDEN algorithm to the current context (including scripting

and testing a new cost function in an existing CDEN package in R),

iii. application of a cross-validation methodology.

Further work on predictor selection and analyses based on longer data sets may be

able to improve the skill of the CDEN predictions; this work is necessary for this

technique to be truly useful for operational decision making. If improved CDEN skills

are achieved, this tool may prove useful in projecting the impact of climate change

on extreme precipitation in Newfoundland. By then applying a CDEN trained with

observations and reanalyses to 20th and 21st general circulation model output, it

would be possible to explore shifts in precipitation distributions in detail, even when

only relatively short model runs (≤ 100 years) are available. This is being considered

for future work.

Page 42: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

Bibliography

2008 Reanalysis and Attribution: Understanding How and Why Recent Climate

Change Has Varied and Changed : Findings and Summary of the U.s. Climate

Change Science Program Synthesis and Assessment Product 1.3. Climate Change

Science Program.

Allen, M. R. & Ingram, W. J. 2002 Constraints on future changes in climate

and the hydrological cycle. Nature, 419 224-232.

Benestad, R. E., Hanssen-Bauer, I. & Chen, D. 2008 Empirical-statistical

downscaling. World Scientific Publishing Company.

Cai, S., William W., H. & Alex J., C. 2006 A comparison of bayesian and con-

ditional density models in probabilistic ozone forecasting. Hydrological Processes,

16, 1137-1150.

Barrow, E., Maxwell, B., Gachon, P. & Canada, C. E. 2004 Cli-

mate Variability and Change in Canada: Past, Present and Future. Environment

Canada.

Beniston, M., Stephenson, D. B., Christensen, O. B., Ferro, C. A. T.,

Frei, C., Goyette, S., Halsnaes, K., Holt, T., Jylh, K. & Koffi, B.

35

Page 43: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 36

2007 Future extreme events in european climate: an exploration of regional climate

model projections. Climatic Change, 81 (S1), 71-95.

Bromwich, D. H. & Cullather, R. I. 1998 The atmospheric hydrologic cycle

over the Arctic basin from ECMWF re-analyses. Proc. of the WCRP First Int.

Conf. on Reanalyses Silver Spring, MD, WCRP, in press.

Cannon, A. J. & McKendry, I. 2002 A graphical sensitivity analysis for statist-

ical climate models: Application to indian monsoon rainfall prediction by artificial

neural networks and multiple linear regression models. International Journal of

Climatology, 22, 1687-1708.

Cannon, A. J. 2008 Probabilistic multisite precipitation downscaling by an expan-

ded bernoulli-gamma density network. Journal of Hydrometeorology, 9, 1284-1300.

Cannon, A. J. 2010 A flexible nonlinear modelling framework for non-stationary

generalized extreme value analysis in hydro-climatology. Hydrological Processes, 24,

673-685.

Cawley, G. C., Janacek, G. J., Haylock, M. R. & Dorling, S. R.

2007 Predictive uncertainty in environmental modelling. Neural Networks, 20 (4),

537–549.

CBCL Ltd. 2014 Rennies River Catchment Stormwater Management Plan. CBCL

Project No. 123097 .

Chen, D. & Hellstrom, C. 1999 The influence of the north atlantic oscillation

on the regional temperature variability in sweden: spatial and temporal variations.

Tellus, 51A, 505-516.

Page 44: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 37

Christopher, J. W., Sanabria, A, Corney, S, Grose, M, Holz, M,

Bennett, J, Cechet, R & Bindoff, N.L. 2010 Modelling Extreme Events in

a Changing Climate using Regional Dynamically- Downscaled Climate Projections.

IEMSs 2010 International Congress on Environmental Modelling and Software,

Ottawa.

Dawson, C. & Wilby, R. 2001 Hydrological modelling using artificial neural

networks. Progress in Physical Geography, 25, 80-108.

Christensen, J. H., Carter, T. R., Rummukainen, M. & Amanatidis,

G. 2007 Evaluating the performance and utility of regional climate models: The

PRUDENCE project. Climate Change, 81, 1-6.

Dickinson, R.E., Errico, R.M., Giorgi, F.& Bates, G.T. 1989 A regional

climate model for western United States. Climatic Change, 15, 383-422.

Dorling, S., Foxall, R., Mandic, D. & Cawley, G. 2003 Maximum

likelihood cost functions for neural network models of air quality data. Atmospheric

Environment, 37(24), 3435-3443.

Finnis, J., Holland, M. M., Serreze, M. C. & Cassano, J. J. 2007 Response

of Northern Hemisphere extratropical cyclone activity and associated precipita-

tion to climate change, as represented by the Community Climate System Model.

Journal Geophysics Ressources, 112 G04S42, doi:10.1029/2006JG000286.

Folland, C. & Karl, T. 2001 Observed climate variability and change. in:

Houghton, j.t. et al., editors. climate change 2001: the scientific basis. Cambridge

University Pres, , 99-181.

Page 45: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 38

Forland, E.J., Engelen , A., Ashcroft, J., Dahlstrom, B., Demaree,

G., Frich, P., Hanssen-Bauer, I., Heino, R., Jonsson, T., Mietus,

M., Muller-Westermeier, G., Palsdottir, T., Tuomenvirta, H. &

Vedin, H. 1996 Change in normal precipitation in the north atlantic region (2nd

edn). DNMI Report 7-96 Klima.

Fowler, H. J., Blenkinsop, S.,& Tebaldi, C. 2007 Linking climate change

modelling to impact studies: Recent advances in downscaling techniques for hydro-

logical modelling. International Journal Climatology, 27, 1547-1578.

Frei, C., Schar, C., Luthi, D. & Davies, H. C. 1998 Heavy precipitation

processes in a warmer climate. grl, 25, 1431-1434.

Frei, C., Schll, R., Fukutome, S., Schmidli, J. & Vidale, P. L. 2006

Future change of precipitation extremes in Europe: Inter-comparison of scenarios

from regional climate models. Journal of Geophysical Research, 111.

Gardner, M. W. & Dorling, S. 1998 Artificial neural networks (the multilayer

perceptron)a review of applications in the atmospheric sciences. Atmospheric En-

vironment, 32, 2627–2636.

Gautam, D. & Holz, K. 2000 Neural network based system identification

approach for the modelling of water resources and environmental systems. Artificial

intelligence methods in civil engineering applications, 87-100.

Giorgi, F. 1990 Simulation of regional climate using a limited area model nested

in a general circulation model. Journal of Climate, 3, 941-963.

Goodess, C. M. 2011 An inter-comparison of statistical downscaling methods for

Europe and European regionsAssessing their performance with respect to extreme

Page 46: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 39

weather events and the implications for climate change applications. Technical re-

port, Univ. of East Anglia, Norwich, U. K., in press.

Groisman, P., Karl, T., Easterling, D., Knight, R. W., Jamason, P.,

Hennessy, R. S., C.M., P., Wibig, J., Fortuniak, K., Razuvaev, V.,

Forland, E. & Zhai, P. 1999 Changes in the probability of heavy precipitation:

Important indicators of climatic change. Climate Change, 42, 243-283.

Gutowski, W.J., Arritt, R.W., Kawazoe, S., Flory, D.M., Takle,

E.S., Biner, S., Caya, D., Jones, R.G., Laprise, R., Leung, R.L.,

Mearns, L.O., Moufouma-Okia, W., Nunes, A.M.B.,Qian, Y , Roads,

J.O., Sloan, L.C. & Snyder, M.A. 2010 Regional Extreme Monthly Precip-

itation Simulated by NARCCAP RCMs. Journal of Hydrometeor, 11, 1373-1379.

doi: http://dx.doi.org/10.1175/2010JHM1297.1

Haylock, M. R., Cawley, G. C., Harpham, C., Wilby, R. L. & Goodess,

C. M. 2006 Downscaling heavy precipitation over the united kingdom: a compar-

ison of dynamical and statistical methods and their future scenarios. International

Journal of Climatology, 26, 13971415.

Hellstrom, C., Chen, D., Achberger, C. & Risnen, J. 2001 Comparison of

climate change scenarios for sweden based on statistical and dynamical downscaling

of monthly precipitation. Climate Research, 19, 45-55.

Hennessy, K. J. & Suppiah, R. 1999 Australian rainfall changes, 19101955.

Australian Meteorological Magazine, 48, 1-13.

Hostetler, S.W., Bartlein, P.J., Clark, P.U., Small, E.E. & So-

lomon, A.M. 2000 Simulated influence of Lake Agassiz on the climate of Central

North America 11,000 years ago. Nature, 405(6784), 334–337.

Page 47: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 40

Houghton, JT., Meira Filho, L.,Callander, B., Harris, N., Kat- ten-

berg, A. & Maskel, l. K. e. 1996 IPCC climate change. the IPCC second

assessment report. Cambridge University Press: New York .

Hsieh, W. W. & Tang, B. 1998 Applying neural network models to prediction

and data analysis in meteorology and oceanography. Bulletin of the American Met-

eorological Society, 79, 1855-1870.

Hsieh, W. W. 2009 Machine Learning Methods in the En-

vironmental Sciences. Cambridge University Press, 1st edi-

tion.http://dx.doi.org/10.1017/CBO9780511627217.

Iwashima, T.& Yamamoto, R. 1993 A statistical analysis of the extreme events:

long-term trend of heavy daily precipitation. Journal of the Meteorological Society

of Japan, 71, 637-640.

Kaas, E. & Frich, P. 1995 Diurnal temperature range and cloud cover in the nordic

countries: observed trends and estimates for the future. Atmospheric Research, 37,

211-228.

Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D.,

Gandin, L., & Joseph, D. 1996 The NCEP/NCAR 40-year reanalysis project.

Bulletin of the American meteorological Society, 77(3), 437-471.

Kalnay, E 2003 Atmospheric Modeling, Data Assimilation and Predictability. Cam-

bridge university.

Karl, T. R., Knight, R. W. & Plummer, N. 1995 Trends in high frequency

climate variability in the twentieth century, 217-220.

Page 48: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 41

Karl, T. R. & Knight, R. W. 1998 Secular trends of precipitation amount fre-

quency and intensity in the United States. Bulletin of the American Meteorological

Society, 79, 231-241.

Kaas, E. & Frich, P. 1995 Diurnal temperature range and cloud cover in the nor-

dic countries: observed trends and estimates for the future. Atmospheric Research,

37, 211-228.

Lenderink, G., Buishand, A. ,& van Deursen, W. 2007 Estimates of future

discharges of the river Rhine using two scenario methodologies: Direct versus delta

approach. Hydrology and Earth System Sciences, 11(3), 1145-1159.

Linderson, M., Achberger, J. & Chen, D. L. 2004 Statistical downscal-

ing and scenario construction of precipitation in scania, southern sweden. Nordic

Hydrology, 35(3), 261-278.

Mekis, & Vincent, L.A. 2011 An overview of the second generation adjusted

daily precipitation dataset for trend analysis in Canada. Atmosphere-Ocean, 49(2),

163-177.

Plummer, N., Salinger, M. J., Nicholls, N., Suppiah, R., Hennessy,

K. J., Leighton, R. M., Trewin, B., Page, C. M. & Lough, J. M.

1999 Changes in climate extremes over the australian region and new zealand dur-

ing the twentieth century. In Weather and Climate Extremes, 183-202. Springer

Netherlands.

Rummukainen, M. 1997 Methods of statistical downscaling of GCM simulations.

SMHI Rapporter. Meteorologi och Klimatologi (Sweden). no. 80.

Page 49: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 42

SAS Institute Inc. 2011 SAS(R) Enterprise Miner 7.1 Extension Nodes: De-

velopers Guide. SAS Cary, NC: SAS Institute Inc.

Schonwiese, C. & Rapp, J 1996 Climate trend atlas of Europe based on obser-

vations 1891- 1990. Kluwer Academic Publishers: Dordrecht.

Schoof, J. & Pryor, S. 2001 Downscaling temperature and precipitation: a

comparison of regression-based methods and artificial neural networks. Interna-

tional Journal Climatology, 21, 773–790.

Suppiah, R. & Hennessy, K. J. 1995 Trends in the intensity and frequency of

heavy rainfall in tropical Australia and links with the southern oscillation. World

Meteorological Organization-Publications-WMO TD, 897-901.

Suppiah, R. & Hennessy, K. J. 1998 Trends in total rainfall, heavy rainfall

events and number of dry events in Australia, 1910-1990. International Journal of

Climatology, 18, 1141-1164.

Trenberth, K.E., Karl, T. R. & Knight, R. W. 2003 Modern global climate

change. Science, 302, 1719-1723

Vincent, L. A. 1998 A technique for the identification of inhomogeneities in Ca-

nadian temperature series. Journal of Climate, 11, 1094-1104.

Vincent, L. A. & Gullett, D. W. 1999 Canadian historical and homogen-

eous temperature datasets for climate change analyses. International Journal of

Climatology, 19, 1375-1388.

Vincent, L. A., Zhang, X., Bonsal, B.R. & Hogg, W. D. 2002 Homogen-

ization of daily temperatures over Canada. Journal of Climate, 15, 1322-1334.

Page 50: Evaluating conditional density estimation networks as a … · 2015. 6. 26. · Evaluating conditional density estimation networks as a probabilistic downscaling tool: Application

BIBLIOGRAPHY 43

Vincent Lucie, A. & Mekis, V.A. 2006 Changes in daily and extreme temper-

ature and precipitation indices for canada over the twentieth century. Atmosphere-

Ocean, 44 (2), 177-193.

Vrac, M., & Naveau, P. 2007 Stochastic downscaling of precipitation: From

dry events to heavy rainfalls. Water resources research, 43(7).

Weichert, A.& Burger, G. 1998 Linear versus nonlinear techniques in down-

scaling. Climate Research, 8, 83-93.

Wigley, T. M. L., Jones, P. D., Briffa, K. R. & Smith, G. 1990 Obtaining

subgrid scale information from coarse resolution general circulation model output.

Journal of Geophysical Research: Atmospheres (1984-2012), 95(D2), 1943-1953.

Wilby, R. L., Wigley, T. M. L., Conway, D., Jones, P. D., Hewitson,

B. C., Main, J.& Wilks,D. S. 1998 Statistical downscaling of general circula-

tion model output: A comparison of methods. Water resources research, 34(11),

2995-3008.

Wilks, D. S. 1998 Multisite generalization of a daily stochastic precipitation gen-

eration model. Journal of Hydrology, 210, 178–191.

Wilks, D. S. 2006 Statistical Methods in the Atmospheric Sciences. Academic

Press, 627.

Williams, P. M. 1998 Modelling seasonality and trends in daily rainfall data.

Prameters, 16(12), 15.