uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff...

15
Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models David A. Jones, Alison L. Kay * Centre for Ecology and Hydrology, Maclean Building, Crowmarsh Gifford, Wallingford, Oxfordshire OX10 8BB, UK Received 23 May 2006; received in revised form 20 October 2006; accepted 26 October 2006 Available online 4 January 2007 Abstract Continuous simulation of flows for ungauged catchments is a methodology being developed for estimating flood frequencies where no flood records exist. This involves driving a rainfall-runoff model with either simulated or observed rainfalls, using values of the rainfall- runoff model parameters derived using a generalisation procedure based on analysing sets of parameter values for calibrated catchments. This paper examines the uncertainty associated with such generalised parameters, and carries this through to estimate the uncertainty of the generalised flood frequency curves for ungauged catchments. The approach used distinguishes two sources of uncertainty: the uncer- tainty in the parameters calibrated for individual catchments, and the uncertainty with which parameters for an ungauged catchment can be estimated based on calibrated parameters for other catchments and descriptors of those catchments. The uncertainty associated with estimates for ungauged catchments can then be reduced compared with a more direct approach, firstly by allowing one of the compo- nents of uncertainty to be omitted, and secondly by allowing the introduction of weighting schemes which reduce the effect of catchments where calibration uncertainty is high. Ó 2006 Elsevier Ltd. All rights reserved. Keywords: Components of uncertainty; Estimating uncertainty; Flood frequency; Continuous simulation; Generalisation 1. Introduction When estimates of flood frequency are required for sites for which there are no systematic records of past river flows, two generally applicable classes of approach are potentially available. They are both based on making use of informa- tion from a large set of gauged sites for which substantial flow records are available and they both make use of the assumptions, for catchments in the same general climate regime, that flood frequencies for similar catchments will be similar and that variations in the flood frequencies for different catchments can be explained to a reasonable extent by differences in the physical attributes of the catchments, quantified by various catchment descriptors. Approaches in the first class work directly with the statis- tical distributions of extreme events for the collection of gauged sites: often the analysis relates to the distribution of annual maximum peak flows. Relations are sought between these statistical distributions and the catchment descriptors and these relationships are then used to derive estimates for the statistical distribution of extreme events for a target catchment based on its known catchment descriptors. Approaches in the second class take the stan- dard methods of flood frequency analysis, usually applied to long records of gauged flows, and apply them to synthetic records of flow which are generated within the methodol- ogy. These synthetic records of flow are created as time ser- ies (at a fine time-resolution) by driving a rainfall-runoff model with either stochastically generated rainfall series or with long records of observed catchment rainfalls. This requires that any parameters of the rainfall-runoff model can be specified without needing to make use of observed flow records for the target site. Relations are sought between the parameters of the rainfall-runoff model and the catchment descriptors. Where rainfall data are gener- ated from a stochastic model, the parameters of this model 0309-1708/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.advwatres.2006.10.009 * Corresponding author. Tel.: +44 1491 838800; fax: +44 1491 692424. E-mail addresses: [email protected] (D.A. Jones), [email protected] (A.L. Kay). www.elsevier.com/locate/advwatres Advances in Water Resources 30 (2007) 1190–1204

Upload: david-a-jones

Post on 26-Jun-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

www.elsevier.com/locate/advwatres

Advances in Water Resources 30 (2007) 1190–1204

Uncertainty analysis for estimating flood frequencies for ungaugedcatchments using rainfall-runoff models

David A. Jones, Alison L. Kay *

Centre for Ecology and Hydrology, Maclean Building, Crowmarsh Gifford, Wallingford, Oxfordshire OX10 8BB, UK

Received 23 May 2006; received in revised form 20 October 2006; accepted 26 October 2006Available online 4 January 2007

Abstract

Continuous simulation of flows for ungauged catchments is a methodology being developed for estimating flood frequencies where noflood records exist. This involves driving a rainfall-runoff model with either simulated or observed rainfalls, using values of the rainfall-runoff model parameters derived using a generalisation procedure based on analysing sets of parameter values for calibrated catchments.This paper examines the uncertainty associated with such generalised parameters, and carries this through to estimate the uncertainty ofthe generalised flood frequency curves for ungauged catchments. The approach used distinguishes two sources of uncertainty: the uncer-tainty in the parameters calibrated for individual catchments, and the uncertainty with which parameters for an ungauged catchment canbe estimated based on calibrated parameters for other catchments and descriptors of those catchments. The uncertainty associated withestimates for ungauged catchments can then be reduced compared with a more direct approach, firstly by allowing one of the compo-nents of uncertainty to be omitted, and secondly by allowing the introduction of weighting schemes which reduce the effect of catchmentswhere calibration uncertainty is high.� 2006 Elsevier Ltd. All rights reserved.

Keywords: Components of uncertainty; Estimating uncertainty; Flood frequency; Continuous simulation; Generalisation

1. Introduction

When estimates of flood frequency are required for sitesfor which there are no systematic records of past river flows,two generally applicable classes of approach are potentiallyavailable. They are both based on making use of informa-tion from a large set of gauged sites for which substantialflow records are available and they both make use of theassumptions, for catchments in the same general climateregime, that flood frequencies for similar catchments willbe similar and that variations in the flood frequencies fordifferent catchments can be explained to a reasonable extentby differences in the physical attributes of the catchments,quantified by various catchment descriptors.

Approaches in the first class work directly with the statis-tical distributions of extreme events for the collection of

0309-1708/$ - see front matter � 2006 Elsevier Ltd. All rights reserved.

doi:10.1016/j.advwatres.2006.10.009

* Corresponding author. Tel.: +44 1491 838800; fax: +44 1491 692424.E-mail addresses: [email protected] (D.A. Jones), [email protected] (A.L.

Kay).

gauged sites: often the analysis relates to the distributionof annual maximum peak flows. Relations are soughtbetween these statistical distributions and the catchmentdescriptors and these relationships are then used to deriveestimates for the statistical distribution of extreme eventsfor a target catchment based on its known catchmentdescriptors. Approaches in the second class take the stan-dard methods of flood frequency analysis, usually appliedto long records of gauged flows, and apply them to syntheticrecords of flow which are generated within the methodol-ogy. These synthetic records of flow are created as time ser-ies (at a fine time-resolution) by driving a rainfall-runoffmodel with either stochastically generated rainfall seriesor with long records of observed catchment rainfalls. Thisrequires that any parameters of the rainfall-runoff modelcan be specified without needing to make use of observedflow records for the target site. Relations are soughtbetween the parameters of the rainfall-runoff model andthe catchment descriptors. Where rainfall data are gener-ated from a stochastic model, the parameters of this model

Page 2: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204 1191

would also need to be related to a possibly different set ofcatchment descriptors.

In this paper, to avoid the use of the words ‘‘estimate’’and ‘‘estimation’’ for several different sets of quantitiesand procedures, the term ‘‘generalisation’’ will be usedfor the task of inferring values for ungauged catchmentson the basis of catchment properties and data-records ata collection of gauged sites. The values inferred will becalled the generalised values. Typical generalisation proce-dures include regression, local-averaging or local-regres-sion. These latter techniques might be applied to a subsetof the available catchments chosen to be most similar toa given target ungauged catchment, with similarity beingmeasured using the catchment properties. Section 2describes the generalisation approaches and rainfall-runoffmodels used in this study.

While the present paper is principally concerned withgeneralisation for the parameters of rainfall-runoff models,similar methodologies are potentially applicable morewidely, in particular to generalisation of the parametersof statistical distributions. Examples of work on flood-related distributions are provided by Madsen and Rosbjerg[13], Reis et al. [21] and Tasker and Stedinger [23]. Suchproblems have a common feature that takes the analysisof uncertainty for these generalisation procedures outsidethe scope of ordinary regression problems. This key com-mon feature is that, while the target of the generalisationprocedure is a notional true value of a certain quantityfor a new catchment, only estimates of that quantity areavailable for the catchments used within the generalisationprocedure. For example, in the present application, the lim-ited number of years of data for a catchment lead to theestimates of parameter values, obtained by calibration ofa rainfall-runoff model, being less accurate than they mighthave been if more years of record were available. The‘‘true’’ parameter values for a catchment can reasonablybe defined by imagining that the length of record increasesand that the same calibration procedure is used: then theestimates notionally converge to a set of ‘‘best’’ or ‘‘true’’values for the catchment. When generalisation proceduresare applied for the parameters of statistical distributions,the values available for gauged catchments are estimatesof these parameters obtained from limited records.

If ordinary approaches to uncertainty analysis of the gen-eralised values for ungauged catchments were applied, suchas those within standard regression analyses, these wouldprovide an assessment of uncertainty that is not immedi-ately relevant to the generalisation task. Specifically, whatis wanted is an assessment of uncertainty which measureshow well the generalised value is expected to match the truevalue of the target quantity for the new catchment, whereasan ordinary approach would effectively measure the matchbetween the generalised value and an estimate of the targetquantity that might have been obtained had a data-record,of typical length, been available for the new catchment. Thispaper describes how the required assessment can beobtained via a detailed analysis of uncertainty.

It is clear that the sizes of the errors when the general-ised values estimate a true value for a given catchmentwill be smaller than when they estimate a value subjectto random error but centred on the true value for thecatchment. Hence it can be claimed that a detailed analy-sis of uncertainty, coupled with an identification of thecorrect errors to be considered, leads to a reduction ofuncertainty compared to a less detailed analysis. Thekey point is the ‘‘identification of the correct errors tobe considered’’, which is only possible within a detailedanalysis of uncertainty. In the present context, the typesof errors considered are: (i) the generalisation error, whichis the difference between the true value for a catchmentand the value predicted by the generalisation procedure;(ii) the calibration error, which is the difference betweenthe value calibrated for a given catchment (using limitedobserved data) and the true value for the catchment.Two papers [10,11], associated with this one, make useof this identification in the context of creating improvedgeneralisation estimates.

The uncertainties associated with flood frequencycurves estimated for ungauged catchments have been thesubject of a number of previous papers. For exampleLamb and Kay [12] used a forerunner of the presentapproach which did not identify explicitly the calibrationuncertainty and which treated errors in estimated modelparameters as independent. Wagener and Wheater [26]apply a different approach to treating calibration uncer-tainty, using their measure of parameter ‘‘identifiability’’to construct weights for weighted regression, but theyappear not to use the separation of the regression errorinto components to reduce the uncertainty that needs tobe added to regression estimates. McIntyre et al. [14] dis-cuss a form of weighted site-similarity, with weights differ-ent from those employed here, and introduce a furthersimple approach to assessing the uncertainty of modelparameters for ungauged catchments. For this, a groupof catchments similar to a target catchment is selectedbut, instead of forming generalised estimates of the modelparameters, the sets of calibrated parameters for theselected catchments are taken directly as sets of possibleparameters for the target catchment, with the variabilitywithin the sets being an indication of uncertainty. Suchan approach treats all parameters jointly and implicitlyallows for any interdependence between them: once againit does not use an identification of calibration uncertaintyto allow a reduction in the variability within the parame-ters from the similar catchments.

Blazkova and Bevan [1] take a different approach tousing continuous simulation for flood frequency estimationsince they rely neither on transferring information aboutmodel parameters from gauged catchments, nor indeed onany formal calibrations of rainfall-runoff models. Insteadthey use regional estimates of flood frequencies and flowduration characteristics for a target site and use a quasi-Bayesian procedure to attach weights to randomly selectedparameters sets, uniformly distributed over wide ranges,

Page 3: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

1192 D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204

where the weights reflect how well the flood frequency andflow characteristics from continuous simulation agree withthe regional estimates. The weights attached to the floodfrequency curves are then used to express the uncertaintywith which flood frequencies can be estimated by the proce-dure, where the range of return periods considered is muchwider than those used for the regional estimates. Engelandand Gottschalk [6] apply a Bayesian framework to fit a con-ceptual hydrological model to several catchments simulta-neously, where the same hydrological parameters are usedfor all catchments but where different biases in the flow sim-ulations, and different uncertainty properties for the simula-tion errors, are allowed for each catchment.

This paper presents methods, developed and appliedalongside the generalisation methods of Kay et al. [10,11],to estimate uncertainty bounds for generalised flood fre-quency curves. An additional feature of these methods wasto allow the use of calibration uncertainty within weightedversions of each generalisation approach, which improvedthe generalisation itself. The application of these weightingmethods is described in [10,11]. This paper concentrates onthe estimation of uncertainty bounds, and presents a com-parison of these bounds for the two rainfall-runoff modelsand for two generalisation approaches; site-similarity andweighted univariate regression.

2. Background

In the UK, methods for estimating flood frequency atungauged sites, based upon the direct approach of general-ising statistical distributions, are relatively well developed,including those of the Flood Estimation Handbook [8]. Incontrast, continuous simulation methods using rainfall-runoff models for ungauged catchments are still in thedevelopment stage for the UK. The latter approach is seenas having several potential advantages compared to theformer, including easier application to problems involvingflooding at several sites or over a large region, where suchstudies might need to include hydrodynamic modelling,and the more natural application of climate changescenarios.

Recent research [10,11] developed and compared differ-ent approaches to the generalisation of two simple, concep-tual rainfall-runoff models for the UK; the ProbabilityDistributed Model (PDM) [15,16] and the Time–AreaTopographic Extension (TATE) model [2,3]. Both modelformulations were parameter-sparse, to improve theprospects of spatial generalisation by limiting parameterinterdependence. Each takes, as input, time-series of catch-ment average rainfall and potential evaporation (PE) data,and outputs a time-series of river flows at the catchmentoutlet. Concurrent flow data are required for model cali-bration. An additional input required by the TATE modelis information on the distance–area distribution for thecatchment (calculated from digital terrain data).

The research was based on a set of 119 catchmentsspread across Great Britain (Fig. 1). Of these, 46 catch-

ments had hourly rainfall and flow data with the remaining73 having daily data. The catchments ranged in size fromabout 1 km2 to 1200 km2, with a mean of around250 km2. For each site the required rainfall and PE datawere derived, as described by Kay et al. [11], as well as aset of 24 catchment properties covering aspects of topogra-phy, lakes and reservoirs, rainfall, soils, geology and drain-age networks.

In order to derive estimated parameters for each catch-ment a special automated calibration procedure wasdeveloped. The rainfall-runoff models for all catchmentswere calibrated using a 1-h time-step for the calculationsand, where necessary, daily rainfalls were applied on anhourly basis using a selected profile procedure. Calibra-tion was based on a random search procedure where val-ues for each model parameter were selected sequentially,with a different criterion of fit being used for eachparameter.

Two distinct approaches to generalisation were devel-oped and tested: One based on site-similarity [11] and theother based on regression [10]. Site-similarity involves thedefinition of a pooling group for a target catchment, com-prising a number of calibrated catchments most similar, interms of some set of catchment properties, to the targetcatchment. The parameter estimate for the target site isthen formed from the calibrated parameter values of thepooling group, possibly using distance and/or uncertaintyweighting. In contrast, regression makes use of the wholeset of calibrated catchments. Different variations on regres-sion were tested – weighted univariate regression, withweights based on estimates of calibration uncertainty,and sequential regression, where parameters are general-ised one after the other in an attempt to include the effectsof parameter interdependence. Precise details of theseapproaches will not be given here.

A comparison of the approaches [10] suggested that,overall, using site-similarity with the PDM provided poten-tially the best option for generalisation, although the per-formance using regression was not substantially worse.Using univariate regression with the TATE model also per-formed well overall, and was potentially the second bestoption. Site-similarity did not perform well for the TATEmodel. There was no clear advantage to the use of sequen-tial regression over univariate regression for either model.

3. Modelling of uncertainty

3.1. Introduction

The work reported here was developed within research,outlined above, for which the main target was to providea methodology for estimating the flood frequency curve atsites for which there is not sufficient information to allowthe direct calibration of a rainfall-runoff model relevantfor that site. In order to apply the continuous-simulationprinciple, estimates for the parameters of the catchmentmodel have to be derived from the parameter values cali-

Page 4: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

0

0

1

1

2

2

3

3

4

4

5

5

6

6

7

7

0 0

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

Fig. 1. Locations of the study catchment outlets (hourly – triangles, daily – circles), with catchment boundaries.

D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204 1193

brated for other catchments, by relating these parameters tocatchment properties. The main aim of the research was toprovide a mechanism whereby this generalisation step canbe undertaken, with the eventual outcome for a given unga-uged catchment being an estimated flood frequency curve.An important but secondary aim of the research was to pro-vide a means of assessing the uncertainty arising from the

generalisation procedure. The present paper outlines theapproach used for analysing uncertainty. This analysis ofuncertainty has two major benefits:

(i) it allows the development of generalisation proce-dures that take account of the different sources ofuncertainty, where some of these uncertainties

Page 5: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

1194 D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204

may differ between catchments, and which, at a the-oretical level, are better than procedures which donot;

(ii) it allows the amount of uncertainty attributed to thefinal generalisation results to be reduced compared toa direct approach not based on an analysis ofuncertainty.

To clarify this last point, some reduction in the width ofan uncertainty-band placed about the flood frequency curve(or about values for individual model parameters) canalways be achieved because the analysis of uncertainty isable to omit an unnecessary part of the uncertainty thatwould otherwise be included in a more empirical approach.In contrast, point (i) relates to improvements gained by giv-ing less weight to catchments where calibration uncertaintyis greatest: if these uncertainties are broadly similar acrosscatchments, no improvement is gained. In the presentresearch, averaging across catchments, uncertainty bandsfor individual model parameters of the PDM were between4% and 10% narrower compared with analyses ignoringuncertainty if the generalisation method did not includeuncertainty weighting, and between 4% and 12% narrowerif uncertainty weighting was included. For the TATE modelthe uncertainty bands were between 11% and 15% narrower,not using uncertainty weighting, and 12% to 34% narrowerif uncertainty weighting was used. These comparisons relateto using the site-similarity approach to generalisation.

3.2. Treating model parameters separately

The discussion here treats a single parameter of thecatchment model at a time and forms the main basis ofthe methodology for handling uncertainty. While theuncertainties of all the model parameters are, in fact, con-sidered jointly (a multivariate analysis), the main types ofgeneralisation methods considered are constructed by deal-ing with each parameter separately (i.e. using univariatetechniques). Section 3.4 outlines how the univariate modelsextend to the multivariate case for the simple generalisationmethods. However, it has not been possible to develop anoverall analysis on a fully multivariate basis, the pragmaticapproach of Section 4.5 being used instead.

Consider a fixed set of values for catchment propertiesand consider a number of ‘‘very similar’’ catchments hav-ing these catchment properties (that is they would be indis-tinguishable according to hydrological catchmentdescriptors). If the catchment model were calibrated sepa-rately to these catchments, the mean value of the selectedparameters across all of these ‘‘very similar’’ catchmentsis defined as the ‘‘true’’ generalisation parameter value l.The ‘‘true’’ parameter value for a given catchment is

T ¼ lþ g; ð1Þwhere T represents the model parameters that would becalibrated for a given site if there were an infinitely longdata series available. The random error term g differs be-

tween instances of catchments having the same propertiesand represents variations in the ‘‘true’’ model parametersfor catchments that would be judged to be very similar.The random error term g is assumed to have a mean ofzero, given that l represents the mean value across all sim-ilar catchments.

The parameter value obtained by calibrating the catch-ment model for a single catchment is denoted by Y, where

Y ¼ T þ e ¼ lþ gþ e: ð2ÞHere e represents the calibration error for the catchment.This random variable is assumed to have a zero mean,but its distribution will typically have a spread related tothe length of record available for calibration and othercatchment-specific factors.

In the following, it is assumed that n catchments areavailable for which the catchment model has been cali-brated by a well-defined approach, and for which sets ofcatchment properties are available. Thus the data to beused for generalisation consist of the calibrated parametervalues for sites i = 1, . . . ,n which can be put into a similarrepresentation to (2)

Y i ¼ T i þ ei ¼ li þ gi þ ei; ð3Þwhere the following notation is used:

li mean value of the parameter across catchments‘‘very similar’’ to catchment i;

gi the generalisation error for catchment i;Ti (=li + gi) the parameter value that would be cali-

brated at catchment i, given an infinitely long cal-ibration data set;

ei calibration error at catchment i due to having alimited data set for calibration.

The problem of generalisation typically involves a catch-ment for which no calibration data are available: values forsuch a catchment will be indicated with a subscript *. Thetask of the generalisation procedure is to calculate an esti-mate for T*, where

T � ¼ l� þ g�: ð4Þ

Here g* is the unknown generalisation error for the targetcatchment, which is assumed to have the same statisticalcharacteristics as the gi for the calibration catchments.The size of the generalisation error is therefore character-ised by the variance of g*. The analysis of uncertainty tobe undertaken allows this variance to be estimated.

In contrast, a direct empirical treatment of uncertaintyconsiders the target quantity

Y � ¼ T � þ e� ¼ l� þ g� þ e�; ð5Þinstead of T*, where e* represents a notional calibration errorfor the target catchment. The direct approach thus providesan estimate for the variance of (g* + e*), which is not what iswanted. Specifically, under the above assumptions, the di-rect approach would base its assessment of uncertainty on

Page 6: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204 1195

s2 ¼ n�1Xn

i¼1

ðY i� l̂iÞ2 ¼ n�1Xn

i¼1

ðY i� liÞ2 ¼ n�1

Xn

i¼1

ðgiþ eiÞ2:

ð6ÞThis provides an estimate of the variance of (g* + e*) incases where the extent of the availability of calibration datacan be treated as statistically equivalent for different catch-ments, and where this includes data that are notionallyavailable for the target catchment (but are not). A directanalysis essentially assesses the generalisation estimate onits ability to predict the notional value of Y*. For the morecomplete analysis of uncertainty used here, the quantitiesY* and e* are not involved at all. However, the discussionhere has assumed that the generalised value, l*, can be esti-mated without error, which is clearly not the case. Theuncertainty associated with this generalisation error canreadily be taken into account in the analysis of uncertainty:this is outlined later.

3.3. Some assumptions in the uncertainty model

The model structure outlined in (3) needs to be extendedby making some assumptions about the statistical proper-ties of the sets of error components {ei}, {gi} and g*. Giventhe way these error components have been defined, theassumption follows that they all have a mean value of zero.A major assumption is that the typical sizes of the general-isation errors {gi}, as measured by the variance, do not varyin any predictable way in terms of any set of known catch-ment properties: the variance of the generalisation error isconstant. In practice the generalisation procedures areapplied to transformed versions of the parameters of rain-fall-runoff models (for example, by constructing a general-isation rule for the logarithm of a parameter and thentransforming back with the exponential function to derivethe final generalised value). Such transformations are cho-sen as a matter of judgement on the combination of severalcriteria. The assumption of constant variance has beenchecked for the transformed parameters on the basis of scat-ter plots involving the calibrated values, the generalised val-ues once derived, and the individual catchment properties.

Another set of assumptions made is that the error com-ponents are uncorrelated between the different types ofcomponent and are uncorrelated for errors of the sametype at different catchments. These are partly justified bythe conceptual basis of the error components in the abovemodel. It can be further argued that, if there were any cor-relation in the generalisation errors with a given set ofcatchment properties as the basis of the generalisation, itcould be used to create an improved generalisation ruleeither using the same set or an extended set of catchmentproperties. Thus the usual checks on the behaviour of gen-eralisation rules (checking that there is no benefit fromobvious modifications to the rule) serve as a partial checkon this assumption. The assumption of no correlation forthe calibration errors {ei} is more problematic. It can beargued that neighbouring catchments will be affected by

the same weather events within their calibration periods,and thus the calibration errors might be expected to be cor-related on this basis. The likely extent of this correlation isunknown: adjacent catchments might be sufficiently differ-ent for them to be sensitive to different aspects of rainfallpatterns and thus the calibration errors might be quiteunrelated. For the present study, the catchments being con-sidered are widely dispersed geographically, although thereare some pairs of catchments that are rather close together(Fig. 1). The results from the present study suggest that thecalibration errors make only a small contribution to theuncertainty with which the generalised values predictmodel parameters for ungauged catchments. Thus theassumption of uncorrelated calibration errors may be rela-tively unimportant.

3.4. Treating several model parameters

When all the parameters of a rainfall-runoff model areconsidered together, the same uncertainty structure as usedin Section 3.2 can be used except that now all the quantitiesinvolved are vectors. Thus, if a catchment model has Pparameters, then the quantity Y in the earlier section is a(column) vector with elements {Y(p), p = 1, . . . ,P}, whereY(p) is the calibrated value of the pth parameter. Similarlyl is now a vector with elements {l(p), p = 1, . . . ,P}, wherel(p) is the ‘‘true’’ generalised value for the pth parameter.

The error-components g and e also become vectors. Sec-tion 3.3 described the assumptions used in the case of uni-variate modelling. For the multivariate case, errors of thesame type and for the same catchment but for differentmodel parameters are allowed to be cross-correlated. Allother cross-correlations are assumed to be zero. It is clearlyimportant to include the possible cross-correlation of thegeneralisation errors for different parameters within anyassessment of the uncertainty of quantities derived fromrainfall-runoff models using generalised parameters.

The regression and site-similarity approaches to general-isation treat each model parameter separately and, in prac-tice, different collections of catchment properties are usedfor each parameter. In such cases the generalisation rulesare effectively univariate, but an analysis of the generalisa-tion uncertainty treating all parameters jointly is stillneeded. Section 4 describes the overall approach that hasbeen implemented for site-similarity and regression gener-alisation, and this includes (Section 4.5) a pragmaticapproach to dealing with the multivariate uncertainty ofthe generalised model parameters. Section 3.5 outlineshow the multivariate model for uncertainty might be usedto construct more fully-multivariate generalisation rules,and indicates why this approach has not been adoptedfor implementation.

3.5. Fully-multivariate generalisation approaches

The generalisation methods implemented here (site-sim-ilarity and regression) are applied by performing separate

Page 7: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

1196 D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204

generalisation analyses for each parameter, and thesemethods can therefore be considered as multiple-univariatein nature. The important features of these methods are:

• separate choices are made for each parameter of thecatchment properties that are used within the procedure;

• the generalised values for a given parameter are notaffected by the values calibrated for other parameters.

In principle, improved generalisation procedures areavailable if fully-multivariate statistical modelling method-ologies are applied. However, it can be strongly argued thatany such improvement is largely illusory since reliancewould need to be placed on assumptions which cannot bejustified. This is discussed here.

In the case that the univariate regressions for modelparameters all estimate regression-parameters from exactlythe same set of catchment properties, the estimatesobtained from separate univariate regression and multivar-iate regression are identical, given that the cross-parametercovariances of the multivariate parameters are assumedunknown. In addition, the estimate of the covariancematrix of the errors would be identical to the estimateobtained from an analysis of the residuals from the sepa-rate univariate regressions. Given these results, the main sit-uation in which the multivariate regression approach canprovide improved model performance is when the separateunivariate regression models do not all contain exactly thesame set of catchment properties. Such models are readilyfitted using separate regressions. For multivariate regres-sion this situation is represented by imposing constraintson the elements of the regression-coefficient matrix; specif-ically that certain elements are known to be zero.

Pollock [18, Chapter 13] and Zellner [28, Chapter 8] dis-cuss multivariate regression models with constrained coef-ficients. The work by both of these authors originated in aneconometric context, where the terminology ‘‘seeminglyunrelated regressions’’ is used, emphasizing that there needbe no common explanatory variables across the regressionmodels used for the different quantities being treated. In ahydrological context, Tung et al. [25] and Yeh et al. [27]have implemented these methods when generalising unithydrograph parameters.

The improvement supposedly gained from multivariateregression arises from the assumption that particular(‘‘true’’/population) regression coefficients are zero: theimportance of the exact truth of these assumptions withinthe methodology is unknown. Of course this is exactlythe same assumption being made when a choice of regres-sion-variables is being made in univariate regression andcertain variables omitted. However, there is a radically dif-ferent stature being given to the assumption. In univariateregression, model-assessment procedures effectively allowthe conclusion that not much is lost if certain coefficientsare set to zero, given considerations about the extra estima-tion errors introduced by including unnecessary variables.In multivariate regression, the assumption that a regression

coefficient is zero is used in a complicated way to changethe estimates of all the other regression coefficients, extend-ing to all the dependent variables. The resulting improvedestimates depend strongly on the assumption that certainregression coefficients are zero.

While it may be possible to argue on physical groundsthat certain parameters of rainfall-runoff models shouldbe more highly related to certain geographically derivedvariables, or that the relationship with particular variablesare likely to be positively or negatively dependent, theassumption of zero-regression coefficients is more problem-atic. Strictly, the consideration needs to take account of theother catchment properties that are to be included in themultivariate model. It seems unwise to base an estimationprocedure on such an uncertain assumption.

4. Estimation of uncertainty components

4.1. Analysis of uncertainty

The analysis of uncertainty involves the variances of thetwo different types of errors: the generalisation error g andthe calibration error e. While the variance of the generalisa-tion error can reasonably be assumed constant (subject toverification once the model has been fitted), the variancesof the calibration error will change from catchment tocatchment due to differences in record length and in theability of the rainfall-runoff model to simulate differentflow regimes. The quantities involved in the analysis ofuncertainty are therefore:

r2g, the variance of the generalisation error, and

r2i;e, the variance of the calibration error for catchment i.

The analysis proceeds in three stages which are outlinedin the following sections. First, the variances r2

i;e are esti-mated separately for each catchment (Section 4.2) and thesevalues are then treated as fixed. Secondly, the generalisationvariance, r2

g, is estimated by an iterative procedure (Section4.3) which involves constructing generalisation estimates onthe basis of assuming that the variance components are allknown and then deriving a revised value of r2

g from a com-parison of these with the calibrated values of the parame-ters. Finally, the uncertainty of the generalised values fora given target catchment can be evaluated (Section 4.5).This involves augmenting the basic generalisation uncer-tainty, represented by r2

g, with a contribution from theuncertainty arising in using the ‘‘sample’’ generalisationrule instead of the unknown ‘‘true’’ generalisation rule.

The theory outlined in Sections 4.2 and 4.3 treats eachindividual parameter of a rainfall-runoff model separately.In contrast, the assessment of the uncertainty of the floodfrequency curve derived from spatial generalisationrequires that the uncertainty of the complete set of param-eters should be assessed jointly. This is undertaken by esti-mating the covariance of the estimation errors for pairs ofparameters, as outlined in Sections 4.4 and 4.5.

Page 8: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204 1197

The estimation of uncertainty components is based onthe usual types of assumptions about the error compo-nents: the generalisation errors gi and calibration errors ej

are assumed to be uncorrelated between the two types oferrors and across all catchments. While it is possible toextend the methodology to include cases where the targetcatchment for the generalisation step is one of the catch-ments used as the basis of the generalisation, this is notdealt with here as it is not one of the major concerns ofthe research. However, this extended methodology offersthe potential for improving single-catchment calibrationsof rainfall-runoff models by transferring information viaa generalisation rule. Any improvement from this type ofapproach is likely to be small unless the record-lengthavailable for single-catchment calibration is short.

4.2. Estimation of the calibration variances

Here, the variances of the calibration errors, fr2i;eg, have

been estimated as an extension of the procedure for cali-brating the rainfall-runoff models, by applying a jackknif-ing methodology. Jackknifing is an established statisticalprocedure that provides a way of correcting the bias ofan estimate of a quantity that is of direct interest, andfor estimating the sampling variance of the estimate. Thejackknife methodology can be traced back to initial ideasfor bias correction by Quenouille [19,20] and, for varianceestimation, by Tukey [24]. In textbooks, jackknifing isoften discussed in association with a related techniquecalled bootstrapping [4,5,22]. While the use of bootstrap-ping is feasible in the present context, jackknifing was cho-sen because it involves a fixed number of repetitions of thecalibration task, rather than an arbitrary (larger) numberfor bootstrapping. Further, the work in recoding existingcomputer programs is rather less extensive.

In the usual statistical theory, the jackknife estimate ofthe sampling variance of an estimate can be formulated asfollows. It is assumed that the basic estimate is a functionof N items, denoted by {X1, . . . ,XN} and it is assumedthat the N items are statistically independent. The basicestimate is assumed to be defined in a consistent way asthe number of items available changes. Let the basic esti-mate obtained from the N items be denoted by ZN, anddefine the estimate that would be produced from the(N � 1) items, when item i is omitted from the full listof items, to be ZN�1,i. There are N such estimates withone item deleted: the sample mean of these is defined tobe

ZN ¼ N�1XN

i¼1

ZN�1;i: ð7Þ

Then the jackknife variance estimate, defined so as to esti-mate var(ZN), is

v ¼ N � 1

N

XN

i¼1

ZN�1;i � ZN

� �2: ð8Þ

This estimate is based on the variation between the leave-one-out estimates {ZN�1,i} but it contains an adjustmentfactor that accounts for the non-independence of these esti-mates. In the case where several quantities are being esti-mated simultaneously, as for the parameters of therainfall-runoff models, the jackknife variance estimationprocedure provides an estimate of the covariance matrixof the set of estimated values as

V ¼ N � 1

N

XN

i¼1

ZN�1;i � ZN

� �ZN�1;i � ZN

� �T; ð9Þ

where the basic estimate ZN and the leave-one-out esti-mates {ZN�1,i} are now vectors.

Use of jackknife procedures to assess the uncertainty ofthe calibrated estimates of rainfall-runoff model parametersappears not to have been tried before. The approach imple-mented includes a procedural element intended to over-come problems associated with serial dependence in thetime-series being used for the calibration of rainfall-runoffmodels. This element is related to the idea of Moran [17], inwhich serial correlation is overcome in the estimation of thesampling variance of a mean value by first forming meanvalues for non-overlapping sub-intervals which are eachlong enough for these sub-interval means to be effectivelyuncorrelated. In the present context the ‘‘N items’’ onwhich the jackknife procedure is based are identified withthe modelling-error information contained within each ofN calendar years. At the leave-one-out stage, the standardrainfall-runoff-model calibration procedure is applied bytreating as missing all of the error contributions to objec-tive functions that arise from a given calendar year. Thusit is assumed that the overall effect on the calibration pro-cedure of the whole set of modelling errors in separateyears will be effectively independent across the years.Notionally the assumption of independence being maderelates to the errors in modelling flow values, not to theflow values themselves. The jackknife variance estimationprocedure for estimating the uncertainty of the calibratedvalues of the parameters of the rainfall-runoff model is asfollows:

(i) Apply the calibration procedure to the full data set tocreate the vector of values of fitted model parametersthat are carried over into the generalisation proce-dure: this vector is effectively the basic estimate ZN

above.(ii) For each of N calendar years covering the data period

for which flow information is available, apply the cal-ibration procedure to the data set ignoring contribu-tions to the objective functions used for calibrationthat arise from a given calendar year. Each suchleave-one-year-out calibration creates a vector ofparameter values corresponding to ZN�1,i above.

(iii) Use the above formula to calculate the estimate V forthe covariance matrix of the calibration errors for avector of rainfall-runoff model parameters.

Page 9: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

1198 D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204

In practice, it is most convenient to implement the leave-one-year-out calibration step by setting to ‘‘missing’’ all ofthe values of observed flows for a given calendar year. Cal-ibration procedures for rainfall-runoff models are typicallyconstructed to cope with periods of missing data inobserved flow records, so that little extra programming isinvolved. Note that the leave-one-year-out procedureinvolves ignoring the observed flow values within a givenyear, but the observed rainfall and evaporation data forthat year would continue to be used to create the full setof modelled flow values, and their effect would be carriedover into subsequent years.

The correlations derived from the estimated covariancematrix provide a useful indication of the interactionbetween model parameters. In Section 4.5, the estimatedcovariance matrices of the calibration errors are includedin the calculation of the overall generalisation uncertainty.

4.3. Estimation of the generalisation variance

The recent research considers two major types of spatialgeneralisation procedure: regression and site-similarity(Section 2). Although these are distinct procedures, withdifferent conceptual bases, they are sufficiently similar thatthe same approach to the estimation of the generalisationvariance can be adopted. In particular, the two generalisa-tion procedures have the common structural feature thatthe estimated value at a given target site can be representedas a linear combination of the parameter values obtainedfor the calibration catchments:

l̂� ¼Xn

j¼1

w�jY j: ð10Þ

For site-similarity, the weights w*j are determined by therule for defining neighbours to the target site and by theweighting within these neighbours: many of these weightswill be zero. For regression, the weights are derived fromthe estimates of the regression coefficients (which are them-selves linear combinations of the calibrated values Yj) andfrom the catchment descriptors for the target catchment.Some simple theory based on the uncertainty model leadsto the conclusion that the ‘‘best’’ weights depend upon var-iance components r2

g and fr2i;eg, varying as the inverse of

the uncertainty-weighting terms, Ki, where

Ki ¼ 1þr2

i;e

r2g

: ð11Þ

The terms given in (11) are directly appropriate for use inweighted least squares regression. However, weightingschemes for the site-similarity procedure would either notuse such uncertainty-weighting terms or be based both onthese and on the distance of contributing catchments tothe target catchment, in which case the formal justificationfor employing the uncertainty-weighting terms within thescheme no longer applies. However the effect of theseweighting terms is to give less weight to those catchments

with higher calibration variance r2i;e, which seems

reasonable.Section 4.2 outlined a method for estimating fr2

i;eg, andthese estimates are now treated as fixed, leaving the prob-lem of estimating r2

g. This problem is solved by theapproach of using iteratively re-weighted estimates. In theapproach an initial guess, k(0), for the value of

k ¼ 1

r2g

; ð12Þ

is supplied: the value k(0) = 0 is commonly taken for con-venience. This allows an initial set of values of {Ki} to beconstructed, and hence corresponding values for {wij} canbe calculated. Here wij is the weight used on the value forcatchment j when constructing the generalised value for acatchment having the same catchment properties as catch-ment i. To align with the usual theory adopted in the caseof regression, the approach used here allows the cali-brated value for catchment ito be used to construct thegeneralised value for a catchment having the same prop-erties as catchment i. The set of generalised values, bY i,is then used to construct an estimate of r2

g on the follow-ing indirect basis.

The approach here is based on calculating the weightedsum of squares of residuals

S2 ¼Xn

i¼1

K�1i fY i � bY ig2

: ð13Þ

The expected value of S2 is related to r2g in a slightly com-

plicated way. First, note that

Y i ¼ li þ gi þ ei ¼ li þ xi; ð14Þwhere xi is a random variable with variance

r2g þ r2

i;e ¼ Kir2g: ð15Þ

Then

Y i� bY i¼liþxi�Xn

j¼1

wijðljþxjÞ¼xi�Xn

j¼1

wijxj: ð16Þ

The last step holds exactly in the case of regression-basedgeneralisation procedures, according to the theory forthat case, and it follows in other cases in an approximatesense from the assumption that the generalisation proce-dure is sufficiently flexible to accommodate all thesmooth variation of the ‘‘true’’ model parameters in rela-tion to the catchment properties: specifically thecondition

li �Xn

j¼1

wijlj ¼ 0

is required. Hence

Y i � bY i ¼Xn

j¼1

uijxj; ð17Þ

where {uij} is another set of weights. Then

Page 10: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204 1199

E S2� �

¼ EXn

i¼1

K�1i

Xn

j¼1

uijxj

( )20@ 1A

¼Xn

i¼1

K�1i

Xn

j¼1

u2ijE x2

j

� �( )¼Xn

i¼1

K�1i

Xn

j¼1

u2ijKj

( )r2

g:

It then follows that the quantity s2, defined by

s2 ¼ S2Pni¼1K�1

i

Pnj¼1u2

ijKj

n o ; ð18Þ

provides an unbiased estimate for r2g, at least according to

the assumption that the values {Ki} being used are correct.The value for s2 is accepted as an improved estimate of r2

g,and hence a new value for k is calculated as

kð1Þ ¼ 1

s2: ð19Þ

This allows new sets of values of {Ki} and {wij} to be foundand then new ‘‘re-weighted’’ generalisation estimates bY i arecalculated. Eventually a new estimate s2 for r2

g is obtained.This procedure is repeated iteratively until convergence,then the final value for s2 is used to provide the estimatefor r2

g that is carried forward into other calculations.It may be noted that the expression

n� ¼Xn

i¼1

K�1i

Xn

j¼1

u2ijKj

( )ð20Þ

can be simplified in some cases. For example, if the gener-alisation procedure is based on weighted least-squaresregression, this quantity can be shown to be equal to(n � p � 1), where p is the number of catchment propertiesused in the regression relation (not counting the constantterm in this set).

The iterative approach to estimating r2g that is used here

is similar in principle to the methods suggested for otherapplications [13,21,23], but there can be large differencesin the complexity of the underlying statistical model. Infact, the model here is relatively simple, with errors at dif-ferent catchments treated as being uncorrelated. The simpleestimation step, given by (18), means that estimates of r2

g

cannot become negative. Other implementations can some-times produce negative estimates in cases where the sizes ofthe residuals are smaller than would be expected comparedwith the estimated standard deviations of the calibrationerrors (sampling errors) alone. Thus, special considerationneeds to be given to this possibility. In this application theproblem has not arisen but, if it were to do so, the estimatesof r2

g used here would converge to zero. Another differencebetween the formulation used here and in other works[13,21,23] is that these may sometimes assume that theunderlying generalisation procedure is a supposedly opti-mal one: for example that Generalised Least Squares esti-mation is being used where this is justified. The workhere needed to consider cases where non-optimal estimatesare used. In particular, when the site-similarity method is

used, weights are often assigned on the basis of distancesbetween catchments in a catchment-property space.

4.4. Estimation of the variance of errors in generalised

parameter values

As outlined in Section 3.2, the relevant uncertainty asso-ciated with the generalised estimate of a parameter of therainfall-runoff model is that which treats the quantity beingestimated as the true parameter value for the target catch-ment. Specifically, this is defined to be the value that wouldbe obtained by the chosen calibration procedure if therewere an unlimited amount of data available for direct cal-ibration of the rainfall-runoff model.

For a target catchment, denoted by the subscript *, thegeneralised estimate is constructed as

l̂� ¼Xn

j¼1

w�jY j; ð21Þ

where the weights {w*j} are determined by the generalisa-tion procedure being applied. Where these weights incorpo-rate ‘‘uncertainty weighting’’, they are calculated using thefinal value of r2

g from the iterative procedure. From thisstage on, the weights are treated as fixed.

The quantity being estimated is T*, where

T � ¼ l� þ g�: ð22ÞThus the error in the estimate is

T � � l̂� ¼ T � �Xn

j¼1

w�jY j ¼ l� þ g�ð Þ �Xn

j¼1

w�j lj þ xj

� �¼ g� �

Xn

j¼1

w�jxj: ð23Þ

As in Section 4.3, this last step holds exactly in the case ofregression-based generalisation procedures, according tothe theory for that case, and it holds for site-similarity pro-cedures if they are sufficiently flexible to accommodate allthe smooth variation of the ‘‘true’’ model parameters inrelation to the catchment properties.

In the ungauged catchment case, the target catchment isnot any of those used to provide information for the gener-alisation procedure. This means that g* is uncorrelatedwith all of the combined errors in the set {xj}, which arethemselves uncorrelated within the set. Therefore the vari-ance of the estimation error is

varðT � � l̂�Þ ¼ var g� �Xn

j¼1

w�jxj

!

¼ var g�ð Þ þXn

j¼1

w2�jvarðxjÞ: ð24Þ

Then, since varðxjÞ ¼ r2g þ r2

i;e ¼ r2gKj,

var T � � l̂�ð Þ ¼ r2g 1þ

Xn

j¼1

w2�jKj

( ): ð25Þ

Page 11: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

1200 D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204

4.5. Estimation of the covariances of errors in generalised

parameter values

Sections 4.3 and 4.5 outlined how the uncertainty ofgeneralised parameter values has been treated, dealing witheach parameter of the rainfall-runoff model individually.No equivalent for the iterative procedures used for estimat-ing the generalisation uncertainty has been found thatworks on a fully multivariate basis across several modelparameters simultaneously. This relates to the discussionin Section 3.5.

Instead, a pragmatic approach has been taken to esti-mating the covariance matrix of the overall generalisationerrors for the rainfall-runoff model parameters. The out-come of this approach is such that variances estimatedfor individual model parameters are unchanged from thoseobtained by the methods outlined in Sections 4.3 and 4.5.The covariances are treated in a manner chosen primarilyon the basis of guaranteeing that the estimates of thecovariance matrices being generated are feasible covariancematrices, so that the step of generating randomised sets ofmodel parameters can proceed.

The method for estimating the covariance matrix Rg ofthe generalisation errors in the model parameters is as fol-lows. First the iterative re-weighting procedure, describedin Section 4.3, is implemented for each parameter sepa-rately, yielding a set of values fr2

gðpÞ; p ¼ 1; . . . ; Pg forthe P parameters. The set of weighting values for eachparameter is then available, given by

KiðpÞ ¼ 1þr2

i;eðpÞr2

gðpÞ; ð26Þ

where r2i;eðpÞ is the estimated calibration variance for catch-

ment iand parameter p. The weighted sum of squares ofresiduals that is used in the iterative procedure is replacedby a weighted sum of cross-products of residuals defined tobe

Sðp; qÞ ¼Xn

i¼1

ðY iðpÞ � bY iðpÞÞffiffiffiffiffiffiffiffiffiffiffiKiðpÞ

p ðY iðqÞ � bY iðqÞÞffiffiffiffiffiffiffiffiffiffiffiKiðqÞ

p ð27Þ

where Y iðpÞ; bY iðpÞ are the calibrated and generalised val-ues of the pth parameter. Note that, when p = q, thisexpression is identical to the weighted sum of squares ofresiduals in (13). Finally, the estimate for the covariancematrix Rg is defined to have elements given by

Rgðp; qÞ ¼Sðp; qÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

n�ðpÞn�ðqÞp ; ð28Þ

where n*(p) is the ‘‘unbiasing factor’’ for the pth parameteras defined in (20). Note that this pragmatic estimate of thecovariance matrix of the generalisation errors does notmake any use of estimates of the covariances of the calibra-tion errors which are potentially available from the jack-knife procedure (Section 4.2).

The covariances of the overall generalisation error canthen be computed as follows. From (23) the generalisation

errors of the pth and qth parameters for the same catch-ment are

T �ðpÞ � l̂�ðpÞ ¼ g�ðpÞ �Xn

j¼1

w�jðpÞxjðpÞ; ð29Þ

and

T �ðqÞ � l̂�ðqÞ ¼ g�ðqÞ �Xn

j¼1

w�jðqÞxjðqÞ: ð30Þ

Here {w*j(p)} and {w*j(q)} are the different sets of weightsapplied to the calibrated values of the pth and qth param-eter of the rainfall-runoff model. It then follows that

covðT �ðpÞ � l̂�ðpÞ; T �ðqÞ � l̂�ðqÞÞ

¼ covðg�ðpÞ; g�ðqÞÞ þXn

j¼1

w�jðpÞw�jðqÞcovðxjðpÞ;xjðqÞÞ

¼ Rgðp; qÞ þXn

j¼1

w�jðpÞw�jðqÞcovðRgðp; qÞ þ Rj;eðp; qÞÞ;

ð31Þ

where Rj,e(p,q) is the covariance of the calibration error forparameters p and q and catchment j: this is estimated using(9). The covariances between generalised parameters fordifferent catchments can be found by the same type ofapproach.

5. Uncertainty where generalised estimates are used

As discussed earlier, generalised estimates of modelparameters would typically be used to supply informationto a subsequent procedure which estimates the flood fre-quency curve for an ungauged catchment. Sophisticatedanalyses may involve the use of rainfall-runoff models forseveral catchments. The most straightforward way ofassessing the uncertainty in the final results from such anal-yses is to complement the results obtained for the ‘‘bestestimates’’ of the model parameters with equivalent resultsobtained for sets of models parameters close to the bestestimates but within a range determined by the uncertaintyassociated with the generalised estimates. If randomisedsets of model parameters are generated to have the covari-ance structure indicated by the analyses of uncertainty out-lined in Section 4.5, then the corresponding sets of floodfrequency curves will directly represent the uncertaintyarising from generalisation of the model parameters.

The use of multivariate normal distributions to repre-sent uncertainty is most convenient. A detailed analysisof distributions representing uncertainty should logicallyfollow an intensive study of the calibration results for indi-vidual catchments, comparing these with the generalisedvalues of model parameters and seeking explanations forany large discrepancies, with the possibility of recalibrationusing amended procedures. Unfortunately, there have notbeen resources for this within the recent research, althoughextensive sets of scatter plots have been assessed which do

Page 12: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204 1201

not contradict the assumption of multivariate normality(after selected transformations of the model parameters).The use of multivariate normal distributions should be ade-quate because the main requirement is for a simple indica-tion of the extent of uncertainty and no great reliance isplaced on the shape of the corresponding distributions.However, several of the parameters of the rainfall-runoffmodels being used are subject to natural or imposed con-straints: for example, certain model parameters cannot benegative. These cases have been treated by truncating anyparameter values that are generated from the multivariatenormal distribution so as to lie within the required range.

The generation of multivariate normal random variatesis a well-understood topic, so that no details of this aregiven here. However, two candidate approaches for actualimplementation arise. In the first of these, the set of param-eters required across a number of catchments is decided,then the covariance matrix of the overall generalisationerrors is determined by expressions such as those in Section4.5 and random values are generated corresponding to thiscovariance matrix. This approach seems best suited tocases where only a few catchments are involved as targetcatchments. An alternative approach may be more suitedto cases where many target catchments have to be treatedsimultaneously, since it avoids dealing with a covariancematrix with large dimensions. This alternative involvesthe representation of the overall generalisation errors interms of their basic components. For example, it is possibleto obtain the required randomised versions of overall gen-eralisation error, simultaneously for all parameters andcatchments, by generating random versions for the compo-nent errors {g*(p)} and {xj(p)} and then combining themusing (29).

6. Uncertainty of estimated flood frequency curves

6.1. Derivation of uncertainty bounds

Uncertainty bounds were produced around each of the(univariate regression and site-similarity) generalised floodfrequency curves for each catchment, by using the methodsdescribed above to generate a large number of ‘generalisedparameter sets with uncertainty’ (where the generation ofeach set allows for the cross-parameter dependence in theuncertainty of the generalised estimate). The rainfall-runoffmodel is then run with each of these parameter sets, to pro-duce time-series of flows from which flood frequencycurves are derived, and the set of flood frequency curvesproduced is used to estimate bounds. For the analysis here,the flood frequency curves are calculated using theobserved record of catchment rainfall for each catchment.

A collection of 1000 parameter sets were used. Theuncertainty bounds were estimated at each plotted returnperiod by ranking the 1000 estimated peak flows at thegiven return period (lowest to highest) and selecting rankedpoints appropriate to the bounds required. For illustrativepurposes, the 90, 95 and 99% bounds were calculated and

the corresponding points at each return period were joinedup, to produce continuous bounds on the flood frequencycurve. Fig. 2 shows examples of uncertainty bounds onthe generalised flood frequency curves, for two catchments.

It is clear that the simulations outlined above consideronly part of the uncertainty in the estimated flood fre-quency curves. The simulations reported here treat the esti-mates of the covariance matrices of the generalisationerrors and of the calibration errors as if they providedexact values of these quantities: a more complete assess-ment of uncertainty should include the effect of this addi-tional uncertainty. However the results do give anassessment of the uncertainty arising from using general-ised values for the parameters of a rainfall-runoff modelfor an ungauged catchment. A more complete assessmentmight be envisaged, potentially based on a Bayesian formu-lation of uncertainties. However, an overall procedure forestimating flood frequencies using continuous simulationwould contain a number of components and there wouldtypically be uncertainties associated with each of these.For example, if a stochastic rainfall model were used, therewould be uncertainties arising from estimating the param-eters of that model. The use of a limited-length observedrainfall record, as here, means that a complete assessmentof uncertainty should include some contribution relatedto this limited representation of rainfall. Note also thatthe simulations here do not include any potential effectfrom the ‘‘modelling error’’ which might be included torepresent the fact that rainfall-runoff models are unableto exactly reproduce observed river flows. Given all ofthe above, the uncertainty bands reported should not beinterpreted as true confidence intervals. They do indicatethe large amount of uncertainty resulting from not beingable to use rainfall-runoff model parameters calibratedfor the particular target catchment, and they can be usedto compare the uncertainties associated with different gen-eralisation procedures and with different rainfall-runoffmodels.

6.2. Analysis of uncertainty bounds

As for the generalised flood frequency curves them-selves, the uncertainty bounds vary by catchment, general-isation method and rainfall-runoff model. Table 1summarises the performance of the bounds according towhether the flood frequency curve from observed flows liescompletely within each of the bounds (‘within’), lies outsidean outer (99%) bound somewhere (‘outside’), or otherwise(that is, lies outside a 90% or 95% bound somewhere, butnot outside either 99% bound). As found in the comparisonin terms of generalisation performance (Section 2 and [10]),this again suggests that PDM site-similarity performs best,with the highest number of catchments classified as ‘within’and the lowest number classified as ‘outside’. However, acloser look at the bounds suggests that those from PDMsite-similarity are also the widest for a large number ofcatchments (about 60%), so it is not surprising that the

Page 13: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

PDM TATE

univ

aria

tere

gres

sion

0.1 1.0 10.0 100.0

return period [years]

0

10

20

30

40

50

peak

flow

[m3s

-1]

43005

0.1 1.0 10.0 100.0

return period [years]

0

20

40

60

peak

flow

[m3 s

-1]

43005

site

-si

mila

rity

0.1 1.0 10.0 100.0

return period [years]

0

20

40

60

80

peak

flow

[m3s

-1]

43005

0.1 1.0 10.0 100.0

return period [years]

0

20

40

60

80

100

peak

flow

[m3 s

-1]

43005

univ

aria

tere

gres

sion

0.1 1.0 10.0 100.0

return period [years]

0

100

200

300

400

peak

flow

[m3 s

-1]

90003

0.1 1.0 10.0 100.0

return period [years]

0

100

200

300

peak

flow

[m3 s

-1]

90003

site

-si

mila

rity

0.1 1.0 10.0 100.0

return period [years]

0

100

200

300

400

peak

flow

[m3 s

-1]

90003

0.1 1.0 10.0 100.0

return period [years]

0

50

100

150

200

250

300

peak

flow

[m3 s

-1]

90003

Fig. 2. Examples of uncertainty bounds (90% – long-dashed, 95% – short-dashed, 99% – dotted) and the median curve (dot-dashed) derived fromuncertainty in the generalised parameters. Also shown are the peaks-over-threshold points and flood frequency curves (squares and solid line) using thegeneralised parameters.

1202 D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204

Page 14: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

Table 1Summary of performance of uncertainty bounds, giving the number of catchments for which the observed flood frequency lies: (a) totally within theuncertainty bounds, (b) outside an outer bound somewhere, or (c) otherwise

Classification PDM TATE

Univariate regression Site-similarity Univariate regression Site-similarity

Within all bounds (90%, 95% and 99%) 86 103 77 80Outside an outer (99%) bound somewhere 9 3 17 17Otherwise 24 13 25 22

Table 2Mean relative widths of uncertainty bounds

Return period Bounds PDM TATE

Univariate regression Site-similarity Univariate regression Site-similarity

10-years

99% upper 2.05 2.12 1.80 1.7799% lower 2.22 2.24 3.12 2.4195% upper 1.71 1.75 1.58 1.5795% lower 1.74 1.80 1.95 1.8590% upper 1.56 1.60 1.49 1.4790% lower 1.58 1.62 1.68 1.64

50-years

99% upper 2.17 2.23 1.81 1.7599% lower 2.26 2.28 3.27 2.5295% upper 1.79 1.83 1.61 1.5595% lower 1.78 1.83 2.01 1.9390% upper 1.62 1.66 1.50 1.4690% lower 1.61 1.65 1.73 1.69

D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204 1203

observed flood frequency curve is more likely to lie com-pletely within the bounds. These impressions are studiedin more detail in the analysis that follows.

Table 2 shows some results based on the widths of theuncertainty bounds, and these can also be used to summa-rise the more general properties of the statistical distribu-tions reflecting the uncertainty of the generalisedestimates of the flood frequency curves. For a given per-centage-level, say 95%, and a given return period, say 10years, a relative width for the upper bound is defined asthe ratio of the flood-value at the upper 95% bound tothe flood-value at the 50% point for the same return period.This value is calculated for each catchment and averagedacross catchments. Similarly, a relative width for the lowerbound is defined as the ratio of the flood-value at the 50%point to the flood-value at the lower 95% bound for thesame return period. Again these values are calculated foreach catchment, and averaged. When defined in this waythe relative widths have values greater than one and canbe interpreted as factors to multiply and divide by to createupper and lower bounds about a central estimate. Theseare summary measures only and, for actual applications,it will be better to use the catchment-specific boundsderived separately for each target catchment.

The results in Table 2 show that the upper uncertaintybounds derived for the PDM model are generally wider,in a relative sense, than those for the TATE, while theopposite is true for the lower uncertainty bounds. For thePDM, the uncertainty bounds associated with site-similar-ity generalisation are slightly wider than those for general-

isation by regression, while the opposite is true for theTATE.

A comparison between the relative widths of the upperand lower uncertainty bounds indicates rather differentbehaviour for the distributions describing the uncertaintyfor the PDM and TATE models. The plots in Fig. 2, andthose for all other catchments considered, show that thesedistributions are skewed towards the right when consideredstraightforwardly in terms of flow. Table 2 shows that, forthe PDM model, the distributions are close to symmetricwhen judged in a logarithmic sense (the relative widthsfor the upper and lower bounds are reasonably similar),while those for the TATE model are skewed to the left.The difference between the models here may well be relatedto the different role played by the parameters for the twomodels: in particular the PDM model has the parameterfc which acts multiplicatively on the rainfall before othermodelling-elements apply, whereas the closest correspond-ing parameter within TATE acts once rainfall has beendivided into different types of runoff. A general summaryof the uncertainty bounds for the two models is that theupper uncertainty bounds are wider for the PDM thanfor the TATE, but the lower uncertainty bounds are widerfor the TATE than the PDM.

7. Conclusions

This paper has illustrated that it is possible, with the data-sets available in the UK, to estimate the parameters of sim-plified rainfall-runoff models for ungauged catchments and,

Page 15: Uncertainty analysis for estimating flood frequencies for ungauged catchments using rainfall-runoff models

1204 D.A. Jones, A.L. Kay / Advances in Water Resources 30 (2007) 1190–1204

not only to provide an uncertainty analysis for these esti-mates, but to carry this through to provide uncertaintybounds for flood frequency curves derived via continuoussimulation of river-flows. The use of a statistical analysiswhich explicitly takes account of calibration uncertaintyhas been found to be useful, since this allows the uncertaintybounds associated with estimates to be reduced in width in awell-justified way. It also allowed [10,11] the formulation ofimproved generalisation procedures which vary the weightgiven to calibrated parameter values according to how wellthose values are determined by the records used for calibra-tion. The results given briefly at the end of Section 3.1 relateto these two points.

Section 6 discussed the uncertainty in estimates of floodfrequency curves, resulting from use of generalised param-eters. Some differences were noted in the characteristics ofthe uncertainty bands derived for the two rainfall-runoffmodels and there were also differences in which of thetwo generalisation methods appeared to be best for eachof the two models. No firm reasons are known for thesedifferences. Further experience with the approach may pro-vide elucidation.

A description has been given of a complete approach todealing with the uncertainty associated with using general-ised estimates of the parameters of rainfall-runoff modelsfor ungauged catchments. However, there are many pointsstill open to question. Among these is the use of the jack-knife procedure for estimating the calibration uncertainty.While alternative procedures are available (for example,[7,9]), not all of these would be suitable for the type of cal-ibration procedure that has been adopted, which calibratesindividual model parameters in sequence using differentselections for the objective function. Further studies ofthe rainfall-runoff models used here, but allowing moreparameters to enter the calibration, are required to assessthe way in which the extra freedom then available toachieve improved calibrations counterbalances with thedifficulty of finding reasonable generalisation rules formore parameters and with the possibly increased uncer-tainty of the flood frequency curves derived from these.

Acknowledgements

This study was supported by the UK Department forEnvironment, Food and Rural Affairs (Defra), projectFD2106, and the Scottish Executive.

References

[1] Blazkova S, Beven K. Flood frequency estimation by continuoussimulation for a catchment treated as ungauged (with uncertainty).Water Resour Res 2002;38(8):14. doi:10.1029/2001WR000500. 1-14.

[2] Calver A. The time–area formulation revisited. P I Civil Eng-Water1993;101:31–6.

[3] Calver A. Development and experience of the ‘TATE’ rainfall-runoffmodel. P I Civil Eng-Water 1996;118:168–76.

[4] Davison AC, Hinkley DV. Bootstrap methods and their applica-tion. Cambridge: Cambridge University Press; 1997.

[5] Efron B, Tibshirani RJ. An introduction to the Bootstrap. NewYork: Chapman and Hall; 1993.

[6] Engeland K, Gottschalk L. Bayesian estimation of parameters in aregional hydrological model. Hydrol Earth Syst Sc 2002;6(5):883–98.

[7] Gupta HV, Sorooshian S, Yapo PO. Toward improved calibration ofhydrologic models: multiple and noncommensurable measures ofinformation. Water Resour Res 1998;34(4):751–63.

[8] Institute of Hydrology. Flood estimation handbook (5 volumes).Wallingford: Institute of Hydrology; 1999.

[9] Jones DA. Statistical analysis of empirical models fitted by optimi-sation. Biometrika 1983;70:67–88.

[10] Kay AL, Jones DA, Crooks SM, Calver A, Reynard NS. Acomparison of three approaches to spatial generalisation of rainfall-runoff models. Hydrol Process 2006;20(18):3953–73.

[11] Kay AL, Jones DA, Crooks SM, Kjeldsen TR, Fung CF. Aninvestigation of site-similarity approaches to generalisation of arainfall-runoff model. Hydrol Earth Syst Sc 2007; 11, in press.

[12] Lamb R, Kay AL. Confidence intervals for a spatially generalized,continuous simulation flood frequency model for Great Britain.Water Resour Res 2004;40:W07501. doi:10.1029/2003WR002428.

[13] Madsen H, Rosbjerg D. Generalized least squares and empiricalBayes estimation in regional partial duration series index-floodmodeling. Water Resour Res 1997;33(4):777–81.

[14] McIntyre N, Lee H, Wheater H, Young A, Wagener T. Ensemblepredictions of runoff in ungauged catchments. Water Resour Res2005;41:W12434. doi:10.1029/2005WR004289.

[15] Moore RJ. The probability-distributed principle and runoff produc-tion at point and basin scales. Hydrolog Sci J 1985;30(2):273–97.

[16] Moore RJ. Real-time flood forecasting systems: perspectives andprospects. In: Casale R, Margottini C, editors. Floods and landslides:integrated risk assessment. Berlin: Springer; 1999. p. 147–89.

[17] Moran PAP. The estimation of standard errors on Monte Carlosimulation experiments. Biometrika 1975;62:1–4.

[18] Pollock DSG. The algebra of econometrics. Wiley, Chichester: Wiley;1979.

[19] Quenouille MH. Problems in plane sampling. Ann Math Stat1949;20:355–75.

[20] Quenouille MH. Notes on bias estimation. Biometrika 1956;43:353–60.

[21] Reis DS, Stedinger JR, Martins ES. Bayesian generalized leastsquares regression with application to log Pearson type 3 regionalskew estimation. Water Resour Res 2005;41:W10419. doi:10.1029/2004WR003445.

[22] Shao J, Tu D. The Jackknife and Bootstrap. Springer, NewYork: Springer; 1995.

[23] Tasker GD, Stedinger JR. An operational GLS model for hydrologicregression. J Hydrol 1989;111(1-4):361–75.

[24] Tukey JW. Bias and confidence in not-quite large samples (abstract).Ann Math Stat 1958;29:614.

[25] Tung YK, Yeh KC, Yang JC. Regionalization of unit hydrographparameters: 1. Comparison of regression analysis techniques. StochHydrol Hydraul 1997;11:145–71.

[26] Wagener T, Wheater HS. Parameter estimation and regionalizationfor continuous rainfall-runoff models including uncertainty. J Hydrol2006;320:132–54.

[27] Yeh KC, Yang JC, Tung YK. Regionalization of unit hydrographparameters: 2. Uncertainty analysis. Stoch Hydrol Hydraul 1997;11:173–92.

[28] Zellner A. An introduction to Bayesian inference in econometrics. -Wiley, New York: Wiley; 1971.