phase-space analysis of daily streamflow ...directory.umm.ac.id/data elmu/jurnal/a/advances in...

13
Phase-space analysis of daily streamflow: characterization and prediction Q. Liu a , S. Islam a , *, I. Rodriguez-Iturbe b & Y. Le a a The Cincinnati Earth System Science Program, Department of Civil and Environmental Engineering, University of Cincinnati, PO Box 210071, Cincinnati, OH 45221-0071, USA b Department of Civil Engineering, Texas A&M University, College Station TX 77843, USA (Received 7 May 1996; revised 17 January 1997; accepted 3 February 1997) This paper describes a methodology, based on dynamical systems theory, to model and predict streamflow at the daily scale. The model is constructed by developing a multidimensional phase-space map from observed streamflow signals. Predictions are made by examining trajectories on the reconstructed phase space. Prediction accuracy is used as a diagnostic tool to characterize the nature, which ranges from low-order deterministic to stochastic, of streamflow signals. To demonstrate the utility of this diagnostic tool, the proposed method is first applied to a time series with known characteristics. The paper shows that the proposed phase-space model can be used to make a tentative distinction between a noisy signal and a deterministic chaotic signal. The proposed phase-space model is then applied to daily streamflow records for 28 selected stations from the Continental United States covering basin areas between 31 and 35 079 km 2 . Based on the analyses of these 28 streamflow time series and 13 artificially generated signals with known characteristics, no direct relationship between the nature of underlying streamflow characteristics and basin area has been found. In addition, there does not appear to be any physical threshold (in terms of basin area, average flow rate and yield) that controls the change in streamflow dynamics at the daily scale. These results suggest that the daily streamflow signals span a wide dynamical range between deterministic chaos and periodic signal contaminated with additive noise. q 1998 Elsevier Science Limited. All rights reserved 1 INTRODUCTION The classical approaches for analyzing hydrologic signals (i.e. streamflow, rainfall, etc.), whether they are produced by a deterministic or stochastic process, are based on: (i) exploring the observable to detect patterns; (ii) constructing an explanatory model from first principles; and (iii) measur- ing data to initialize, calibrate and validate the model. In characterizing streamflow, one could argue that basic equations (e.g. for rainfall-runoff transformations, overland flow, hydraulic routing) are well known and can be derived for idealized conditions. However, these idealized condi- tions are far from being physically realistic, especially from the viewpoint of space–time heterogeneity. Even if we accept that model equations can be formulated for idealized conditions, correct specification of initial and boundary conditions would require measurements of state variables in a four-dimensional volume. However, measure- ments are usually taken only at discrete locations and times. The inherent spatial and temporal variability in streamflow make the basic equations only an approximation whose values in operational hydrology is conditional on appropri- ate calibration through numerous tuning parameters. An alternative approach is to construct a streamflow model directly from the available data. A key assumption behind constructing such a model is that even if the exact mathematical description of a dynamical system is not known, the state space can be reconstructed from a single variable time series 1 . The state space is defined as the multi- dimensional space whose axes consists of variables of a dynamical system. For example, for a three-variable model, the state space will be three dimensional and each of the three axes will be represented by a model variable. When the state space is reconstructed from a time series, Advances in Water Resources 21 (1998) 463–475 q 1998 Elsevier Science Limited All rights reserved. Printed in Great Britain 0309-1708/98/$19.00 + 0.00 PII: S 0 3 0 9 - 1 7 0 8 ( 9 7 ) 0 0 0 1 3 - 4 ADWR 183 463 *Corresponding author.

Upload: others

Post on 17-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

Phase-space analysis of daily streamflow:characterization and prediction

Q. Liu a, S. Islama,*, I. Rodriguez-Iturbe b & Y. Le a

aThe Cincinnati Earth System Science Program, Department of Civil and Environmental Engineering, University of Cincinnati,PO Box 210071, Cincinnati, OH 45221-0071, USA

bDepartment of Civil Engineering, Texas A&M University, College Station TX 77843, USA

(Received 7 May 1996; revised 17 January 1997; accepted 3 February 1997)

This paper describes a methodology, based on dynamical systems theory, to modeland predict streamflow at the daily scale. The model is constructed by developing amultidimensional phase-space map from observed streamflow signals. Predictions aremade by examining trajectories on the reconstructed phase space. Prediction accuracyis used as a diagnostic tool to characterize the nature, which ranges from low-orderdeterministic to stochastic, of streamflow signals. To demonstrate the utility of thisdiagnostic tool, the proposed method is first applied to a time series with knowncharacteristics. The paper shows that the proposed phase-space model can be used tomake a tentative distinction between a noisy signal and a deterministic chaotic signal.

The proposed phase-space model is then applied to daily streamflow records for 28selected stations from the Continental United States covering basin areas between 31and 35 079 km2. Based on the analyses of these 28 streamflow time series and 13artificially generated signals with known characteristics, no direct relationshipbetween the nature of underlying streamflow characteristics and basin area has beenfound. In addition, there does not appear to be any physical threshold (in terms ofbasin area, average flow rate and yield) that controls the change in streamflowdynamics at the daily scale. These results suggest that the daily streamflow signalsspan a wide dynamical range between deterministic chaos and periodic signalcontaminated with additive noise.q 1998 Elsevier Science Limited. All rightsreserved

1 INTRODUCTION

The classical approaches for analyzing hydrologic signals(i.e. streamflow, rainfall, etc.), whether they are produced bya deterministic or stochastic process, are based on: (i)exploring the observable to detect patterns; (ii) constructingan explanatory model from first principles; and (iii) measur-ing data to initialize, calibrate and validate the model. Incharacterizing streamflow, one could argue that basicequations (e.g. for rainfall-runoff transformations, overlandflow, hydraulic routing) are well known and can be derivedfor idealized conditions. However, these idealized condi-tions are far from being physically realistic, especiallyfrom the viewpoint of space–time heterogeneity. Even ifwe accept that model equations can be formulated foridealized conditions, correct specification of initial and

boundary conditions would require measurements of statevariables in a four-dimensional volume. However, measure-ments are usually taken only at discrete locations and times.The inherent spatial and temporal variability in streamflowmake the basic equations only an approximation whosevalues in operational hydrology is conditional on appropri-ate calibration through numerous tuning parameters.

An alternative approach is to construct a streamflowmodel directly from the available data. A key assumptionbehind constructing such a model is that even if the exactmathematical description of a dynamical system is notknown, the state space can be reconstructed from a singlevariable time series1. The state space is defined as the multi-dimensional space whose axes consists of variables of adynamical system. For example, for a three-variablemodel, the state space will be three dimensional and eachof the three axes will be represented by a model variable.When the state space is reconstructed from a time series,

Advances in Water Resources21 (1998) 463–475q 1998 Elsevier Science Limited

All rights reserved. Printed in Great Britain0309-1708/98/$19.00 + 0.00PII: S 0 3 0 9 - 1 7 0 8 ( 9 7 ) 0 0 0 1 3 - 4

ADWR 183

463

*Corresponding author.

Page 2: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

rather than with actual model variables, it is customary tocall this state space a phase space. We will use a time-delayembedding (defined in Section 2) to reconstruct the phase-space from the observed streamflow signals. In principle,the phase space contains the knowledge about the internaldynamics of the system and thus can be used as a predictivetool. The basic idea here is that since the embedding mappreserves the underlying dynamic structure, the future canbe predicted from the behavior of the past. As shown byTakens2, the phase space retains essential properties of theoriginal state space including the dimensionality of theunderlying system. Now, if one can reconstruct the determi-nistic rules underlying the data in a phase space then one canattempt to predict the future states from the history of thedata embedded in the phase space. Several recent studieshave successfully used phase-space-based models forchaotic signal characterization3–5, prediction6, noise reduc-tion7 and lake level prediction8. In addition, as we will see, aphase-space-based model can also be used to make short-term prediction and provide a tentative distinction betweenlow-dimensional determinism and noise9,10.

Hydrologists have long maintained that large basins aresmoother in their streamflow response behavior than smallbasins. This assumption has not been really substantiatedfrom a quantitative, data based, point of view, althougharguments based on the smoothing effect resulting fromlarger storage property constitute a reasonable basis for itsacceptance. Such a smoothing effect of large basins is fre-quently translated in the assertion that because of theirinherent larger degree of linearity, their response (e.g. run-off) is easier (compared to the smaller basins) to predict. It iscommonly argued that as the time and spatial averagingincrease, then the rainfall–streamflow relationships maybecome more linear and hence the streamflow becomesmore predictable. However, even if the above is true, it isnot clear how much the predictability of streamflow willincrease in terms of accuracy and prediction lead time.Recent studies have shown possible presence of chaos instreamflow11,12. If the underlying streamflow signal ischaotic, it is quite possible that its inherent predictabilitywill be quite limited irrespective of the basin area.

In this study, we will describe an alternative model forstreamflow prediction. This model will be used to investi-gate the characteristic signatures of streamflow signals (e.g.low-order determinism vs stochastic noise) at the dailyscale. For example, does streamflow change dynamics (non-linear to linear) with increasing basin area? What is the impli-cation of the nature of streamflow characteristics on itspredictability? We will use recent developments in nonlinearmodeling, phase-space reconstruction from a time series andrelated diagnostic tools to address the above issues.

2 STREAMFLOW MODELING: A DYNAMICALSYSTEM PERSPECTIVE

Due to the dramatic expansion of digital data acquisition

and processing, it is now possible to develop predictivemodels for streamflow dynamics from a ‘theory-poor’ and‘data-rich’ perspective. By theory-poor we mean that ourapproach does not require explicit formulation of governingpartial differential equations. The idea of data intensivemodeling is by no means new—an autoregressivemodel13,14is a good example. What is new is the emergenceof a set of concepts and tools (such as phase-space recon-struction, neural network, etc.) that combine broad approxi-mation abilities and few specific assumptions15. We willtake this data-rich and theory-poor perspective to constructa predictive model directly from streamflow time series.Building this type of dynamical model from a time seriesinvolves two steps: (i) reconstruction of the phase spacefrom data by time delay embedding; and (ii) developmentof a methodology for phase-space prediction.

2.1 Reconstruction of the phase space from data bytime delay embedding

Let X0(t) be the time series of a dynamical variable from apotentially complex natural system (e.g. streamflow signal).As theM variables {Xk(t)} describing the system satisfy aset of first-order differential equations, successive differen-tiation in time reduces the problem to a single highly non-linear differential equation ofMth order for one of thesevariables. Thus, instead ofXk(t), k ¼ 0, 1,…M ¹ 1, wemay useX0(t), the variable of the time series data, and its(M ¹ 1) successive derivativesX(k)

0 (t), k ¼ 1,…M ¹ 1, to betheM variables of the problem spanning the phase space ofthe system1. Therefore, in principle, sufficient informationis given in a one-dimensional time series to construct amultidimensional phase-space for studying the systemdynamics.

A simple procedure, suggested originally by Ruelle16,avoids the problem of calculatingX(k)

0 (t) from a timeseries ofX0(t) and uses multiple time delays as a surrogatefor successive derivatives. A point in anM-dimensionalphase-spaceX 0(t) is then defined as

X0(t) ¼ [X0(t), X0(t þ t), X0(t þ 2t), …,

3 X0{ t þ (M ¹ 1)t} ]

To construct a well-behaved phase space by time delay, acareful choice oft is critical. A popular choice for thischaracteristic time scale is chosen from the autocorrelationfunction of the original time series. Here, the time delayt ischosen such that the autocorrelation drops to 1/e4. Asshown by Takens17, the phase space retains essentialproperties of the original state space including the dimen-sionality. In addition, as we will see, a phase space can beused to make short-term predictions and to make a practicaldistinction between low-dimensional determinism andnoise9,10. As an example, Fig. 1(a) shows the first 100points from the so-called Henon map. For the chosen para-meter, this is a chaotic map. This time series, in manyways, is indistinguishable from a white noise sequence

464 Q. Liu et al.

Page 3: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

[Fig. 1(b)]. A phase-space plot of the Henon map and awhite noise sequence in Fig. 2, on the other hand, revealsremarkable structure in the chaotic Henon map while thewhite noise sequence fills up the entire plane with noapparent structure.

2.2 Develop a methodology for phase-space prediction

Once we have reconstructed the phase space, we can usesome of its properties to develop a short-term predictionmodel. For example, if the underlying dynamics is determi-nistic, then the order with which the points in the phasespace appear will also be deterministic. Thus, we may beable to define some functional relationship between thecurrent stateX(t) and the future statesX(t þ P), i.e. X(t þ

P) ¼ Fp(X(t)). Now, we need to find a predictorFp thatapproximatesFp. There are a variety of numerical tech-niques to approximateFp from scattered points in thephase space. This methodology can be illustrated by using

Fig. 3, where part of a trajectory is shown in a two-dimensional phase-space and the present state is denotedby an open circle. The solid circles indicate neighbors ofthe current state, and the arrowheads show movement of theneighbors through a local section of the phase space. Byfinding a suitable function (linear or nonlinear) that approxi-mates how the neighbors move, a prediction of the currentstate can be made. This is know as local approximation asopposed to a global approximation which defines afunctional relationship over the entire phase space.

Farmer and Sidorowich6 introduced local linear modelsfor phase-space forecasting. Smith2 discussed the relation-ship between local linear and nonlinear models as well asbetween the local and global approaches. In general, locallinear approximation has been shown to provide betterprediction accuracy for a number of controlled datasets18.In this paper, local approximation methods will be used.One such method, popularly known as the nearest neighbormethod, approximates unknown functions near the present

Fig. 1. (a) Time series of 100 points for the chaotic Henon map:xt þ 1 ¼ 1¹ ax2t þ yt; ytþ1 ¼ bxt with a ¼ 1.4 andb ¼ 0.30. This time series

is in many ways indistinguishable from random noise. (b) Time series of 100 points generated from uniform distribution in the intervalbetween 0 and 1.

Fig. 2. Two-dimensional phase space map for (a) a Henon map, and (b) a white noise sequence.

Phase-space analysis of daily streamflow 465

Page 4: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

state vector by using the nearest neighbor of the presentstate.

We now locate nearbyM-dimensional points in the phasespace and choose a minimal neighborhood withK closestneighbors such that the predictee (the point from which theprediction is made) is contained within the smallest simplex.To enclose a point in anM-dimensional space, we require asimplex with a minimum ofM þ 1 points. Then, to obtain aprediction, we project the domain of the chosen nearestneighborsTP (prediction step) steps forward and computeFp to get the predicted value. Since it becomes increasinglydifficult to define an enclosing simplex for higher dimen-sional embedding spaces, we have extended the above ideato the nearest neighbors in an Euclidean sense. A minimumof M þ 1 nearest neighbors are chosen based on theEuclidean distance between the neighbor and the predictee.Then, we project the domain of the chosen neighborsTp stepforward and estimate the predicted value. We have exploredseveral estimation kernels including arithmetic average,weighted average and weighted regression to estimate thepredicted value. It was found that arithmetic averageprovides comparable prediction accuracy and requires notuning parameters and hence we have chosen arithmeticaverage of projected neighboring points to obtain thepredicted value in this study.

There are only two parameters to be chosen for thisphase-space prediction model: embedding dimensionM,and number of nearest neighborsK. In general,Mmin .(2D þ 1) whereD is the attractor dimension. An estimateof the attractor dimension may be obtained from the corre-lation dimension4,5. Prediction results are sensitive to thechoice of M10. We will look at the prediction accuracy(correlation between predicted and observed) as a functionof embedding dimension to choose an optimum value ofMfor our prediction algorithm. Since to enclose a point in anM-dimensional space, we require to construct a simplexwith a minimum of (M þ 1) points, one hasKmin . (M þ 1).

Use of the phase space to develop a forecasting modelmay appear to be similar to an autoregressive model: a pre-diction is estimated based on time-lagged vectors. However,the crucial difference is that understanding phase-spacegeometry frames forecasting as recognizing and then repre-senting underlying dynamical structures. For example, two

neighboring points in a phase space may not be close to eachother within the context of a time sequence. The traditionalautoregressive (AR) model relies on time-lagged signalsthat are neighbors in a temporal sense, whereas a neighborin a phase space is close in a dynamic sense. In addition,once the number of lags exceeds the minimum embeddingdimension, the geometry of the underlying dynamics willnot change. A global linear model, such as the AR model,must do this with a single hyperplane with no fundamentalinsight into the underlying geometric structure. Unliketraditional AR models, the proposed methodology alsopromises to make a tentative distinction between stochasticnoise and low-dimensional chaos. A characteristic feature ofchaotic dynamics is that the prediction accuracy exponen-tially decays as the prediction time increases. On the otherhand, for a noisy system the prediction accuracy does notdecay sharply with prediction lead time9,10.

2.3 Distinction between deterministic chaos andstochastic noise

Below, we show how a phase-space-based forecastingmodel works by applying it to a known chaotic time seriesgenerated from the well-studied chaotic Henon map.Additionally, as an example of noisy dynamics, we studyuncorrelated additive noise superimposed on a sine wave.Such uncorrelated noise can be thought of as measurementerror superimposed on a hypothetical streamflow signal witha pronounced seasonal cycle. We have used a total of 5000points for each time series; the first 4000 points are used as atraining set while the other 1000 points are used to makepredictions and estimate prediction accuracy as a functionof prediction lead time.

Fig. 4 shows the prediction accuracy for the chosen

Fig. 3. Schematic representation of the nearest-neighbor methodfor phase-space-based prediction. The present stateX(t) and itsunknown future valueX(t þ T) are denoted by open circles, whilethe black dots inside the circle represent the neighborhood ofX(t)in the phase space. By finding a suitable function (linear or non-linear) that approximates how neighbors move, a prediction of the

current state is made6,19.

Fig. 4. Prediction accuracy, defined as the correlation between theobserved and predicted values of a particular time series, as afunction of prediction lead time. The dotted line represents asinewave with additive noise, while the solid line depicts the

chaotic Henon map.

466 Q. Liu et al.

Page 5: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

chaotic and noisy time series. Here, the prediction accuracyis defined as the correlation between the observed and pre-dicted values of a particular time series. The dotted lineshows that the correlation does not decline for additivenoise (here white noise is superimposed on a periodicsignal) as one tries to forecast further into the future. Incontrast, the solid line (for a time series generated from achaotic Henon map) shows the declining signature charac-teristic of a chaotic sequence. For a detailed discussion onthe Henon map including its stability and phase-spacecharacteristics we refer to Ref.19. The correlation coeffi-cient of the Henon map prediction drops abruptly from 0.95for Tp ¼ 1 to 0.16 forTp ¼ 3. Such a sharp drop in theprediction accuracy is a characteristic signature of a chaoticsignal. If there is a periodicity in the signal which is less thanthe maximum prediction lead time then the effects of theperiodicity of the signal will show up in the predictionaccuracy. To avoid such an influence, usually a differencetime series is used9,10. On the other hand, the correlationcoefficient for the noisy time series does not show such anexponential loss of information with prediction lead time. In

the following section, we will explore the utility of thisdiagnostic tool to characterize the nature of dailystreamflow.

3 PHASE-SPACE-BASED MODEL FORSTREAMFLOW PREDICTION

3.1 Analysis of daily streamflow from the southwesternUnited States

The dataset used in this study is described by Walliset al.20.It consists of daily streamflow measurements from 1948 to1988 for 1009 streamgages across the United States. Allfiles are serially complete for 41 water years beginning inOctober 1948 and ending in September 1988. Missing datain the raw data records are estimated using simple proratingmethods described in Walliset al.20.

First, eight stations are chosen from the southwesternUnited States covering three states: Arizona, Californiaand New Mexico. Relevant information for eight selected

Table 1. Characteristic attributes for streamgages from the southwestern United States

Number Stationidentity

Area(km2)

Latitude Longitude Dailyaverageflow rate(m3 s¹1)

Coefficientof variationof dailyflows

Averageyield(105 m day¹1)

1 x102818 47 36.78 118.26 0.36 1.37 66.102 x112135 2465 36.86 118.97 42.86 1.69 150.203 q094710 3156 31.63 110.17 1.45 5.48 3.974 q094305 4826 33.06 108.54 4.45 2.69 7.975 q094975 7376 33.79 110.50 19.06 2.22 22.326 x094420 10381 32.97 109.31 5.21 2.97 4.337 q094985 11148 33.62 110.92 24.52 2.61 19.008 q094485 20442 32.87 109.51 13.16 3.51 5.56

Fig. 5. Daily time series of eight streamflow records from the southwestern United States described in Table 1, for 41 years (1948–1988).The vertical axis is flowrate (m3 s¹1).

Phase-space analysis of daily streamflow 467

Page 6: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

stations used in this study is summarized in Table 1. Here,basin yield is defined as the average flow rate per unit area.These stations are chosen to represent a wide range of basinareas from the same geographical region. Basin areas for theselected stations range between 47 and 20 442 km2. StationIDs with letter prefix ‘q’ indicates that there were no datagaps for the station in the raw USGS data file, while a prefix‘x’ indicates that there were periods of missing data thatwere estimated by Walliset al.20.

Fig. 5 shows the variations of daily streamflow values forthe eight streamflows for 41 years. It appears that thesmaller basins (1 and 2) seem to have a pronouncedannual cycle while the larger basins (3–8) do not showappreciable annual cycle. This feature can be clearly seenin Fig. 6, which shows the streamflows for the smallest andthe largest basins during the first 5 years (1948–1953). Theautocorrelation function for the selected eight basins areshown in Fig. 7. It appears that the two smallest basins (1and 2) show apparent periodicity, while the other six do notshow any clear periodic signature. This preliminary analysissuggests that streamflow records 1 and 2 are dominated by aseasonal cycle with added noise. On the other hand, a sharpdecay in the autocorrelation function for the other sixstreamflow records suggests that their dynamics might becontrolled either by random processes or by deterministicchaos. We will use phase-space model-based predictions tomake a distinction between these two types of streamflowcharacteristics. Fig. 8(a) and (b) show three-dimensionalphase-space maps for q10140 with two different values oflag time (t). If the dimension of the underlying attractor isgreater than three, a phase-space map in a three or lowerdimension would appear as a cluster of points with noidentifiable structure. It appears that the underlyingdynamics for this time series (q10140) has a higher

dimensional attractor, and consequently the underlyingstructure is hidden. However, a higher dimensional phase-space map, although difficult to visualize, is expected toshow structured pattern in the phase space.

As discussed in Section 2, the first step in developing aphase-space model for streamflow signals involves thedetermination of optimum embedding dimensions fromthe daily streamflow time series. This is done by plottingthe correlation coefficient between the observed and pre-dicted streamflows forTp ¼ 1 (1-day ahead prediction) asa function of embedding dimension. Fig. 9 shows thecorrelation coefficient for eight streamflows as a functionof embedding dimension,M. We choose the optimumembedding dimension such that it produces the largestcorrelation coefficient for 1-day ahead prediction. Forexample, streamflow record 5 produces a maximum corre-lation coefficient of 0.85 forM ¼ 4 and hence for thisstreamflow four is chosen as the optimum embeddingdimension. Optimum embedding dimensions found were2, 3, 3, 4, 4, 7, 2 and 4 for streamflow records 1–8, respec-tively. An estimate of optimum embedding dimensionprovides an indication of the underlying complexity of thesystem. For example, in general, the larger the embeddingdimension the greater is the underlying complexity. There isno apparent trend between the optimal embeddingdimension and basin area. With these embedding dimensionestimates, we are now set to make predictions.

Fig. 10 shows the prediction accuracy for the selectedstations as a function of the prediction lead time. For eachof these streamflows we have made 1-day to 20-day aheadpredictions. The two smallest basins show a very highdegree of correlation between the observed and the pre-dicted sequence. This persistence in prediction accuracymay be considered analogous to periodic signal with

Fig. 6. Daily time series of the smallest and the largest basin for the first five years (1948–1953).

468 Q. Liu et al.

Page 7: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

additive noise, as seen in the illustrative example of Fig. 4.Prediction accuracy for the other six stations show anexponential decay with the prediction lead time. Sugiharaet al.10 argued that such an exponential decline in predictionaccuracy could arise from locally exponentially divergingtrajectories and could be taken as an operational definitionof chaos. A sharp decay of correlation between the observedand predicted streamflow records shown in Fig. 10 couldthus suggest a possible presence of deterministic chaos. Thisserves as a preliminary evidence that the daily streamflowtime series analyzed here show a change in dynamics as weincrease the basin area. It appears to show a tendency to gofrom noisy dynamics to chaotic dynamics for increasingbasin areas. As we explain below, the influence of otherfactors such as climate, topography, vegetation and soiltexture could complicate this apparent relationship betweenbasin area and streamflow characteristics.

A direct implication of the results reported above is thatincreasing basin area does not necessarily imply increasedlinearity or enhanced predictability. This is somewhatcounterintuitive. One could argue that a larger basin

would spatially average small-scale fluctuations in forcingfunctions (e.g. rainfall) and basin attributes (e.g. spatialvariability in topography, soil texture). This averagingshould reduce the dimension of the underlying dynamicalsystem and consequently lead to increased streamflow pre-dictability. There does not appear to be any consistentreduction in the optimum embedding dimension as weincrease the basin area. Another feature to note for theseeight stations is that there appears to be a relationshipbetween the yield (average flow rate per unit area expressedas depth per day) and basin dynamics. For higher yield,basin dynamics appear to be more predictable, whereasfor lower yield it becomes more unpredictable. If onelooks at the geographical locations of these basins, basins1 and 2 are seen in the Sierra Nevada while the other sixbasins appear to be in the Gila and Salt River drainages. TheSierra Nevada area is dominated by winter storm frontscoming from the Pacific, and snow accumulation andsnowmelt play a strong role in the hydrology of streamflowrecords 1 and 2. The streamflow records 3–8, on theother hand, are affected by more variable winter storms,

Fig. 7. Autocorrelation function for the eight streamflow records from the southwestern United States as a function of lag (days).

Phase-space analysis of daily streamflow 469

Page 8: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

by small-scale convective events, and by occasionalintense, large area summer monsoons. Hence, theretends to be less persistence in these streamflow signals.Based on these hydrometeorological explanations, onecould argue that shift in streamflow characteristics fromnoisy dynamics to low-dimensional determinism are sig-nificantly affected by variability and timing of atmo-spheric processes in this region. This, however,complicates the notion of increased linearity or enhancedpredictability of streamflow with increasing area. As theresults and inferences presented above are based on theanalysis of eight select stations from a geographicalregion, further analysis with more streamgages fromother regions are required before a generalized conclusioncan be attempted.

3.2 Analysis of daily streamflow from the continentalUnited States

In this section, we analyze daily streamflow data from 20streamgages from across the continental United States.These streamgages are chosen from the data set compiledby Wallis et al.20. Relevant information for 20 selectstations used in this study are summarized in Table 2.These stations were chosen randomly to represent a widerange of basin areas from different geographical regionswithin the continental United States. Basin areas for theselected stations range between 31 and 35 079 km2. Thisis the widest range, in terms of basin area, we could find forunregulated streams. For these stations average flow ratevaries over three order of magnitudes with a range between

Fig. 8. Three-dimensional phase space map for q10140 with (a)t ¼ 5 days and (b)t ¼ 10 days.

470 Q. Liu et al.

Page 9: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

0.30 and 343 m3 s¹1. The yield varies between 1.64 and244.8 3 10¹5 m day¹1. For the streamflow time seriesanalyzed, there does not appear to be any relationshipbetween the basin area and the coefficient of variation orthe basin area and the yield. Fig. 11 shows theautocorrelation coefficient for these 20 streamflow timeseries. There does not appear to be any direct relationshipbetween apparent periodicity in the autocorrelation functionand the basin area.

Now we will use phase-space model based predictions toexplore whether there is a physical threshold, in terms ofbasin area, at which the dynamics of streamflow changefrom linear noisy dynamics to chaotic dynamics for

increasing basin areas. Fig. 12 shows the predictionaccuracy for 20 streamflows as a function of embeddingdimension. As before, we use this figure to choose the opti-mum embedding dimension such that it produces the largestcorrelation coefficient for 1-day ahead prediction. Fig. 13shows the prediction accuracy for these 20 stations as afunction of the prediction lead time. For each of thesestreamflows we have made 1-day to 20-day aheadpredictions.

Four basins (d, f, q and t) show persistent high correlationbetween the observed and the predicted sequence. Thispersistence in prediction accuracy is analogous to periodicsignal with additive noise, as seen in the illustrative example

Fig. 9. Prediction accuracy as a function of embedding dimensionfor the selected eight streamgages from the southwestern United

States.

Fig. 10.Prediction accuracy as a function of prediction lead timefor 1-day to 20-day ahead predictions for the eight streamgages

from the southwestern United States.

Table 2. Characteristic attributes for streamgages selected from the continental United States

Number Stationidentity

Area(km2)

Latitude Longitude Dailyaverageflow rate(m3 s¹1)

Coefficientof variationof daily flows

Average yield(105 m day¹1)

a q10730 31 43.15 ¹70.97 0.56 1.55 156.07b q20990 39 36.04 ¹79.95 0.50 3.24 110.76c q54540 65 41.69 ¹91.49 0.45 3.52 93.25d x133295 78 45.34 ¹117.29 2.21 1.25 244.80e q73730 132 31.54 ¹92.41 1.72 3.65 112.58f x22670 153 27.96 ¹81.50 1.33 0.61 75.11g q53935 212 45.45 ¹89.98 2.36 2.06 96.18h q69115 287 38.61 ¹95.64 1.58 5.70 47.56i q54660 401 41.27 ¹90.38 2.91 2.54 62.69j q54640 13322 42.50 ¹92.33 77.02 1.57 4.99k q10140 14666 47.26 ¹68.59 268.32 1.44 158.07l q53405 16155 45.41 ¹92.65 132.85 1.09 6.76m q54645 16854 41.97 ¹91.67 99.82 1.38 51.17n x69020 17812 39.64 ¹93.27 101.88 2.34 49.42o q54650 20154 41.41 ¹91.29 124.67 1.23 53.45p q23205 20400 29.96 ¹82.93 213.58 0.83 90.45q q64855 21809 42.83 ¹96.56 25.94 2.84 10.27r q80805 22772 33.01 ¹100.18 4.32 7.16 1.64s q21310 22860 34.20 ¹79.55 288 0.84 108.8t q133170 35079 45.75 ¹116.32 343.47 1.25 84.6

Phase-space analysis of daily streamflow 471

Page 10: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

of Fig. 4. Prediction accuracy for seven stations (b, c, e, h, i,n and r) shows an exponential decay with the prediction leadtime. This sharp decay of correlation between the observedand predicted streamflows suggests the existence ofdeterministic chaos. Prediction accuracy for the other ninestations falls somewhere between these two dynamicalregimes. There does not appear to be any direct relationshipbetween the area or yield and the prediction accuracy. Forexample, the largest basin (t) shows persistence in predic-tion accuracy similar to a periodic signal with additivenoise, while the third largest basin (r) shows a dynamicalbehavior similar to a deterministic chaos and the secondlargest basin (s) falls somewhere in the middle.

To explore the origin of such mixed characteristics, low-order determinism to stochastic noise, for different stream-flows, we now focus on a series of synthetically generatedtime series with known dynamics. We have generated 13time series with dynamics ranging from deterministic chaos,to a periodic signal, and to pure noise. Fig. 14 shows theprediction accuracy for these 13 time series as a function ofprediction lead time. Here, the signal generated from deter-ministic chaos (a; generated from the Henon map) shows anexponential decay with increasing prediction lead time

Fig. 11. Similar to Fig. 7 but for 20 streamgages, ‘a’–‘t’ for increasing basin areas, chosen from across the continental United States.

Fig. 12. Similar to Fig. 9 but for 20 streamgages, ‘a’–‘t’ forincreasing basin areas, chosen from across the continental

United States.

472 Q. Liu et al.

Page 11: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

Fig. 11. Continued.

Fig. 13. Similar to Fig. 10 but for 20 streamgages, ‘a’–‘t’ forincreasing basin areas, chosen from across the continental

United States.

Fig. 14.Prediction accuracy, as a function of prediction lead timefor 13 artificially generated time series with know dynamics.

Details on the time series is given in the text.

Phase-space analysis of daily streamflow 473

Page 12: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

while the other five deterministic chaotic signals withincreasing level of additive noise (b–f) also show rapidloss of predictability with increasing lead time. The signalgenerated from a pure sine wave (g) does not show, asexpected, any loss of information with increasing predictionlead time. However, periodic signals contaminated withincreasing levels of additive noise (h–l) mimic dynamicswhich fall in between deterministic chaos and periodicsignal. We also note that, as expected, the signal repre-sentative of pure noise (m) does not show any level ofpredictability.

To test the robustness of our proposed prediction algo-rithm, we have analyzed two streamgages (‘b’: q20990 with39 km2 basin area, and ‘k’: q10140 with 14 666 km2 basinarea) for three different 10-year segments, 1948–1957,1958–1967 and 1968–1977. For each segment, the first 5years of the data are used for the training phase and the other5 years are used for prediction. Fig. 15 shows the predictionaccuracy vs prediction lead time for the three segments of therecord examined. For each basin, the relationship betweenprediction accuracy and lead time is very close to eachother for all three segments of the data. These results clearlydemonstrate the robustness and stability of our results.

4 SUMMARY AND CONCLUSIONS

This paper describes a methodology, based on dynamicalsystems theory, to model and predict streamflow. The modelis constructed by developing a multidimensional phase-space map from observed streamflow time series.Predictions are made by examining trajectories on thereconstructed phase space. Prediction accuracy is used asa diagnostic tool to characterize the nature, random vs

deterministic, of streamflow characteristics. To demonstratethe utility of this diagnostic tool, the proposed method isfirst applied to a time series with known dynamics. It hasbeen shown that the proposed phase-space model can beused to make a tentative distinction between noisy andlow-order deterministic chaotic streamflow signals.

The proposed phase-space model is then applied to dailystreamflows for 28 selected stations from the continentalUnited States covering basin areas between 31 and35 079 km2. Based on the analyses of these 28 streamflowtime series and 13 artificially generated signals with knowndynamics, no direct relationship between the nature ofunderlying streamflow signal and basin area has beenfound. In other words, it appears that increasing the basinarea does not necessarily imply increased linearity orenhanced predictability. In addition, there does not appearto be any physical threshold (in terms of basin area, averageflow rate and yield) that controls the change in streamflowcharacteristics at the daily scale. The daily streamflow timeseries may span a wide dynamical range between determi-nistic chaos and periodic signal contaminated with additivenoise. Added noise strongly affects the nonlinear behavior of adeterministic system by decreasing the predictability andincreasing the dimension of an existing attractor. We note,however, that in addition to basin area heterogeneousinfluence of other factors (e.g. topography and climate)could also play a role in dictating the predictability of stream-flow. We hope future studies would attempt to quantify therelative importance of these factors on streamflow dynamics.

ACKNOWLEDGEMENTS

This research is supported, in part, by a grant from the

Fig. 15. Prediction accuracy, as a function of prediction lead time for q10140: (·) 1948–1957; (W) 1958–1967; (3 ) 1968–1977), andq20990: (þ) 1948–1957; (¹) 1958–1967; (*) 1968–1977).

474 Q. Liu et al.

Page 13: Phase-space analysis of daily streamflow ...directory.umm.ac.id/Data Elmu/jurnal/A/Advances In Water Resource… · Phase-space analysis of daily streamflow: characterization and

National Science Foundation (NSF EAR-9526628).Comments from three anonymous reviewers and the editor(Dr Mike Celia) are gratefully acknowledged.

REFERENCES

1. Packard, N.H., Crutchfield, J.P., Farmer, J.D. & Shaw, R.S.Geometry from a time series.Physics Review Letters, 1980,45 712–716.

2. Smith, L. A., Does a meeting in Santa Fe imply chaos?. InTime Series Prediction: Forecasting the Future and Under-standing the Past. Addison Wesley, Reading, MA, 1994.

3. Fraedrich, K. Estimating the dimensions of weather andclimatic attractors.Journal of Atmosphere Science, 1986,43 331–344.

4. Islam, S., Bras, R.L. & Rodriguez-Iturbe, I. An explanationfor low correlation dimension estimates for the atmosphere.Journal of Applied Meteorology, 1993,32(2) 203–208.

5. Rodriguez-Iturbe, I., dePower, B.F., Sharifi, M.B. & Georga-kakos, K.P. Chaos in rainfall.Water Resources Research,1989,25(7) 1667–1675.

6. Farmer, J.D. & Sidorwich, J.J. Exploiting chaos to predictfuture and reduce noise.Physics Review Letters, 1987, 59845–848.

7. Schreiber, T. & Grassberger, P. A simple noise-reductionmethod for real data.Physics Letters, 1991,A 160 411–418.

8. Lall, U., Sangoyomi, T. & Abarbanel, H.D.I. Nonlineardynamics of the Great Salt Lakes: nonparametric short-termforecasting.Water Resources Research, 1996,32(4) 975–986.

9. Sugihara, G. & May, R.M. Nonlinear forecasting as a way ofdistinguishing chaos from measurement error in time series.Nature, 1990,344 734–741.

10. Sugihara, G., Grenfell, B. & May, R.M. Distinguishing errorfrom chaos in ecological time series.Philosophy Transactionsof the Royal Society of London B, 1990,330235–251.

11. Jayawadena, A.W. & Lai, F. Analysis and prediction ofchaos in rainfall and streamflow time series.Journal ofHydrology, 1994,153 23–52.

12. Yakowitz, S. & Karlsson, M. Nearest neighbor methods withapplication to rainfall/runoff prediction. In:StochasticHydrology, ed. J. B. Macneil and G. J. Humphries,pp. 149–160, D. Reidel, Hingham, MA, 1987.

13. Yule, G. On a method of investigating periodicity in a dis-turbed series with special reference to Wolfer’s sunspot num-bers.Philosophy Transactions of the Royal Society of LondonA, 1927,226 267–298.

14. Bras, R. and Rodriguez-Iturbe,Random Functions inHydrology. Addison Welsey, Reading, MA, 1985.

15. Weigend, A. S. and Gershenfeld, N. A., The future of timeseries: learning and understanding. In:Time Series Predic-tion: Forecasting the Future and Understanding the Past, ed.A.S. Weigend and N.A. Gershenfeld. Addison Wesley,Reading, MA, 1994.

16. Ruelle, D., Chemical kinetics and differentiable dynamicalsystems. In:Nonlinear Phenomena in Chemical Dynamics.Springer, Berlin, 1981.

17. Takens, F., Detecting strange attractors in turbulence. In:Dynamical Systems and Turbulence, ed. D. A. Rand andL.-S. Young. Lecture Notes in Mathematics, Vol. 898, 336–381. Warwick, 1980. Springer-Verlag, Berlin, 1981.[Afraimovich, Fraser, Gershenfeld, Kostelich, Palusˇ, Pineda,Sauer, Smith, Wan, Zhang].

18. Sauer, T., Time series prediction by using delay coordinateembedding. In:Time Series Prediction: Forecasting theFuture and Understanding the Past. Addison Wesley,Reading, MA, 1994.

19. Tsonis, A. A.,Chaos, from Theory to Applications. Plenum,New York, 1992.

20. Wallis, J. R., Lettenmaier, D. P. and Wood, E., A dailyhydro-climatological data set for the continental U.S..Research Report, IBM Research Division RC 16607,#7045, 1990.

Phase-space analysis of daily streamflow 475