analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfanalysis...

14
Hydrological Sciences~Journal-des Sciences Hydrologiques, 46(5) October 2001 781 Analysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute, University ofKentucky, Lexington, KY, U.S.A. 40506-0107 e-mail: aelshO(S>,engr.ukv.edu U. S. PANU Civil Engineering Department, Lakehead University, Thunder Bay, Ontario P7B 5E1, Canada e-mail: umed.panu(a),lakeheadu,ca S. P. SIMONOVIC Department of Civil and Environmental Engineering, University of Western Ontario, London, Ontario N6A 5B9, Canada e-mail: siroonovie(Siuwo.ca Abstract A trial is made to explore the applicability of chaos analysis outside the commonly reported analysis of a single chaotic time series. Two cross-correlated streamflows, the Little River and the Reed Creek, Virginia, USA, are analysed with regard to the chaotic behaviour. Segments of missing data are assumed in one of the time series and estimated using the other complete time series. Linear regression and artificial neural network models are employed. Two experiments are conducted in the analysis: (a) fitting one global model and (b) fitting multiple local models. Each local model is in the direct vicinity of the missing data. A nonlinear noise reduction method is used to reduce the noise in both time series and the two experiments are repeated. It is found that using multiple local models to estimate the missing data is superior to fitting one global model with regard to the mean squared error and the mean relative error of the estimated values. This result is attributed to the chaotic behaviour of the streamflows under consideration. Key words chaos; missing data; noise reduction; neural network; regression; local approximation; global approximation Analyse de séries de débits chaotiques présentant une corrélation croisée Résumé Nous explorons l'application de la théorie du chaos en dehors du cadre fréquemment utilisé d'une simple série temporelle chaotique. Nous analysons les séries de débits des deux cours d'eau Little River et Reed Creek en Virginie, USA, qui présentent une corrélation croisée. Des segments de données sont supposés manquants dans l'une des deux séries et estimés à partir de la seconde série, complète, grâce à des modèles de régression linéaire et de réseaux neuronaux. Deux démarches sont suivies pour l'analyse: (a) l'ajustement d'un modèle global et (b) l'ajustement de modèles locaux multiples. Chaque modèle local correspond au voisinage immédiat de données manquantes. Une méthode non-linéaire est utilisée pour réduire le bruit des deux séries temporelles, puis les deux démarches sont répétées. Le calcul de l'erreur quadratique moyenne et de l'erreur relative moyenne montre que l'utilisation de plusieurs modèles locaux pour estimer les données manquantes donne de meilleurs résultats que l'ajustement d'un modèle global unique. Ce résultat est expliqué par le comportement chaotique des séries de débits considérées. Mots clefs chaos; données manquantes; réduction de bruit; réseau de neurones; régression; approximation locale; approximation globale Open for discussion until 1 April 2002

Upload: others

Post on 24-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

Hydrological Sciences~Journal-des Sciences Hydrologiques, 46(5) October 2001 781

Analysis of cross-correlated chaotic streamflows

AMIN ELSHORBAGY Kentucky Water Research Institute, University of Kentucky, Lexington, KY, U.S.A. 40506-0107 e-mail: aelshO(S>,engr.ukv.edu

U. S. PANU Civil Engineering Department, Lakehead University, Thunder Bay, Ontario P7B 5E1, Canada e-mail: umed.panu(a),lakeheadu,ca

S. P. SIMONOVIC Department of Civil and Environmental Engineering, University of Western Ontario, London, Ontario N6A 5B9, Canada e-mail: siroonovie(Siuwo.ca

Abstract A trial is made to explore the applicability of chaos analysis outside the commonly reported analysis of a single chaotic time series. Two cross-correlated streamflows, the Little River and the Reed Creek, Virginia, USA, are analysed with regard to the chaotic behaviour. Segments of missing data are assumed in one of the time series and estimated using the other complete time series. Linear regression and artificial neural network models are employed. Two experiments are conducted in the analysis: (a) fitting one global model and (b) fitting multiple local models. Each local model is in the direct vicinity of the missing data. A nonlinear noise reduction method is used to reduce the noise in both time series and the two experiments are repeated. It is found that using multiple local models to estimate the missing data is superior to fitting one global model with regard to the mean squared error and the mean relative error of the estimated values. This result is attributed to the chaotic behaviour of the streamflows under consideration.

Key words chaos; missing data; noise reduction; neural network; regression; local approximation; global approximation

Analyse de séries de débits chaotiques présentant une corrélation croisée Résumé Nous explorons l'application de la théorie du chaos en dehors du cadre fréquemment utilisé d'une simple série temporelle chaotique. Nous analysons les séries de débits des deux cours d'eau Little River et Reed Creek en Virginie, USA, qui présentent une corrélation croisée. Des segments de données sont supposés manquants dans l'une des deux séries et estimés à partir de la seconde série, complète, grâce à des modèles de régression linéaire et de réseaux neuronaux. Deux démarches sont suivies pour l'analyse: (a) l'ajustement d'un modèle global et (b) l'ajustement de modèles locaux multiples. Chaque modèle local correspond au voisinage immédiat de données manquantes. Une méthode non-linéaire est utilisée pour réduire le bruit des deux séries temporelles, puis les deux démarches sont répétées. Le calcul de l'erreur quadratique moyenne et de l'erreur relative moyenne montre que l'utilisation de plusieurs modèles locaux pour estimer les données manquantes donne de meilleurs résultats que l'ajustement d'un modèle global unique. Ce résultat est expliqué par le comportement chaotique des séries de débits considérées.

Mots clefs chaos; données manquantes; réduction de bruit; réseau de neurones; régression; approximation locale; approximation globale

Open for discussion until 1 April 2002

Page 2: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

782 Amin Elshorbagy et al.

INTRODUCTION

Over a period of a decade, many case studies discussing and applying principles of chaos to hydrological time series have been reported (e.g. Rodriguez-Iturbe et al., 1989; Jayawardena & Lai, 1994; Lall et al, 1996; Porporato & Ridolfi, 1997). Some of the reported chaos-related work in hydrology follows the exploratory approach to investigate the existence of chaos in the hydrological time series (e.g. Sivakumar et al., 1999a). Some others, such as Jayawardena & Lai (1994) and Porporato & Ridolfi (1997), have focused on the prediction of future values. The issue related to the existence of noise in the real time series and its effect on the prediction process has been highlighted by Porporato & Ridolfi (1997) and further investigated by Sivakumar et al. (1999b) and Jayawardena & Gurung (2000).

The above-mentioned applications and other case studies focus mainly on the univariate case where a single time series is always under consideration. The growing belief in the applicability of chaos theory to water resources along with the indications that some of the hydrological time series might be chaotic rather than stochastic raise important questions to be addressed. In stochastic hydrology, multiple time series are considered using multivariate time series analysis approach (Salas et al., 1985). Data generation, augmentation, and extension of short records (e.g. Hirsch, 1982) are always treated in the context of statistical and stochastic analysis where statistical properties such as mean and variance are parameters to be preserved. When hydrological data are proved to be chaotic (behaviour of nonlinear deterministic dynamics), then analysis has to be reformulated to fit the context of chaos analysis.

In this paper, two cross-correlated streamflow time series are analysed with regard to the existence of chaos. One of the time series, named the reference river, is used to estimate missing data of the other series, referred to as the target river. The main objective of the analysis described herein is to study the effect of the existence of nonlinear deterministic dynamics (chaotic behaviour) on the process of analysing cross-correlated hydrological time series for data in-filling purposes.

ANALYSIS OF CHAOTIC TIME SERIES

Reconstruction of the phase space

Usually, the first step in the process of chaos analysis is to reconstruct the dynamics in a phase space. The method of Takens (1981) is an appropriate method for reconstructing a phase space from a single time series. The dynamics of a scalar time series {x\, x2, ••-, x„} is embedded in the m-dimensional phase-space (m > d, where dis the dimension of the attractor). The phase space is defined by:

Yt =\x,,xM,xl_2x,..., xt_(m_tyj (1)

where t is the time delay that can be chosen with the help of the autocorrelation function or the mutual information content. Details have been reported by Jayawardena & Lai (1994) and others, and they are excluded from this paper to avoid duplication.

Page 3: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

Analysis of cross-correlated chaotic stream/lows 783

Correlation dimension

To estimate the fractal dimension of a time series, the concept of correlation dimension is useful and often applied (Kantz & Schreiber, 1997). The method of correlation dimension is well explained in detail in Lall et al. (1996). For an m-dimensional phase space, the correlation integral C(r) given by Theiler (1986) is:

C(r)= lim , , i M r - K - 7 / (2) v ' \<i<j<N

where H is the Heaviside step function, with H{u) = 1 for u > 0, and H(u) = 0 for u < 0; N is the number of points on the reconstructed attractor, r is the radius of the sphere centred on 7, or Yj. If the phenomenon is chaotic, for a large number of points, beyond a certain m the correlation integral follows the power law:

C(r) = arv

r-»0 {->)

where a is a constant; and v is the correlation dimension. The slope of the logC(r) vs logr plot is given by:

v = lim b w (4) logr

r->0

For a random process, v varies linearly with increasing m, without reaching a saturation value, whereas for a deterministic process the value of v saturates (levels off) after a certain m. The saturation value, d, is the fractal dimension of the attractor or the time series.

There are other invariants such as Lyapunov exponents, Kolmogorov entropy, and the surrogate data methods that can be used as auxiliary tools to investigate the existence of chaotic behaviour in a time series. However, the correlation dimension is the key invariant, which is commonly used and it is employed herein.

NOISE IN HYDROLOGICAL DATA

The concept of noise in hydrological data is not uniquely linked to the adoption of chaos principles. It has been acknowledged and handled in application related to stochastic hydrology. Matalas (1967) has employed the lag-one Markov process, which includes a random component that has zero mean and unit variance and is independent of the signal (xt), for generating synthetic data. This component was sometimes called error term or independent stochastic component that has zero autocorrelation coefficient. D'Astous & Hipel (1979) used the expression "white noise" when they introduced a stochastic intervention model. In general, the white noise component in stochastic models is considered to include any measurement error and/or components that are not captured by the model.

Page 4: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

784 Amin Elshorbagy et al.

On the other hand, chaos analysis of time series is dependent on the computation of some invariants such as correlation dimension and Lyapunov exponents. The fact that time signals from natural phenomena, such as hydrological time series, are corrupted by noise poses difficulty on the estimation of such invariants. Measured time signals will always contain some noise due to random influences and inaccuracies that can never be eliminated completely (Schouten et al., 1994). Apparently, hydrological time series are no exception.

Two types of noise, measurement noise and dynamical noise may exist in chaotic time series. Measurement noise, which is sometimes called additive noise, refers to the corruption of observations by errors independent of the dynamics (Kantz & Schreiber, 1997). The dynamics satisfies:

X,«=F{X,) (5)

but scalars are measured:

y,=y{Xt) + i\t (6)

where y(X) is a smooth function that maps points on the attractor to real numbers, and the {r),} are random numbers. The series {r|,} is referred to as the measurement noise. In contrast, dynamical noise is a feedback process wherein the system is perturbed by a small random amount at each time step:

XM=F(Xt+r)t) (7)

Details of the noise issue can be found in Elshorbagy et al. (2000b) and Kantz & Schreiber (1997). The problem of dynamical noise is much more complex than the additive noise (Schouten et al, 1994) and removing it from the signal may not be justified, if at all possible. Only the issue of removal or reduction of the additive (measurement) noise is addressed herein.

Effect of noise

The computations of the chaotic invariants are based on the assumption that the data are noise-free. The existence of noise complicates the process of computing the invariants (Kostelich & Schreiber, 1993). When hydrologists started adopting analysis of chaotic time series, their efforts have been focused on proving the existence of chaotic behaviour in hydrological phenomena (e.g. rainfall and streamflow). Prediction imposed itself as an important hydrological application where local nonlinear prediction models have been applied to the raw data after proving the existence of chaos (Jayawardena & Lai, 1994; Lall et al. 1996). Sivakumar et al. (1999b) have reported that prediction accuracy has not been highly satisfactory.

The unsatisfactory accuracy of prediction obtained in Porporato & Ridolfi (1997), Sivakumar et al. (1999b), and Jayawardena & Gurung (2000) has been attributed to the existence of noise contaminating the signal. Surprisingly there has been no reference to the possibility that low accuracy of prediction could have been caused by the inefficiency of the selected prediction model. Only a single local prediction model has been always employed in the above-mentioned case studies. Consequently, the noise

Page 5: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

Analysis of cross-correlated chaotic stream/lows 785

reduction approach, which has been adopted and the accompanied improvement in the prediction accuracy has been taken for granted as an indication of the existence of noise. The noise-reduced signal has been treated in the reported studies to represent the historical data.

In order to have noise component removed, reduced, or separated from the main signal, one of the two following approaches has to be adopted. First, is to fit an empirical model to the original data and separate the component that is not obeying the model (similar to ARMA type of models), and in this case any prediction using that model should be compared to the original data. Second, either the clean signal is known and the noise can be simply removed from the data or the true underlying dynamics governing the system is known and can be perfectly modelled. In this case also, the perfect model can be used to eliminate the noise. Following the second approach implies that any prediction process can be conducted using the clean signal. For hydrological signals (rainfall, streamflow, tree rings, etc.), the second approach cannot be adopted because neither the clean signal nor the underlying dynamics of the system is known. On the other hand, the first approach might appeal to hydrologists (or any real data analysts). In stochastic hydrology, modellers adhere to the first approach by comparing predicted values to the observed ones, and also ensure that residuals (noise) are independent, and do not carry any structure that can be further modelled. In the reported applications of noise reduction in hydrology, the second approach has been apparently followed raising doubts about both of the components, the clean (modelled) signal and the left out (alleged noise) component.

Noise reduction

The issue of noise reduction in hydrological data and its reliability is discussed in Elshorbagy et al. (2000b) based on the results obtained by two noise reduction methods: Schreiber & Grassberger (1991) and Schreiber (1993). The noise reduction method, reported by Grassberger et al. (1993), is employed because of its superior performance as indicated by Jayawardena & Gurung (2000). The method of Grassberger et al. (1993), named as locally projective nonlinear noise reduction method, is outlined briefly below.

Locally projective nonlinear noise reduction method This method makes use of the hypotheses that the measured data are composed of the output of a low-dimensional dynamical system and of random or high-dimensional noise. The effect of noise is to spread the data off the low-dimensional manifold. The idea of the projective nonlinear noise reduction scheme is to identify the manifold and to project the data onto it (Hegger et al, 1999).

Suppose the dynamical system forms a q-dimensional manifold V containing the

trajectory. All embedding vectors yi would lie inside another manifold V in the embedding space. For each j , there exists a correction Ay,-, with |Aj ; | small, in such a

way that yi = Ay,- e V and that Aj,- are orthogonal on V. So, vectors have to be over-embedded in m-dimensional spaces with m > q. This idea is realized through the TISEAN Software (Hegger et al, 1999).

Page 6: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

786 Amin Elshorbagy et al.

ESTIMATION OF MISSING DATA

The problem of estimation of missing data is one of the important issues to be addressed in hydrology. Methodologies or techniques that help improve the accuracy of estimating missing observations are highly significant and useful for the process of hydrological modelling, which requires complete data sets. The problem of estimation of missing segments (consecutive observations) in streamflow records has been reported recently (Elshorbagy et al., 2000; Panu et ah, 2000). In this paper, a complete set of streamflow data is used as a reference river and another concurrent set of complete streamflow data, that is cross-correlated with the reference river, is used as a target river. The data sets are divided into training (calibration) and testing (verification) sets. Few patches of consecutive observations are removed from the data set of the target river to test the estimation procedures.

Two techniques are used to estimate the missing data: a linear regression (LR) technique and the artificial neural networks (ANNs) as a representative tool of the nonlinear approximation methods. The feed-forward ANNs that employ the back-propagation (BP) training technique are selected for the study under consideration.

CASE STUDY

The daily streamflow data of the Little River (USGS # 03170000) and Reed Creek (USGS # 03167000), Virginia, USA are used in this study. The cross-correlation coefficient between the two time series is found to be equal to 0.76. A sequence of 20 000 concurrent observations of the two rivers is selected for this study. The Little River and the Reed Creek are used as the reference and target rivers, respectively. Seven patches, each of ten consecutive observations, are considered missing from the

(a)

(b)

1.4 1.2

1 -0.8 -0.6 0.4 0.2

0 0 10 20 30 40

Embedding dimension (m)

Fig. 1 Correlation dimension of the time series: (a) Little River and (b) Reed Creek.

Page 7: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

Analysis of cross-correlated chaotic stream/lows 787

target river. The whole data set is divided into seven sections and one patch is randomly selected from each section so that the missing data may be representative of the entire data range.

Correlation dimension

A value of lag time ten (i.e. 10 days) is considered in the analysis of the correlation dimension. The correlation integral (sum) C(r) and the correlation exponent v are computed, as explained above, from the data set. Figure 1 shows that the correlation exponent (dimension) increases with the increase in the embedding dimension up to a certain point (m= 19) and saturates beyond that point. The saturation values of the correlation exponent (dimension) are -1.19 and 1.07 for the Little River and the Reed Creek, respectively. The nearest integer above the correlation dimension value (d = 2) is the minimum dimension of the phase space that can embed the attractor. The value of m at the saturation point (m= 19) is supposed to provide the sufficient number of variables to describe the dynamics of the attractor (Kantz & Schreiber, 1997).

RESULTS AND ANALYSIS OF ESTIMATION OF MISSING DATA

Two experiments are conducted to test the ability of both local and global modelling approaches to handle chaotic time series. Two standard techniques, linear regression (LR) and artificial neural networks (ANNs), are used in the analysis. These two experiments are designed as follows: (a) the training data are used to train one single model, which is verified using the

testing data. The testing data are unseen during the training and assumed to be missing. It is also worth mentioning that the testing data are selected to represent seven sections that are covering the entire data set. The single model is called the global model because it is trained to generalize over the various data sections.

(b) Seven different local models are trained to represent the different sections of the data. Each model is verified on a test data set that represents the section of the data within which the model is trained. The models are called local models because each model is trained to generalize over the data pattern of a specific data section. The mean squared error (MSE) and the mean relative error (MRE) are calculated as measures of accuracy of estimating the missing data. Other experiments can be designed as well but the two mentioned earlier are

considered to be sufficient in this study to present the idea of the effect of local and global modelling on the analysis of chaotic time series. The actual and estimated missing data, using LR and ANN models, of the Reed Creek are shown in Figure 2. One can observe that using multiple local models (sub-models) can improve the accuracy of estimating the missing data.

Nonlinear noise reduction

The locally projective nonlinear noise reduction method (Grassberger et al., 1993) is applied in this study to reduce the noise of both target and reference rivers. Four

Page 8: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

788 Amin Elshorbagy et al.

(a) 60 50

5" 40 f 30

I 20 10

0

- -Reg (sub-models)

K 1 6 11 16 21 26 31 36 41 46 51 56 61 66

(b) -Actual

- ANN (1-model)

- - -ANN (sub-models)

I ' H I I I I ;

r - ~ o o a > m - ' - r - - c o o > i r 3 T - i ^ T - T - C N j c o c o ^ f - s l - i n c o c o

Days

Fig. 2 Actual and estimated data of the Reed Creek raw data using (a) linear regression and (b) artificial neural networks.

different values of q are considered and consequently removed noise level is calcu­lated. For each noise-reduced data set, the cross-correlation between the Little River and the Reed Creek is calculated and summarized in Table 1. The stopping criteria for removing noise is a disputable issue in chaos-related literature. Porporato & Ridolft (1997) kept on removing noise until the amount of noise removed is not worth the time and effort consumed to do so, whereas Sivakumar et al. (1999b) have used the improvement in the prediction accuracy as an indication of the correctness of the level of the reduced noise. Both approaches have been critically reviewed by Elshorbagy et al. (submitted) and their deficiencies have been highlighted.

Table 1 Different levels of reduced noise and the corresponding cross-correlation values.

Projection dimension Noise level reduced (%): Cross-correlation Little River Reed Creek

19

17 15 10 5

0.0

2 3.7 9.1 18.7

0.0 1.9 3.7 9.5

20.6

0.733 0.733 0.732 0.720 0.709

The removed noise is assessed in two ways; first, the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the noise component are investigated to ensure the randomness and the absence of internal structure. Second, the cross-correlation between the noise-reduced signals of the two time series is calculated to observe the effect of noise reduction on the cross-correlation structure of

Page 9: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

Analysis of cross-correlated chaotic stream/lows 789

Fig. 3 Correlation structure of the 18.7% (q = 5) noise removed from the Little River data: (a) autocorrelation function and (b) partial autocorrelation function.

0.3

0.2

0.1

0

-0.1

-0.2

-0.3

-0.4

-0.5 J

ift 14 15

Fig. 4 Correlation structure of the 3.7% (q = 15) noise removed from the Little River data: (a) autocorrelation function and (b) partial autocorrelation function.

Page 10: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

790 Amin Ehhorbagy et al.

the two rivers. At q values of 10 and 5, the cross-correlation starts to deteriorate. Further, a slow decay of the ACF and sudden cut-off of the PACF are observed, which is an indication of possible existence of autoregressive-type structure. At q values of 17 and 15, the cross-correlation does not decrease and less significant autocorrelation exists. Therefore, a noise level of 3.7%, which is reduced at q value of 15, is considered in this paper (see Figs 3 and 4).

The experiments conducted with the raw data are repeated with 3.7% noise reduced time series. The actual and estimated missing data, using LR and ANN models, of the Reed Creek are shown in Fig. 5. One can observe that using multiple local models (sub-models) can improve the accuracy of estimating the missing data.

(a) Actual

Reg (1-model)

-Reg (sub-models)

1 7 13 19 25 31 37 43 49 55 61 67

Actual

• •—ANN (1-model)

- -ANN (sub-models)

Fig. 5 Actual and estimated data of the Reed Creek noise-reduced data using (a) linear regression and (b) artificial neural networks.

DISCUSSION

The MSE and the MRE of estimating the missing data resulting from different experiments are given in Table 2. It is evident that ANN models are superior to the LR models given that the experiment type is fixed (i.e. either one global model or local models are considered). The nonlinearity in the data, which is not captured by the linear regression models, is better modelled using a nonlinear technique such as the ANNs. More importantly, one can observe the superiority of the local models over the single global models irrespective of the adopted technique, LR or ANNs.

Page 11: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

Analysis of cross-correlated chaotic stream/lows 791

Table 2 The mean squared and the mean relative errors of the estimated values.

Experiment

LR(1-model) LR (local models) ANN (1-model) ANN (local models)

Raw data: MSE 32.30 19.64 26.79 15.29

MRE 0.55 0.23 0.31 0.17

Noise-reduced data: MSE MRE 33.49 0.54 20.65 0.24 29.78 0.30 15.00 0.18

The nonlinearity that is inferred by the superiority of the ANNs over LR is complemented by the improvement of modelling accuracy when local models are employed. It is believed that the superiority of nonlinear local models in estimating the missing data can be attributed to the existence of chaos in the two cross-correlated time series considered in this study. This result supports the argument provided by others that local models can simulate chaotic time series better than global ones (e.g. Porporato & Ridolfi, 1997).

The values presented in Table 2 indicate also that the process of nonlinear noise reduction has not helped in improving the accuracy of estimating the missing data. Neither the MSE nor the MRE has shown the possibility of improving the correlation between the reference and the target rivers after reducing the "alleged" noise. This result is consistent with the fact that the cross correlation values, given in Table 1, do not increase with the increase of the removed noise level. The cross-correlation value starts to deteriorate when higher levels of "alleged" noise are decreased, which raises serious doubts about the utility and validity of the currently used noise reduction concept in chaos-related applications in hydrology. The slight difference in the error values between the outputs of models using raw and noise-reduced data can be neglected. However, in some cases, a significant deterioration of the accuracy of estimating the missing data or prediction of future values may occur. Such a result has been attributed by Elshorbagy et al. (submitted) to the possible removal of a significant part of the signal during the noise reduction process.

In a study of stochastic models of streamflows, Mujumdar & Nagesh Kumar (1990) have shown that an autoregressive model, AR(1), is superior to other ARMA models, that have more parameters, in terms of the MSE of forecasting future values. In their paper, the authors have considered the results to be interestingly contrary to the common belief that models with larger number of parameters give better forecasts. In light of the recent revealed chaotic behaviour of the hydrological time series, the results of Mujumdar & Nagesh Kumar (1990) can be interpreted differently. Although both AR(1) and AR(10), for example, are global models but when forecast is done using AR(1), only the value of the most recent observation is used for the forecast. In case of AR(10), a lengthier stretch of observations, 10 values, are utilized. That segment of observations may be longer than the window that constitutes the zone of influence for the future values to be forecast. This argument may justify why AR(1) resulted in lower MSE than models that include more parameters. The comment of Mujumdar & Nagesh Kumar (1990) that "the simplest model is sufficient" may be reasonable if simple and more complicated models result in similar values of MSE. But the deterioration that occurs when more parameters are included can be attributed to the existence of chaos.

Page 12: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

792 Amin Elshorbagy et al.

CONCLUSION

A trial is made to model two cross-correlated rivers as chaotic time series. The existence of chaos in the daily flows of the Little River and the Reed Creek, Virginia, is investigated. Indication of chaotic behaviour is shown using the correlation analysis. Segments of missing data in the Reed Creek are estimated using linear regression and artificial neural network (ANN) models. Two experiments are conducted with both of the modelling techniques; (a) a single global model is fitted to estimate the missing data and (b) multiple local models are fitted in the direct vicinity of the missing data. The locally projective nonlinear noise reduction method is used to reduce the noise of the time series and experiments are repeated with the noise-reduced data. In general, ANN models show superiority over LR models in estimating the missing data. More significantly, local models are preferred over global models in terms of the MSE and MRE criteria, which is attributed to the chaotic behaviour of the time series. The noise reduction process is shown to be of little utility and significance to the accuracy of estimating the missing hydrological data. The role that the chaotic behaviour plays in this study highlights the importance of investigating effects of chaoticity on other hydrological applications such as record extension, data generation, and multivariate analysis of chaotic data. When a time series is proved to be chaotic rather than stochastic, every application in stochastic hydrology should find its analogy in chaos analysis, which poses a research challenge for hydrologists.

Acknowledgments The authors wish to acknowledge the financial support given to this research by the Natural Sciences and Engineering Research Council (NSERC) of Canada through grant No.OGP-0004404. The daily streamflow data provided by Dr. Hirsch of the USGS is thankfully acknowledged.

REFERENCES

D'Astous, F. & Hipel, K. W. (1979) Analyzing environmental time series. J. Environ. Engng Div. ASCE 105(EE5), 979-992.

Elshorbagy, A., Panu, U. S. & Simonovic, S. P. (2000) Group-based estimation of missing hydrological data. I. Approach and general methodology. Hydrol. Sci. J. 45(6), 849-866.

Elshorbagy, A., Simonovic, S. P. & Panu, U. S. (submitted) Noise reduction in chaotic hydrologie time series: facts and doubts. J. Hydrol.

Grassberger, P., Hegger, R., Kantz, H., Schaffrath, C. & Schreiber, T. (1993) On noise reduction methods for chaotic data. Chaos 3, 127-141.

Hegger, R., Kantz, H. & Schreiber, T. (1999) Practical implementation of nonlinear time series methods: the TISEAN package. Chaos 9, 413^140.

Hirsch, R. M. (1982) A comparison of four streamflow record extension techniques. Wat. Resour. Res. 18(4), 1081-1088. Jayawardena, A. W. & Gurung, A. B. (2000) Noise reduction and prediction of hydrometeorological time series:

dynamical systems approach vs. stochastic approach. J. Hydrol. 228, 242-264. Jayawardena, A. W. & Lai, F. (1994) Analysis and prediction of chaos in rainfall and streamflow time series. J. Hydrol.

153, 23-52. Kantz, H. & Schreiber, T. (1997) Nonlinear Time Series Analysis. Cambridge University Press, Cambridge, UK. Kostelich, E. J. & Schreiber, T. (1993) Noise reduction in chaotic time-series data: a survey of common methods. Phys.

Rev. £48(3), 1752-1763. Lall, U., Sangoyomi, T. & Abarbanel, H. D. I. (1996) Nonlinear dynamics of the great salt lake [check: Great Salt Lake?]:

nonparametric short-term forecasting. Wat. Resour. Res. 32(4), 975-985. Matalas, N. C. (1967) Mathematical assessment of synthetic hydrology. Wat. Resour. Res. 3(4), 937-945. Mujumdar, P. P. & Nagesh Kumar, D. (1990) Stochastic models of streamflow: some case studies. Hydrol. Sci. J. 35(4),

395-410.

Page 13: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,

Analysis of cross-correlated chaotic stream/lows 793

Panu, U. S., Khalil, M. & Elshorbagy, A. (2000) Streamflow data infilling techniques based on concepts of groups and neural networks. Chapter 12 in: Artificial Neural Networks in Hydrology (ed. by R. S. Govindaraju & R. Rao), 235-258. Kluwer, Dordrecht, The Netherlands.

Porporato, A. & Ridolfi, L. (1997) Nonlinear analysis of river flow time sequences. Wat. Resour. Res. 33(6), 1353-1367. Rodriguez-Iturbe, I., De Power, B. F., Sharifi, M. B. & Georgakakos, K. P. (1989) Chaos in rainfall. Wat. Resour. Res.

25(7), 1667-1675. Salas, J. D., Delleur, J. W., Yevjevich, V. & Lane, W. L. (1980) Applied Modelling of Hydrologie Time Series. Water

Resources Publications, Littleton, Colorado, USA. Salas. J. D., Tabios, G. Q., Ill, & Bartolini, P. (1985) Approaches to multivariate modelling of water resources time series.

Wat. Resour. Bull. 21(4), 683-708. Schouten, J. C. & van den Bleek, C. M. (1994) RRCHAOS: a menu driven software package for chaotic time series

analysis. Reactor Research Foundation, Delft, The Netherlands. Schouten, J. C , Takens, F. & Van den Bleek, C. M. (1994) Estimation of the dimension of a noisy attractor. Phys. Rev. E.

50(3), 1851-1861. Schreiber, T. (1993) Extremely simple nonlinear noise-reduction method. Phys. Rev. £47(4), 2401-2404. Schreiber, T. & Grassberger, P. (1991) A simple noise-reduction method for real data. Phys. Lett. A 160(5), 411-418. Sivakumar, B. (2000) Chaos theory in hydrology: important issues and interpretations. J. Hydro!. 227, 1 -20. Sivakumar, B., Liong, S., Liaw, C. & Phoon, K. (1999a) Singapore rainfall behavior: chaotic? J. Hydrol. Engng ASCE

4(1), 38-48. Sivakumar, B., Phoon, K., Liong, S. & Liaw, C. (1999b) A systematic approach to noise reduction in chaotic hydrological

time series. J. Hydro!. 219, 103-135. Takens, F. (1981) Detecting strange attractors in turbulence. In: Dynamical Systems and Turbulence (ed. by D. A. Rand &

L. S. Young ), 366-381. Springer-Verlag, New York, USA. Theiler, J. (1986) Spurious dimension from correlation algorithms applied to limited time-series data. Phys. Rev. A 34,

2427-2432. Tsonis, A. A. & Eisner, J. B. (1988) The weather attractor over very short time scales. Nature 333, 545-547.

Received 12 December 2000; accepted 21 May 2001

Page 14: Analysis of cross-correlated chaotic streamflowshydrologie.org/hsj/460/hysj_46_05_0781.pdfAnalysis of cross-correlated chaotic streamflows AMIN ELSHORBAGY Kentucky Water Research Institute,