prediction of rainfall time series using modular

76
Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167 1 Prediction of Rainfall Time Series Using Modular Artificial Neural Networks Coupled 1 with Data Preprocessing Techniques 2 C. L. Wu 1,2 , K. W. Chau 1, *, and C. Fan 3 3 1 Dept. of Civil and Structural Engineering, Hong Kong Polytechnic University, 4 Hung Hom, Kowloon, Hong Kong, People’s Republic of China 5 6 2 Changjiang Institute of Survey, Planning, Design and Research, 7 Changjiang Water Resources Commission, 8 430010, Wuhan, HuBei, People’s Republic of China 9 10 3 Department of Civil Engineering, Ryerson University, 11 350 Victoria Street, Toronto, Ontario, Canada, M5B 2K3 12 13 *Email: [email protected] 14 15 ABSTRACT 16 This study is an attempt to seek a relatively optimal data-driven model for rainfall forecasting from three aspects: 17 model inputs, modeling methods, and data preprocessing techniques. Four rain data records from different 18 regions, namely two monthly and two daily series, are examined. A comparison of seven input techniques, 19 either linear or nonlinear, indicates that linear correlation analysis (LCA) is capable of identifying model inputs 20 reasonably. A proposed model, modular artificial neural network (MANN), is compared with three benchmark 21 models, viz. artificial neural network (ANN), K-nearest-neighbors (K-NN), and linear regression (LR). 22 Prediction is performed in the context of two modes including normal mode (viz., without data preprocessing) 23 and data preprocessing mode. Results from the normal mode indicate that MANN performs the best among all 24 four models, but the advantage of MANN over ANN is not significant in monthly rainfall series forecasting. 25 Under the data preprocessing mode, each of LR, K-NN and ANN is respectively coupled with three data 26 This is the Pre-Published Version.

Upload: others

Post on 04-Dec-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

1

Prediction of Rainfall Time Series Using Modular Artificial Neural Networks Coupled 1 

with Data Preprocessing Techniques 2 

C. L. Wu1,2, K. W. Chau1,*, and C. Fan3 3 

1 Dept. of Civil and Structural Engineering, Hong Kong Polytechnic University, 4 

Hung Hom, Kowloon, Hong Kong, People’s Republic of China 5 

2 Changjiang Institute of Survey, Planning, Design and Research, 7 

Changjiang Water Resources Commission, 8 

430010, Wuhan, HuBei, People’s Republic of China 9 

10 

3 Department of Civil Engineering, Ryerson University, 11 

350 Victoria Street, Toronto, Ontario, Canada, M5B 2K3 12 

13 

*Email: [email protected] 14 

15 

ABSTRACT 16 

This study is an attempt to seek a relatively optimal data-driven model for rainfall forecasting from three aspects: 17 

model inputs, modeling methods, and data preprocessing techniques. Four rain data records from different 18 

regions, namely two monthly and two daily series, are examined. A comparison of seven input techniques, 19 

either linear or nonlinear, indicates that linear correlation analysis (LCA) is capable of identifying model inputs 20 

reasonably. A proposed model, modular artificial neural network (MANN), is compared with three benchmark 21 

models, viz. artificial neural network (ANN), K-nearest-neighbors (K-NN), and linear regression (LR). 22 

Prediction is performed in the context of two modes including normal mode (viz., without data preprocessing) 23 

and data preprocessing mode. Results from the normal mode indicate that MANN performs the best among all 24 

four models, but the advantage of MANN over ANN is not significant in monthly rainfall series forecasting. 25 

Under the data preprocessing mode, each of LR, K-NN and ANN is respectively coupled with three data 26 

This is the Pre-Published Version.

Page 2: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

2

preprocessing techniques including moving average (MA), principal component analysis (PCA), and singular 27 

spectrum analysis (SSA). Results indicate that the improvement of model performance generated by SSA is 28 

considerable whereas those of MA or PCA are slight. Moreover, when MANN is coupled with SSA, results 29 

show that advantages of MANN over other models are quite noticeable, particularly for daily rainfall 30 

forecasting. Therefore, the proposed optimal rainfall forecasting model can be derived from MANN coupled 31 

with SSA. 32 

KEYWORDS 33 

Rainfall prediction, Modular artificial neural network, Moving Average, Principal component analysis, Singular 34 

spectral analysis, Fuzzy C-Means clustering, K-nearest-neighbors 35 

36 

1. Introduction 37 

Accurate and timely rainfall forecasting is crucial for reservoir operation and flooding 38 

prevention because it can provide an extension of lead-time of the flow forecasting, larger 39 

than the response time of the watershed, in particular for small and medium-sized 40 

mountainous basins. 41 

Many studies have been conducted for the quantitative precipitation forecasting using 42 

diverse techniques including numerical weather prediction models and remote sensing 43 

observations (Yates et al., 2000; Ganguly and Bras, 2003; Sheng, et al., 2006; Doomede et al., 44 

2008), statistical models (Chu and He, 1995; Chan and Shi, 1999; DelSole and Shukla, 2002; 45 

Munot, 2007; Li and Zeng, 2008; Nayagam et al., 2008), chaos theory-based approach 46 

(Jayawardena and Lai, 1994), K-nearest-neighbor (K-NN) method (Toth et al, 2000), and soft 47 

computing methods including artificial neural network (ANN), support vectors regression 48 

(SVR) and fuzzy inference system (Venkatesan et al., 1997; Silverman and Dracup, 2000; 49 

Page 3: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

3

Toth et al., 2000; Pongracz et al., 2001; Sivapragasam et al., 2001; Brath et al., 2002; Lin and 50 

Chen, 2005; Chattopadhyay and Chattopadhyay, 2007; Guhathakurta, 2008; Lin et al., 2009). 51 

Venkatesan et al. (1997) employed ANN to predict all India summer monsoon rainfall with 52 

different meteorological parameters as model inputs. Toth et al. (2000) applied three data-53 

driven models, auto-regressive moving average, ANN and K-NN, to short-term rainfall 54 

predictions. Results showed that ANN performed the best in terms of the accuracy of runoff 55 

forecasting when the predicted rainfalls by the three models were used as inputs of a rainfall-56 

runoff model. Pongracz et al. (2001) applied fuzzy inference to monthly rainfall prediction. 57 

Chattopadhyay and Chattopadhyay (2007) constructed an ANN model to predict monsoon 58 

rainfall in India depending on the rainfall series alone. 59 

Recently, the concept of coupling different models has attracted more attention in 60 

hydrologic forecasting. They can be broadly categorized into ensemble models and modular 61 

(or hybrid) models. The basic idea behind ensemble models is to build several different or 62 

similar models for the same process and to integrate them together (Shamseldin et al., 1997; 63 

Shamseldin and O’Connor, 1999; Xiong et al., 2001; Abrahart and See, 2002; Kim et al, 64 

2006). For example, Xiong et al. (2001) used a Takagi-Sugeno-Kang fuzzy technique to 65 

couple several conceptual rainfall-runoff models. Coulibaly et al. (2005) employed an 66 

improved weighted-average method to coalesce forecasted daily reservoir inflows from K-67 

NN, conceptual model and ANN. Kim et al. (2006) investigated five ensemble methods for 68 

improving stream flow prediction. 69 

Physical processes in rainfall and/or runoff are generally composed of a number of 70 

sub-processes. Their accurate modeling by building of a single global model is sometimes not 71 

possible (Solomatine and Ostfeld, 2008). Modular models were therefore proposed where 72 

Page 4: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

4

sub-processes were first of all identified and then separate models (also called local or expert 73 

model) were established for each of them (Solomatine and Ostfeld, 2008). Different modular 74 

models were proposed depending on the soft or hard splitting of training data. Soft splitting 75 

means the dataset can be overlapped and the overall forecasting output is the weighted-76 

average of each local model (Zhang and Govindaraju, 2000; Shrestha and Solomatine, 2006; 77 

Wu et al., 2008). Zhang and Govindaraju (2000) examined the performance of modular 78 

networks in predicting monthly discharges based on the Bayesian concept. Wu et al. (2008) 79 

employed a distributed SVR for daily river stage prediction. On the contrary, there is no 80 

overlap of data in the hard splitting and the final forecasting output is explicitly from only 81 

one of local models (See and Openshaw, 2000; Hu et al., 2001; Solomatine and Xue, 2004; 82 

Sivapragasam and Liong, 2005; Jain and Srinivasulu, 2006; Wang et al., 2006; Corzo and 83 

Solomatine, 2007; Lin and Wu, 2009). Hu et al. (2001) developed a range-dependent network 84 

which employs a number of multilayer perceptron neural networks to model the river flow in 85 

different flow bands of magnitude (e.g. high, medium and low). Their results indicated that 86 

the range-dependent network performed better than the conventional global ANN. 87 

Solomatine and Xue (2004) used M5 model trees and neural networks in a flood-forecasting 88 

problem. Sivapragasam and Liong (2005) divided the flow range into three regions, and 89 

employed different SVR models to predict daily flows in high, medium and low regions. 90 

Wang et al. (2006) used a crisp modular ANN to make soft or crisp predictions for validation 91 

data where each local network was trained using the subsets achieved by either a threshold 92 

discharge value or a clustering of input spaces. Lin and Wu (2009) proposed a hybrid ANN 93 

model for event-based hourly rainfall prediction where self-organizing map networks are 94 

Page 5: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

5

used for data cluster analysis and multilayer conceptron networks are employed to serve each 95 

cluster to construct mapping between input and output. 96 

A hydrological time series can be actually regarded as an integration of stochastic (or 97 

random) and deterministic components (Salas et al., 1985). Once the stochastic (noise) 98 

component is appropriately eliminated, the deterministic component can then be easily 99 

modeled. For the purpose of cleaning hydrological series, many data preprocessing 100 

techniques, including Principal component analysis (PCA), wavelet analysis (WA), and 101 

singular spectrum analysis (SSA), have been employed in hydrology field by researchers 102 

(Sivapragasam et al., 2001; Marques et al., 2006; Hu et al., 2007; Partal and Kişi, 2007; 103 

Sivapragasam et al., 2007;Wu et al., 2009). Hu et al. (2007) employed PCA as an input data 104 

preprocessing tool to improve the prediction accuracy of the rainfall-runoff neural network 105 

models. The use of WA to improve rainfall forecasting was conducted by Partal and Kişi 106 

(2007). Their results indicated that WA was promising. SSA has also been recognized as an 107 

efficient preprocessing algorithm to avoid the effect of discontinuous or intermittent signals, 108 

coupled with neural networks (or similar approaches) for time series forecasting (Lisi et al., 109 

1995; Sivapragasam et al., 2001; Baratta et al., 2003). For example, Lisi et al. (1995) applied 110 

SSA to extract the significant components in their study on southern oscillation index time 111 

series and used ANN for prediction. They reconstructed the original series by summing up 112 

the first “p” significant components. Sivapragasam et al. (2001) proposed a hybrid model of 113 

support vector machine (SVM) and SSA for rainfall and runoff predictions. The hybrid 114 

model resulted in a considerable improvement in the model performance in comparison with 115 

the original SVM model. A comparison between WA and SSA in Wu et al. (2009) indicated 116 

that SSA performed better than WA. In addition, moving average (MA) is used for data 117 

Page 6: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

6

preprocessing to improve the performance of ANN by de Vos and Rientjes (2005). They 118 

argued that one of reasons on lagged predictions of ANN was due to the use of previous 119 

observed data as ANN inputs and suggested that an effective solution was to obtain new 120 

model inputs by MA over the original data series. 121 

In this paper, one of the main purposes is to develop a modular ANN (MANN) 122 

coupled with appropriate data-preprocessing techniques to improve the accuracy of rainfall 123 

forecasting. MANN consists of three local models which are associated with three subsets 124 

clustered by the fuzzy C-means (FCM) clustering method. To evaluate MANN, LR, K-NN 125 

and ANN are employed for comparison. ANN is first used to choose the best model inputs by 126 

seven candidate model inputs techniques. Once all forecasting models are established, three 127 

data-preprocessing methods (i.e., MA, PCA, and SSA) can be examined. To ensure wider 128 

application of the conclusions, four cases consisting of two monthly rainfall series and two 129 

daily rainfall series from India and China, are investigated. The remaining part is structured 130 

as follows. Methodology is detailed in Section 2 where case studies are first described, and 131 

then data-preprocessing techniques and forecasting models are introduced. Section 3 presents 132 

modeling methods and their applications to four rainfall series. The optimal model input 133 

method and the best data preprocessing can be identified. In Section 4, principal results are 134 

shown along with relevant discussions. The last section presents main conclusions. 135 

2. Methodology 136 

2.1 Study Area and Data 137 

Two daily mean rainfall series from Daning and Zhenshui river basins of China, and 138 

two monthly mean rainfall series from India and Zhongxian of China, are analyzed. 139 

Page 7: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

7

The Daning River, a first-order tributary of the Yangtze River, is located at the 140 

northeastern side of Chongqing city. The daily rainfall data from Jan. 1, 1988 to Dec. 31, 141 

2007 were measured at six raingauges located at the upstream of the study basin (Figure 1). 142 

The upstream part is controlled by Wuxi hydrology station, with a drainage area of around 2 143 

000 km2. The mean areal rainfall series is calculated by the Thiessen polygon method 144 

(hereafter the averaged rainfall series is referred to as Wuxi). 145 

The Zhenshui basin is located at the northern side of Guangdong Province and 146 

adjoined by Hunan Province and Jianxi Province. The basin belongs to a second-order 147 

tributary of the Pearl River and has an area of 7,554 km2. The daily rainfall time series of 148 

Zhenwan raingauge was collected between January 1, 1989 and December 31, 1998 149 

(hereafter the averaged rainfall series is referred to as Zhenwan). 150 

The all Indian average monthly rainfall is estimated from area-weighted observations 151 

at 306 land stations uniformly distributed over India. The data, with period spanning from 152 

January 1871 to December 2007, are available at the website http://www.tropmet.res.in run 153 

by the Indian Institute of Tropical Meteorology. 154 

The other monthly rainfall series is from Zhongxian raingauge which is located at 155 

Chongqing city, China. The catchment containing this raingauge belongs to a first-order 156 

tributary of the Yangtze River. The monthly rainfall data were collected from January 1956 157 

to December 2007. 158 

Figure 2 shows hyetographs of four rainfall series. A linear fit to each hyetograph is 159 

denoted by the dashed line. All series appear stationary at least in a weak sense since these 160 

linear fits are close to horizontal. 161 

Page 8: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

8

In this study, each of data series is partitioned into three parts as training set, cross-162 

validation set and testing set. The training set serves the model training and the testing set is 163 

used to evaluate the performances of models. The cross-validation set has dual functions: one 164 

is to implement an early stopping approach in order to avoid overfitting of the training data 165 

and another is to select some best predictions from a number of ANN’s runs. In the present 166 

study, 10 best predictions are selected from a total of 20 ANN’s runs. The same data partition 167 

is adopted for each series: the first half of the entire data as training set and the first half of 168 

the remaining data as cross-validation set and the other half as testing set. 169 

Table 1 presents pertinent information about watersheds and some descriptive 170 

statistics of the original data and three data subsets, including mean (µ), standard deviation 171 

(Sx), coefficient of variation (Cv), skewness coefficient (Cs), minimum (Xmin), and maximum 172 

(Xmax). As shown in Table 1, the training set cannot fully include the cross-validation or 173 

testing data. Owing to the weak extrapolation ability of ANN, all data are scaled to the 174 

interval [-0.9, 0.9] instead of [-1, 1] whilst hyperbolic tangent sigmoid functions are 175 

employed as transfer functions in hidden and output layers. 176 

2.2 Data preprocessing techniques 177 

(1) MA 178 

MA smoothes data by replacing each data point with the average of the k 179 

neighboring data points, where k may be termed the length of memory window. The method 180 

is based on the idea that any large irregular component at any point in time will exert a 181 

smaller effect if we average the point with its immediate neighbors (Newbold et al., 2003). 182 

The equally weighted MA is the most commonly-used, in which each value of the data 183 

carries the same weight in the smoothing process. There are three types of moving modes 184 

Page 9: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

9

including centering, backward and forward. In a forecasting scenario, only the backward 185 

mode is used since the other two modes may necessitate future observed values. For a time 186 

series 1 2, , , Nx x x , when the backward moving mode is adopted (Lee et al., 2000), the 187 

-k term unweighted moving average *ty is written as 188 

1*

0

k

t t iiy y k

(1) 189 

where , ,t k N . The choice of the window length k is by a trial and error procedure with 190 

a minimization of the loss of the objective function. 191 

(2) PCA 192 

PCA was first introduced by Pearson (1901) and developed independently by 193 

Hotelling (1933), and has now well entrenched as an important technique in data analysis. 194 

The central idea is to reduce the dimensionality of a data set consisting of a large number of 195 

interrelated variables, while retaining as much as possible of the variation present in the data 196 

set. The PCA approach uses all of the original variables to obtain a smaller set of principal 197 

components (PCs) which can be used to approximate the original variables. PCs are 198 

uncorrelated and are ordered so that the first few retain most of the variation present in the 199 

original set. 200 

Consider a data matrix X which has n rows (observations) and p column (variables). 201 

Let the covariance matrix of X be , where cov( ) ( )TE X X X . The linear transformed 202 

orthogonal matrix Z is presented as 203 

Z XA (2) 204 

Page 10: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

10

where Z is the PCs with elements ( ,i j ) of thi observation and thj principal component; A 205 

is a ( p p ) matrix with eigenvector elements of the covariance of X , and having 206 

T T A A AA I . 207 

Because matrix TX X is real and symmetric, it can be expressed as T TX X AΛA 208 

where Λ is a diagonal matrix whose nonnegative entries are the eigenvalues (λ , 1, ,i i p ) 209 

of TX X . The total variance of the data matrix X is represented as 210 

1

trace( ) trace( ) trace( )= λp

ii

TAΛA Λ (3) 211 

On the other hand, the covariance matrix of principal components Z is expressed as 212 

cov( ) ( ) ( )T T TE E Z Z Z A X XA Λ (4) 213 

1

trace( ) trace( )= λp

ii

Z Λ (5) 214 

Therefore, the total variance of the data matrix X is identical to the total variance after PCA 215 

transformation Z . 216 

The solution of PCA, using singular value decomposition (SVD) or determinants of 217 

the covariance matrix of X , can provide the eigenvectors A with their 218 

eigenvalues, λ , 1, ,i i p , representing the variance of each component after PCA 219 

transformation. If the eigenvalues are ordered by 1 2 3λ λ λ λ 0p , the first few PCs 220 

can capture most of the variance of the original data while the remaining PCs mainly 221 

represent the noise in the data. The percentage of total variance explained by the first thm 222 

PCs is 223 

1 1

λ λ 100%pm

i ii i

V

(6) 224 

Page 11: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

11

The higher is the selection of the total data variance, V , the better the properties of the data 225 

matrix are preserved. For the sake of the reduction of dimensionality, a small number of PCs 226 

are selected, though most of the data variance in selected components still remain. If the 227 

transformation is to prevent the collinearity of regression variables, the selected component 228 

number m  in Eq. (6) can be set for a higher total variance, such as 95% 99%V (Hsu et 229 

al., 2002). 230 

The original data matrix A can be reconstructed by a reverse operation of Eq. (2) as 231 

TX ZA (7) 232 

By choosing suitable m ( p ) PCs from Z and accompanying m  eigenvectors from A , the 233 

original data can be filtered. 234 

(3) SSA 235 

According to Golyandina et al. (2001), the basic SSA consists of two stages: 236 

decomposition and reconstruction. The decomposition stage involves two steps: embedding 237 

and SVD; the reconstruction stage also comprises two steps: grouping and diagonal 238 

averaging. Consider a real-valued time series 1 2, , , NF x x x of length ( 2)N . Assume 239 

that the series is a nonzero series, viz. there exists at least one i such that 0ix . Four steps 240 

are briefly presented as follows. 241 

1st step: embedding 242 

The embedding procedure maps the original time series to a sequence of multi-243 

dimensional lagged vectors. Let L be an integer (window length), 1 L N , and be the 244 

delayed time at a multiple of the sampling period. The embedding procedure forms 245 

( 1)n N L lagged vectors 2 ( 1), , , ,T

i i i i i Lx x x x X , where R Li X , and 246 

Page 12: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

12

1,2, ,i n . The ‘trajectory matrix’ of the time series is denoted by [ ]1 i n X X XX 247 

having lagged vectors as its columns. In other words, the trajectory matrix is 248 

1 2 3

1 2 3

1 2 2 2 3 2 2

1 ( 1) 2 ( 1) 3 ( 1)

n

n

n

L L L N

x x x xx x x xx x x x

x x x x

X

(8) 249 

If 1 , the matrix X is called Hankel matrix since it has equal elements on the 250 

‘diagonals’ where the sum of subscripts of row and column is equal to constant. If 1 , the 251 

equal elements in X are not definitely in the ‘diagonals’. 252 

2nd step: SVD 253 

Let TS XX , 1 2λ ,λ , ,λL denote the eigenvalues of S taken in the decreasing order 254 

of magnitude ( 1 2 3λ λ λ λ 0L ) and 1 2, , , LU U U denote the orthonormal system of 255 

the eigenvectors of the matrix S corresponding to these eigenvalues. If we denote 256 

λTi i i iV UX ( 1, ,i L ) (equivalent to the thi eigenvector of TX X ), then the SVD of the 257 

trajectory matrix X can be written as 258 

1 L X X X (9) 259 

where λ Ti i i i U VX . The matrices iX have rank 1; therefore they are elementary matrices. 260 

The collection ( λ , ,i i iU V ) is called the thi eigentriple of the SVD. Note that iU and iV are 261 

also the thi left and right singular vectors of X , respectively. 262 

3rd step: grouping 263 

The purpose of this step is to identify appropriately the trend component, oscillatory 264 

components with different periods, and structureless noises by grouping components. This 265 

Page 13: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

13

step can be skipped if one does not want to precisely extract hidden information by 266 

regrouping and filtering of components. 267 

The grouping procedure partitions the set of indices {1, , }L into m disjoint subsets 268 

1, , mI I , so that the elementary matrix in Eq. (9) is regrouped into m groups. Let 269 

1{ , , }pI i i . Then the resultant matrix IX corresponding to the group I is defined as 270 

1 pI i i X X X . These matrices are computed for 1, , mI I . By substituting into the 271 

expansion (9), one obtains the new expansion 272 

1 mII X X X (10) 273 

The procedure of choosing the sets 1, , mI I is called the eigentriple grouping. 274 

4th step: Diagonal averaging 275 

The last step in the basic SSA is the transformation of each resultant matrix of the 276 

grouped decomposition (10) into a new series of length N . The diagonal averaging is to find 277 

equal elements in the resultant matrix and then to generate a new element by averaging over 278 

them. The new element has the same position (or index) as the corresponding elements in the 279 

original series. As mentioned in the step 1, the concept of ‘diagonal’ is not true for 1 . 280 

Regardless of the value of being larger than or equal to 1, the principle of reconstruction is 281 

the same. For 1 , the diagonal averaging can be carried out by the formula recommended 282 

by Golyandina et al. (2001). Let Y be a ( L n ) matrix with elements ijy , 1 i L , 1 j n . 283 

Let * min( , )L L n , * max( , )n L n and ( 1)N n L . Let *ij ijy y if L n and *

ij jiy y

284 

otherwise. Diagonal averaging transfers matrix Y to a series 1 2{ , , , }Ny y y by the following 285 

equation 286 

Page 14: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

14

*

*

*

* *, - 1

1

* * *, - 1*

1

- 1* *

, - 1- 1

11

1

1

- 1

k

m k mm

L

k m k mm

N K

m k mm k K

y for k Lk

y y for L k KL

y for L k NN k

(11) 287 

Eq. (11) corresponds to the averaging of the matrix elements over the ‘diagonals’ 1i j k . 288 

The diagonal averaging, applied to a resultant matrix IkX , produces a N length series kF , 289 

and thus the original series F is decomposed into the sum of m series: 290 

1 mF F F (12) 291 

As mentioned above, these reconstructed components (RCs) can be associated with the trend, 292 

oscillations or noise of the original time series with proper choices of L and the sets of 293 

1, , mI I . Certainly, if the third step (namely, grouping) is skipped, F can be decomposed 294 

into L RCs. 295 

2.3 Forecasting models 296 

This section describes four candidate forecasting models. They are LR, K-NN, ANN 297 

and MANN. They are usually called data-driven models because they capture the mapping 298 

between input (e.g. antecedent rainfall) and output variables (forecasted rainfall) without 299 

directly considering the physical laws that underlie the mechanism of rainfall (or 300 

precipitation). These models are purely based on the information retrieved from the collected 301 

rainfall data. 302 

(1) Construction of input/output pairs 303 

Page 15: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

15

Let 1 2, , , Nx x x stand for a rainfall time series. It can be reconstructed into a series 304 

of delay vectors as  t 2 ( 1), , , ,t t t t mx x x x X , where t RmX , is the delay time as a 305 

multiple of the sampling period and m is the embedded dimension. Suppose that the rainfall 306 

( 1)t T mx at T -step lead is related to the vector tX , the available historical data may be 307 

summarized into a set of pairs as t ( 1), : 1, ,t T mx t n X , where n stands for  the number of 308 

pairs, and ( 1)n N m . 309 

The functional relationship between the input vector tX at time t and the predicted 310 

output ( 1)Ft T mx  

at time t T can be written as follows: 311 

( 1) ( )Ft T m t tx f e X (13) 312 

where te is a typical noise term, ( 1)Ft T mx

is the prediction of ( 1)t T mx , and ( )f  is the 313 

mapping function. The difference of various data-driven forecasting models used in the 314 

current study relies on the way of approximating ( )f once model inputs are attained with 315 

the appropriate selection of ( , m ). 316 

(2) LR 317 

The linear regression model herein is actually called stepwise linear regression (SLR) 318 

model because the forward stepwise regression is used to determine optimal input variables. 319 

The basic idea of SLR is to start with a function that contains the single best input variable 320 

and to subsequently add potential input variables to the function one at a time in an attempt to 321 

improve the model performance. The order of addition is determined by using the partial 322 

-F test values to select which variable should enter next. The high partial -F value is 323 

compared to a (selected or default) -F to-enter value. After a variable has been added, the 324 

Page 16: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

16

function is examined to see if any variable should be deleted. Interested readers are referred 325 

to Draper and Smith (1998) and McCuen (2005) for more details. 326 

(3) K-NN 327 

The prediction of ( 1)t T mx by the K-NN method is formulated as: 328 

( 1) ( 1)( , )

1Ft T m t T m

t S n

x xK

X

(14) 329 

where S( , n)X denotes the set of indices t of the K nearest neighbors to the feature vector 330 

(n)X . The meaning of “nearest neighbors” is generally interpreted in a Euclidean sense. 331 

Therefore, if i belongs to S( , n)X and j is not in S( , n)X , then according to Euclidean 332 

distance - -n i n jX X X X . Intuitively speaking, the forecast ( 1)Ft T mx in Eq. (14) is the 333 

sample average of output rainfall of the K nearest neighbors to (n)X . Obviously, a key task 334 

is to determine the parameter K in the K-NN method. 335 

(4) ANN 336 

The multilayer perceptron network is by far the most popular ANN paradigm, which 337 

usually uses the technique of error back propagation to train the network configuration. The 338 

architecture of the ANN consists of a number of hidden layers and a number of neurons in 339 

the input layer, hidden layers and output layer. ANNs with one hidden layer are commonly 340 

used in hydrologic modeling (Dawson and Wilby, 2001; de Vos and Rientjes, 2005) since 341 

these networks are considered to provide enough complexity to accurately simulate the 342 

nonlinear-properties of the hydrologic process. Based on Eq. (13), the ANN forecasting 343 

model is formulated as 344 

( 1) 0 ( 1)1 1

( , , , , ) ( )j

h mF outt T m t ji t i j

j i

x f w m h w w x

X (15) 345 

Page 17: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

17

where denotes transfer functions; jiw are the weights defining the link between the ith 346 

node of the input layer and the jth node of the hidden layer; j are biases associated to the 347 

jth node of the hidden layer; j

outw are the weights associated to the connection between the 348 

jth node of the hidden layer and the node of the output layer; and 0 is the bias at the output 349 

node. To apply Eq. (15) to rainfall predictions, appropriate training algorithm is required to 350 

optimize w and  . 351 

(5) MANN 352 

To construct MANN, the training data have to be divided into several clusters 353 

according to cluster analysis techniques, and then each single model is applied to each cluster. 354 

The FCM clustering technique is adopted in the present study (e.g., Bezdek, 1981, Wang et 355 

al., 2006). It is able to generate either soft or crisp clusters. ANN (or similar techniques) is 356 

unable to extrapolate beyond the range of the data used for training. Otherwise, poor 357 

forecasts or predictions can be expected when a new input data is outside the range of those 358 

used for training. Hard forecasting is, therefore, taken into consideration in this study. 359 

Figure 3 displays the schematic diagram of MANN where the training data is 360 

partitioned into three clusters which are based on an assumption that three magnitudes of 361 

rainfall (i.e., low, medium, and high) may be derived from different mechanisms. According 362 

to this flow chart, once input-output pairs are obtained, they are first split into three subsets 363 

by the FCM technique, and then each subset is approximated by a single ANN. The final 364 

output of the modular model results directly from the output of one of three local models. 365 

2.4 Implementation framework of rainfall forecasting 366 

Page 18: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

18

Figure 4 illustrates the implementation framework of rainfall forecasting where four 367 

prediction models can be conducted in two modes: without/with three data preprocessing 368 

methods (dashed box). These acronyms in the column of “methods for model inputs” 369 

represent seven methods to determine model inputs: LCA (linear correlation analysis, 370 

Sudheer et al., 2002), AMI (average mutual information, Fraser and Swinney, 1986), PMI 371 

(partial mutual information, May et al., 2008), FNN (false nearest neighbors, Kennel et al., 372 

1992), CI (correlation integral, Theiler, 1986), SLR, and MOGA (ANN based on multi-373 

objective genetic algorithm, Giustolisi and Simeone, 2006). 374 

2.5 Evaluation of model performances 375 

The Pearson’s correlation coefficient (r) or the coefficient of determination (R2 = r2), 376 

have been identified as inappropriate measures in hydrologic model evaluation by Legates 377 

and McCabe (1999). The coefficient of efficiency (CE) (Nash and Sutcliffe, 1970) is a good 378 

alternative to r or R2 as a “goodness-of-fit” or relative error measure in that it is sensitive to 379 

differences in the observed and forecasted means and variances. Legates and McCabe (1999) 380 

also suggested that a complete assessment of model performance should include at least one 381 

absolute error measure (e.g., root mean square error (RMSE)) as necessary supplement to a 382 

relative error measure. Besides, the Persistence Index (PI) (Kitanidis And Bras, 1980) was 383 

adopted here for the purpose of checking the prediction lag effect. Three measures are 384 

therefore used in this study. They are listed below. 385 

2 2

1 1

ˆCE 1- ( - ) / ( - )n n

i i ii i

y y y y

(16) 386 

2

1

1ˆRMSE ( - )

n

i iiy y

n (17) 387 

Page 19: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

19

2 2-

1 1

ˆPI 1- ( - ) / ( - )n n

i i i i li i

y y y y

(18) 388 

In these equations, n is the number of observations, ˆiy stands for the forecasted flow, 389 

iy represents the observed flow, y denotes the average observed flow, and i ly is the flow 390 

estimate from a so-call persistence model (or termed naïve model) that basically takes the last 391 

flow observation (at time i minus the lead time l ) as a prediction. CE and PI values of 1 392 

stands for perfect fits. A small value of PI may imply occurrence of lagged prediction. 393 

3. Applications of Models 394 

3.1 Determination of model inputs 395 

ANN, equipped with the Levernberg-Marquardt (L-M) training algorithm and 396 

hyperbolic tangent sigmoid transfer functions, is used as the benchmark model to examine 397 

aforementioned seven model input methods in terms of RMSE. Depending on the simplified 398 

algorithm from Yu et al. (2000) (downloaded at http://small.eie.polyu.edu.hk/), the four 399 

rainfall series are identified as non-chaotic since the correlation dimension does not display 400 

the property of convergence, in particular, for daily rainfall series. Results from remaining six 401 

methods are presented in Table 2. These results are based on one step lead prediction and let 402 

t+1X be the target value at one-step prediction horizon. It can be seen from RMSE that most 403 

of these methods tend to be mutually alternative because their RMSE are close. Owing to the 404 

convenience of operation, the LCA method is preferred in this study. Furthermore, Figure 5 405 

shows identification of effective inputs in Table 2 for the LCA method. Taking Wuxi and 406 

Zhenwan as examples, model inputs should take the previous 5-day and 7-day rainfalls for 407 

Page 20: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

20

them respectively because the partial auto-correlation function (PACF) value decays within 408 

the confidence band around these time lags. 409 

3.2 Identification of models 410 

The model identification is to determine the structure of a candidate model by using 411 

training data to optimize relevant parameters of model control once model inputs have been 412 

obtained. The LR model is built by the SLR technique. In terms of one step prediction 413 

(viz., 1T ), input variables can be found in Table 2. For example, the LR model for Wuxi 414 

can be expressed as 415 

1 1 2 4 7 110.421 -0.043 0.044 0.025 0.036 0.03Ft t t t t t tx x x x x x x (19) 416 

With respect to K-NN, the model identification consists in finding the optimal K if the 417 

m dimensional input vector is determined. Sugihara and Mary (1990) suggested that the 418 

value of K was taken as 1K m . On the other hand, the choice of K should ensure the 419 

reliability of the forecasting (Fraser and Swinney, 1986). The check of robustness of 420 

1K m in terms of RMSE is presented in Figure 6, where K is in the interval of [2, 40]. 421 

Adopting the value of K as 1m seems reasonable for the current study because the 422 

difference between its RMSE and the minimum RMSE is only 2.9% for Wuxi, 2.9% for 423 

Zhenwan, 2.6% for India, and 2.0% for Zhongxian, respectively. Consequently, the value of 424 

K is 6 for Wuxi ( 5m ), 8 for Zhenwan ( 7m ), 13 for India ( 12m ), and 14 for 425 

Zhongxian ( 13m ), respectively. 426 

Based on Eq. (14), the formula for one-step lead prediction in the context of K-NN 427 

can be defined as 428 

1 11

1i

Ft t

K

iKx x

(20) 429 

Page 21: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

21

where 1Xit

stands for an observed value associated with a neighbor of the current state. For a 430 

T step lead prediction, Eq. (20) becomes 431 

1

1i

Ft T t T

K

i

x xK

(21) 432 

The identification of ANN structure is to optimize the number of hidden nodes h in 433 

the hidden layer with the known model inputs and output. The optimal size h of the hidden 434 

layer is found by systematically increasing the number of hidden neurons from 1 to 10 until 435 

the network performance on the cross-validation set no longer improves significantly. Based 436 

on the L-M training algorithm and hyperbolic tangent transfer functions, the identified 437 

configurations of ANN are 5-5-1 for Wuxi, 7-4-1 for Zhenwan, 12-5-1 for India, and 13-3-1 438 

for Zhongxian, respectively. The same method is used to identify the structure of MANN, and 439 

the only difference is that the identification is repeated three times, with each time being for a 440 

local ANN. Consequently, MANN is obtained as 5-5/7/9-1 for Wuxi, 7-4/8/4-1 for Zhenwan, 441 

12-3/2/5-1 for India, and 13-1/1/1-1 for Zhongxian, respectively. 442 

It is worthwhile to notice that the standardization/normalization of the training data is 443 

very crucial in the improvement of the model performance. Two methods can be found in the 444 

literature (Dawson and Wilby, 2001; Cannas et al., 2002; Rajurkar et al, 2002; Campolo et al., 445 

2003; Wang et al., 2006). The standardization (also termed rescaling in some papers) method, 446 

as adopted above for model input determination, is to rescale the training data to [-1, 1], [0, 1] 447 

or even more narrow interval depending on what kinds of transfer functions are employed in 448 

ANN. The normalization method is to rescale the training data to a Gaussian function with a 449 

mean of 0 and unit standard deviation, which is by subtracting the mean and dividing by the 450 

standard deviation. When the normalization approach is adopted, ANN uses the linear 451 

function (e.g. purelin) instead of the hyperbolic tangent sigmoid transfer function in the 452 

Page 22: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

22

output layer. In addition, some studies have indicated that considerations of statistical 453 

principles may improve ANN model performance (e.g. Cheng and Titterington, 1994). For 454 

example, the training data was recommended to be normally distributed (Fortin et al., 1997). 455 

Sudheer et al. (2002) suggested that the issue of stationarity should be considered in the ANN 456 

development because the ANN cannot account for trends and heteroscedasticity in the data. 457 

Their results showed that data transformation to reduce the skewness of data was capable of 458 

significantly improving the model performance. For the purpose of obtaining better model 459 

performance, four data-transformed schemes are examined: 460 

Standardizing the raw data (referred to as Std_raw); 461 

Normalizing the raw data (referred to as Norm_raw); 462 

Standardizing the n-th root transformed data (referred to as Std_nth_root); 463 

Normalizing the n-th root transformed data (referred to as Norm_nth_root). 464 

Table 3 compares the ANN model performance of the four schemes in terms of 465 

RMSE and CE. The Norm_raw scheme is, on the whole, slightly more effective than the 466 

Std_raw method. It can also be seen that the effect of the n-th root scheme (3 is taken after 467 

trial and error) on the improvement of the performance is basically negligible. Therefore, the 468 

Norm_raw scheme is adopted for the later rainfall prediction in the present study. 469 

3.3 Rainfall data preprocessing 470 

(1) MA 471 

The MA operation entails the window length k in Eq. (1) to smooth the raw rainfall 472 

data. An appropriate k can be found by systematically increasing k from 1 to 10. The 473 

smoothed data is then used to feed into each forecasting model. The targeted value of k 474 

corresponds to the optimal model performance in terms of RMSE. 475 

Page 23: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

23

(2) PCA 476 

PCA is employed in two ways: one for reduction of the dimensionality or preventing 477 

collinearity (depending on Eq. (2)); second for noise reduction by choosing leading 478 

components (contributing most of the variance of the original rainfall data) to reconstruct 479 

rainfall series (depending on Eq. (7)). The percentage V of total variance (see Eq. (6)) is set 480 

at three horizons, 85%, 90%, and 95% for principal component selection. 481 

(3) SSA 482 

This approach of filtering a time series to retain desired modes of variability is based 483 

on the idea that the predictability of a system can be improved by forecasting the important 484 

oscillations in time series taken from the system. The general procedure is to filter the 485 

original record first and then to build the forecasting model based on the filtered series. To 486 

filter the raw rainfall series, the series needs to be decomposed into components with the aid 487 

of SSA. The decomposition by SSA requires identifying the parameter pair ( , L ). The value 488 

of an appropriate L should be able to clearly resolve different oscillations hidden in the 489 

original signal. However, the present study does not require accurately resolving the raw 490 

rainfall signal into trends, oscillations, and noises. A rough resolution can be adequate for the 491 

separation of signals and noises where some leading eigenvalues should be identified. 492 

To select L , a small interval of [3, 10] is examined in the present study. Figure 7 493 

shows the relation between singular spectrum (namely, a set of singular values) and singular 494 

number L for Wuxi, Zhenwan, India, and Zhongxian. It can be observed that the curve of 495 

singular values in each case except for Wuxi tends to level off with the increase of L . 496 

Generally, extraction of high-frequency oscillations becomes more difficult with the increase 497 

of singular number L (or mode). L is selected empirically by following the criterion that the 498 

Page 24: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

24

singular spectrum can be distinguished markedly under that L . According to this criterion, 499 

L is set the value of 7 for India and Zhenwan, 6 for Zhongxian. For Wuxi, since all values in 500 

the interval satisfy the criterion, in order to reduce computational load in later filtering 501 

operation, L is set a small value of 5. The singular spectrum associated with the selected L is 502 

highlighted by the dotted solid line in Figure 7. 503 

As regards , Figure 8 presents the results of sensitivity analysis of singular spectrum 504 

on the lag time using SSA with the determined L . For daily rainfall series, the singular 505 

spectrum can be distinguished only when 1 . In contrast, the singular spectrum is 506 

insensitive to in the case of monthly rainfall series. The final parameter pair ( , L ) in SSA 507 

are set as (1, 5) for Wuxi, (1, 7) for Zhenwan, (1, 7) for India, (1, 6) for Zhongxian, 508 

respectively. 509 

3.4 Filtering of RCs 510 

The subsequent task is to reconstruct a new rainfall series as model inputs by finding 511 

contributing RCs so as to improve the predictability of the rainfall series. There is no 512 

practical guide on how to identify a contributing or noncontributing component to the 513 

improvement of accuracy of prediction. Two proposed filtering methods, supervised and 514 

unsupervised, are herein examined. 515 

(1) Supervised filtering (denoted by SSA1) 516 

Figure 9 depicts cross-correlation function (CCF) between RCs and the original 517 

Zhenwan rainfall series. The last plot in this figure presents the average of CCFs from all 7 518 

RCs. The average indicates an overall correlation between input and output at various lags 519 

(also termed prediction horizons). The plot of average CCF shows that the best correlation is 520 

positive and occurs at lag 1. Among all 7 RCs, RC1 exhibits the best positive correlation with 521 

Page 25: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

25

the original rainfall series. The CCF values for other RCs change alternatively between 522 

positive and negative with the increase of the lag. From the perspective of linear correlation, 523 

the positive or negative CCF value may indicate that the RC makes a positive or negative 524 

contribution to the output of model when the RC is used as the input of model. With the 525 

assumption, deleting RCs, which have negative correlations with the model output if the 526 

average CCF is positive, may improve the performance of the forecast model. This is the 527 

basic idea behind the supervised method. 528 

The procedure of the supervised method coupled with ANN is depicted in Figure 10. 529 

The aim is to find the optimal p ( L ) RCs from all L RCs for each prediction horizon. The 530 

procedure can be summarized into three steps: SSA decomposition, correlation coefficients 531 

sorting, and reconstructed components filtering. Operation in each step is bounded by the 532 

dashed box. It is worth noting that the filtering method is based on assumption that 533 

combination of components with the same sign in CCF ( or -) can strengthen the correlation 534 

with the model output. 535 

(2) Unsupervised filtering (denoted by SSA2) 536 

There are some drawbacks on the supervised method. The salient one is that this 537 

method relies on linear correlation analysis, which disregards the existence of nonlinearity in 538 

meteorological processes. Also, random combinations among all RCs are not taken into 539 

account. To overcome these drawbacks, an unsupervised filtering method (also termed 540 

enumeration) is recommended where all input combinations are examined. There are 541 

2L combinations for L RCs. The unsupervised method may be computationally intensive if 542 

L is large. 543 

Page 26: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

26

4. Results and Discussions 544 

This section presents predictions using various models under two types of modes, 545 

namely “normal” and “data preprocessing”. The “data preprocessing” mode is separately 546 

described by MA, PCA, and SSA. To extend one-step-ahead prediction to multi-step-ahead 547 

prediction, a direct multi-step prediction method (by directly having the multi-step-ahead 548 

prediction as output, also termed static prediction method) is adopted in this study to perform 549 

two- and three-step-ahead predictions. 550 

4.1 Forecasting with normal mode 551 

Table 4 shows results of three prediction horizons by applying five models including 552 

naïve model to each case study. The naïve model is used as the benchmark in which the 553 

forecasted value is directly equal to the last observed value (namely, no change). The naïve 554 

model presents the poorest forecasting which can be explained by the fact that it is unlikely to 555 

capture any dependence relation. From the perspective of rainfall series, the monthly rainfall 556 

can be better predicted than the daily rainfall. Generally, a daily rainfall series, in particular 557 

in a semi-humid and semi-dry or dry region, tends to be intermittent and discontinuous due to 558 

a large number of no rain periods (dry periods). Two global modeling methods, LR and ANN, 559 

mainly capture the zero-zero (or similar extreme low-intensity) rainfall patterns in daily 560 

rainfall series because the type of pattern is overwhelmingly dominant in the daily rainfall 561 

series. As a consequence, poor performance indices in terms of RMSE, CE, and PI can be 562 

observed (depicted in Table 4 for Wuxi and Zhenwan). Nevertheless, Table 4 also shows that 563 

MANN performs the best in each case study. MANN adopts three local ANN models, one for 564 

each cluster generated by FCM, which can better capture the mapping relation than using a 565 

single global ANN. It can be noticed that MANN is more effective for daily rainfall series 566 

Page 27: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

27

than monthly rainfall data, which can be because daily rainfall data is more irregular (or non-567 

periodic) than monthly rainfall series. The use of K-NN for daily rainfall forecasting is even 568 

worse than LR although it employs a local prediction approach. Apart from the issue of the 569 

selection of K , the performance of K-NN is also influenced by the similarity of input-output 570 

patterns. The smooth monthly rainfall series easily construct similar patterns so that they are 571 

well predicted by K-NN. It is worth noting that negative values occasionally appear in the 572 

forecasts of ANN or MANN whereas this situation does not happen in the K-NN method. 573 

Take Wuxi and India data as representative examples, Figure 11 shows the scatter 574 

plots and hyetographs of the results at one-day-ahead prediction of ANN and MANN using 575 

the rainfall data of Wuxi, where the hyetograph is plotted in a selected range for better visual 576 

inspection. ANN seriously underestimates a number of moderate- and high-intensity rainfalls. 577 

The low values of CE and PI demonstrate that time shift between the forecasted and observed 578 

rainfall may occur, which is further verified by the hyetograph. MANN improves noticeably 579 

the accuracy of forecasting in terms of CE and PI. As shown by the scatter plots, the 580 

medium-intensity rainfall can be simulated better by MANN although high-intensity rainfalls 581 

(or peak values) are still underestimated. Figure 12 shows scatter plots and hyetographs of 582 

results at one-day-ahead prediction of ANN and MANN using the rainfall data of India. It 583 

can be seen from hyetograph graphs that both ANN and MANN reproduce well the 584 

corresponding observed rainfall data, which is further revealed by the scatter plots with a low 585 

dispersion around the exact fit line. 586 

Figure 13 shows the analysis of the lag effect between forecasted and observed 587 

rainfall series. The value of CCF at zero lag corresponds to the actual performance (i.e. 588 

correlation coefficient) of the model. A target lag is associated with the maximum value of 589 

Page 28: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

28

CCF, and is an expression for the mean lag for the forecast. It can be seen from Figure 16 590 

that ANN makes fairly obvious lagged predictions for daily rainfall series, and the lag effect 591 

can be overcome by MANN. There are 1, 2, and 3 days lag for Wuxi, which are respectively 592 

associated with one-, two-, and three-day-ahead forecasting. In contrast, there is no lag effect 593 

in monthly rainfall predictions of ANN or MANN. 594 

4.2 Forecasting with MA 595 

Table 5 presents forecasted results of ANN with the “backward” MA (hereafter 596 

referred to as ANN-MA) using the Wuxi rainfall data. The performance indices 597 

corresponding to 1k are associated with the normal ANN. Results at each prediction 598 

horizon seem to be insensitive to the window length k in view of slight differences among 599 

each performance index for k from 1 to 10. Considering the fact that ANNs tend to generate 600 

unstable outputs, the influence of MA on the performance of ANN is negligible. Small values 601 

of PI also imply that MA cannot eliminate the lagged forecast from ANN. 602 

4.3 Forecasting with PCA 603 

As mentioned previously, PCA is used in two ways: one (denoted by PCA1) for 604 

reduction of dimensionality (also termed principal component regression) and the other one 605 

(denoted by PCA2) for noise reduction. Results from PCA1 are presented in Table 6. The 606 

scenario of V 100% stands for forecasting using models with the normal mode. Results 607 

show that PCA1 cannot improve the model performances in terms of RMSE, CE, and PI, 608 

which means that the reduction of dimensionality is unnecessary for the present case studies. 609 

Actually, the original inputs are characterized by a low dimension. Table 7 describes the 610 

results from PCA2. According to results from LR and K-NN (because results from ANN tend 611 

to be unstable), a marginal improvement in the model performances can be observed for the 612 

Page 29: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

29

Wuxi watershed whereas the model performances deteriorate for the India watershed with the 613 

decrease of the value of V . 614 

4.4 Forecasting with SSA 615 

Following the procedure in Figure 10, the supervised filtering (SSA1) using ANN for 616 

RCs of Wuxi and India is illustrated in Figure 14. The RMSE associated with the maximum 617 

number of p (for instance 5p for Wuxi) represents the performance of ANN with the 618 

normal mode. The optimal p corresponds to the minimum RMSE, which can be found by 619 

systematically deleting RCs one at a time. Consequently, numbers of chosen optimal p RCs 620 

in three forecasting horizons are 3, 2, and 1 for Wuxi, and 1, 3, and 5 for India, respectively. 621 

However, the unsupervised filtering method (SSA2) is based on enumeration of combinations 622 

of all RCs. Selection of the optimal p RCs cannot be presented in a graphical form. 623 

Table 8 shows selected p at various prediction horizons using LR, K-NN, and ANN 624 

in conjunction with SSA1 and SSA2. A large amount of information can be extracted from 625 

this Table. First of all, a considerable improvement in the model performance is achieved by 626 

each forecasting model in conjunction with SSA1 or SSA2, compared with results in Table 4. 627 

From the perspective of rainfall series, the accuracy of daily rainfall prediction is improved 628 

significantly in comparison to that in the normal mode. Secondly, as expected, results from 629 

SSA2 are superior to or at least equivalent to those from SSA1 since the former examines 630 

each combination of RCs in search of the optimal p . SSA2 is therefore considered as an 631 

efficient and effective method if the number L of RCs is small. For the present four cases, 632 

SSA2 method is appropriate due to the small number of RCs. Once L is large, say 40 or 50, 633 

SSA1 may be a good alternative where a relative optimal forecasting can be guaranteed. 634 

Page 30: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

30

Additionally, it should be noted that the optimal p are different at three forecast horizons. 635 

Finally, Table 8 also shows that, among the three models, ANN performs the best with SSA1 636 

or SSA2, which is consistent with results in the normal mode. 637 

In the normal mode, MANN has been proved to be superior to ANN, in particular, for 638 

daily rainfall forecasting. As an attempt to improve the accuracy of rainfall forecasting, 639 

MANN is also coupled with SSA2. Table 9 demonstrates results in terms of RMSE, CE, and 640 

PI using MANN compared with those of ANN. Good accuracies of forecasting are made by 641 

both MANN and ANN. It can be seen from values of PI that the prediction lag effect is 642 

completely eliminated. The model performance does not deteriorate markedly with the 643 

increase of the forecasting lead. Results also show that MANN still maintains a salient 644 

superiority over ANN in the SSA2 mode for both daily and month rainfall series. 645 

One-step lead estimates of MANN and ANN with the help of SSA2 are shown in 646 

Figure 15 (Wuxi) and Figure 16 (India) in the form of hyetographs and scatter plots (the 647 

former is plotted in a selected range for better visual inspection). Compared with Figure 11, 648 

each scatter plot in Figure 15 is closer to the exact line, which means that the daily rainfall 649 

process is fitted appropriately. Nevertheless, some peak values still remain mismatched 650 

although MANN shows a better ability to capture the peak value than ANN. Regarding the 651 

monthly rainfall series, the scatter plots with perfect match of the diagonal indicates that the 652 

rainfall process is perfectly reproduced. The representative hyetograph shows that the peak 653 

times and peak values are also accurately predicted. 654 

Figure 17 presents the correlation analysis between observed and forecasted rainfall 655 

from ANN and MANN using the Wuxi and India series, respectively. Compared with Figure 656 

Page 31: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

31

13, the lagged prediction of ANN is completely eliminated by SSA2 since the maximum 657 

CCF occurs at zero lag. The larger the CCF at zero lag is, the better the model performance is. 658 

4.5 Discussions 659 

Some discussions regarding forecasting models and the effects of the SSA technique 660 

are made in the following. 661 

(1) About the investigation of effects of SSA 662 

Figure 18 shows that a large number of zeros and near zeros occur in the original 663 

Wuxi rainfall which makes the series discontinuous. Using the intermittent series is difficult 664 

to reconstruct similar input patterns for a forecasting model. Thus, depending on those 665 

reconstructed input patterns, data-driven models based on pattern training, for example, 666 

ANNs, tend to be unfeasible. In contrast, rainfall series preprocessed by SSA becomes 667 

smoother where most of zeros are replaced by nonzero values. New input vectors from the 668 

reconstructed rainfall series are characterized by better repetition of patterns so that they are 669 

easier reproduced. 670 

To investigate the influence of SSA on the ANN’s performance, correlation analyses 671 

between inputs and output of ANN and ANN-SSA2 are compared using the Wuxi data and 672 

are depicted in Figure 19. As the input and output series in ANN are both the raw rainfall 673 

series, the cross-correlation analysis is equivalent to the autocorrelation analysis of the raw 674 

rainfall series. At all three prediction horizons, cross-correlation coefficients (CCs) between 675 

reconstructed inputs by SSA2 and the raw rainfall data are improved significantly at most 676 

lags except for the lag of 3 at one-step lead. It should be recalled that model inputs are the 677 

previous five rainfall data for Wuxi. The “starting point” in Figure 19 represents the first 678 

previous rainfall of the five inputs, and the remaining inputs consist of four points after the 679 

Page 32: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

32

starting point. It can be observed that the CCF value between each new model input and 680 

output are far larger than that between the raw model input and output (seen at the same lag). 681 

Therefore, the improvement of a model’s performance by the SSA technique may be owing 682 

to the enhancement of the mapping relation of model input and output by deleting noises 683 

hidden in the raw signals. 684 

(2) About parameter L in SSA 685 

The parameter L in SSA has a significant impact on the performance of a forecasting 686 

model since the optimal p RCs may be different with the change of L when using the same 687 

forecasting model at the same prediction horizon. The selection of L in this study is based on 688 

the interval of [3, 10] in conjunction with an empirical criterion (namely, a particular L is 689 

selected only if the singular spectrum can be distinguished markedly under that L ). To check 690 

the robustness of the empirical method, each L in [3, 10] is examined by the LR model with 691 

SSA2 using the Wuxi and Zhenwan data and presented in Table 10. As mentioned previously, 692 

the target L for Wuxi and Zhenwan are 5 and 7, respectively. The RMSE associated with 693 

them at each prediction horizon is highlighted in bold (shown in Table 10). In terms of Wuxi, 694 

the difference between the target RMSE and the minimum RMSE at the same prediction 695 

horizon is only 9.2% for one-step prediction, 1.4% for two-step prediction, 0.0% for three-696 

step prediction, respectively. Regarding Zhenwan, the three values are respectively 5.2%, 697 

7.5%, and 0.0%. These changes are slight and cannot influence the conclusions drawn 698 

previously. Therefore, the empirical method for the present rainfall data should be 699 

appropriate. 700 

Page 33: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

33

5. Conclusion 701 

This study suggests the use of modular artificial neural network (MANN) coupled 702 

with data preprocessing techniques for improving four rainfall predictions from India and 703 

China consisting of two monthly and two daily series. To reasonably evaluate MANN’s 704 

performance, three models, LR, K-NN and ANN, are used for the purpose of comparison. In 705 

the process of model development, model inputs and data preprocessing techniques are 706 

carefully analyzed and discussed. The following conclusions are reached based on this study: 707 

1. LCA is regarded as an effective and efficient method among all seven input 708 

techniques due to its simplicity of computation and comparable capability of forecasting. 709 

2. In the normal mode (without data preprocessing), MANN distinguishes from the 710 

other three models for both monthly and daily rainfall series forecasting. Whilst all four 711 

models reasonably forecast two monthly rainfall series, only MANN is able to simulate each 712 

daily rainfall series without obvious lag effect. 713 

3. In the data preprocessing mode, the effect of MA is negligible for the improvement 714 

of each forecasting model. 715 

4. PCA as a data preprocessing technique is discussed in two forms, i.e. PCA1 for the 716 

purpose of dimension reduction, and PCA2 for the purpose of noise reduction. Results show 717 

that PCA1 cannot improve model’s performance and PCA2 marginally improve model’s 718 

performance. 719 

5. Two filtering method, i.e. supervised (SSA1) and unsupervised (SSA2), are 720 

examined for SSA when coupled with forecasting models. It can be found that each model 721 

achieves considerable improvement in performance with the aid of SSA1 or SSA2. In terms 722 

of forecasting models, MANN still outperforms all other models. 723 

Page 34: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

34

6. As far as two filtering methods are concerned, SSA2 tends to be better if the 724 

number of raw RCs is small. Otherwise, SSA1 is a good alternative. 725 

7. A further discussion reveals that the essence of SSA in improving model 726 

performance is to strengthen the mapping relation of model input and output by deleting 727 

noises in the raw signal. 728 

8. There is still considerable room for improving forecasting of peak values although 729 

MANN coupled with SSA has made perfect overall predictions for daily rainfall series. 730 

731 

Nomenclature 732 

ACF Auto Correlation Function 733 

AMI Average Mutual Information 734 

ANN Artificial Neural Networks 735 

CC Cross-correlation Coefficient 736 

CCF Cross Correlation Function 737 

CE Coefficient of Efficiency 738 

CI Correlation Integral 739 

FCM Fuzzy C-means 740 

FNN False Nearest Neighbor 741 

K-NN K-Nearest-Neighbors 742 

LCA Linear Correlation Analysis 743 

L-M Levenberg-Marquart 744 

LR Linear Regression 745 

MA Moving Average 746 

Page 35: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

35

MANN Modular Artificial Neural Networks 747 

MOGA ANN based on Multi-objective Genetic Algorithm 748 

PACF Partial Auto Correlation Function 749 

PC Principal Component 750 

PCA Principal Component Analysis 751 

PMI Partial Mutual Information 752 

PI Persistence Index 753 

RC Reconstructed Component 754 

RMSE Root Mean Square Error 755 

SLR Stepwise Linear Regression 756 

SSA Singular Spectrum Analysis 757 

SVD Singular Value Decomposition 758 

SVM Support Vector Machine 759 

SVR Support Vectors Regression 760 

WA Wavelet Analysis 761 

762 

References: 763 

Abrahart, R.J. and See, L.M. (2002), Multi-model data fusion for river flow forecasting: an 764 

evaluation of six alternative methods based on two contrasting catchments. Hydrology and 765 

Earth System Sciences, 6(4), 655-670. 766 

Baratta et al., Baratta, D., Cicioni, G., Masulli, F. and Studer, L. (2003), Application of an 767 

ensemble technique based on singular spectrum analysis to daily rainfall forecasting. Neural 768 

Networks, 16, 375-387. 769 

Page 36: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

36

Bezdek, J.C. (1981), Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum 770 

Press, New York. 771 

Brath, A., Montanari, A., and Toth, E. (2002), Neural networks and non-parametric methods 772 

for improving realtime flood forecasting through conceptual hydrological models. Hydrology 773 

and Earth System Sciences, 6(4), 627-640. 774 

Campolo, M., Andreussi, P. and Soldati, A. (2003), Artificial neural network approach to 775 

flood forecasting in the river Arno. Hydrological Sciences Journal, 48(3), 381-398. 776 

Cannas, B., Fanni, A., Pintus, M., and Sechi, G. M. (2002), Neural Network Models to 777 

Forecast Hydrological Risk. IEEE IJCNN, 2002. 778 

Chan, J.C.L., and Shi, J.E. (1999), Prediction of the summer monsoon rainfall over South 779 

China. International Journal of Climatology, 19 (11), 1255-1265. 780 

Chattopadhyay,S., and Chattopadhyay, G. (2007), Identification of the best hidden layer size 781 

for three-layered neural net in predicting monsoon rainfall in India. Journal of 782 

Hydroinformatics, 10(2), 181-188. 783 

Cheng, B., and Titterington, D. M. (1994), Titterington Neural networks: A review from a 784 

statistical perspective. Statistical Science, 9(1), 2-54. 785 

Chu, P.S., and He, Y.X. (1994), Long-Range Prediction of Hawaiian Winter Rainfall Using 786 

Canonical Correlation-Analysis). International Journal of Climatology, 14(6), 659-669. 787 

Corzo, G., and Solomatine, D. (2007), Baseflow separation techniques for modular artificial 788 

neural network modelling in flow forecasting. Hydrological Sciences–Journal–des Sciences 789 

Hydrologiques, 52(3), 491-507. 790 

Page 37: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

37

Coulibaly, P., Haché, M., Fortin, V., and Bobée, B. (2005), Improving daily reservoir inflow 791 

forecasts with model combination. Journal of Hydrologic Engineering, 10(2), 91-99. 792 

Dawson, C.W., and Wilby, R.L. (2001), Hydrological Modeling Using Artificial Neural 793 

Networks. Progress in Physical Geography, 25(1), 80-108. 794 

DelSole, T., and Shukla, J. (2002), Linear prediction of Indian monsoon rainfall. Journal of 795 

Climate, 15 (24), 3645-3658. 796 

de Vos, N.J. and Rientjes, T.H.M. (2005), Constraints of artificial neural networks for rainfall 797 

-runoff modeling: trade-offs in hydrological state representation and model evaluation. 798 

Hydrology and Earth System Sciences, 9, 111-126. 799 

Diomede, T., Davolio, S., Marsigli, C., Miglietta, M. M., Moscatello, A., Papetti, P., 800 

Paccagnella, T., Buzzi, A., and Malguzzi, P. (2008), Discharge prediction based on multi-801 

model precipitation forecasts. Meteorology and Atmospheric Physics, 101 (3-4), 245-265. 802 

Draper, N. R. and Smith, H. (1998), Applied regression analysis, 3rd ed. New York: Wiley. 803 

Fortin, V., Ouarda, T.B.M.J., and Bobe´e, B., (1997), Comment on ‘The use of artificial 804 

neural networks for the prediction of water quality parameters’ by H.R. Maier and G.C. 805 

Dandy. Water Resources Research, 33 (10), 2423-2242. 806 

Fraser, A.M. and Swinney, H.L. (1986), Independent coordinates for strange attractors from 807 

mutual information, Physical Review A, 33(2), 1134-1140. 808 

Ganguly, A.R., and Bras, R.L. (2003), Distributed quantitative precipitation forecasting 809 

(DQPF) using information from radar and numerical weather prediction models. Journal of 810 

Hydrometeorology, 4 (6), 1168-1180. 811 

Page 38: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

38

Giustolisi, O., and Savic, D. A. (2006), A symbolic data-driven technique based on 812 

evolutionary polynomial regression, Journal of Hydroinformatics, 8(3), 207-222. 813 

Golyandina, N. Nekrutkin,V., and Zhigljavsky, A. (2001), Analysis of Time Series Structure: 814 

SSA and Related Techniques, Chapman & Hall/CRC. 815 

Guhathakurta, P. (2008), Long lead monsoon rainfall prediction for meteorological sub-816 

divisions of India using deterministic artificial neural network model. Meteorology and 817 

Atmospheric Physics, 101 (1-2), 93-108. 818 

Guhathakurta, P., Rajeevan, M., and Thapliyal, V. (1999), Long range forecasting Indian 819 

summer monsoon rainfall by a hybrid Principal Component neural network model. 820 

Meteorology and atmospheric physics, 71, (3-4), 255-266. 821 

Hsu, K.L., Gupta, H.V., Gao, X.G., Sorooshian, S., and Imam, B. (2002), Self-organizing 822 

linear output map (SOLO): An artificial neural network suitable for hydrologic modeling and 823 

analysis. Water Resources Research, 38(12), 1302,doi:10.1029/2001WR000795. 824 

Hu, T.S., Lam, K.C., Ng, S.T., (2001), River flow time series prediction with a range-825 

dependent neural network. Hydrological Science Journal, 46 (5), 729-745. 826 

Hotelling, H., (1933), Analysis of a complex of statistical variables into principal components. 827 

Journal of Educational Psychology, 24, 417-441. 828 

Hu, T.S.,Wu, F.Y., and Zhang, X. (2007), Rainfall-runoff modeling using principal 829 

component analysis and neural network. Nordic Hydrology, 38(3), 235-248. 830 

Jain, A., and Srinivasulu, S. (2006), Integrated approach to model decomposed flow 831 

hydrograph using artificial neural network and conceptual techniques. Journal of Hydrology, 832 

317, 291-306. 833 

Page 39: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

39

Jayawardena, A.W., and Lai, F. (1994), Analysis and prediction of chaos in rainfall and 834 

stream flow time series. Journal of hydrology, 153, 23-52. 835 

Kennel, M. B., Brown, R., and Abarbanel, H. D. I. (1992), Determining embedding 836 

dimension for phase space reconstruction using geometrical construction. Physical Review A., 837 

45(6), 3403-3411. 838 

Kim, T., Heo, J. H. and Jeong, C. S. (2006), Multireservoir system optimization in the Han 839 

River basin using multi-objective genetic algorithms. Hydrological Processes, 20 (9), 2057-840 

2075. 841 

Kitanidis, P. K. and Bras, R. L. (1980), Real-time forecasting with a conceptual hydrologic 842 

model, 2, applications and results, Water Resources Research, 16 (6), 1034–1044. 843 

Lee, C.F., Lee, J.C. and Lee, A.C., (2000), Statistics for business and financial economics 844 

(2nd version), World Scientific, Singapore. 845 

Legates, D. R., and McCabe, Jr, G. J. (1999), Evaluating the use of goodness-of-fit measures 846 

in hydrologic and hydroclimatic model validation, Water Resources. Research, 35(1), 233- 847 

241. 848 

Li, F., and Zeng, Q.C. (2008), Statistical prediction of East Asian summer monsoon rainfall 849 

based on SST and sea ice concentration. Journal of the meteorological society of Japan, 86 850 

(1), 237-243. 851 

Lin, G.F., and Chen, L.H. (2005), Application of artificial neural network to typhoon rainfall 852 

forecasting. Hydrological Processes, 19 (9), 1825-1837. 853 

Lin, G.F., Chen, G.R., Wu, M.C., and Chou, Y.C. (2009), Effective forecasting of hourly 854 

typhoon rainfall using support vector machines. Water Resources Research, 45, W08440. 855 

Page 40: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

40

doi:10.1029/2009WR007911. 856 

Lin, G.F., and Wu, M.C. (2009), A hybrid neural network model for typhoon-rainfall 857 

forecasting. Journal of Hydrology, 375 (3-4), 450-458. 858 

Lisi, F., Nicolis, and Sandri, M. (1995), Combining singular-spectrum analysis and neural 859 

networks for time series forecasting. Neural Processing Letters, 2(4), 6-10. 860 

Marques, C.A.F., Ferreira, J., Rocha, A., Castanheira, J., Gonçalves, P., Vaz. N., and Dias, 861 

J.M. (2006), Singular spectral analysis and forecasting of hydrological time series. Physics 862 

and Chemistry of the Earth, 31, 1172-1179. 863 

May, R.J., Maier, H.R., Dandy, G.C., and Fernando, T.M.K. (2008), Non-linear variable 864 

selection for artificial neural networks using partial mutual information. Environmental 865 

Modeling & Software, 23, 1312-1328. 866 

McCuen, R. H. (2005), Hydrologic analysis and design (3rd ed.), Upper Saddle River, NJ: 867 

Pearson/Prentice Hall. 868 

Munot, A. A., and Kumar, K.K. (2007), Long range prediction of Indian summer monsoon 869 

rainfall. Journal of Earth System Science, 116 (1), 73-79. 870 

Nash, J. E. and Sutcliffe, J. V. (1970), River flow forecasting through conceptual models part 871 

I — A discussion of principles. Journal of Hydrology, 10 (3), 282-290. 872 

Nayagam, L.R., Janardanan, R., and Mohan, H.S.R. (2008), An empirical model for the 873 

seasonal prediction of southwest monsoon rainfall over Kerala, a meteorological subdivision 874 

of India. International Journal of Climatology, 28 (6), 823-831. 875 

Newbold, P., Carlson, W. L., and Thorne, B.M. (2003), Statistics for business and economics 876 

Page 41: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

41

(fifth version), Prentice Hall, Upper Saddle River, N.J. 877 

Partal, T. and Kişi, Ӧ. (2007), Wavelet and Neuro-fuzzy conjunction model for precipitation 878 

forecasting. Journal of Hydrology, 342 (2), 199-212. 879 

Pearson, K., (1901), On lines and planes of closest fit to systems of points in space. 880 

Philosophical Magazine, 2, 559-572. 881 

Pongracz, R., Bartholy, J., and Bogardi, I. (2001), Fuzzy rule-based prediction of monthly 882 

precipitation. Physics and Chemistry of the Earth Part B-Hydrology Oceans and Atmosphere, 883 

26 (9), 663-667. 884 

Rajurkar, M.P., Kothyari, U.C., and Chaube, U.C. (2002), Artificial neural networks for daily 885 

rainfall-runoff modeling. Hydrological Sciences-Journal, 47(6), 865-876. 886 

Salas, J.D., Delleur, J.W., Yevjevich, V., and Lane, W.L. (eds) (1985), Applied Modeling of 887 

Hydrologic Time Series. Water Resources Publications: Littleton, Colorado. 888 

See, L., and Openshaw, S. (2000), A hybrid multi-model approach to river level forecasting. 889 

Hydrology Science Journal, 45(4), 523-536. 890 

Shamseldin, A.Y. and O’Connor, K.M. (1999), A real-time combination method for the 891 

outputs of different rainfall-runoff models. Hydrological Sciences Journal, 44(6), 895-912. 892 

Shamseldin, A.Y., O'Connor, K.M. and Liang, G.C., (1997), Methods for combining the 893 

outputs of different rainfall runoff models, Journal of Hydrology, 197, 203–229. 894 

Sheng, C., Gao, S., and Xue, M. (2006), Short-range prediction of a heavy precipitation event 895 

by assimilating Chinese CINRAD-SA radar reflectivity data using complex cloud analysis. 896 

Meteorology and Atmospheric Physics, 94 (1-4), 167-183. 897 

Page 42: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

42

Shrestha, D.L., and Solomatine, D.P. (2006), Machine learning approaches for estimation of 898 

prediction interval for the model output. Neural Networks, 19(2), 225-235. 899 

Silverman, D., and Dracup, J.A. (2000), Artificial neural networks and long-range 900 

precipitation in California. Journal of Applied Meteorology 31 (1), 57-66. 901 

Sivapragasam, C., and Liong, S. Y. (2005), Flow categorization model for improving 902 

forecasting. Nordic Hydrology 36 (1), 37-48. 903 

Sivapragasam, C., Liong, S.Y. and Pasha, M.F.K. (2001), Rainfall and runoff forecasting with 904 

SSA-SVM approach. Journal of Hydroinformatics, 3(7), 141-152. 905 

Sivapragasam, C., Vincent and, P., and Vasudevan, G. (2007), Genetic programming model 906 

for forecast of short and noisy Data. Hydrological Processes, 21, 266-272. 907 

Solomatine, D. P., and Ostfeld, A. (2008), Data-driven modelling: Some past experiences and 908 

new approaches. Journal of Hydroinformatics, 10(1), 3-22. 909 

Solomatine, D. P. and Xue, Y. I. (2004), M5 model trees and neural networks: application to 910 

flood forecasting in the upper reach of the Huai River in China. Journal of Hydrological 911 

Engineering, 9(6), 491-501. 912 

Sudheer, K. P., Gosain, A. K., and Ramasastri, K. S. (2002), A data-driven algorithm for 913 

constructing artificial neural network rainfall-runoff models. Hydrological Processes, 16, 914 

1325-1330. 915 

Sugihara, G and May, R.M. (1990), Nonlinear forecasting as a way of distinguishing chaos 916 

from measurement error in time series. Nature, 344, 734- 741. 917 

Sudheer, K. P., Gosain, A. K., and Ramasastri, K. S. (2002), A data-driven algorithm for 918 

Page 43: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

43

constructing artificial neural network rainfall-runoff models. Hydrological Processes, 16, 919 

1325-1330. 920 

Theiler, J., (1986), Spurious dimension from correlation algorithms applied to limited time-921 

series data. Physical Review A, 34 (3), 2427-2432. 922 

Toth, E, Brath, A., and Montanari, A. (2000), Comparison of short-term rainfall prediction 923 

models for real-time flood forecasting. Journal of Hydrology, 239, 132-147. 924 

Venkatesan, C., Raskar, S.D., Tambe, S.S., Kulkarni, B.D., and Keshavamurty, R.N. (1997), 925 

Prediction of all India summer monsoon rainfall using error-back-propagation neural 926 

networks. Meteorology and Atmospheric Physics, 62 (3-4), 225-240. 927 

Wang,W., Van Gelder, P.H.A.J.M., Vrijling, J.K. and Ma, J. (2006), Forecasting Daily 928 

Streamflow Using Hybrid ANN Models. Journal of Hydrology, 324, 383-399. 929 

Wu, C.L., Chau, K.W., and Li, Y.S., (2009), Methods to improve neural network performance 930 

in daily flows prediction. Journal of Hydrology, 372(1-4), 80-93. 931 

Wu, C.L., Chau, K.W. and Li, Y.S. (2008), River stage prediction based on a distributed 932 

support vector regression. Journal of Hydrology, 358, 96-111. 933 

Xiong, L.H., Shamseldin, A.Y., and O’Connor, K.M. (2001), A non-linear combination of 934 

forecasts of rainfall-runoff models by the first-order TS fussy system for forecast of rainfall-935 

runoff model. Journal of Hydrology, 245, 196-217. 936 

Yates, D.N., Warner, T.T., and Leavesley, G.H. (2000), Prediction of a flash flood in complex 937 

terrain. Part II: A comparison of flood discharge simulations using rainfall input from radar, a 938 

dynamic model, and an automated algorithmic system. Journal of Applied Meteorology, 39 939 

(6), 815-825. 940 

Page 44: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

44

Yu, D.L., Gomm, J.B., and Williams, D. (2000), Neural model input selection for a MIMO 941 

chemical process. Engineering Applications of Artificial Intelligence, 13(1), 15-23. 942 

Zhang, B., and Govindaraju, R. S. (2000), Prediction of watershed runoff using Bayesian 943 

concepts and modular neural networks. Water Resources Research, 36(3), 753-762. 944 

945 

Page 45: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

45

Figure Captions 946 

Figure1. Location of Daning river basin (Map of Chongqing in the left panel, and Daning 947 

watershed in the dashed box) 948 

Figure 2. Rainfall series of (a) Wuxi, (b) India, (c) Zhenwan, and (d) Zhongxian 949 

Figure 3. Flow chart of hard forecasting using a modular model 950 

Figure 4. Implementation framework of forecast models with/without data preprocessing 951 

Figure 5. Plots of ACF and PACF of the rainfall series with the 95% confidence bounds (the 952 

dashed lines), (a) and (c) for Wuxi, and (b) and (d) for Zhenwan 953 

Figure 6. Check of robustness of K in KNN method for (a) Wuxi, (b) India, (c) Zhenwan, 954 

and (d) Zhongxian. 955 

Figure 7. Singular Spectrum as a function of lag using various window lengths L for (a) Wuxi, 956 

(b) India, (c) Zhenwan, and (d) Zhongxian 957 

Figure 8. Sensitivity analysis of singular Spectrum on varied for (a) Wuxi, (b) India, (c) 958 

Zhenwan, and (d) Zhongxian 959 

Figure 9. Plots of CCF between each RC and the raw rainfall data for Zhenwan 960 

Figure 10. Supervised procedure for a forecast model with SSA 961 

Figure 3. Scatter plots and hyetographs of one-step-ahead forecast using ANN and MANN 962 

for Wuxi ((a) and (c) from ANN, and (b) and (d) from MANN) 963 

Figure 4. Scatter plots and hyetographs of one-step-ahead forecast using ANN and MANN 964 

for India ((a) and (c) from ANN, and (b) and (d) from MANN) 965 

Figure 5. CCFs at three forecast horizons for various lags in time of observed and forecasted 966 

rainfall of ANN and MANN: Wuxi (left column) and India (right column) 967 

Figure 6. Performance of ANN with SSA1 in terms of RMSE as a function of p ( L ) 968 

Page 46: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

46

selected RCs at various prediction horizons: (a) Wuxi and (b) India 969 

Figure 7. Scatter plots and hyetographs of one-step-ahead forecast using ANN and MANN 970 

with SSA2 for Wuxi ((a) and (c) from ANN, and (b) and (d) from MANN) 971 

Figure 8. Scatter plots and hyetographs of one-step-ahead forecast using ANN and MANN 972 

with SSA2 for India ((a) and (c) from ANN, and (b) and (d) from MANN) 973 

Figure 17. CCFs at three forecast horizons using ANN and MANN with Wuxi and India ((a) 974 

and (c) for Wuxi and (b) and (d) for India) 975 

Figure 18. Hyetographs (detail; from 900th to 950th) of Wuxi: (1) the raw rainfall series and (2) 976 

reconstructed rainfall series for one-step prediction of ANN. 977 

Figure 19. CCF between inputs and output for ANN and ANN-SSA2 at three prediction 978 

horizons using the Wuxi data 979 

980 

Page 47: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

47

Table captions 981 

Table 1. Pertinent information for four watersheds and the rainfall data 982 

Table 2. Comparison of methods to determine mode inputs using ANN model 983 

Table 3. Performance comparison of ANN with different data-transformed methods 984 

Table 4. Model performances at three forecasting horizons under normal mode 985 

Table 5. Model performances of ANN-MA using the Wuxi data 986 

Table 6. Multiple-step predictions for Wuxi and India series using LR, K-NN, and ANN with 987 

PCA1 988 

Table 7. Multiple-step predictions for Wuxi and India series using LR, K-NN, and ANN 989 

with PCA2 990 

Table 8. Optimal p RCs for model inputs at various forecasting horizons 991 

Table 9. Model performances at three forecasting horizons using MANN and ANN with the 992 

SSA2 993 

Table 10. RMSE of the LR model coupled with SSA2 using various L 994 

995 

Page 48: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

48

0 15 30 45 60 75 km

Chongqing

Gaolou

Jianlou

Changan

XujiabaXining

Wuxi

Wanzhou

Yunyan

KaixianFengjie

Wushan

YangtzeRi

ver

Gaolou

Jianlou

Changan

XujiabaXining

Wuxi

Wushantown or cityhydrology stationrain gauge

996 

Figure 1. 997 

998 

Page 49: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

49

999 

 1000 

1000 2000 3000 4000 5000 6000 70000

50

100

150

200

Time (1/1/1988-31/12/2007)

Rai

nfal

l (m

m/d

ay)

500 1000 1500 2000 2500 3000 35000

50

100

150

200

Time (1/1/1989-31/12/1998)

Rai

nfal

l (m

m/d

ay)

200 400 600 800 1000 1200 1400 16000

1000

2000

3000

4000

Time (1/1871-12/2007)

Rai

nfal

l (m

m/m

onth

)

100 200 300 400 500 6000

200

400

600

Time (1/1956-12/2007)

Rai

nfal

l (m

m/m

onth

)

Wuxi

Linear fitting

Zhenwan

Linear fitting

India

Linear fitting

Zhongxian

Linear fitting

(d)(c)

(b)(a)

1001 

Figure 2. 1002 

1003 

Page 50: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

50

1004 

Figure 3. 1005 

1006 

Final output

( 1)Ft T mx

Raw input series

x

FCM clustering

Sub-model 1

Determination of model

inputs

Sub-model 2

Sub-model 3

Page 51: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

51

1007 

1008 

Figure 4. 1009 

Raw input series

x

LCA

AMI

PMI

FNN

CI

SLR

MOGA

Methods for model inputs

MA

PCA

SSA

MA

PCA

SSA

LR

K-NN

ANN

MANN

MA

PCA

SSA

MA

PCA

SSA

Forecasting models

Method for data preprocessing

( 1)Ft T mx

Model output

( 1)Ft T mx

( 1)Ft T mx

( 1)Ft T mx

Page 52: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

52

1010 

1011 

0 5 10 15 20-0.2

0.2

0.6

1

AC

F

Wuxi

0 5 10 15 20-0.2

0.2

0.6

1

AC

F

Zhenwan

0 5 10 15 20-0.2

0.2

0.6

1

Lag (day)

PA

CF

0 5 10 15 20-0.2

0.2

0.6

1

Lag (day)

PA

CF

(c) (d)

(a) (b)

1012 Figure 5. 1013 

1014 

Page 53: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

53

1015 

10 20 30 4010

11

12

13

K

Val

i-RM

SE

Wuxi

10 20 30 40200

220

240

260

280

300

K

Val

i-RM

SE

India

10 20 30 4010

11

12

13

14

15

K

Val

i-RM

SE

Zhenwan

10 20 30 4052

54

56

58

K

Val

i-RM

SE

Zhongxian(d)(c)

(b)(a)

K=m+1minimum

K=m+1minimum

minimum

minimum

K=m+1

K=m+1

1016 

Figure 6. 1017 

1018 

Page 54: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

54

1 3 5 7 96

6.5

7

7.5ln

( s

ingu

lar

valu

e )

Mode

Wuxi

1 3 5 7 96.2

6.4

6.6

6.8

7

7.2

ln (

sin

gula

r va

lue

)

Mode

Zhenwan

1 3 5 7 99

10

11

12

ln (

sin

gula

r va

lue

)

Mode

India

1 3 5 7 97

7.5

8

8.5

9

ln (

sin

gula

r va

lue

)

Mode

Zhongxian

(a) (b)

(c) (d)

L=10

L=3 L=3

L=10

L=3

L=10

L=3

L=10

1019 

Figure 7. 1020 

1021 

Page 55: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

55

1022 

1023 

1024 

1 2 3 4 56.3

6.8

7.3

ln (

sin

gula

r va

lue

)

singular number, L

Wuxi

=1=3=5=7=10

1 2 3 4 5 6 7

6.3

6.7

7.1

ln (

sin

gula

r va

lue

)

singular number, L

Zhenwan

=1=3

=5=7=10

1 2 3 4 5 6 79

10

11

12

ln (

sin

gula

r va

lue

)

singular number, L

India

=1=3=5=7=10

1 2 3 4 5 67

8

9

ln (

sin

gula

r va

lue

)

singular number, L

Zhongxian

=1=3=5=7=10

(d)(c)

(b)(a)

1025 

Figure 8. 1026 

Page 56: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

56

1027 

0 2 4 6 8 10 120

0.5

1RC1

CC

F

0 2 4 6 8 10 12-1

0

1 RC2

CC

F

0 2 4 6 8 10 12-1

0

1 RC3

CC

F

0 2 4 6 8 10 12-1

0

1 RC4

CC

F

0 2 4 6 8 10 12-1

0

1RC5

CC

F

0 2 4 6 8 10 12-0.5

0

0.5RC6

CC

F

0 2 4 6 8 10 12-0.5

0

0.5RC7

Lag (day)

CC

F

0 2 4 6 8 10 120

0.5

1Average over all RCs

Lag (day)

CC

F

1028 Figure 9. 1029 

1030 

Page 57: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

57

1031 

1032 

Figure 10. 1033 

1034 

Choose (τ, L) for singular spectrum analysis on raw rainfall data

SSA Decomposition

Decompose raw rainfall data into L reconstructed components (RCs)

Obtain correlation coefficient matrix by correlation analysis between RC and raw rainfall data at varied lags (or termed prediction horizons) from 1 to T

Compute average correlation coefficient by averaging over L correlation coefficients at each same lag, as shown in the last plot of Figure 9.

For a specified lag (or prediction horizon), all L values of CCFs at this lag are sorted in a ascending (or ascending) order if the average CCF is negative (or positive) when the pruning algorithm for the filter of RCs is adopted

ANN is repeatedly trained and tested by new time series which is obtained by pruning operation on sorted RCs where the number of p varies from L at the beginning to 1 at the end. The optimal p is associated with the best performance of ANN.

Correlation Coefficients Sort

Reconstructed Components Filter

Page 58: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

58

1035 

0 50 100 1500

50

100

150

Observed

Fore

cast

ed

ANN

900 910 920 930 940 9500

50

100

150

Time (day)

Rai

nfal

l (m

m)

0 50 100 1500

50

100

150

Observed

Fore

cast

ed

MANN

900 910 920 930 940 9500

50

100

150

Time (day)

Rai

nfal

l (m

m)

observedforecasted

observedforecasted

RMSE: 8.2CE: 0.50PI: 0.60

RMSE: 10.6CE: 0.17PI: 0.32

(b)

(d)(c)

(a)

1036 Figure 11. 1037 

1038 

Page 59: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

59

1039 

1040 

1041 

0 500 1000 1500 2000 2500 30000

1000

2000

3000

Observed

Fore

cast

ed

ANN

150 160 170 180 190 2000

1000

2000

3000

4000

Time (month)

Rai

nfal

l (m

m)

0 500 1000 1500 2000 2500 30000

1000

2000

3000

Observed

Fore

cast

ed

MANN

150 160 170 180 190 2000

1000

2000

3000

4000

Time (month)

Rai

nfal

l (m

m)

observedforecasted

observedforecasted

(a)

(c) (d)

(b)

RMSE: 245.2CE: 0.93PI: 0.86

RMSE: 243.3CE: 0.93PI: 0.86

1042 Figure 12. 1043 

1044 

Page 60: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

60

1045 

-10 -5 0 5 100

0.5

1

Lag (day)

CC

F

ANN (Wuxi)

-10 -5 0 5 100

0.5

1

Lag (day)

CC

F

MANN (Wuxi)

-10 -5 0 5 10-1

-0.5

0

0.5

1

Lag (month)

CC

F

ANN (India)

-10 -5 0 5 10-1

-0.5

0

0.5

1

Lag (month)

CC

F

MANN (India)

One-day

Two-day

Three-day

One-day

Two-day

Three-day

One-month

Two-month

Three-month

One-month

Two-month

Three-month

(b)

(d)(c)

(a)

1046 Figure 13. 1047 

1048 

Page 61: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

61

1049 

1050 

1051 

5 4 3 2 14

5

6

7

8

9

10

11

12

p

RM

SE

(m

m)

Wuxi

one steptwo stepthree step

7 6 5 4 3 2 1160

180

200

220

240

260

280

300

p

RM

SE

(m

m)

India

one steptwo stepthree step

(a) (b)

1052 Figure 14. 1053 

1054 

Page 62: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

62

0 50 100 1500

50

100

150

Observed

Fore

cast

edANN-SSA2

900 910 920 930 940 9500

50

100

150

Time (day)

Rai

nfal

l (m

m)

0 50 100 1500

50

100

150

Observed

Fore

cast

ed

MANN-SSA2

900 910 920 930 940 9500

50

100

150

Time (day)

Rai

nfal

l (m

m)

observedforecasted

observedforecasted

(a)

(c) (d)

(b)

RMSE: 4.43CE: 0.84PI: 0.87

RMSE: 3.63CE: 0.90PI: 0.92

1055 Figure 15. 1056 

1057 

Page 63: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

63

1058 

1059 

 1060 

0 500 1000 1500 2000 2500 30000

1000

2000

3000

Observed

Fore

cast

ed

ANN-SSA2

150 160 170 180 190 2000

1000

2000

3000

4000

Time (month)

Rai

nfal

l (m

m)

0 500 1000 1500 2000 2500 30000

1000

2000

3000

Observed

Fore

cast

ed

ANN-SSA2

150 160 170 180 190 2000

1000

2000

3000

4000

Time (month)

Rai

nfal

l (m

m)

observedforecasted

observedforecasted

RMSE: 144.2CE: 0.98PI: 0.95

RMSE: 164.7CE: 0.98PI: 0.95

(b)

(d)(c)

(a)

1061 Figure 16. 1062 

1063 

Page 64: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

64

1064 

1065 

1066 

-10 -5 0 5 10-0.5

0

0.5

1

Lag (day)

CC

F

ANN-SSA2(Wuxi)

-10 -5 0 5 10-0.5

0

0.5

1

Lag (day)

CC

F

MANN-SSA2(Wuxi)

-10 -5 0 5 10-1

-0.5

0

0.5

1

Lag (month)

CC

F

ANN-SSA2(India)

-10 -5 0 5 10-1

-0.5

0

0.5

1

Lag (month)

CC

F

MANN-SSA2(India)

One-dayTwo-dayThree-day

One-dayTwo-dayThree-day

One-monthTwo-monthThree-month

One-monthTwo-monthThree-month

(a)

(c) (d)

(b)

1067 Figure 17. 1068 

1069 

Page 65: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

65

1070 

1071 

900 905 910 915 920 925 930 935 940 945 9500

20

40

60

Time (day)

Rai

nfal

l (m

m)

Original Wuxi rainfall series

900 905 910 915 920 925 930 935 940 945 950

0

20

40

60

Time (day)

Rai

nfal

l (m

m)

Reconstructed Wuxi rainfall series

(a)

(b)

1072 

Figure 18. 1073 

1074 

Page 66: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

66

1075 

0 5 10 150

0.2

0.4

0.6

0.8

1

Lag (day)

CC

F

One-step lead

0 5 10 150

0.2

0.4

0.6

0.8

1

Lag (day)

Two-step lead

0 5 10 150

0.2

0.4

0.6

0.8

1

Lag (day)

Three-step lead

Input series generated by SSA2 Raw input series

starting point

starting point

starting point

1076 

Figure 19. 1077 

1078 

Page 67: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

67

Table 11. Pertinent information for four watersheds and the rainfall data 1079 

Watershed and datasets

Statistical parameters Watershed area and data period

µ Sx Cv Cs Xmin Xmax (mm) (mm) (mm) (mm)

Wuxi Original data 3.67 10.15 0.36 5.68 0.00 154 Area: Training 3.81 10.94 0.35 6.27 0.00 147 2 000 km2 Cross-validation 3.42 8.87 0.39 4.96 0.00 102 Data period: Testing 4.03 11.60 0.35 5.46 0.00 154 Jan., 1988- Dec., 2007 Zhenwan Original data 4.3 11.0 0.39 4.94 0.0 159 Area: Training 4.3 11.2 0.38 5.60 0.0 159 7554 km2 Cross-validation 4.7 11.2 0.42 4.22 0.0 125 Data period: Testing 4.0 10.9 0.37 4.97 0.0 133 Jan., 1989- Dec., 1998 India Original data 906.7 951.6 1.0 0.9 3.0 3460 Area:

Training 904.8 955.7 0.9 0.9 3.0 3393 all India

Cross-validation 918.2 969.5 0.9 1.0 8.0 3460 Data period:

Testing 898.9 927.4 1.0 0.9 16.0 3232 Jan., 1871- Dec., 2007 Zhongxian Original data 96.2 79.2 1.2 1.2 0.0 599 Area: Training 97.2 77.5 1.3 0.9 0.0 429 Cross-validation 98.6 86.8 1.1 1.9 0.0 599 Data period: Testing 91.8 74.9 1.2 0.8 0.0 306 Jan., 1956- Dec., 2007

1080 

1081 

Page 68: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

68

Table 12. Comparison of methods to determine mode inputs using ANN model 1082 

Watershed Methods m Effective inputsa Identified ANN

RMSE

Wuxi

LCA 1 20 The last 5 (5-5-1) 10.74

AMI 1 12 Except for Xt-10,t-9 (10-3-1) 10.91

PMIb 1 12 Xt,t-1,t-3,t-5,t-7,t-10,t-4 (7-8-1) 10.85

FNN 1 20 The last 14 (14-3-1) 11.02

CI 4 20 Nil

SLR 1 12 Xt-11,t-7,t-4,t-2,t-1,t (6-3-1) 10.94

MOGA 1 12 Xt, t-1 (2-6-1) 10.55

Zhenwan

LCA 1 20 The last 7 (7-4-1) 11.03

AMI 1 12 Except for Xt-11,t-10,t-9,t-8,t-2 (7-5-1) 10.95

PMI 1 12 Xt-4,t,t-1,t-3,t-11,t-5,t-10,t-6,t-9 (10-4-1) 10.98

FNN 1 20 Last 14 (14-3-1) 11.08

CI 3 20 Nil

SLR 1 12 Xt-11,t-7,t-6,t-3,t-1,t (6-3-1) 11.01

MOGA 1 12 Xt,t-4,t-7,t-9,t-11 (5-8-1) 10.43

India

LCA 1 20 the last 12 (12-5-1) 256.22

AMI 1 12 the last 12 (12-5-1) 256.22

PMI 1 12 Xt-11,t-10,t-5,t (4-5-1) 275.06

FNN 1 20 the last 5 (5-9-1) 286.04

CI 4 20 nil

SLR 1 12 except for Xt-4 (11-9-1) 258.13

MOGA 1 12 Xt-11,t-9,t-7,t-5,t-4,t-1,t (7-1-1) 277.57

Zhongxian

LCA 1 20 the last 13 (13-3-1) 51.70

AMI 1 12 Xt-11,t-10,t-6,t-5,t-4,t (6-5-1) 54.67

PMI 1 12 Xt-11,t,t-9,t-7,t-7 (5-9-1) 55.39

FNN 1 20 the last 4 (4-7-1) 59.78

CI 3 20 nil

SLR 1 12 Xt-11,t-7,t-6,t-5,t-3,t (6-6-1) 55.47

MOGA 1 12 Xt-11,t-10,t-6,t-3,t (5-2-1) 53.93

Note:a for the convenience of writing down effective inputs, “Xt, t-1”stands for Xt, Xt-1; b effective inputs 1083 from PMI are in descending order of priority. 1084 

1085 

Page 69: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

69

Table 13. Performance comparison of ANN with different data-transformed methods 1086 

Watershed Data

Transformation RMSE CE

1 2 3 a 1 2 3 Wuxi Std_raw 10.77 11.54 11.62b 0.14 0.01 0.00 Norm_raw 10.57 11.49 11.59 0.17 0.02 0.00 Std_nth_root 11.00 12.02 12.10 0.10 -0.07 -0.09 Norm_nth_root 11.15 12.01 12.09 0.08 -0.07 -0.09 Zhenwan Std_raw 11.03 11.11 11.16 0.03 0.02 0.01 Norm_raw 10.72 11.06 11.14 0.09 0.03 0.02 Std_nth_root 11.25 11.68 11.75 -0.01 -0.09 -0.10 Norm_nth_root 11.34 11.70 11.74 -0.02 -0.09 -0.09 Wuxi Std_raw 256.22 250.51 249.46 0.92 0.93 0.93 Norm_raw 251.74 246.48 250.99 0.93 0.93 0.93

Std_nth_root 259.81 253.42 256.43 0.92 0.93 0.92

Norm_nth_root 252.75 251.95 259.00 0.93 0.93 0.92

Zhongxian Std_raw 54.26 54.23 53.91 0.48 0.48 0.48 Norm_raw 52.91 53.10 52.78 0.50 0.50 0.51 Std_nth_root 52.15 53.44 53.17 0.52 0.49 0.50 Norm_nth_root 52.27 53.37 54.30 0.51 0.49 0.48

a Numbers of “1, 2, and 3” denote one-, two-, and three-day-ahead forecasting; 1087 b Result is average over 10 best runs from total 20 runs; 1088 

1089 

Page 70: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

70

Table 14. Model performances at three forecasting horizons under the normal mode 1090 

Watershed Model RMSE CE PI

1* 2* 3* 1 2 3 1 2 3

WuXi

Naïve 12.2 16.0 16.5 0.05 -0.61 -0.72 0.00 0.00 0.00

LR 10.9 11.9 12.0 0.12 -0.05 -0.07 0.28 0.41 0.43

K-NN 11.8 12.4 12.6 -0.03 -0.14 -0.18 0.16 0.36 0.38

ANN 10.6 11.5 11.6 0.17 0.02 0.00 0.32 0.45 0.47

MANN 8.2 9.2 9.0 0.50 0.38 0.40 0.60 0.65 0.68

Zhenwan

Naïve 12.4 12.2 13.6 -0.53 -0.49 -0.84 0.00 0.00 0.00

LR 11.1 11.3 11.4 0.02 -0.02 -0.03 0.39 0.42 0.45

K-NN 12.7 12.7 12.8 -0.27 -0.28 -0.30 0.21 0.28 0.31

ANN 10.7 11.1 11.1 0.09 0.03 0.02 0.43 0.45 0.48

MANN 7.9 9.6 9.9 0.50 0.27 0.23 0.69 0.59 0.59

India

Naïve 643.1 1084.5 1399.2 0.52 -0.37 -1.28 0.00 0.00 0.00

LR 286.8 301.6 302.7 0.90 0.89 0.89 0.80 0.92 0.95

K-NN 246.6 257.3 251.2 0.93 0.92 0.93 0.85 0.94 0.97

ANN 245.2 245.9 247.2 0.93 0.93 0.93 0.86 0.95 0.97

MANN 243.3 241.8 244.4 0.93 0.93 0.93 0.86 0.95 0.97

Zhongxian

Naïve 75.7 91.9 109.2 -0.03 -0.51 -1.13 0.00 0.00 -0.02

LR 56.1 57.7 58.4 0.44 0.41 0.39 0.46 0.60 0.71

K-NN 55.0 56.0 57.2 0.46 0.44 0.42 0.48 0.63 0.72

ANN 52.5 54.4 54.3 0.51 0.48 0.48 0.52 0.65 0.75

MANN 50.3 50.2 53.2 0.55 0.55 0.50 0.56 0.70 0.76 * The number of “1, 2, and 3” denote one-, two-, and three-step-ahead forecasts 1091 

1092 

Page 71: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

71

1093 

Table 15. Model performances of ANN-MA using the Wuxi data 1094 

Prediction horizons

Performance index

Window length (k) for MA 1 2 3 4 5 6 7 8 9 10

One-step RMSE 10.60 10.70 10.63 10.72 10.77 10.70 10.83 10.83 10.72 10.68 CE 0.17 0.15 0.16 0.15 0.14 0.15 0.13 0.13 0.15 0.15 PI 0.32 0.31 0.32 0.31 0.30 0.31 0.29 0.29 0.31 0.31 Two-step RMSE 11.50 11.51 11.51 11.50 11.53 11.56 11.55 11.47 11.45 11.45 CE 0.02 0.02 0.02 0.02 0.01 0.01 0.01 0.02 0.03 0.03 PI 0.45 0.44 0.44 0.45 0.44 0.44 0.44 0.45 0.45 0.45 Three-step RMSE 11.60 11.58 11.58 11.55 11.61 11.58 11.56 11.51 11.52 11.52 CE 0.00 0.00 0.00 0.01 0.00 0.00 0.01 0.02 0.01 0.01 PI 0.47 0.47 0.47 0.48 0.47 0.47 0.48 0.48 0.48 0.48

1095 

Page 72: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

72

Table 6. Multiple-step predictions for Wuxi and India series using LR, K-NN, and ANN 1096 

with PCA1 1097 

Watershed Performance

index V (%)1

LR K-NN ANN 1* 2* 3* 1 2 3 1 2 3

Wuxi RMSE 85 11.0 11.9 12.0 12.1 12.5 12.7 10.6 11.5 11.6 90 11.0 11.9 12.0 12.1 12.5 12.7 10.7 11.5 11.6 95 10.9 11.9 12.0 11.8 12.4 12.6 10.6 11.5 11.6 100 10.9 11.9 12.0 11.8 12.4 12.6 10.6 11.5 11.6 CE 85 0.12 -0.05 -0.07 -0.04 -0.15 -0.19 0.16 0.01 0.00 90 0.12 -0.05 -0.07 -0.04 -0.15 -0.19 0.16 0.02 0.00 95 0.12 -0.05 -0.07 -0.03 -0.14 -0.18 0.16 0.02 0.00 100 0.12 -0.05 -0.07 -0.03 -0.14 -0.18 0.17 0.02 0.00 PI 85 0.27 0.41 0.43 0.12 0.34 0.36 0.32 0.44 0.47 90 0.27 0.41 0.43 0.12 0.34 0.36 0.31 0.45 0.47 95 0.28 0.41 0.43 0.16 0.36 0.38 0.32 0.45 0.47 100 0.28 0.41 0.43 0.16 0.36 0.38 0.32 0.45 0.47 India RMSE 85 410.9 320.6 457.7 275.9 262.7 281.1 250.4 256.1 247.0 90 294.9 307.8 311.1 260.3 256.2 276.8 248.1 252.3 249.0 95 291.6 305.0 304.6 255.3 256.5 265.0 247.7 254.2 251.4 100 286.8 301.6 302.7 246.6 257.3 251.2 245.2 245.9 247.2 CE 85 0.81 0.89 0.78 0.91 0.91 0.90 0.93 0.92 0.93 90 0.90 0.89 0.89 0.92 0.91 0.91 0.93 0.93 0.93 95 0.90 0.89 0.89 0.92 0.91 0.91 0.93 0.92 0.93 100 0.90 0.89 0.89 0.93 0.92 0.93 0.92 0.92 0.93 PI 85 0.61 0.92 0.90 0.81 0.93 0.96 0.85 0.94 0.97 90 0.79 0.92 0.95 0.83 0.94 0.96 0.85 0.95 0.97 95 0.80 0.92 0.95 0.84 0.94 0.96 0.85 0.95 0.97 100 0.80 0.92 0.95 0.85 0.94 0.97 0.84 0.94 0.97

Note: * “1, 2, and 3” denote one-, two-, and three-step-ahead forecasts; 1 “V” stands for the percentage of 1098 total variance. 1099 

1100 

Page 73: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

73

Table 16. Multiple-step predictions for Wuxi and India series using LR, K-NN, and 1101 

ANN with PCA2 1102 

Watershed Performance

index V (%)1

LR K-NN ANN 1* 2* 3* 1 2 3 1 2 3

Wuxi RMSE 85 10.8 11.6 11.6 11.5 12.2 12.7 10.6 11.5 11.6 90 10.8 11.6 11.6 11.5 12.2 12.7 10.6 11.5 11.6 95 10.9 11.9 12.0 11.8 12.4 12.6 10.6 11.5 11.6 100 10.9 11.9 12.0 11.8 12.4 12.6 10.6 11.5 11.6 CE 85 0.14 0.01 0.00 0.02 -0.11 -0.19 0.16 0.02 0.00 90 0.14 0.01 0.00 0.02 -0.11 -0.19 0.16 0.02 0.00 95 0.12 -0.05 -0.07 -0.03 -0.14 -0.18 0.17 0.02 0.00 100 0.12 -0.05 -0.07 -0.03 -0.14 -0.18 0.16 0.02 0.00 PI 85 0.30 0.44 0.47 0.20 0.37 0.37 0.32 0.44 0.47 90 0.30 0.44 0.47 0.20 0.37 0.37 0.32 0.45 0.47 95 0.28 0.41 0.43 0.16 0.36 0.38 0.32 0.45 0.47 100 0.28 0.41 0.43 0.16 0.36 0.38 0.32 0.45 0.47 India RMSE 85 482.9 413.0 800.6 254.7 248.9 253.1 241.7 247.2 249.8 90 352.8 357.8 600.6 253.1 249.4 250.4 245.0 247.9 247.8 95 326.5 331.9 376.1 247.8 246.1 246.1 243.5 249.0 244.8 100 286.8 301.6 302.7 246.6 257.3 251.2 247.6 252.3 247.1 CE 85 0.73 0.80 -2.74 0.92 0.93 0.93 0.93 0.93 0.93 90 0.86 0.85 0.58 0.93 0.93 0.93 0.93 0.93 0.93 95 0.88 0.87 0.84 0.93 0.93 0.93 0.93 0.93 0.93 100 0.90 0.89 0.89 0.93 0.92 0.93 0.93 0.93 0.93 PI 85 0.44 0.86 -1.75 0.84 0.95 0.97 0.86 0.95 0.97 90 0.70 0.89 0.81 0.85 0.95 0.97 0.86 0.95 0.97 95 0.74 0.91 0.93 0.85 0.95 0.97 0.86 0.95 0.97 100 0.80 0.92 0.95 0.85 0.94 0.97 0.85 0.95 0.97

1103 

1104 

Page 74: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

74

Table 17. Optimal p RCs for model inputs at various forecasting horizons 1105 

Watershed Model Prediction horizons

Supervised filter (SSA1) Unsupervised filter (SSA2)

Optimal p RCs RMSE Optimal p RCs RMSE

Wuxi LR 1 1* 2* 6.01 1 2 6.01 2 1 5 7.73 1 5 7.73 3 1 8.40 1 8.40 K-NN 1 1 8.02 2 3 7.17 2 1 8.41 2 4 8.03 3 1 9.99 2 9.69 ANN 1 1 2 3 4.43 1 2 3 4.43 2 1 5 5.82 1 5 5.57 3 1 6.42 1 6.25 Zhenwan LR 1 1 2 3 7.19 1 2 3 7.19 2 1 7 2 7.99 1 2 7 7.99 3 1 5 8.81 1 5 8.81 K-NN 1 1 2 3 4 5 9.84 3 6 7.64 2 1 9.72 3 6 8.95 3 1 10.33 2 5 10.24 ANN 1 1 2 3 4 5.55 5 6 7 5.02 2 1 7 2 5.84 3 7 5.51 3 1 5 6.58 3 7 5.56 India LR 1 2 1 3 4 185.95 1 2 3 4 185.95 2 1 2 237.85 1 2 237.85 3 3 4 2 7 1 299.14 1 2 6 287.16 K-NN 1 2 1 3 4 5 236.90 1 2 5 6 236.39 2 1 2 247.44 1 2 5 242.65 3 3 4 2 7 249.43 1 2 5 6 243.86 ANN 1 2 166.58 1 7 164.70 2 1 2 7 173.57 1 2 7 166.30 3 3 4 2 7 1 235.59 1 5 7 172.56 Zhongxian LR 1 1 2 40.06 1 2 40.06 2 1 2 6 41.87 1 6 39.44 3 3 6 2 4 1 58.29 1 5 41.53 K-NN 1 1 2 51.78 1 2 51.78 2 1 2 53.86 1 2 3 53.32 3 3 6 2 54.15 2 3 53.16 ANN 1 1 2 3 38.39 1 5 6 34.09 2 1 44.49 1 5 6 39.45 3 3 6 46.14 1 5 34.94

Note:* the numbers of “1, 2” stand for RC1 and RC2, and these numbers in the SSA1 column is in a 1106 descending order of CCFs shown in Figure 6.12. 1107 

1108 

Page 75: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

75

Table 18. Model performances at three forecasting horizons using MANN and ANN with the 1109 

SSA2 1110 

Watershed Model RMSE CE PI

1 2 3 1 2 3 1 2 3

WuXi ANN 4.43 5.57 6.25 0.84 0.78 0.70 0.87 0.88 0.84 MANN 3.63 4.32 3.93 0.90 0.86 0.89 0.92 0.92 0.94 Zhenwan ANN 5.02 5.51 5.56 0.81 0.75 0.71 0.88 0.85 0.84 MANN 3.18 3.20 3.31 0.92 0.92 0.91 0.95 0.95 0.95 India ANN 164.7 166.3 172.6 0.97 0.97 0.97 0.95 0.98 0.99 MANN 144.2 145.1 157.4 0.98 0.98 0.97 0.95 0.98 0.99 Zhongxian ANN 34.09 39.45 34.94 0.84 0.71 0.83 0.84 0.80 0.92 MANN 28.58 32.24 32.69 0.86 0.82 0.82 0.86 0.88 0.91

1111 

1112 

Page 76: Prediction of Rainfall Time Series Using Modular

Journal of Hydrology, Vol. 389, No. 1-2, 2010, pp. 146-167

76

1113 

Table 19. RMSE of the LR model coupled with SSA2 using various L 1114 

Watershed Prediction horizons

L in SSA 3 4 5 6 7 8 9 10

Wuxi 1 6.13 5.94 6.01 6.41 5.83 5.81 5.61 5.51 2 7.79 7.62 7.73 8.14 7.71 7.75 7.62 7.66 3 11.76 8.61 8.40 9.04 9.23 8.72 8.56 8.62 Zhenwan 1 7.74 7.49 7.31 7.29 7.19 6.99 7.08 6.84 2 10.27 8.42 9.04 8.19 7.99 7.67 7.43 7.61 3 11.28 10.66 10.06 9.33 8.81 8.91 9.42 9.28

1115 

1116