space time modelling of precipitation using a hidden

Space time modelling of precipitation usinga hidden Markov model and censored

Gaussian distributions

Pierre AilliotUniversite de Brest, France

Craig ThompsonNational Institute of Water and Atmospheric Research, New Zealand

Peter ThomsonStatistics Research Associates Ltd, New Zealand

15 October 2008

Abstract

A new hidden Markov model (HMM) for the space-time evolution of daily rain-fall is developed which models precipitation within hidden regional weather typesby censored power-transformed Gaussian distributions. The latter provide flexibleand interpretable multivariate models for the mixed discrete-continuous variablesthat describe both precipitation, when it occurs, and no precipitation. Parameterestimation is performed using a Monte Carlo EM algorithm whose use and perfor-mance are discussed using simulation studies. The model is fitted to rainfall datafrom a small network of stations in New Zealand encompassing a diverse rangeof orographic effects. The results obtained show that the marginal distributionsand spatial structure of the data are well-described by the fitted model which pro-vides a better description of the spatial structure of precipitation than a standardHMM commonly used in the literature. However the fitted model, like the standardHMM, cannot fully reproduce the local dynamics and underestimates the lag-oneautocorrelations.

Keywords: Space-time model; precipitation; hidden Markov model; censoredGaussian distribution; Monte Carlo EM algorithm.

1

1 Introduction

This paper develops a new space-time model for daily precipitation over localised spatialscales. Such models form an important part of stochastic weather generators (see [22]; [29];[25], for example) where they are used to simulate rainfall for the purposes of hydrologicaldesign or as inputs into environmental and ecosystem models. A particular impetus forthis study was the need to generate realistic daily rainfall sequences over small networksof rainfall stations in, or near, hydroelectric power generation catchments in South Island,New Zealand. The study forms part of a larger and ongoing research project whose overallaim is to better understand and predict periods of low inflows to hydroelectric reservoirsand consequent risk of lack of hydroelectric supply. However, daily rainfall models havemany other risk forecasting applications. For example, they can be used as stochasticrainfall generators to provide realistic inputs to rainfall-runoff and crop growth models,among many other applications.

A variety of stochastic models have been proposed in the literature, and among them theweather type models play an important role. The basic idea of these models consists ofintroducing an extra variable to describe the meteorological regime (weather type), andassume that this variable explains most of the space-time structure of the data. In [28]and [26], a variable representing the local weather type is introduced at each location(”local weather type” model), but in this paper we will only consider models in which theweather type is common to all locations (”regional weather type” model). This variablecannot be observed directly, and a first approach consists of using synoptic conditionpatterns as a basis for classifying each day into a few weather types (see [5], [30], [15], [2]and references therein).

A natural alternative consists of introducing the weather type as a hidden state variableand Hidden Markov Models (HMM) have been proposed in this framework (see [32],[16], [17], [4], [18]). A major problem in the first class of models is that the weathertypes are defined a priori and, although interpretable, will not necessarily provide gooddescriptions of the space-time stochastic properties of rainfall. With HMMs, the fittedstates are optimally selected to capture the stochastic properties of rainfall. However theyremain statistical artifacts estimated as part of the fitting process and may not alwayshave simple interpretations in terms of weather types. Furthermore, the existence of thehidden variable complicates the statistical inference, leading to the use of simple modelsto describe the sequence of weather types and also the distribution of precipitation withinweather types. As a consequence, these models may be too simple to restore all thecomplexity of the space-time structure of observed precipitation.

The simplest HMM is obtained by assuming that precipitation is conditionally indepen-dent in space given the regional weather type or, in other terms, that all the spatialdependence is induced by the weather type (see [32], [16], [4]). This assumption seemsunrealistic, in particular when the network of rainfall stations is dense and when thetopography of the area is diverse, and this leads to an underestimation of spatial corre-lation. This is illustrated below on a small network of stations in New Zealand. As aconsequence, more sophisticated spatial models are needed to describe the precipitationwithin the weather types.

2

Figure 1: A small network of seven rainfall stations in the east coast of South Island, NewZealand, located at Winchmore (1), Highbank (2), Lake Coleridge (3), Lincoln (4), Christchurch(5), Rangiora (6) and Balmoral (7).

Besides the existence of the hidden weather type, another major difficulty with buildingsuch spatial structure is the mixed nature of rainfall distributions which have a point massat zero, corresponding to the days with no rainfall, and a continuous density for days withpositive precipitation. In order to deal with this mixture of components, several authorshave proposed using a two-stage approach. A binary process describing the occurrenceof rainfall is first modelled, before introducing a model for the positive amounts whenrainfall occurs (see [8], [3]). In this context, several models have been proposed to describethe spatial dependence of precipitation occurrence within weather types ([17], [18]), butdifficulties arise when positive rainfall amounts are included within this structure.

In this paper, we propose modelling temporal dependence using a regional weather typemodel and, conditional on weather type, modelling the spatial dependence of rainfalloccurrence and amount using censored, power transformed, Gaussian distributions. Var-ious authors have already proposed using transformed Gaussian variables to describe thedistribution of rainfall (see for example [2], [24], [1]), but, as far as we know, never inconjunction with an HMM. The data considered in this paper are introduced in Section 2and the model in Section 3. Parameter estimation and fitting procedures are given inSection 4, results of fitting the model to the data are given in Section 5 and conclusionsgiven in Section 6.

2 Data

We consider a small network of K = 7 rainfall stations located in South Island, NewZealand, where these stations have been numbered as shown in Figure 1. For each locationk (k = 1, . . . , K), zk will denote the 2-dimensional spatial coordinates of the station andYt(k) the accumulated rainfall over day t. In this paper, a stochastic model for themultivariate time series Yt = (Yt(1), . . . , Yt(K))′ is proposed.

3

1 2 3 4 5 6 7

1

2

3

4

5

6

7

1

0.91 1

0.69 0.75 1

0.83 0.76 0.54 1

0.79 0.75 0.5 0.92 1

0.73 0.79 0.51 0.77 0.87 1

0.6 0.63 0.5 0.61 0.61 0.64 1

Location

Loca

tion

0

0.5

1

Figure 2: Spatial correlation matrix of rainfall amounts Yt(k) at the 7 locations given inFigure 1.

The data set consists of 26 years of daily rainfall and we focus on the month of April.Further, we assume that the 26 months of April daily rainfall are 26 independent realiza-tions of a common stochastic process. This assumption is not unusual for meteorologicalprocesses. However, it does not take into account low frequency variation such as the ElNino Southern Oscillation (ENSO) and the Interdecadal Pacific Oscillation (IPO). Theextension of the model to include seasonal and inter-annual variability will be the subjectof future research.

The climate of South Island is dominated by the effects of the Southern Alps on theprevailing westerly flow. There are steep rainfall gradients, ranging from more than 8mper year on average in the Southern Alps, to less than 0.5m per year at some placesnear the east coast. The area considered in this paper is located between the SouthernAlps and the east coast, and much of the precipitation comes from easterly or southerlyweather systems. The distance between the locations ranges between 15km and 120km.The topography of the area is relatively diverse, with some stations located near the coastand others in the foothills of the Southern Alps. The range of topographical effects inthis region produces local effects and precipitation fields with complex spatial structures.

An illustration of the network’s spatial complexity is given in Figure 2 which shows theempirical correlation matrix of the multivariate marginal distribution of {Yt}. It can beseen that the pairwise correlations between locations 1 and 2, locations 4 and 5, andlocations 5 and 6 are high. These locations are close together, and on a flat coastal areathat generally experiences similar rainfall conditions. On the other hand, locations 3 and7, which are inland and at higher elevation, have least correlation with the other locations.

3 Model description

Meteorologists usually describe the synoptic atmospheric patterns by using meteorologicalregimes, and local rainfall patterns are strongly related to these regimes. In this context,different authors have proposed using HMMs to describe the evolution of the multivariate

4

precipitation process {Yt} (see for example [32], [16], [17], [4], [18]). More precisely, letSt denote the weather type at time t. We assume that this process cannot be observeddirectly (it is hidden or unobserved) and that it has finite state space with St taking onvalues 1, . . . ,M . Then {St, Yt} is assumed to follow an HMM, so that the two conditionalindependence properties

p(st|st−11 , yt−1

1 ) = p(st|st−1) (1)

p(yt|yt−11 , st1) = p(yt|st) (2)

hold where the p(.) denote conditional probabilities (densities with point mass at zero inthe case of (2)), and yt1 denotes the sequence {y1, . . . , yt} of values of {Yt} from day 1 today t, with st1 defined similarly.

In [3], [16], [17], and [18], the hidden Markov chain {St} is assumed to be non-homogeneouswith a transition probability matrix that depends on meteorological variables that areknown a priori. In this way the model can be used to relate local rainfall to broad scaleatmospheric circulation patterns. Here our focus is on risk forecasting and the stochasticgeneration of rainfall in situations where such broad scale information is typically un-available. Given this context, we shall assume that {St} is a time-homogeneous Markovchain that is ergodic and irreducible. We further assume that the emission probabilitiesp(yt|st) are time-homogeneous with stochastic properties that vary across the networkof rainfall stations. Nevertheless, the proposed model can readily be extended to thenon-homogeneous and non-stationary case if desired.

Assumptions (1) and (2) characterize the dynamics of the precipitation. They imply that{St} is a Markov chain, whose evolution is independent of the previous observations, andthat the successive observations are conditionally independent given the weather types.In particular, it is assumed that all the dynamics of precipitation are captured by theregional weather type {St}.The hidden Markov chain {St} can be parameterised either by its transition probabilitymatrix alone, or its transition probability matrix as well as its initial distribution. Sincewe consider daily rainfall in a given month (April) over consecutive years, we adopt thelatter more general parameterisation which also brings modest computational advantages.In particular, the marginal state probabilities at the beginning of April, assumed to becommon over years, do not have to be the same as the steady-state stationary distributionderived from the transition probability matrix. Support for this assumption is given inSection 5.

The emission probabilities p(yt|st), describing the spatial dependence of precipitationwithin the weather type st, also need to be suitably parameterised. A simple model is

p(yt|st) =K∏k=1

p(yt(k)|st) (3)

which implies that the rainfall at different locations are conditionally independent giventhe regional weather type or, in other words, that all the spatial structure is captured bythe regional weather type. However, it is often the case that this simple model cannotexplain the spatial dependence structure observed in precipitation, and that additional

5

correlation between nearby stations is created by local effects (see [32], [16], [3], [17]). Wewill see in Section 5 that this is also the case for the network considered here.

In order to allow for additional spatial dependence within the weather types, a two stageapproach is generally used in order to handle the mixed discrete-continuous nature of thedata. In the first step, only the binary process describing the occurrence of rainfall isconsidered, and various models have been successfully proposed to describe the spatialstructure of this process within weather type (see [17], [18]). The next step modelsthe positive amounts conditionally on weather type and rainfall occurrence. In [8], theamounts are introduced a posteriori, once the weather types have been identified by fittingan HMM to the occurrence process, but this approach is not entirely satisfactory sincethe precipitation amounts do not play a role in the definition of weather type. In [3], theamounts are assumed to be conditionally independent in space given weather type andrainfall occurrence, but again this assumption seems restrictive.

The approach adopted here is to build multivariate distributions for mixed discrete-continuous variables by censoring a multivariate Gaussian distribution. More precisely,we assume that if St = s then

Yt(k) = max(Xt(k), 0), Xt(k) =

{Wt(k) (Wt(k) ≤ 0)

Wt(k)β(s)(k) (Wt(k) > 0)

(4)

where Wt = (Wt(1), . . . ,Wt(K))′ satisfies

Wt = m(s) +H(s)Zt (s = 1, . . . ,M)

and Zt is a sequence of independent and identically distributed Gaussian vectors, eachwith zero mean and unit covariance matrix. Here m(s) is a K-dimensional mean vectorand the K-dimensional covariance matrix Σ(s) = H(s)(H(s))′ is assumed to be positivedefinite for each s. Thus (4) transforms a multivariate Gaussian vector to a randomvector with a multivariate mixed discrete-continuous distribution. Note, in particular,that this definition ensures a 1-1 mapping between the partially observed Xt(k) and theunobserved Wt(k) when β(s)(k) 6= 0. Various authors have used transformations such asthese to describe rainfall distributions (see [2], [24], [1] and references therein), howeverthis appears to be the first use of such a transformation in conjunction with an HMM.

The number of parameters introduced by the model is M2 − 1 +MK(K + 1)/2 + 2MK.Depending on the amount of data available and on the size of the network, this numbermay be too large, and various reduced models will need to be considered. We couldassume, as in [2] and [24], that the exponents β(s)(k) are spatially homogeneous so thatthey have the same values, β(s) say, at each location k. In [2] it is additionally assumed thatthe exponents are the same across weather types with β(s)(k) = 5

3. Reduced models for the

β(s)(k) such as these were tried on our data, but were unable to successfully describe themarginal distribution of the process. This experience, and other considerations, led us tofully specify the MK exponents β(s)(k), the MK means m(s) and the M2−1 probabilitiesspecifying {St}.However reduced models are necessary to describe the K × K covariance matrices Σ(s)

(s = 1, . . . ,M) since, otherwise, the number of parameters involved is quadratic in the

6

number of locations K. In Section 5, we consider the following reduced models for thesecovariance matrices.

C0: Here the Σ(s) are assumed to be diagonal with typical diagonal elements

Σ(s)(i, i) = (σ(s)i )2

and σ(s)i > 0. This implies that precipitation is conditionally independent in space

which is probably too restrictive, particularly if the network is dense or topograph-ically diverse. The number of parameters is M2 − 1 + 3MK.

C1: For this model the conditional correlation between stations at zi and zj is assumedto depend on the distance d(zi, zj) between these locations through a spatially ho-mogeneous parameter λ(s) > 0 so that Σ(s) has typical element

Σ(s)(i, j) = σ(s)i σ

(s)j exp(−λ(s)d(zi, zj))

with σ(s)i > 0. Note that this model includes the previous model as a limit when all

λ(s) are large (λ(s) ↑ ∞). Again, this assumption can be restrictive and it is clear thatwe cannot reproduce matrices with block structure, like the one shown in Figure 2,with such a simple model. Other variables which summarize the topography, such aselevation differences, the direction between the different locations, or the distance tothe coast would also need to be included. However, it seems difficult to summarizethe topography of the region considered here with just a few variables. The numberof parameters in this model is M2 − 1 +M(3K + 1).

C2: Here, like model C1, the conditional correlation between stations depends on distancebetween locations, but through the more general model

Σ(s)(i, j) = σ(s)i σ

(s)j κ(λ

(s)i , λ

(s)j ) exp(−κ(λ

(s)i , λ

(s)j )

√λ

(s)i λ

(s)j d(zi, zj))

where κ2(x, y) = 2√x2y2/(x2 + y2) and σ

(s)i > 0, λ

(s)i > 0. The dimensionless factor

κ2(x, y) (the ratio of the geometric and arithmetic means of x2, y2) is bounded aboveby unity, and the fact that Σ(s) is positive definite follows from [21], Theorem 1,using the exponential correlation function with diagonal kernel matrix. This modelincludes the previous models as special cases and the parameter λ

(s)k describes how

strongly the station at location k is correlated with the other locations. For example,a high value of λ

(s)k relative to the other λ

(s)j (j 6= k), implies that the correlation

between location k and other locations quickly decreases with distance, whereas alow value indicates a slowly decreasing correlation. For this model the number ofparameters is M2 − 1 + 4MK.

Finally, all these models are nested in the full model, denoted C∗, where the only require-ment is that the Σ(s) (s = 1, . . . ,M) are positive definite.

7

4 Monte Carlo EM algorithm

The structure of the model is summarized in Table 1. The weather type St is a finitestate-space Markov chain which controls the parameters of the conditionally Gaussianvector Wt. Then, the observed rainfall Yt is linked to this conditionally Gaussian variablevia power-transformation and truncation. Because of this truncation, the process Wt isonly partially observed, and the model can be seen as a non-linear state-space model withtwo layers of hidden variable, one discrete and one continuous.

. . . → St−1 → St → St+1 → . . .↓ ↓ ↓

. . . Wt−1 Wt Wt+1 . . .↓ ↓ ↓

. . . Yt−1 Yt Yt+1 . . .

Table 1: Directed graph summarizing conditional independence assumptions of the HMM withcensored Gaussian distributions.

Recently there has been a surge of interest in the theoretical and computational aspectsof statistical inference procedures for non-linear state-space models (see [6] and [10] forrecent reviews). Maximum likelihood and Bayesian estimation approaches are commonlyconsidered using a variety of computational algorithms. We have chosen to fit the modelproposed in Section 3 by maximum likelihood with a focus on procedures with reasonablecomputational complexity and cost, and which are readily implemented and understoodin practice.

4.1 Likelihood function

To account for seasonality, the daily rainfall measurements at the 7 stations in the rainfallnetwork were blocked into months with the data for any particular month (April in ourcase) concatenated into one data sequence yT1 of T daily vector observations. Let B indexthe days that mark the beginning of each annual block of data for the month concerned,and assume that these blocks constitute independent annual realisations of that month’sdaily rainfall. Then, apart from a constant, the complete log-likelihood, based on thedistribution of the hidden weather types St and the partially observed Xt(k) is given by

log p(xT1 , sT1 ; θ) =

T∑t=1

logPst−1st +∑t∈B

(log πst − logPst−1st)

− 12

T∑t=1

log det Σ(st) − 12

T∑t=1

(wt(β(st))−m(st))′(Σ(st))−1(wt(β

(st))−m(st))

+T∑t=1

∑k|xt(k)>0

(1− β(st)(k)

β(st)(k)log yt(k)− log β(st)(k))

8

where Ps0s1 is defined to be πs1 , and the wt(β(st)) are functions of xt, st given by the 1-1

mapping (4). Here θ = (θ(0), θ(1), . . . , θ(M)) denotes the set of unknown parameters, withθ(0) comprising the initial probabilities πs = P (St = s) for t ∈ B, and transition proba-bilities Psu = P (St = u|St−1 = s) for t /∈ B, that describe the dynamics of the Markovchain {St}, and θ(s) (s = 1, . . . ,M) comprising the set of parameters m(s), Σ(s), β(s) thatdescribe the emission probabilities p(yt|St = s). The maximum likelihood estimate θ isthe value of θ that maximises the (incomplete) likelihood of the observations yT1 formedby integrating the complete likelihood over the missing variables.

In this paper, the Expectation-Maximisation (EM) algorithm due to [9] is used to computeθ. This recursive algorithm computes successive approximations θn of θ by cycling throughthe following steps.

E-step: Compute Q(θ|θn) = E(log(p(XT1 , S

T1 ; θ))|yT1 ; θn) as a function of θ.

M-step: Determine the updated parameter estimate θn+1 = arg maxθ

Q(θ|θn).

Under certain general conditions it can be shown that the sequence of estimates θn yieldsmonotonically increasing values of the incomplete likelihood, and converges to the max-imum likelihood estimate θ (see [31]). Thus the EM algorithm provides an alternativemethod of maximising the incomplete log-likelihood which is commonly used in modelswith hidden or latent variables such as the model proposed here. The EM algorithmdirectly utilises the hidden structure and, as a consequence, is often more robust in prac-tice to the choice of initial starting values than direct maximum likelihood methods. Itscomputational efficiency is enhanced if the E and M steps are readily evaluated. Theseobservations and design objectives underpin the implementation of the EM algorithm thathas been adopted and are discussed more fully in the sections that follow.

4.2 E-step

The conditional probabilities involved in the computation of Q(θ|θn) are typically calcu-lated using the so-called forward-backward recursions (see [19], [6] for example). However,as is often the case for non-linear state space models, intractable integrals appear in theserecursions and numerical approximations are required. In this context, various MonteCarlo methods have been proposed in the literature including use of the Gibbs sampler(see [23], [6] for details). In this subsection, the performance of this generic and easilyimplementable procedure is compared with more specific sampling techniques which takeadvantage of the conditional independence structure of our model.

To determine Q(θ|θn) as a function of θ we need to compute the quantities

γt(s) = p(St = s|yT1 ; θn), γt(s, u) = p(St−1 = s, St = u|yT1 ; θn) (5)

E(W−t |St = s, yt; θn), E(W−

t (W−t )′|St = s, yt; θn) (6)

for s, u = 1, . . . ,M and t = 1, . . . , T excluding the γt(s, u) when t ∈ B. Here W−t

denotes the (possibly empty) vector containing the non-positive Wt(k), and the classifi-cation probabilities (5) are computed recursively using the forward-backward algorithm(see [19] for example). An important byproduct of this key algorithm is the evaluation

9

of the incomplete likelihood (see [11] for example). The use of this algorithm requiresthe computation of emission probabilities p(yt|st; θn). In general, analytic expressions forthese and the conditional expectations (6) are not available and numerical procedures arerequired.

Consider for example the emission probability p(y|s; θ) where the first d elements of y arezero (dry locations) and the last K − d elements are positive (wet locations). Then

p(y|s; θ) = (K∏

k=d+1

y(k)1/β(s)(k)−1

β(s)(k))

∫ 0

−∞. . .

∫ 0

−∞φ(w;m(s),Σ(s))dw(1) . . . dw(d) (7)

where w(k) = y(k)1/β(s)(k) for k = d + 1, . . . , K, and φ(.;m,Σ) denotes the multivariateGaussian density with mean m and covariance Σ. Note that (7) can also be written asa product of the density corresponding to the K − d wet locations, and an integral overthe density corresponding to the d dry locations conditional on the observations at thewet locations. Values of p(y|s; θ) for other values of y can always be expressed in thisform using a suitable permutation of the indices of y, and the conditional expectations(6) involve similar multidimensional Gaussian integrals.

Various Monte Carlo methods have been proposed in the literature to approximate suchintegrals (see, for example, [27], [13], [14] and references therein). In this paper, twomethods are considered and compared. The first, Method A, uses a simple applicationof acceptance-rejection sampling to approximate the probabilities (5) and expectations(6). In the case of (7), N independent samples from the conditional Gaussian distribution

of W (1), . . . ,W (d) given W (k) = y(k)1/β(s)(k) (k = d+1, . . . , K) are generated in the usualway from the multivariate Gaussian distribution of W (1), . . . ,W (K) with mean m(s) andcovariance matrix Σ(s). Then (7) is approximated by:

(K∏

k=d+1

y(k)1/β(s)(k)−1

β(s)(k))φ(w; m(s), Σ(s))

INN

where IN denotes the number of samples where all elements of W (1), . . . ,W (d) are nega-tive, m(s), Σ(s) are the mean and covariance matrix of W (d+ 1), . . . ,W (K), and w is the

vector of values y(k)1/β(s)(k) (k = d + 1, . . . , K). The expectations in (6) are estimatedsimilarly as the sample mean and cross-product respectively of these IN vector values.

The second approach, Method B, uses the more sophisticated procedures proposed in[13] which transform the original integration region to the unit hypercube before applyinga simple Monte Carlo method. Numerical experiments indicate that the method proposedin [13] is more efficient than the crude acceptance-rejection method when the probabilityof truncation is small (high rejection rate). On the other hand, the transformation tothe unit hypercube adds extra computational cost which makes the acceptance-rejectionmethod more efficient when the probability of truncation is not too small.

The performances of these two methods for our particular application were compared andbenchmarked against a Gibbs sampler (Method C) in terms of accuracy and computa-tional efficiency. The particular form of the Gibbs sampler used is similar to that described

10

in [6], Section 6.3.1.1, and sampled sequentially from the conditional distributions

p(St|st−11 , sTt+1, w

T1 , y

T1 ; θn) = p(St|st−1, st+1, wt; θn) (8)

p(w−t |sT1 , wt−11 , wTt+1, y

T1 ; θn) = p(w−t |st, yt; θn) (9)

for t = 1, . . . , T with corrections made at the endpoints of each annual block of data.Then (8) is simulated in the usual way for an HMM with finite state-space (see [6]), and(9) is simulated using the basic acceptance-rejection method described above.

The comparison of the three methods was based on calculating representative componentsof Q(θ|θn) where θn was chosen to be the maximum likelihood estimate of θ for the C2model with 4 regimes discussed in more detail in Section 5. Attention was restrictedto location 1 (Winchmore) and the driest state (St = 1). The components of Q(θ|θn)considered were

U =1

T

T∑t=1

γt(1), V = m(s)n+1(1) Z = C

(s)n+1(1, 1)

where m(s), C(s)n+1 are given by (10), (12) respectively and the means and standard devi-

ations of these quantities were computed from 100 independent applications of each ofthe three methods. Results are reported in Table 2 for three choices of N together withaverage computational times. All calculations were undertaken using Matlab on a PCwith 1GB RAM and 1.66 Ghz CPU.

Method tc xU sU xV sV xZ sZA 0.0570 0.4271 0.0146 -4.2194 0.2412 40.8743 3.9212

N = 102 B 0.2842 0.4207 0.0018 -4.1886 0.0851 40.5693 2.2315C 3.3811 0.3562 0.0318 -4.0335 0.2237 39.5712 1.7641A 0.0786 0.4171 0.0051 -4.1802 0.0961 40.2633 1.4732

N = 103 B 1.1242 0.4245 0.0005 -4.1933 0.0257 40.5206 0.6424C 32.8370 0.4281 0.0113 -4.2105 0.0465 40.5812 0.4856A 0.3839 0.4265 0.0015 -4.1957 0.0247 40.5155 0.3821

N = 104 B 12.5239 0.4266 0.0002 -4.1932 0.0083 40.4533 0.2217C 326.8721 0.4351 0.0039 -4.2421 0.0153 40.6601 0.1348

Table 2: Sample means (xU , xV , xZ) and standard deviations (sU , sV , sZ) of the statistics U ,V , Z obtained from 100 independent applications of Methods A, B, and C. Here N is the MonteCarlo sample size used by each method and tc denotes average computational time in seconds.

Of the three methods, the Gibbs sampler was generally the least efficient in terms of bothspeed and accuracy and, for given N , Method B was almost always the most accurateand Method A easily the fastest. Accuracy was proportional to 1/

√N for all methods as

expected, computational time was roughly proportional to N and, for Methods A and B,significantly dominated by the time spent calculating the Monte Carlo simulations ratherthan the forward-backward recursions. For Method C, no explicit allowance was made fora burn-in period which may explain the bias observed for small values of N . Inclusion ofa burn-in period would reduce this bias at the expense of increased computational time.

11

A more systematic comparison of methods A and B shows that they give similar resultsin terms of accuracy achieved for the same computational cost.

In general, the relative efficiencies of the three methods will depend on the number oftruncated components and their degrees of truncation. This means, in particular, thatthe results reported in Table 2 will vary according to the number of locations K, thepercentage of dry days in the data, and the nature of the region’s rainfall encapsulatedby the parameter values θn.

4.3 M step

For any given choice of θn, the function Q(θ|θn) can be decomposed as

Q(θ|θn) = QS(θ(0)|θn) +M∑s=1

QY |S(θ(s)|θn)

where

QS(θ(0)|θn) =T∑t=1

M∑s=1

M∑u=1

γt(s, u) logPsu +∑t∈B

M∑s=1

(γt(s) log πs −M∑u=1

γt(s, u) logPsu)

and

QY |S(θ(s)|θn) = −12(log det Σ(s))

T∑t=1

γt(s)

− 12

T∑t=1

γt(s)E[(Wt(β(s))−m(s))′(Σ(s))−1(Wt(β

(s))−m(s))|yt, St = s; θn]

+T∑t=1

∑k|yt(k)>0

γt(s)(1− β(s)(k)

β(s)(k)log yt(k)− log β(s)(k))

for s = 1, . . . ,M . Here Wt(β(s)) has typical element yt(k)1/β(s)(k) when yt(k) > 0 and

Wt(k) when yt(k) ≤ 0. As a consequence, optimising Q(θ|θn) with respect to θ involvesM+1 separate optimisations. Although there are well-known analytic expressions for thevalue of θ(0) that maximises QS(θ(0)|θn) (see [11] for example), this is not always the casefor QY |S(θ(s)|θn) and numerical optimisation procedures are usually required.

In order to simplify the optimization of QY |S(θ(s)|θn) and minimise computational time,

the following two-stage procedure has been used. In the first stage m(s)n+1, Σ

(s)n+1 are deter-

mined as the values of m(s), Σ(s) that maximise QY |S(m(s),Σ(s), β(s)n |θn). Thus

m(s)n+1 =

1∑Tt=1 γt(s)

T∑t=1

γt(s)E(Wt(β(s)n )|yt, St = s; θn) (10)

and Σ(s)n+1 is the value of Σ that minimises

log det Σ + trace(Σ−1C(s)n+1) (11)

12

where

C(s)n+1 =

1∑Tt=1 γt(s)

T∑t=1

γt(s)E(Wt(β(s)n )Wt(β

(s)n )′|yt, St = s; θn)− m(s)

n+1m(s)n+1′ (12)

with m(s)n+1, C

(s)n+1 computed using (6). Explicit formulae are available for Σ

(s)n+1 in certain

special cases; Σ(s)n+1 = C

(s)n+1 for the C∗ model and Σ

(s)n+1 is a diagonal matrix with the

same diagonal entries as C(s)n+1 for the C0 model. However, in general, standard numerical

procedures will be needed to minimise (11). Nevertheless, the simplicity of (11) means

that, in practice, the values of Σ(s)n+1 will typically be determined very quickly and with

minimal computational cost. In the more computationally intensive second stage, theestimates m

(s)n+1, Σ

(s)n+1 are used to determine an updated estimate β

(s)n+1 of β(s). This is

determined as the value of β(s) maximising QY |S(m(s)n+1, Σ

(s)n+1, β

(s)|θn) where, once again,

a standard numerical optimisation method can be used to efficiently determine β(s)n+1. If

greater accuracy is required, these estimates could be further refined by iterating theprocedure until convergence. However, this is likely to lead to modest gains in accuracyat the expense of significantly increased computational cost.

The main advantage of this two-stage procedure is that the complexity of the initialoptimisation problem has been significantly reduced. Instead of using a numerical opti-mization procedure to solve a high dimensional problem, a number of lower dimensionalproblems are solved which significantly improve computational time and efficiency. Whenno stochastic approximation is used in the E-step, it is easy to check that this procedureensures that Q(θn+1|θn) > Q(θn|θn) and the corresponding values of the incomplete like-lihood are monotonically non-decreasing. Such an algorithm is called a Generalised EM(GEM) algorithm, and its convergence properties are well known (see [31] and [20]).

4.4 Computational issues

The time required to compute one iteration of the Monte Carlo EM algorithm mainlydepends on N(n), the size of the sample simulated in the E-step of iteration n of the EMalgorithm. When N(n) is small the E-step is performed quickly, but with less accurateapproximations of the integrals concerned. On the other hand, when N(n) is large thecomputational cost is higher, but the approximations are more accurate.

A suitable strategy for the choice of N(n) would be to let it increase with n. When n issmall, small values of N(n) allow fast computation and more efficient exploration of thelikelihood surface. Closer to the maximum, large values of N(n) provide more accurateapproximations of the incomplete log likelihood and the actual EM updates, thus ensuringconvergence. Various procedures have been proposed in the literature for the choice ofN(n) (see [7], [12] and [6] for example) including both deterministic (e.g. N(n) = n2)and adaptive, data driven, schemes where, for example, the sample size could dependon the number of dry locations, or could switch from Method A to Method B when thetruncation probability is small.

We adopted a simple deterministic scheme with N(n) = 100 for the first 50 MCEMiterations, N(n) = 500 for iterations 51 to 100, and then N(n) = n2 for the remaining

13

50 100 150 200

−6800

−6600

−6400

−6200

log

L

50 100 150 200

1.6

1.8

2

m(1

) (1)

50 100 150 2000

20

40

Σ(1) (1

,1)

Number of iteration50 100 150 200

1.5

2

2.5

3

β(1) (1

)

Number of iteration

Figure 3: Results from 200 iterations of the MCEM algorithm for the C2 model with 4 regimes.The plots show successive estimates of the incomplete log-likelihood logL (top left), m(1)(1) (topright), Σ(1)(1, 1) (bottom left) and β(1)(1) (bottom right), by iteration number n. The samplesize N(n) for iteration n was 100 for n ≤ 50, 500 for 50 < n ≤ 100, and n2 for n > 100.

iterations. Method A of Section 4.2 was used to evaluate the E-step since the numericalresults given in Section 4.2 suggest that Method A or Method B give similar results.

The convergence of this algorithm is illustrated in Figure 3 for the C2 model with 4regimes. Note that the Monte Carlo approximations to the incomplete log-likelihoodfluctuate about an average level which is strictly increasing in accordance with standardresults for the EM algorithm. As expected, these fluctuations decrease as N(n) increasesand the quality of the Monte Carlo approximations improve. The step changes in N(n) atn = 50 and n = 100 have apparently led to level shifts in the incomplete log-likelihood andthe estimated parameters. These are typically small changes on a percentage basis except,perhaps, those for the power transformation parameter β(1)(1) whose precision seems moresensitive to the quality of the Monte Carlo approximation. The latter presumably reflectsthe importance of these parameters for specifying the shape of the state-dependent rainfalldistributions. In terms of computational cost, the total CPU time for the 200 iterations isabout 140 minutes using Matlab on a PC with 1GB RAM and 1.66 Ghz CPU. In practice,the algorithm is initialised using a variety of parameter estimates chosen within a set ofphysically reasonable values.

A suitable computational strategy for fitting the model would be to initially explore theincomplete log-likelihood using the C∗ model since it is the encompassing model and hasan analytical M-step, minimising computational cost. The estimates from this modelcan then be used as consistent initial estimates for the various reduced models listedin Section 3. Finally, standard errors of the various estimates could be obtained bynumerically evaluating the Hessian and corresponding information matrix from values ofthe incomplete log-likelihood in the neighbourhood of the maximum likelihood estimate.

14

5 Results

The full model and the different reduced models introduced in Section 3 have been fittedto the data for M = 1, . . . , 5 using the methodology described in Section 4. In order toselect the best model, we used the Akaike Information Criterion (AIC) and the BayesInformation Criterion (BIC), defined respectively as

AIC = −2 logL+ 2k(θ), BIC = −2 logL+ k(θ) log T

where L is the likelihood of the data (the incomplete likelihood) and k(θ) is the numberof parameters.

The results are given in Table 3 where, for comparison, we also show the results of fittingthe HMM model proposed in [4]. For this model, (3) holds with

p(yt(k)|st) =

{1− p(st)

k (yt(k) = 0)

p(st)k γ(yt(k);α

(st)k , β

(st)k ) (yt(k) > 0)

where γ(y;α, β) denotes a gamma density with parameters α, β, and 0 ≤ p(s)k ≤ 1,

α(s)k > 0, β

(s)k > 0 are unknown parameters which depend on both the regime s and the

location k. In other words, precipitation is assumed to be conditionally independent inspace given the current weather type, with p

(s)k giving the probability of rainfall occurrence

at each location. When precipitation occurs, the amount follows a Gamma distribution.This model is denoted Cγ.

AIC BICM 1 2 3 4 5 1 2 3 4 5Cγ 17404 14317 13436 13213 13144 17502 14523 13760 13663 13731C0 17403 14445 13639 13398 13289 17501 14651 13963 13849 13875C1 13092 12770 12697 12616 12623 13196 12985 13035 13085 13233C2 12957 12750 12599 12525 12506 13089 13023 13022 13108 13257C∗ 12904 12643 12640 12674 12611 13101 13046 13259 13519 13690

Table 3: AIC and BIC values for the C0, Cγ, C1, C2 and C∗ models and M = 1, . . . , 5

The Cγ and C0 models have the same number of parameters, and so Table 3 shows thatthe AIC and BIC values for the Cγ model are generally lower, and the log-likelihood val-ues generally higher, than those corresponding to the C0 model. Since the two models areessentially the same apart from different distributional specifications for positive precipi-tation, this difference indicates that the censored transformed Gaussian does not providequite as good a fit as the gamma distribution. Both AIC and BIC clearly favour the threemodels with non-diagonal covariance matrices over the C0 and Cγ models, and this partlyjustifies the introduction of spatial dependence within weather types. Furthermore, intro-ducing spatial structure leads to selecting models with fewer weather types which suggeststhat the spatial structure of precipitation is an important aspect of weather type. Finally,AIC selects the model C2 with five regimes as the best model (all models gave larger AICvalues for M > 5) whereas BIC selects the model C1 with only two regimes.

15

To make a final selection, we checked the physical realism of the different models (includingother spatial models comparable in complexity to C2) and their ability to generate realisticprecipitation. When M > 4, for example, the fitted states included two very similar statesor had states with very low probability of occurrence. On the other hand, when M < 4the states were distinct and interpretable, but could not reproduce the observed dynamicsas well as 4 state models (underestimation of the dry and wet durations in particular).This led us to restrict attention to M = 4 and select model C2 which now has the bestAIC value and close to the best BIC value, although several models gave almost similarresults.

The emission probabilities corresponding to this model are described in Figure 4. Withineach weather type the distribution of the precipitation at each location is described bythree parameters; the mean and variance of the underlying Gaussian variable and theexponent used in the power transformation. Instead of these three parameters and inorder to facilitate interpretation, we have plotted the probability of rainfall occurrenceand the mean and standard deviation of the positive rainfall amounts. Similarly, thecontemporaneous spatial correlations of precipitation within each weather type are alsoplotted in Figure 4 rather than the covariance matrices Σ(s). These quantities were allestimated using Monte Carlo methods.

The first regime corresponds mainly to dry conditions, with low probabilities of rainfalloccurrence (between 3% and 13%), but the means and the standard deviations of thepositive amounts are relatively large, in particular at locations 2, 6 and 7. It indicatesthat even if the probabilities of rainfall occurrence are low, the amounts can be importantwhen rainfall occurs. The spatial correlations of (wet and dry) precipitation show evidenceof clustering with locations 3 and 4 generally being less correlated with the remaininglocations. Within these two clusters there is moderate and more uniform correlation.The second regime is a very light rain or drizzle regime with relatively low probability ofrainfall occurrence (between 20% and 31%) and low amounts (the mean of the positiveamount lies between 0.5mm and 1.5mm). Consistent with its localised and scatterednature, the second regime has spatial correlations that are generally lower than the otherregimes. The third regime corresponds mainly to wet conditions and moderate rainfall,with probabilities of rainfall occurrence ranging from about 60% near the coast up to98% in the foothills. Similar regional differences exist for the amounts, with heavierprecipitation near the mountains. The spatial correlations are also generally low withthe city of Christchurch (location 5) having the least correlation with the remainingstations. Finally, the fourth regime corresponds to wet conditions and heavy rainfall.The precipitation is relatively homogeneous in space, except the northern locations 3, 6and 7 which have lower precipitation. Here the spatial correlations are more uniform andmore important, with all stations generally experiencing similar weather conditions.

The transition matrix describing the evolution of the hidden Markov chain, its stationarydistribution, the initial probabilities πs and mean durations in the different regimes, aregiven in Table 4. The most likely regional weather type is St = 1 (dry conditions), andthe mean duration of sojourns in this regime equals 2.73 days. The remaining stateshave lower mean durations which are similar and vary between 1.65 and 1.73 days. Someof the transition probabilities are very low. For example, the probability of going from

16

ProbabilityS

t=1

0.110.1 0.08

0.130.03 0.09

0.04

Mean

2.94.8 2.3

1.51.8 5.6

8.2

Standard deviation

36 3

1.51.9 8.2

14.9

Correlation

0

0.5

1S

t=2

0.220.3 0.22

0.250.28 0.31

0.2

1.11.5 0.6

0.70.8 0.5

1.2

0.91.5 0.5

0.50.7 0.4

1

0

0.5

1

St=

3

0.610.78 0.57

0.60.98 0.63

0.54

4.76.1 3.2

3.56.9 2.9

5.5

56.5 3

2.96.5 2.8

5.3

0

0.5

1

St=

4

0.920.86 0.93

0.880.74 0.93

0.89

12.113.8 11.4

12.29.7 9.9

7.2

11.512.3 10.2

10.89.8 9.5

6.1

0

0.5

1

Figure 4: Distribution of precipitation for the fitted C2 model in the different weather typesSt = s (rows). The columns give the probability of rainfall occurrence, the mean and standarddeviation of the positive rainfall amounts (mm), and the correlation matrix corr(Yt|St = s). Thedistribution of the fitted C2 model was obtained by simulation.

StSt−1 1 2 3 4 πs πs Ds

1 0.63 0.21 0.11 0.05 0.21 0.38 2.732 0.39 0.42 0.15 0.03 0.64 0.34 1.733 0.00 0.34 0.40 0.26 0.15 0.16 1.674 0.06 0.54 0.00 0.40 0.00 0.12 1.65

Table 4: Estimated transition probabilities of {St} for the C2 model together with the initialprobabilities πs, stationary distribution πs and mean state durations Ds.

state 4 to state 3 is approximately 0, and most of the time the fourth regime will befollowed by the second regime. Also, the probabilities of going directly from the drier

17

regimes 1 and 2 to wet regime 4 are low, and most of the time the Markov chain willtransit from the drier to wetter regimes through regime 3. As expected, the estimatedinitial probabilities of being in the respective states at the beginning of April are differentfrom the stationary distribution. This reflects the seasonal rainfall patterns with Marchrainfall having different dynamics to that of April. In order to further investigate themeteorological realism of this model, the distribution of other meteorological variables(such as pressure, temperature, etc) in the different regimes could be investigated.

To further validate the model we have checked its ability to simulate realistic precipita-tion. For that we have generated artificial time series from the model, and have comparedstatistics corresponding to the artificial sequences with those from the original data. Wefirst looked at the marginal distributions of precipitation, both sample and fitted, at eachlocation. Typical results are shown in Figure 5 which gives the Quantile-Quantile (QQ)plot and log-survivor function (the logarithm of one minus the distribution function)of positive precipitation for location 1 (Winchmore). To assist visual comparison, 95%prediction intervals about the theoretical log-survivor function have been superimposedwhere these quantities have been computed using Monte Carlo methods and the fittedC2 model. For each precipitation value, the pointwise limits of the intervals were calcu-lated as the 2.5th and 97.5th percentiles of the corresponding values of 1000 log-survivorfunctions calculated from 1000 independently generated sequences of 26 months of Aprilsimulated using the fitted model. Figure 5 shows that the model has successfully restoredthe marginal distributions with similar results at the other locations. Although out-of-sample validation would ensure that this result is reproducable, it is neither surprisingnor uncommon given the mixture distributions inherent in hidden Markov modelling.

0 20 40 600

10

20

30

40

50

60

70

Sample quantile

Mod

el−

base

d qu

antil

e

0 20 40 60−2.5

−2

−1.5

−1

−0.5

0

Rainfall amount (mm)

Log−

surv

ivor

func

tion

Figure 5: Distributions of positive precipitation (mm) at location 1 (Winchmore). The leftpanel shows the QQ plot of the distribution for the fitted C2 model against the sample distribu-tion. The right panel shows the log-survivor function and 95% prediction intervals (grey) for thefitted C2 model, with the sample log-survivor function (black) superimposed. The distributionof the fitted C2 model was obtained by simulation.

To assess the model’s ability to capture the contemporaneous spatial dependence structureof precipitation, the pairwise correlations of rainfall amounts Yt were estimated at eachpair of locations by the C2 and Cγ models with M = 4 and compared to the sample

18

0.4 0.6 0.8 1

0.4

0.5

0.6

0.7

0.8

0.9

1

Mod

el−

base

d co

rrel

atio

n

0.4 0.6 0.8 1

0.4

0.5

0.6

0.7

0.8

0.9

1

0.4 0.6 0.8 1

0.4

0.5

0.6

0.7

0.8

0.9

1

Sample correlation

Mod

el−

base

d co

rrel

atio

n

0.4 0.6 0.8 1

0.4

0.5

0.6

0.7

0.8

0.9

1

Sample correlation

Figure 6: Pairwise spatial correlations of rainfall amounts (left panels) and occurrence (rightpanels) for the fitted C2 model (upper panels) and fitted Cγ model (lower panels) plotted againstthe sample correlations. The correlations for the fitted models were obtained by simulation.

correlations given by the data. Recall that the Cγ model assumes that precipitation isconditionally independent in space given weather type, whereas the C2 model has spatialstructure that is a function of distance between locations. Figure 6 shows that the C2model successfully describes the spatial structure of rainfall amounts by comparison tothe Cγ model which significantly underestimates the pairwise correlations.

The correlations of rainfall occurrence (1 when Yt(k) > 0, 0 otherwise) at each pair oflocations were also estimated by the C2 and Cγ models and compared to those given by thedata. Figure 6 shows that the C2 model improves the description of the joint occurrenceprobabilities, particularly for the three location pairs with highest sample correlations, butnot as dramatically as for the amounts. These results collectively indicate that a commonweather type alone cannot capture all the spatial dependence structure of precipitation forthe network under consideration, and justify the introduction of spatial structure withinthe weather types.

The model assumes that the dynamics of precipitation are inherited solely from the com-mon weather type St. To check this assumption, Figure 7 plots the lag-one autocorrela-tions of the rainfall amounts and rainfall occurrence at each location, together with the

19

2 4 6−0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

Location

Lag−

one

auto

corr

elat

ion

2 4 6−0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

Location

0 5 10

−2

−1.5

−1

−0.5

0

Dry durations (days)0 5 10 15 20

−2

−1.5

−1

−0.5

0

Wet durations (days)

Log−

surv

ivor

func

tion

Figure 7: Lag-one autocorrelations of rainfall amounts (top left) and occurrence (top right)at each location, and log-survivor functions of the dry (bottom left) and wet (bottom right)durations at location 1 (Winchmore). Sample values are plotted (black) together with thevalues and 95% prediction intervals for the fitted C2 model (grey) obtained by simulation.

log-survivor functions of the dry and wet durations at a typical location (Winchmore)between the coast and the foothills.

The lag-one autocorrelation is a simple descriptive measure of temporal dependence com-monly used in practice, particularly in the case of stationary or near stationary data, asis the case here. Similarly, the lengths of wet and dry spells provide an alternative andequally important view of temporal persistence. In all cases, 95% prediction intervals ofthese variables about the corresponding theoretical values have been superimposed wherethese quantities were computed using Monte Carlo simulations from the fitted C2 modelin the same way as before. Although the log-survivor functions for Winchmore appearto show reasonable agreement between the fitted model and the data in terms of thelengths of the sequences of wet and dry days, the lag-one autocorrelations indicate thatthe model consistently underestimates the persistence of rainfall amounts and occurrenceat each location, in some cases significantly. This finding is also evident with the lag-onecross-correlations shown in Figure 8 .

20

Loca

tion

Location2 4 6

1

2

3

4

5

6

7

Location2 4 6

1

2

3

4

5

6

70

0.25

0.5

Figure 8: Lag-one cross-correlation matrices of rainfall amounts. The left panel shows thesample values determined from the data and the right panel the values for the fitted C2 modelobtained by simulation.

6 Conclusions

A new model for the space-time evolution of daily rainfall has been introduced whichcombines two approaches that have already been proposed elsewhere in the literature.A hidden Markov model has been used to model regional weather types that drive thetemporal dependence of the rainfall process. The spatial dependence of rainfall withinweather type has been modelled by censored power-transformed Gaussian distributions.The latter provide flexible and interpretable multivariate models for the mixed discrete-continuous variables that describe both precipitation, when it occurs, and no precipitation.

Using a Monte Carlo EM algorithm, the model was fitted to daily rainfall data froma small network of stations in an area of New Zealand with complex topography and,as a consequence, a diversity of effects. The results obtained show that the marginaldistributions of the rainfall data and its spatial structure are well-described by the fittedmodel. It is also shown that the model provides a better description of the spatial structureof precipitation than more standard HMMs such as [4].

However, despite its better spatial dependence structure, the model cannot fully repro-duce the local dynamics and underestimates the lag-one autocorrelations of both rainfallamounts and occurrence. Its performance in this regard is little different from the standardHMM proposed in [4] which is to be expected since both models share the same tempo-ral structure based on a hidden regional weather type. By contrast, the local weathertype HMM proposed in [26] does appear to accurately capture the persistence present inrainfall occurrence, but not rainfall amounts where the results are similar to the othermodels. The local weather state model uses separate HMMs at each location with spatialdependence built using copulas. However, despite better dynamic properties, this modelrequires many parameters and is unable to adequately capture the spatial dependencestructure of rainfall amounts.

This suggests including local weather states within regional weather states (a hierarchicalmodel) to better describe the physical generation and propagation of rainfall across this

21

orographically diverse region. The aim would be to marry the better dynamics of the localweather state model [26] with the better spatial dependence properties of the regionalweather type model proposed in this paper. Alternatively, one could consider buildingin further statistical dependence by replacing the driving Gaussian process {Wt} by alow-order vector autoregression. These and other related modelling issues remain to beinvestigated.

Acknowledgments

The authors are grateful to an associate editor and two anonymous referees for their con-structive comments and suggestions which led to significant improvements in the paper.The first author acknowledges the support of the Hidden Markov Models and ComplexSystems research programme sponsored by the New Zealand Institute of Mathematics andits Applications. The second and third authors acknowledge financial support providedby the New Zealand Foundation for Research, Science and Technology through contractC01X0302. The assistance of James Sturman (National Institute of Water and Atmo-spheric Research) who prepared the map in Figure 1 is also gratefully acknowledged.

References

[1] D.J. Allcroft and C.A. Glasbey. A latent Gaussian Markov random field model forspatio-temporal rainfall disaggregation. Applied Statistics, 52:487–498, 2003.

[2] A. Bardossy and E. Plate. Space-time model for daily rainfall using atmosphericcirculation patterns. Water Resources Research, 28(5):1247–1260, 1992.

[3] E. Bellone. Non homogeneous hidden Markov models for downscaling synoptic at-mospheric patterns to precipitation amount. PhD Thesis, University of Washington,2000.

[4] E. Bellone, J.P. Hughes, and P. Guttorp. A hidden Markov model for downscalingsynoptic atmospheric patterns to precipitation amounts. Climate Research, 15:1–12,2000.

[5] I. Bogardi, I. Matyasovsky, A. Bardossy, and L. Duckstein. Application of a space-time stochastic model for daily precipitation using atmospheric circulation patterns.Journal of Geophysical Research, 98(D9):16653–16667, 1993.

[6] O. Cappe, E. Moulines, and Ryden T. Inference in hidden Markov models. Springer-Verlag, New York, 2005.

[7] G. Celeux, D. Chauveau, and J. Diebolt. On stochastic versions of the em algorithm.Tech. Rep. RR-2514, INRIA, 1995.

[8] S.P. Charles, B.C. Bates, and J.P. Hughes. Downscaling of daily multisite precipita-tion. Journal of Geophysical Research, 104(D24):31657–31669, 1999.

22

[9] A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incompletedata via the EM algorithm. Journal of the Royal Statistical Society, B, 39:1–38, 1977.

[10] A. Doucet, N. De Freitas, and N. Gordon. Sequential Monte Carlo Methods in Prac-tice. Springer-Verlag, New York, 2001.

[11] Y. Ephraim and N. Merhav. Hidden markov processes. IEEE Transactions on In-formation Theory, 48:1518–1569, 2002.

[12] G. Fort and E. Moulines. Convergence of the Monte Carlo expectation maximizationfor curved exponential families. Annals of Statistics, 31(4):1220–1259, 2003.

[13] A. Genz. Comparison of methods for the computation of multivariate normal prob-abilities. Computing Science and Statistics, 25:400–405, 1993.

[14] V. Hajivassiliou, McFadden D., and Ruud P. Simulation of multivariate normalrectangle probabilities and their derivatives: Theoretical and computational results.Journal of Econometrics, 72(1-2):85–134, 1996.

[15] L.E. Hay, G.J.Jr. McCabe, D.M. Wolock, and M.A. Ayers. Simulation of precipitationby weather-type analysis. Water Resources Research, 27:493–501, 1991.

[16] J.P Hughes and P. Guttorp. A class of stochastic models for relating synoptic at-mospheric patterns to local hydrologic phenomenon. Water Resources Research,30:1535–1546, 1994.

[17] J.P. Hughes, P. Guttorp, and S.P. Charles. A non-homogeneous hidden Markovmodel for precipitation occurrence. Applied Statistics, 48(1):15–30, 1999.

[18] S. Kirshner. Modeling of multivariate time series using hidden Markov models. PhDthesis, University of California., 2005.

[19] I.L. McDonald and W. Zucchini. Hidden Markov and Other Models for Discrete-Valued Time Series. Chapman & Hall/CRC, London, 1997.

[20] X.L. Meng and D.B. Rubin. Maximum likelihood via the ECM algorithm: A generalframework. Biometrika, 1993.

[21] C.J. Paciorek and M.J. Schervish. Spatial modelling using a new class of nonstation-ary covariance functions. Environmetrics, 2006.

[22] C.W. Richardson. Stochastic simulation of daily precipitation, temperature and solarradiation. Water Resources Research, 1981.

[23] C.P. Robert and G. Casella. Monte Carlo Statistical Methods. Springer-Verlag, NewYork, second edition, 2004.

[24] B. Sanso and L. Guenni. A non-stationary multi-site model for rainfall. Journal ofthe American Statistical Association, 95(452):1089–1100, 2000.

23

[25] R. Srikanthan and T.A. McMahon. Stochastic generation of annual, monthly anddaily climate data: A review. Hydrology and Earth System Sciences, 5:653–670, 2001.

[26] C. Thompson, P. Thomson, and X. Zheng. Fitting a multisite rainfall model to newzealand data. Journal of Hydrology, 340:25–39, 2007.

[27] Tong T.L. The Multivariate Normal Distribution. Springer-Verlag, New York, 1990.

[28] D.S. Wilks. Multisite generalization of a daily stochastic precipitation model. Journalof Hydrology, 210:178–191, 1998.

[29] D.S. Wilks and R.L. Wilby. The weather generation game: a review of stochasticweather models. Progress in Physical Geography, 23:329–357, 1999.

[30] L.L. Wilson, D.P. Lettenmaier, and E. Skyllingstad. A hierarchial stochastic model oflarge-scale atmospheric circulation patterns and multiple station daily precipitation.Journal of Geophysical Research, 97(D3):2791–2809, 1992.

[31] C. F. J. Wu. On the convergence properties of the EM algorithm. Annals of Statistics,11:95–103, 1983.

[32] W. Zucchini and P. Guttorp. A hidden Markov model for space-time precipitation.Water Resources Research, 27:1917–1923, 1991.

24

space time modelling of precipitation using a hidden

Documents