matrix analytic methods with markov decision processes for hydrological applications · 2015. 6....

MATRIX ANALYTIC METHODS WITHMARKOV DECISION PROCESSES FOR

HYDROLOGICAL APPLICATIONS

Aiden James Fisher

December 2014

Thesis submitted for the degree of

Doctor of Philosophy

in the School of Mathematical Sciences at

The University of Adelaide

ÕÎêË @ ø@ QK.

Acknowledgments

I would like to acknowledge the help and support of various people for all the years Iwas working on this thesis.

My supervisors David Green and Andrew Metcalfe for all the work they did mentoringme through out the process of researching, developing and writing this thesis. Thecountless hours they have spent reading my work and giving me valuable feedback, forsupporting me to go to conferences, giving me paid work to support myself, and mostimportantly, agreeing to take me on as a PhD candidate in the first place.

The School of Mathematical Sciences for giving me a scholarship and support formany years. I would especially like to thank all the office staff, Di, Jan, Stephanie,Angela, Cheryl, Christine, Renee, Aleksandra, Sarah, Rachael and Hannah, for alltheir help with organising offices, workspaces, computers, casual work, payments, thepostgraduate workshop, conference travel, or just for having a coffee and a chat. Youall do a great job and don’t get the recognition you all deserve.

All of my fellow postgraduates. In my time as a PhD student I would have sharedoffices with over 20 postgraduates, but my favourite will always be the “classic line” ofG23: Rongmin Lu, Brian Webby, and Jessica Kasza. I would also like to give specialthanks to Nick Crouch, Kelli Francis-Staite, Alexander Hanysz, Ross McNaughtan,Susana Soto Rojo, my brotege Stephen Wade, and my guide to postgraduate life JasonWhyte, for being my friends all these years. Brian Webby once again for helping mewith, and co-authoring, my first research paper and for being there for me while Isettled into postgraduate life.

Kunle Akande for his insights and advice about the Essex Transfer Scheme that helpedin the writing of the paper he co-authored with me.

To my loving parents, Sue and Greg, who supported me emotionally and financially formost of the early years I was a PhD candidate. Also my brothers and sister, Damon,Dominic, and Shannon, for their support.

Most importantly I would like to thank my darling wife Elham for her love and supportwhile writing this thesis. Without her encouragement I would I have never finishedand it owes its existence to her.

Contents

Preamble xxv

1 Introduction 1

2 Discrete Time Markov Chains 5

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Markov Chain Terminology . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Reducibility of the Markov Chain . . . . . . . . . . . . . . . . . 7

2.2.2 Passage and Recurrence in the Markov Chain . . . . . . . . . . 7

2.2.3 Types of States in a Markov Chain . . . . . . . . . . . . . . . . 8

2.3 Introductory Markov Theory . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3.1 Markov Probability Distributions . . . . . . . . . . . . . . . . . 9

2.3.2 Stationary Distribution . . . . . . . . . . . . . . . . . . . . . . . 10

2.3.3 Ergodic Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.4 Hitting Probabilities, Passage Times, and Absorbing Chains . . 16

3 Discrete Time Markov Chains with Hidden States 25

3.1 Discrete Phase-Type Distribution . . . . . . . . . . . . . . . . . . . . . 25

3.1.1 Moments of the Discrete Phase-Type Distribution . . . . . . . . 26

3.1.2 Properties of the Discrete Phase-Type Family . . . . . . . . . . 27

3.2 Matrix Analytic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2.1 General MAM Models . . . . . . . . . . . . . . . . . . . . . . . 28

vii

viii CONTENTS

3.3 Hidden Markov Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3.1 Relation to Matrix Analytic Methods . . . . . . . . . . . . . . . 33

3.3.2 Fitting Hidden Markov Models: The Baum-Welch Algorithm . . 34

3.4 Adaptation of the Baum-Welch Algorithm . . . . . . . . . . . . . . . . 40

3.4.1 Fitting Discrete-Time Batch Markovian Arrival Processes . . . . 40

3.4.2 Fitting Discrete-Time Phase-Type Distributions . . . . . . . . . 42

4 Markov Decision Processes 45

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2 Value Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.1 Discounted Value Iteration . . . . . . . . . . . . . . . . . . . . . 47

4.3 Policy Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.4 Stochastic Solution Methods . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4.1 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4.2 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.5 Partially Observable Markov Decision Processes . . . . . . . . . . . . . 52

5 Literature Review 55

5.1 Moran Dam Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.1.1 Gould Probability Matrix . . . . . . . . . . . . . . . . . . . . . 57

5.2 Linear Programming Models . . . . . . . . . . . . . . . . . . . . . . . . 59

5.2.1 Stochastic Linear Programming . . . . . . . . . . . . . . . . . . 59

5.3 Stochastic Dynamic Programming - Single Reservoir Models . . . . . . 60

5.3.1 Stochastic Dynamic Programming . . . . . . . . . . . . . . . . . 60

5.3.2 Seasonal Stochastic Dynamic Programming Models . . . . . . . 63

5.3.3 Serially Correlated and Forward Forecast Stochastic DynamicProgramming Models . . . . . . . . . . . . . . . . . . . . . . . . 63

5.4 Stochastic Dynamic Programming - Multi-Reservoir Models . . . . . . 65

5.5 Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

CONTENTS ix

6 Synthesis 69

6.1 Optimal Control of Multi-Reservoir Systems with Time-Dependent MarkovDecision Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.2 Managing River Flow in Arid Regions with Matrix Analytic Methods . 83

6.3 Modelling of Hydrological Persistence for Hidden State Markov DecisionProcesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.4 First-Passage Time Criteria for the Operation of Reservoirs . . . . . . . 107

7 Conclusion 121

7.1 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

A Supplementary Material for Chapter 3 127

A.1 Supplementary Material for Section 3.3 . . . . . . . . . . . . . . . . . . 127

A.1.1 Demonstration of Equation (3.7) . . . . . . . . . . . . . . . . . 127

A.1.2 Demonstration of Equation (3.9) . . . . . . . . . . . . . . . . . 128

Bibliography 135

Abstract

In general, a physical process is modelled in terms of how its state evolves over time.The main challenge of modelling is to describe this evolution without unnecessarycomputation or making unrealistic simplifying assumptions. Markov chains have foundwidespread applications in many fields of analytic study from engineering to biologyto linguistics. One of their most notable applications in hydrological applications hasbeen modelling the storage of reservoirs, as described in Moran’s influential monograph(Moran, 1955). One of the fundamental properties of Markov chains is that the futureevolution depends only on the present state, and not on any of the previous states.This property is simply stated as the “memory-less” property or the Markov property.

In a Markov chain model the states representing the physical process are discrete, buttime can be modelled as either discrete or continuous. In this thesis, time is modelledin discrete units because this is consistent with the well-established theory of Markovdecision processes. The discrete states need not be a practical limitation because ofcontinuous state variables, as in this case storage in a reservoir, can be discretised asa reasonable approximation.

There have been many advances in Markov chain modelling techniques in other fields,most notably in telecommunications with the development of matrix analytic methods.Matrix analytic methods exploit the structure of certain types of Markov chains inorder to more efficiently calculate properties of the models. This thesis examines howthese methods can be applied to hydrological applications with the goal of providinga framework for which more precise modelling can be achieved without extendingcomputational times. There are many unique challenges due to the seasonal natureof hydrology as well as the tendency for persistence of hydrological conditions. Thisthesis explores some of these problems in four papers.

The first paper looks at the issues surrounding hydrological persistence and its in-corporation into Markov decision processes using the Southern Oscillation Index asproxy.

The second paper looks at modelling using matrix analytic methods of spate flows inthe Cooper Creek, which is an ephemeral river located in the South Australia.

The third paper looks at a way of modelling hydrological persistence with underlyinghidden states in contrast to assumed dependence on the Southern Oscillation Index.

The final paper looks at multi-objective optimisation using first-passage time distribu-tions with an application to a two reservoir system in South East England. The Paretofront of Pareto optimal policies is shown.

sistence for hidden state Markov decision processes.” Annals of Operations Research,199: 215–224.

Fisher, A. J., Green, D. A., Metcalfe, A. V., and Akande, K. (2014). “First-passagetime criteria for the operation of reservoirs.” Journal of Hydrology, 519, Part B:1836–1847.

Preamble

This thesis has been submitted to the University of Adelaide for the degree of Doctorof Philosophy. According to the University’s Specification for Thesis a Doctoral thesismay comprise,

a combination of conventional written narrative presented as typescript andpublications that have been published and/or submitted for publicationand/or text in manuscripts,

and this thesis takes this form.

The thesis has been divided into seven chapters:

The first chapter is a very brief introduction to the history of water management, theproblems currently facing the world’s water supplies, and the future in a changingclimate.

The second chapter gives a brief introduction to discrete-time Markov processes orMarkov chains. This chapter is only intended as a cursory view of the topic. A moregeneral and in depth discussion can be found in any textbook on the topic, for exampleNorris (1997). A reader familiar with this material could skip this chapter.

The third chapter introduces several more advanced Markov models, including; thephase-type distribution, Markov arrival processes, and hidden Markov models.

The fourth chapter discusses solution techniques for Markov decision processes.

The fifth chapter comprises of a literature review of mathematical modelling of reser-voirs and methods for determining optimal policy using Markov decision processes.

The sixth chapter comprises four published papers that form the main component ofthe thesis, along with an outline of each paper. The published papers are presentedin the format they were printed in. All the papers have been scaled to the largestsize possible that will allow the pages to fit within the University’s specifications for athesis.

The final chapter outlines potential future research following on from this thesis.

xxiii

Chapter 1

Introduction

The human need to manage water resources can be traced back at least as far as theadvent of agriculture around 9,000 BC. Archaeological records shows that as earlyas 5,000 years ago naturally forming reservoirs in volcanic craters were being usedon the Arabian peninsula for the irrigation of crops. Some time between 5,000 and4,000BC, canals were constructed in Mesopotamia, complete with weirs and sluicegates, not only for agricultural irrigation but also transportation (Tamburrino, 2010).The oldest known dam constructed by humans was either the Jawa Dam in Jordan,circa 3,000BC, (Fahlbusch, 2009) or the Sadd-el-Kafara Dam (Dam of the Pagans) inEgypt, circa 2,650BC (Mays, 2008). However Fahlbusch (2009) contends that giventhe size and scale of these dams there would have likely been smaller precursor dams,citing 7,000 year-old earth-works in Jordan that serve to hold back water pressure, butdo not prevent seepage. Some of the early water management systems showed signs ofbeing extensive and well planned. For example, the Nimrud Dam built on the TigrisRiver 180km upstream of Baghdad diverted water over 100km via the Nahrawan Canalto supplement the flows of the Diyala River (Mays, 2008).

The development of concrete by the Romans lead to some of the most complex watersystems built to date. The famous public fountains and baths marked a change fromwater use as essential for human survival to a luxury item (Mays, 2010a). The damand aqueducts allowed water to be sourced and stored hundreds of miles from citiesand allowed for unprecedented growth in the Roman cities. Some of these dams, suchas the Cornalvo Dam in the Badajoz province of Spain, are still in use today.

Modern super-dam construction began with the construction of the Hoover Dam inBlack Canyon on the border of the states of Arizona and Nevada in the United Statesof America. At 221.4 meters high, 379 meters long and with a volume of 2, 480, 000m3, Hoover Dam was at the time of completion the largest concrete structure in theworld. The current largest damn in the world is the recently completed Three GorgesDam which impounds the Yangtze River is five times the volume at 181 meters highand 2,335 meters long.

More important than the size of the dam itself is the water body (or reservoir) it

1

2 CHAPTER 1. INTRODUCTION

holds back. Kariba Dam on the border between Zambia and Zimbabwe impounds theZambezi River forming Lake Kariba, the largest man-made reservoir in the world bycapacity at 180 km3. Lake Mead, formed by Hoover Dam impounding the ColoradoRiver, was the largest man-made reservoir at construction (now 20th) with a maximumcapacity of 35.2 km3 (in normal operation it should never exceed 32.2 km3). Howevercurrent storage is down to just 12.6 km3 (USBR, 2014). Whilst this in part is caused byrecent droughts, the long-term future of the reservoir, along with many others aroundthe world, is under question. Barnett and Pierce (2008) examined a series of studiesthat found run-off into the Colorado River is expected to decrease by 10-30% overthe next 30-50 years, due to changes in the catchment basin and potential decreasesin future rainfall due to climate change. They predicted that based on projectionsof inflow then available, there was a 10% chance that Lake Mead would reach deadstorage by 2013 and a 50% chance it would reach dead storage by 2021. Longer-termpredictions are even more bleak. Even if no climate change occurs, there is a 90%chance of reaching dead storage by 2055, if climate change results in a loss of inflowby 20% dead storage would be reached by 2055 with almost certainty. The impact ofthe reservoir reaching dead storage would be immense. Lake Mead supplies 90% of thewater used by the Southern Nevada Water Authority and the power generated at theDam for over 1.3 million people.

In less developed countries the problem is more dire. The United Nations Children’sFund estimates that 400 million children do not have access to safe drinking water(UNICEF, 2005) and approximately 1.1 billion people in total are without potablewater. The United Nations Educational, Scientific and Cultural Organization issuesa report every three years on the status and future of the world’s freshwater, withparticular concern to disparity in access to freshwater as a fundamental human right(UNESCO, 2003, UNESCO, 2006, UNESCO, 2009, and UNESCO, 2012). Whilst thereports have found that access to freshwater is increasing, the supply of freshwater isdecreasing.

Demand increase not only comes from population increase, but economic development.The two largest users of water per head of capita are the US and Australia, whorespectively use 575 and 493 litres of water per person per day for entirely domesticpurposes (UNESCO, 2003). If the entire human population was to use water at therate of Australians, 3.3% of all annual freshwater runoff would be used in domicilesalone. Based on the Department of the Environment and Heritage estimates, in orderfor the entire world to consume water at the rate of Australians not only domestically,but in agriculture, mining, and industry, 22.5% of all the world’s annual water runoffwould need to be captured and delivered. Postel et al. (1996) examined what it termedaccessible runoff. Of the 40,700 km3 of renewable water that runs off into rivers globallyon average each year, 7,774 km3/annum is in regions too remote to be accessible and20,426 km3/annum is short-term flood water that will go uncaptured due to physicalrestrictions on storage sizes, leaving just 12,500 km3/annum that is geographically andtemporally accessible. Using this definition of accessible water, 73% of all accessiblerunoff would be required for every person on Earth to enjoy the same standard of livingas an Australian. Postel et al. (1996) found that humans have already appropriated 54%of all the accessible renewable freshwater runoff. The problem of obtaining accessible

3

renewable freshwater is exacerbated by the distribution of people living on Earth. Table1.1 shows the distribution of human population and freshwater runoff by continent. Themost stark feature of the table is that the human population is centred in regions thathave comparatively lower freshwater runoff, with the Americas and the Oceanic Regionbeing the only notable exceptions.

Table 1.1: Re-creation of the table “Share of global runoff and population by continent”shown in Postel et al. (1996).

RegionRiver runoff(km3/annum)

Share of globalriver runoff (%)

Share of globalpopulation(%)

Europe 3,240 8.0 13.0Asia 14,550 35.8 60.5Africa 4,320 10.6 12.5North and Central America 6,200 15.2 8.0South America 10,420 25.6 5.5Australia and Oceania 1,970 4.8 0.5

Total 40,700 100.0 100.0

With increasing demand, decreasing supply, failing water storages, and a populationthat is distributed unevenly in relation to the distribution of freshwater, improved andsustainable management of water resources is not only desirable, it is essential. Mays(2010b) cites at least seven civilisations, including the Mohenjo Daro of Pakistan, theAkkadian Empire, and the Maya, whose demise was the result of, or instigated through,the loss of their water supply or through the mismanagement of the supply that existed.In 1999, Ismail Serageldin, Vice-President for Environmentally and Socially SustainableDevelopment of the World Bank predicted “The wars of the next century will be aboutwater.”

Whilst larger water storage and delivering facilities would go some way to alleviatingthe problems ahead these are both expensive and only feasible in areas where thelandscape permits reservoir construction. One of the most cost effective methods isimproving the management techniques of the water management facilities already inplace and also the methods used in determining the appropriation and distribution ofthe resource.

This thesis aims to make a contribution to improving the management techniques ofreservoirs, by introducing new mathematical tools to the problem of finding operatingpolicies that preserve the long-term amenity of the storage. The earliest reservoir man-agement model, the Moran Dam model (see Section 5.1), was developed in the 1950sand used Markov chains as the basis of its construction. Whilst the Markov chainremains one of the most important modelling tools to date, various expansions of theidea have made it a more robust tool. Most notably, matrix analytic methods, were de-veloped in the field of telecommunication systems modelling as a means of introducingmore general distribution times within the Markov chain framework. Matrix analyticmethods are applied to traditional reservoir modelling and optimisation techniques todevelop models and operating policies that are far more general in structure and moredescriptive of the system.

4 CHAPTER 1. INTRODUCTION

This thesis shows that by using the special phase structure of matrix analytic methods,more information can be included into long-term planning models. This in turn allowsthe demands on the resources to be met whilst simultaneously allowing the manager ofthese resources to more accurately predict the likelihood of future shortfalls. Finallyit proposes a method by which the managers of the resource can find a strategy tominimise the probability of the system reaching dead-storage and obtain the maximumeconomic benefit from the resource through a compromise policy of operation.

Chapter 2

Discrete Time Markov Chains

Introduced by Markov (1906), the Markov chain forms the basis of the modellingtechniques used in this thesis and some of its important properties are presented inthis chapter. A reader that is already familiar with Markov chains may skip thischapter.

2.1 Introduction

Consider a countable set of states, or state space, X = {0, 1, 2, . . . }, which is eitherfinite or infinite, where the states are numbered sequentially. If there is a finite numberof states, there exists a final state denoted N . A discrete time process {Jk} on this statespace which describes the state of the process at time k ∈ Z+ (Z+ the non-negativeintegers) is called a Markov chain if the following property holds:

Pr[Jk+1 = j|Jk = i, Jk−1, . . . , J0] = Pr[Jk+1 = j|Jk = i] , i, j ∈ X.

This Markovian property for a discrete time process is that the probability of tran-sitioning from a state Jk = i at time k to another state Jk+1 = j at time k + 1 isonly dependent on the state of the process at time k and not on any previous state.This is known as the “memoryless” property of the Markov chain as the conditionalprobability remains unchanged by the process before time k. The only restriction onthese probabilities are that they are true probabilities,

0 ≤ Pr[Jk+1 = j|Jk = i] ≤ 1 ,

and that, ∑j∈X

Pr[Jk+1 = j|Jk = i] = 1 .

A Markov chain is termed time-homogeneous if,

Pr[Js+k+1 = j|Js+k = i] = Pr[Jk+1 = j|Jk = i] ,∀s > 0 .

5

6 CHAPTER 2. DISCRETE TIME MARKOV CHAINS

If a Markov chain is time-homogeneous it is convenient to denote

Pr[Jk+1 = j|Jk = i] = pij .

Markov chains are often visualised using state diagrams. An example of a finite stateMarkov chain with four states (N = 3) is given in Figure 2.1.

0

p03

p01

p02

p00 1p10

p12

p13

p11

3p32

p30

p31

p33 2

p21

p23

p20

p22

Figure 2.1: State diagram of a N = 3 Markov chain.

As these can become quite complex this is not always the most convenient way ofdescribing a Markov chain. During this thesis state diagrams will generally appearwithout the transition probabilities marked and sometimes the “self”-transition fromi to i will be left out for clarity.

All of these time-homogeneous one step transition probabilities can be expressed inthe form of a stochastic matrix where the i row and j column element is the transi-tion probability pij. For example, the chain in Figure 2.1 has the one-step transitionprobability matrix,

P =

0 1 2 3

0 p00 p01 p02 p03

1 p10 p11 p12 p13

2 p20 p21 p22 p23

3 p30 p31 p32 p33

.

Stochastic matrices are any matrix whose rows form a probability distribution and canbe used to describe more than just Markov chains, with the only restriction being thatall the elements of the matrix are greater than or equal to zero and that the rowssum to one. Sometimes the matrix may have additional indexing, so in order to makethis clearer the i row and j column element of the matrix P maybe denoted [P ]ij inaddition to the pij notation already introduced. If the counting of the state space startsat 1 instead of 0, likewise then the numbering of the rows and columns will start at 1instead of 0.

2.2. MARKOV CHAIN TERMINOLOGY 7

Due to time-homogeneity, the two step transition probability,

Pr[Jk+2 = j|Jk = i] =N∑n=0

Pr[Jk+2 = j|Jk+1 = n] Pr[Jk+1 = n|Jk = i]

=N∑n=0

Pr[Jk+1 = j|Jk = n] Pr[Jk+1 = n|Jk = i]

=N∑n=0

pinpnj ,

which is the i row and j column element of the matrix P 2, denoted p(2)ij . By induction

the n-step transition probability,

Pr[Jk+n = j|Jk = i]

is given by the i row and j column element of the matrix P n, denoted p(n)ij .

2.2 Markov Chain Terminology

Markov chains can be categorised into different types based on the properties of thechain and results can be obtained more readily.

2.2.1 Reducibility of the Markov Chain

Reducibility of the Markov chain deals with whether some states maybe reached fromother states, or whether a the chain is joined at all. If states become disjoint throughbranching it maybe possible to reduce the state space as the chain evolves throughtime.

Definition 1 The state j is said to be accessible from i, denoted i→ j, if there existssome integer k ≥ 0 such that p

(k)ij > 0. If j is not accessible from i it is denoted i 6→ j.

Definition 2 Two states i and j are said to communicate if both i→ j and j → i,denoted i↔ j.

Definition 3 A Markov chain is said to be irreducible if i↔ j for all i, j ∈ X.

2.2.2 Passage and Recurrence in the Markov Chain

As a Markov chain is typically used to replicate a real-world system the time betweenstates and whether the state is reached again is usually of interest.


Definition 4 The hitting time from state i to state j which is accessible from i isthe random variable defined as

Hij = inf{n ≥ 0 : Jk = i, Jk+n = j} .

If i 6→ j, Hij =∞.

Definition 5 Let f(k)ij be the probability that the process takes exactly k steps to go

from state i and to state j for the first time, that is,

f(k)ij = Pr[Hij = k] .

The hitting probability is the probability of ever going from state i to state j, thatis

fij =∞∑k=0

f(k)ij .

If i 6→ j, fij = 0.

Definition 6 The expected hitting time is the expected number of steps for theprocess to go from state i to state j, that is

uij = E[Hij] =∞∑k=0

kf(k)ij .

Definition 7 Let f(k)i be the probability that there are exactly k steps between leaving

state i and returning to state i, that is

f(k)i = Pr[Hii = k > 0] .

The return probability is the probability of ever returning to state i, that is

fi =∞∑k=1

f(k)i .

Definition 8 The first passage time of state i from initial distribution π0 is therandom variable defined as

Ki = inf{n ≥ 1 : Jn = i|π0} .

2.2.3 Types of States in a Markov Chain

The easy identification of the type of Markov chain being dealt with allows it prop-erties to be more readily deduced. Certain types of Markov chain states have specialproperties which give information about the chain as a whole.

2.3. INTRODUCTORY MARKOV THEORY 9

Definition 9 A state i is said to be recurrent if the probability of ever returning isone, i.e,

fi =∞∑k=1

f(k)i = 1 .

Definition 10 A recurrent state i is said to be positive-recurrent if it returns infinite time, i.e,

mi =∞∑k=1

kf(k)i <∞ ,

where mi is called the mean recurrence time. It is said to be null-recurrent ifmi →∞.

Definition 11 A state i is said to be transient if there exists non-zero probability ofnever returning, i.e,

fi =∞∑k=1

f(k)i < 1 .

Definition 12 A transient state i is said to be ephemeral if once the process has leftstate i it never returns, i.e,

fi =∞∑k=1

f(k)i = 0 .

Definition 13 A state i is said to be periodic if for k ≥ 1,

d = gcd{k : p(k)ii > 0} > 1 .

The state i is said to have periodicity d. A state i is said to be aperiodic if d = 1.

Definition 14 A state i is said to be ergodic if it is both aperiodic and positive-recurrent.

Definition 15 A state i is said to be absorbing if pii = 1.

2.3 Introductory Markov Theory

2.3.1 Markov Probability Distributions

A Markov chain can be used to describe the evolution of a distribution on X over time.

Define π(0) as the distribution on X at time k = 0, or the intial distribution, as a vectorwith ith element,

π(0)i = Pr[J0 = i] .

The distribution on X at time k = 1, π1, is then given by a vector with jth element,


π(1)j =

∑i∈X

Pr[J1 = j|J0 = i] Pr[J0 = i]

=∑i∈X

Pr[J1 = j|J0 = i]π(0)i .

Due to the Markov property, the distribution of X at time k + 1, π(k+1), is given by avetor with jth element,

π(k+1)j =

∑i∈X

Pr[Jk+1 = j|Jk = i]π(k)i .

If the Markov chain is time-homogeneous the distributions are given by,

π(n+1) = π(n)P

= π(n−1)P 2

...

= π(0)P n+1 .

2.3.2 Stationary Distribution

Of particular interest is the invariant or stationary distribution of the Markov chain,which represents a distribution from which the process will never move. The stationarydistribution for a Markov chain, if it exists, is given by the global balance equation,

πj =∑i∈X

πi Pr[Jk+1 = j|Jk = i] , (2.1)

where πi is the stationary probability of state i and that,∑i

πi = 1 .

In the case where the Markov chain is time-homogeneous, Equation (2.1) can be ex-pressed as,

πj =∑i

πipij .

In the case where the state-space is finite the stationary distribution always exists,a row vector of length N + 1, π, can be defined whose ith element is the stationaryprobability for state i, πi.


In the case where the Markov chain is finite and time-homogeneous the vector π canbe expressed equivalently to Equation (2.1) in the matrix equation,

π = πP ,

whereπ1 = 1 ,

where 1 is a column vector of ones of the appropriate length. In this case, findingthe stationary probability vector corresponds to finding the left eigenvector of theeigenvalue equal to 1.

Example: A Rainy Day

The weather can be in one of two states dry or raining (Gabriel and Neumann, 1962).Assume that it is observed regularly with a fixed time-step between each observation(e.g, once every 15 minutes). If at the current observation it is dry it can be raining bythe next observation in one time-step with probability 0 < α < 1. If the observation isthat it is currently raining it will stop raining by the next observation in one time-stepwith probability 0 < β < 1. This process can be modelled as a Markov process withprobability transition matrix,

P =

(1− α αβ 1− β

).

The stationary distribution π is found by solving π(I2 − P ) = 0. This problem caneither be transposed to (I2 − P t)πt = 0t and solved using row operations, or it can besolved using column operations. Column operations reduces (I2 − P ) to,(

1 0−β/α 0

).

This gives the eigenvector solution that, π = s(β/α 1

)for all s ∈ R. However as the

probabilities need to be normalised such that π0 + π1 = 1, the final solution is foundto be,

π =(

βα+β

αα+β

).

Therefore if the process goes on long enough,

β

α + β

will be the proportion of the observations it will be dry and αα+β

will be the proportionof the observations it will be raining.

Example: A Simple Reservoir Model

A simple reservoir model can be constructed using Markov chains, by letting i ∈ Xrepresent the discretised storage levels of the reservoir, with i = 0 representing the


storage being empty and i = 8 representing the storage being full. Let the probabilitythat the storage increases by one unit of storage in one time-step be 0.3, the storagedecreases one unit be 0.2 and the probability it is unchanged be 0.5, unless the reservoiris full or empty. If the reservoir is empty then the probability it remains empty is0.2 + 0.5 = 0.7. Likewise, if the reservoir is full the probability that it remains full is0.5 + 0.3 = 0.8. Therefore the transition probabilities are,

pij =

0.2 if j = i− 1, i 6= 00.5 if j = i, i 6= 0, 80.3 if j = i+ 1, i 6= 80.7 for i = j = 00.8 for i = j = 80 otherwise

.

The reservoir can then be described by the probability transition matrix,

P =

0.7 0.3 0 0 0 0 0 0 00.2 0.5 0.3 0 0 0 0 0 00 0.2 0.5 0.3 0 0 0 0 00 0 0.2 0.5 0.3 0 0 0 00 0 0 0.2 0.5 0.3 0 0 00 0 0 0 0.2 0.5 0.3 0 00 0 0 0 0 0.2 0.5 0.3 00 0 0 0 0 0 0.2 0.5 0.30 0 0 0 0 0 0 0.2 0.8

.

The stationary distribution is once again found by solving (I9 − P t)πt = 0t. Thisproblem can be expressed in augmented matrix form as,

0.3 −0.3 0 0 0 0 0 0 0 0−0.2 0.5 −0.3 0 0 0 0 0 0 0

0 −0.2 0.5 −0.3 0 0 0 0 0 00 0 −0.2 0.5 −0.3 0 0 0 0 00 0 0 −0.2 0.5 −0.3 0 0 0 00 0 0 0 −0.2 0.5 −0.3 0 0 00 0 0 0 0 −0.2 0.5 −0.3 0 00 0 0 0 0 0 −0.2 0.5 −0.3 00 0 0 0 0 0 0 −0.2 0.2 0

.

Due to linear dependence an infinite number of solutions will not be found. Insteadthe last line is replaced with the condition that

∑πi = 1, yielding,


0.3 −0.3 0 0 0 0 0 0 0 0−0.2 0.5 −0.3 0 0 0 0 0 0 0

0 −0.2 0.5 −0.3 0 0 0 0 0 00 0 −0.2 0.5 −0.3 0 0 0 0 00 0 0 −0.2 0.5 −0.3 0 0 0 00 0 0 0 −0.2 0.5 −0.3 0 0 00 0 0 0 0 −0.2 0.5 −0.3 0 00 0 0 0 0 0 −0.2 0.5 −0.3 01 1 1 1 1 1 1 1 1 1

.

Row reducing this new augmented matrix gives the solution,

π = (0.01335, 0.02003, 0.03005, 0.04507, 0.06760, 0.10140, 0.15210, 0.22816, 0.34224) .

This tells us that over long term operation the reservoir will be full 34.2% of the timeand will be empty 1.3% of the time.

Example: M/M/1 Queue with Discrete Time-Steps

The M/M/1 Queue is a single server queue with infinite holding capacity (M indicatesthat the arrival and service distributions are Markovian, later queues with a general, G,arrivals and departures are discussed) (Latouche and Ramaswami, 1999). This modelis similar to the reservoir model, except that it is not finite. Whilst an infinite storagereservoir is physically impossible, due to the ease with which analytic results can beobtained the Moran Dam Model in Section 5.1 makes use of such a construction andif the probability of reaching full-capacity is small it can be a good approximation.

The probability of one customer arriving in one time-step is α > 0, and the probabilitythat one customer is served (given they exist) in one time-step is β > 0, the probabilitythere is no change in the number of customers is 1 − α − β ≥ 0. To make notationmore intuitive, denote the state with i customers as i ∈ X, including 0 ∈ X, a statefrom which there are no customers to serve however customers arrive with probabilityα, therefore p00 = 1 − α. A system of equations can be written using Equation (2.1)as,

πi = απi−1 + (1− α− β)πi + βπi+1 for i 6= 0

π0 = (1− α)π0 + βπ1

subject to∑πi = 1.

Consider the boundary condition,

π0 = (1− α)π0 + βπ1

βπ1 = απ0

π1 =α

βπ0 .


Likewise,

π1 = απ0 + (1− α− β)π1 + βπ2

βπ2 = (α + β)π1 − απ0

π2 =(α + β)

βπ1 −

α

βπ0

=(α2 + αβ)

β2π0 −

α

βπ0

=

(α

β

)2

π0 .

By induction it can be shown that,

πi =

(α

β

)iπ0 .

Using the fact that the stationary probabilities must sum to one,

π0

[∞∑i=0

(α

β

)i]= 1 ,

it can be determined if this system has a stationary distribution. If α < β,

∞∑i=0

(α

β

)i=

1

1− αβ

=β

β − α,

otherwise the sum diverges. Hence,

πi =β − αβ

(α

β

)i, if and only if α < β,

otherwise the stationary distribution does not exist.

2.3.3 Ergodic Chains

Consider an irreducible time-homogeneous Markov chain with a state i that is ergodic.

As the chain is irreducible, for any other state j 6= i ∈ X there exists r and s suchthat p

(r)ji > 0 and p

(s)ij > 0. As i is positive recurrent there exists a k such that p

(k)ii > 0.

As j → i→ j represents at least one path by which the process can go from j to j inm = r + k + s time-steps then,

p(m=r+k+s)jj ≥ p

(r)ji p

(k)ii p

(s)ij > 0 .

It follows then that the first-return time for state j, Hjj, must be,

Hjj ≤ Hii + r + s ,


as j → i→ j represents at least one path by which the process may return to j havingleft it, and as Hjj is the number of steps on the shortest path, it maybe no longer thanthis. Consider the expected value,

mj = E[Hjj] ≤ E[Hii + r + s] = mi + r + s <∞ ,

and so all j 6= i ∈ X are positive-recurrent if i ∈ X is positive recurrent and X isirreducible.

For any aperiodic state it can be shown that,

limn→∞

p(n)ij → 1/mj = πj ,

where πj is the stationary distribution. The proof of this is very complicated involvingcoupling two Markov chains together, it can be found in Norris (1997, Section 1.8).

For any ergodic chain a matrix S can be written such that

limn→∞

P n → S = 1π ,

a matrix of rank 1 which has every row as the stationary distribution. It is sometimesuseful to write,

P n = S + T (n) ,

where

limn→∞

[T (n)]ij → 0 ∀i, j ∈ X.

The limit of periodic chains do not exist in the same sense. Instead consider partitioningthe state space of an irreducible Markov chain where every state has period d as,

X = X0 ∪X1 ∪ · · · ∪Xd−1

where,

• p(r)ij > 0 only if i ∈ X0 and j ∈ Xr, and

• p(nd)ij > 0 for some sufficiently large n when i, j ∈ X0.

It can be shown for all i ∈ X0 and j ∈ Xr,

limn→∞

p(dn+r)ij → d/mj ,

where r = 0, 1, . . . , d− 1. This shows that the periodic chain exhibits long-term stablebehaviour independent of its starting state even if its limit doesn’t exist in the normalsense, only being defined every d steps.


2.3.4 Hitting Probabilities, Passage Times, and AbsorbingChains

The probability of hitting a state j in a finite number of steps given that the processstarts in state i is denoted fij and is given by Definition 5. It can also be defined asthe probability that there exists at least one non-negative integer n such that Jn+k = jgiven that Jk = i or expressed mathematically as

fij = Pr[Hij <∞] = Pr[inf{n ≥ 0 : Jk = i, Jk+n = j} <∞] .

More generally, define the random variable HAi of reaching a collection of states A ⊂ Xfrom state i in exactly n steps as,

HAi = inf{n ≥ 0 : Jk = i, Jk+n ∈ A} .

Similarly the probability fAi of hitting a collection of states A ⊂ X from i /∈ A is givenby,

fAi = Pr[HAi <∞] = Pr[inf{n ≥ 0 : Jk = i, Jk+n ∈ A} <∞] .

If i ∈ A then fAi = 1 by definition. If i /∈ A, then n ≥ 1 and the process will have tostep to another state at time k + 1 that is,

Pr[inf{n ≥ 0 : Jk = i, Jk+1 = `, Jk+n ∈ A} <∞]

= Pr[Jk+1 = `|Jk = i] Pr[inf{n ≥ 1 : Jk+1 = `, Jk+n ∈ A} <∞] .

From this result it follows that

fAi = Pr[inf{n ≥ 0 : Jk = i, Jk+n ∈ A} <∞] ,

=∑`∈X

Pr[inf{n ≥ 0 : Jk = i, Jk+1 = `, Jk+n ∈ A} <∞] ,

=∑`∈X

Pr[Jk+1 = `|Jk = i] Pr[inf{n ≥ 1 : Jk+1 = `, Jk+n ∈ A} <∞] for i /∈ A.

If the Markov chain is time-homogeneous then

fAi =∑`∈X

pi`fA` for i /∈ A.

The expected number of time-steps to reach a collection of states A ⊂ X from state istarting at time 0 is defined as,

uAi = Ek[k = inf{n : J0 = i, Jn ∈ A}] .


If i ∈ A, then uAi = 0, otherwise uAi ≥ 1. Therefore the process has to go through onemore transition state ` at time k = 1, that is

Ek[k = inf{n : J0 = i, J1 = `, Jn ∈ A}] = 1 + Ek[k = inf{n : J1 = `, Jn ∈ A}] .

From this result it follows that

uAi = Ek[k = inf{n : J0 = i, Jn ∈ A}] ,

=∑`∈X

Ek[k = inf{n : J0 = i, J1 = `, Jn ∈ A}] ,

=∑`∈X

Pr[J1 = `|J0 = i] [1 + Ek[k = inf{n : J1 = `, Jn ∈ A}]] ,

= 1 +∑`∈X

Pr[J1 = `|J0 = i]Ek[k = inf{n : J1 = `, Jn ∈ A}] for i /∈ A.

If the Markov chain is time-homogeneous then

uAi = 1 +∑`∈X

pi`uA` for i /∈ A.

Both the hitting probability and the expected number of steps for a time-homogeneousMarkov chain can be written a set of linear equations:

fAi = 1 ∀i ∈ A

fAi =∑`∈X

pi`fA`j ∀i /∈ A, (2.2)

and

uAi = 0 ∀i ∈ A

uAi = 1 +∑`∈X

pi`uA`j ∀i /∈ A. (2.3)

Define fA to be the vector in which ith element is the probability fAi . By constructionit is a solution to system of linear equations given in Equation (2.2).

It is straightforward to show that fA is the minimal non-negative solution to Equation(2.2). Suppose x is a non-negative solution to Equation (2.2), then for i ∈ A the ith

element of x, xi = 1, and for i /∈ A, xi ≥ 0 and satisfies,

xi =∑`1∈X

pi`1x`1

=∑`1∈A

pi`1 +∑`1 6∈A

pi`1x`1 . (2.4)


Substituting Equation (2.4) into itself yields,

xi =∑`1∈A

pi`1 +∑`1 /∈A

pi`1

[∑`2∈A

p`1`2 +∑`2 /∈A

p`1`2x`2

]

=∑`1∈A

pi`1 +∑`1 /∈A

∑`2∈A

pi`1p`1`2 +∑`1 /∈A

∑`2 /∈A

pi`1p`1`2x`2 .

The first term pij is the probability of going from i to j in one time-step. This can beexpressed as, ∑

`1∈A

pi`1 = Pr[J1 ∈ A|J0 = i /∈ A] = Pr[HAi = 1] .

Similarly the second term can be expressed as,∑`1 /∈A

∑`2∈A

pi`1p`1`2 = Pr[J2 ∈ A|J0 = i /∈ A, J1 /∈ A] = Pr[HAi = 2] .

By induction, repeated substitutions yields that,

xi = Pr[Hij = 1] + · · ·+ Pr[Hij = n] +∑`1 /∈A

· · ·∑`n /∈A

pi`1 . . . p`n−1`nx`n . (2.5)

As xi ≥ 0 for all i ∈ X, all terms in Equation (2.5) at positive and,

xi ≥ limn→∞

Pr[HAi ≤ n] = Pr[HAi <∞] = fAi .

Therefore fA is the minimal non-negative solution of Equation (2.2).

Similarly let uA be a vector with ith element uAi . By construction it is a solution tosystem of linear equations given in Equation (2.3).

Again it can be shown that uA is the minimal non-negative solution to Equation (2.3).Suppose y is a non-negative solution to Equation (2.3), then for i ∈ A the ith elementof y, yi = 0, and for i /∈ A, yi ≥ 1 and satisfies,

yi = 1 +∑`1∈X

pi`1y`1

= 1 +∑`1∈A

p1`1y`1 +∑`1 /∈A

pi`1y`1

= 1 +∑`1 /∈A

pi`1y`1 . (2.6)

Substituting Equation (2.6) into itself yields,

yi = 1 +∑`1 /∈A

pi`1

(1 +

∑`2 /∈A

p`1`2y`2

)

= 1 +∑`1 /∈A

pi`1 +∑`1 /∈A

∑`2 /∈A

pi`1p`1`2y`2 .


The first term is the probability that the process does not start in A given that thestarting state is not in A, that is,

Pr[J0 /∈ A|J0 = i /∈ A] = 1 ,

or alternatively.

Pr[inf{n : Jn ∈ A, J0 = i /∈ A} > 0] = Pr[HAi ≥ 1] .

The second term is the probability the process still has not entered a state in A attime-step k = 1, ∑

`1 /∈A

pi`1 = Pr[HAi ≥ 2] .

By induction it can be shown that,

yi = Pr[HAi ≥ 1] + · · ·+ Pr[HAi ≥ n] +∑`1 /∈A

· · ·∑`n /∈A

pi`1 . . . p`n−1`ny`n .

As all terms are non-negative,

yi ≥ limn→∞

n∑k=1

Pr[HAi ≥ k] =∞∑k=1

Pr[HAi ≥ k] =∞∑k=1

k Pr[HAi = k] = uAi .

Therefore the vector uA is the minimal non-negative solution for Equation (2.3).

Example: A Simple Reservoir Model

Revisiting the simple reservoir model from Section 2.3.2, the system will now be con-sidered a failure if it goes empty. To model this new system if the reservoir goes emptyit will remain there signalling the failure of the system. The transition probabilitiesare now,

pij =

0.2 if j = i− 1, i 6= 00.5 if j = i, i 6= 0, 80.3 if j = i+ 1, i 6= 0, 81 for i = j = 00.8 for i = j = 80 otherwise

.

The transition probability matrix is then,

P =

1 0 0 0 0 0 0 0 00.2 0.5 0.3 0 0 0 0 0 00 0.2 0.5 0.3 0 0 0 0 00 0 0.2 0.5 0.3 0 0 0 00 0 0 0.2 0.5 0.3 0 0 00 0 0 0 0.2 0.5 0.3 0 00 0 0 0 0 0.2 0.5 0.3 00 0 0 0 0 0 0.2 0.5 0.30 0 0 0 0 0 0 0.2 0.8

,


which has been partitioned into four submatricies: the transition between the tran-sient states (top-left); the transitions to the absorbing states (top-right); and the self-transitions of the absorbing states (bottom-right), leaving a matrix of zeros in thebottom-left. The partitions are labelled as follows,

P =

(1 0A T

),

so that T is a 8 × 8 matrix of transient probabilities, A a 8 × 1 matrix of absorbingprobabilities, 0 a 1 × 8 matrix of zeros and 1 is the absorbing state of the reservoirbeing empty. It can be shown that for any matrix partitioned in this way that,

P n =

(1 0∑n−1

i=0 TiA T n

),

and that in the limit,

limn→∞

P n =

(1 0

(I − T )−1A 0

).

Let u be the vector whose ith element gives the expected number of steps from statei /∈ A to one of the absorbing states in A. Therefore

ui = 1 +∑

pijuj ,

or in vector form,

u = 1 + Tuu− Tu = 1

(I − T )u = 1u = (I − T )−11

For the reservoir model the results are,

u =

246.29407.15511.05576.99617.62641.37653.87658.87

Hence, if for example the reservoir is initially full, it is expected to last 658.87 time-stepsuntil it goes empty.

To find probability that the reservoir goes full before it goes empty from any statei = 1, . . . , 7, consider a vectorised version of Equation (2.2),

f = Pf

f − Pf = 0

(I9 − P )f = 0 ,


where f is a column vectors whose ith element is the probability of going from i to9 without reaching 0. However this equation only holds for i 6= 9 and i → 9. To fixthis only the rows i = 1, . . . , 8 will be included in the augmented matrix. Instead theequations f 9

9 = 1 (from the definition) and f 90 = 0 (to represent the inaccessibility of 9

from 0) will be added. This yields the augmented matrix equation,

P =

1 0 0 0 0 0 0 0 0 0−0.2 0.5 −0.3 0 0 0 0 0 0 0

0 −0.2 0.5 −0.3 0 0 0 0 0 00 0 −0.2 0.5 −0.3 0 0 0 0 00 0 0 −0.2 0.5 −0.3 0 0 0 00 0 0 0 −0.2 0.5 −0.3 0 0 00 0 0 0 0 −0.2 0.5 −0.3 0 00 0 0 0 0 0 −0.2 0.5 −0.3 00 0 0 0 0 0 0 0 1 1

Row reducing now yields,

f =

00.346870.578110.732280.835050.903570.949250.97970

1

which give the probability of the reservoir being full before it goes empty starting ineach state of the system.

Example: The Gambler’s Ruin

Similar to the M/M/1 server queue, however now the state 0 is absorbing. The statei ∈ X represents the amount of money a gambler has. Each time he bets one monetaryunit and wins one additional unit with probability α > 0, loses the bet unit withprobability β > 0, or pushes (neither wins nor loses and has his bet returned) withprobability 1− α− β ≥ 0, once the gambler runs out of money he is no longer able toplace a bet and is in ruin (state i = 0).

If the Gambler starts the game with C amount of money how long until they goesbroke?

There are two equations describing the amount of time until the gambler goes broke:

uC = 1 + αuC+1 + βuC−1 + (1− α− β)uC ,

and the boundary condition,

u0 = 0⇒ u1 = 1 + αu2 + (1− α− β)u1 .


First consider the homogeneous equation,

αuC+1 − (α + β)uC + βuC−1 = 0 .

Try the solution uC = mC ,

αmC+1 − (α + β)mC + βmC−1 = 0

αm2 − (α + β)m+ β = 0

⇒ m =(α + β)±

√(α + β)2 − 4αβ

2α

=β

αor 1 .

So the general solution for the homogeneous case is,

uC = Ah +Bh

(β

α

)C.

Applying the boundary condition gives,

uC = Ah

(1−

(β

α

)C).

Try uC = ApC+Bp a particular solution for the non-homogeneous case. The boundarycondition gives Bp = 0 and as a result,

−1 = αAp(C + 1)− (α + β)ApC + βAp(C − 1)

= Ap(α− β)

⇒ Ap =−1

α− β

=1

β − α.

This results in the general solution,

uC = Ah

(1−

(β

α

)C)+

C

β − α. (2.7)

Now consider the case where α = β. In this case the first term of Equation (2.7) willbe 0 and the second term undefined, in which case the minimal non-negative solutionwill be uC =∞.

In the case where α > β, the first term is always non-negative for Ah ≥ 0, howeverthe second term is negative for C > 0. In the limit as C →∞, the first term tends to


Ah and the second term tends to −∞, so there is no finite value for Ah for which theexpression will always be non-negative. As a result the minimal non-negative solutionwill be uC =∞.

In the case where α < β, the first term will always be non-negative for all Ah ≤ 0 andthe second term is non-negative for all C. Therefore a non-negative solution exists forall C ∈ Z. The minimal non-negative solution will occur when Ah = 0 and so thesolution is,

uC =C

β − α,

for α < β.

Assume that the Gambler starts with C > 0, what is the probability that eventuallyhave C ′ > C given that β > α?

There are three equations that describe the situation,

fC′

C = αfC′

C+1 + (1− α− β)fC′

C + βfC′

C−1 ,

fC′

0 = 0 ,

andfC′

C′ = 1 .

The first equation is identical to the homogeneous equation above, so,

fC′

C = A+B

(β

α

)C.

Applying the boundary condition that fC′

0 = 0, gives,

fC′

C = A

(1−

(β

α

)C)

and applying the other boundary condition gives that,

A

(1−

(β

α

)C′)= 1

⇒ A =1

1−(βα

)C′ .As a result,

fC′

C =

(βα

)C − 1(βα

)C′ − 1.

Chapter 3

Discrete Time Markov Chains withHidden States

This chapter presents more advanced Markov chain constructions: the discrete phase-type distribution, a probability distribution defined by an absorbing Markov chains;matrix analytic methods, a method of generalising Markov chains so that they have amemory processes greater than one time-step; and the hidden Markov model, a Markovchain model with similarities to matrix analytic methods. As part of the discussion onthe hidden Markov model, the Baum-Welch algorithm for fitting hidden Markov modelsto data is discussed. A novel method for fitting discrete phase-type distributions anddiscrete batch Markovian arrival processes to data using an adaptation of the Baum-Welch algorithm is then presented.

3.1 Discrete Phase-Type Distribution

A non-negative discrete random variable can be modelled by the number of steps untilabsorption in some Markov chain with a specified initial distribution. This is a verygeneral construction known as a discrete phase-type distribution.

0

p00 = 1

1 2 · · · m

Figure 3.1: State diagram of the discrete phase-type distribution.

Consider a Markov chain {Jk}k∈Z+ on a state space X = {0, 1, 2, . . . ,m}, where state 0

25

26 CHAPTER 3. DTMCS WITH HIDDEN STATES

is an absorbing state and states 1, . . . ,m, for m ∈ Z+, are transient states. The Markovchain can be represented by the state diagram given in Figure 3.1 and described usingthe following probability transition matrix

P =

(1 0t T

),

where the m × m matrix has entries [T ]ij ≥ 0, ti ≥ 0 for i, j ∈ X\{0} such thatt+T1 = 1. The number of steps to absorption K is distributed by the discrete phase-type distribution, DPH(τ , T ), where τ is an initial probability distribution such thatτ i = Pr[J0 = i|i 6= 0], τ0 = Pr[J0 = 0], and τ0 + τ1 = 1. The k-step transition matrix,

P k =

(1 0

1− T k1 T k

)

yields,

Pr[K ≤ k] = Pr[Jk = 0] = 1− τT k1 , for all k ∈ Z+. (3.1)

By differencing Equation (3.1), it can be shown that,

Pr[K = k] = Pr[Jk = 0|Jk−1 6= 0] = τT k−1t .

3.1.1 Moments of the Discrete Phase-Type Distribution

The moment generating function of the discrete phase-type distribution is,

G(z) =∑k≥0

zk Pr[K = k]

= τ0 +∑k≥1

zkτT k−1t

= τ0 + z∑k≥1

τ (zT )k−1t

= τ0 + zτ (Im − zT )−1t .

Differentiating with respect to z, n times, and setting z = 1 yields,

E[K(K − 1) . . . (K − n+ 1)] = n!τ (Im − T )−nT n−11 .

3.1. DISCRETE PHASE-TYPE DISTRIBUTION 27

3.1.2 Properties of the Discrete Phase-Type Family

The family of discrete phase-type distributions has several properties which allow theDPH family to easily lend itself to many applications. Similar to the geometric dis-tribution, the sum of any two DPH distributed random variables are themselves DPHdistributed, any convex mixture of two DPH distributed random variables is also DPHdistributed. In addition, any discrete distribution on the non-negative integers can bearbitrarily well approximated by a discrete phase-type distribution.

There is one important operation that needs to be defined and that in the Kroneckerproduct ⊗. For two matrices A and B the Kronecker product is defined as,

A⊗B =

a11B a12B . . .a21B a22B . . .

......

. . .

.

Closure Properties

The following theorems give the closure properties of the DPH family. They are statedwithout proof as the proofs can be found in Latouche and Ramaswami (1999, Section.2.6).

Theorem 1 Let X ∼ DPH(τ , T ) and Y ∼ DPH(σ, S) be two independent randomvariables. Then the distribution of Z = X + Y is given by DPH(α, A), where

α =(τ τ0σ

)and

A =

(T t · σ0 S

).

Theorem 2 Let X ∼ DPH(τ , T ) and Y ∼ DPH(σ, S) be two independent randomvariables. Then the distribution of Z = pX + (1 − p)Y , 0 ≤ p ≤ 1 is given byDPH(α, A), where

α =(pτ (1− p)σ

)and

A =

(T 00 S

).

Theorem 3 Let X ∼ DPH(τ , T ) and Y ∼ DPH(σ, S) be two independent randomvariables. Then the distribution of Z = min(X, Y ) is given by DPH(α, A), where

α = τ ⊗ σ

andA = T ⊗ S .

Theorem 4 The discrete phase-type distribution is dense in the set of all discreteprobability distributions supported on the natural numbers (N).


3.2 Matrix Analytic Methods

Matrix analytic methods (MAM) were first formalised by Neuts (1981) as an algorith-mic method of solving many stochastic modelling problems. It formalises the basicMarkov chain model by considering the states of the chain in hierarchical blocks usu-ally referred to as levels and the states within the level as the phases of the process.The simplest MAM model is one where each level has only one phase, in this case it isthe standard Markov chain model.

3.2.1 General MAM Models

The general matrix analytic method model is a doubly stochastic process {(Nk, JNk) :k ∈ Z+}, where Nk is the level of the process at time k, and JNk is the phase of processat time k that can depend on the level Nk at time k. Associated with each level Nk

is a collection of states known as phases. The time between changes of the level Nk

has some type of DPH distribution and changes of phase within a level correspondto phase changes within that DPH distribution and hence the phase process Jk is Nk

dependent.

These models can be represented in the form of probability matrices with a blockstructure, by allowing the rows and column of blocks to represent the level of thesystem and the rows and columns within each block, the phases of the Markov chainassociated with each level.

Discrete Phase-type Renewal Process

A simple example of a matrix analytic model is the phase-type renewal process. Arenewal process is a process in which the distribution of time between events are inde-pendent and identically distributed. The simplest example of a renewal process is thegeometric process, the counting of the number of successes of a sequence of Bernoullitrial; for example, counting the number of time heads shows up in successive coin flips.Let p be the probability of a success, than the geometric process is described by theinfinite Markov chain with probability transition matrix,

P =

1− p p 0 0 · · ·

0 1− p p 0 · · ·0 0 1− p p · · ·...

. . . . . . . . . . . . .

In a phase-type renewal process, Nk counts the number of times a phase-type variable,DPH(τ , T ), enters the absorbing state and Jk represents the phase of a phase-type

3.2. MATRIX ANALYTIC METHODS 29

distribution at time k, X/{0}. This process is represented by the infinite probabilitytransition matrix,

P =

A0 A1 0 0 · · ·0 A0 A1 0 · · ·0 0 A0 A1 · · ·...

. . . . . . . . . . . . ,

(3.2)

where A0 = T and A1 = tτ , an m ×m matrix. Immediately after each renewal, thediscrete phase-type distribution immediately recommences with distribution τ .

This matrix contains sub-matrices which describe the DPH distribution for the timesbetween renewals. A renewal will occur when there is a transition in the submatrixtτ , which will be referred to as marked transitions. When a marked transition occurs,Nk will go from i ∈ N to i + 1. The transitions within the sub-matrix T that do notcorrespond to a renewal, will be referred to as unmarked transitions. This allows therenewal times to have a distribution other than the geometric distribution. Note thata geometric distribution is the simplest form of a DPH distribution, DPH(1, 1− p).

Discrete Markovian Arrival Process

In a more general setting, the marked transition do not have to correspond to renewalpoints and are applicable when arrivals are not renewals but just special transitions.This can be modelled using the discrete Markovian arrival process, where Nk countsthe number of arrivals and Jk tracks the state of an underlying Markov chain. Thediscrete time Markov chain described by a finite state probability transition matrix Awhich can be filtered into two matricies A0 and A1 such that A = A0 +A1. The matrixA1 are transitions in the Markov chain that correspond to an arrival, and the matrix A0

are the other transitions of the chain. The only restriction on the form of the matricesA0 and A1 is that 0 ≤ [An]ij ≤ 1 for n ∈ {0, 1}, such that (A0 + A1) 1 = 1. Thematrix A1 is not in general rank 1 or of the form tτ , except in the phase-type renewalcase, however its rows give the starting distribution of the phase-type distribution thatdescribes the next inter-arrival time. Here the time between arrivals will not necessarilybe independent or identically distributed, but will still have some sort of phase-typedistribution from the family of phase-type distributions described by the matrix A0 andthe state of the underlying Markov chain in which the last arrival time occurred. Thearrival process can be expressed in the same form as the infinite probability transitionmatrix shown in Equation (3.2), except A1 is no longer restricted to be of rank 1.

Discrete Batch Markovian Arrival Process

Discrete Markovian arrival process (DMAP) is a special case of the discrete batchMarkovian process (DBMAP) where the stochastic process Nk is allowed to increase


by any finite positive integer. The process Jk is described by the finite state probabilitytransition matrix A =

∑`∈Z+

A`, where A`>0 are the transitions of marking an arrivalof size ` > 0 and A0 is the same as before. The special case where max{`} = 1is the Markovian arrival process. The restrictions on the elements of A` are that,0 ≤ [A`]ij ≤ 1, such that A1 = 1. The arrival process can be expressed in terms of theinfinite probability transition matrix,

P =

A0 A1 A2 A3 A4 · · ·0 A0 A1 A2 A3 · · ·0 0 A0 A1 A2 · · ·...

. . . . . . . . . . . . . . . .

(3.3)

Discrete Quasi-Birth and Death Process

Given that the level Nk can increase as discussed, it can also be allowed to decrease.A discrete quasi-birth and death process is another special case of a matrix analyticmodel where Nk is only allowed to move skip-free up or down one level. That is, anytransition will cause Nk to, increase by one, decrease by one, or remain unchanged. Theprocess {JNk} is described by the probability transition matrix D1 = B1+A0+A1 whenNk > 0 and D0 = B0+A1 when Nk = 0, with the restrictions that 0 ≤ [An]ij , [Bn]ij ≤ 1for n ∈ {0, 1}, such that D01 = D11 = 1. The homogeneous quasi-birth and deathprocess can be expressed in terms of the infinite block tri-diagonal probability transitionmatrix,

P =

B0 A1 0 0 · · ·B1 A0 A1 0 · · ·0 B1 A0 A1 · · ·...

. . . . . . . . . . . . .

Latouche and Ramaswami (1999, Chap. 13) showed that single server queues withgeneral inter-arrival time distribution with a Markovian distributed server, GI/M/1queues, or Markovian inter-arrival time distribution with a generally distributed server,M/G/1 queue, can be encompassed into the QBD structure. All of the previous struc-tures can be allowed to have level dependence and therefore can be described as inho-mogeneous. For example the homogeneous quasi-birth and death process,

P =

B0 A01 0 0 · · ·B11 A10 A11 0 · · ·0 B21 A20 A21 · · ·...

. . . . . . . . . . . . .

3.3. HIDDEN MARKOV MODEL 31

3.3 Hidden Markov Model

A hidden Markov model is a doubly stochastic process {(Yk, Jk) : k ∈ Z+}, where Yk isan observable process whose distribution is dependent on the state of some underlyingMarkov process Jk. However only the transitions of Yk are observed, which are referredto as emissions or observations in the HMM literature. The states of Jk are hidden orunobserved and represent the transitions of some underlying Markov chain at time k.

A hidden Markov model can be described by five quantities:

• X the state space of the hidden Markov process Jk,

• O the set of observations for the observable process Yk,

• P the Markov probability transition matrix P : X → X,

• Q the probability transition matrix Q : X → O,

• π the initial distribution of J0 on X.

In this thesis X and O will both be discrete and finite, as will time k, it should benoted that they do not have to be. For a more formal and general discussion of hiddenMarkov models see Cappe et al. (2005).

Definitions, Notation, and Assumptions

{Jk : k ∈ Z+} is a hidden Markov process on the states X = {1, . . . , N}, where Nis the number of elements in X. {Yk : k ∈ Z+} is an observed process on statesO = {1, . . . ,M}, where M is the number of possible observations.

Let Y0, Y1, . . . , YK , where Yk ∈ O, be the sequence of observations from time k = 0 upto time k = K, and let J0, J1, . . . , JK , where Jk ∈ X, be a sequence of possible hiddenstates for the observation Yk.

Note: As Jk is unobserved it is by no means unique for each Yk, but that somesequences of J0, J1, . . . , JK are more likely than others.

To prevent odious notation, let,

Y = {Y0, Y1, . . . , YK}

be an observed sequence and assume that,

J = {J0, J1, . . . , JK}

is the hidden sequence corresponding to Y . The partial sequences of observation willbe denoted by,

~Yk = {Y0, Y1, . . . , Yk}


and~Y k = {Yk, Yk+1, . . . , YK} .

Likewise the partial sequence of hidden transition will be denoted by,

~Jk = {J0, J1, . . . , Jk}

and~Jk = {Jk, Jk+1, . . . , JK} .

There are two independence assumptions for hidden Markov processes. It is assumedthat the tth hidden random variable, given the (t − 1)th hidden random variable, isindependent of all other previous hidden random variables,

Pr(Jk| ~Jk−1) = Pr(Jk|Jk−1) ,

and hence the process is Markovian. For the observations, it is assumed that the tth

observation, given the tth hidden random variable, is independent all other observationand hidden random variables, ie,

Pr(Yk|J, ~Yk−1, ~Y k+1) = Pr(Yk|Jk) .

It is also assumed that the hidden Markov process is time-homogeneous, such that theN ×N stochastic matrix P describes the transition of the process {Jk : k ∈ Z+} whichhas elements,

[P ]ij = Pr[Jk = j|Jk−1 = i] = pij ∀i, j ∈ X, ∀k ∈ Z+.

Let π be the initial distribution across X, a vector of length N with elements,

πi = Pr[J0 = i] , ∀i ∈ X .

It is also assumed that the observation probabilities are time-homogeneous for eachstate. As a result a N ×M stochastic probability matrix Q describes the probabilityof each observation in O given a state in X, which has elements,

[Q]ij = Pr[Yk = j|Jk = i] = qij ∀i ∈ X, ∀j ∈ O ,∀k ∈ Z+.

Example

For example, consider a Markov chain with two hidden states; clement, inclement: X ={,,/}; and three possible observations; sunny, overcast, rain : O = {�,�,�}. Thetransition probabilities between the states are given in Table 3.1 and the observationprobabilities are given in Table 3.2.


Table 3.1: Transition probability ma-trix P for the transitions between hid-den states.

, /, 0.8 0.2/ 0.4 0.6

Table 3.2: Observation probability ma-trix Q for the observations given thehidden states.

� � �, 0.70 0.25 0.05/ 0.15 0.20 0.65

The process can be represented in diagrammatic form such as in Figure 3.2, withthe colored line representing the observations from each state and the black lines thetransitions between the hidden states.

A HMM also requires a starting distribution for the hidden process, in this examplelet π = (0.3, 0.7).

☼ � �

, /

E

E×

Figure 3.2: A diagram of a hidden Markov model

An example of a sample path of the sequence {Yk} (simulated from this model) is:

�, �, �, �, ☼, ☼, ☼, �, ☼, �, �, �, ☼, �, �, �, �, �, ☼, �.

An observer sees the sequence of sunshine, overcast, and rain, not the underlyinghidden states, represented here by the weather’s colouring. In general, at time k theobservations Y0, Y1, . . . , Yk ∈ O are known, however the hidden states that producedthese observations, J0, J1, . . . , Jk ∈ X, are unknown.

3.3.1 Relation to Matrix Analytic Methods

Hidden Markov models (HMM), in the fully discrete case, can be viewed a specialcase of matrix analytic models described in Section 3.2, although they were developed


independently. A hidden Markov model is a more structured batch Markovian arrivalprocess if the observations are Yk ∈ Z+ and the process Jk is stationary. This givesa matrix analytic model that has a reduced number of parameters due to the phasesbeing homogeneous in both level and arrival size. The counting process Nk, whichcounts the number of arrivals, is then described by the probability transition matrixshown in Equation (3.3), where,

An =

p11q1n . . . p1Nq1n...

. . ....

pN1qNn . . . pNNqNn

.

3.3.2 Fitting Hidden Markov Models: The Baum-Welch Al-gorithm

Rabiner (1989) considers the problem of fitting hidden Markov models as a three stepproblem, given the clarity with which the author explained the topic, the same generalstructure will be followed here.

Given Y and model parameters P , Q, π, calculate Pr[Y |P,Q,π]

Denote the probability qJkYk as the probability that the kth observation of Y was Yk ∈ Ogiven that the kth hidden state of J was Jk ∈ X. Given P , Q and π, the probabilitythat Y was observed given that the sequence J occurred is

Pr[Y |J, P,Q,π] =K∏k=0

Pr[Yk|Jk, P,Q,π]

= qJ0Y0qJ1Y1 . . . qJKYK ,

by the assumption of independence of observations. The probability that the sequenceJ occurred given P , Q, and π is

Pr[J |P,Q,π] = Pr[J0|P,Q,π]K∏k=1

Pr[Jk|Jk−1, P,Q,π]

= πJ0pJ0J1 . . . pJK−1JK .

By conditional probability, the probability that both Y and J occurred given P , Q,and π is then,

Pr[Y, J |P,Q,π] = Pr[Y |J, P,Q,π] Pr[J |P,Q,π] . (3.4)

Summing Equation (3.4) over all possible sequences J will yield Pr[Y |P,Q,π]. How-ever, there are NK+1 possible sequences for J and as a result such a calculation wouldbe impractical.

Leonard Baum developed two procedures for calculating Pr[Y |P,Q,π] without the needfor extensive computation (Baum and Eagon 1967 and Baum and Sell 1968) known asthe Forward Procedure and the Backward Procedure.


Forward Procedure: Define,

αk(i) = Pr[~Yk, Jk = i|P,Q,π] ,

the probability of being in hidden state i ∈ X at time k and observing the sequence~Yk given P , Q, and π.

The probability that the hidden process is in state i at time 0 is πi, therefore theprobability that observation Y0 occurred given that the first state was i is given by

α0(i) = πiqiY0 .

The probability the hidden state is j at time k, given the observations ~Yk−1, is givenby the probability the hidden process is in state i at time k− 1, given the observations~Yk−1, multiplied by the probability of transition from state i to state j, summed overall the possible states i ∈ X, expressed mathematically as,

Pr[~Yk−1, Jk = j|P,Q,π] =N∑i=1

αk−1(i)pij .

Therefore the probability that ~Yk is observed and the process is in the hidden state jat time k is given by

αk(j) =

[N∑i=1

αk−1(i)pij

]qjYk ,

for all j = 1, . . . , N and all k = 1, . . . , K, for some final time K given by the time ofthe last observation.

The desired result can be found by summing over all i ∈ X at final time K,

Pr[Y |P,Q,π] =N∑i=1

αK(i) .

Backward Procedure: Define,

βk(i) = Pr[ ~Y k+1|Jk = i, P,Q,π] ,

the probability of observing the sequence ~Y k+1 given the hidden process being in thehidden state i ∈ X at time k and P , Q, and π.

If then,βK(j) = 1,

then for all i = 1, . . . , N ,

βk−1(i) =N∑j=1

pijqjYkβk(j) ,

for all k = K,K − 1, . . . , 1.


The desired result can be found by,


πiqiY0β0(i) .

Note: It can also be written that,


Pr[Y, Jk = i|P,Q,π]

=N∑i=1

Pr[~Yk, Jk = i|P,Q,π] Pr[ ~Y k+1|Jk = i, P,Q,π]

=N∑i=1

αk(i)βk(i) ,

for all k = 0, 1, . . . , K.

Given Y and model parameters P , Q, π find the “most likely” J

Finding the “most likely” state Jk a time k given Y , P , Q, and π is relatively straightforward,

Pr[Jk = i|Y, P,Q,π] =Pr[Y, Jk = i|P,Q,π]

Pr[Y |P,Q,π]

=Pr[~Yk, Jk = i|P,Q,π] Pr[ ~Y k+1|Jk = i, P,Q,π]

Pr[Y |P,Q,π]

=αk(i)βk(i)∑Ni=1 αk(i)βk(i)

. (3.5)

The most likely state Jk at time k is then,

argmaxi∈X

{αk(i)βk(i)∑Ni=1 αk(i)βk(i)

}.

This however will not yield the most likely sequence J . Consider for example if thatthe most likely state at time k− 1 was i and the most likely state at time k was j, butthe probability pij = 0, then such a sequence could not occur.

Viterbi (1967) instead found the “mostly likely” sequence J based on the observationsY given P , Q, and π.


Define the highest probability of a path to state i at time k based on observationsY0, Y1, . . . , Yk given P , Q, and π as,

γk(i) = max~Jk−1

{Pr[~Yk, ~Jk−1, Jk = i|P,Q,π]

}. (3.6)

The probability of starting in state i with observation Y0 is given by

γ0(i) = πiqiY0 .

It can be shown (Appendix A.1.1) that an iterative equation for γk(j) is

γk(j) =[maxi{γk−1(i)pij}

]qj,Yk . (3.7)

Let ψk(j) be the hidden state i ∈ X at time k − 1 that maximises the probability of

being in hidden state j ∈ X at time k, given the observations ~Yk−1,

ψk(j) = argmaxi∈X

{γk−1(i)pij} .

The Viterbi state, J∗k−1, is the state of J at time k−1 in the most likely sequence. TheViterbi state at final observation time K is,

J∗K = argmaxi∈X

{γK(i)}

and the most likely sequence can be found recursively for all k = 1, . . . , k − 1 by,

J∗k−1 = ψk(J∗k ) .

This is the most likely or Viterbi sequence, given Y , P , Q, and π.

Find P , Q, π to maximise Pr[Y |P,Q,π]

The most important question is that given a set of observations, can estimates for P ,Q and π be found that best describe that data given that P , Q and π are non-unique?The answer is that whilst they can, these estimates are in terms of the parameters P ,Q and π themselves.

The probability that a transition from state i to state j occurs at time k+1 is denoted,

ξk(i, j) = Pr[Jk = i, Jk+1 = j|Y, P,Q,π]. (3.8)

It can be shown that (Appendix A.1.2) Equation (3.8) can be written as,

ξk(i, j) =αk(i)pijqjYk+1

βk+1(j)∑Ni=1

∑Nj=1 αk(i)pijqjYk+1

βk+1(j). (3.9)


The probability that the process is in state i at time k was shown earlier in Equation(3.5) to be,

ηk(i) =αk(i)βk(i)∑Ni αk(i)βk(i)

.

The sum of ξk(i, j) over all k = 0, . . . , K − 1 is the expected number of transitionsfrom state i to state j given Y , P , Q, and π. Similarly, the sum of ηk(i) over allk = 0, . . . , K− 1 is the expected number of transitions from state i given Y , P , Q, andπ.

Given the data Y and parameters P , Q, and π, the best estimates for the parametersare given by P , Q, and π respectively,

πi = Pr[J0 = 0|Y, P,Q,π] = η0(i) ,

pi,j =

∑K−1k=0 Pr[Jk = i, Jk+1 = j|Y, P,Q,π]∑K−1

k=0 Pr[Jk = i|Y, P,Q,π]=

∑K−1k=0 ξk(i, j)∑K−1k=0 ηk(i)

,

qi,j =

∑Kk=0 Pr[Jk = i|Yk = j, Y, P,Q,π]∑K

k=0 Pr[Jk = i|Y, P,Q,π]=

∑Kk=0 ηk(i)δ(Yk − j)∑K

k=0 ηk(i),

where δ(n) is the Kronecker delta given by,

δ(n) :=

{1 if n = 0,

0 if n 6= 0.

The Baum-Welch Algorithm

The Baum-Welch algorithm, proposed by Lloyd Welch (Welch, 2003) and developedover a number of years by Leonard Baum (Baum and Petrie 1966, Baum and Eagon1967, Baum and Sell 1968, Baum et al. 1970, and Baum 1972), provides a means forfinding a locally optimal solution to,

argmaxP,Q,π

{Pr[Y |P,Q,π]} .

The algorithm works iteratively by updating estimates for P , Q, and π. The νth

estimate for P , Q, and π will be denoted P ν , Qν , and πν respectively. The first stepis the expectation step. Using the current estimates P ν , Qν , and πν , calculate theexpected value of the likelihood function with respect to the conditional distributionof J given the data and parameter estimates, and set,

B(P,Q,π|P ν , Qν , πν) = EJ |Y,P ν ,Qν ,πν [Pr[Y |P,Q,π]]


= EJ |Y,P ν ,Qν ,πν

[ ∑J∈XK+1

Pr[Y, J |P,Q,π]

].

Whilst this equation looks unwieldy, it simply gives the most likely sequence J . Thenext step is the maximisation step. The problem is to find new estimates,

P ν+1, Qν+1, πν+1 = argmaxP,Q,π

{B(P,Q,π|P ν , Qν , πν)

},

where these new estimates are used to find the expectation once again and then theprocess is repeated. Baum showed that Pr[Y |P ν+1, Qν+1, πν+1] ≥ Pr[Y |P ν , Qν , πν ]and that the algorithm converges to a locally optimal solution. A pseudo-code of theBaum-Welch algorithm for fitting hidden Markov models to data is shown in Algorithm3.1.

Algorithm 3.1: Pseudo-code for the Baum-Welch algorithm for fitting HMMs.

The Baum-Welch Algorithm:

1. Set ν = 0 with initial estimates of P 0, Q0, π0.

2. Calculate:

ξνk(i, j) = Pr[Jk = i, Jk+1 = j|Y, P ν , Qν , πν ]

ηνk(i) = Pr[Jk = i|Y, P ν , Qν ,πν ]

3. Update parameters to:

πν+1i = ην0 (i)

pν+1ij =

∑K−1k=0 ξ

νk(i, j)∑K−1

k=0 ηνk(i)

qν+1ij =

∑Kk=0 η

νk(i)I(Yk = j)∑Kk=0 η

νk(i)

ν ← ν + 1

4. Repeat steps 2 and 3 until either ν reaches a desired value or until the changelikelihood function falls below a set threshold.

Notes on fitting the data with the Baum-Welch algorithm

1. The likelihood is only locally optimal. Different starting parameters may yielddifferent estimates after the algorithm converges. These different estimates mayhave different likelihoods.


2. The estimates are non-unique. Different estimated parameters may have thesame likelihood.

3. If an initial parameter, p0ij, q

0ij, or π0

i is set to zero, for every estimate the param-eter will remain zero.

4. The number of states N must be chosen in advance, it is usually best to choosea number less than the maximal observation M = max{Yk}.

3.4 Adaptation of the Baum-Welch Algorithm for

the Fitting of Matrix Analytic Models

The Baum-Welch algorithm is readily adaptable to the fitting of discrete-time matrixanalytic models. Despite this, a literature search turns up no published papers wherethis has been done. However, continuous-time examples do exist using the more gen-eral expectation-maximisation (EM) algorithm of which the Baum-Welch was an earlyexample (Dempster et al., 1977). Breuer (2002) and Klemm et al. (2003) have appar-ently independently arrived at the same method for fitting continuous-time BMAPsusing the EM algorithm. Asmussen et al. (1996) found a method for fitting continuous-time phase-type distributions also using the EM algorithm and Olsson (1996) discussedEM fitting for continuous-time phase-type distributions with censored data. The freesoftware EMpht (Olsson, 1998) will fit continuous-time phase-type distributions toboth censored and uncensored data. In this section methods for fitting discrete-timeBMAPs and PH distributions using adapted Baum-Welch algorithms are presented forcompleteness.

3.4.1 Fitting Discrete-Time Batch Markovian Arrival Processes

Let Y = {Y0, Y1, . . . , YK} be a set of K + 1 observed arrivals (or observations) eachof batch size Yk ∈ Z+ and let the maximal batch size M = max{Yk} the maximumbatch size from the observations. It is important to note that no arrival at time k is anobservation, that is an arrival of batch size Yk = 0. For an initial distribution π andDBMAP process A0, A1, . . . , AM the probability of observing such a sequence is,

Pr[Y |π, A0, A1, . . . , AM ] = πAY0AY1 . . . AYK1 .

It is assumed that process is stationary, so that

Pr[Y |π, A0, A1, . . . , AM ] = Pr[Y |A0, A1, . . . , AM ]

and∑M

l=0 πAl = π∑M

l=0 Al = π.

3.4. ADAPTATION OF THE BAUM-WELCH ALGORITHM 41

The forward probabilities are then defined as,

α0(j) =∑i

πi[AY0 ]i,j , ∀j ∈ X ,

αk(j) =∑i

αk−1(i)[AYk ]i,j , ∀j ∈ X, k = 1, . . . , K .

Similarly the backward probabilities are defined as,

βK+1(i) = 1 , ∀i ∈ X ,

βk−1(j) =∑i

[AYk ]i,jβk(j) , ∀j ∈ X, k = 1, . . . , K .

Similarly,

ξk(i, j) =αk(i)[AYk+1

]i,jβk+1(j)∑Ni=1

∑Nj=1 αk(i)[AYk+1

]i,jβk+1(j)(3.10)

ηk(i) =αk(i)βk(i)∑Ni=1 αk(i)βk(i)

. (3.11)

The fitting procedure then follows the same format as the Baum-Welch algorithm asshown in Algorithm 3.2. The notes about fitting the Baum-Welch algorithm for HMMsare also relevant for the fitting of DBMAPS.

Algorithm 3.2: Pseudo-code for a modified Baum-Welch algorithm for fitting DBMAPs

Modified Baum-Welch Algorithm for discrete batch Markovian arrival processes:

1. Set ν = 0 with initial estimates A00, A

01, . . . , A

0M and .

2. Use Aν0, Aν1, . . . , A

νM to evaluate Equation (3.10) and Equation (3.11) for all

i, j = 1, . . . , N and k = 0, . . . , K.

3. Update parameters using:

[Aνl ]ij =

∑Kk=0 ξ

νk(i, j)δ(Yk − l)∑Kk=0 η

νk(i)

.



3.4.2 Fitting Discrete-Time Phase-Type Distributions

Let Y = {Y1, Y2, . . . , YM} be a set of M observed times until absorption of a discretetime phase-type Markov chain, DPH(τ , T ), where the times Yl ∈ Z+. The probabilityof observing the random variable Yl given τ and T is,

Pr[Yl|τ , T ] = τT Yl−1t .

Using the previous notation for the phase-type distribution given in Section 3.1 theforward and backward probabilities can be defined in terms of the transient states1, . . . ,m.

The forward probabilities are,

αl0(j) = τi , ∀j ∈ X ,

αlk(j) =∑i

αk−1(i)Ti,j , ∀j ∈ X, k = 1, . . . , Yl − 1 .

The backward probabilities (steps backwards from absorption) are,

βlYl(i) = ti , ∀i ∈ X ,

βlk(j) =∑i

Ti,jβk+1(j) , ∀j ∈ X, k = 1, . . . , Yl − 1 .

Having obtained the forward and backward probabilities,

Pr[Yl|τ , T ] =m∑i=1

αlYl−1(i)ti =m∑i=1

τiβl1(i) .

The probability of transitioning from state i ∈ X\{0} to state j ∈ X\{0} and theprobability of being in state i ∈ X\{0} at time k are given by

ξlk(i, j) =αk(i)Ti,jβk+1(j)∑N

i=1

∑Nj=1 αk(i)Ti,jβk+1(j)

, ∀i, j ∈ X\{0} ,

ηlk(i) =αk(i)βk(i)∑Ni=1 αk(i)βk(i)

, ∀i, j ∈ X\{0} .

The probability of starting is state i given τ , T , and Yl is given by,

$l(i) =τiβ

l1(i)∑m

i=1 τiβl1(i)

, ∀i ∈ X\{0} .

3.4. ADAPTATION OF THE BAUM-WELCH ALGORITHM 43

The probability of being absorbed at time Yl from state i given τ , T , and that absorptionhas not occurred up until time Yl − 1 is given by,

ωl(i) =αlYl−1(i)ti∑mi=1 α

lYl−1(i)ti

, ∀i ∈ X\{0}.

The total expected number of transitions from i to j given τ , T , and Y is

ξ(i, j) =M∑l=1

Yl−1∑k=1

ξlk(i, j) . (3.12)

The total expected number of times the process is in i given τ , T , and Y is

η(i) =M∑l=1

Yl∑k=1

ηlk(i) . (3.13)

The total expected number of times the process begins in i given τ , T , and Y is

$(i) =M∑l=1

$l(i) . (3.14)

The total expected number of times the process is absorbed from i given τ , T , and Yis

ω(i) =M∑l=1

ωl(i) . (3.15)

The Baum-Welch algorithm is then modified for the discrete phase-type distributionas shown in Algorithm 3.3. The notes about fitting the Baum-Welch algorithm forHMMs are also relevant to the fitting of discrete phase-type distributions.

Algorithm 3.3: Pseudo-code for a modified Baum-Welch algorithm for fitting DPHs.

Modified Baum-Welch Algorithm for discrete phase-type distributions:

1. Set ν = 0 with initial estimates τ 0, T 0, t0.

2. Use τ ν , T ν , tν

to evaluate Equations (3.12), (3.13), (3.14), and (3.15) for alli, j = 1, . . . ,m, and k = 1, . . . , Yl.

3. Update parameters to:

τ ν+1i = $(i)/M

[T ν+1]ij = ξ(i, j)/η(i)

tν+1i = ω(i)/η(i)


Chapter 4

Markov Decision Processes

This chapter will present the Markov decision process which uses many of the Markovchain constructions seen to this point and will be the main focus of study of theoriginal research presented in Chapter 6. The Markov decision process is based ontwo concepts, a reward(s) for making transitions in a Markov chain, and a decision(s)which can change the transition probabilities and/or the reward(s). Fundamental to theMarkov decision processes is that the system of interest be modelled by a Markov chain.In Chapter 5, previous applications of Markov decision processes to modelling andoptimising release from reservoirs will be discussed and these will be further extendedin Chapter 6.

4.1 Introduction

A decision process M is a stochastic process {Jk : k ∈ Z+} with a reward associatedwith being in each state of the process and a set of decisions the effects how the processevolves. The decision process is defined by 4 quantities M = (X,D, P,R) :

• X a countable state space representing the states of the system,

• D a compact set of decisions,

• P a transition function that maps state/decision pairs to a probability distribu-tion over X, Pr[Jk+1|Jk, d ∈ D],

• R a reward function R : X ×D → R.

A Markov decision process (MDP) is one where the transitions between states areMarkovian. That is, the states in X are the states of a Markov chain.

A probability transition between the states i, j ∈ X for a given action d ∈ D is writtenpij(d) = Pr[Jk+1 = j|Jk = i, d ∈ D], which is the ith row and jth column element of a

45

46 CHAPTER 4. MARKOV DECISION PROCESSES

probability transition matrix P (d). Likewise, a reward, R(d), is defined for a decisiond, where rj(d) is the reward associated with ending in state j with decision d.

A Bellman equation (Bellman, 1957) describing the value vk(i) of being in state i attime k is given by

vk(i) =∑j

pij(d) [rj(d) + vk+1(j)]

= λi(d) +∑j

pij(d)vk+1(j) ,

where λi(d) =∑

j pij(d)rj(d).

An alternative formulation of an MDP can have R(d) being dependent on the statebefore and after the decision d is taken. That is, a matrix giving a reward associatedwith the transition from one state of X to another state. Here rij(d) would denote thereward associated with the transition from state i to state j in X under decision d.Consequently λi(d) =

∑j pij(d)rij(d), otherwise there is no practical difference between

the two formulations. Both are used in this thesis, where appropriate.

The expected monetary value (EMV) until some future time K of being in state i ∈ Xat time k is denoted vk(i). A backwards recursive Bellman equation is used to determinethe maximum EMV, v∗k(i), at time k given the maximum value at time k + 1 of allstates of X, by choosing the decision d ∈ D such that,

v∗k(i) = maxd∈D

{λi(d) +

∑j

pij(d)v∗k+1(j)

}. (4.1)

A policy, denoted u, is a set of decisions to be made at each state. That is, u is avector of length |X|, the number of states in X, where the ith element of u is ui ∈ D.

4.2 Value Iteration

Howard (1960, chap. 3) proposes a method for finding an optimal set of decisions thatwill create a policy to give the maximal long term reward. Let n be the number oftime steps until the end time of the process. Then Equation (4.1) is rewritten as,

v∗n(i) = maxd∈D

{λi(d) +

∑j

pij(d)v∗n−1(j)

}. (4.2)

By choosing/setting a final value, v0(i) for all states i ∈ X, solving Equation (4.2) willgive a policy un at each time step n that maximises future expected EMV.

If the final time is allowed to tend to infinity, in which case Bellman’s equation is solvedrecursively from an arbitrary final value, v0(i), at some unspecified future time working

4.3. POLICY ITERATION 47

backwards until a steady state solution is found. Bellman (1957) showed that this willconverge, although not necessarily in a finite number of steps. The convergence onan optimal policy: un → u∗ will only occur as n → ∞. In practice the algorithmstops when un+1 = un, as the process has become stationary and further iterationswill not result in a different policy. Different initial values for v0(i) are used to checkthat this policy is globally optimal, as there may be locally optimal solutions, althoughin practice this is rare.

4.2.1 Discounted Value Iteration

Howard (1960, chap. 7) also considers that future values of a system will have a lowerutility now than the current value and proposes that future values be discounted by afactor of β, 0 ≤ β ≤ 1. This results in a geometric utility function were the value ofone dollar at n time steps into the future is worth βn dollars now. It has a Bellmanequation of the form,

v∗n(i) = maxd∈D

{λi(d) + β

∑j

pij(d)v∗n−1(j)

}. (4.3)

Equation (4.3) is identical to Equation (4.2) when β = 1. When β = 0 the decisionis made to maximize the current reward only with no value placed on future rewards.This has stronger convergence than (4.2) and will converge quicker for smaller β.

4.3 Policy Iteration

Howard (1960, chap. 4) also proposes an alternative method for solving Markov de-cision processes. This method converges on an optimal policy u∗ in a finite numberof iterations so long as the number of states of X is a finite number m. Policy itera-tion is performed in two-steps, a value-determination procedure, followed by a policy-improvement procedure.

Value-Determination Procedure

Consider Equation (4.2), if the system is operating with policy u at all times n then,

vn(i) = λi +∑j

pijvn−1(j) . (4.4)

If P is the probability transition matrix for the system operating under this policy,then if the vector vn is the state vector for the values at time n and the vector λ has


elements λi then Equation (4.4) can be rewritten in matrix form as,

vn+1 = λ+ Pvn . (4.5)

The z-transform of Equation (4.5) is,

V z =z

1− z(Im − zP )−1λ+ (Im − zP )−1V 0 , (4.6)

where V z is the z-transform of vn.

If P is ergodic then P n can be expressed as S + T (n), where S is the long-term stablebehaviour of the matrix P where every row is equal to π = πP , and π1 = 1. T (n) isthe asymptotic behaviour and all the entries tend to zero as n becomes very large. Asa result,

(Im − zP )−1 =1

1− zS + T (z) , (4.7)

where T (z) is the z-transform of T (n). Substituting (4.7) into (4.6) yields,

V z =z

(1− z)2Sλ+

z

1− zT (z)λ+

1

1− zSV 0 + T (z)V 0 ,

and taking the inverse z-transform yields,

vn = nSλ+n∑k=1

T (k)λ+ Sv0 + T (n)v0 .

As n→∞, T (n)v0 → 0 and∑n

k=1 T (k)λ+ Sv0 → v a vector of constants associatedwith the value of each state. As the rows of S are π, define g = Sλ where the elementsgi = g =

∑πjλj. This yields the asymptotic form,

vn = ng + v ,

or for each state i ∈ X,vn(i) = ng + v(i) . (4.8)

Substituting Equation (4.8) into Equation (4.4) yields the equation describing the valueincrease with each time step for very large n,

ng + v(i) = λi +∑j

pij((n− 1)g + v(j))

⇒ g + v(i) = λi +∑j

pijv(j) . (4.9)

It is worth noting that v(i), ∀i ∈ X are relative values (substitute rj = ar′j + b intoEquation (4.9) to check). Therefore fixing v(N) = 0, a series of linear equations can besolved for Equation (4.9). This will determine the relative values for each state givena fixed policy when n is large.

4.3. POLICY ITERATION 49

Policy-Improvement Procedure

Consider Equation (4.1), which states,

v∗n(i) = maxd∈D

{λi(d) +

∑j

pij(d)v∗n−1(j)

}.

As only the long-term stable solution is of interest, consider the asymptotic values,

ng + v(i) = maxd∈D

{λi(d) +

∑j

pij(d)[(n− 1)g + v(j)]

},

g + v(i) = maxd∈D

{λi(d) +

∑j

pij(d)v(j)

}.

As g is a constant, the policy that maximises v(u) is the policy that maximises,

λi(d) +∑j

pij(d)v(j) (4.10)

for all i ∈ X.

Policy Iteration is performed by alternately solving Equation (4.9) and maximisingEquation (4.10) as shown in Algorithm 4.1.

Algorithm 4.1: Pseudo-code for the Policy Iteration procedure.

Policy Iteration:

1. Set ν = 0 with initial estimate u0.

2. For uν solve Equation (4.9) with v(N) = 0 to find vν+1.

3. Using vν+1 maximise (4.10) to find uν+1.

4. If uν+1 = uν stop, otherwise set ν ← ν + 1 and repeat steps 2 and 3.

Puterman and Brumelle (1978 and 1979) showed that policy iteration is equivalentto the Newton-Kantorovitch root-finding procedure, such that there exists an optimalpolicy u∗, and that uν converges to u∗. For problems where X has finite cardinalityand D is a discrete finite set (such that will be considered in this thesis), policy iterationconverges O(N2).


4.4 Stochastic Solution Methods

In situations where the number of states, |X|, and the number of decisions, |D|, arevery large it is not always practical to solve using policy iteration or value iteration. Inrecent years a number of stochastic solution methods have been proposed to find thesolution to Markov Decision Processes without having to calculate every state/decisionpair for each iteration step. Instead, policies are generated randomly in a systematicway to find policies that are optimal for the reward function. These have also provenpopular when solving optimisation problems with more than one reward function, suchas in multi-objective optimisation. Two widely used methods are simulated annealingand genetic algorithms.

For a fixed policy u there is a matrix P (u) whose elements are given by pij(ui), whereui ∈ D, is the decision made in state i given by policy u. Define π(u) as the stationary(row) vector of the matrix P (u).

Now define a column vector λ(u) which has as its ith element the expected rewardλ(ui) for decision ui ∈ D given by policy u.

The expected monetary value of the system operating under a policy u is the given by,

f(u) = π(u)λ(u) . (4.11)

Stochastic solution methods focus on finding the policy which will maximise Equation(4.11). These are adaptations of well established solution methods. However it shouldbe noted that there is no one ideal one way of implementing these methods and theyare subject to modification on a case by case basis.

4.4.1 Simulated Annealing

Simulated annealing is based on the idea of annealing in metallurgy where metal isheated and then cooled in a controlled way to achieve a desired crystal structure. Todo this a temperature T is introduced which will control the process.

Starting with an initial policy u0 a number of the elements of the vector, usually thenumber is conditioned on T , are mutated at random. This is done by taking a chosenelement, ui ∈ D and selecting another decision in D, u∗i ∈ D, to replace its position inthe vector. For example,

u0 = [u1, u2, . . . , uc−1, uc, uc+1 . . . , u1n]→ u1 = [u1, u

∗2, . . . , uc−1, u

∗c , uc+1 . . . , u

1n] .

If the new policy performs better, then it always replaces the old policy. However ifit performs worse, it replaces the old policy with a probability dependent on T . Aftera predetermined number of iterations M , the temperature is lowered. This processis then repeated a predetermined number of times N . All the while the best policy,

4.4. STOCHASTIC SOLUTION METHODS 51

defined as the one that gives the highest EMV, is recorded and is used to restart themutation process each time. The best policy found through this process is used as themaximum. A summary of the simulated annealing procedure is shown in Algorithm4.2.

Algorithm 4.2: Pseudo-code for Simulated Annealing procedure applied to MDPs.

Simulated Annealing:

1. Set ν = 0 with initial policy u0, temperature T , and 0 < α < 1.

2. Begin search at temperature T :

(a) Probabilistically mutate uν conditional on T to create u′.

(b) Calculate g = min{1, exp[−(f(u)− f(u′))/T ]}.(c) Set uν+1 ← u′ with probability g, otherwise set uν+1 ← uν .

(d) Set ν ← ν + 1.

(e) Repeat Steps (a)–(d) M times while keeping note ofu∗ = argmax

uν{f(u0), . . . , f(uM)}.

3. Lower temperature T ← αT , then set ν = 0 and u0 ← u∗

4. Repeat Steps 2 & 3, N times.

4.4.2 Genetic Algorithms

Genetic algorithms are based on the idea of Mendelian inheritance in evolutionarytheory. Here a population Q0 of N policies is generated. The policies are then rankedby how well they perform on the EMV measure, this is called their “fitness”. Theserankings are used to determine “breeding” pairs. Breeding pairs are selected randomlywhere the higher ranked polices are favoured, usually by a weighting method. When abreeding pair is selected, a point along the vectors is chosen at random (via a probabilitypc) for a crossover. Each of the “offspring” vectors will inherit one portion of the vectorfrom each of their “parent” policy vectors (see Fiqure 4.1), making two new vectors.

[u11, u

12, . . . , u

1c , u

1c+1, . . . , u

1n] [u2

1, u22, . . . , u

2c , u

2c+1, . . . , u

2n]

[u11, u

12, . . . , u

1c , u

2c+1, . . . , u

2n] [u2

1, u22, . . . , u

2c , u

1c+1, . . . , u

1n]

Parent Generation

Offspring Generation

Figure 4.1: The crossover in two policy vectors from the parents to the offspring.

After a populationQ1 ofN new policies is generated, random elements (chosen typically


with a small probability pm) of the new policies vectors are mutated in the same wayas in the simulated annealing example. This process is repeated for M generations, atthe end of which the “fittest” policy found via this procedure is used as the maximum.A summary of a genetic algorithm for MDPs is shown in Algorithm 4.3.

Algorithm 4.3: Pseudo-code for a Gentic Algorithm applied to MDPs.

Genetic Algorithms:

1. Generate a population of N policies Q0 = {u1, . . . ,uN} and set 0 < pc, pm <1 and ν = 0.

2. Evaluate the EMV of the population f(Qν) = {f(u1), . . . , f(uN)}.

3. Breeding pairs are chosen from Q based on their value in f(Q), preferencegiven to better performers.

4. For a breeding pair a point is chosen along the vector with probability pc andthe two vectors are cut at this point, swapped and recombined, to create twonew polices.

5. Repeat Step 3 and 4 until a new population Qν+1 of N policies is created.

6. Mutate individual element in the policies in Qν+1 with probability pm.

7. Set ν ← ν + 1 and repeat step 2–6 M times.

4.5 Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) are based on similar ideasto hidden Markov models (Section 3.3) in that only a sequence of observations {Yk}is observed, not the process {Jk}. Formulations can vary, but Littman (2009) definesPOMDPs as follows:

• X, the state space of the Markov chain,

• D, the set of decisions,

• Y , the set of of observations,

• P , the probability transition matrix dependent on the decision,

• R, the reward function dependent on the state of the Markov chain and thedecision,

• Q, the observation function which relates the states and the decisions to theobservations,

4.5. PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES 53

• b, a vector of probabilities describing the belief of the current state in X, suchthat

∑i bi = 1.

The constructions X, P , D and R are defined in the same way as before. As Q is someprobability function that maps X, D and R to [0, 1] then the likelihood of being inthe unobserved states X based on observations Y is tracked with the belief vector b.The ith element bi of b is the probability of the process being in state i and

∑i bi = 1.

What the observations Y are is dependent on the function Q, they could be rewards,subsets of X, or another indicator variable entirely. Given that rewards are a functionof state and action, Q is a map of X and D to [0, 1] and once again can be representedas a set of matrices that vary with d ∈ D. The function Q is used to update b basedon the observations Y as they occur.

Dealing with POMDPs

The probability of observing Yk based on the belief about the of state Jk−1, bk−1, andthe decision d ∈ D,

py = Pr[Yk|d, bk−1] =∑i

bk−1i

∑j

Pr[Jk = j|d, Jk−1 = i] Pr[Yk|d, Jk = j]

=∑i

bk−1i

∑j

pi,j(d)qjYk(d) .

The belief vector is updated at time k based on the decision at time k − 1 and obser-vation Yk at time k,

bkj = Pr[Jk = j|Yk, d, bk−1] =

∑i bk−1i pi,j(d)qjYk(d)∑

i bk−1i

∑j pi,j(d)qjYk(d)

.

If the observations are not the rewards, using the belief vector bk at time k, the expectedreward received for making decision d is,

λ(d, bk) =∑i

bki∑j

rij(d) .

Using the expected reward, a Bellman equation can be written that when solved iter-atively will give a policy that maximises the expected return until the end time,

v∗k−1(bk−1) = maxd∈D

{λ(d, bk−1) +

∑y

pyv∗k(b

k)

}.

Similar to Section 4.2.1 a discount factor 0 ≤ β ≤ 1 can be applied.

v∗k−1(bk−1) = maxd∈D

{λ(d, bk−1) + β

∑y

pyv∗k(b

k)

}.


It should be noted that unlike standard MDPs, in many cases POMDPs cannot besolved to find stable solutions without the use of simulations.

Chapter 5

Literature Review

5.1 Moran Dam Model

First proposed by Moran (1954), the Moran Dam Model is a simple, discrete-timestochastic model for a reservoir of maximal capacity C. The available storage Zk+1 ofthe reservoir at time k + 1 is given by the recursive relation,

Zk+1 = min[Zk + Yk+1, C]−min[Zk + Yk+1, D] ,

where Yk+1 is the net inflow (inflow minus evaporation) in the time step (k, k + 1] andD ∈ (0, C) is the target amount of water released from the reservoir during one timestep.

Typically it is assumed that Y1, Y2, . . . are identically and independently distributed(iid), such as Moran (1959). Another assumption implicit in the model is that thewater is released from the dam after it flows in, as a result the storage will never beabove C −D at the end of the time step. Both of these assumptions are valid for themodelling of a reservoir created by a dam which is used to hold back an annual inflowof water that arrives in a relatively short period of time, before being slowly releasedduring the year. If the main inflows are close to one year apart the assumption ofindependence between inflows should be valid.

The Moran Dam Model under these assumption is a Markov chain and is analyticallyidentical to the M/G/1 or G/M/1 server queues (Stadje, 1993). In Section 3.2.1 itwas discussed that M/G/1 or G/M/1 server queues can be represented in the MAMframework as quasi-birth and death processes, so that it can be inferred that the MoranDam Model is also a quasi-birth and death process.

If Zk and Yk take only discrete values, Zk = 0, 1, . . . , C − D and Yk ∈ Z+, then theMoran Dam Model is a finite state Markov chain, with the K + 1 states of the chainrepresenting levels of storage in the reservoir.

The probability of inflow Yk+1 = i is denoted pi for all i = 0, 1, . . . and the probabilityof the storage being Zk = i at time k is denoted πki . Consider a release policy of

55

56 CHAPTER 5. LITERATURE REVIEW

C−D > D (this is done for convenience, an alternative set of equations can be writtenfor C −D ≤ D), the probabilities πk+1

i are then:

πk+10 = πk0(p0 + · · ·+ pD) + πk1(p0 + · · ·+ pD−1) + · · ·+ πkDp0

πk+11 = πk0pD+1 + πk1pD + · · ·+ πkD+1p0

...πk+1C−2D = πk0pC−D + πk1pC−D−1 + · · ·+ πkC−Dp0

...πk+1C−D−1 = πk0pC−1 + πk1pC−2 + · · ·+ πkC−DpD−1

πk+1C−D = πk0(pC + . . . ) + πk1(pC−1 + . . . ) + · · ·+ πkC−D(pD + . . . ) . (5.1)

As k → ∞, πki → πi, the stationary probability of being in state i at the end of eachtime-step. By setting πki = πk+1

i = πi for all i = 0, 1, . . . , C − D in the set of linearequations (5.1) and using the fact that

∑πi = 1, the system of linear equations can

be solved to find the stationary probability distribution of the water storage in thereservoir.

Using the Moran model it is possible to study two other stochastic processes, {Rk},the sequence of actual releases and, {Wk}, the sequence of overflows defined by,

Rk+1 = min[Zk + Yk+1, D] and

Wk+1 = max[Zk + Yk+1 − C, 0] .

Define the process Uk+1 = Zk+Yk+1, this would model a Moran Dam of infinite capacityand no release. Let Pr(Uk = i) = ρki and let the stationary probability distribution belimk→∞ ρ

ki = ρi. It can be shown that,

ρ0 = π0p0

ρ1 = π0p1 + π1p0...

ρC−D = π0pC−D + · · ·+ πC−Dp0

ρC−D+s = π0pC−D+s + · · ·+ πC−Dps, s > 0.

Then the stationary probability distribution of the process {Rk} is,

Pr[R∞ = 0] = ρ0

Pr[R∞ = 1] = ρ1...

Pr[R∞ = D − 1] = ρD−1

Pr[R∞ = D] = ρD + ρD+1 + . . .= 1− ρ0 − ρ1 − · · · − ρD−1

The stationary probability distribution of the process {Wk} is,

Pr[W∞ = 0] = ρ0 + ρ1 + · · ·+ ρC

5.1. MORAN DAM MODEL 57

Pr[W∞ = s] = ρC+s, s > 0.

Reznicek and Cheng (1991) commented that it is the ease with which these stationaryprobability distributions can be calculated that makes the Moran Dam Model desirable.However if the assumptions are not valid the theory cannot be applied in practice.

5.1.1 Gould Probability Matrix

First proposed by Gould (1961), the Gould probability matrix uses a modified Moranmodel to determine the capacity C at which a reservoir should be built by setting anacceptable probability of failure. Described by McMahon and Mein (1978), the methodis as follows:

1. Determine the monthly releases from the reservoir Dk for k = 1, . . . , 12.

2. Using the historical inflows to the reservoir, find all the monthly inflows Yk. LetN denote the total number of years for which the monthly inflows are known.

3. Determine the monthly evaporation EZ for the storage level Z ∈ [0, C].

4. For all intimal levels of Z0, calculate Z1 = Z0 + Y1 − D1. That is, for eachinitial level Z0 of the reservoir, find what level Z1 the reservoir would be afterone month.

5. Calculate iteratively Zk+1 = min[Zk + Yk+1, C]−min[Zk + Yk+1, Dk] until Z12 isobtained.

6. If Z0 = i and Z12 = j, add 1 to ith row and jth column element of a table, andperform this calculation for all rows i. That is calculate what the reservoir levelwould be at the end of the year for the reservoir starting the year at each initiallevel i.

7. Repeat Steps 4 to 6 for all N years in the data set and divide each row of thetable by N to form a matrix P .

8. P is now the annual transition matrix for the storage of the reservoir with releasesDk for each month k = 1, . . . , 12 during the year.

This new model improves upon the Moran model in that it allows for seasonal variationin the inflows, as well as allowing the releases and inflows to be spaced over the year. Itstill assumes that the annual inflows are independent and identically distributed. Thecapacity C for which the dam should be constructed is found as follows:

1. Partition a reservoir into C levels and a level zero (0) representing the deadstorage of the reservoir.


2. Calculate the probability transition matrix P as described above.

3. Whilst calculating P , count the number of times the storage reaches zero duringthe year (not just at the end). Create the column vector f , the elements of whichare the probability that the storage reaches zero, given it is at level i at the startof the year.

4. Calculate the steady state vector π = πP , where π1 = 1.

5. Calculate the overall risk πf of the reservoir reaching dead storage.

6. If the overall risk is greater than expected probability of failure, increase themaximum capacity C and repeat Steps 2 to 5 until the risk is below the desiredvalue.

Srikanthan and McMahon (1985a) considered that given P is the annual probabilitytransition matrix calculated using the inflows and releases on a monthly basis, whatshould be used as the starting month? Analysis showed that by starting the annualtime step at different months, that the calculated capacity (Gould storage) could varyby as much as 20%. They concluded that the starting month should be the one with theminimum mean monthly inflow as this was the month that produced the largest Gouldstorage and would therefore provide the largest guard against failure. Srikanthan andMcMahon (1985b) considered the problem of adding temporal, or serial, correlation tothe inflows as the Gould matrix model is unrealistic when dealing with such scenarios.They concluded that a correction factor had to be applied to the Gould storage if thecorrelation coefficient of the lag-1 inflows |r| ≥ 0.2 and that if the correction coefficientwas greater than 1.5 it indicated that the Gould matrix method was unsuitable for suchapplications. Otieno and Ndiritu (1997) determined that for annual serial correlatedinflows, the starting month with the lowest annual serial correlation coefficient shouldbe used as the starting month.

The Gould matrix method has also been used to determine reservoir operation policywhen the capacity and releases are fixed. Price (2008) described how a modified Gouldmatrix was used to determine when water should be transferred via a pipeline tosupplement the water supply in neighbouring counties in the South East of England.The counties Essex and Suffolk have two reservoirs which can be supplemented bybuying and transferring water from a river called the Ely Ouse which is located in athird county to the North. A modified Gould probability transition matrix was usedto model the total capacity of the two reservoirs, treating them as a single storage. Foreach month and each level of storage the probability that the system would be full atthe start of May was calculated (similar to the probability that the reservoir reachesdead storage) and if it was less than 0.9, water would be transferred from the Ely Ouse.This produced a control curve on which to operate the transfer scheme.

5.2. LINEAR PROGRAMMING MODELS 59

5.2 Linear Programming Models

A linear programming (LP) model is a model in which both the objective function tooptimise and the inequality constraints on the optimisation are linear in the variables.A linear programming model is expressed in canonical form as:

minx

Z = CTx ,

subject to,

Ax ≤ bxi ≥ 0 for all i = 1, . . . , n ,

where C is an n×n matrix of coefficients of the linear objective function, x is an n×1vector of the variables, A is an m×n matrix of coefficient of the linear constraints andb is an m× 1 vector representing the right hand side of the linear constraints.

Whilst LP models are deterministic they have been used in the planning of reservoirssuch as finding the minimal capacity required to meet expected inflows, Yeh (1985)has a comprehensive list of such applications. LP models have found wide use inconjunction with dynamic programming (DP) models, such as in Webby et al. (2009)and Vedula and Nagesh Kumar (1996) amongst may others. Dynamic programmingmodels were introduced by Bellman (1957) and have been used to model the non-linearand dynamic components of reservoirs.

5.2.1 Stochastic Linear Programming

Proposed by Loucks (1968) stochastic linear programming models combine the efficientcalculations of linear programming with the stochastic nature of the system beingmodelled. Consider the point probability Xν,i,d,k of having initial storage volume ν attime k, net inflow i and release d during the time [k, k + 1). Let Rν,i,d,k be the rewardfor such states. Correlated serial flows can be modelled using a Markov chain. Denotethe inflow probabilities Pi,j,k = Pr[Yk+1 = j|Yk = i], where Yk+1 is the net inflow in thetime period [k, k + 1). The problem is then expressed in canonical form as:

maxXν,i,d,k

Rν,i,d,kXν,i,d,k,


subject to, ∑ν,i,d

Xν,i,d,t = 1 for all k ,

ν+j∑e=ν+j−C

Xν,j,e,k+1 =∑k

∑ν

ν+i∑d=ν+i−C

Xν,i,d,kPi,j,k for all ν, j, k

0 < Xν,i,d,k < 1 for all ν, j, d, k ,

where e and Zk+1 = ν are the release and storage volume at time k + 1 respectively.

This method can be used to calculate probability based operating policies using

Pr[d|ν, i, k] =Xν,i,d,k∑dXν,i,d,k

.

The major drawback of this method is that the number of constraints can easily reachinto the thousands (Yeh, 1985).

5.3 Stochastic Dynamic Programming - Single

Reservoir Models

Stochastic dynamic programming (SDP), in the hydrological literature (Reznicek andCheng, 1991), refers to the value iteration method of solving Markov decision processesas presented in Section 4.2, this modelling technique allows for the solution of bothnon-linear and stochastic elements for reservoir management. Most of the time thesemodels can be solved using the policy iteration technique (Section 4.3) as well.

5.3.1 Stochastic Dynamic Programming

If vk(Zk) is the expected value at time k of the reservoir having storage Zk, the problemis to solve,

vk(Zk) = maxdk∈D

∑Yk+1

Pr[Yk+1] [R(dk, Yk+1, Zk) + vk+1(Zk+1)]

,

subject to,

0 ≤ Zk+1 = Zk + Yk+1 + ∆(dk) ≤ C ,

5.3. SDP - SINGLE RESERVOIR MODELS 61

where Yk+1 is the inflow from (k, k + 1], ∆(dk) is the net effect of the decision dkdecided at time k (typically negative unless the reservoir is a pump-storage), and C isthe capacity of the reservoir.

Alternatively some authors let

λ(dk, Zk) = EYk+1[R(dk, Yk+1, Zk)]

and then solve,

vk(Zk) = maxdk∈D

λ(dk, Zk) +∑Yk+1

Pr[Yk+1] [vk+1(Zk+1)]

,

subject to,0 ≤ Zk+1 = Zk + Yk+1 + ∆(dk) ≤ C .

Both of these problems can be solved using the formulation presented in Chapter 4.The basic model for SDP with a single reservoir system is as follow:

• X represents the level of storage in the reservoir,

• D represents the set of decisions that can be applied to the system (typicallyreleases),

• P gives the probability of the net change in storage (inflow minus release minusevaporation) after one time step,

• and R represents the rewards for the storage/decision pairs.

Here the states of the Markov chain X will represent discretised partitions of thereservoir storage, so Zk ∈ X = {0, 1, 2, . . . , C − 1, C}.

The probability pij(d) shall denote the time-homogeneous probability of going fromZk = i ∈ X to Zk+1 = j ∈ X under release decision dk ∈ D.

When 0 < Zk+1 < C, the probability of the change in storage is given by,

Pr[Zk+1|Zk] = Pr[Zk + Yk+1 + ∆(dk)|Zk] = Pr[Yk+1] .

As a result the probability pi,j(dk) for 0 < j < C is given by,

pi,j(dk) = Pr[Yk+1 = j − i−∆(dk)] .

When Zk+1 ≥ C then,

pi,C(dk) =∑

Zk+1≥C

Pr[Zk+1|Zk] =∑

Yk+1≥C−i−∆(dk)

Pr[Yk+1].


Similarly when Zk+1 ≤ 0,

pi,0(dk) =∑

Zk+1≤0

Pr[Zk+1|Zk] =∑

Yk+1≤−i−∆(dk)

Pr[Yk+1].

Further constraints are typically applied using R and D. States that are possible, butundesirable are given a large negative reward or penalty. States that are never possibleto enter are constrained by making decisions that would cause them not available.

Solving to find an optimal policy u∗, selecting the rows of P that correspond to thatdecision being implemented the matrix P (u∗) can constructed that describes the tran-sition probabilities of the system being operated under this optimal policy. The steadystate vector for operating under this policy can be found by solving πP (u∗) = π andπ1 = 1. If column vector r(u∗) is constructed by putting R(u∗i ), the reward for beingin state i under optimal decision policy, as the ith element, the expected gain g for thesystem can be found by g = πr(u∗).

350

400

450

●

●

●

●

●

●

● ●● ●

5 10 15 20Number of storage states

Exp

ecte

d ga

in (

$1,0

00)

Figure 5.1: Reconstruction of the results of Goulter and Tai (1985) on the effect onthe expected gain by storage space discretisation.

Goulter and Tai (1985) looked at the effect of the discretisation of the storage onthe optimal policy. They found that a smaller number of storage states caused anoptimal solution with a lower expected gain (see Figure 5.1) and that the stationarydistribution π would be skewed, with more probability mass at the higher storagestates. This artificial skewing was found to be greatly reduced when at least ninestorage levels was used.

5.3. SDP - SINGLE RESERVOIR MODELS 63

5.3.2 Seasonal Stochastic Dynamic Programming Models

Butcher (1971) suggests a method for modelling reservoirs using SDP which have aseasonal component to them. In their proposed model, a decision process is expandedto

M = (X,D, P0, . . . , Ps−1, R0, . . . , Rs−1),

where each Pl is the probability transition function at time l = (k mod s) for atransition over the time interval (k− 1, k] and Rl is the reward function at time l = (kmod s), where s is the number of seasons in the cycle.

As the matrices are now time dependent Equation (4.1) becomes,

v∗k(i) = maxd

{λl,i(d) +

∑j

pl,ij(d)v∗k+1(d)

}. (5.2)

As the probability transition matrices and rewards are now dependent on time, this isalso sometimes referred to as a time-dependent Markov decision process.

When solving the Equation (5.2) by backwards induction, in general, un+1 6= un, asn and n + 1 are in different seasons. The new stopping rule becomes stop iterationswhen un+m = un+m+s for all m = 0, 1, . . . , s − 1, that is when the policy is the samefor the equivalent season in two years.

Matrix analytic methods gives us a structured framework that allows us to see that forevery seasonal MDP there exists an equivalent cross-seasonal MDP. By letting

P ∗ =

0 P1 0 . . . 0 0

0 0 P2. . . 0 0

......

. . . . . . . . ....

0 0. . . . . . Ps−2 0

0 0 0. . . 0 Ps−1

P0 0 0 . . . 0 0

and a seasonal construction forR∗; by expanding the state spaceX∗ = [X0, X1, . . . , Xs−1],such that Xk is a collection of states in X for season k of the seasonal MDP,

M = (X,D, P0, . . . , Ps−1, R0, . . . , Rs−1) ≡ (X∗, D, P ∗, R∗) .

This Markov chain is periodic with period s and as a result is not ergodic, thereforethe policy iteration method cannot be used to solve seasonal problem.

5.3.3 Serially Correlated and Forward Forecast Stochastic Dy-namic Programming Models

As described in Yeh (1985) a serially correlated SDP can be described as follows:


vk(Zk, Yk) = maxdk∈D

λ(dk, Zk) +∑Yk+1

Pr[Yk+1|Yk] [vk+1(Zk+1, Yk+1)]

, (5.3)

subject to

0 ≤ Zk+1 = Zk + Yk+1 + ∆(dk) ≤ C ,

where q(dk, Zk) is the reward for the release and storage, and vk(Zk, Yk) is the expectedreturn of the system at time with Zk units of storage and net inflow of Yk over the timeperiod (k − 1, k). This equation can solved using value iteration from a time horizonthat is n time steps away, where v0(Z0, Y0) is the value of having Z0 units of storageat end of the system’s life and Y0 units of inflow at the preceding time interval, bysubstituting k = n, k + 1 = n− 1, and k − 1 = n+ 1 into Equation (5.3).

Loucks and Falkson (1970) did a comparison of a reservoir operation model with seriallycorrelated inflows using stochastic linear, dynamic and policy iteration methods tofind the optimal operating policy. They all achieved the same optimal policy, but thestochastic dynamic programming approach was the fastest.

Stedinger et al. (1984) made an alteration to Equation (5.3) to produce,

vk(Zk, Yk) = maxdk∈D

λ(dk, Zk) +∑Yk+1

Pr[Yk+1|Yk,xk] [vk+1(Zk+1, Yk+1)]

, (5.4)

subject to

0 ≤ Zk+1 = Zk + Yk+1 + ∆(dk) ≤ C ,

where xk is all other information available at time k. This model is a forward forecastingmodel where any other hydrological or meteorological information can be incorporatedinto this model and Stedinger et al. (1984) found that in simulations this methodout performed normal serially correlated inflow models. One major drawback to thismethod is that it can not be used to determine steady-state operating policies that areneeded to manage a resource in the long-term, however it has found widespread use inshort-term planning. Røtting and Gjelsvik (1992) for example, discuss the operationof Norway’s 38 connected hydroelectric-dam system where forward forecasting of twoweeks is done using simulations to determine the best short-term operating policywithin the constraints of a longer-term management plan.

Both of these methods have achieved an increase in operating performance by expand-ing the states space from the number of capacity states C + 1 (the usable capacityC plus a dead storage level 0) to (C + 1) × max{Yk}, assuming that Yk has a finitemaximal value that can be known in advance.

Turgeon (2005) proposed a method of further increasing the correlation without ex-panding the state space too greatly. First consider the p-lag auto-correlation of inflows,

Yk = φ0,k + φ1,kYk−1 + φ2,kYk−2 + · · ·+ φp,kYk−p + εk

5.4. SDP - MULTI-RESERVOIR MODELS 65

where εk is the error term. The term Hk is defined as,

Hk = φ1,kYk−1 + φ2,kYk−2 + · · ·+ φp,kYk−p,

so that,Yk = φ0,k +Hk + εk .

Define,Hk+1 = φ2,k+1Yk−2 + · · ·+ φp,k+1Yk−p+1 ,

so that,Hk+1 = φ1,k+1Yk + Hk+1 .

It can then be shown that,

E(Hk+1|Hk) = mk+1 + ρk,k+1νk+1

νk(Hk −mk) ,

where:

mk = E(Hk)

mk+1 = E(Hk+1)

νk = Var(Hk)

νk+1 = Var(Hk+1)

ρk,k+1 = Cor(Hk, Hk+1) .

In the formulation given in Turgeon (2005), Equation (5.4) becomes,

vk(Zk, Hk) = maxdk∈D

λ(dk, Zk) +∑Yk+1

Pr[Yk+1|Yk, Hk] [vk+1(Zk+1, Hk+1)]

,

where Hk+1 is given by,

Hk+1 = φ1,k+1Yk + mk+1 + ρk,k+1νk+1

νk(Hk −mk) .

The state space is now expanded by max{Hk+1} which is of the same order as max{Yk+1}.

5.4 Stochastic Dynamic Programming - Multi-

Reservoir Models

Extending stochastic dynamic programming to a system of more than one reservoirwould initially appear straight forward. Let the system consist of N reservoirs islabelled n = 1, . . . , N , where typically they will be labelled from the most upstreamdown. Denote each of the reservoir properties at time k using, Zn

k for storage, Y nk+1 for

future stochastic inflow, dnk for decision, and Cn for capacity for each n ∈ {1, . . . , N}.

Define:


• Zk = (Z1k , . . . , Z

Nk ), the N -tuplet of all the storages at time k,

• Yk = (Y 1k , . . . , Y

Nk ), the N -tuplet of all the inflows at time k

• dk = (d1k, . . . , d

Nk ), the N -tuplet of all the decisions at time k

• Vk(Zk) =∑

n v(Znk ), the expected future reward at time k for all N storages,

• Λk(dk, Zk) =∑

n λ(dnk , Znk ), the total expected reward at time k for the decision.

Also define δ(dnk) as the release at reservoir n due to decision dnk and δ(d ~nk ) as the

releases from all the reservoirs directly upstream of n, i.e, only reservoirs whose releasewill first flow into n before any other.

Optimizing the system as a whole is then just a matter of recursively solving,

Vk(Zk) = maxdk

Λ(dk, Zk) +∑Yk+1

Pr[Yk+1][Vk+1(ZK+1)

] ,

subject to,

0 ≤ Znk+1 = Zn

k + Y nk+1 + δ(d ~n

k )− δ(dnk) ≤ Cn .

Whilst this may seem straight forward, this problem is exceptionally large. The newstate space is of size C1 × C2 × · · · × CN , and with seasons it will be s time largeragain. The state space becomes larger again if any lag-correlation is included. Similarlythe number of decisions will be the Cartesian product of the number of decisions thatcan be enacted on each reservoir. Similarly the inflow distribution has gone from a1-dimensional distribution that is to be summed over to an N -dimensional distributionwhere all possible discretised N -tuplets are to be summed over, again the Cartesianproduct of all the maximal inflows to all N storages. Opinions vary on the number ofstorages that can be handled with this direct method. Turgeon (1981) puts it at fourstorages, whereas Lamontagne and Stedinger (2013) says three. Labadie (2004) givesan overview of methods for dealing with multi-reservoir modelling.

Turgeon (1980), Turgeon (1981), Archibald et al. (1997), and Archibald et al. (2006)have developed an effective method for dealing with multi-reservoir systems, aggre-gation and decomposition. In this method reservoirs are optimised one-at-a-time todetermine their target release, with other reservoirs being aggregated to form compositereservoirs. Archibald et al. (2006) suggests a 4-group method, with the target reservoiras a single state, all reservoirs upstream composited into a single state, all reservoirsparallel to the target composited as a single state, and all reservoirs downstream streamcomposited as a single state. After the target release policy is determined for the targetreservoir, a new target reservoirs is selected, and the previous target is composited intothe appropriate composite reservoir. This process is repeated until it converges on apolicy that is feasible. These aggregated methods are efficient in that they considerno more than 4 storages at one time, however it does have the draw back of finding apolicy that has an expected monetary value slightly below the maximum (Archibaldet al., 1997).

5.5. HIDDEN MARKOV MODELS 67

5.5 Hidden Markov Models

Early modelling of rainfall with Markov chains used basic observed state models, forexample Gabriel and Neumann (1962) and Stern and Coe (1984). Here the probabilityof rain is conditional on whether it had rained or not the previous, one, two, or threedays depending on the number of states included in the model. Whilst models withincreased state space are useful, they have an associated the risk of over-fitting andtherefore of not having any predictive power as they only explain the observed data.Another drawback of this type of modelling is that for each additional time-step of lagrequires a doubling of a state space. Hidden Markov models are advantageous in thatthey have a similar structure without prescribing any physical meaning to the states,thus allowing for a more general structure.

Hidden Markov models appear to have been introduced to the hydrological literature byZucchini and Guttorp (1991) then further expanded in MacDonald and Zucchini (1997)and have found wide-spread application in the hydrological literature, for exampleThyer and Kuczera (2003a, 2003b), Lambert et al. (2003), and Akintug and Rasmussen(2005). However these have typically focussed on a continuous observation distributionas rainfall is normally treated as a continuous variable. Akintug and Rasmussen (2005)give a fairly typical example of this type of modelling. Letting Yk be the observedrainfall modelled by,

Yk = µJk + σJkε , ε ∼ N(0, 1),

where Jk is an underlying hidden Markov process on a state space X, where each statei ∈ X has its own mean rainfall, µi, and standard deviation, σi. These models arenot restricted to the normal distribution. Other distributions, such as the log-normaldistribution (Lambert et al., 2003), can be used where each state of the Markov chaindefines its own rainfall distribution dependent on that state.

Hidden Markov models are often used to model climatic states and their effects onrainfall, such as the El-Nino/La-Nina phenomena, which dominates the climatic con-ditions in the southern Pacific region and influences the rainfall on eastern Australiaas shown extensively in Whiting (2003, 2004, and 2005).

However, as previously mentioned, all these models assume that the rainfall or river-flows are a continuous variables and to be incorporated in to a stochastic dynamicprogramming model, such as those laid-out in Sections 5.3 and 5.4, requires discretisedobservations to apply to the hidden Markov, such as those shown in Section 3.3.

Chapter 6

Synthesis

Having done a review of the historical and state-of-the-art hydrological literature inChapter 5 this thesis will demostrate how some of the innovative Markov modellingtechniques discussed in Chapter 3 can be combined with the Markov decision processesoutlined in Chapter 4 to great benefit.

The first aspect that will be explored is whether the phase-structure makes a notice-able difference in the solution to Markov decision processes and if so how much? Theinclusion of phases into the stochastic dynamic programming models shown in Sections5.3 and 5.4 should improve fitting and predictive power. However as previously men-tioned, as with all fitting procedure, they can lead to over-fitting which results in amodel which describes too much of the already realised data. To prevent over-fitting,initially the number of phases is limited to a number of physically measurable states.The Southern Oscillation Index (SOI), which measures the normalised difference inair-pressure between Tahiti and Darwin, is used as a phase description for the purposeof phase-type construction. This inclusion of phases demonstrates that it is beneficialand the results of this are shown in the paper Fisher et al. (2008) in Section 6.1.

Having established that phases have a beneficial effect when incorporated into a model,the thesis then investigates the benefits that matrix analytic models provide to hydrol-ogy and water resource management. Hydrology is already a well established field withmany techniques and models of its own, however the usefulness these models from thematrix analytic literature can extended the field further. For example Cigizoglu et al.(2002) proposed a complicated model for rivers with spate flows which would be un-suitable for inclusion in a Markov decision processes. A Markov chain based modelwould be of great benefit. Matrix analytic methods provide such constructions andin this case a Markov modulated arrival processes as described in Latouche and Ra-maswami (1999) is appropriate and tools already exist for their fitting (Ryden, 2000).The results from these considerations are shown in the paper Fisher et al. (2010) inSection 6.2.

The inclusion of hidden phases into the models appears essential if the full computa-tional capabilities of matrix analytic methods are to be realised in hydrology. However,

69

70 CHAPTER 6. SYNTHESIS

what effects does having hidden phases have on the modelling and implementation of aMarkov decision process? Whilst hidden Markov models have been used in hydrologybefore (see Section 5.5), they have been based on continuous variables and have notbeen used in conjunction with MDPs. The thesis shows directly how a system operatesunder a model with hidden information and how it compares to an equivalent knownphase model. The results of this research is shown in the paper Fisher et al. (2012) inSection 6.3.

The final question that has been explored in this thesis is: can the first-passage timedistribution be used as an operating criteria and if so how would this impact theoperating policies found using this criteria? The motivation for this question comesfrom Young and McColl (2005) who advocate that water resources be managed withthe objective of maintaining the long-term viability of the storage system first and thenproportioning the releases based on the economic values placed on the water after basichuman needs had been met. The objective criteria is then to maximisie the number ofsteps until any undesirable state is reached or to minimise the number of steps untila target state is reached. This obviously has an impact on the economic out-put ofthe system as the operating policy will not be chosen to maximise expected monetaryvalue as has been the focus of all previous research shown in Sections 5.3 and 5.4. Theresults of this research is shown in the paper Fisher et al. (2014) in Section 6.4.

The results as indicated above are now presented as a collection of papers as publishedin the literature.

6.1. OPTIMAL CONTROL OF MULTI-RESERVOIR SYSTEMS W/ TDMDP 71

6.1 Optimal Control of Multi-Reservoir Systems

with Time-Dependent Markov Decision

Processes

Paper highlights:

• A seasonal (time-dependent) Markov decision process conditioned on thesouthern oscillation index is applied to the Wivenhoe Reservoir, a flood-control and water-storage reservoir in Southern Queensland.

• The method is extended to Wivenhoe and Somerset as a 2-reservoir system.

The paper Optimal Control of Multi-Reservoir Systems with Time-Dependent MarkovDecision Processes (Fisher et al., 2008) establishes the use of a phase-structure ina Markov decision processes where transition probabilities change with time. Recallfrom Section 3.2.1 that a matrix analytic model describes a doubly stochastic process{(Nk, Jk) : k ∈ Z+}

The first model, a single reservoir storage with stochastic inflows from a river along witha water recycling transfer. Here the levels of the reservoir storage are represented by thelevel process of the MAM model Nk. The transition probabilities are conditioned on anunderlying hydrological state, the Southern Oscillation Index (SOI), which is used as aproxy measure of the El-Nino/La-Nina phenomenon that effects the prevailing weathercondition on the Eastern seaboard of Australia. The SOI is included as a phase processwithin each level of the reservoir model.

The second model incorporates the upstream reservoir, Somerset that flows into thelower reservoir. Here an approximation using a triply stochastic system {(N2

N1k, N1

k , Jk) :

k ∈ Z+} is used to model the system, where N2N1k, which is dependent on N1

k , the level

of the upstream dam, and Jk the independent hydrological state determined by theSOI.

The matrix analytic model leads to a better performance due to its ability to incorpo-rate additional information of the SOI in the structure.

6.1. OPTIMAL CONTROL OF MULTI-RESERVOIR SYSTEMS W/ TDMDP 73

NOTE:

This publication is included on pages 73-82 in the print copy of the thesis held in the University of Adelaide Library.

A Fisher, A.J., Green, D.A., Metcalfe, A.V. & Webby, R.B. (2008) Optimal control of multi-reservoir systems with time-dependent Markov decision processes. In: Water Down Under.

6.2. MANAGING RIVER FLOW IN ARID REGIONS WITH MAM 83

6.2 Managing River Flow in Arid Regions with

Matrix Analytic Methods

Paper highlights:

• Matrix analytic methods used for modelling arid flow in Cooper Creek con-ditioning on both hidden states and known states.

• Discussion of phase-type distribution in the context of Markov decision pro-cesses.

The paper Managing River Flow in Arid Regions with Matrix Analytic Methods (Fisheret al., 2010) introduces further concepts of matrix analytic methods to the hydrologicalliterature.

In the first part of the paper a matrix analytic model for an ephemeral river flow isconsidered, with the levels (Nk) of the process representing the current flow in the river.The river had long sojourn times between spates of flow arrivals which are modelledusing a discrete phase-type distribution. The flows are then modelled as a batch arrivalprocess using historical count data conditioned on the previous arrival. The spates havetwo phases, described by the process JNk , which determines whether the last flow wasan increase or decrease on the previous flow. This phase process induces a correlationstructure on the size of the flows.

In the second part of the paper, a model is used to demonstrate the phase structure,where counting processes were used to fit the model parameters that give a direct phys-ical meaning to the phases. The use of general fitting with expectation-maximisationalgorithms is further developed and shown in Section 3.4.

6.2. MANAGING RIVER FLOW IN ARID REGIONS WITH MAM 85

A Fisher, A.J., Green, D.A. & Metcalfe, A.V. (2010) Managing river flow in arid regions with matrix analytic methods. Journal of Hydrology, v. 382(1-4), pp. 128-137

NOTE:


It is also available online to authorised users at:

http://doi.org/10.1016/j.jhydrol.2009.12.023

6.3. MODELLING OF HYDROLOGICAL PERSISTENCE FOR HSMDP 95

6.3 Modelling of Hydrological Persistence for

Hidden State Markov Decision Processes

Paper highlights:

• Comparison of a decision process based on a hidden Markov model witha decision process based on the climatic indicator given by the SouthernOscillation Index for the Wivenhoe Reservoir in Southern Queensland.

The paper Modelling of Hydrological Persistence for Hidden State Markov DecisionProcesses (Fisher et al., 2012) is a comparative study of known vs hidden phases inMarkov decision processes, in particular where it relates to hydrological applications.The use of state variables are common in hydrological applications but in some situa-tions such a variable is not always available.

In this paper, a hidden state Markov decision processes (HSMDP) is compared to amodel constructed using an assumed dependence on the Southern Oscillation Index.The two models are compared in the context of decision processes for releases from theWivenhoe, Somerset system of reservoirs in Southern Queensland.

The paper finds an almost identical result for the HSMDP as the one informed by theSOI information, however in one instance the HSMDP was surprisingly less conserva-tive. In a practical application caution would be needed in implementing a planningbased primarily on hidden phase information as the hidden phase can only be estimatedas a probability and if the policy is less conservative it could lead to more failures ofthe system.

6.3. MODELLING OF HYDROLOGICAL PERSISTENCE FOR HSMDP 97

NOTE:



http://doi.org/10.1007/s10479-011-0992-2

A Fisher, A., Green, D. & Metcalfe, A. (2012) Modelling of hydrological persistence for hidden state Markov decision processes. Annals of Operations Research, v. 199(1), pp. 215-224

6.4. FIRST-PASSAGE TIME CRITERIA 107

6.4 First-Passage Time Criteria for the Operation

of Reservoirs

Paper highlights:

• The Abberton and Hanningfield two-reservoir system is studied using sea-sonal Markov decision processes.

• First-passage time criteria are introduced into Markov decision processes forhydrological applications.

• Pareto optimisation is used to compare first-passage time criteria to expectedmonetary value.

The paper First-Passage Time Criteria for the Operation of Reservoirs (Fisher et al.,2014) takes the work of Jianyong and Huang (2001) and looks at it in both a matrixanalytic methods and hydrological setting. Jianyong and Huang (2001) proposed usingthe first-passage time until a system reaches failure as criteria for optimising its oper-ating policy. However they did not appear to notice the link between first passage timeof a Markov chain and the standard Bellman equation used in calculating the reward.

Consider:

vk(Zk) = maxdk∈D

λ(dk, Zk) +∑Yk+1

Pr[Yk+1] [vk+1(Zk+1)]

,

where if for the states where Zk is a failure of the reservoir system, vk(Zk) = 0 for all k,and λ(dk, Zk) = 1, a system of equations that calculates the first-passage times for thesystem is obtained. This can be used iteratively to find a policy, which maximises thefirst-passage times. These first-passage times can also be calculated with the phase-type distribution as is discussed in the paper.

Preliminary research focussed on maintaining the reservoir amenity outside failureeither by overflow or the reservoir reaching dead storage. This however made forpredictable results. Instead, the applications as part of multi-objective programmingwere considered for the paper.

In multi-objective programming there are competing reward structures that are consid-ered. Typically policy decisions that are optimal for one will not be optimal for anotherreward structure. Instead “compromise” policies will be needed. Pareto optimality isinstead the criteria used to determine whether a policy should be considered. Thismeans that policies that are sub-optimal for both criteria will need to be found. Thishas typically been done using genetic algorithms following on from Huang and Guo(2009) or simulated annealing Teegavarapu and Simonovic (2002). However in the pa-per a simpler method based on the Metropolis-Hastings algorithm is used instead. Theresult of using the Metropolis-Hastings algorithm causes the sampling to take place

108 CHAPTER 6. SYNTHESIS

where the decision space is densest and as a result it is seen that perturbations do nothave a large effect on the rewards, with only minimal losses.

6.4. FIRST-PASSAGE TIME CRITERIA 109

A Fisher, A.J., Green, D.A., Metcalfe, A.V. & Akande, K. (2014) First-passage time criteria for the operation of reservoirs. Journal of Hydrology, v. 519(Part B), pp. 1836-1847

NOTE:



http://doi.org/10.1016/j.jhydrol.2014.09.061

Chapter 7

Conclusion

In Chapter 6 five questions were proposed:

1. What benefits can matrix analytic models provide to hydrology and water re-source management?

2. Whether the phase-structure makes a noticeable difference in the solution toMarkov decision processes and if so how much?

3. What effects does having hidden phases have on the modelling and implementa-tion of a Markov decision process?

4. How would a system operating under a model with hidden information performand would it be severally detrimental compared to an equivalent known phasemodel?

5. Can the first-passage time distribution itself be an operating criteria and if sohow would this impact the operating policies found using this criteria?

The four published papers (Fisher et al. 2008, 2010, 2012, and 2014) addressed eachof these questions.

Benefits of Matrix Analytic Methods

Matrix analytic methods provide more generalised Markov chain modelling techniquesthat improve upon the standard single state per level chain with multi-phases per levelincorporating additional information into the modelling processes. For example lag-one correlations (Fisher et al., 2010), climatic states (Fisher et al., 2010), or structuralstates that had no physical meaning but allowed for different flow probabilities (Fisheret al., 2012). Fisher et al. (2012) dealt with this question in the most detail, as it wasthe first paper to propose that hydrology could be done in a matrix analytic methodsframework. The paper put forward three ideas in which matrix analytic methods couldbenefit hydrology.

121

122 CHAPTER 7. CONCLUSION

The first was the use of the phase-type distribution to measure inter-arrival times.The phase-type distribution is advantageous in this respects as the discrete-time formcan be made to fit very general distributions and is easily incorporated into a Markovdecision process if desired. Its main disadvantage is that it can often require a largenumber of phases to fit data with any degree of accuracy. For example Faddy (1998)required 397 phases for a sufficiently good fit of inter-arrival times of eruptions of theOld-Faithful geyser in Yellowstone Park, something that can be satisfactorily fit witha mixture of two normal distributions for a total of five parameters (two mean, twostandard deviation and the mixture probability). However, many of the parameters inFaddy’s model were restricted to either zero or being equal leaving just four parametersto estimate meaning that it was smaller in that respect even though it was larger inanalytic expression and computer storage terms.

The second proposed idea was the use of phases with a direct physical interpretation tomodel a lag-one correlation of whether the flow had increased or decreased previously.The incorporation of this information improved the fit of the model, whilst keeping aMarkov chain structure, but at the expense of doubling of the state space. The resultsof fitting hidden phases shown in Akintug and Rasmussen (2005) imply that a three-phase model is sufficiently good for hydrological applications, although this is possiblycontext dependent and warrants further investigation. Fitting models also presentsa challenge as there is no readily available packages to do this, although Section 3.4proposes methods that are adaptable enough to do this fitting.

The final idea presented in Fisher et al. (2010) was the recasting of seasonal modelsproposed by Butcher (1971) as a periodic Markov chain. The computational advan-tage for this conceptualisation was dealt with in Fisher et al. (2014) where algorithmsexploiting this structure were presented.

Incorporation of Known and Hidden Phases into Markov Decision Processes

Questions two through to four relate to the incorporation of phases into Markov decisionprocesses and were dealt with primarily in Fisher et al. (2008) and Fisher et al. (2012).The first question regarding the inclusion of known phases in the Markov decisionprocess model gave the intuitive result that the model performed better with moreinformation. In the same way, the Stedinger et al. (1984) forward-forecasting modeldiscussed in Section 5.3 gives better management decision by including meteorologicalpredictions. However the model in Fisher et al. (2008) has the benefit of a smallerstate space expansion and due to the El Nino/La Nina’s long-term persistence, allowsfor longer time-steps in the model.

McMahon (2008) demonstrates that hidden phases included into a Markov decisionprocess can potentially lead to an overall suboptimal operation of the system and thatthe current phase needs to be estimated in order for it to be implemented. In Fisheret al. (2012) it is shown that by using hidden Markov models, the current hidden phasecan be estimated with the Viterbi algorithm and implemented with only a 0.2% lossin expected monetary value in that particular example.

7.1. FURTHER WORK 123

Incorporation of First-Passage Time Distributions into Markov DecisionProcesses

The final question deals with the use of first-passage time as an optimisation criterion.As previously mentioned in Chapter 6, the motivation comes from the fact that everyMarkov chain’s first-passage times between states can be used to either minimisiethe expected number of time-steps to reach a target state (as done in Fisher et al.2014) or used to maximise the number of time-steps to reach an undesirable state.As mentioned in Fisher et al. (2014), when used as the only optimisation criteria, itperforms best when used to keep the system within an operating window between twostates. However, when used as part of a multi-objective optimisation in conjunctionwith expected monetary value it can used to find a policy that is a compromise betweenthe two using a Pareto optimisation where only minimal difference below the optimalvalue found with either criteria in isolation.

7.1 Further Work

In all the applications considered in this thesis the fundamental property of interestafter establishing an appropriate model is the stationary or invariant distribution ofthe Markov chain that describes the system operating under a fixed policy. Fromthis distribution, the long-term reward for the system can be calculated for a singleor multi-objective optimisation. Calculating this distribution as efficiently as possibleleads to the ability to construct more realistic models. In Fisher et al. (2014) (Section6.4) this was addressed with methods for reducing the calculation time by one orderof magnitude for monthly seasonal models. Extending this, another efficient methodmay have its beginnings somewhere in the earliest parts of the literature. The MoranDam Model was designed primarily to find the capacity a reservoir would need to havein order to hold back an annual inundation and then release some fixed amount. Themodel’s elegance lies in the fact that it is analytically similar to simple Markov models,most noticeably those of M/G/1 and G/M/1 type, this allowed the stationary distri-bution to be quickly calculated with just the knowledge of the inflow distribution. TheMoran Dam Model uses a fixed release for all states, however the system of equationscan be amended to incorporate different release decisions for each state. Even withthis alteration the Moran Dam Model fits into the matrix analytic framework as it isstill a skip-free process and can be represented as a quasi birth and death process.

Expanding the Moran Dam Model for multi-reservoir systems can be achieved by in-creasing the state space. The most natural way of doing this is by embedding onereservoir into the next downstream dam using a Kronecker product (this was done inboth Sections 6.1 and 6.4). Without any state aggregation, this results in a matrix fora multi-reservoir system that is dimensionally very large. Goulter and Tai (1985) sug-gest that somewhere between 15 and 20 states is needed to describe a reservoir withoutcompromising the effectiveness of the model. This means that the Markov chain formodelling n reservoirs would need about 20n states. In addition, any auto-correlation

124 CHAPTER 7. CONCLUSION

that needs to be included will further increase the state-space by a factor equal to thedimension associated with the flow.

However, expressed as a QBD, the Moran Dam Model has a tri-diagonal block struc-ture. The expanded embedded model also has a tri-diagonal block structure, with theactual transition matrix being very sparse. It is here that matrix analytic methods canbe used to implement the policy iteration method 4.3 or the stochastic methods 4.4.Neuts (1981) and Latouche and Ramaswami (1999) provide a number of algorithms forcalculating stationary distributions for matrix analytic models. Exploiting these well-defined structures to calculate the stationary distributions should allow for specialisedMarkov decision process algorithms that can go beyond the currently accepted fourreservoir limitation of MDPs (as discussed in Section 5.4). Given that the Moran DamModel can be put into the matrix analytic framework, this is a promising approachinto the future.

Another critical issue in modelling hydrological systems is seasonality, which results inthe Markov chain being non-ergodic, as it demonstrates periodic behaviour. Strictlyspeaking a periodic Markov chain has no limiting distribution, however, a stationaryregime can be found, which has allowed policy iteration to be used to solve seasonalMarkov decision processes in the literature. It is this seasonality that prevents theeasy application of many established algorithms that rely on the ergodic property.For example, Piantadosi et al. (2010) only considered non-seasonal behavior, whereasin Fisher et al. (2014) the algorithms used actually exploited this cyclic structure ofthe seasonal model in order to ease the calculations. Finding enhanced methods forexploiting both the cyclic pattern and embedded tri-diagonal block structures in orderto further ease the calculation of the stationary distribution is a task for the future.

The inclusion of more elaborate costings that include for example hydroelectricitygeneration, spot-costs, and renewable energy storage, also will present new challengesespecially in the framework of multi-reservoir systems.

Appendix A

Supplementary Material forChapter 3

A.1 Supplementary Material for Section 3.3

A.1.1 Demonstration of Equation (3.7)

Let Jk−1 = i for i ∈ X. Equation (3.6) can be rewritten as,

γk(j) = max~Jk−2,i

{Pr[ ~Jk−2, Jk−1 = i, Jk = j, ~Yk|P,Q,π]

}

= max~Jk−2,i

{Pr[Jk = j, Yk| ~Jk−1, Jk−1 = i, ~Yk−1, P,Q,π]

× Pr[ ~Jk−2, Jk−1 = i, ~Yk−1|P,Q,π]}

= max~Jk−2,i

{Pr[Yk|Jk = j, ~Jk−2, Jk−1 = i, ~Yk−1, P,Q,π]

× Pr[Jk = j| ~Jk−2, Jk−1 = i, ~Yk−1, P,Q,π]

× Pr[ ~Jk−2, Jk−1 = i, ~Yk−1|P,Q,π]}

127

128 APPENDIX A. SUPPLEMENTARY MATERIAL FOR CHAPTER 3

Now consider that the process Jk is Markovian and that Yk is dependent only on Jk.

γk(j) = max~Jk−2,i

{Pr[Yk|Jk = j, P,Q,π] Pr[Jk = j|Jk−1 = i, P,Q,π]

× Pr[ ~Jk−2, Jk−1 = i, ~Yk−1|P,Q,π]}

= max~Jk−2,i

{qjYkpij Pr[ ~Jk−2, Jk−1 = i, ~Yk−1|P,Q,π]

}= max

i{qjYkpijγk−1(i)}

=[maxi{pijγk−1(i)}

]qjYk

Hence Equation (3.7) is obtained.

A.1.2 Demonstration of Equation (3.9)

First rewrite Equation (3.8) as,

ξk(i, j) = Pr[Jk = i, Jk+1 = j|Y, P,Q,π]

=Pr[Jk = i, Jk+1 = j, Y |P,Q,π]

Pr[Y |P,Q,π]

Concentrating on the enumerator,

Pr[Jk = i, Jk+1 = j, Y |P,Q,π]

= Pr[Jk = i, ~Yk|P,Q,π]× Pr[Jk+1 = j, ~Y k+1|Jk = i, P,Q,π]

= Pr[Jk = i, ~Yk|P,Q,π]× Pr[Jk+1 = j|Jk = i, P,Q,π]

× Pr[ ~Y k+1|Jk+1 = j, P,Q,π]

= Pr[Jk = i, ~Yk|P,Q,π]× Pr[Jk+1 = j|Jk = i, P,Q,π]

× Pr[Yk+1|Jk+1 = j, P,Q,π]× Pr[ ~Y k+2|Jk+1 = j, P,Q,π]

= αk(i)pijqj,Yk+1βk+1(j) .

A.1. SUPPLEMENTARY MATERIAL FOR SECTION 3.3 129

Likewise for the denominator,


N∑j=1

Pr[Jk = i, Jk+1 = j, Y |P,Q,π]

=N∑i=1

N∑j=1

αk(i)pijqj,Yk+1βk+1(j) .

Which gives the result,

ξk(i, j) =Pr[Jk = i, Jk+1 = j, Y |P,Q,π]

Pr[Y |P,Q,π]=

αk(i)pijqj,Yk+1βk+1(j)∑N

i=1

∑Nj=1 αk(i)pijqj,Yk+1

βk+1(j).

Hence Equation (3.9) is obtained.

List of Figures

2.1 State diagram of a N = 3 Markov chain. . . . . . . . . . . . . . . . . . 6

3.1 State diagram of the discrete phase-type distribution. . . . . . . . . . . 25

3.2 A diagram of a hidden Markov model . . . . . . . . . . . . . . . . . . . 33

4.1 The crossover in two policy vectors from the parents to the offspring. . 51

5.1 Reconstruction of the results of Goulter and Tai (1985) on the effect onthe expected gain by storage space discretisation. . . . . . . . . . . . . 62

131

List of Tables

1.1 Re-creation of the table “Share of global runoff and population by con-tinent” shown in Postel et al. (1996). . . . . . . . . . . . . . . . . . . . 3

3.1 Transition probability matrix P for the transitions between hidden states. 33

3.2 Observation probability matrix Q for the observations given the hiddenstates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

133

Bibliography

Akintug, B. and Rasmussen, P. F. (2005). “A Markov switching model for annualhydrologic time series.” Water Resources Research, 41(9): W09 424. URL http:

//dx.doi.org/10.1029/2004WR003605.

Aksoy, H. (2003). “Markov chain-based modelling techniques for stochastic generationof daily intermittent streamflows.” Advances in Water Resources, 26. URL http:

//dx.doi.org/10.1016/S0309-1708(03)00031-9.

Aksoy, H. and Bayazit, M. (2000). “A model for daily flows of intermittentstreams.” Hydrological Processes, 14: 1725–1744. URL http://dx.doi.org/10.

1002/1099-1085(200007)14:10<1725::AID-HYP108>3.0.CO;2-L.

Andersen, A. T. and Nielsen, B. F. (1998). “A Markovian approach for modelingpacket traffic with long range dependence.” Selected Areas in Communications, IEEEJournal on, 16(5): 719–732. URL http://www2.imm.dtu.dk/pubdb/p.php?2951.

Archibald, T. W., McKinnon, K. I. M., and Thomas, L. (2006). “Modeling the op-eration of multireservoir systems using decomposition and stochastic dynamic pro-gramming.” Naval Research Logistics, 53: 217–225. URL http://dx.doi.org/10.

1002/nav.20134.

Archibald, T. W., McKinnon, K. I. M., and Thomas, L. C. (1997). “An aggregatedstochastic dynamic programming model of multireservoir systems.” Water ResourceResearch, 33(2): 333–340. URL http://dx.doi.org/10.1029/96WR02859.

Asmussen, S., Nerman, O., and Olsson, M. (1996). “Fitting phase-type distributionsvia the EM algorithm.” Scandinavian Journal of Statistics, 23(4): 419–441. URLhttp://www.jstor.org/stable/4616418.

Barnett, T. P. and Pierce, D. W. (2008). “When will Lake Mead go dry?” Wa-ter Resources Research, 44(3): W03 201. URL http://dx.doi.org/10.1029/

2007WR006704.

Baum, L. E. (1972). “An inequality and associated maximization technique in sta-tistical estimation for probabilistic functions of Markov processes.” Inequalities, 3:1–8.

Baum, L. E. and Eagon, J. A. (1967). “An inequality with applications to statisti-cal estimation for probabilistic functions of Markov processes and to a model forecology.” Bulletin of the American Mathematical Society, 73: 360–363.

135

136 BIBLIOGRAPHY

Baum, L. E. and Petrie, T. (1966). “Statistical inference for probabilistic functions offinite state Markov chains.” The Annals of Mathematical Statistics, 37(6): 1554–1563. URL http://www.jstor.org/pss/2238772.

Baum, L. E., Petrie, T., Soules, G., and Weiss, N. (1970). “A maximization techniqueoccurring in the statistical analysis of probabilistic functions of Markov chains.” TheAnnals of Mathematical Statistics, 41(1): 164–171. URL http://www.jstor.org/

stable/2239727.

Baum, L. E. and Sell, G. R. (1968). “Growth transformations for functions on mani-folds.” Pacific Journal of Mathematics, 27(2): 211–227.

Bayraktar, E. and Ludkovski, M. (2010). “Inventory management with partially ob-served nonstationary demand.” Annals of Operations Research, 176: 7–39. URLhttp://dx.doi.org/10.1007/s10479-009-0513-8.

Bellman, R. E. (1957). Dynamic Programming. Princeton University Press, Princeton,NJ.

Borkar, V. S. (1991). “A remark on control of partially observed Markov chains.”Annals of Operations Research, 29: 429–438. URL http://dx.doi.org/10.1007/

BF02283609.

Boyan, J. A. and Littman, M. L. (2000). “Exact solutions to time-dependent MDPs.”In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, “Advances in Neural Infor-mation Processing Systems 13, Papers from Neural Information Processing Systems(NIPS),” NIPS, pages 1026–1032. MIT Press, Denver, CO, USA. URL http://

papers.nips.cc/paper/1811-exact-solutions-to-time-dependent-mdps.pdf.

Breuer, L. (2002). “An EM alogrithm for batch Markovian arrival processes and itscomparison to a simpler estimation procedure.” Annals of Operations Research,112(1-4): 123–138. URL http://dx.doi.org/110.1023/A:1020981005544.

Brown, B. W. (1965). “On the iterative method of dynamic programming on a finitespace discrete time Markov process.” The Annals of Mathematical Statistics, 36(4):1279–1285. URL http://www.jstor.org/stable/2238128.

Butcher, W. S. (1971). “Stochastic dynamic programming for optimum reservoir oper-ation.” JAWRA Journal of the American Water Resources Association, 7(1): 115–123. URL http://dx.doi.org/10.1111/j.1752-1688.1971.tb01683.x.

Cao, X.-R. and Guo, X. (2007). “Partially observed Markov decision processes withreward information: basic ideas and models.” IEEE Transactions on AutomaticControl, 52: 677. URL http://dx.doi.org/10.1109/TAC.2007.894520.

Cappe, O., Moulines, E., and Ryden, T. (2005). Inference in hidden Markov model.Springer series in statistics. Springer, New York and London.

Chiew, F., Stewardson, M., and McMahon, T. (1993). “Comparison of six rainfall-runoff modelling approaches.” Journal of Hydrology, 147(1–4): 1 – 36. URL http:

//dx.doi.org/10.1016/0022-1694(93)90073-I.

BIBLIOGRAPHY 137

Cigizoglu, H. K., Adamson, P. T., and Metcalfe, A. V. (2002). “Bivariate StochasticModelling of Ephemeral Streamflow.” Hydrological Processes, 16: 1451–1465. URLhttp://dx.doi.org/10.1002/hyp.355.

Cohen, S. and Elliott, R. (2010). “Backward stochastic difference equations with finitestates.” In N. P. A. Kohatsu-Higa and S.-J. Sheu, editors, “Stochastic Analysis withFinancial Applications, Hong Kong 2009,” pages 33–43. Birkhauser. URL http:

//dx.doi.org/10.1007/978-3-0348-0097-6_3.

Cote, P., Haguma, D., Leconte, R., and Krau, S. (2011). “Stochastic optimisation ofHydro-Quebec hydropower installations: a statistical comparison between SDP andSSDP methods.” Canadian Journal of Civil Engineering, 38(12): 1427–1434. URLhttp://www.nrcresearchpress.com/doi/abs/10.1139/l11-101.

David, I., Friedman, L., and Sinuany Stern, Z. (1999). “A simple suboptimal algorithmfor system maintenance under partial observability.” Annals of Operations Research,91: 25–40. URL http://dx.doi.org/10.1023/A%3A1018949723461.

Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). “Maximum likelihood fromincomplete data via the EM algorithm.” Journal of the Royal Statistical Society.Series B (Methodological), 39(1): pp. 1–38. URL http://www.jstor.org/stable/

2984875.

Denault, M., Simonato, J.-G., and Stentoft, L. (2013). “A simulation-and-regressionapproach for stochastic dynamic programs with endogenous state variables.” Com-puters and Operations Research, 40(11): 2760 – 2769. URL http://dx.doi.org/

10.1016/j.cor.2013.04.008.

Edirisinghe, N. C. P., Patterson, E. J., and Saadouli, N. (2000). “Capacity planningmodel for a multipurpose water reservoir with target-priority operation.” Annu-als of Operations Research, 100: 273–303. URL http://dx.doi.org/10.1023/A%

3A1019200623139.

Faddy, M. J. (1998). “On inferring the number of phases in a Coxian phase-typedistribution.” Communications in Statistics - Stochastic Models, 14(1 & 2): 407–417.

Fahlbusch, H. (2009). “Early dams.” Proceedings of the ICE - Engineering Historyand Heritage, 162(1): 13–18.

Fisher, A. J., Green, D. A., and Metcalfe, A. V. (2010). “Managing river flows in aridregions with matrix analytic methods.” Journal of Hydrology, 382: 128–137. URLhttp://dx.doi.org/10.1016/j.jhydrol.2009.12.023.

Fisher, A. J., Green, D. A., and Metcalfe, A. V. (2012). “Modelling of hydrological per-sistence for hidden state Markov decision processes.” Annals of Operations Research,199: 215–224. URL http://dx.doi.org/10.1007/s10479-011-0992-2.

Fisher, A. J., Green, D. A., Metcalfe, A. V., and Akande, K. (2014). “First-passagetime criteria for the operation of reservoirs.” Journal of Hydrology, 519, Part B:1836–1847. URL http://dx.doi.org/10.1016/j.jhydrol.2014.09.061.

138 BIBLIOGRAPHY

Fisher, A. J., Green, D. A., Metcalfe, A. V., and Webby, R. B. (2008). “Optimalcontrol of multi-reservoir systems with time-dependent Markov decision processes.”In “Proceeding from Water Down Under 2008,” pages 2610–2619. URL http://

search.informit.com.au/documentSummary;dn=594217807418287;res=IELENG.

Fletcher, S. and Ponnambalam, K. (1998). “A constrained state formulation for thestochastic control of multireservoir systems.” Water Resources Research, 34(2): 257–270. URL http://dx.doi.org/10.1029/97WR01900.

Foufoula-Georgiou, E. and Lettenmair, D. P. (1987). “A Markov renewal model forrainfall occurrences.” Water Resources Research, 23(5): 875–884. URL http://dx.

doi.org/10.1029/WR023i005p00875.

Gabriel, K. R. and Neumann, J. (1962). “A Markov chain model for daily rainfalloccurrence at Tel Aviv.” Quarterly Journal of the Royal Meteorological Society,88(375): 90–95. URL http://dx.doi.org/10.1002/qj.49708837511.

Gould, B. W. (1961). “Statistical methods for estimating the design capacity of dams.”Journal of the Institute of Engineers, Australia, 33: 405–416.

Goulter, I. C. and Tai, F.-K. (1985). “Practical implication in the use of stochasticdynamic programming for reservoir operation.” JAWRA Journal of the AmericanWater Resources Association, 21(1): 65–74. URL http://dx.doi.org/10.1111/j.

1752-1688.1985.tb05352.x.

Green, D. A. and Metcalfe, A. V. (2005). “Reliability of supply between produc-tion lines.” Stochastic Models, 21: 449–464. URL http://dx.doi.org/10.1081/

STM-200056229.

Hastings, W. K. (1970). “Monte Carlo sampling methods ssing Markov chains andtheir applications.” Biometrika, 57(1): pp. 97–109. URL http://www.jstor.org/

stable/2334940.

Himmelmann, L. (2010). “HMM - Hidden Markov Models.” URL http://cran.

r-project.org/web/packages/HMM/index.html.

Howard, R. A. (1960). Dynamic programming and Markov processes. MIT Press,Boston.

Huang, W., Yuan, L., and Lee, C. (2002). “Linking genetic algorithms with stochasticdynamic programming to the long-term operation of multireservoir system.” WaterResources Research, 38(12): WR001 122.

Huang, Y. and Guo, X. (2009). “Optimal risk probability for first passage modelsin semi-Markov decision processes.” Journal of Mathematical Analysis and Applica-tions, 359(1): 404 – 420. URL http://dx.doi.org/10.1016/j.jmaa.2009.05.058.

Hughes, J. P. and Guttorp, P. (1999). “A non-homogeneous hidden Markov model forprecipitation occurrence.” Applied Statistics, 48: 15–30.

BIBLIOGRAPHY 139

Jianyong, L. and Huang, S. (2001). “Markov decision processes with distribution func-tion criterion of first-passage time.” Applied Mathematics and Optimization, 43:187–201. URL http://dx.doi.org/10.1007/s00245-001-0007-9.

Kelman, J., Stedinger, J. R., Cooper, L. A., Hsu, E., and Yuan, S.-Q. (1990). “Samplingstochastic dynamic programming applied to reservoir operation.” Water ResourcesResearch, 26: 447 – 454. URL http://dx.doi.org/10.1029/WR026i003p00447.

Klemm, A., Lindemann, C., and Lohmann, M. (2003). “Modelleing IP traffic using thebatch Markovian arrival process.” Preformace Evaluation, 54: 149–173.

Knights, D. (2004). “Water reform, drought, fish and the community: Tarong’s ap-proach to water supply issues and risks.” Paper presented to ESAA Power StationChemistry Conference, Pokolbin, New South Wales, Australia.

Labadie, J. W. (2004). “Optimal operation of multireservoir systems: state-of-the-artreview.” Journal of Water Resources Planning & Management, 130(2): 93 – 111.URL http://dx.doi.org/10.1061/(ASCE)0733-9496(2004)130:2(93).

Lambert, M. F., Whiting, J. P., and Metcalfe, A. V. (2003). “A non-parametric hiddenMarkov model for climate state identification.” Hydrological and Earth System Sci-ences, 7(5): 652–667. URL https://hal.archives-ouvertes.fr/hal-00304912.

Lamontagne, J. R. and Stedinger, J. R. (2013). “HRF Final Research Report:Executive Summary.” URL http://www.hydrofoundation.org/pdf/thesis/

Lamontagne.HRF%20Final.10-20-13.pdf.

Latouche, G. and Ramaswami, V. (1999). Introduction to Matrix Analytic Methods inStochastic Modelling. ASM-SIAM, Philidelphia, PA and Alexandria, VI.

Latouche, G. and Remiche, M. (2003). “A MAP-based Poisson cluster model for Webtraffic.” Performance Evaluation, 49(1-4): 359–370.

Ledoux, J. (2005). “Recursive filters for partially observable finite Markov chains.”Journal of Applied Probability, 42: 684–697.

Littman, M. L. (2009). “A tutorial on partially observable Markov decision processes.”Journal of Mathematical Psychology, 53(3): 119 – 125. URL http://dx.doi.org/

10.1016/j.jmp.2009.01.005. Special Issue: Dynamic Decision Making.

Lopez, J. A., Ponnambalam, K., and Quintana, V. H. (2007). “Generation and trans-mission expansion under risk using stochastic programming.” IEEE Transactionson Power Systems, 22(3): 1369–1378. URL http://dx.doi.org/10.1109/TPWRS.

2007.901741.

Loucks, D. P. (1968). “Computer models for reservoir regulation.” Journal of the San-itary Engineering Division, Proceedings of the American Society of Civil Engineers,94(SA4): 657–669.

140 BIBLIOGRAPHY

Loucks, D. P. and Falkson, L. M. (1970). “A comparison of some dynamic, linear andpolicy iteration methods for reservoir operation.” JAWRA Journal of the AmericanWater Resources Association, 6(3): 384–400. URL http://dx.doi.org/10.1111/

j.1752-1688.1970.tb00489.x.

Lovejoy, W. S. (1991). “A survey of algorithmic methods for partially observed Markovdecision processes.” Annals of Operations Research, 28: 47–66.

MacDonald, I. L. and Zucchini, W. (1997). Hidden Markov and other models fordiscrete-valued time series. Chapman & Hall, London.

Mahootchi, M., Ponnambalam, K., and Tizhoosh, H. R. (2010). “Comparison of risk-based optimization models for reservoir management.” Canadian Journal of CivilEngineering, 37(46): 122–124.

Markov, A. (1906). “Rasprostranenie zakona bol’shih chisel na velichiny, zavisyaschiedrug ot druga.” Izvestiya Fiziko-matematicheskogo obschestva pri Kazanskom uni-versitete. 2-year series., 15: 135–156.

Mays, L. W. (2008). “A very brief history of hydraulic technology during antiquity.”Environmental Fluid Mechanics, 8: 471–484.

Mays, L. W. (2010a). “A brief history of Roman water technology.” In L. Mays,editor, “Ancient Water Technologies,” pages 115–137. Springer Netherlands. URLhttp://dx.doi.org/10.1007/978-90-481-8632-7_7.

Mays, L. W. (2010b). “Lessons from the Ancients on water resources sustainability.”In L. Mays, editor, “Ancient Water Technologies,” pages 217–239. Springer Nether-lands. URL http://dx.doi.org/10.1007/978-90-481-8632-7_11.

McMahon, J. J. (2008). Time-Dependence in Markovian Decision Processes. Ph.D.thesis, University of Adelaide.

McMahon, T. A. and Mein, R. G. (1978). “Chapter 4: Probability Matrix Methods.” In“Reservoir Capacity and Yield,” volume 9 of Developments in Water Science, pages71 – 106. Elsevier. URL http://dx.doi.org/10.1016/S0167-5648(08)70959-6.

Momtahen, S. and Dariane, A. (2007). “Direct search approaches using genetic al-gorithms for optimization of water reservoir operating policies.” Journal of WaterResources Planning and Management, pages 202–209. URL http://dx.doi.org/

10.1061/(ASCE)0733-9496(2007)133:3(202).

Moran, P. A. P. (1954). “A probability theory of dams and storage systems.” AustralianJournal of Applied Sciences, 5: 116–125.

Moran, P. A. P. (1955). “A probability theory of dams and storage systems: modifica-tions of the release rules.” Australian Journal of Applied Sciences, 6: 117–130.

Moran, P. A. P. (1959). The Theory of Storage. Methuen & Co Ltd, London.

BIBLIOGRAPHY 141

Neuts, M. F. (1981). Matrix-Geometric Solutions In Stochastic Models: An AlgorthimicApproach. John Hopkins University Press, Baltimore.

Neuts, M. F. (1995). Algorthimic Probability. Chapman & Hall, London.

Nielsen, B. F. (2000). “Modelling long-range dependent and heavy-tailed phenomenaby matrix-analytic methods.” In G. Latouche and P. Taylor, editors, “Advances inAlgorithmic Methods for Stochastic Models,” pages 265–278. Notable PublicationsInc. URL http://www2.imm.dtu.dk/pubdb/p.php?2960.

Norris, J. R. (1997). Markov Chains. Cambridge University Press, New York.

Olsson, M. (1996). “Estimation of phase-type distributions from censored data.”Scandinavian Journal of Statistics, 23(4): 443–460. URL http://www.jstor.org/

stable/4616419.

Olsson, M. (1998). “The EMpht-programme.” URL home.imf.au.dk/asmus/

pspapers.html. Computer Program. Accessed: 26-July-2014.

Otieno, F. A. O. and Ndiritu, J. G. (1997). “The effect of serial correlation on reservoircapacity using the modified Gould’s probability matrix method (MGPM).” WATERSA, 23(1): 63–70. URL http://www.wrc.org.za/Knowledge%20Hub%20Documents/

Water%20SA%20Journals/Manuscripts/1997/01/WaterSA_1997_01_1007.PDF.

Piantadosi, J., Howlett, P., Bean, N., and Beecham, S. (2010). “Modelling systems ofreservoirs using structured Markov chains.” Proceedings of the ICE - Water Man-agement, 163(8): 407–416.

Piantadosi, J., Metcalfe, A. V., and Howlett, P. (2008). “Stochastic dynamic program-ming (SDP) with a conditional value-at-risk (CVaR) criterion for management ofstorm water.” Journal of Hydrology, 348: 320–329. URL http://dx.doi.org/10.

1016/j.jhydrol.2007.10.007.

Postel, S. L., Daily, G. C., and Ehrlich, P. R. (1996). “Appropriation of renewablefresh water.” Science, 271(5250): 785–788. URL http://www.jstor.org/stable/

2889886.

Price, L. (2008). “Reservoir control curve development for Environmental Impact As-sessment: an example from the Abbeton Scheme.” In “BNH 10th National HydrologySymposium, Exeter, 2008,” pages 67–74.

Puterman, M. L. and Brumelle, S. L. (1978). “The analytic theory of policy iteration.”In “Dynamic Programming and Its Applications,” pages 91–113. Academic Press.

Puterman, M. L. and Brumelle, S. L. (1979). “On the convergence of policy iterationin stationary dynamic programming.” Mathematics of Operations Research, 4(1):pp. 60–69. URL http://www.jstor.org/stable/3689239.

R Development Core Team (2011). R: A Language and Environment for StatisticalComputing. R Foundation for Statistical Computing, Vienna, Austria. URL http:

//www.R-project.org. ISBN 3-900051-07-0.

142 BIBLIOGRAPHY

Rabiner, L. R. (1989). “A tutorial on hidden Markov-models and selected applicationsin speech recognition.” Proceedings of the IEEE, 77: 257–286. URL http://www.

cs.cornell.edu/Courses/cs4758/2012sp/materials/hmm_paper_rabiner.pdf.

Reddy, M. and Nagesh Kumar, D. (2006). “Optimal reservoir operation using multi-objective evolutionary algorithm.” Water Resources Management, 20(6): 861–878.URL http://dx.doi.org/10.1007/s11269-005-9011-1.

Reznicek, K. and Cheng, T. (1991). “Stochastic modelling of reservoir operations.”European Journal of Operational Research, 50(3): 235–248. URL http://dx.doi.

org/10.1016/0377-2217(91)90257-V.

Røtting, T. A. and Gjelsvik, A. (1992). “Stochastic dual dynamic programming forseasonal scheduling in the Norwegian power system.” IEEE Transaction on PowerSystems, 7(1): 273–279. URL http://dx.doi.org/10.1109/59.141714.

Ryden, T. (2000). “Statistical estimation for Markov-modulated Poisson processesand Markovian arrival processes.” Advances in Algorithmic Methods for StochasticModels.

Srikanthan, R., Harrold, T., Sharma, A., and McMahon, T. (2005). “Comparisonof two approaches for generation of daily rainfall data.” Stochastic EnvironmentalResearch and Risk Assessment, 19: 215–226. URL http://dx.doi.org/10.1007/

s00477-004-0226-0.

Srikanthan, R. and McMahon, T. (1980). “Stochastic generation of monthly flows forephemeral streams.” Journal of Hydrology, 47: 19–48. URL http://dx.doi.org/

10.1016/0022-1694(80)90045-1.

Srikanthan, R. and McMahon, T. (1985a). “Gould’s probability matrix method 1.The starting month problem.” Journal of Hydrology, 77: 125–133. URL http:

//dx.doi.org/10.1016/0022-1694(85)90201-X.

Srikanthan, R. and McMahon, T. (1985b). “Gould’s probability matrix method 2.The annual autocorrelation problem.” Journal of Hydrology, 77: 135–139. URLhttp://dx.doi.org/10.1016/0022-1694(85)90202-1.

Stadje, W. (1993). “A new look at the Moran dam.” Journal of Applied Probability,30(2): 489–495. URL http://www.jstor.org/stable/3214860.

Stanford, D., Horn, W., and Latouche, G. (2006). “Tri-layered processes with boundaryassistance for service resources.” Stochastic Models, 22(3): 361–282.

Stedinger, J. R., Sule, B. F., and Loucks, D. P. (1984). “Stochastic dynamic pro-gramming models for reservoir operation optimization.” Water Resources Research,20(11): 1499–1505. URL http://dx.doi.org/10.1029/WR020i011p01499.

Stern, R. D. and Coe, R. (1984). “A Model Fitting Analysis of Daily Rainfall Data.”Journal of the Royal Statistical Society. Series A (General), 147(1): 1–34. URLhttp://www.jstor.org/stable/2981736.

BIBLIOGRAPHY 143

Tamburrino, A. (2010). “Water technology in Ancient Mesopotamia.” In L. Mays,editor, “Ancient Water Technologies,” pages 29–51. Springer Netherlands. URLhttp://dx.doi.org/10.1007/978-90-481-8632-7_2.

Teegavarapu, R. S. and Simonovic, S. P. (2002). “Optimal operation of reservoir sys-tems using simulated annealing.” Water Resources Management, 16(5): 401–428.URL http://dx.doi.org/10.1023/A%3A1021993222371.

Tejadaguibert, J. A., Johnson, S. A., and Stedinger, J. R. (1993). “Comparison of 2 ap-proaches for implementing multireservoir operating policies derived using stochasticdynamic programming.” Water Resources Research, 29(12): 3969–3980.

Thyer, M. and Kuczera, G. (2003a). “A hidden Markov model for modelling long-termpersistence in multi-site rainfall time series 1. Model calibration using a Bayesianapproach.” Journal of Hydrology, 275: 12–26. URL http://dx.doi.org/10.1016/

S0022-1694(02)00412-2.

Thyer, M. and Kuczera, G. (2003b). “A hidden Markov model for modelling long-termpersistence in multi-site rainfall time series 2. Real data analysis.” Journal of Hy-drology, 275: 27–48. URL http://dx.doi.org/10.1016/S0022-1694(02)00411-0.

Turgeon, A. (1980). “Optimal operation of multireservoir power systems with stochasticinflows.” Water Resource Research, 16(2): 275–283. URL http://dx.doi.org/10.

1029/WR016i002p00275.

Turgeon, A. (1981). “A decomposition method for the long-term scheduling of reservoirsin series.” Water Resources Research, 17(6): 1565–1570. URL http://dx.doi.org/

10.1029/WR017i006p01565.

Turgeon, A. (2005). “Solving a stochastic reservoir management problem with multilagautocorrelated in-flows.” Water Resources Research, 41(12): W12 414.1–W12 414.9.URL http://dx.doi.org/10.1029/2004WR003846.

United Nations Children’s Fund (2005). Childhood Under Threat. The Stateof the World’s Children. UNICEF, New York. URL http://www.unicef.org/

publications/files/SOWC_2005_%28English%29.pdf.

United Nations Educational, Scientific and Cultural Organization (2003). Water: ashared responsibility. The United Nations World Water Development Report. UN-ESCO / Berghan Books, Paris / New York. URL http://unesdoc.unesco.org/

images/0012/001297/129726e.pdf.

United Nations Educational, Scientific and Cultural Organization (2006). Water forPeople. Water for Life. The United Nations World Water Development Report 2.UNESCO / Berghan Books, Paris / New York. URL http://unesdoc.unesco.org/

images/0014/001454/145405E.pdf.

United Nations Educational, Scientific and Cultural Organization (2009). Water in aChanging World. The United Nations World Water Development Report 3. UNESCO/ Berghan Books, Paris / New York. URL http://unesdoc.unesco.org/images/

0018/001819/181993e.pdf.

144 BIBLIOGRAPHY

United Nations Educational, Scientific and Cultural Organization (2012). ManagingWater under Uncertainty and Risk. The United Nations World Water DevelopmentReport 4. UNESCO / Berghan Books, Paris / New York. URL http://unesdoc.

unesco.org/images/0021/002156/215644e.pdf.

United States Bureau of Reclamation (2014). “Lower Colorado river operations sched-ule.” URL http://www.usbr.gov/lc/region/g4000/hourly/rivops.html. Ac-cessed: 26-July-2014.

Vedula, S. and Nagesh Kumar, D. (1996). “An integrated model for optimal reservoiroperation for irrigation of multiple crops.” Water Resources Research, 32(4): 1101–1108. URL http://dx.doi.org/10.1029/95WR03110.

Viterbi, A. J. (1967). “Error bounds for convolutional codes and an asymptoticallyoptimum decoding algorithm.” Information Theory, IEEE Transactions on, 13(2):260–269.

Webby, R., Adamson, P., Boland, J., Howlett, P., Metcalfe, A., and Piantadosi, J.(2006). “The Mekong - applications of Value-at-Risk (VaR) and Conditional Valueat Risk (CVaR) simulation to the benefits, costs and consequences of water resourcesdevelopment in a large river basin.” Ecological Modelling, 201: 89–96.

Webby, R. B., Green, D. A., and Metcalfe, A. V. (2009). “Modelling water blending-sensitivity of optimal policies.” Environmental Modeling and Assessment, 14(6):749–757. URL http://dx.doi.org/10.1007/s10666-008-9166-2.

Welch, L. R. (2003). “Hidden Markov models and the Baum-Welch algorithm.”IEEE Information Theory Society Newsletter. URL http://www-bcf.usc.edu/

~lototsky/MATH508/Baum-Welch.pdf.

Whiting, J. P. (2005). Identification and modelling of hydrological persistence withhidden Markov models. Ph.D. thesis, University of Adelaide. URL http://hdl.

handle.net/2440/37842.

Whiting, J. P., Lambert, M. F., and Metcalfe, A. V. (2003). “Modelling persistancein annual Austrailan point rainfall.” Hydrological and Earth System Sciences, 7(2):197–211. URL http://dx.doi.org/10.5194/hess-7-197-2003.

Whiting, J. P., Lambert, M. F., Metcalfe, A. V., Adamson, P. T., Franks, S. W., andKuczera, G. (2004). “Relationships between the El-Nino southern oscillation andspate flows in southern Africa and Australia.” Hydrology and Earth System Sciences,8(6): 1118–1128. URL http://dx.doi.org/10.5194/hess-8-1118-2004.

Yamout, G., Hatfield, K., and Romeijn, H. E. (2007). “Comparison of new condi-tional value-at-risk management models for optimal allocation of uncertain watersupplies.” Water Resources Research, 43: W07 430. URL http://dx.doi.org/10.

1029/2006WR005210.

Yeh, W. W.-G. (1985). “Reservoir management and operations models: a state-of-the-art review.” Water Resources Research, 21(12): 1797–1818. URL http://dx.doi.

org/10.1029/WR021i012p01797.

BIBLIOGRAPHY 145

Young, M. and McColl, J. (2005). “Defining tradable water entitlements and alloca-tions: A robust system.” Canadian Water Resources Journal, 30(1): 65–72.

Zucchini, W. and Guttorp, P. (1991). “A Hidden Markov Model for Space-Time Pre-cipitation.” Water Resources Research, 27(8): 1917–1923. URL http://dx.doi.

org/10.1029/91WR01403.