read me at last after markov chain (summary)

8/10/2019 Read Me at Last After Markov Chain (Summary)

1/74

Lecture by John Shortle, partially transcribed by James LaBelle, based on the class textbook: Ross, S. M., 2003,Introduction to Probability Models, 8

th, Academic Press.

OR / STAT 645: Stochastic Processes

Lecture 1: Probability Review, Exponential DistributionGiven: 8/31/2006

Why You Need to Know about Stochastic Processes

Life is stochastico Commute to worko Wait in line for luncho Even deterministic things are stochastic (e.g., Metro busses)

Stochastic problems are directly relevant to your lifeo Why do bad things happen in groups?o How do I increase the page rank of my web site in Google?o How should stock options be priced?o When should I replace my aging car?o Why do I have to wait so long for a bus at Dulles airport? And why do the busses get

clumped up in groups?

Probability Review

Notation

( )f x is the Probability Density Function (PDF).

( )F x is the Cumulative Distribution Function (CDF): ( ) ( )x

F x f u du

= .

( ) 1 ( )cF x F x= is the Complement of the CDF (or CCDF)

Relationships

( ) ( ) ( )cd d

f x F x F xdx dx= =

Exponential Distribution

( ) xf x e = , 0x ( ) 1 xF x e = , 0x

( )c xF x e = , 0x

Memorize These Formulas!

Gamma Distribution

PDF:1 /

( )( )

xx e

f x

=

, 0x > , where 1

0

( ) xx e dx

= .

Note: ( ) ( 1)! = when is a positive integer.CDF: When is a positive integer,

1/

0

( / )( ) 1

!

jx

j

xF x e

j

=

= , 0x > .


2/74



th, Academic Press.

We will derive this property later.

Other properties1. When is a positive integer, a gamma random variable (RV) is equivalent to the sum of

independent exponential RVs with mean .

2. When 1 = , gamma RV is an exponential RV with mean .

Check:1 / /

( )( )

x x

xx e ef x e

= = =

, where 1/ = .

Mean of a Random Variable

(1) Typical method

Discrete Case: [ ] i ii

E x x p= , where ( )i ip P X x= =

Continuous Case: [ ]0

( )E X xf x dx

=

(2) WhenXis non-negativerandom variable

0

[ ] ( )cE x F x dx

=

Proof: SupposeXis discrete with ( )i ip P X x= = . Plot ( )c

F x as a function ofx:

Calculate the area under the curve two ways:

First way:0

( )cF x dx

.

Second way: Add up areas of horizontal rectangles. This gives1

i i

i

x p

=, which is [ ]E X .

(3) Using Moment Generating Function

Moment generating function for random variableX: ( ) tXt E e =

x1 x2 x3 x4

p1

p2

p3

p4

Fc(x)


3/74



th, Academic Press.

( )

discrete case

( ) continuous case

tk

k

k

tx

e p

t

e f x dx

=

=

Then, ( )

( ) 0nn

t

E x t=

=

.

For example, [ ] ( )0t

E x t=

=

( )20t

E x t=

=

Note: Moment generating function is closely related to the Laplace transform:

*

0

( ) [ ] ( )sx sxf s E e e f x dx

=

Example: Exponential Distribution

( ) ( ) ( ) ( )( )

0 0 0

x

t xt xtx x

x

et e e dx e dxt t

=

=

= = = = .

Therefore,

[ ]0

1

t

E xt

== =

( ) ( )( )

22

3 20

0

2 2t

t

E x tt

==

= = =

Variance of Random Variable

Variance: 2 2 2var[ ] ( [ ]) [ ] [ ]X E X E X E X E X = = , where2 2

0

[ ] ( )E X x f x dx

= .

Standard Deviation (std. dev.) = var[ ]X

Coefficient of Variation (CV) =std. dev.

[ ]E X

Example: Exponential Random Variable

2 22 2 2

2 1 1var[ ] [ ] ( [ ])X E X E X

= = =

std. dev. =1

CV =1/

11/

=


4/74



th, Academic Press.

Memoryless Property

Def. 1. A random variableXhas the memoryless property if:

( | ) ( )P X t s X s P X t > + > = >

Intuition: SupposeXrepresents the time that you wait for a bus. Given that you have already beenwaiting stime units ( )X s> , the probability that you wait an additional tunits ( | )P X t s X s> + > is the same as the probability of waiting tunits in the first place ( )P X t> .

We now formulate an alternate definition. IfXhas the memoryless property, then

( ) ( | )

( and )

( )

( )

( )

P X t P X t s X s

P X t s X s

P X s

P X t s

P X s

> = > + >> + >

=>

> +=

>

Def. 2. A random variableXhas the memoryless property if:( ) ( ) ( )P X t s P X t P X s> + = > >

The exponential distribution is the only distribution that has the memoryless property.

Check the exponential has this property: ( )( ) ( ) ( )t s t sP X t s e e e P X t P X s + > + = = = > >

Useful Properties of Exponential Distribution

Suppose that

( )1 1~ expX (time until event 1 happens)

( )2 2~ expX (time until event 2 happens)

( )~ expn nX (time until event nhappens)

Alli

X are independent

1. First occurrence among events

What is the probability that 1 2X X< ?

1 1 2 2

1

1 1 2 2

1

1 1 2 1

1 2 1 2 2 1

0

1 2 2 1

0

1 1

0

( ) x x

x

x x

x

x x

P X X e e dx dx

e e dx dx

e e dx

< =

=

=


5/74



th, Academic Press.

11 2

1 2

( )P X X

< =

+

The double integration uses the following graphic.

x1

x2

x1

x2

The second to last equality uses the known CCDF for the exponential distribution.

For the opposite relationship

1 22 1 1 2

1 2 1 2

( ) 1 ( ) 1P X X P X X

< = < = =

+ +,

as expected from symmetry.

More generally,

1( min( , , ) i

i n

j

j i

P X X X

= =

To derive the general parts from the 2-variable cases, build up inductively.

3 variable example: 11 1 2 3 1 2 31 2 3

( min( , , ) ( min( , ))P X X X X P X X X

= = < =

+ +

2. Distribution of time of first event (minimum)

1 2 1 2

1 2

1 2

(min( , ) ) ( , )

( ) ( )

( ) ( )C C

P X X x P X x X x

P X x P X x

F x F x

> = > >

= > >

=

For an exponential RV,

1 2

1 2

1 2 1 2

( )

(min( , ) ) ( ) ( )C C

x x

x

P X X x F x F x

e e

e

+

> =

=

=

This is the CDF of an exponential RV with rate 1 2( ) + , therefore

1 2 1 2min( , ) ~ exp( )X X + .

More generally,

1 2 1 2min( , , , ) ~ exp( )nX X X + + .


6/74



th, Academic Press.

Key intuition: Think of exponential RVs as times until something happens and s as rates.

3. Independence property (stated without proof). The time of the first occurrence of an event isindependent of the ordering of the events. That is, 1 2 nX X X< < =

=

=

=

For an exponential RV,

1 2

1 2 1 2

1 2 1 2

1 2 1 2

( )

( )

(max( , ) ) 1 ( ) ( )

1 (1 )(1 )

1 (1 )

x x

x x x

x x x

P X X x F x F x

e e

e e e

e e e

+

+

> =

=

= +

= +

Note: This could have been derived from Venn diagram principles

1 2 1 2 1 2(max( , ) ) ( ) ( ) ( , )P X X x P X x P X x P X x X x> = > + > > >

5. Sum of exponentials (with same rate) is a gamma (stated earlier)

Example (Prob. 5.28)

Consider ncomponents with independent lifetimes. Component ifunctions for an exponential timewith rate

i . All components are initially in use and remain so until they fail.

a. Find the probability that component 1 is the second component to fail.b. Find the expected time of failure of the second component

Possible orderings for component 1 to be second component to fail:

a. 2 fails first, then 1, then some other component fails.b. 3 fails first, then 1, then some other component fails.c. d. nfails first, then 1, then some other component fails.

Probability of event (a) is:


7/74



th, Academic Press.

2 1

21

(2 fails first, then 1, then another)

(2 fails before all others)P(1 fails before all except 2)

n

iiii

P

P

=

=

Likewise, probability of event (b) is: 3 1

31

n

iiii

=

.

Events (a),(b), , (d) are mutually exclusive, therefore P(component 1 is second to fail) is sum of allthe above probabilities.

2 1 1

21 1

1 2

21

n

n n

i ii i ni ii i

n

n

i ii i nii

= =

=

+ +

= + +

(b)

Expected time of first failure is

1

1n

ii

=.

Probability first failure is type kis:

1

k

n

ii

=

.

Expected time from first failure to second failure given first failure is type kis:1

ii k

.

Thus, total expected time until second failure is

11 1

1 1n kn n

k ii ki ii i

= = =

+

Computing Expectations by Conditioning

Basic idea: Compute the expectation or variance of a (complicated) random variable byconditioning on another random variable.

In stochastic processes, it is often useful to condition on the first event.

Use the formulas

( ) ( ( | ))

( ) ( ( | )) ( ( | ))

E X E E X Y

V X V E X Y E V X Y

=

= +


8/74



th, Academic Press.

Example

The probability of an accident on I-66 during my morning commute is 0.1.

If there is an accident, commute time ~N(50, 62)

If there is no accident, commute time ~N(30, 42)

What is the average time to get to work? What is the variance of time to get to work?

Average time to get to work (easy): 0.1 50 0.9 30 32 + = .

But lets work out carefully in language of conditional expectation:

X= Time to get to work

Y= Accident or no accident

50 if accident( | )

30 if no accidentE X Y

=

Note: ( | )E X Y is a random variable(call it Z). In other words,

50 w.p. 0.1( | )

30 w.p. 0.9Z E X Y

=

.

Finally,

( ) ( ( | )) ( ) 0.1 50 0.9 30 32E X E E X Y E Z= = = + = .

To compute V(X), first evaluate ( ( | ))V E X Y . We already know ( | )E X Y (which we calledZ).

2( ) 0.1 2500 0.9 900 1,060E Z = + = .

2 2 2

( ( | )) ( ) ( ) ( ) 1060 32 36V E X Y V Z E Z E Z = = = =

Now, evaluate ( ( | ))E V X Y .

36 w.p. 0.1( | )

16 w.p. 0.9V X Y

=

Note: ( | )V X Y is a random variable.

( ( | )) 0.1 36 0.9 16 18E V X Y = + =

In summary, ( ) ( ( | )) ( ( | )) 36 18 54V X V E X Y E V X Y = + = + = . (Note: Variance is bigger than

variances of the conditioned normal variables).


9/74


th, Academic Press.

OR / STAT 645: Stochastic Models

Lecture 2: The Poisson ProcessGiven: 9/7/2006

The Poisson Distribution

Def.A Poisson random variable with meanAhas probability mass function:

( )!

iAA

P X i ei

= =

where 0,1,2,i =

Note: Ae is a normalization constant.

Mean of distribution isA.Variance of distribution isA.

Note: For an exponential RV, the mean and std. dev.are equal. Here, the mean and varianceare

equal.

Historical Background

Ladislaus Bortkiewicz. Born 1868 in St. Petersburg, Russia, born into Russian nobility. He was amilitary man and an instructor teaching artillery and mathematics. After being awarded a doctorate,he led a career in statistics and actuarial science. Some have argued that the Poisson distributionshould be named the von Bortkiewicz distribution.

Bortkiewicz observed that events with a low frequency in a large population follow a Poissondistribution, even when the probabilities of the events vary. The classical example is the followingdata set (Bortkewicz L von.Das Gesetz der Kleinen Zahlen. Leipzig: Teubner; 1898):

14 (out of 16 total) Prussian army corps units observed over 20 years (1875-1894). A count of men killed by a horse kick, each year, for each unit (280 data points) Total deaths = 196 Average deaths per unit per year = 196 / 280 = 0.70.

Assume the number of deaths (for one unit in one year) is a Poisson RV with mean 0.70. Then thepredicted and actual distributions are as follows:

Deaths

Theoretical #

of Units

Observed

# of Units

0 139.04 144

1 97.33 912 34.07 323 7.95 114 1.39 2

5+ 0.22 0Total 280 280

Some sources give an alternate account of the data:


10/74

Stochastic Models


th, Academic Press.

10Prussian army corps units observed over 20 years (1875-1894) Total deaths = 122 Average deaths per unit per year = 122 / 200 = 0.61.

Deaths

Theoretical #

of Units

Observed

# of Units

0 108.67 1091 66.29 652 20.22 223 4.11 34 0.63 15 0.08 06 0.01 0

Another Example

During World War II, Germans attacked London with V-2 flying bombs. It was observed that theimpacts of the bombs tended to be grouped in clusters, rather than showing a random distribution.

A possible explanation was that (a) specific areas were targeted and (b) the precision of the bombswas very high. However, the bombs were launched from across Europe and so this explanationseemed implausible.

The following data were taken 144 square kilometers of south London were divided into 576 squares of square kilometer

each. A count was made of the number of bombs in each square. Total bombs observed: 537. Average bombs per square: 537 / 576 = 0.932.

Assume the number of bombs in a square kilometer is a Poisson RV with mean 0.932. Then thepredicted and actual distributions are as follows:

Bombs per

Square

Theoretical #

of Squares

Observed

# of Squares

0 226.74 2291 211.39 2112 98.54 933 30.62 354 7.14 7

5+ 1.57 1Total 576.00 576

Conclusion: When rare events are randomly distributed, there tend to appear gaps in which noevents occur and then periods in which events appear in clusters. Mentally, we tend to forget aboutthe gaps and focus on the unusual occurrence of multiple rare events in the same space, giving aninflated illusion of rare-event clustering. It would actually be quite unusual to see rare eventsevenly distributed throughout time or space in a grid-like fashion.

Poisson Convergence

Why does the Poisson distribution work so well?


11/74

Stochastic Models


th, Academic Press.

Roughly speaking, one way to think of a Poisson RV is the sum of a large numberof independentrareevents (not necessarily identical). We motivate with an example:

Let1 if person enters Giant G between 12:05 and 12:10 pm on 9/6/05

0 otherwisei

iX

=

Let1

N

i

i

X X=

= , where the summation is over all the people in Fairfax county.

Check conditions Large number of events? yes Independent events? Mostly

o Counter-example: Many customers arrive in a short time period. Subsequentcustomers see a full parking lot and decide not to enter.

o Counter-example: One car comes with multiple people Rare events? yes Identical probabilities of events? no

For the moment, we assume all events are identical, and we relax the assumption later. Specifically,we suppose ( 1) /iP X A N = = for all i. Note: [ ]E X A= .

Based on these assumptions, we have a binomial distribution:

1

1k N kN

i

i

N A AP X k

k N N

=

= =

!1

( )! !

k N kN A A

N k k N N

=

( 1) ( 1) (1 / )

! (1 / )

k N

k k

N N N k A A N

N k A N

+ =

1! 1 !

k A kAA e A

ek k

= - a Poisson random variable!

In other words Bin(N,p) is approximately Poisson(Np) under the previous assumptions.

Now, we eliminate the identically distributed assumption:Theorem. Let ,n mX (1 m n ) be a sequence of RVs where for each n:

,n mX are independent

,

,1 w.p.0 otherwise

n m

n mpX =

(we are counting events)

,1 , (0, )n n np p A+ + (collectively, the events are rare, since nis large)

,1max 0n m

m np

(allevents are rare no one event hogs the probability)

then ,1 , Poisson( )n n nX X A+ + .


12/74

Stochastic Models


th, Academic Press.

Note: This looks similar to the Central Limit Theorem. However, in the CLT, condition 3 is replacedwith ,1 , (0, )n n np p nA+ + (in other words, the means of the random variables ,n mX are

approximately constant in n, so the mean of the sum grows linearly in n). Here, the means of therandom variables are shrinking in n, so the mean of the sum stays roughly constant.

Example

400 students are in a calculus class. LetXbe the number of students who have a birthday on the dayof the final. What is the probability that there are 2 or more birthdays on the final?

Let1 if student has a birthday on the final

0 otherwisei

iX

=

( 1) 1/ 365iP X = = 400

1i

i

X X=

is approximately Poisson with mean 400 / 365.

Then, ( 2) 1 ( 2) 1 A AP X P X e Ae = < = , whereA= 400 / 365. (Answer: 0.2995)

Preliminary Definitions

Def.Astochastic processis a collection of random variables (RV) indexed by time{ ( ), }X t t T .

If Tis a continuous set, the process is a continuous time stochastic process(e.g. PoissonProcess).

If Tis countable, then the process is a discrete time stochastic process(e.g. Markov Chain).

Def. A counting processis a stochastic process { ( }; 0}N t t such that

( ) {0,1,2, }N t (that is, ( )N t is a non-negative integer).

If s t< then ( ) ( )N s N t (that is, ( )N t is non-decreasing in t).

For s t< , ( ) ( )N t N s is the number of events occurring in the time interval ( , ]s t .

Interpretation: ( )N t is the number of events that have occurred by time t.

Def. A counting process hasindependent incrementsif the numbers of events in disjoint (non-overlapping) intervals are independent.

Def. A counting process has stationary incrementsif the distribution of the number of events in aninterval depends on the lengthof the interval, but not on the starting point of the interval. That is,

( ( ) ( ) )P N s t N s n+ = does not depend on s. Intuitively, the interval can be slid around without

changing its stochastic nature.

Def. A function ( )f is ( )o h if0

( )lim 0h

f hh

= . That is, ( )f goes to zero faster than hgoes to zero.

Example: Which functions are ( )o h ?2

1.5

2

( ) yes

( ) 0.01 no

( ) yes

( ) no

f x x

f x x

f x x

f x x x

=

=

=

= +


13/74

Stochastic Models


th, Academic Press.

Definitions of the Poisson Process

Definition 1: A Poisson process is a counting process { ( }; 0}N t t with rate 0> , if:

1. (0) 0N =

2. The process has independent increments3. The number of events in any interval of length tis a Poisson RV with mean t .

That is, for all , 0s t , and 0,1,2,n=

( )( ( ) ( ) )

!

nt t

P N s t N s n en

+ = =

Example: Consider people entering a McDonalds over a short period of time, say 20 minutes.

Q: How do you verify these conditions?A: Condition 1 holds. Condition 2 may hold if people do not come in batches. Hard to verifyassumption 3 without collecting data.

[Note: Cinlar (1975),Introduction to Stochastic Processes, gives a similar definition, withoutassuming independent increments. Assumption 3 is changed to: The number of events on any finiteunion of disjoint intervals is a Poisson RV with mean b where bis the length of the union.]

Is it possible to use the physics of the situation to derive a Poisson process, similar to the rareevent law given previously?

Definition 2: A Poisson process is a counting process { ( }; 0}N t t with rate 0> , if:

1. (0) 0N =

2. The process has stationary increments

3. The process has independent increments4. ( ( ) 1) ( )P N h h o h= = + . (# of events approximately proportional to length of interval)

5. ( ( ) 2) ( )P N h o h = (cant have 2 or more events at the same time orderliness)

This is a more fundamental, qualitative definition of the Poisson process.

Theorem: Definitions 1 and 2 are equivalent.


14/74

Stochastic Models


th, Academic Press.

Q: Can these conditions be verified for the McDonalds example?A: Stationarity over small intervals ok; independent increments not valid if external events occur.

[Note: Cinlar (1975),Introduction to Stochastic Processes, gives a similar definition, withoutassumptions 4 and 5, instead assuming that the process has only unit jumps. Assumption 4 canactually be eliminated in Cinlar (1975), Lemma 1.8 derives (4) from (1), (2), (3), (5), and that theprocess is a counting process.]

Eliminating individual assumptions yields variations on the Poisson process: Eliminate Assumption 2: Non-stationary Poisson process Eliminate Assumption 3: Mixture of Poisson processes (choose randomly, then run a

Poisson process) Eliminate Assumption 5: Compound Poisson process

Def. 2 Implies Def. 1

Assume a Poisson process under definition 2. Consider a time horizon [0, T] divided up into nbins

(where nis large):

n Binsn Bins

The average number of events in Suppose on average Tevents arrive in time period [0,T] By orderliness (Property 5), there is (loosely speaking) at most 1 event in each bin. By Property 4 and stationarity (Property 2),

(1 event in a given bin) T

Pn

.

By independent increments (Property 3), the numbers in each bin are independent.

Therefore, the total number of events is approximately a binomial distribution bin( , / )n p T n= . By

the previous discussion on Poisson convergence, the total number of events in the interval isapproximately Poisson with mean np T= .

Additional Poisson Properties

Letn

TTT ,,,21 be inter-event times for a Poisson Process (for between n& n-1events). Let

nSSS ,,,

21 be the times of each event (ordered in time). Then..

1

= NNN

SST

1011 SSST =

=

=N

iiN

TS1

, and

the following are equivalent

( ) NtNtStTN

N

ii

=1


15/74

Stochastic Models


th, Academic Press.

Inter-event Times

First, we derive the distribution of the time 1Tof the first event. 1( )P T t> is the probability that no

events occur in [0, ]t . The number of events in [0, ]t is a Poisson RV with mean t . So,0

1

( )( ) ( ( ) 0)

0!t tt

P T t P N t e e

> = = = = .

This is the CCDF of an exponential random variable. So,

1 ~ exp( )T .

Now, we derive the distribution of the second inter-event time 2T . First, we condition on the time of

the first event:

2 1( | )P T t T s> =

1( ( ) ( ) 0 | )P N s t N s T s= + = = 0 events in ( , ]s s t+ , 1 event in [0, ]s

( ( ) ( ) 0)P N s t N s= + = by independent increments

( ( ) (0) 0)P N t N = = by stationary increments

( ( ) 0)P N t= = since ( ) 0N t =

te = Since 2 1( | )

tP T t T s e

> = = does not depend on s, 2 2 1( ) ( | )

tP T t P T t T s e

> = > = = . So,

2 ~ exp( )T and 2T is independent of 1T.

We can continue with the same logic for 3 4, ,T T

Definition 3: A Poisson process with rate is a counting process such that times between events arei.i.d. with distribution exp( ) .

Conditional Distribution of Event Times

Given: One event in [0, ]t , what is the distribution of 1T?

11

( , ( ) 1)( | ( ) 1)

( ( ) 1)

(1 event in [0, ],0 events in ( , ])

( ( ) 1)

P T s N t P T s N t

P N t

P s s t

P N t

= = =

=

==

(1 event in [0, ]) (0 events in ( , ])

( ( ) 1)

P s P s t

P N t

=

= by independent increments

1( )

1

( )

1!( )

1!

s t s

t

se e

st t

e

= = .

So, 1( | ( ) 1) s

P T s N t t

= = .

This is the CDF of a uniform distribution on [0, t]. Thus, given one event in [0, t], its location isuniformly distributed in [0, t].


16/74

Stochastic Models


th, Academic Press.

The general result (not proven here) is:

Theorem 5.2: Given nevents in [0, t] (i.e., ( )N t n= ), the un-ordered event times 1 2, , , nS S S are

distributed as i.i.d. uniform random variables on [0, t].

Un-ordered means that the event times 1 2, , , nS S S are not listed in the order of occurrence (thatis, where 1 2 nS S S< <


17/74


th, Academic Press.


Lecture 3: Poisson Process

Further Properties, Generalizations, and ApplicationsGiven: 9/14/2006

Splitting a Poisson Process

Problem Set-up / Assumptions: Let ( )N t be a Poisson process with rate .

Each event is labeled:o Type-I with probabilityp,o Type-II with probability 1 p,

Assignment of event types are i.i.d. Split Poisson process:

o Let ( )IN t be the number of Type-I events by time t.

o Let ( )IIN t be the number of Type-II events by time t.

Proposition(5.2, p. 296). ( )IN t and ( )IIN t are independentPoisson processes with rates p and

(1 )p , respectively.

Proof. (0) 0IN = ( )IN t has stationary and independent increments.

( ( ) 2) ( ( ) 2) ( )IP N h P N h o h = ( ( ) 1) ( ( ) 1| ( ) 1) ( ( ) 1) ( ( ) 1| ( ) 2) ( ( ) 2)I I IP N h P N h N h P N h P N h N h P N h= = = = = + =

( ( )) ( ( ) 1| ( ) 2) ( )

( )

Ip h o h P N h N h o h

p h o h

= + + = = +

Proposition(5.3, p. 303): Same assumptions as above except:

An event at time tis a type-ievent with probability ( )ip t , 1,2, ,i n= , where1

( ) 1n

i

i

p t=

= (independent of all else). Note: the splitting probability may depend on time.

Let ( )i

N t be the number of type-i events by time t.

Then, ( )i

N t are independent Poisson random variables, with0

[ ( )] ( )t

i iE N t p s ds=

.

Note: the split processes are not technically Poisson processes why?

Corollary. If the splitting probabilities have no time dependence, then the split process areindependent Poisson processes with rate ip (or mean ip t ).

Example

Calls to a central office arrive according to a Poisson process with rate 20 = per min.


18/74

Stochastic Models


th, Academic Press.

The probability that an arriving call is a voice call is 80%; the probability of a data call is20%, independent of all else.

What is the probability that 100 or more voice calls and 50 or more data calls arrive in a 5 minuteperiod?

Voice calls have a Poisson distribution with mean 20 0.8 5 80 = Data calls have a Poisson distribution with mean 20 0.2 5 20 = The two random variables are independent

The answer is99 49

80 20

0 0

80 201 1

! !

i i

i i

e ei i

= =

.

What is the probability there are more data calls than voice calls in a 5 minute period?

80 20

0 1

80 20

! !

i j

i j i

e ei j

= = + .

Example: Minimizing # of Encounters (Optional)

Assumptions: Cars enter the highway according to a Poisson process with rate . Velocity of each car is constant, but chosen according to distribution G. Cars pass each other with no loss of time.

Q: What speed should you travel to minimize the number of encounters?

SolutionConsider a section of highway with length dand the following variables:

Time Enter

Highway

Time on

Highway Velocity

You 0 /t d v= v Other car s /T d V= V

The decision variable is v(or equivalently t).

An encounter with this car occurs if: 0s< and T s t+ > (you pass the car) or, 0s> and T s t+ < (the car passes you)

We classify all other cars into those involving an encounter with you and those not. A car arriving attime sis involved in an encounter with probabilityp(s):

( ) if 0( )

( ) if 0

cF t s sp s

F t s s

where ( ) ( ) ( / ) ( / ) ( / )cF t P T t P d V t P V d t G d t = = = is the CDF of time spend by cars on

this section of highway. (Note: ( ) 0F t s = when ( ) 0t s < ).


19/74

Stochastic Models


th, Academic Press.

Think of other cars arriving as a Poisson process and classifying cars by whether or not they have anencounter with you. By Poisson splitting, the number of cars (over all time) involved in an encounterwith you is a Poisson random variable with mean:

0

0

( ) ( ) ( )cp s ds F t s ds F t s ds

= +

(Note: we start counting time, for the Poisson splitting, at , rather than at 0)

( ) ( )t

c

t

F s ds F s ds

= + (change of vars)

To minimize this mean, take the derivative with respect to tand set equal to 0:

( ) ( ) 0cF t F t + =

This implies that ( ) ( )cF t F t = . In other words, tis the median of the travel times on the road.

Equivalently, you should travel at the median velocity of all cars on the road.

Application: M/G/Queue

Notation M: Markovian or Memoryless arrival process (i.e., a Poisson process). G: General service time (not necessarily exponential) : Infinite number of servers

Let ( )X t be the number of customers who have completed service by time t

( )Y t be the number of customers who are being served at time t

( )N t be the total number of customers who have arrived by time t

Then, ( ) ( ) ( )N t X t Y t= + .

Splitting the arrival process Fix a reference time T. Consider the process of customers arriving prior to time T. (i.e., assume t T ) Note:

notation is slightly different than the book, p. 304. A customer arriving at t T is

o Type-I if service is completed before T Occurs with probability ( )G T t

o Type-II if customer still in service at T

Occurs with probability ( )cG T t

Since arrival times and service times are all independent, the type assignments are independent.Therefore, we can apply Proposition 5.3:

( )X T is a Poisson random variable with mean0 0

( ) ( )T T

G T t dt G t dt = .

( )Y T is a Poisson random variable with mean0 0

( ) ( )T T

c cG T t dt G t dt = .

( )X T and ( )Y T are independent.


20/74

Stochastic Models


th, Academic Press.

What happens when T ?

( ) 1G t for large t. Therefore, ( )X T is a Poisson random variable with mean T .

( )Y T is a Poisson random variable with mean0

( ) [ ]T

cG t dt E G (why does last equality

hold?)

Summary: Number of customers in service in an M/G/queue in steady state is a Poissonrandom variable with mean [ ]E G .

Note: If / and 1/ [ ]E G = , then ( ) ~ Poisson( )X T .

Example

Suppose insurance claims arrive according to a Poisson process with rate 5 per day (Q: What typesof insurance claims can be modeled this way? Hurricane claims? Auto-accident claims?). Suppose

the time it takes to process an insurance claim is uniformly distributed on [1 day, 7 days]. What isthe probability that there are no insurance claims being processed at a given moment?

SolutionProcess can be modeled as an M/G/queue. Assumption made:

Service times are independent There are a large number of agents, so that effectively the number of servers is infinite

(i.e., no claim ever waits for service) System is in steady-state

Under these assumptions, the numberXof customers in service is a Poisson random variable with

mean [ ] 5 4 20E G = = . Thus, 20( 0)P X e= =

Combining Poisson Processes

If ( )IN t and ( )IIN t are independentPoisson processes with rates I and II , respectively, and if

( )N t counts the number of events in both processes, then ( )N t is a Poisson process with rate

I II + .

Why? Inter-event times in ( )N t are the minimum of inter-event times in ( )I

N t and ( )II

N t . Hence,

inter-event times in ( )N t are exponential with rate I II + (using properties of the exponentialdistribution). Hence, ( )N t is a Poisson process with rate I II + .


21/74

Stochastic Models


th, Academic Press.

Non-Homogeneous Poisson Process (NHPP)

Properties1. (0) 0N = 2. ( )N t has independent increments.

3. [ ]( ) ( ) 1 ( ) ( )P N t h N t t h o h+ = = +

4. [ ]( ) ( ) 2 ( )P N t h N t o h+ =

Notes: This is like a Poisson process, without the stationarity assumption In property 3, if we had just a constant , then we would have a regular Poisson process

(stationarity is implied by properties 3 and 4).

A process with the above properties is a NHPP with intensity(or rate)function ( )t .

Def. The mean value function(for a NHPP) is

0

( ) ( )t

m t u du=

Note: If ( )t = , then ( )m t t=

Key PropertyFor a NSPP, ( ) ( )N t s N s+ (the number of events between sand t) is a Poisson random variablewith mean ( ) ( )m s t m s+ .

Proof (p.316)

Divide interval [ , ]s s t+ into nbins. Let iN be the number of events in the interval i.

Index icorresponds to interval( 1)

,i t it

s sn n

+ +

Bin width is /t n

Using assumed properties: ( 2) 0iP N . (Property 4)

. . . . . . . .

s s+t

N(t+s) N(s)

nbins


22/74

Stochastic Models


th, Academic Press.

( 1)iit t

P N sn n

= +

. (Property 3)

iN are independent. (Property 2)

Then,1

( ) ( )n

i

i

N s t N s N=

+ = . For nlarge, ( ) ( )N s t N s+ is the sum of a large number of

independent, rare events. Thus, ( ) ( )N s t N s+ is approximately a Poisson random variable withmean:

[ ]1 1

( ) ( ) [ ]n n

i i

i i

E N s t N s E N E N= =

+ = =

Now, [ ] ( 1)i iit t

E N P N sn n

= +

1 1

[ ] ( ) ( ) ( )s tn n

i

i i s

it tE N s u du m s t m s

n n

+

= =

= + = +

This graphically looks like:

n

its +Height =

n

tWidth =

n

its +Height =

n

its +Height =

n

tWidth =

n

tWidth =

Example

Consider a NHPP with rate10 0 0.5

( )20 0.5 1

tt

t


23/74

Stochastic Models


th, Academic Press.

10 0 0.5( )

20 5 0.5 1

t tm t

t t


24/74

Stochastic Models


th, Academic Press.

[ ( )] [ [ ( ) | ( )]] [ [ ( ) | ( )]]V X t V E X t N t E V X t N t = + Now,

1

[ ( ) | ( ) ] [ ]n

i i

i

V X t N t n V Y nV Y =

= = =

So,[ [ ( ) | ( )]] [ [ ( ) | ( )]]V E X t N t E V X t N t +

[ ( ) [ ]] [ ( ) [ ]]i iV N t E Y E N t V Y = + 2[ ] [ ]i itE Y tV Y = + 2 2 2[ ] ( [ ] [ ])i i itE Y t E Y E Y = +

2[ ( )] [ ]iV X t tE Y =

Example (similar to 5.26)

People call Ticketmaster according to a Poisson process with rate 2= per minute. The number oftickets ordered per call is 1, 2, 3, or 4 with probabilities 1/6, 1/3, 1/3, and 1/6, respectively.

What is the probability that at least 240 tickets are sold in the next 50 minutes?

Let ( )N t be the number of calls by time t.

Let iYbe the number of tickets sold for call i.

Let ( )X t be the number of tickets sold by time t.

Then, ( )X t is a compound Poisson process with( )

1

( )N t

i

i

X t Y=

= , with:

1 1 2 2 3 2 4 1 15 5

( ) 6 6 2iE Y

+ + + = = =

2 2 2 2

2 1 1 2 2 3 2 4 1 43( )6 6

iE Y

+ + + = =

( ( )) ( ) (2)(50)(5 / 2) 250i

E X t t E Y= = =

2 2,150( ( )) ( ) (2)(50)(43/ 6)3

iV X t t E Y = = =

Since ( )N t is relatively large, ( )X t is approximately a normal random variable. Thus,

(50) 250 240 250( (50) 240)

2150 / 3 2150 / 3

1 ( 0.3735) ( 0.3735)

XP X P

>

= =

0.6456=


25/74

Lecture by John Shortle, partially transcribed by James LaBelle, based on the class textbook: Ross, S. M., 2003,Introduction to Probability Models, 8th, Academic Press.


Lecture 4: Markov Chains, Discrete & Continuous TimeGiven: 9/21/2006

Discrete-Time Markov Chain (DTMC)Let nX ( 0,1,2,n = ) be a stochastic process, taking on a finite or countable number of

values (generally, assume that {0,1,2, }nX ).

( )X t is a DTMC if it has the Markov Property: Given the present, the future is

independent of the past:

1 1 1 1 1 0 0( | , , , , )n n n nP X j X i X i X i X i+ = = = = = = 1( | )n nP X j X i+ = =

In this class, we assume thatn

X is stationary. That is, 1( | )n nP X j X i+ = = does not

depend on n. That is,1

( | )n n ij

P X j X i p+

= = . The DTMC is said to have stationary

transition probabilities.

Transition probabilities must satisfy 1ijj

p = .

Often write transition probabilities as a matrix P .Q: Do columns or rows sum to 1?

Continuous-Time Markov Chain (CTMC)

Let ( 0t ) be a stochastic process, taking on a finite or countable number of values

(generally, assume that ( ) {0,1,2, }X t ).

( )X t is a CTMC if it has the Markov Property: Given the present, the future is

independent of the past:( ( ) | ( ) , ( ) ( ), for 0 ) ( ( ) | ( ) )P X t s j X s i X u x u u s P X t s j X s i+ = = = < = + = =

In this class, we assume that ( )X t is stationary. That is, ( ( ) | ( ) )P X t s j X s i+ = = does

not depend on s, only on t. The CTMC is said to have stationary transition probabilities.

Distribution of Time in a State

Let iTbe the time spent in state i(before a transition). Suppose MC enters state iat time 0 MC remains in state ithru time s

What is the probability that the MC remains in state ifor at least an additional ttimeunits?

( | )i iP T s t T s> + >


26/74


( | ( ) )iP T s t X s i= > + = (by Markov property)

( | (0) )iP T t X i= > = (by stationarity)

( )i

P T t= >

Thus, iThas the memoryless property, so ~ exp( )i iT v .

CTMC: Alternate Definition

This gives an alternate definition for a CTMC:

( )X t is a CTMC if:

1. The amount of time spent in state i (before a transition) is exponentially

distributed with rate iv : ~ exp( )i iT v

2. When the process leaves state i, it enters statejw.p. ijp .

3. All transitions and times are independent (in particular, the transition probability

out of a state is independent of the time spent in the state).

Summary: the process moves from state to state according to a DTMC, and the time spentin each state is exponentially distributed.

The transition probabilities ijp denote the embeddedDTMC.

As before, 1ijj

p = .

But now, we require that 0iip = (otherwise, time spent in state iis not

exponential)

Def. The instantaneous transition rate from state itojis ij i ijq v p , where iv is the

instantaneous transition rate out of state i.

Note:

ij i ij i

j j

q v p v= =

ij ij

ij

ij i

j

q qp

q v= =

Thus, you can specify a CTMC with either { , }ij i

p v or { }ij

q .

Example

A company has 4 machines. The time until each machine breaks is exponentially distributed with mean 6 days. The repair time of each machine is exponentially distributed with mean 2 days. There is only one repair person. All random variables are independent.


27/74


Let ( )X t be the number of working machines at time t.

The transition rates out of each state are (why?):

0

1

2

3

4

1/ 2

1/ 6 1/ 2 2 / 3

2 / 6 1/ 2 5 / 6

3/ 6 1/ 2 1

4 / 6 2 / 3

v

v

v

v

v

=

= + =

= + =

= + =

= =

The transition probabilities for the embedded DTMC are (why?)0 1 0 0 0

1/ 4 0 3/ 4 0 0

0 2 / 5 0 3/ 5 0

0 0 1/ 2 0 1/ 2

0 0 0 1 0

=

P

Or, define Markov chain using transition rates:1/ 2 0 0 0

1/ 6 1/ 2 0 0

0 2 / 6 1/ 2 0

0 0 3/ 6 1/ 2

0 0 0 4 / 6

=

Q

(In a moment, we will define the rate matrix Q with non-zero elements on the diagonal.)

Note: Often easier to construct Q first and then construct P and iv .

DTMC: n-Step Transition Probabilities

Def. n-step transition probability. Let nijP be the probability that the system is in statejin

nsteps, given the system is in state inow.

( | )nij n k k P P X j X i+= = = .

By stationarity,

0( | )n

ij nP P X j X i= = =

Note: 1ij ijP p= (using our original notation).

Chapman-Kolmogorov equations:

0( | )n m

ij n mP P X j X i+ += = =

Must be at one of the possible states at time n:

0( , |, )n m nk

P X j X k X i+= = = =


28/74


Apply Bayes rule (easier to see if ignore 0X i= ):

0 0( | , ) ( | )n m n nk

P X j X k X i P X k X i+= = = = = =

By Markov property:

0( | ) ( | )n m n nk

P X j X k P X k X i+= = = = =

Thus,

(*) n m m nij kj ik k

P P P+ = .

If ( )iP is a matrix of i-step transition probabilities, then (*) is matrix multiplication:( ) ( ) ( )n m n m+ =P P P

Also, (1) =P P , So ( ) (1) (1) (1)n n= =P P P P P . In other words, the n-step transitionprobabilities are the elements in the matrix obtaining by raising P to the nth power.

Example

1 20

0.3 1

0.71

1 20

0.3 1

0.71

0 1 0

0.3 0 0.7

0 1 0

=

P

What is 201P ? Should be 0.

What is 202P ? Should be 0.7.

Check: 20 1 0 0 1 0 0.3 0 0.7

0.3 0 0.7 0.3 0 0.7 0 1 0

0 1 0 0 1 0 0.3 0 0.7

= =

P

CTMC: t-time Transition Probabilities

Def. t-step transition probability. Let ( )ij

P t be the probability that the system is in statej

in ttime units, given the system is in state inow.( ) ( ( ) | ( ) )ijP t P X t s j X s i= + = =

( ( ) | (0) )P X t j X i= = = (by stationarity)

Lemma 6.2

1.0

1 ( )lim ii ih

P hv

h

= (rate process leaves i)


29/74


Proof. ( )iiP h Prob(0 transitions in time h) ( ) iv h

iP T h e

= > = . Thus,2( )

1 12!1 ( )

( )

ii

iii

v hv h

P hv o h

h h

+ + = = +

2.0

( )lim ij ij i ijh

P hq v p

h= = (rate process goes from itoj)

Proof. ( )ijP h Prob(transition before time hand transition is to statej)

[1 exp( )] [1 (1 )]i ij i ij i ijhv p hv p hv p= =

Lemma 6.3( ) ( ( ) |, (0) )

( ( ) , ( ) |, (0) )

ij

k

P t s P X t s j X i

P X t s j X t k X i

+ = + = =

= + = = =

Apply Bayes rule (easier to see if ignore (0)X i= ):

( ( ) | ( ) , (0) ) ( ( ) | (0) )k

P X t s j X t k X i P X t k X i= + = = = = =

By Markov property:

( ( ) | ( ) ) ( ( ) | (0) )k

P X t s j X t k P X t k X i= + = = = =

Thus,

(*) ( ) ( ) ( )ij ik kjk

P t s P t P s+ = .

Forward Chapman-Kolmogorov Equations:Basic idea: Apply Lemma 2 using a small time step h:

( ) ( ) ( ) ( ) ( )ij ij ik kj ijk

P t h P t P t P h P t + =

( ) ( ) [1 ( )] ( )ik kj jj ij

k j

P t P h P h P t

=

So,0 0

( ) ( ) [1 ( )] ( )( ) ( )

( ) lim limik kj jj ij

ij ij k j

ijh h

P t P h P h P t P t h P t

P th h

+

= =

( ) ( ) ( )ij kj ik j ij

k j

P t q P t v P t

=

Now, let us define: jj jq v= . Then, the previous expression becomes

( ) ( )ij ik kj

k

P t P t q =

This is just matrix multiplication ( ) ( )t t =P P Q with


30/74


00 01

10 11

( ) ( )

( ) ( ) ( )

P t P t

t P t P t

=

P

,0 01

10 1

v q

q v

=

Q

.

Thus, we usually define the transition rate matrix Q with the negative diagonal elementsas described.

The solution to the differential equation is2 3( ) ( )

( )2! 3!

t t tt e t= = + + + +Q

Q QP I Q

(This is the matrix analog of solving ( ) axx ax x t Ce= = )

This solution is valid provided iv is bounded. In particular, it works when the number of

states is finite.


31/74



Lecture 5: Markov Chains, Discrete & Continuous TimeGiven: 9/28/2006

Classifications of States

Def. A path is a sequence of states, where each transition has a positive probability ofoccurring.

Def. Statejis reachable from state i(or i j ) if there is a path from itoj equivalently,

0nijP > for some 0n .

Def. States iandjcommunicate ( i j ) if iis reachable fromjandjis reachable from i.

(Note: a state ialways communicates with itself)

Def. A MC is irreducible if all states are in the same communication class.

Def. State iis an absorbing state if 1iip = .

Def. A set of states Sis a closed set if no state outside of Sis reachable from any state inS(like an absorbing state, but with multiple states)

Def. State iis a transient state if there exists a statejsuch thatjis reachable from ibut iisnot reachable fromj.

Def. A state that is not transient is recurrent. There are two types of recurrent states:1. Positive recurrent, if the expected time to return to the state is finite.2.

Null recurrent (less common), if the expected time to return to the state is infinite(this requires an infinite number of states).

Def. A state iis periodic with period 1k> , if kis the smallest number such that all pathsleading from state iback to state ihave a multiple of ktransitions.

Def. A state is aperiodic if it has period 1k= .

Def. A state is ergodic if it is positive recurrent and aperiodic.


32/74


Examples

1 20 1 20

Period = 2

0

12

0

12

Period = 3

0

12

0

12

Period = 1

( 1 2 3 411 11 11 110, 0, 0, 0P P P P= = > > )

Communication Classes

Properties of communication:1.

i i (reflexivity)2. i j j i (symmetry)

3. i j and j k i k (transitivity)

These three properties partition the set of states into communication classes. Each class isdisjoint, and every state is contained in one class. Each class contains states thatcommunicate with each other. If there is only one state, the MC is irreducible.

ExampleGamblers Ruin: You win $1 with probabilitypand lose $1 with probability 1-p. Youstop when you reach $0 or $N. For example, for 4N= ,

1 20 3

1-p 1-p 1-p

pp1

4

p 1

1 20 3

1-p 1-p 1-p

pp1

4

p 1

Communications classes are:

{0} recurrent {1, 2, 3} transient

{4} recurrent

Example

1

2

0

3

1

2

0

3


33/74


Communication classes are:

{0, 1} transient

{2, 3} recurrent

Transient and Recurrent Classes

Let1 if

0 if

n

n

n

X iI

X i

==

. Then0

n

n

I

= is the total number of visits to state i. The expected

number of visits to state i(given the MC starts in state i) is:

0 0 0 ,

0 0 0 0

n

n n n i i

n n n n

E I x i E I x i P x i x i P

= = = =

= = = = = = =

Therefore, the state is recurrent if0

n

ii

n

P

=

= and transient if0

n

ii

n

P

=

< .

Technical note: Switching the expectation and the infinite sum is allowed by themonotone convergence theorem(e.g., Durrett, Probability Theory and Examples, p. 14):

If 0jY and jY Y , then ( ) ( )jE Y E Y . The proof is as follows. (For notational

simplicity, assume all random variables are conditioned on 0x i= .)

Let0

j

j n

n

Y I=

= and0

n

n

Y I

=

= . Then 0jY and jY Y , so the MCT can be used.

The switching of the expectation and infinite sum is proved by:

0 0 0 0

[ ] lim [ ] lim lim ( ) ( )j j

n n n j nj j j

n n n n

E I E I E I E Y E Y E I

= = = =

= = = = =

Random Walk

With probabilityp, we move up 1 step, with probability 1-p, we move down 1 step:

-1 0-2 1

1-p 1-p 1-p

ppp

2

1-p

p

-1 0-2 1

1-p 1-p 1-p

ppp

2

1-p

p

Is this chain recurrent or transient?

Probability of returning to state 0:

2

00

2 2 !(1 ) (1 )

! !

n n n n nn n

P p p p pn n n

= =

Use Sterlings approximation: 1/ 2! 2n nn n e + .


34/74


( )

2 1/ 2 22

00 21/ 2

(2 ) 2(1 )

2

n nn n n

n n

n eP p p

n e

+

+ =

2 1/ 22(1 )

2

nn n

p pn

+

=

4 (1 )n n np p

n

=

State 0 is transient if 2001

n

n

P

=

< or1

4 (1 )n n n

n

p p

n

=

< .

If 1/ 2p= , then 2001 1

1n

n n

Pn

= =

= = , so state 0 is recurrent.

If 1/ 2p , then 2001 1 1

nn n

n n n

aP a

n

= = =

= < < , where 1a < , so state 0 is transient.

Note: A 2-dimensional symmetric random walk is recurrent. However, a 3-dimensional(or higher) symmetric random walk is transient.

Limiting Probabilities (DTMC)

Theorem. For an irreducible, ergodic MC, lim nj ijn

P

exists and is independent of the

starting state i. Thenj

is the unique solution of j i iji

P = and 1jj

= .

Proof. Using law of total probability:

1 1( ) ( ) ( )n n n ni

P X j P X j X i P X i+ += = = = = .

Taking limits of both sides as time :

1 1lim ( ) ( ) ( )n n n nn

i

P X j P X j X i P X i+ +

= = = = =

.

j ij i

i

P = .

In matrix form, this theorem can be stated:

= P

Two interpretations for i :

1. The probability of being in state ia long time into the future (large n).2. The long-run fraction of time in state i.

If the MC is irreducible and ergodic, then interpretations 1 and 2 are equivalent.

Otherwise,i is still the solution to = P , but only interpretation 2 is valid.


35/74


Example 1

0 1

1

10 1

1

1

[ ] [ ]

[ ] [ ]

0 1 0 1

0 1 1 0

1 0

0 1

P

=

=

=

0 1

0 11 0.5ii

=

= = =

Chain is irreducible and positive recurrent, but not aperiodic. Thus, interpretation 1 is not

valid. In particular, 2 2 100 001, 0n nP P += = , for integer n.

Example 2 Planes arrive at Dulles airport.

Three types: Heavy, Large, Small Assume the sequence of airplanes follows a MC

0

12

0.3

0.8

0.3

0.2

0.4

0.7

0.3

0

12

0.3

0.8

0.3

0.2

0.4

0.7

0.3

3.07.00

4.03.03.0

08.02.0

S

L

H

SLH

SLH

SLHS

SLHL

SLHH

++=++=

++=

++=

1

3.04.00

7.03.08.0

03.02.0

One equation is redundant, eliminate complicated equation: 0.8 0.3 0.7L H L S = + +


36/74


3

8

4

7

H L

S L

=

=

So,3 4

1 18 7 L

+ + =

0.193

0.514

0.294

H

L

S

=

=

=

Example 3: GoogleThe following Markov Chain is motivated by the Google search engine.

Consider the following MC:

States are web pages

Randomly choose a new page from available links (w.p. 1 / nwhere nis thenumber of links on current page)

Page rank is determined by j , the overall fraction of visits to pagej.

Note: Page rank is boosted by

Many links to the site

Having the pages which link to the site have a high page rank themselves

Some issues:

Web pages with no links (absorbing states)

Web pages with circular links (absorbing communication class)

Solution: At each site,

With probabilitypchoose a random web page from allweb pages With probability 1 pchoose a random web page from existinglinks

Limiting Probabilities (CTMC)

Let lim ( )ij j

tP t P

= (assume no dependence on i).

Using Chapman-Kolmogorov forward equations (recall, we defined jj jq v= ):

( ) ( )ij ik kikP t P t q = (in matrix form: ( ) ( )t t =P P Q )

lim ( ) ( )ij ik kit

k

P t P t q

=

Now, assuming that limit exists, ( )ijP t must go to zero, since probabilities are bounded by

0 and 1. Therefore, ( )ik k

P t P (assuming limit does not depend on initial state i)


37/74


0 k kik

P q=

In matrix notation, this is

0 P= Q

, where

[ ]0 1 2P P P P=

and

0 01 02

10 1 12

20 21 2

v q q

q v q

q q v

=

Q

Remarks: We have assumed that the limiting probabilities iP exist (and do not depend on

the initial condition). A sufficient condition for this is: The MC is positive recurrent andirreducible (note: dont need aperiodic as in CTMC).

Interpretation of this equation:

0 P= Q

0 j kj j jk j

P q P v

=

j j j kj

k j

P v P q

=

The left-hand side is the rate of transitions out of statej.The right-hand side is the rate of transitions into statej.

Example

3 machines, time to failure ~ exp(1)

2 service workers, time to repair ~ exp(8)

[ ]0 1 2 3

16 16 0 0

1 17 16 00

0 2 10 8

0 0 3 3

P P P P

=

0 1

1 0

16 0

16

P P

P P

+ =

=

0 1 2

2 1 0

16 17 2 0

8 128

P P P

P P P

+ =

= =

2 3

3 2 0

8 3 0

8 1024

3 3

P P

P P P

=

= =


38/74


0 1 2 3 0

10241 16 128

3P P P P P

+ + + = + + +

0

1

2

3

30.00206

1459

480.03290

1459

3840.26319

1459

10240.70185

1459

P

P

P

P

= =

= =

= =

= =

Example: M/M/1 Queue

1 20 3

4

1 20 3

4

( )( )

( )

0 0

0

0

0 0

Q

+ = + +

[ ] [ ] ( )

( )

( )

( )

( )

( )

( )

0 1 2

0 1 1 0

0 1 2

1 2 3

1 1 2

1 2 2 1

0

0

0 0 00

0

0

0

0

1ii

P

P P P

P P P P

P P P

P P P

P P P

P P P P

P

Q

=

+ = = +

+ = =

+ + =

+ + =

+ + =

+ =

=

Generalizing this we get:

0PP

i

i

=


39/74


=

=

1

10

i

iP

P


40/74



Lecture 6: Markov Chains, Applications, Branching Processes

Limiting Probabilities (Review)

DTMC

= P and 1ii

= .

Sufficient conditions for limiting probabilities and uniquesolution to exist:irreducible and ergodic.

CTMC:

0 P= Q and 1i

i

P=

Sufficient conditions for limiting probabilities and uniquesolution to exist:irreducible and positive recurrent.

Since (under given assumptions) the solution is unique, if you can guess i or iP , andthen solve the above equations, then i or iP are the limiting probabilities.

CTMC Example: Tandem Queue

ratein

Queue#1

1

Queue#2

2

ratein

Queue#1

1

Queue#1

1

Queue#2

2

Queue#2

2

Assumptions

Exponential inter-arrival times Exponential service times

All times independent

To model as a CTMC, choose a 2-dimensional state space: ( ) ( , )X t a b= , where

ais the number at queue 1 (including any customer in service)

bis the number at queue 2 (including any customer in service)

For a Markov chain like this, it is hard to write out the transition matrix Q , because thestate space is 2-dimensional. Instead, we write out the rate balance equationsfor eachstate.

Let ,a bP be the limiting probability of being in state (a, b). The rate balance equations are:

1. Node ( , )a b , where , 1a b :

1 2 , 1 1, 1 2 , 1 1,( ) a b a b a b a bP P P P + + + + = + +

2.

Node (0, )b , where 1b :

2 0, 1 1, 1 2 0, 1( ) b b bP P P ++ = +

3. Node ( ,0)a , where 1a :


41/74


1 ,0 2 ,1 1,0( ) a a aP P P + = +

4. Node (0,0) :

0,0 2 0,1P P =

These equations are based on the following rate diagram.

Now, we guess the form of ,a bP and show that it satisfies all of the equations above.

Clearly, the first queue operates as anM/M/1 queue. Recall, the limiting probabilities for

anM/M/1 queue are:

1

n

nP

=

Conjecture that the second queue operates as an independentM/M/1 queue. Queue 1: Arrival rate = , Service rate = 1

Queue 2: Arrival rate = , Service rate = 2

Thus, the joint distribution is:

,

1 1 2 2

1 1

a b

a bP

=

We can regard the terms that do not depend on aor bas normalizing constants:

,

1 2

a b

a bP C

=

Check that this solves the above equations.

Equation (1): 1 2 , 1 1, 1 2 , 1 1,( ) a b a b a b a bP P P P + + + + = + +

0,00,0

0,10,1

0,20,2 1,21,2

1,11,1

1,01,0

2,22,2

2,12,1

2,02,0

1

11

11

1

11

2

2

22

2

2 2

2

2

Eq. 1

Eq. 3

Eq. 2

Eq. 4

Number in Queue 1

Number in

Queue 2

0 1 2

0

2

1


42/74


Plugging in:1 1 1 1

1 2 1 2

1 2 1 2 1 2 1 2

( )

a b a b a b a b

C C C C

+ + + + = + +

Dividing by1 2

a b

:

2 11 2 1 2

1 2

( )

+ + = + +

1 2 2 1 + + = + +

Also need to check for other equations, but we omit that here.

Summary: The steady-statedistribution for the number in each queue is as if the 2 queuesare independentM/M/1 queues.But, the second queue is not really independent of thefirst

DTMC Example: Family Genetics

Consider left and right handed people. The book What to Expect the Toddler Yearsprovides the following probabilities of having left-handed children based on thehandedness of the parents:

Parents Prob (Left-handed Child)

LL a= 0.50LR b= 0.17RR c = 0.02

Using this information, what is the fractionpof left-handed people from this data?

Let nX be the handedness of the first born of the nth generation. ( { , }nX L R ).

A left-handed child Marries a left-handed spouse with probabilityp

o Has a left-handed kid with probability ao Has a right-handed kid with probability 1 a

Marries a right-handed spouse with probability 1 po Has a left-handed kid with probability bo Has a right-handed kid with probability 1 b

A right-handed child Marries a left-handed spouse with probabilityp

o

Has a left-handed kid with probability bo Has a right-handed kid with probability 1 b

Marries a right-handed spouse with probability 1 po Has a left-handed kid with probability co Has a right-handed kid with probability 1 c

Thus, the transition matrix


43/74


L R

L (1 ) 1 [ (1 ) ]

R (1 ) 1 [ (1 ) ]

pa p b pa p b

pb p c pb p c

+ + +

Now, solve = P

[ (1 ) ] [ (1 ) ]L L R

pa p b pb p c = + + + ,

where L represents the probability of being left-handed for large n- by definition,p.

Also, we have 1R L = .

[ (1 ) ] (1 )[ (1 ) ]p p pa p b p pb p c= + + + 2 22 (1 ) (1 ) ]p p a p p b p c= + +

20 ( 2 ) (2 2 1)a b c p b c p c= + + + 20 0.18 0.70 0.02p p= +

.70 .49 4(.02)(.18)

.36p

=

3.86,0.03p=

Choose the value ofpthat is a probability, so 0.03p= .

Actual percentage of left-handed people is about 10%. What explains the incorrect valueofp?

Not a Markov chain next state also depends on handedness of grandparents,great-grandparents?

Other factors influencing next state Probability two left-handed parents have a left-handed child >> 50% (value

presented in book was approximate)

Branching Process

Consider a population.

Each individual producesjnew offspring each period with probability , 0j

p j .

Assume that 1jp < for allj(i.e., the problem is not deterministic).

Let nX be the size of the population at period n.

Usually modeled as a DTMC. The states of a Markov chain are:

0X

1X

2X


44/74


0 1 2 3 40 1 2 3 4

How many communication classes? State 0 is absorbing.

All other states are transient (assuming 0 0p > ), since one can get to 0 from any

state, but one can not get to that state from 0. In other words, it is possible that allindividuals fail to produce any offspring during the same time period.

Nevertheless, it is possible that the MC chain has an infinite positive drift to theright (in other words, every state is transient, but you dont have to end up in state0).

Let be the average number of offspring per individual. That is,

0j

j

jp

=

=

If 1 , the system will always end up in state 0 (the population dies out).

If 1 > , the system may end up in state 0 or may grow to infinity.

Fundamental question: What is the probability the population survives indefinitely?

Let iZ be the number of offspring of the ith individual from generation n 1. Then,

1

1

nX

n i

i

X Z

=

=

1 10

[ ] [ | ] ( )n n n nk

E X E X X k P X k

=

= = =

1

10 1

( )nX

i n

k i

E Z P X k

= =

= =

10

( )nk

k P X k

=

= =

1[ ]nE X =

Thus, (if start with one individual),

0[ ] 1E X =

1[ ]E X = 2

2[ ]E X =

[ ] nnE X =


45/74


Let 0 be the probability that the population dies out.

Condition on 1X :

0 1 10

(population dies out | ) ( )j

P X j P X j

=

= = =

(*) 0 00

j

j

j

p

=

=

When 1 > , it can be shown that 0 is the smallest positive number satisfying (*).

ExampleSuppose:

0 0.3p =

1 0.3p =

2 0.4p =

From this data, 1.1 = .2

0 0 00.3 0.3 0.4 = + + 2

0 00 0.3 0.7 0.4 = +

0

.7 .49 .48 .7 .1 3

.8 .8 4

= = =


46/74

Lecture by John Shortle, transcribed by James LaBelle, based on the class textbook: Ross, S. M., 2003, Introduction toProbability Models, 8

th, Academic Press.

OR 645: Stochastic Models II

Lecture 7: Markov Chains, Birth/Death Processes, Reversible Chains

Birth Death Process

DTMC Birth-Death Process: LetX(t)be a DTMC variable only possible transitions are up (+1) and down (-1) like a random walk.Births are represented by +1. Deaths are represented by -1. Births and Deaths occur one at a time.

CTMC Birth Death Process: LetX(t)be a CTMC variable represents population size at time t(p353). ipeople give birth with ratei

(arrivals). ipeople have a death with rate i(departures). CTMC Characteristic s: (a) time in state (b) embedded DTMC.

+

++

++

=

000

00

00

0010

33

3

22

2

22

2

11

1

11

1

P

Probability Matrix

( )( )

+

+

=

3

1222

1111

00

00

0

0

00

Q

Transfer Rate Matrix

1 20 3

1

2

3

210

4

3

1 20 3

1

2

3

210

4

3

For M/M/3 Queue

=

=

=

=

=

4

3

2

1

0

3

3

3

2

5

4

3

2

1

=

=

=

=

=

Expected Time to State n: Let Tibe the time to first get to i+1starting at i. Condition on 1ststep. Let

+

+=

1isstep1stif0

1isstep1stif1iI .

If we move from 23:

[ ]ii

iiITE +

==1

1

which occurs with probability:

[ ]ii

iiIP

+=

If we move from 213:


47/74

Stochastic Models


th, Academic Press.

[ ] [ ] [ ]iiii

ii TETEITE +++== 1

10

which occurs with probability:

[ ]ii

iiIP

+=

Unconditionally: [ ] [ ][ ]iii ITEETE = [ ]

[ ] [ ]

[ ] [ ] [ ][ ]

[ ] [ ]1

1

1

1

1

11

+

=

+

++

+=

+

+

+

++

+

+

=

i

i

i

i

i

ii

ii

i

ii

i

ii

ii

i

iiii

i

ii

i

TETE

TETETE

TETE

TE

where the initial condition [ ] 00 1=TE Example: (HW 6.20) There are two machines, one of which is used as a spare. A working machine will function for an exponential time

with rateand will then fail. Upon failure, it is immediately replaced by the other machine if that one is in working order, and it goes tothe repair facility. The repair facility consists of a single person who takes an exponential time with rateto repair a failed machine. Atthe repair facility, the newly failed machine enters service if the repairperson is free. If the repairperson is busy, it waits until the othermachine is fixed; at that time, the newly repaired machine is put in service and repair begins on the other one. Starting at both machinesworking, find the expected value and variance of the time until both are in the repair facility. In the long run, what proportion of time isthere a working machine?

1 20

# in repair

1 20

# in repair

[ ]

[ ] [ ]

[ ]

[ ] [ ] [ ]

[ ]

+=

+=

+=

+=

=

22

102

21

01

0

2

1

1

1

TE

TETETE

TE

TETE

TE


48/74

Stochastic Models


th, Academic Press.

Variance in Time to State n:

[ ] [ ][ ] [ ][ ][ ] [ ][ ] [ ][ ]iiiii ITVEITEVTV

YXVEYXEVXV

+=

+=

where

[ ]

=

=+

+

=

0if

1if01

i

i

ii

ii

If

IITE

where

[ ] [ ]

ii

i

ii

i

1

yprobabilitwith0

yprobabilitwith1

+=

+=

+=

i

i

ii

I

I

TETEf

Now note that [ ] [ ]ii TETEf += 1 is a Bernoulli Variable of a binomial for which [ ] [ ] [ ]22 XEXEXV = is turned into

[ ] ( )ppApApAXV == 12222 .

Also note that the term ( )ii +1 is a constant and therefore will contribute 0to the variance. So now, we have.

[ ][ ] [ ] [ ][ ] pqTETEITEV iiii2

10 ++=

whereAoccurs with probabilityp; and 0occurs with probability (1 - p) = q.

[ ]2

11

+==

ii

iiITV

with probabilityp.

[ ] [ ]1plus1plus1statetotime0 +== iiii-i-VITV ii with probability q = 1 - p.

[ ] [ ] [ ]iiiiii TVTVITV ++

+==

1

2

10

with probability q = 1 - p.

[ ] [ ] [ ]

++

++

+= ii

iiii

ii TVTVqITV 1

22

11

and the initial condition is : [ ]2

0

0

1

=

TV

Note: In Regular Markov Chain, we had .

[ ]iXjXPP nnij === 1

.. using Bayes Formula [ ][ ]

[ ] [ ][ ]iXP

jXPjXiXPQ

iXP

iXjXPQ

n

nnn

ij

n

nnij

=

====

=

===

11

1 ,

(Note: Dont confuse this with CTMCs ijq ). Assume that this is ergodic, irreducible Markov Chain. Let n . Now


49/74

Stochastic Models


th, Academic Press.

a.

i

jij

ij

PQ

= , This is always true.

b. A chain is time reversible if ijij PQ = . This is only true for a Time Reversible MC

Given the following sequence

2 1 2 3 1 2 2 1 3 1 2 1 3 2 1 3

What if we had to estimate 12 pattern occurrence?21

6312 =P . And in the reverse sequence, we get

2 1 2 3 1 2 2 1 3 1 2 1 3 2 1 3

3

2

6

412 =Q . In other words, NOT EVERY CHAIN is Time Reversible. Markov Chain is Time Reversible if number of transactions

from ijequals the number of transactions going fromji. Expressing

i

jij

ij

PQ

= as

jijiji PP =

We can see that the transition rate ijmust equal the transition rateji.

Example: The following chain is not Time Reversible; we can go 13 but we cannot go 31.

1

23

1

23

Example: The following is chain has the following movement:

1 20 1 20

0 1 2 1 0 1 0 1 2 1 0

The forward chains sequence will always be within one of the reverse chains sequence.

Conclusion: Birth Death process is Time Reversible.

Theorem: If you can find i that satisfies jijiji PP = and 1=i

i , then the chain is Time Reversible and the i s are the

limiting probabilities. In other words, if you can guess a solution to jijiji PP = that solves it, then this Markov Chain is TimeReversible.

Example: Given the following Random Walk:

1 20 3

p p p

qqq

1 20 3

p p p

qqq

Is this Time Reversible Markov Chain? YES. Typical approach is:

1. 10 q=

2. 21201 qpq =+=

3. 32312 qpqp =+=

Time Reversible (Quess) approach: Let i = 0&j = 1and using jijiji PP = we get

1. ( ) 10 1 q=


50/74

Stochastic Models


th, Academic Press.

2. 21 qp =

3. 32 qp =

Cut Method: Using midterm exam problem to illustrate: Given the following Markov Chain:

1 20 3

3/10 2/10 1/10

111/2

1 20 3

3/10 2/10 1/10

111/2

Cut across both paths. Rate you cross from 12must equal rate of crossings going 21. This is reversible by inspection has to be

true for every possible combinations. Key Point: jijiji PP = and 1=i

i .

Example: consider the midterm problem represented in the above figure. Go from 01& 10.

01105

3

2

1

10

3

=

=

= jijiji PP

Go from 12& 21.

( ) 0122125

3

5

11

10

2 ===

Go from 23& 32.

( ) 02332250

3

10

11

10

1 ===

433

3,

433

30,

433

150,

433

250

250

3

25

31

1

3210

000

210

====

+

+=

++==

i

i

Birth Death processes are Time Reversible. A Time Reversible process may be Birth Death process (i.e., there are other MarkovChains that are reversible besides Birth Death).

Example: consider the midterm problem represented in the above figure. Go from 01& 10.

2

1 20 3 4 5

3 3 3

2

1 20 3 4 5

3 3 3

Assume: arrival rate = & service rate = for a single server.

0110 PPPP

PP jijiji

==

=


51/74

Stochastic Models


th, Academic Press.

02

2

2

01221

2

222

PP

PPPPP

=

===

04

4

3

02

2

2332

3!3

2333

PP

PPPPP

=

===

and for any 3>n , 033!3PP

nn

n

n

=

Proposition 6.8 (p.381) A Time Reversible chain with limiting probabilities SjPj , , that is truncated to the set SA and remains

irreducible is also Time Reversible and has limiting probabilitiesA

jP given by

=Ai

i

jAj

P

PP , Aj

1 20 3 4 5

Eliminate these

states to create

the truncated

Markov Chain

1 20 3 4 5

Eliminate these

states to create

the truncated

Markov Chain

This is a renormalization, given that we have thrown away some states (see figure below), so that in the new truncated Markov Chain

1=i

i and the individual probabilities will have the same relationship relative to each other. Renormalization Constant is: Ai

iP .


52/74

Stochastic Models


th, Academic Press.

Example: M/M/3/3 (from the previous example)

=

=

3

0

0

04

4

3

!

1

3!3

jj

j

Pj

P

P

Where the Renormalization Constant is:

=

3

0

0!

1

jj

j

Pj

In general, the Blocking Probability for M/M/C/C Queue is

read me at last after markov chain (summary)

Documents