a review of the discrete time markov chains on a …pejman/intro2prob/amazon/live1.pdf · pejman...
TRANSCRIPT
A Review of the Discrete Time Markov Chains ona Finite State Space
Pejman Mahboubi
December 12, 2011
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Conditional Probability
I A man observes a race between three horses a,b and c. Hefeels that a and b have the same chance of wining bout c istwice as likely to win as a. Assume the man learns horse b isnot going to run. What is the probability of wining for a andc? i.e we are looking for P(a|bc) and P(c|bc).
I The original probability space is Ω = a,b, c, withP(a) = P(b) = 1
4 and P(c) = 12 . After learning that b wont
run, we want to change the probability space.I Our new sample space is Ω′ = a, c. We don’t want to
divide P(b) equally between P(a), P(c).I We do this by rescaling P(a) and P(c). We want α s.t
αP(a) + αP(c) = 1
I Then α = 1P(bc ) . Then the new probability P(·|bc) is defined
by P(a|bc) = P(a)P(bc ) .
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I Sometimes by removing an event like b (i.e conditioning onbc), parts of sets a or c are removed to, and we are left witha ∩ bc , and c ∩ bc . Therefore we should rescale theprobabilities of these two sets.
I This leads us to the general conditioning formula
P(A|B) =P(A ∩ B)
P(B).
I The expectation we respect to P(·|B) is called conditionalprobability and denoted by E[·|B].
I If the distribution pX (·) of a discrete r.v X with rangexn∞n=1 is given, then E[X |B] =
∑∞n=1 xnpX |B(x), where
pX |B(x) = P(X = xn|B)
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Conditional independence
I Coins a and b show head with probabilities pa and pb. Wechoose one coin randomly and drop it twice. What is Ω?
Ω = a,b × H,T2
I To define P on Ω, we assume that given the coin, the firstand the second cast are independent. i.e
P(a,T ,H) = P(T ,H|a)P(a) = pa(1− pa)× 1
2
I Is under P the events A =First cast is H, and B = Secondcast is T , independent?
I Answer is No.(Check it for yourself!)
I But, by definition, Given the coin, A and B are independent.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Simple Random Walk (SRW)
I Let Xnn≥1 be iid Bernoulli:P(Xn = 1) = 12 = P(Xn = −1)
I Define S0 = X0 and Sn = Sn−1 + Xn = X0 + · · ·+ Xn
I X0 is a fixed integer, and specifies the starting point.
I This is not a MC on a FINITE space; the state space is Z.
I If X0 = 0. P(S8 = 8) = P(X1 = 1, · · · ,X8 = 1) = 128 .
I Clearly, S6 is not independent of S4, S6 = S4 + X5 + X6.
I Intuitively, given S5, S6 is independent of S2.
I Mathematicly, S6 and S2 are indep. w.r.t P(·|S4 = i) for anyi ∈ S.
P(S6 = 2, S2 = 0|S4 = i) = P(S6 = 2|S4 = i)P(S2 = 0|S4 = i)
Let us prove this for i = 2.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I Proof.
P(S6 = 2,S2 = 0|S4 = 2) =P(S6 = 2, S2 = 0,S4 = 2)
P(S2 = 0,S4 = 2)P(S2 = 0|S4 = 2)
P(S6 = 2,S2 = 0, S4 = 2)
P(S2 = 0,S4 = 2)=
P(S6 − S2 = 2, S4 − S2 = 2,S2 = 0)
P(S4 − S2 = 2,S2 = 0)
Notice that theS6 − S2 = 2,S4 − S2 = 2 = X5 + X6 = 0, X3 + X4 = 2.Therefore the set on the left is expressible by the r.v.’s X3, · · · ,X6.Therefore it is independent of S2. Therefor, the last fraction isequal to:
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
=P(S6 − S2 = 2,S4 − S2 = 2)
P(S4 − S2 = 2)=
P(S6 − S4 = 0,S4 − S2 = 2)
P(S4 − S2 = 2)
Similarly, S6 − S4 = 0 and S4 − S2 = 2) are independent
= P(S6 − S4 = 0) = P(S2 = 0).
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
-1
-2
1
2
3
1 2 3 4 5 6 7 8n
Sn
X1 = −1,X2 = 1,X3 = 1,X4 = 1,X5 = 1,X6 = −1
S •
-1
-2
1
2
3
1 2 3 4 5 6 7 8n
Sn
X1 = 1,X2 = −1,X3 = 1,X4 = 1,X5 = 1,X6 = −1
S •
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I The blue bullet indicates the “conditioning”,i.e., we restrictthe sample space to all paths passing through this bluebullet.
I This obviously changes the probability law P on the space.
I Instead of re-writing the Ω, we let P be concentrated on therestricted area.
I So far we have
P(S6 = 2, S2 = 0,S4 = 2)
P(S2 = 0,S4 = 2)=
P(S4 = 2, S2 = 2)
P(S2 = 2)
=P(S4 = 2, S2 = 2)
P(S2 = 2)
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Markov Chains (MC)-Transition probability matrix
I Let state space S be a finite set, and P(Xn ∈ S) = 1 ∀n ≥ 0.
I Define pi ,j = P(Xn+1 = j |Xn = i), note the time-homogeneity
P =
↓ S→1...n
1 · · · n p11 · · · p1n... · · ·
...pn1 · · · pnn
I Each rows is of TPM totals 1:
p11 + p12 + · · ·+ p1n = P(Xn+1 = 1|Xn = 1)
+P(Xn+1 = 2|Xn = 1) + · · ·+ P(Xn+1 = n|Xn = 1)
= P(Xn+1 ∈ S|Xn = 1) = 1
I Elements of TPM are nonnegative number: pi ,j ≥ 0 for all i , j .
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Markov Property(MP)
I The transition matrix P, is not sufficient for describing aMarkov chain. We also need the Markov property:
P(Xn+1 = j |Xn = i ,Xn−1 = in−1, · · · ,X0 = i0)
= P(Xn+1 = j |Xn = i) = pij
I The MP says: probabilities pij apply whenever state i isvisited, no matter what happened in the past, and no matterhow state i was reached. OR
I Given the present state, the future is independent of the past.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
The distribution after one step
I Assume S = 1, 2, and we are given P =
(α 1− α
1− β β
).
I Assume X0 = 1. What is the distribution of X1?
I By distribution of X1 we mean P(X1 = 1) and P(X1 = 2).P(X1 = 1|X0 = 1) = α,P(X1 = 2|X0 = 1) = 1− α
I Let φ2 denote the distribution of X1: φ2 = (α, 1− α) ∈ R2+.
I When S = 1, 2, the dist. of Xn is a vector in R2+.
I Let φn denote this vector, i.e φn has two coordinates:
φn = (φn(1), φn(2)),
where φn(i) = P(Xn = i).
I If we start from 1, i.e X0 = 1 i.e., φ0 = (1, 0), thenP(X1 = 1) = P(X1 = 1|X0 = 1) = p11, andP(X1 = 2) = P(X1 = 2|X0 = 1) = p12.
I If φ0 = (1, 0), then φ1 = (p11, p12)
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Initial Distribution
I A and B are 2 cities. Assume that every year 40% of thepopulation of A moves to B, while 70% of B popl. moves toA. What is the popl. after 1 year, if A and B have initially .6and 2.4 mil. popl.?The vector (.2, .8) is denoted by φ0 ∈ R2
+I Let Xnn≥1 denote location of one individual: Xn ∈ A,B.
Assume initially all the population resides in A. What is thedistribution of population after one year?
I With this assumption, φ0 = (1, 0).
A B
.6
.4
.7
.3
P =
(.6 .4.7 .3
)
φ0 = (P(X0 ∈ A),P(X0 ∈ B)) = (1, 0)
φ1 = (.6, .4) = φ0PPejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I Then, if the initial dist. is φ0 = (1, 0),
φ1 = (P(X1 = 1|X0 = 1),P(X1 = 2|X0 = 1))
= (p11, p12) = (1, 0)
(.6 .4.7 .3
)= φ0P.
I Similarly, if the initial dist. is φ0 = (0, 1), then
φ1 = (P(X1 = 1|X0 = 2),P(X1 = 2|X0 = 2))
= (p21, p22) = (0, 1)
(.6 .4.7 .3
)= φ0P.
I If φ0 = (.2, .8), then φ1 = φ0
(.6 .4.7 .3
)= (.68, .32)
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I What is the distribution of the popl. after 10 years, whenφ0 = (.2, .8)?
I We are looking for the dist. of X10, given we started from φ0:
φ10 = φ0P10 =
(.2, .8)
(0.428575 0.5714250.428569 0.571431
)= (0.428570, 0.571430)
I Therefore Pn gives the distribution of Xn.
I pnij denotes the ij element of Pn.
I Do you think that the population is reaching to anequilibrium?
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Probability of the paths
I We choose a person at random, what is the probability thatthis person is from A and stays in A for 3 years, and thenmoves to B?
I X0 = X1 = X2 = A,X3 = B is a path with probability:
P(X0 = X1 = X2 = A,X3 = B)
= P(X3 = B|X0 = X1 = X2 = A)P(X0 = X1 = X2 = A)
= p12P(X2 = A|X0 = X1 = A)P(X0 = X1 = A)
= p12p11P(X1 = A|X0 = A)P(X0 = A)
= p12p11p11P(X0 = A) = .4× .6× .6× .2.
I If all the factors are nonzero, then the path is positive.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Probability of the paths
I We choose an individual from city A. What is the probabilitythat this person for the next 4 years moves every 2 years?
I We want to find P(X1 = A,X2 = B,X3 = B,X4 = A|X0 = A)
= P(X4 = A|X3 = X2 = B,X1 = X0 = A)×P(X3 = X2 = B,X1 = A|X0 = A)
=p21P(X3 = B|X2 = B,X1 = X0 = A)×P(X2 = B,X1 = A|X0 = A)
= p21p22P(X2 = B|X1 = X0 = A)× P(X1 = A|X0 = A)
= p21p22p12p11 = (.6)(.4)(.3)(.7)
I What happens if we choose a person at random, and try tofind the probability that he moves every 2 year? i.e
P(X0 = X1,X1 6= X2,X2 = X3,X3 6= X4)
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
n-step probability, Chapmann-Kolmogorov
I Nasim lives in A, what is the probability that she will be in Bin 10th year? P(X10 = B|X0 = A):
φ10 = (1, 0)P10 = (1, 0)
(0.428575 0.5714250.428569 0.571431
)=
I We have to find all the paths from X0 = A to X10 = B, andadd their probabilities
I We can put all the path in two categories:I path which pass through A at time 8: X8 = A,I path which pass through A at time 8: X8 = B.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
P(X10 = B|X0 = A)
=∑i∈S
P(X10 = B|X8 = i ,X0 = A)P(X8 = i |X0 = A)
=∑i∈S
P(X10 = B|X8 = i)P(X8 = i |X0 = A)
I Let Pi (·) denote P(·|X0 = i) (don’t mistake it with pnij), then
p10AB =
∑i∈S
p8A,ip
2i ,B i ∈ A,B
In general, if x , y ∈ S are two states, then
pm+nx ,y =
∑z∈S
pmx ,zp
nz,y
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I Similarly we have
Pi (Xn = j) =∑k∈S
Pi (Xn−1 = k)pk,j
I By repeating
Pi (Xn = j) =∑k∈S
Pi (Xn−1 = k)pk,j =∑k∈S
∑l∈S
Pi (Xn−2 = l)pl ,kpk,j
I If we continue we will get
Pi (Xn = j) =: pnij =
∑i1,··· ,in−1∈S
pii1pi1i2 · · · pin−1j
I The right-hand-side is the element ij of the matrix Pn.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Review
Consider a game in which two players A and B take turns to toss acoin and player who coms up with heads first wins the game.Supposing the player A starts off, let us consider the followingproblems:
1. What is the probability that A wins the game?
2. How many tosses are required on the average to end?
Ω =
ω1 = H, ω2 = TH,ω3 = TTH, · · ·ωn = TT · · ·THω∞ = TTT · · ·
,
and P on Ω is defined by P(ωn) = 12n .
A wins when ω1, ω3, ω5, · · · happens and B wins otherwise.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I Let WA and WB denote the events that A and B winsrespectively, e.g WA = ω1, ω3, · · · .
P(WA) =1
2+
(1
2
)3
+ · · · , P(WB) + P(B) = 1
P(WB) =
(1
2
)2
+
(1
2
)4
+ · · · , P(WB) = 2P(A)− 1
I Let N denote the length of the game. N : Ω→ N is defined by
N(ωn) = n, P(N = n) = P(ωn) =
(1
2
)n
I Therefore, EN =∑∞
n=1 nP(N = n) =∑∞
n=1 n(
12
)n.
I We know how to compute the last sum: EN = 2
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Markov Chain
I We can model this game as a Markov chain with 3 stateS ,H,T , as follows
S
T H1/2
1/2
1/2
1/21
P =
0 .5 .50 .5 .50 0 1
I ESN = ES [N|X1 = T ]P(X1 = T ) + ES [N|X1 = H]P(X1 = H)
= 1 + ETN 12 + EHN 1
2 = 1 + ETN 12
I ETN = ET [N|X1 = T ]12 = 1 + ETN 1
2 ⇒ ETN = 2
I ESN = 2.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Probability of path, and n-step probability
I Assume we start from S . What is the probability of TTTH?
I X1 = T ,X2 = T ,X3 = T ,X4 = H is a path. We wantP(X1 = T ,X2 = T ,X3 = T ,X4 = H|X0 = S) =
P(X1 = T ,X2 = T ,X3 = T ,X4 = H,X0 = S)1
P(X0 = S)
= P(X4 = H|X1 = T ,X2 = T ,X3 = T ,X0 = S)×P(X1 = T ,X2 = T ,X3 = T ,X0 = S)
P(X0 = S)
= P(X4 = H|X3 = T )P(X1 = T ,X2 = T ,X3 = T ,X0 = S)
P(X0 = S)
= P(X1 = H|X0 = T )P(X1 = T ,X2 = T ,X3 = T |X0 = S)
p0,1P(X1 = T ,X2 = T ,X3 = T |X0 = S)
= · · · = pS ,TpT ,TpT ,TpT ,H = (.5)4
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Markov Chain
I Let Xn∞n≥0 be a sequence of random variables taking valuesin the finite set
S = 1, · · · ,N
I Let 0 < m < n. Xnn≥0 is called a Markov Chain if
P(Xn = in|X0 = i0, · · · ,Xm = im) = P(Xn = in|Xm = im)(1)
I Equation (??) is called the Markov Property (MP).
I (??) states that if k < l < m, then Xk and Xm areindependent with respect to P(·|Xn).
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Ω = H , TN
I The Cartesian product A× B of the sets A and B:
A× B = (x , y) : x ∈ A, y ∈ B
I If A = H,T, then A2 = (H,H), (H,T ), (T ,T ), (T ,T ).I A2 is the set of all sequences of length 2 comprised of
elements in A.
I An is the set of all sequences of length n, comprised ofelements in A.
TTTHTHHTHHHHTTTT ∈ H,T16
I By AN, where N is the set of natural numbers, we mean theset of all infinite sequences comprised of elements of A.
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
I If the player in the game above continue for ever, thenΩ = H,TN.
I If ω ∈ Ω, then ω is an infinite sequence
ω = ω1, ω2, ω3, · · ·
I We define a probability P on Ω by P(ω) = qn−1p, where n
n = infk : ωk = H
Define a random variable T on Ω by
T (ω) = infk : ωk = H
I We can translate the previous problem to this space as follows:
I A wins=T is odd. Average time of the game=ET .
I P(A wins) = P(T is odd).
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space
Constructing iid Bernoulli(Kolmogorov’s ConsistencyTheorem)
I Let Ω = −1, 1N. i.e., if ω ∈ Ω, then ω is a sequence of −1and 1, like ω = 1, 1,−1, 1,−1,−1,−1, · · · .
I Let A1 denote the set of all ωs starting with 1. DefineP(A1) = 1
2 .I Let A2 denote the set of all ωs starting with 1, 1. Define
P(A2) =(
12
)2.
I In general let An denote the set of all ω starting witha1, · · · , an, where ai ∈ −1, 1. Define P(An) =
(12
)nI We claim that this specifies P(A) for “any” A ⊂ Ω.I Define Xn : Ω→ R by Xn(ω) = ωn.I You can check that Xn∞n=1 are iid, and Bernoulli.I Do this procedure for producing 2 Bernoulli iid r.v’s.I How can we construct infinitely many iid Gaussian random
variables?
Pejman Mahboubi A Review of the Discrete Time Markov Chains on a Finite State Space