15 october 2012 eurecom, sophia-antipolis thrasyvoulos spyropoulos / [email protected] discrete...
TRANSCRIPT
15 October 2012Eurecom, Sophia-AntipolisThrasyvoulos Spyropoulos / [email protected]
Discrete Time Markov Chains
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
(Discrete-Time) Markov Process
X(n): the state/value of a process at the n-th period (time slot) X(n) is random and takes values in a finite or countable set
The sequence X(1), X(2), …, X(n) could bea) The values of a stock each dayb) The web page a user is currently browsing c) The Access Point(AP) a moving user is currently associated with
We would like a probabilistic model for X(1),X(2),…,X(n) a) X(i) are independent => not so realistic (for above examples)b) X(i+1) depends on X(i) => not a bad model to start with
2
P{X(n+1) = j | X(n) = i,X(n-1) = in-1,…,X(1) = X0} = P{X(n+1) = j | X(n) = i} = pij(Markov Property)
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Transition Matrix and Properties
Assume state space: 0, 1, 2, … Pij : P{S(n+1) = j | S(n) = i}
Stationary: P{S(n+1) = j | S(n) = i} = P{S(1) = j | S(0) = i} for any n
pij define a transition matrix P =
3
i 1,pj
ij
222120
121110
020100
ppp
ppp
ppp
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Graphical Representation of Markov Chains Simple weather model
X(n): weather after n days
Simple channel error model Prob of error = 0.1 Case 1: q = 0.1 Case 2: q = 0.9
What is the transition matrix P in both cases?
4
good bad
p 1-pq
1-q
(uncorrelated)
(burst of errors)
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Markov Chain Models (cont’d) A simple handover model
cell -> state need to find transition probabilities depend on road structure, user
profile, statistics
5
user on the phone
Cellular Network
0.2
0.30.5
0.4
0.4
0.7
0.2
0.15
0.2
0.8
0.6
Markov Chain
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
2-step transition probabilities
So far we know probabilities for next step: S(n)->S(n+1) Probabilities for S(1) -> S(n)?
If P(2) = {pij(2)} denotes the 2-step trans. probability matrix,
then P(2) = P • P Pij
(2): multiply row I with column j
6
iS(0)|jS(2)P
k
kji}S(0)|jP{S(2) ppik
k
iS(0)|kS(1)P iS(0)k,S(1)|jS(2)P
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
n-step transition probabilities
Generalizing for n steps:
Kolmogorov-Chapman equations P(n) = P•P…P = (P)n (n-step transition probabilities) P(n) = P(m) • P(l)
e.g. P(5) = P(4) • P(1) = P(2) • P(3) = P(2) • P • P(2)
7
n) l(m ,ppiS(0)|jS(n)Pk
(l)kj
(m)ik
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
An example: Pre-Fetching Web Pages
Your iPhone browser can pre-fetch pages to improve speed
The current Web Page being browsed has 3 links Study of click statistics have shown that
A user clicks link 1 next with prob 0.8 Link 2 with prob 0.1 Link 3 with prob 0.1 It seems reasonable that the browser pre-fetches link 1
Assume now a tiny subset of the Web with the following transition matrix Start at page A
Q: Which 2 web pages should the browser pre-fetch?8
0.50.20.10.2
0.30.20.40.1
0.60.10.20.1
0.10.50.20.2
D page
C page
B page
Apage
D C B A
S(1) = [0.2, 0.2, 0.5, 0.1]
S(2) = [0.13, 0.29, 0.24, 0.34]
S(0) = [1, 0, 0, 0]
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Limiting distribution
What happens to pij(n) as n goes to infinity?
What is the lim P(n) as n->∞? Q: Does it always converge?? (we’ll see this later)
If limit exists, then for any initial state i
π = {π0, π1,…, πm} is called the stationary distribution
IF π• P = π and Σiπi = 1
The above equation can be used to find π
9
jnij
nπplim
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Getting the Stationary Distribution: Examples The weather system
Data rates supported for outdoor user e.g. (128Kbps, 320Kbps, 1 Mbps)
What about this one?
10
0.40.6
0.20.8P (3/4,1/4))π,(π rainysunny
0.50.20.3
0.20.60.2
0.20.40.4
P )72
,73
,72
()π,π,(π 1024320128
100
00.70.3
00.60.4
P
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
A Definition Prelude (Part I)
(Reachability) State j is reachable by i Pij(n) > 0 for some n
denote ij
(Communication) states i and j communicate (i <-> j) if ij and ji
(Recurrence/Transience) Denote fi the probability to ever return to i, starting from i
Transient) state is transient if fi < 1 number of returns is geometric (fi): why?
Recurrent) state i is recurrent if fi = 1 Positive recurrent) expected time between visits to i is finite Null recurrent) expected time between visits is infinite (strange!?)
Periodicity) state i is periodic, if exists d such that pii(n) = 0, if n ≠
kd Largest d with above property is called period 11
returns) of num (infinite
0n
(n)iip
returns) of num (finite
0n
(n)iip
100
00.70.3
00.60.4
P
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
A Definition Prelude (Part II)
Definition: A subset of states S: {for all i,j in S => i <-> j} is called a class
Lemma 1: recurrence is a class property if i is recurrent and i <-> j, then j is recurrent Proof (by contradiction): if j transient => can never return to j after some time
=> cannot be at i either (since there is always a chance to go from i to j)
Lemma 2: transience is a class property Similar argument Periodicity and positive/null-recurrence are also class properties
Irreducible) A Markov Chain is irreducible if it has only 1 class A finite MC which is irreducible is always positive-recurrent
12
100
00.70.3
00.60.4
P
Theorem: If a DTMC is aperiodic, irreducible and positive recurrent (“ergodic”) then it has a stationary distribution π = {π1,…,πN}
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
A Motivating Example: Google PageRank Algorithm Search for “Network Modeling”
1000s of pages (courses, companies, etc.) contain these keywords Goal: Make the page I’m interested appear in at least the top 10
pages shown
Q: How should pages be ranked?A1: A page is important if many links to this page
Q: What is wrong with this metric?
A: - link from Yahoo more important than link from /~spyropou
- can easily “fool” this! (create many dummy pages pointing to mine)
Q: How to fix this?
A: weigh a link from x to page p with the number of links into x
Q: Can this system be fooled?
A: Yes! Create 1000 dummy pages pointing to each other and mine13
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Google’s Solution
A page p has high rank (is important) if the pages pointing to it also have high rank
Q: How is this different from before?A: All the dummy pages pointing to each other would still be low
rank, if no “external” link from a high rank page
Q: Is a link from a page with 1000 other links as important as a link from a page with 5 other links?
A: No! If page i has a link to j and k total outgoing links Pij = 1/k
Q: Where does this lead? How do we calculate the rank?A: rank of page j = Σi { rank of page i * Prob of link ij}
i.e. rj = Σi ri Pij
(basis of) Page Rank Algorithm: rank of pages = stationary prob. of an MC with transitions Pij
14
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Page Rank Algorithm Pitfalls
Q: What about this tiny WWW? A: πA = πN = 2/5, πM = 1/5
Q: But what about these two?A: These two chains are reducible and do not have a stationary distr.
15
Google’s heuristic: introduce additional arrows to avoid such loops and dead ends (not necessarily optimal!)
Q: How does Google solve the huge system of eq (million pages)?
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
A More Advanced Example: The Aloha Protocol
16
[Network] m nodes on a common wire Or wireless channel Time is slotted
[New Packets] Each node transmits a new packet with probability p (p < 1/m)
[Success] If exactly 1 packet is transmitted
[Collision] If k>1 (re-)transmissions during a slot
Every collided message is stored, to be resent later
[Retransmission] Each collided(back-logged) message is retransmitted with prob q
new packetold packet
p
p
p
√Xq
Does Aloha work??
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Slotted Aloha: A Markov Chain Model
Q: What should we take as the state of the chain Xn? A: The number of backlogged messages (at slot n)
Transition probabilities from state 0 (no backlogged msg)
P00 =
P01 =
P0k =
P0k =
17
1mm p)p(1p)(1 m
kmk p)(1pk
m
(0 or 1 node transmits)
0 (not possible)
if k ≤ m (any k of the m nodes transmit)
0 if k > m
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Pk,k-1 = (better)
Pk,k =
Pk,k+1 = (worse)
Pk,k+r = (1< r ≤ m) (worse)
Pk,k+r = 0 (r > m)
Slotted Aloha: Transitions From State k
Assume we are now at state k (i.e. k messages backlogged)
18
1km q)kq(1p)(1
(0 new, 1 old)
1kkmk1mkm q)kq(1q)(11p)(1q)(1p)mp(1q)(1p)(1
(0 new, 0 old) (1 new, 0 old)
rm-r p)(1pr
m
(r new, any old)
(0 new, ≥2 old)
)q)(1(1p)mp(1 k1m- (1 new) (>0 old)
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
0
p00 p01
1 2p10 p21
p12
…
p0m
kk-1
pk,k-1
pk-1,k
pk-m,k
k+1
pk,k-1
pk-1,k
pk,k+2
pk,k+m
k+2 k+m…pk+1,k+2
pk+2,k+1
: :
Slotted Aloha: The Complete Markov Chain
How can we tell if “Aloha works”? Assume 10 nodes and transmission probability p =
0.05 Load = 0.5 Capacity
Intuition: (necessary) q should be small (re-Tx) To ensure retransmissions don’t overload medium Let’s assume q = 0.005 (10 times smaller than p)
Q: Is Aloha Stable?If backlog becomes infinite => delay goes to infinity!
19
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Slotted Aloha: Stability
When at state k: Pback (k) : reduce backlog Pfwd (k) : increase backlog
(MHB,
Ch.10)
(Why?) So???
20
0
p00 p01
1 2p10 p21
p12
…
p0m
kk-1
pk,k-1
pk-1,k
pk-m,k
k+1
pk,k-1
pk-1,k
pk,k+2
pk,k+m
k+2 k+m…pk+1,k+2
pk+2,k+1
: :
Pback(k)
Pfwd(k)
0q)kq(1p)(1lim(k)Plim 1km
kback
k
m
fwdkp)(11(k)Plim
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Slotted Aloha: Stability Conclusion
For large enough k => states k+1, k+2, …, are transient
Markov Chain is transient => Aloha protocol is unstable!
Q: would Aloha work if we make q really (REALLY) small?
A: No! Intuition: Let E[N] the expected Tx at state k If E[N] ≥ 1 then situation either stays the same or
worse But E[N] = mp + kq --- what happens if k ∞?
21
0
p00 p01
1 2p10 p21
p12
…
p0m
kk-1
pk,k-1
pk-1,k
pk-m,k
k+1
pk,k-1
pk-1,k
pk,k+2
pk,k+m
k+2 k+m…pk+1,k+2
pk+2,k+1
: :
Pback(k)
Pfwd(k)
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Improving Aloha: q = f(backlog)
Q: How can we fix the problem and make the chain ergodic (and the system stable)?
A1: E[N] = mp + kq < 1 => q < (1-mp)/ki.e. q = f(backlog)
A2: or be more aggressive geometric backoff q = a/kn
a < (1-mp)
A3: or even exponential q = β-k β > 1
Exponential backoff is the basis behind Ethernet
Q: Why should q not be too small?A: Retransmission delay goes to infinity
22
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Ergodicity: Time Vs. Ensemble Average
Time Averages Ni(t) = number of times in state i by time t
pi = (percentage of time in state i)
Ensemble Averages mij: expected time to reach j (for 1st time), starting
from i mii = expected time between successive visits to i
πi = (prob. of being at state i after many steps
Theorem: for an ergodic DTMC
(proof based on Renewal Theory) 23
iii
kin
i pm1
(n)plimπ
t(t)N
lim i
t
(n)ki
nplim
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
(Global) Balance Equations
πi • pij : rate of transitions from state i to state j πi = percentage of time in state i pij = percentage of these times the chain
moves next to j
Q: What is Σjπj pji ? A: rate into state i
24
π0
p00 p01
π1
π2
p10
p21
p12
From stationary equation: πi =Σjπj pji πi: rate into i
But also πi =Σjπi pij (why?)
Theorem: Σjπj pji = Σjπi pij (rate in = rate out) Q: why is this reasonable? A: Cannot make a transition from i, without a transition into i
before it (difference in number is at most 1) Assume a subset of states S: rate into S = rate out of S
rate into 1π2p21
rate out of 1π1p12+π1p10
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Local Balance Equations and Time-Reversibility Assume there exist πi: πipij = πjpji (for all i and j)
and Σiπi = 1
Then: πi is the stationary distribution for the chain The above equations are called local balance equations The Markov chain is called time-reversible
Solving for the Stationary Distribution of a DTMC
25
Stationary Eq.
πi = Σjπjpji
(not always easy)
Global Balance
Σj ≠i πipij = Σj ≠i πjpji
(a bit easier)
Local Balance
πipij = πjpji
(easiest! try first!)
15 October 2012Eurecom, Sophia-AntipolisThrasyvoulos Spyropoulos / [email protected]
Absorbing Markov Chains
26
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
A Very Simple Maze Example
A mouse is trapped in the above maze with 3 rooms and 1 exit
When inside a room with x doors, it chooses any of them with equal probability (1/x)
Q: How long will it take it on average to exit the maze, if it starts at room i?
Q: How long if it starts from a random room?
27
1
23 exit
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
First Step Analysis
Def: Ti = expected time to leave maze, starting from room I
T2 = 1/3*1 + 1/3*(1+T1)+1/3*(1+T3) = 1 + 1/3*(T1+T3)
T1 = 1 + T2
T3 = 1 + T2
T2 = 5, T3 = 6, T1 = 6
Q: Could you have guessed it directly?A: times room 2 is visited before exiting is geometric(1/3)
on average, the wrong exit will be taken twice (each time costing two steps) and the 3rd time the mouse exits
28
1
23 exit
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
“Hot Potato” Routing
A packet must be routed towards the destination over the above network
“Hot Potato Routing” works as follows: when a router receives a packet, it picks any of its outgoing links randomly (including the incoming link) and send the packet immediately.
Q: How long does it take to deliver the packet?
29
destination
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
“Hot Potato” Routing (2)
First Step Analysis: We can still apply it! But it’s a bit more complicated: 9x9 system of linear equations Not easy to guess solution either! We’ll try to model this with a Markov Chain
30
destination
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
An Absorbing Markov Chain
9 transient states: 1-9 1 absorbing state: AQ: Is this chain irreducible?A: No!Q: Hot Potato Routing Delay expected time to
asborption? 31
1 2 3
4 5 6
7 8 9
A
1 1/4
1/2
1/2
1/3
1/3
1/2
1/2
1/2
1/2
1/4
1/3
1/2
1/21/3 1/2
1/21/4
1/4
1
1/3
1/3
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Hot Potato Routing: An Absorbing Markov Chain
We can define transition matrix P (10x10)Q: What is P(n) as n∞?A: every row converges to [0,0,…,1]Q: How can we get ETiA?
(expected time to absorption starting from i)
Q: How about ?A: No, the sum goes to infinity!
32
1 2 3
4 5 6
7 8 9
A
1 1/4
1/2
1/2
1/3
1/3
1/2
1/2
1/2
1/2
1/4
1/3
1/2
1/21/3 1/2
1/21/4
1/4
1
1/3
1/3
n
niAnP )(
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Absorbing Markov Chain Theory
Transition matrix can be written in canonical form Transient states written first, followed by
absorbing ones
Calculate P(n) using canonical form
Q: Qn as n ∞?A: it goes to O
Q: where does the (*) part of the matrix converge to if only one absorbing state?
A: to a vector of all 1s 33
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Fundamental Matrix
Theorem: The matrix (I-Q) has an inverse N = (I-Q)-1 is called the fundamental matrix N = I + Q + Q2 + … nik : the expected number of times the
chain is in state k, starting from state i, before being absorbed
Proof:
34
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Time to Absorption (using the fundamental matrix)
Theorem: Let Ti be the expected number of steps before the chain is
absorbed, given that the chain starts in state i, let T be the column vector whose ith entry is Ti.
then T = Nc , where c is a column vector all of whose entries are 1
Proof: Σknik :add all entries in the ith row of N
expected number of times in any of the transient states for a given starting state i the expected time required before being absorbed. Ti = Σknik.
35
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Absorption Probabilities
Theorem: bij :probability that an absorbing chain will be
absorbed in (absorbing) state j, if it starts in (transient) state i.
B: (t-by-r) matrix with entries bij .
then B = NR , R as in the canonical form.
Proof:
36
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Back to Hot Potato Routing Use Matlab to get matrices Matrix N =
Vector T =
37
3.2174 2.6957 2.3478 6.6522 4.5652 4.0000 3.8261 3.2174 2.6087 1.3478 3.9130 2.9565 4.0435 4.3043 4.0000 2.5217 2.3478 2.1739 1.1739 2.9565 3.4783 3.5217 3.6522 4.0000 2.2609 2.1739 2.0870 2.2174 2.6957 2.3478 6.6522 4.5652 4.0000 3.8261 3.2174 2.6087 1.5217 2.8696 2.4348 4.5652 4.9565 4.0000 2.7826 2.5217 2.2609 1.0000 2.0000 2.0000 3.0000 3.0000 4.0000 2.0000 2.0000 2.0000 1.9130 2.5217 2.2609 5.7391 4.1739 4.0000 4.8696 3.9130 2.9565 1.6087 2.3478 2.1739 4.8261 3.7826 4.0000 3.9130 4.6087 3.3043 1.3043 2.1739 2.0870 3.9130 3.3913 4.0000 2.9565 3.3043 3.6522
33.1304 27.6087 25.3043 32.1304 27.9130 21.0000 32.3478 30.5652 26.7826
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Example: ARQ and End-to-End Retransmission A wireless path consisting of H hops (links)
link success probability p A packet is (re-)transmitted up to M times on each link
If it fails, it gets retransmitted from the source (end-to-end)
Q: How many transmissions until end-to-end success?
38