inference in hmm tutorial #6 © ilan gronau. based on original slides of ydo wexler & dan geiger
Post on 21-Dec-2015
216 views
TRANSCRIPT
.
Inference in HMMTutorial #6
© Ilan Gronau.
Based on original slides of Ydo Wexler & Dan Geiger
2
Hidden Markov Models - HMM
X1 X2 XL-1 XLXi
Hidden states
Observed data
H1 H2 HL-1 HLHi
3
Hidden Markov Models - HMMC-G Islands Example
A C
G
T
change
A C
G
T
C-G / Regular
{A,C,G,T}
X1 X2 XL-1 XLXi
H1 H2 HL-1 HLHi
4
Hidden Markov Models - HMMCoin-Tossing Example
Fair/Loaded
Head/Tail
X1 X2 XL-1 XLXi
H1 H2 HL-1 HLHi
transition probabilities
emission probabilities
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
5
Hidden Markov Models - HMMInteresting inference queries:
• Most probable state for certain position:
MAXS {Pr[ Si=S | X1,…,XL ]}
• Most probable path through states:
MAXŠ {Pr[S1,…,SL =Š | X1,…,XL ]}
• Likelihood of evidence:Pr[ X1,…,XL | HMM]
X1 X2 XL-1 XLXi
S1 S2 SL-1 SLSi
6
1. Compute posteriori belief in Si (for a specific i) given the evidence {X1,…,XL} for each possible value of Si.
Pr[ Si=Loaded | X1,…,XL ] and Pr[ Si=Fair | X1,…,XL ]
1. Do this for every Si without repeating the first task L times.
Query: what are the probabilities for Fair/Loaded coins given the set of tosses {X1,…,XL}?
Coin-Tossing ExampleQuestion
7
Decomposing the computation
Pr [X1,…,XL, Si = S ] = Pr[X1,…,Xi , Si = S ] * Pr[Xi+1,…,XL | X1,…,Xi , Si = S ] =
= Pr[X1,…,Xi , Si = S ] * Pr[Xi+1,…,XL | Si = S ] =
= fi(S) * bi(S)
Recall: Pr[ Si = S | X1,…,XL ] = Pr [X1,…,XL, Si = S ] / Pr[X1,…,XL ]
where Pr[X1,…,XL ] = ΣS’ (Pr [ X1,…,XL, Si = S’ ])
X1 X2 XL-1 XLXi
S1 S2 SL-1 SLSi
Markov
8
The forward algorithm
The task: Compute fi(S) = Pr [X1,…,Xi , Si=S ] for i=1,…,L - consider evidence up to time slot i
f1(S) = Pr [X1 , S1=S ] = Pr[S1 =S ]* Pr [X1 | S1=S ] {Base step}
f2(S) = S’ (Pr [X1 X2 , S1=S’ , S2=S ] ) = {2nd step}
= S’ (Pr [X1 , S1=S’]* Pr [S2=S | X1 ,S1=S’ ]* Pr [X2 | X1 ,S1=S’, S2=S] ) =
= S’ (Pr [X1 , S1=S’]* Pr [S2=S | S1=S’ ]* Pr [X2 | S2=S ] ) =
= S’ (f1(S’) * Ptrans [S’S ]* Pemit [S X2])
fi(S) = S’ (fi-1(S’) * Ptrans [S’S ]* Pemit [S Xi]) {ith step}
X1 X2 Xi
S1 S2 Si
transition emission
Directdependency
9
The backward algorithm
The task: Compute bi(S) = Pr [Xi+1,…,XL | Si=S ] for i=1,…,L - consider evidence after time slot i
bL-1(S) = Pr [XL | SL-1=S ] = S’ (Pr [XL , SL=S’ | SL-1=S ]) = {Base step}
= S’ (Pr [SL=S’ | SL-1=S ]*Pr [XL | SL-1=S , SL=S’ ] ) =
= S’ (Ptrans [SS’ ]*Pemit [S’ XL])
bi(S) = S’ (Ptrans [SS’ ] *Pemit [S’ Xi+1]* bi+1(S’ )) {ith step}
Directdependency
XL-1 XLXi+1
SL-1 SLSi+1Si
10
The combined answer
1. Compute posteriori belief in Si (for a specific i) given the evidence {X1,…,XL} for each possible value of Si.
Answer: Run forward and backward algorithms to obtain bi(S), fi(S).
1. Do this for every Si without repeating the first task L times.
Answer: Run forward and backward algorithms to obtain b1(S), fL(S).
(intermediate values are saved on the way)
'
1 )'(b*)'(f
)(b*)(f,,|Pr
S ii
iiLi SS
SSXXSS
X1 X2 XL-1 XLXi
S1 S2 SL-1 SLSi
11
Likelihood of evidence
Likelihood of evidence - Pr [X1,…,XL | HMM ] = Pr [X1,…,XL] = ?
Pr [X1,…,XL] =S (Pr [X1,…,XL , Si=S ]) = S (fi(S) bi(S))
X1 X2 XL-1 XLXi
S1 S2 SL-1 SLSi
You should get the same value
no matter which i you choose
12
Coin-Tossing HMMNumeric example
Outcome of 3 tosses: Head, Head, Tail
Forward – 1st step:
Pr[X1=H , S1=Loaded] = Pr[S1=Loaded] * Pr[ Loaded H] = 0.5 * 0.75 = 0.375
Pr[X1=H , S1=Fair] = Pr[S1=Fair] * Pr[ Fair H] = 0.5 * 0.5 = 0.25
Recall: fi(S) = Pr [X1,…,Xi , Si=S ] = S’ (fi-1(S’) * Ptrans [S’S ]* Pemit [S Xi])
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
13
Coin-Tossing HMMForward algorithm
Outcome of 3 tosses: Head, Head, Tail
Forward – 1st step:Pr[X1=H , S1=Loaded]= Pr[S1=Loaded] * Pr[ Loaded H]= 0.5*0.75 = 0.375Pr[X1=H , S1=Fair] = Pr[S1=Fair] * Pr[ Fair H] = 0.5*0.5 = 0.25
Forward – 2nd step:
Pr[X1X2= HH , S2=Loaded]=
Pr[X1=H , S1=Loaded] * Pr[ Loaded Loaded] * Pr[ Loaded H] +
Pr[X1=H , S1=Fair] * Pr[ Fair Loaded] * Pr[ Loaded H] =
0.375 * 0.9 * 0.75 + 0.25 * 0.1 * 0.75 = 0.271875
Pr[X1X2= HH , S2=Fair]=
Pr[X1=H , S1=Loaded] * Pr[ Loaded Fair] * Pr[ Fair H] +
Pr[X1=H , S1=Fair] * Pr[ Fair Fair] * Pr[ Fair H] =
0.375 * 0.1 * 0.5 + 0.25 * 0.9 * 0.5 = 0.13125
Recall: fi(S) = Pr [X1,…,Xi , Si=S ] = S’ (fi-1(S’) * Ptrans [S’S ]* Pemit [S Xi])
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
14
Coin-Tossing HMMForward algorithm
Forward – 1st step:Pr[X1=H , S1=Loaded]= Pr[S1=Loaded] * Pr[ Loaded H]= 0.5*0.75 = 0.375Pr[X1=H , S1=Fair] = Pr[S1=Fair] * Pr[ Fair H] = 0.5*0.5 = 0.25
Forward – 2nd step:
Pr[X1X2= HH , S2=Loaded] = 0.271875
Pr[X1X2= HH , S2=Fair] = 0.13125
Forward – 3rd step:Pr[X1X2X3= HHT , S3=Loaded]=
Pr[X1X2=HH , S2=Loaded] * Pr[ Loaded Loaded] * Pr[ Loaded T] +
Pr[X1X2=HH , S2=Fair] * Pr[ Fair Loaded] * Pr[ Loaded T] =
0.271875* 0.9 * 0.25 + 0.13125 * 0.1 * 0.25 = 0.06445Pr[X1X2X3= HHT, S3=Fair]=
Pr[X1X2=HH , S1=Loaded] * Pr[ Loaded Fair] * Pr[ Fair T] +
Pr[X1X2=HH , S1=Fair] * Pr[ Fair Fair] * Pr[ Fair T] =
0.271875 * 0.1 * 0.5 + 0.13125 * 0.9 * 0.5 = 0.07265
Recall: fi(S) = Pr [X1,…,Xi , Si=S ] = S’ (fi-1(S’) * Ptrans [S’S ]* Pemit [S Xi])
Outcome of 3 tosses: Head, Head, Tail
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
15
Coin-Tossing HMMBackward algorithm
Backward – 1st step:
Pr[X3=T | S2=Loaded] = Pr[ Loaded Loaded] * Pr[ Loaded T] +
Pr[ Loaded Fair] * Pr[ Fair T] = 0.9*0.25 + 0.1+0.5 = 0.275
Pr[X3=T | S2=Fair] = Pr[Fair Loaded] * Pr[ Loaded T] +
Pr[Fair Fair] * Pr[ Fair T] = 0.1*0.25 + 0.9+0.5 = 0.475
Backward – 2nd step:
Pr[X2X3= HT | S1=Loaded]=
Pr[ Loaded Loaded] * Pr[ Loaded H] * Pr[X3=T | S2=Loaded] +
Pr[ Loaded Fair] * Pr[ Fair H] * Pr[X3=T | S2=Fair] =
0.9 * 0.75 * 0.275 + 0.1 * 0.5 * 0.475 = 0.209
Pr[X2X3= HT | S1=Fair]=
Pr[ Fair Loaded] * Pr[ Loaded H] * Pr[X3=T | S2=Loaded] +
Pr[Fair Fair] * Pr[ Fair H] * Pr[X3=T | S2=Fair] =
0.1 * 0.75 * 0.275 + 0.9 * 0.5 * 0.475 = 0.234
Recall: bi(S) = Pr [Xi+1,…,XL | Si=S ] = S’ (Ptrans [SS’ ] *Pemit [S’ Xi+1]* bi+1(S))
Outcome of 3 tosses: Head, Head, Tail
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
16
Coin-Tossing HMMLikelihood of evidence
Likelihood:• 0.06445 * 1 + 0.07265 * 1 = 0.1371• 0.271875 * 0.275 + 0.13125 * 0.475 = 0.1371• 0.375 * 0.209 + 0.25 * 0.234 = 0.1371
Recall: likelihood = Pr [X1,…,XL] =
Outcome of 3 tosses: Head, Head, Tail
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
0.375 0.271875 0.06445
0.25 0.13125 0.07265LF
0.209 0.275 1
0.234 0.475 1LF
Forward:
Backward:
17
Most Probable Path
• Likelihood of evidence:
Pr [X1…XL] = Š (Pr [X1…XL , S1…SL = Š] )
• We wish to compute:
Pr* [X1…XL] = MAXŠ{Pr [X1…XL , S1…SL = Š]}
and the most probable path leading to this value:
S*1,…,S*
L = ARGMAXŠ{Pr [X1…XL , S1…SL = Š]}
X1 X2 XL-1 XLXi
S1 S2 SL-1 SLSi
18
Most Probable Path Revisiting likelihood calculation
S’(Pr[S1=S’]*Pr[ S’X1]*S’’(Pr[S’S’’]*Pr[S’’X2]* S’’’(Pr[S’’S’’’]*Pr[S’’’ X3]))) =
S’ S’’ S’’’(Pr [X1X2X3 , S1S2S3 = S’S’’S’’’])
S’(Pr[S1=S’]*Pr[ S’X1]*S’’(Pr[S’S’’]*Pr[S’’X2]* b2 (S’’))) =
Pr[X1,X2,X3] =
S’(Pr[S1=S’]*Pr[ S’X1]*b1 (S’)) =
X1 X2
S1 S2
X3
S3
19
MAXS’{Pr[S1=S’]*Pr[ S’X1]*MAXS’’{Pr[S’S’’]*Pr[S’’X2]* MAXS’’’{Pr[S’’S’’’]*Pr[S’’’X3]}}}
= MAXS’ S’’ S’’’{Pr [X1X2X3 , S1S2S3 = S’S’’S’’’]}
MAXS’{Pr[S1=S’]*Pr[ S’X1]*MAXS’’{Pr[S’S’’]*Pr[S’’X2]* v2 (S’’)}} =
Pr*[X1,X2,X3] =
MAXS’{Pr[S1=S’]*Pr[ S’X1]*v1 (S’)} =
X1 X2
S1 S2
X3
S3
S3*
S2*
S1*
Most probable path:S1
*S2*S3
*
Most Probable Path
20
Backward phase: Calculate values vi(S) = Pr* [Xi+1,…,XL |
Si=S ]
• Base: vL(S) = 1
• Step: vi(S) = MAXS’{Pr[SS’]*Pr[S’Xi+1]* vi+1(S’)} πi+1(S) = ARGMAXS’{Pr[SS’]*Pr[S’Xi+1]* vi+1(S’)}
Forward phase: Trace path of maximum probability
• Base: π1 = S1* = ARGMAXS’{Pr[S’]*Pr[S’X1]* v1(S’)}
• Step: Si+1* = πi+1(Si)
X1 X2 XL-1 XLXi
S1 S2 SL-1 SLSiMost Probable Path Viterbi’s algorithm
Classical Dynamic Programming
The value of Si+1
which maximizes the probability
21
Most Probable PathCoin-Tossing Example
Fair/Loaded
Head/Tail
X1 X2 XL-1 XLXi
H1 H2 HL-1 HLHi
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
Reminder:
Outcome of 3 tosses: Head, Head, Tail
What is the most probable series of
coins?
22
S1 , S2, S3 Pr[X1,X2,X3 , S1,S2,S3]
F , F , F (0.5)3*0.5*(0.9)2 = 0.050625
F , F , L (0.5)2*0.25*0.5*0.9*0.1 = 0.0028125
F , L , F 0.5*0.75*0.5*0.5*0.1*0.1 = 0.0009375
F , L , L 0.5*0.75*0.25*0.5*0.1*0.9 = 0.00422
L , F , F 0.75*0.5*0.5*0.5*0.1*0.9 = 0.0084375
L , F , L 0.75*0.5*0.25*0.5*0.1*0.1 = 0.000468
L , L , F 0.75*0.75*0.5*0.5*0.9*0.1 = 0.01265
L , L , L 0.75*0.75*0.25*0.5*0.9*0.9 = 0.0569
Pr[X1,X2,X3 , S1,S2,S3]=
Pr[S1,S2,S3 | X1,X2,X3]*Pr[X1,X2,X3]
max
Most Probable PathCoin-Tossing Example
H H
S1 S2
T
S3
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
Exponential in length of
observation
23
Most Probable PathCoin-Tossing Example
H H
S1 S2
T
S3
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
Backward phase: Calculate values vi(S) = Pr* [Xi+1,…,XL | Si=S ]
• Base: v3(L) = v3(F) = 1
• Step : vi(S) = MAXS’{Pr[SS’]*Pr[S’Xi+1]* vi+1(S’)}• v2(L) = MAX {Pr[LL]*Pr[LT]*v3(L) , Pr[LF]*Pr[FT]*v3(F) } =
= MAX {0.9 * 0.25 , 0.1 * 0.5} = 0.225
π 3(L) = L
• v2(F) = MAX {Pr[FL]*Pr[LT]*v3(L) , Pr[FF]*Pr[FT]*v3(F) } = = MAX {0.1 * 0.25 , 0.9 * 0.5} = 0.45
π 3(F) = F
24
Most Probable PathCoin-Tossing Example
H H
S1 S2
T
S3
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
Backward phase: Calculate values vi(S) = Pr* [Xi+1,…,XL | Si=S ]
• Step : vi(S) = MAXS’{Pr[SS’]*Pr[S’Xi+1]* vi+1(S’)}• v2(L) = 0.225 π 3(L) = L
• v2(F) = 0.45 π 3(F) = F
• v1(L) = MAX {Pr[LL]*Pr[LH]*v2(L) , Pr[LF]*Pr[FH]*v2(F) } = = MAX {0.9*0.75*0.225 , 0.1*0.5*0.45} = 0.151875 π 2(L)
= L• v1(F) = MAX {Pr[FL]*Pr[LH]*v2(L) , Pr[FF]*Pr[FH]*v2(F) } =
= MAX {0.1*0.75*0.225, 0.9*0.5*0.45} = 0.2025 π 2(F) = F
25
Most Probable PathCoin-Tossing Example
H H
S1 S2
T
S3
0.9
fair loaded
H HT T
0.90.1
0.1
1/2 1/43/41/2
Start1/2 1/2
Backward phase: Calculate values vi(S) = Pr* [Xi+1,…,XL | Si=S ]
• v2(L) = 0.225 π 3(L) = L
• v2(F) = 0.45 π 3(F) = F
• v1(L) = 0.151875 π 2(L) = L
• v1(F) 0.2025 π 2(F) = F • Pr* [HHT] = MAX{Pr[L]*Pr[LH]*v1(L) , Pr[F]*Pr[FH]*v1(F)} =
= MAX{0.5*0.75*0.151875 , 0.5*0.5*0.2025} = 0.056953125
Forward phase: Trace maximum-pointers
• S1*= L
• S2* = π2(S1
*) = L
• S3*= π3(S2
*) = L
(0.050625)
26
Hidden Markov Models - HMMInteresting inference queries:
• Most probable state for certain position:
MAXS {Pr[ Si=S | X1,…,XL ]}
• Most probable path through states:
MAXŠ {Pr[S1,…,SL =Š | X1,…,XL ]}
• Likelihood of evidence:Pr[ X1,…,XL | HMM]
X1 X2 XL-1 XLXi
S1 S2 SL-1 SLSi
√
√
√