hidden markov modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · hidden markov...

77
Hidden Markov Models 11-711: Algorithms for NLP Fall 2017 Hidden Markov Models Fall 2017 1 / 32

Upload: others

Post on 28-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Models11-711: Algorithms for NLP

Fall 2017

Hidden Markov Models Fall 2017 1 / 32

Page 2: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Table of Contents

1 Notations

2 Hidden Markov Model

3 Computing the Likelihood: Forward-Pass Algorithm

4 Finding the Hidden Sequence: Viterbi Algorithm

5 Estimating Parameters: Baum-Welch Algorithm

Hidden Markov Models Fall 2017 2 / 32

Page 3: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Outline

1 Notations

2 Hidden Markov Model

3 Computing the Likelihood: Forward-Pass Algorithm

4 Finding the Hidden Sequence: Viterbi Algorithm

5 Estimating Parameters: Baum-Welch Algorithm

Hidden Markov Models Fall 2017 3 / 32

Page 4: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Notations

• R: set of real numbers• Cartesian products:

◦ RD1×D2 : set of matrices of size D1 ×D2, with real entries

• Vectors:◦ Lower case: a, b, c, ...◦ Row major: a ∈ RD means a ∈ R1×D

• Matrices:◦ Upper case: A, B, C, ...◦ Row major

Hidden Markov Models Fall 2017 4 / 32

Page 5: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Notations

• R: set of real numbers• Cartesian products:

◦ RD1×D2 : set of matrices of size D1 ×D2, with real entries• Vectors:

◦ Lower case: a, b, c, ...◦ Row major: a ∈ RD means a ∈ R1×D

• Matrices:◦ Upper case: A, B, C, ...◦ Row major

Hidden Markov Models Fall 2017 4 / 32

Page 6: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Notations

• R: set of real numbers• Cartesian products:

◦ RD1×D2 : set of matrices of size D1 ×D2, with real entries• Vectors:

◦ Lower case: a, b, c, ...◦ Row major: a ∈ RD means a ∈ R1×D

• Matrices:◦ Upper case: A, B, C, ...◦ Row major

Hidden Markov Models Fall 2017 4 / 32

Page 7: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Outline

1 Notations

2 Hidden Markov Model

3 Computing the Likelihood: Forward-Pass Algorithm

4 Finding the Hidden Sequence: Viterbi Algorithm

5 Estimating Parameters: Baum-Welch Algorithm

Hidden Markov Models Fall 2017 5 / 32

Page 8: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model

Hidden Markov Models Fall 2017 6 / 32

Page 9: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Notations

Name Notation Meaning/Property

State space Q = {q1, q2, ..., qN} Set of N statesObservation space V = {w1, w2, ..., wV } Set of V statesState sequence S = {s1, s2, ..., sT } Sequence of T steps. si ∈ Q

Observation sequence O = {o1, o2, ..., oT } Sequence of T steps. oi ∈ V

Transition probs A ∈ RN×N Each row is a valid distributionEmission probs B ∈ RN×V Each row is a valid distribution

Valid (probability) distribution:• A(i, j) ≥ 0•∑N

j=1 A(i, j) = 1

Hidden Markov Models Fall 2017 7 / 32

Page 10: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Notations

Name Notation Meaning/Property

State space Q = {q1, q2, ..., qN} Set of N statesObservation space V = {w1, w2, ..., wV } Set of V states

State sequence S = {s1, s2, ..., sT } Sequence of T steps. si ∈ Q

Observation sequence O = {o1, o2, ..., oT } Sequence of T steps. oi ∈ V

Transition probs A ∈ RN×N Each row is a valid distributionEmission probs B ∈ RN×V Each row is a valid distribution

Valid (probability) distribution:• A(i, j) ≥ 0•∑N

j=1 A(i, j) = 1

Hidden Markov Models Fall 2017 7 / 32

Page 11: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Notations

Name Notation Meaning/Property

State space Q = {q1, q2, ..., qN} Set of N statesObservation space V = {w1, w2, ..., wV } Set of V statesState sequence S = {s1, s2, ..., sT } Sequence of T steps. si ∈ Q

Observation sequence O = {o1, o2, ..., oT } Sequence of T steps. oi ∈ V

Transition probs A ∈ RN×N Each row is a valid distributionEmission probs B ∈ RN×V Each row is a valid distribution

Valid (probability) distribution:• A(i, j) ≥ 0•∑N

j=1 A(i, j) = 1

Hidden Markov Models Fall 2017 7 / 32

Page 12: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Notations

Name Notation Meaning/Property

State space Q = {q1, q2, ..., qN} Set of N statesObservation space V = {w1, w2, ..., wV } Set of V statesState sequence S = {s1, s2, ..., sT } Sequence of T steps. si ∈ Q

Observation sequence O = {o1, o2, ..., oT } Sequence of T steps. oi ∈ V

Transition probs A ∈ RN×N Each row is a valid distribution

Emission probs B ∈ RN×V Each row is a valid distribution

Valid (probability) distribution:• A(i, j) ≥ 0•∑N

j=1 A(i, j) = 1

Hidden Markov Models Fall 2017 7 / 32

Page 13: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Notations

Name Notation Meaning/Property

State space Q = {q1, q2, ..., qN} Set of N statesObservation space V = {w1, w2, ..., wV } Set of V statesState sequence S = {s1, s2, ..., sT } Sequence of T steps. si ∈ Q

Observation sequence O = {o1, o2, ..., oT } Sequence of T steps. oi ∈ V

Transition probs A ∈ RN×N Each row is a valid distributionEmission probs B ∈ RN×V Each row is a valid distribution

Valid (probability) distribution:• A(i, j) ≥ 0•∑N

j=1 A(i, j) = 1

Hidden Markov Models Fall 2017 7 / 32

Page 14: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Assumptions

s0A(s0, s1)

B(s0, o0)

o0

s1A(s1, s2)

B(s1, o1)

o1

s2A(s2, s3)

B(s2, o2)

o2

s3A(s3, s4)

B(s3, o3)

o3

s4A(s4, s5)

B(s4, o4)

o4

s5

B(s5, o5)

o5

• Markov assumption

P (s0, s1, s2, ..., sT ) =T∏

t=1P (st|s<t) =

T∏t=1

P (st|st−1)def=T∏

t=1A(st−1, st)

• Independent assumption

P (ot|o<t, s≤t) = P (ot|st)def= B(st, ot)

Hidden Markov Models Fall 2017 8 / 32

Page 15: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Assumptions

s0A(s0, s1)

B(s0, o0)

o0

s1A(s1, s2)

B(s1, o1)

o1

s2A(s2, s3)

B(s2, o2)

o2

s3A(s3, s4)

B(s3, o3)

o3

s4A(s4, s5)

B(s4, o4)

o4

s5

B(s5, o5)

o5

• Markov assumption

P (s0, s1, s2, ..., sT ) =T∏

t=1P (st|s<t) =

T∏t=1

P (st|st−1)def=T∏

t=1A(st−1, st)

• Independent assumption

P (ot|o<t, s≤t) = P (ot|st)def= B(st, ot)

Hidden Markov Models Fall 2017 8 / 32

Page 16: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Hidden Markov Model: Assumptions

s0A(s0, s1)

B(s0, o0)

o0

s1A(s1, s2)

B(s1, o1)

o1

s2A(s2, s3)

B(s2, o2)

o2

s3A(s3, s4)

B(s3, o3)

o3

s4A(s4, s5)

B(s4, o4)

o4

s5

B(s5, o5)

o5

• Markov assumption

P (s0, s1, s2, ..., sT ) =T∏

t=1P (st|s<t) =

T∏t=1

P (st|st−1)def=T∏

t=1A(st−1, st)

• Independent assumption

P (ot|o<t, s≤t) = P (ot|st)def= B(st, ot)

Hidden Markov Models Fall 2017 8 / 32

Page 17: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

More Notations

• Prior: P (A, B)◦ Without seeing anything, how do you believe A, B should look like?

• Likelihood: P (O|A, B)◦ Suppose you know A, B, how likely do you see O?

• Posterior: P (A, B|O)◦ Suppose you see O, how do you believe A, B should look like?

• Bayes rule

P (A, B|O) = P (O|A, B) · P (A, B)P (O)

posterior = likelihood · priordata distribution

Hidden Markov Models Fall 2017 9 / 32

Page 18: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

More Notations

• Prior: P (A, B)◦ Without seeing anything, how do you believe A, B should look like?

• Likelihood: P (O|A, B)◦ Suppose you know A, B, how likely do you see O?

• Posterior: P (A, B|O)◦ Suppose you see O, how do you believe A, B should look like?

• Bayes rule

P (A, B|O) = P (O|A, B) · P (A, B)P (O)

posterior = likelihood · priordata distribution

Hidden Markov Models Fall 2017 9 / 32

Page 19: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

More Notations

• Prior: P (A, B)◦ Without seeing anything, how do you believe A, B should look like?

• Likelihood: P (O|A, B)◦ Suppose you know A, B, how likely do you see O?

• Posterior: P (A, B|O)◦ Suppose you see O, how do you believe A, B should look like?

• Bayes rule

P (A, B|O) = P (O|A, B) · P (A, B)P (O)

posterior = likelihood · priordata distribution

Hidden Markov Models Fall 2017 9 / 32

Page 20: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

More Notations

• Prior: P (A, B)◦ Without seeing anything, how do you believe A, B should look like?

• Likelihood: P (O|A, B)◦ Suppose you know A, B, how likely do you see O?

• Posterior: P (A, B|O)◦ Suppose you see O, how do you believe A, B should look like?

• Bayes rule

P (A, B|O) = P (O|A, B) · P (A, B)P (O)

posterior = likelihood · priordata distribution

Hidden Markov Models Fall 2017 9 / 32

Page 21: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

3 Questions with HMM

• Settings:◦ Prior P (A, B) are uniform.◦ P (O) is unknown, but O will be observed.

• Question 1: Compute the likelihood: P (O|A, B)• Question 2: (only for HMMs) Find the hidden sequence S

S∗ = argmaxS

P (O|S, A, B)

• Question 3: Find the posterior

A∗, B∗ = argmaxA,B

P (A, B|O)

Hidden Markov Models Fall 2017 10 / 32

Page 22: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

3 Questions with HMM

• Settings:◦ Prior P (A, B) are uniform.◦ P (O) is unknown, but O will be observed.

• Question 1: Compute the likelihood: P (O|A, B)

• Question 2: (only for HMMs) Find the hidden sequence S

S∗ = argmaxS

P (O|S, A, B)

• Question 3: Find the posterior

A∗, B∗ = argmaxA,B

P (A, B|O)

Hidden Markov Models Fall 2017 10 / 32

Page 23: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

3 Questions with HMM

• Settings:◦ Prior P (A, B) are uniform.◦ P (O) is unknown, but O will be observed.

• Question 1: Compute the likelihood: P (O|A, B)• Question 2: (only for HMMs) Find the hidden sequence S

S∗ = argmaxS

P (O|S, A, B)

• Question 3: Find the posterior

A∗, B∗ = argmaxA,B

P (A, B|O)

Hidden Markov Models Fall 2017 10 / 32

Page 24: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

3 Questions with HMM

• Settings:◦ Prior P (A, B) are uniform.◦ P (O) is unknown, but O will be observed.

• Question 1: Compute the likelihood: P (O|A, B)• Question 2: (only for HMMs) Find the hidden sequence S

S∗ = argmaxS

P (O|S, A, B)

• Question 3: Find the posterior

A∗, B∗ = argmaxA,B

P (A, B|O)

Hidden Markov Models Fall 2017 10 / 32

Page 25: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Outline

1 Notations

2 Hidden Markov Model

3 Computing the Likelihood: Forward-Pass Algorithm

4 Finding the Hidden Sequence: Viterbi Algorithm

5 Estimating Parameters: Baum-Welch Algorithm

Hidden Markov Models Fall 2017 11 / 32

Page 26: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Question

Compute the likelihood: P (o1, o2, ..., oT |A, B)

Hidden Markov Models Fall 2017 12 / 32

Page 27: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Observation

O depends on the hidden sequence S

s0A(s0, s1)

B(s0, o0)

o0

s1A(s1, s2)

B(s1, o1)

o1

s2A(s2, s3)

B(s2, o2)

o2

s3A(s3, s4)

B(s3, o3)

o3

s4A(s4, s5)

B(s4, o4)

o4

s5

B(s5, o5)

o5

Hidden Markov Models Fall 2017 13 / 32

Page 28: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Compute P (O|A, B) for a particular S

S = (s1, s2, ..., sT )P (O|S, A, B) = P (o1|s1)P (o2|s2) · · ·P (oT |sT )

=T∏

i=1P (oi|si)

Hidden Markov Models Fall 2017 14 / 32

Page 29: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

What does Bayes say?

P (O|A, B) =∑S

P (O|S, A, B)·P (S)

=∑

s1,...,sT

T∏i=1

P (oi|si)·T∏

i=1P (si|si−1)

=

∑s1,...,sT

T∏i=1

B(si, oi) ·T∏

i=1A(si−1, si)

But there are Θ(NT ) possible sequences S

Hidden Markov Models Fall 2017 15 / 32

Page 30: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

What does Bayes say?

P (O|A, B) =∑S

P (O|S, A, B)·P (S)

=∑

s1,...,sT

T∏i=1

P (oi|si)·T∏

i=1P (si|si−1)

=∑

s1,...,sT

T∏i=1

B(si, oi) ·T∏

i=1A(si−1, si)

But there are Θ(NT ) possible sequences S

Hidden Markov Models Fall 2017 15 / 32

Page 31: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

What does Bayes say?

P (O|A, B) =∑S

P (O|S, A, B)·P (S)

=∑

s1,...,sT

T∏i=1

P (oi|si)·T∏

i=1P (si|si−1)

=

∑s1,...,sT

T∏i=1

B(si, oi) ·T∏

i=1A(si−1, si)

But there are Θ(NT ) possible sequences S

Hidden Markov Models Fall 2017 15 / 32

Page 32: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

What does Bayes say?

P (O|A, B) =∑S

P (O|S, A, B)·P (S)

=∑

s1,...,sT

T∏i=1

P (oi|si)·T∏

i=1P (si|si−1)

=

∑s1,...,sT

T∏i=1

B(si, oi) ·T∏

i=1A(si−1, si)

But there are Θ(NT ) possible sequences S

Hidden Markov Models Fall 2017 15 / 32

Page 33: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Compute harder instances based on easier instances• Cache easier instances

Hidden Markov Models Fall 2017 16 / 32

Page 34: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 1

P (o1|A, B) =∑s1

B(s1, o1) ·A(s0, s1)

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

=∑s2

B(s2, o2)(∑

s1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 17 / 32

Page 35: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 1

P (o1|A, B) =∑s1

B(s1, o1) ·A(s0, s1)

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

=∑s2

B(s2, o2)(∑

s1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 17 / 32

Page 36: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 1

P (o1|A, B) =∑s1

B(s1, o1) ·A(s0, s1)

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

=∑s2

B(s2, o2)(∑

s1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 17 / 32

Page 37: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 1

P (o1|A, B) =∑s1

B(s1, o1) ·A(s0, s1)

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

=∑s2

B(s2, o2)(∑

s1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 17 / 32

Page 38: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

P (o1, o2, o3|A, B) =∑s3

∑s2

∑s1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

=∑s3

B(s3, o3)∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 18 / 32

Page 39: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

P (o1, o2, o3|A, B) =∑s3

∑s2

∑s1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

=∑s3

B(s3, o3)∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 18 / 32

Page 40: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

P (o1, o2, o3|A, B) =∑s3

∑s2

∑s1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

=∑s3

B(s3, o3)∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 18 / 32

Page 41: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

P (o1, o2, o3|A, B) =∑s3

∑s2

∑s1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

=∑s3

B(s3, o3)∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 18 / 32

Page 42: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

P (o1, o2|A, B) =∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

P (o1, o2, o3|A, B) =∑s3

∑s2

∑s1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

=∑s3

B(s3, o3)∑s2

∑s1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 18 / 32

Page 43: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

• How to compute f [t, s]?

f [t + 1, s] = P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

=∑s′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

=∑s′

f [t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 19 / 32

Page 44: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

• How to compute f [t, s]?

f [t + 1, s] = P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

=∑s′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

=∑s′

f [t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 19 / 32

Page 45: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

• How to compute f [t, s]?

f [t + 1, s] = P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

=∑s′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

=∑s′

f [t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 19 / 32

Page 46: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

f [t, s] def= P (o1, o2, ..., ot, st = s|A, B)

• How to compute f [t, s]?

f [t + 1, s] = P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

=∑s′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

=∑s′

f [t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 19 / 32

Page 47: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• How to compute f [t, s]?

f [t + 1, s] =∑s′

f [t, s′] ·A(s′, s)B(s, ot+1)

• What can we do with f [t, s]?

P (o1, o2, ..., ot|A, B) =∑

s

P (o1, o2, ..., ot, st = s|A, B)·P (ot|st = s)

=∑

s

f [t, s]·B(s, ot)

Hidden Markov Models Fall 2017 20 / 32

Page 48: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• How to compute f [t, s]?

f [t + 1, s] =∑s′

f [t, s′] ·A(s′, s)B(s, ot+1)

• What can we do with f [t, s]?

P (o1, o2, ..., ot|A, B) =∑

s

P (o1, o2, ..., ot, st = s|A, B)·P (ot|st = s)

=∑

s

f [t, s]·B(s, ot)

Hidden Markov Models Fall 2017 20 / 32

Page 49: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• How to compute f [t, s]?

f [t + 1, s] =∑s′

f [t, s′] ·A(s′, s)B(s, ot+1)

• What can we do with f [t, s]?

P (o1, o2, ..., ot|A, B) =∑

s

P (o1, o2, ..., ot, st = s|A, B)·P (ot|st = s)

=∑

s

f [t, s]·B(s, ot)

Hidden Markov Models Fall 2017 20 / 32

Page 50: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Put Together: Forward-Pass Algorithm

(1) Initialize:• For each hidden state s:

f [1, s] = P (o1, s1 = s|A, B)← B(s, o1) ·A(s0, s)

(2) For t = 2 to T :• For each hidden state s:

f [t, s]←∑

s′

f [t− 1, s′] ·A(s′, s)B(s, ot)

(3) Finally:

P (o1, o2, ..., oT |A, B)←∑

s

f [T, s]B(s, oT )

Complexity: O(T ·N 2)

Hidden Markov Models Fall 2017 21 / 32

Page 51: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Put Together: Forward-Pass Algorithm

(1) Initialize:• For each hidden state s:

f [1, s] = P (o1, s1 = s|A, B)← B(s, o1) ·A(s0, s)

(2) For t = 2 to T :• For each hidden state s:

f [t, s]←∑

s′

f [t− 1, s′] ·A(s′, s)B(s, ot)

(3) Finally:

P (o1, o2, ..., oT |A, B)←∑

s

f [T, s]B(s, oT )

Complexity: O(T ·N 2)

Hidden Markov Models Fall 2017 21 / 32

Page 52: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Outline

1 Notations

2 Hidden Markov Model

3 Computing the Likelihood: Forward-Pass Algorithm

4 Finding the Hidden Sequence: Viterbi Algorithm

5 Estimating Parameters: Baum-Welch Algorithm

Hidden Markov Models Fall 2017 22 / 32

Page 53: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Question

Given O. Find the hidden sequence S

S∗ = argmaxS

P (O|S, A, B)

Hidden Markov Models Fall 2017 23 / 32

Page 54: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Familiar Observation

S = (s1, s2, ..., sT )⇒ P (O|S, A, B) = P (o1|s1)P (o2|s2) · · ·P (oT |sT )

=T∏

i=1P (oi|si)

But there are Θ(NT ) possible sequences S

Hidden Markov Models Fall 2017 24 / 32

Page 55: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Familiar Observation

S = (s1, s2, ..., sT )⇒ P (O|S, A, B) = P (o1|s1)P (o2|s2) · · ·P (oT |sT )

=T∏

i=1P (oi|si)

But there are Θ(NT ) possible sequences S

Hidden Markov Models Fall 2017 24 / 32

Page 56: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

Replace argmax with max:• T = 1

max P (o1|A, B) = maxs1

P (o1|s1) · P (s1|s0)

= maxs1

B(s1, o1) ·A(s0, s)

• T = 2

max P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

= maxs2

B(s2, o2)(

maxs1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 25 / 32

Page 57: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

Replace argmax with max:• T = 1

max P (o1|A, B) = maxs1

P (o1|s1) · P (s1|s0)

= maxs1

B(s1, o1) ·A(s0, s)

• T = 2

max P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

= maxs2

B(s2, o2)(

maxs1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 25 / 32

Page 58: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

Replace argmax with max:• T = 1

max P (o1|A, B) = maxs1

P (o1|s1) · P (s1|s0)

= maxs1

B(s1, o1) ·A(s0, s)

• T = 2

max P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

= maxs2

B(s2, o2)(

maxs1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 25 / 32

Page 59: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

Replace argmax with max:• T = 1

max P (o1|A, B) = maxs1

P (o1|s1) · P (s1|s0)

= maxs1

B(s1, o1) ·A(s0, s)

• T = 2

max P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

= maxs2

B(s2, o2)(

maxs1

B(s1, o1) ·A(s0, s1)A(s1, s2))

Hidden Markov Models Fall 2017 25 / 32

Page 60: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

maxs1,s2

P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

maxs1,s2,s3

P (o1, o2, o3|A, B) = maxs3

maxs2

maxs1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

= maxs3

B(s3, o3) maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

g[t, s] def= maxs1,s2,...,st−1

P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 26 / 32

Page 61: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

maxs1,s2

P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

maxs1,s2,s3

P (o1, o2, o3|A, B) = maxs3

maxs2

maxs1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

= maxs3

B(s3, o3) maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

g[t, s] def= maxs1,s2,...,st−1

P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 26 / 32

Page 62: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

maxs1,s2

P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

maxs1,s2,s3

P (o1, o2, o3|A, B) = maxs3

maxs2

maxs1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

= maxs3

B(s3, o3) maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

g[t, s] def= maxs1,s2,...,st−1

P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 26 / 32

Page 63: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

maxs1,s2

P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

maxs1,s2,s3

P (o1, o2, o3|A, B) = maxs3

maxs2

maxs1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

= maxs3

B(s3, o3) maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

g[t, s] def= maxs1,s2,...,st−1

P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 26 / 32

Page 64: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• T = 2

maxs1,s2

P (o1, o2|A, B) = maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)

• T = 3

maxs1,s2,s3

P (o1, o2, o3|A, B) = maxs3

maxs2

maxs1

B(s1, o1)B(s2, o2)B(s3, o3) ·A(s0, s1)A(s1, s2)A(s2, s3)

= maxs3

B(s3, o3) maxs2

maxs1

B(s1, o1)B(s2, o2) ·A(s0, s1)A(s1, s2)A(s2, s3)

• We can catch the blue quantities

g[t, s] def= maxs1,s2,...,st−1

P (o1, o2, ..., ot, st = s|A, B)

Hidden Markov Models Fall 2017 26 / 32

Page 65: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

g[t, s] def= maxs1...st−1

P (o1, o2, ..., ot, st = s|A, B)

• How to compute g[t, s]? Same!

g[t + 1, s] = maxs1,...,st

P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

= maxs′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

= maxs′

g[t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 27 / 32

Page 66: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

g[t, s] def= maxs1...st−1

P (o1, o2, ..., ot, st = s|A, B)

• How to compute g[t, s]? Same!

g[t + 1, s] = maxs1,...,st

P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

= maxs′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

= maxs′

g[t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 27 / 32

Page 67: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

g[t, s] def= maxs1...st−1

P (o1, o2, ..., ot, st = s|A, B)

• How to compute g[t, s]? Same!

g[t + 1, s] = maxs1,...,st

P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

= maxs′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

= maxs′

g[t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 27 / 32

Page 68: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• Caching values:

g[t, s] def= maxs1...st−1

P (o1, o2, ..., ot, st = s|A, B)

• How to compute g[t, s]? Same!

g[t + 1, s] = maxs1,...,st

P (o1, o2, ..., ot, ot+1, st+1 = s|A, B)

= maxs′

P (o1, o2, ..., ot, st = s′|A, B) · P (st+1 = s|st = s′)P (ot+1|s)

= maxs′

g[t, s′] ·A(s′, s)B(s, ot+1)

Hidden Markov Models Fall 2017 27 / 32

Page 69: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• How to compute g[t, s]?

g[t + 1, s] = maxs′

g[t, s′] ·A(s′, s)B(s, ot+1)

• What can we do with g[t, s]? Same!

maxs1,...,st

P (o1, o2, ..., ot|A, B) = maxs

P (o1, o2, ..., ot, st = s|A, B)·P (ot|st = s)

= maxs

g[t, s]·B(s, ot)

Hidden Markov Models Fall 2017 28 / 32

Page 70: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• How to compute g[t, s]?

g[t + 1, s] = maxs′

g[t, s′] ·A(s′, s)B(s, ot+1)

• What can we do with g[t, s]? Same!

maxs1,...,st

P (o1, o2, ..., ot|A, B) = maxs

P (o1, o2, ..., ot, st = s|A, B)·P (ot|st = s)

= maxs

g[t, s]·B(s, ot)

Hidden Markov Models Fall 2017 28 / 32

Page 71: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Solution: Dynamic Programming

• How to compute g[t, s]?

g[t + 1, s] = maxs′

g[t, s′] ·A(s′, s)B(s, ot+1)

• What can we do with g[t, s]? Same!

maxs1,...,st

P (o1, o2, ..., ot|A, B) = maxs

P (o1, o2, ..., ot, st = s|A, B)·P (ot|st = s)

= maxs

g[t, s]·B(s, ot)

Hidden Markov Models Fall 2017 28 / 32

Page 72: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Put Together: almost Viterbi Algorithm

(1) Initialize:• For each hidden state s:

g[1, s]← B(s, o1) ·A(s0, s)

(2) For t = 2 to T :• For each hidden state s:

g[t, s]← maxs′

g[t− 1, s′] ·A(s′, s)B(s, ot)

(3) Finally:

maxs1...sT

P (o1, o2, ..., oT |A, B)← maxs

g[T, s]B(s, oT )

We want argmax , not max !!!

Hidden Markov Models Fall 2017 29 / 32

Page 73: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Put Together: almost Viterbi Algorithm

(1) Initialize:• For each hidden state s:

g[1, s]← B(s, o1) ·A(s0, s)

(2) For t = 2 to T :• For each hidden state s:

g[t, s]← maxs′

g[t− 1, s′] ·A(s′, s)B(s, ot)

(3) Finally:

maxs1...sT

P (o1, o2, ..., oT |A, B)← maxs

g[T, s]B(s, oT )

We want argmax , not max !!!

Hidden Markov Models Fall 2017 29 / 32

Page 74: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

+ traceback: Viterbi Algorithm

(1) Initialize:• For each hidden state s:

g[1, s]← B(s, o1) ·A(s0, s)

(2) For t = 2 to T :• For each hidden state s:

g[t, s]← maxs′

g[t− 1, s′] ·A(s′, s)B(s, ot)

h[t, s]← argmaxs′

g[t− 1, s′] ·A(s′, s)B(s, ot)

(3) Follow h[t, s] to find s∗T , s∗T−1, ..., s∗1.

Complexity: O(T ·N 2)

Hidden Markov Models Fall 2017 30 / 32

Page 75: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

+ traceback: Viterbi Algorithm

(1) Initialize:• For each hidden state s:

g[1, s]← B(s, o1) ·A(s0, s)

(2) For t = 2 to T :• For each hidden state s:

g[t, s]← maxs′

g[t− 1, s′] ·A(s′, s)B(s, ot)

h[t, s]← argmaxs′

g[t− 1, s′] ·A(s′, s)B(s, ot)

(3) Follow h[t, s] to find s∗T , s∗T−1, ..., s∗1.

Complexity: O(T ·N 2)

Hidden Markov Models Fall 2017 30 / 32

Page 76: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Outline

1 Notations

2 Hidden Markov Model

3 Computing the Likelihood: Forward-Pass Algorithm

4 Finding the Hidden Sequence: Viterbi Algorithm

5 Estimating Parameters: Baum-Welch Algorithm

Hidden Markov Models Fall 2017 31 / 32

Page 77: Hidden Markov Modelstbergkir/11711fa17/recitation4_slides.pdf · 2017. 9. 22. · Hidden Markov Model: Notations Name Notation Meaning/Property State space Q = {q 1,q 2,...,q N} Set

Question

Find the posterior

A∗, B∗ = argmaxA,B

P (A, B|O)

Hidden Markov Models Fall 2017 32 / 32