fast inference and learning in large-state-space hmms

102
Siddiqi and Moore, www.autonlab.org Fast Inference and Learning in Large- State-Space HMMs Sajid M. Siddiqi Andrew W. Moore The Auton Lab Carnegie Mellon University

Upload: dash

Post on 04-Jan-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Fast Inference and Learning in Large-State-Space HMMs. Sajid M. Siddiqi Andrew W. Moore The Auton Lab Carnegie Mellon University. Fast Inference and Learning in Large-State-Space HMMs. Sajid M. Siddiqi Andrew W. Moore The Auton Lab Carnegie Mellon University. Sajid Siddiqi: Happy. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Fast Inference and Learning in Large-State-Space HMMs

Sajid M. SiddiqiAndrew W. Moore

The Auton LabCarnegie Mellon

University

Page 2: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Fast Inference and Learning in Large-State-Space HMMs

Sajid M. SiddiqiAndrew W. Moore

The Auton LabCarnegie Mellon

University

Page 3: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Sajid Siddiqi: Happy

Sajid Siddiqi: Discontented

Page 4: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Hidden Markov Models

1/3

1

q0

q1

q2

q3

q4

Page 5: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Each of these probability tables is identical

i P(qt+1=s1|qt=si) P(qt+1=s2|qt=si) … P(qt+1=sj|qt=si) …P(qt+1=sN|qt=si)

1 a11 a12…a1j

…a1N

2 a21 a22…a2j

…a2N

3 a31 a32…a3j

…a3N

: : : : : : :

i ai1 ai2…aij

…aiN

N aN1 aN2…aNj

…aNN

Hidden Markov Models

1/3

1

q0

q1

q2

q3

q4

Notation:

)|( 1 itjtij sqsqPa

Page 6: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Observation Modelq0

q1

q2

q3

q4

O0

O1

O2

O3

O4

Page 7: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Observation Modelq0

q1

q2

q3

q4

O0

O1

O2

O3

O4

i P(Ot=1|qt=si) P(Ot=2|qt=si) … P(Ot=k|qt=si) … P(Ot=M|qt=si)

1 b1(1) b1 (2) … b1 (k) … b1(M)

2 b2 (1) b2 (2) … b2(k) … b2 (M)

3 b3 (1) b3 (2) … b3(k) … b3 (M)

: : : : : : :

i bi(1) bi (2) … bi(k) … bi (M)

: : : : : : :

N bN (1) bN (2) … bN(k) … bN (M)

Notation:

)|()( itti sqkOPkb

Page 8: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Some Famous HMM TasksQuestion 1: State Estimation

What is P(qT=Si | O1O2…OT)

Page 9: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Question 1: State Estimation

What is P(qT=Si | O1O2…OT)

Some Famous HMM Tasks

Page 10: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Question 1: State Estimation

What is P(qT=Si | O1O2…OT)

Some Famous HMM Tasks

Page 11: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Question 1: State Estimation

What is P(qT=Si | O1O2…OT)

Question 2: Most Probable Path

Given O1O2…OT , what is the most probable path that I took?

Some Famous HMM Tasks

Page 12: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Question 1: State Estimation

What is P(qT=Si | O1O2…OT)

Question 2: Most Probable Path

Given O1O2…OT , what is the most probable path that I took?

Some Famous HMM Tasks

Page 13: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Question 1: State Estimation

What is P(qT=Si | O1O2…OT)

Question 2: Most Probable Path

Given O1O2…OT , what is the most probable path that I took?

Some Famous HMM Tasks

Woke up at 8.35, Got on Bus at 9.46, Sat in lecture 10.05-11.22…

Page 14: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Some Famous HMM TasksQuestion 1: State Estimation

What is P(qT=Si | O1O2…OT)

Question 2: Most Probable Path

Given O1O2…OT , what is the most probable path that I took?

Question 3: Learning HMMs:

Given O1O2…OT , what is the maximum likelihood HMM that could have produced this string of observations?

Page 15: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Some Famous HMM TasksQuestion 1: State Estimation

What is P(qT=Si | O1O2…OT)

Question 2: Most Probable Path

Given O1O2…OT , what is the most probable path that I took?

Question 3: Learning HMMs:

Given O1O2…OT , what is the maximum likelihood HMM that could have produced this string of observations?

Page 16: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Some Famous HMM TasksQuestion 1: State Estimation

What is P(qT=Si | O1O2…OT)

Question 2: Most Probable Path

Given O1O2…OT , what is the most probable path that I took?

Question 3: Learning HMMs:

Given O1O2…OT , what is the maximum likelihood HMM that could have produced this string of observations?

Eat

Bus

walk

aAB

aBB

aAA

aCB

aBA aBC

aCC

Ot-1 Ot+1

Ot

bA(Ot-1)

bB(Ot)

bC(Ot+1)

Page 17: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Basic Operations in HMMsFor an observation sequence O = O1…OT, the three basic HMM

operations are:

Problem Algorithm Complexity+

Evaluation:

Calculating P(O|)

Forward-Backward O(TN2)

Inference:

Computing Q* = argmaxQ P(O,Q|)

Viterbi Decoding O(TN2)

Learning:

Computing * = argmax P(O|Baum-Welch (EM) O(TN2)

T = # timesteps, N = # states

Page 18: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Basic Operations in HMMsFor an observation sequence O = O1…OT, the three basic HMM

operations are:

Problem Algorithm Complexity+

Evaluation:

Calculating P(O|)

Forward-Backward O(TN2)

Inference:

Computing Q* = argmaxQ P(O,Q|)

Viterbi Decoding O(TN2)

Learning:

Computing * = argmax P(O|Baum-Welch (EM) O(TN2)

T = # timesteps, N = # states

This talk:

A simple approach to

reducing the complexity in N

Page 19: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Reducing Quadratic N penaltyWhy does it matter?

• Quadratic HMM algorithms hinder HMM computations when N is large

• Several promising applications for efficient large-state-space HMM algorithms in

• biological sequence analysis

• speech recognition

• real-time HMM systems such as for activity monitoring

Page 20: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Idea One: Sparse Transition Matrix

• Only K << N non-zero next-state probabilities

Page 21: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Idea One: Sparse Transition Matrix

• Only K << N non-zero next-state probabilities

7.003.000

05.0005.0

75.00025.00

03.007.00

004.006.0

Page 22: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Idea One: Sparse Transition Matrix

• Only K << N non-zero next-state probabilities

7.003.000

05.0005.0

75.00025.00

03.007.00

004.006.0

Only O(TNK)

Page 23: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Idea One: Sparse Transition Matrix

• Only K << N non-zero next-state probabilities

7.003.000

05.0005.0

75.00025.00

03.007.00

004.006.0

• But can get very badly

confused by

“impossible transitions”

• Cannot learn the

sparse structure (once

chosen cannot

change)

Only O(TNK)

Page 24: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Dense-Mostly-Constant Transitions K non-constant probabilities per row DMC HMMs comprise a richer and more

expressive class of models than sparse HMMs

a DMC transition matrix with K=2

25.015.030.015.015.0

01.051.001.001.046.0

6.005.005.025.005.0

04.018.004.07.004.0

1.01.03.01.04.0

Page 25: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Dense-Mostly-Constant Transitions

• The transition model for state i now comprises:• NCi = { j : sisj is a non-constant transition probability }

• ci = the transition probability for si to all states not in NCi

• aij = the non-constant transition probability for si sj, iNCj

25.015.030.015.015.0

01.051.001.001.046.0

6.005.005.025.005.0

04.018.004.07.004.0

1.01.03.01.04.0 NC3 = {2,5}

c3 = 0.05

a32 = 0.25

a35 = 0.6

Page 26: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

HMM FilteringP(qt = si | O1, O2 … Ot)

Page 27: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

HMM FilteringP(qt = si | O1, O2 … Ot) =

Where

N

jt

t

j

i

1

)(

)(

ittt SqOOOi ..P 21

Page 28: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

HMM FilteringP(qt = si | O1, O2 … Ot) =

Where

N

jt

t

j

i

1

)(

)(

ittt SqOOOi ..P 21

iObaj ti

tjijt 11

Page 29: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

HMM FilteringP(qt = si | O1, O2 … Ot) =

Where

N

jt

t

j

i

1

)(

)(

ittt SqOOOi ..P 21

iObaj ti

tjijt 11

t t(1) t(2) t(3) … t(N)

1

2 …

3

4

5

6

7

8

9

Page 30: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

HMM FilteringP(qt = si | O1, O2 … Ot) =

Where

N

jt

t

j

i

1

)(

)(

ittt SqOOOi ..P 21

t t(1) t(2) t(3) … t(N)

1

2 …

3 …

4

5

6

7

8

9

iObaj ti

tjijt 11

Page 31: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

HMM FilteringP(qt = si | O1, O2 … Ot) =

Where

N

jt

t

j

i

1

)(

)(

ittt SqOOOi ..P 21

t t(1) t(2) t(3) … t(N)

1

2 …

3 …

4

5

6

7

8

9

iObaj ti

tjijt 11

•Cost O(TN2)

Page 32: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Fast Evaluation in DMC HMMs

Page 33: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Fast Evaluation in DMC HMMs

O(N), but common to all j per timestep t

O(K) for each t(j)

This yields O(TNK) complexity for the evaluation problem.

Page 34: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

The Viterbi algorithm uses dynamic programming to calculate the globally optimal state sequence Qg=maxQP(Q,O|).

Fast Inference in DMC HMMs

Define t(i) as

The variables can be computed in O(TN2) time, with the O(N) inductive step:

Under the DMC assumption, this step can be carried out in O(K) time:

O(N), but common to all j per timestep t

O(K) for each t(j)

Page 35: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning a DMC HMM

Page 36: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning a DMC HMM

• Idea One:• Ask user to tell us the DMC

structure• Learn the parameters using EM

Page 37: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning a DMC HMM

• Idea One:• Ask user to tell us the DMC

structure• Learn the parameters using EM

• Simple

• But in general, don’t know the DMC structure

Page 38: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning a DMC HMM

• Idea Two:Use EM to learn the DMC structure too

1. Guess DMC structure2. Find expected transition

counts and observation parameters, given current model and observations

3. Find maximum likelihood DMC model given counts

4. Goto 2

Page 39: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning a DMC HMM

• Idea Two:Use EM to learn the DMC structure too

1. Guess DMC structure2. Find expected transition

counts and observation parameters, given current model and observations

3. Find maximum likelihood DMC model given counts

4. Goto 2

DMC structure can (and does) change!

Page 40: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning a DMC HMM

• Idea Two:Use EM to learn the DMC structure too

1. Guess DMC structure2. Find expected transition

counts and observation parameters, given current model and observations

3. Find maximum likelihood DMC model given counts

4. Goto 2

DMC structure can (and does) change!

In fact, just start with an all-constant transition model

Page 41: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning a DMC HMM2. Find expected transition

counts and observation parameters, given current model and observations

Page 42: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

newija )|( 1 itjt sqsqP We want new estimate of

Page 43: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

newija )|( 1 itjt sqsqP We want new estimate of

N

kT

old

Told

OOOki

OOOji

121

21

,,,| ns transitio# Expected

,,,| ns transitio# Expected

Page 44: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

newija )|( 1 itjt sqsqP We want new estimate of

N

kT

old

Told

OOOki

OOOji

121

21

,,,| ns transitio# Expected

,,,| ns transitio# Expected

N

k

T

tTitkt

T

tTitjt

OOOsqsqP

OOOsqsqP

1 121

old1

121

old1

),,,|,(

),,,|,(

Page 45: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

newija )|( 1 itjt sqsqP We want new estimate of

N

kT

old

Told

OOOki

OOOji

121

21

,,,| ns transitio# Expected

,,,| ns transitio# Expected

N

k

T

tTitkt

T

tTitjt

OOOsqsqP

OOOsqsqP

1 121

old1

121

old1

),,,|,(

),,,|,(

N

kik

ij

S

S

1

where

T

tTitjtij OOsqsqPS

1

old11 )|,,,(

T

ttjttij Objia

111 )()()(

Page 46: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

ttjttijij ObjiaS

111 )()()( where

Page 47: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N

T

N

We want

N

kikijij SSa

1

new

T

ttjttijij ObjiaS

111 )()()( where

Page 48: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N

T

N

Can get this in O(TN) time

Can get this in O(TN) time

We want

N

kikijij SSa

1

new

T

ttjttijij ObjiaS

111 )()()( where

Page 49: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

Can get this in O(TN) time

Can get this in O(TN) time

)()()( 11 tjtijt Objaj

Page 50: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

Page 51: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

Page 52: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

TS O(TN2)

Page 53: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

TS O(TN2)

Speedups:

• Strassen?

Page 54: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

TS O(TN2)

Speedups:

• Strassen

• Approximate by DMC

Page 55: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

TS O(TN2)

Speedups:

• Strassen

• Approximate by DMC

• Approximate randomized ATB

Page 56: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

TS O(TN2)

Speedups:

• Strassen

• Approximate by DMC

• Approximate randomized ATB

• Sparse structure fine?

Page 57: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

TS O(TN2)

Speedups:

• Strassen

• Approximate by DMC

• Approximate randomized ATB

• Sparse structure fine

• Fixed DMC is fine?

Page 58: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24

*2 *4

Dot Product of Columns

TS O(TN2)

Speedups:

• Strassen

• Approximate by DMC

• Approximate randomized ATB

• Sparse structure fine

• Fixed DMC is fine

• Speedup without approximation

Page 59: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

We want

N

kikijij SSa

1

new

T

tttij jiS

1

)()( where

T

N

T

N

SN

N

S24• Insight One: only need the top K entries

in each row of S

• Insight Two: Values in rows of and are often very skewed

Page 60: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N N

-biggies(i) -biggies(j)

For i = 1..N, store indexes of R largest values in i’th column of

For j = 1..N, store indexes of R largest values in j’th column of

There’s an important detail I’m omitting here to do with prescaling the rows of and .

Page 61: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N N

-biggies(i) -biggies(j)

For i = 1..N, store indexes of R largest values in i’th column of

For j = 1..N, store indexes of R largest values in j’th column of

R << T

Takes O(TN) time to do all indexes

There’s an important detail I’m omitting here to do with prescaling the rows of and .

Page 62: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N N

-biggies(i) -biggies(j)

For i = 1..N, store indexes of R largest values in i’th column of

For j = 1..N, store indexes of R largest values in j’th column of

R << T

Takes O(TN) time to do all indexes

T

tttij jiS

1

)()(

There’s an important detail I’m omitting here to do with prescaling the rows of and .

Page 63: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N N

-biggies(i) -biggies(j)

For i = 1..N, store indexes of R largest values in i’th column of

For j = 1..N, store indexes of R largest values in j’th column of

R << T

Takes O(TN) time to do all indexes

T

tttij jiS

1

)()(

biggies(j)-biggies(i)-

)()(

t

tt ji

biggies(j)-biggies(i)-

)()(

t

tt ji

There’s an important detail I’m omitting here to do with prescaling the rows of and .

Page 64: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N N

-biggies(i) -biggies(j)

For i = 1..N, store indexes of R largest values in i’th column of

For j = 1..N, store indexes of R largest values in j’th column of

R << T

Takes O(TN) time to do all indexes

T

tttij jiS

1

)()(

biggies(j)-biggies(i)-

)()(

t

tt ji

biggies(j)-biggies(i)-

)()(

t

tt ji

biggies(j)-biggies(i)-

)()(

t

tt ji

biggies(j)-biggies(i)-

)( )()(

t

tR ji

There’s an important detail I’m omitting here to do with prescaling the rows of and .

Page 65: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

T

N N

-biggies(i) -biggies(j)

For i = 1..N, store indexes of R largest values in i’th column of

For j = 1..N, store indexes of R largest values in j’th column of

R << T

Takes O(TN) time to do all indexes

T

tttij jiS

1

)()(

biggies(j)-biggies(i)-

)()(

t

tt ji

biggies(j)-biggies(i)-

)()(

t

tt ji

biggies(j)-biggies(i)-

)()(

t

tt ji

biggies(j)-biggies(i)-

)( )()(

t

tR ji

R’th largest value in i’th column of

O(1) time to obtain

O(1) time to obtain (precached for all j in time O(TN) )

O(R) computation

There’s an important detail I’m omitting here to do with prescaling the rows of and .

Page 66: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

S

N

j1 2 3 N…

Sij

Computing the i’th row of S…

In O(NR) time, we can put upper and lower bounds on Sij for j = 1,2 .. N

Page 67: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

S

N

j1 2 3 N…

Sij

Computing the i’th row of S…

In O(NR) time, we can put upper and lower bounds on Sij for j = 1,2 .. N

Only need exact values of Sij for the k largest values within the row

Page 68: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

S

N

j1 2 3 N…

Sij

Computing the i’th row of S…

In O(NR) time, we can put upper and lower bounds on Sij for j = 1,2 .. N

Only need exact values of Sij for the k largest values within the row

Ignore j’s that can’t be the best

Page 69: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

S

N

j1 2 3 N…

Sij

Computing the i’th row of S…

In O(NR) time, we can put upper and lower bounds on Sij for j = 1,2 .. N

Only need exact values of Sij for the k largest values within the row

Ignore j’s that can’t be the best

Be exact for the rest: O(N) time each.

Page 70: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

S

N

j1 2 3 N…

Sij

Computing the i’th row of S…

In O(NR) time, we can put upper and lower bounds on Sij for j = 1,2 .. N

Only need exact values of Sij for the k largest values within the row

Ignore j’s that can’t be the best

Be exact for the rest: O(N) time each.

If there’s enough pruning,

total time is O(TN+RN2)

Page 71: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Evaluation and Inference Speedup

Dat

ase

t: s

ynth

etic

dat

a w

ith T

=20

00 t

ime

ste

ps

Page 72: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Parameter Learning Speedup

Dat

ase

t: s

ynth

etic

dat

a w

ith T

=20

00 t

ime

ste

ps

Page 73: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Performance Experiments• DMC-friendly dataset:

• 2-D gaussian 20-state DMC HMM with K=5 (20,000 train, 5,000 test)

• Anti-DMC dataset: • 2-D gaussian 20-state regular HMM with steadily varying,

well-distributed transition probabilities (20,000 train, 5,000 test)

• Motionlogger dataset: • Accelerometer data from two sensors worn over several days

(10,000 train, 4,720 test)• Regular and DMC HMMs:

• 20 states• Small HMM:

• 5-state regular HMM• Uniform HMM:

• 20-state HMM with uniform transition probabilities

Page 74: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for DMC-friendly data

Page 75: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for DMC-friendly data

Page 76: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for DMC-friendly data

Page 77: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for DMC-friendly data

Page 78: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for DMC-friendly data

Page 79: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for DMC-friendly data

Page 80: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for DMC-friendly data

Page 81: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Anti-DMC data

Page 82: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Anti-DMC data

Page 83: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Anti-DMC data

Page 84: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Anti-DMC data

Page 85: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Anti-DMC data

Page 86: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Anti-DMC data

Page 87: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Anti-DMC data

Page 88: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Motionlogger data

Page 89: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Motionlogger data

Page 90: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Motionlogger data

Page 91: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Motionlogger data

Page 92: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Motionlogger data

Page 93: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Motionlogger data

Page 94: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Learning Curves for Motionlogger data

Page 95: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Tradeoffs between N and K• We vary N and K while keeping the number of

transition parameters (N×K) constant• Increasing N and decreasing K allows more

states for modeling data features but fewer parameters per state for temporal structure

Page 96: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Tradeoffs between N and K

• Average test-set log-likelihoods at convergence• Datasets:

• A: DMC-friendly• B: Anti-DMC• C: Motionlogger

• Each dataset has a different optimal N-vs-K tradeoff

Page 97: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Regularization with DMC HMMs• # of transition parameters in regular 100-state

HMM: 10,000• # of transition parameters in DMC 100-state

HMM with K= 5 : 500

Page 98: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Conclusions• DMC HMMs are an important class of models

that allow parameterized complexity-vs-efficiency tradeoffs in large state spaces

• The speedup can be several orders of magnitude

• Even for non-DMC domains, DMC HMMs yield higher scores than baseline models

• The DMC HMM model can be applied to arbitrary state spaces and observation densities

Page 99: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Related Work• Felzenszwalb et al. (2003) – fast HMM algorithms

when transition probabilities can be expressed as distances in an underlying parameter space

• Murphy and Paskin (2002) – fast inference in hierarchical HMMs cast as DBNs

• Salakhutdinov et al. (2003) – combined EM and conjugate gradient for faster HMM learning when missing information amount is high

• Beam Search – widely used heuristic in word recognition for speech systems

Page 100: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Future Work• Investigate DMC HMMs as regularization

mechanism• Eliminate R parameter using an automatic

backoff evaluation approach• Devise ways to automatically set K

parameter, have per-row K parameters

Page 101: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org

Future Work• Investigate DMC HMMs as regularization

mechanism• Eliminate R parameter using an automatic

backoff evaluation approach• Devise ways to automatically set K

parameter, have per-row K parameters

The End

Page 102: Fast Inference and Learning in Large-State-Space HMMs

Siddiqi and Moore, www.autonlab.org