a plsa-based language model for conversational telephone speech david mrva and philip c.woodland

26
A PLSA-based Language Model for Con versational Telephone Speech David Mrva and Philip C.Woodland 2004/12/08 邱邱邱

Upload: onawa

Post on 18-Mar-2016

53 views

Category:

Documents


4 download

DESCRIPTION

A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland. 2004/12/08 邱炫盛. Outline. Language Model PLSA Model Experimental Results Conclusion. Language Model. The task of a language model is to calculate probability n-gram model - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

A PLSA-based Language Model for Conversational Telephone Speech

David Mrva and Philip C.Woodland

2004/12/08 邱炫盛

Page 2: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Outline

• Language Model• PLSA Model• Experimental Results• Conclusion

Page 3: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Language Model

• The task of a language model is to calculate probability

• n-gram model – Range of dependencies is limited to n-words – Information is ignored

)( ii hwP

),...,()( 11 iniiii wwwPhwP

Page 4: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Language Model (cont.)

• Topic-based language model– Latent Semantic Analysis– Topic-based language model– PLSA-based language model

Page 5: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model• PLSA is general machine learning technique for

modeling the co-occurrences of events.

• Co-occurrence of words and documents

• Hidden variable = aspect

• PLSA in this paper is a mixture of unigram distribution.

Page 6: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

P(d)d w

P(w|d)

P(d)td w

P(t|d)

P(w|t)

Graphical Model Representation

Page 7: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

P(wj|z1)

P(wj|z2)

P(wj|zk)

P(z1|di)

P(z2|di)

P(zk|di)w1 w2 w3…….wj

di

Page 8: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

N

i

M

j

K

kkjikij

N

i

M

j

dwnij

K

kkik

kikiii

zwpdzpdwn

dwpL

zwpdzp

zwpdzpzwpdzpzwpdzpdwp

ij

1 1 1

1 1

),(

1

2211

)|()|(log),(

))|((log(log

)|()|(

)|()|(...)|()|()|()|()|(

M: number of words in vocabulary

N: number of documents in training collection

K: number of aspects or topics

Page 9: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

iiii

K

kijkijz

K

kikjijk

K

kijijk

,d|wzijzikj,d|wzij

ijkikj

ijk

ikj

ij

K

kkjik

dddd

d,|wzp,d|wpd|,zwp,d|wzp

d|wp,d|wzp

d,|wpd|,zwpEd|wpE

,d|wzp|d,zwp,d|wzp|d,zwp

dwpzwpdzp

k

ijkkijk

~H~

~logˆ~logˆ

~logˆ

~log~log~log

Step-E

logloglog

|log||log

,,

11

1

1

Page 10: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

0~|ˆ

~1ˆ

1log ~

logˆ

logˆ~logˆ

H~H

0H~

H~

0H~

H~

0log~

log log~

log

1

1

1

11

,,

,,,,

,,,,

K

kijkijk

K

k ijk

ijkijk

K

k ijk

ijkijk

K

kijkijk

K

k

ijkijk

iiii

iiiiiiii

iiiiiiii

ijijijij

d,|wzp,dwzp

,d|wzpd,|wzp

,d|wzp

xx,d|wzpd,|wzp

,d|wzp

,d|wzp,d|wzpd,|wzp,d|wzp

dddd

dddddddd

dddddddd

|dwpd|wp|dwpd|wp

Page 11: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

K

k

ikkj

ikkj

ij

ikjijk

K

k

ikkjijk

K

kikjijk

iiij

dzp|zwp

dzp|zwp|dwp|d,zwp

,dwzp

dzp|zwp,d|wzp

d|,zwp,d|wzp

ddd|wp

1

1

1

,

|ˆˆ

|ˆˆˆ

ˆ|ˆ

|logˆmax

~logˆmax

~ maximum

~log maximum

conditional independent

Page 12: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

k

ik

j

kj

i kk j

i kk j

Tikj

M

j

K

kikijkijdzP

wkjk

N

i

M

jkjijkij|zwP

d ziki

z wkjk

N

i

M

j

K

k

ikkjijkij

d Tiki

z wkjk

C

dzpdzp,d|wzpdwn

zwp|zwp,d|wzpdwn

dzpzwp

dzp|zwp,dwzpdwn

dzpzwpLE

|1|logˆ,

|1logˆ,

|1|1

|log|ˆ,

|1|1

Step-M

1 1|

1 1

1 1 1

Page 13: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

Page 14: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

i

M

jijkij

M

jij

M

jijij

K

k

M

jijkij

M

jijkij

ik

M

j

N

i

ijkij

N

i

ijkij

kj

dn

,d|wzpdwn

dwn

,d|wzpdwn

,d|wzpdwn

,d|wzpdwndzP

,d|wzpdwn

,d|wzpdwn|zwP

k

1

1

1

1 1

1

1 1

1

,

,

,

,

,|

,

,

...difference take

Page 15: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

)|(1)|()|(

)|()|(1

1)|(

),(

)|(),()()|(

1

1 1

1

,

,1

ikK

q iqqi

ikkiik

dw

dw kkk

hzpii

hzpzwphzpzwp

ihzp

dwn

dzpdwnzphzp

Use PLSA in language model:P(zk|di) are used as mixture weights when calculating the word probability.The history hi is used instead of di to re-estimate these weight on the test set.

Page 16: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)

K

kikkiii

ikK

qics

iqqi

icsikki

ik

i

hzpzwp)|hp(w

hzpbibi

hzpzwphzpzwp

bihzp

h

1

1

1)(

1

)(1

)|()|(

ndistrbutio cprior topi theof weight the:bth word-i theof score confidence the:cs(i)

)|(1))|()|((

))|()|((1)|(

document. theof topicabout the model the toavailablen informatioenough not and,history in the errorsn recognitio of Because

Page 17: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)Account for the whole document history of word irrespective of the do

cument length.Have no means for representing the word order because of mixture o

f unigram distribution.

Combine n-gram with PLSA:

When PLSA used in decoding, Viterbi-based decoder is not suitable.Two-pass decoder:• First pass:

– n-gram, output a confidence score• Second pass:

– PLSA, rescoring the lattices

)()|()|()|(

iunigram

iiPLSAiigramnii wP

hwPhwPhwP

Page 18: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

PLSA Model (cont.)• During the re-scoring, the PLSA history comprises of all segments in

a document but the current segment.

• PLSA history is fixed for all words in a given segment.

• Refer to “history “ as “context” (ctx). It contains both past and future words.

Page 19: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Experimental ResultsTwo Test Sets• NIST’s Hub5 speech-to-text evaluation 2002(eval02)

– Switchboard I and II– 62k words,19k form Switchboard I

• NIST’s Rich Transcription Spring 2003 CTS speech-to-text evalation(eval03)– Switchboard II phase 5 and Fisher– 74k words, 36k from Fisher

Page 20: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Experimental Results (cont.)

Page 21: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Experimental Results (cont.)• The reduction is greater if PLSA’s training text relates to

the test set.

• PP of (ref.ctx,10) <PP of (rec.ctx,10)

• b=10 is the best value

• Use of confidence score makes the PLSA model less sensitive to b

Page 22: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Experimental Results (cont.)

Page 23: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Experimental Results (cont.)• baseline: n-gram trained on 20M words of Fisher

transcripts. Increased to 500 classes• PLSA: 750 aspects,100 EM iterations• Separate into eval03dev,eval03tst

– Interpolation weight of the word and class-based n-gram were set to minimize perplexity.

– A slight improvement when side-based documents were used.

Page 24: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Experimental Results (cont.)• b=100 is best value

– PLSA model needs much more data to estimate the topic of Fisher than SwbI

• Having a long context is very important.

Page 25: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Experimental Results (cont.)

Page 26: A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Conclusion

• PLSA with the suggested modifications in a language model reduces perplexity.

• Future work:– Re-score lattices to calculate WERs– Combine semantics-oriented model with synta

x-based language model