letter to phoneme alignment using graphical models n. bolandzadeh, r. rabbany dept of computing...

13
Letter to Phoneme Letter to Phoneme Alignment Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Upload: sheryl-bernice-sparks

Post on 18-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Letter to Phoneme Letter to Phoneme AlignmentAlignment

Using Graphical Models

N. Bolandzadeh, R. Rabbany

Dept of Computing ScienceUniversity of Alberta

11

Page 2: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Text to Speech Text to Speech ProblemProblem

Conversion of Text to Speech: TTS

◦Automated Telecom Services◦E-mail by Phone◦Banking Systems◦Handicapped People

2

Page 3: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

PronunciationPronunciation

Pronunciation of the words Dictionary Words Non-Dictionary Words

Phonetic analysis Dictionary lookup?

Language is alive, new words addProper Nouns

Machine Learning higher accuracyL 2 P alignment is needed

3

Page 4: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

4

ProblemProblemLetter to Phoneme Alignment

◦ Letter: c a k e

◦ Phoneme: k ei k

4

L2P

Automatic Speech Recognition

&

Spelling Correction

Page 5: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

5

It's not Trivial! It's not Trivial! why?why?

No Consistency◦City / s /◦Cake / k /◦Kid / k /

No Transparency◦K i d (3) / k i d / (3) ◦S i x (3) / s i k s / (4)◦Q u e u e (5) / k j u: / (3)◦A x e (3) / a k s / (3)

5

Page 6: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

FrameworkFramework

6

Brick brIkBrightening br2tHINBritishbrItISBronx brQNksBugle bjugPBuoy b4

b|r|i|ck| b|r|I|k|b|r|ig|ht|en|i|ng| b|r|2|t|H|I|N|b|r|i|t|i|sh| b|r|I|t|I|S|b|r|o|n|x| b|r|Q|N|ks|b|u|g|le| b|ju|g|P|bu|oy| b|4|

Page 7: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

EvaluationEvaluationNo Aligned DictionaryUnsupervised LearningPreviously aligner was tied with a

generator

Evaluation on percentage of correctly predicted phonemes and words

7

Page 8: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Model of our problemModel of our problem

8

mn pppPlllL ...... 2121

2|||,|

,

...

),|(maxarg

21

ii

iii

k

Abest

PL

PLa

aaaA

PLAPA

B | r | i | t | i | sh |B | r | I | t | I | S |

Page 9: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Static Model, StructureStatic Model, StructureIndependent sub alignments

9

l1 l2

p1 p2

a1

k

iiii PLaPAP

1

),|()(

l3 l4

p3 p4

a2

ln-1 ln

pm-1

pm

ak

Page 10: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Static Model, LearningStatic Model, LearningEM

◦Initialize Parameters◦Expectation Step:

Parameters Alignments

◦Maximization Step: Alignments Parameters

10

Page 11: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Result of Static ModelResult of Static Model

11

Method Letters Words

Static Model

81.34% 43.5%

Page 12: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

Dynamic ModelDynamic Model

12

Sequence of dataUnrolled model for T=3 slices

l1 l2

p1 p2

a1

l3 l4

p3 p4

a2

l5 l6

p5 p6

ak

Page 13: Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1 1

QuestionsQuestions

13