a formant-trajectory model and its usage in comparing coarticulatory effects in dysarthric and...

26
A formant-trajectory model and its usage in comparing coarticulatory effects in Dysarthric and normal speech Xiaochuan Niu and Jan P. H. van Santen Center for Spoken Language Understanding OGI School of Science and Engineering at Oregon Health & Science University, USA MAVEBA 2003 Florence, Italy December 10-12, 2003

Upload: dustin-nash

Post on 02-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

A formant-trajectory model and its usage in comparing coarticulatory effects in

Dysarthric and normal speech

Xiaochuan Niu and Jan P. H. van Santen

Center for Spoken Language UnderstandingOGI School of Science and Engineering at Oregon Health & Science University, USA

MAVEBA 2003 Florence, Italy December 10-12, 2003

What is Dysarthria?

• Group of speech disorders – Weakness / incoordination of speech muscles – result of damage to the brain or nerves

• Results in unintelligible speech

MAVEBA 2003 Florence, Italy December 10-12, 2003

Long Term Project Goal

• Long term goal: Speech transformation– Device that works in real time – Not:

• Amplifier, spectral filter

– But: • Correct for dynamic articulatory problems• Based on a dynamic model of coarticulation

• Today’s talk: – Test (very simple) model of vowel dynamics

MAVEBA 2003 Florence, Italy December 10-12, 2003

Observation: Vowel Formants

• Median Formants in Vowel Centers

[pics]

MAVEBA 2003 Florence, Italy December 10-12, 2003

Framework

• Formant Trajectories– (linear or non-linear) interpolation– between vowel targets

• Three mechanisms for vowel triangle data: 1. More coarticulation (interpolation too smooth)2. More random variability3. Incorrect targets

MAVEBA 2003 Florence, Italy December 10-12, 2003

Mechanism 1: Coarticulation

• Average formants of any given vowel …– … more strongly dependent on …– … the average of the virtual formants …– … of the surrounding consonants

MAVEBA 2003 Florence, Italy December 10-12, 2003

Mechanism 2: Random Variability

• Average formants of any given vowel …– … result of broad distributions that are …– … skewed by the boundaries of vowel space

MAVEBA 2003 Florence, Italy December 10-12, 2003

Mechanism 3: Incorrect Targets

• Average formants of any given vowel …– … result of a tendency to …– … to move articulators in the wrong direction

MAVEBA 2003 Florence, Italy December 10-12, 2003

Linear Coarticulation Model

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Linear Coarticulation Model

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Observed formant vectort: Time p: Preceding consonantv: Voweln: Next consonant

Linear Coarticulation Model

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Observed formant vectort: Time p: Preceding consonantv: Voweln: Next consonant

WeightMatrices

Linear Coarticulation Model

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Observed formant vectort: Time p: Preceding consonantv: Voweln: Next consonant

Target Formants

WeightMatrices

Linear Coarticulation Model

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Based on earlier work by Broad, Oehman, Lindblom, Schouten, Pols, Stevens, …

How use for transformation?

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

How use for transformation?

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Fv =est (I - Apt - Bnt)-1 (F(t|p v n) - AptFp - BntFn)

implies

How use for transformation?

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Fv =est (I - Apt - Bnt)-1 (F(t|p v n) - AptFp - BntFn)

implies

Partial consonant recognition

observed

How use for transformation?

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

Fv =est (I - Apt - Bnt)-1 (F(t|p v n) - AptFp - BntFn)

implies

Partial consonant recognition

observed

Application I

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

ant 0 0

0 ant 0

0 0 ant

Apt= [ ] bnt 0 0

0 bnt 0

0 0 bnt

Bpt= [ ]

• Model F(t|p v n) at vowel midpoints• Each <pvn> token may have different values of Apt and Bnt

No assumptions about dependency of weights on time.• But: assume synchronicity for formant changes:

Application I: Targets [jj/ll]

MAVEBA 2003 Florence, Italy December 10-12, 2003

Application I: Targets [00/09]

MAVEBA 2003 Florence, Italy December 10-12, 2003

Application I: Weights [jj/ll]

MAVEBA 2003 Florence, Italy December 10-12, 2003

Application I: Weights [00/09]

MAVEBA 2003 Florence, Italy December 10-12, 2003

Application II

MAVEBA 2003 Florence, Italy December 10-12, 2003

3x1 3x3 3x1 3x3 3x1 3x3 3x3 3x3 3x3

F(t|p v n) = Apt Fp + Bnt Fn + (I - Apt - Bnt) Fv

ant 0 0

0 a’nt 0

0 0 a”nt

Apt= [ ] bnt 0 0

0 b’nt 0

0 0 b”nt

Bpt= [ ]

• Model F(t|p v n) at vowel midpoints• Apt and Bnt same for all <pvn> tokens.

Assumptions are made about dependency of weights on time.• But: no synchronicity for formant changes:

Application I: Targets [00/09]

MAVEBA 2003 Florence, Italy December 10-12, 2003

Application II: Weights [00/09]

MAVEBA 2003 Florence, Italy December 10-12, 2003

Conclusions

• Proposed linear model of vowel dynamics– To be used for formant “correction”

• When used as analytic instrument– Gave meaningful results

• Strikingly “normal” target values– Without any normalizing bias in the estimation process

• Clear evidence for enhanced coarticulation

MAVEBA 2003 Florence, Italy December 10-12, 2003