Download - “Connecting the dots”

“Connecting the dots”How do articulatory processes “map” onto acoustic processes?

Stevens and House (1955)Model assumes No coupling with

Nasal cavity trachea & pulmonary

system

Stevens and House (1955)Model parameters Distance of major

constriction from glottis (d0)

Radius of major constriction (r0)

Area (A) and length (l) of lip constrictionA/l conductivity index

Figure 1.

Comparing model to real vocal tract

Stevens and House (1955)

Figure 2.

Key Goal of Study Evaluate the effect of systematically changing

each of these three “vocal tract” parameters on F1-F3 frequency

Form

ant F

requ

ency

(K

Hz)

Point of Constriction (d0) (cm from glottis)

F1

F2

F3

Figure 3.

Point of constriction

A/l

NOTE Single intersection

between F1 & F2 in most cases

Figure5.


A/l

Figure 5.


A/l

Figure 7.

General Observations

∆ d0 = ∆ Vfront & Vback

↑ d0 = ↓ Vfront = ↑ F2

↑ d0 = ↑ Vback = ↓ F1

General Observations

↓ r0 = ↓ F1

↑ r0 = ↑ F1

When d0 ↑ (anterior)

↓ r0 = ↓ Vfront = ↑ F2

↑ lip rounding

= ↓ A/l

= ↓ F1 & F2

Formant Patterns for the “Noncentral” (i.e., omitting

/ú/ and /ü/) Monophthongal Vowels of American English (based on Peterson & Barney averages)

Formant Data for Men “Standard” F1-F2 Plot

r0

d0

- +

-

+

Peterson & Barney Averages (for men only) Plotted on an Acoustic Vowel Diagram

20

“normalizing” formant values

Clinical Example

22

Acoustic variables related to the perception of vowel quality F1 and F2 Other formants (i.e. F3) Fundamental frequency (F0) Duration Spectral dynamics

i.e. formant change over time

How helpful is F1 & F2?

Data Source Human Listeners Pattern Classifier

Peterson & Barney (1952)

94.4 % 74.9 %

Hillenbrand et al. (1995)

95.2 % 68.2 %

From Hillenbrand & Gayvert (1993)

How does adding more variables improve pattern classifier success? F1, F2 + F3

80-85 %

F1, F2 + F0

80-85 %

F1, F2 + F3 + F0

89-90 %

How about Duration?

Nearby vowels have different durations

___________________________________

American English Vowels Have Different Typical Durations

___________________________________

/i/ > /I/

/u/ > /U/

/A/ > /‰/

/å/ > /ú/

/Ø/ > /å/ ___________________________________ ___________________________________

Do Listeners Use Duration in Vowel Identification?

RESULTS

Original Duration: 96.0%

Neutral Duration: 94.1%

Short Duration: 91.4%

Long Duration: 90.9%

CONCLUSIONS

1. Duration has a measurable but fairly small overall effect on vowel perception.

2. Vowel Shortening (-2 SDs): ~5% drop in

overall intelligibility 3. Vowel Lengthening (+2 SDs): ~5% drop in

overall intelligibility 4. Vowels Most Affected: /å/ - /Ø/ - /ú/

/A/ - /‰/ 5. Vowels Not Affected: /i/ - /I/

/u/ - /U/

What about Duration?

What about Duration?

Some examples

What about formant variation?

Notice that some vowels – especially /A/ and /I/ – show a fair amount of change in formant freq’s throughout the course of the vowel. Is it possible that these formant movements are perceptually significant?

Naturally spoken /hAd/

Synthesized, preserving original formant contours

Synthesized with flattened formants


Conclusion: Spectral

change patterns do matter.


What do we conclude?

Sinewave Speech Demonstration

Sinewave speech examples (from HINT sentence intelligibility test):

Selected issues that are not resolved What do listener’s use?

Specific formants vs. spectrum envelope What is the “planning space” used by

speakers? Articulatory Acoustic Auditory

Download - “Connecting the dots”

Top Related