pitch tracking + prosody

26
Pitch Tracking + Prosody January 19, 2012

Upload: zorion

Post on 05-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Pitch Tracking + Prosody. January 19, 2012. Homework!. For Tuesday: introductory course project report Background information on your consultant and the language they speak. For Thursday: Digital Signal Processing exercises!. A Typology. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pitch Tracking + Prosody

Pitch Tracking + Prosody

January 19, 2012

Page 2: Pitch Tracking + Prosody

Homework!• For Tuesday: introductory course project report

• Background information on your consultant and the language they speak.

• For Thursday: Digital Signal Processing exercises!

Page 3: Pitch Tracking + Prosody

A Typology• F0 is generally used in three different ways in language:

1. Tone languages (Chinese, Navajo, Igbo)

• Lexically determined tone on every syllable

• “Syllable-based” tone languages

2. Accentual languages (Japanese, Swedish)

• The location of an accent in a particular word is lexically marked.

• “Word-based” tone languages

3. Stress languages (English, Russian)

• It’s complicated.

Page 4: Pitch Tracking + Prosody

Mandarin Tone

ma1: mother

ma2: hemp

ma3: horse

ma4: to scold

• Mandarin (Chinese) is a classic example of a tone language.

Page 5: Pitch Tracking + Prosody

How to Transcribe Tone• Tones are defined by the pattern they make through a speaker’s frequency range.

• The frequency range is usually assumed to encompass five levels (1-5).

• (although this can vary, depending on the language)

1

2

3

4

5Highest F0

Lowest F0

Page 6: Pitch Tracking + Prosody

• In Mandarin, tones span a frequency range of 1-5

• Each tone is denoted by its (numerical) path through the frequency range

• Each syllable can also be labeled with a tone number (e.g., ma1, ma2, ma3, ma4)

Tone

1

2

3

4

Page 7: Pitch Tracking + Prosody

How to Transcribe Tone• Tone is relative

• i.e., not absolute

• Each speaker has a unique frequency range. For example:

1

2

3

4

5Highest F0

Lowest F0

Female

Male

100 Hz

200 Hz 350 Hz

150 Hz

Page 8: Pitch Tracking + Prosody

General Relativity• In ordinary conversation, for European languages (Fant, 1956) :

• Men have an average F0 of 120 Hz

• A range of 50-250 Hz

• Women have an average F0 of 220 Hz

• A range of 120-480 Hz

• Children have an average F0 of 330 Hz

• In a normal utterance, the F0 range is usually one octave.

• i.e., highest F0 = 2 * lowest F0

Page 9: Pitch Tracking + Prosody

Relativity, in Reality• The same tones may be denoted by completely different frequencies, depending on the speaker.

• Tone is an abstract linguistic unit.

female speaker

male speaker

ma, tone 1 (55)

Page 10: Pitch Tracking + Prosody

Accent Languages• In accent languages, there is only one pitch accent associated with each word.

• The pitch accent is realized on only one syllable in the word.

• The other syllables in the word can have no accent.

• Accent is lexically determined, so there can be minimal pairs.

• Japanese is a pitch accent language…

• for some, but not all, words

• for some, but not all, dialects

Page 11: Pitch Tracking + Prosody

Japanese• Japanese words have one High accent

• it attaches to one “mora” in the word

• A mora = a vowel, or a consonant following a vowel, within a syllable.

• For example:

• [ni] ‘two’ has one mora.

• [san] ‘three’ has two morae.

• The first mora, if not accented, has a Low F0.

• Morae following the accent have Low F0.

It’s actually slightly more complicated than this; for more info, see: http://sp.cis.iwate-u.ac.jp/sp/lesson/j/doc/accent.html

Page 12: Pitch Tracking + Prosody

Japanese Examples• asa ‘morning’ H-L

•asa ‘hemp’ L-H

Page 13: Pitch Tracking + Prosody

• “chopsticks” H-L-L

• “bridge” L-H-L

• “edge” L-H-H

Page 14: Pitch Tracking + Prosody

Stress Languages• Stress is a suprasegmental property that applies to whole syllables.

• It is defined by more than just differences in F0.

• Stressed syllables are higher in pitch (usually)

• Stressed syllables are longer (usually)

• Stressed syllables are louder (usually)

• Stressed syllables reflect more phonetic effort.

• More aspiration, less coarticulation in stressed syllables.

• Vowels often reduce to schwa in unstressed syllables.

• The combination of these factors give stressed syllables more prominence than unstressed syllables.

Page 15: Pitch Tracking + Prosody

Stress: Pitch

(N)

(V)

Complicating factor: pitch tends to drift downwards at the end of utterances

Page 16: Pitch Tracking + Prosody

Intonation• Languages superimpose pitch contours on top of word-based stress or tone distinctions.

• This is called intonation.

• It turns out that English:

• has word-based stress

• and phrase-based pitch accents (intonation)

• The pitch accents are pragmatically specified, rather than lexically specified.

• = they change according to discourse context.

Page 17: Pitch Tracking + Prosody

English Intonation• We’ll analyze English intonation with a framework called TOBI

• Tones and Break Indices

• Note: intonational patterns vary across dialects

• The patterns and examples presented today might not match up with your own intonational system

• Also: this framework has only been applied to a few (primarily western) languages

• Check out the following:

• http://www.ling.ohio-state.edu/~tobi/

• Course in Phonetics, pp. 99-107

• Mary Beckman’s notes

Page 18: Pitch Tracking + Prosody

Levels of Prominence• In English, pitch accents align with stressed syllables.

• Example: “exploitation”

vowel X X X X

full vowel X X X

stress X X

pitch accent X

• Normally, the accent falls on the last stressed syllable.

Page 19: Pitch Tracking + Prosody

Pitch Accent Types• In English, pitch accents can be either high or low

• H* or L*

• Examples: High (H*) Low (L*)

Yes. Yes?

H* L*

Magnification. Magnification?

• As with tones in tone languages, “high” and “low” pitch accents are defined relative to a speaker’s pitch range.

• My pitch range: H* = 155 Hz L* = 100 Hz

• Mary Beckman: H* = 260 Hz L* = 130 Hz

Page 20: Pitch Tracking + Prosody

Whole Utterances• The same pitch pattern can apply to an entire sentence:

H*

H*: Manny came with Anna.

L*

L*: Manny came with Anna?

H*

H*: Marianna made the marmalade.

L*

L*: Marianna made the marmalade?

Page 21: Pitch Tracking + Prosody

Information• Note that there’s a tendency to accent new information in the discourse.

• 4 different patterns for 4 different contexts:

H*

H*: Manny came with Anna.

H*

H*: Manny came with Anna.

L*

L*: Manny came with Anna?

L*

L*: Manny came with Anna?

Page 22: Pitch Tracking + Prosody

Pitch Tracking• H* is usually associated with a peak in F0;

• L* is usually associated with a valley (trough) in F0

• Pitch tracking can help with the identification of pitch peaks and valleys.

• Note: it’s easier to analyze utterances with lots of sonorants.

• Check out both productions of “Manny came with Anna” in Praat.

• Note that there is more to the intonation contour than just pitch peaks and valleys

• The H* is followed by a falling pitch pattern

• The L* is followed by a rising pitch pattern

Page 23: Pitch Tracking + Prosody

Tone Types• There are two types of tones at play:

1. Pitch Accents

• associated with a stressed syllable

• may be either High (H) or Low (L)

• marked with a *

2. Boundary Tones

• appear at the end of a phrase

• not associated with a particular syllable

• may be either High (H) or Low (L)

• marked with a %

Page 24: Pitch Tracking + Prosody

Tone Transcription

L* H%

Page 25: Pitch Tracking + Prosody

Phrases• Intonation organizes utterances into phrases

• “chunks”

• Boundary tones mark the end of intonational phrases

• Intonational phrases are the largest phrases

• In the transcription of intonation, phrase boundaries are marked with Break Indices

• Hence, TOBI: Tones and Break Indices

• Break Indices are denoted by numbers

• 1 = break between words

• 4 = break between intonational phrases

Page 26: Pitch Tracking + Prosody

Break Index Transcription

Tones: L* H%

Breaks: 1 1 1 4