chapter 12 speech perception. animals use sound to communicate in many ways bird calls bird calls...
TRANSCRIPT
Chapter 12Chapter 12
Speech PerceptionSpeech Perception
Animals use sound to Animals use sound to communicate in many wayscommunicate in many ways
Bird callsBird calls Whale callsWhale calls Baboons shrieksBaboons shrieks Vervet callsVervet calls Grasshopper rubbing legsGrasshopper rubbing legs
These kinds of communication differ These kinds of communication differ from language in the structure of the from language in the structure of the signals.signals.
Speech perception is a Speech perception is a broad categorybroad category
Understanding what is said (linguistic Understanding what is said (linguistic information)information)
Understanding “paralinguistic Understanding “paralinguistic information”information” Speaker’s identitySpeaker’s identity Speaker’s affective stateSpeaker’s affective state
Speech processing ≠ linguistics Speech processing ≠ linguistics processing.processing.
Vocal tractVocal tract
Includes larynx, throat, tongue, Includes larynx, throat, tongue, teeth, and lips.teeth, and lips.
Vocal chords = vocal foldsVocal chords = vocal folds Male vocal chords 60% larger than Male vocal chords 60% larger than
female vocal chords in humansfemale vocal chords in humans Size of vocal chords are not the sole Size of vocal chords are not the sole
cue to sex of speaker. Children’s cue to sex of speaker. Children’s voices can be discriminated.voices can be discriminated.
Physical disturbances in air Physical disturbances in air ≠ phonemes≠ phonemes
Many different Many different sounds are lumped sounds are lumped together in a every together in a every single phoneme.single phoneme.
Another case of Another case of separating the separating the physical from the physical from the psychological.psychological.
Humans normally speak at about 12 Humans normally speak at about 12 phonemes per second.phonemes per second.
Humans can comprehend speech at up to Humans can comprehend speech at up to about 50 phonemes per second.about 50 phonemes per second.
Voice spectrogram changes with age.Voice spectrogram changes with age.
Spectrograms can be taken of all sorts of Spectrograms can be taken of all sorts of sounds.sounds.
Neural analysis of speech Neural analysis of speech soundssounds
One phoneme can have distinct sound One phoneme can have distinct sound spectrograms. Distinct sound spectrograms spectrograms. Distinct sound spectrograms can be metamers for a phoneme.can be metamers for a phoneme.
Brain mechanisms of Brain mechanisms of speech perceptionspeech perception
http://www.molbio.princeton.edu/courses/mb427/2000/projects/0008/messedupbrainmain.html
Brain mechanisms of Brain mechanisms of speech perceptionspeech perception
Single-cell recordings in monkeys Single-cell recordings in monkeys show they are sensitive to:show they are sensitive to:
1.1. Time lapsing between lip Time lapsing between lip movements and start of sound movements and start of sound productionproduction
2.2. Acoustic context of sound Acoustic context of sound
3.3. Rate of sound frequency changesRate of sound frequency changes
Human studiesHuman studies
Human studies have been based on Human studies have been based on neuroimaging (fMRI and PET).neuroimaging (fMRI and PET).
A1 is not a linguistic center; merely an A1 is not a linguistic center; merely an auditory center. It does not respond auditory center. It does not respond preferentially to speech, rather than sound.preferentially to speech, rather than sound.
Speech processing is a grab bag of kinds of Speech processing is a grab bag of kinds of processing, e.g. linguistic, emotional, and processing, e.g. linguistic, emotional, and speaker identity.speaker identity.
Wernicke’s aphasiaWernicke’s aphasia
Subjects can hear sounds.Subjects can hear sounds. Subjects lose ability to comprehend Subjects lose ability to comprehend
speech, though they can produce speech, though they can produce (clearly disturbed) speech (clearly disturbed) speech themselves.themselves.
Other brain regions Other brain regions involved in speech involved in speech
processingprocessing Right temporal hemisphere is involved Right temporal hemisphere is involved
in emotion, speaker sex, and identity.in emotion, speaker sex, and identity. PhonagnosiaPhonagnosia
Right temporal hemisphere is less Right temporal hemisphere is less involved in linguistic analysis.involved in linguistic analysis.
Right pre-frontal cortex and parts of the Right pre-frontal cortex and parts of the limbic systems respond to emotion.limbic systems respond to emotion.
Other brain regions Other brain regions involved in speech involved in speech
processingprocessing Both hemispheres active in human Both hemispheres active in human
vocalizations, such as laughing or vocalizations, such as laughing or humming.humming.
Some motor areas for speech are Some motor areas for speech are active during speech perception.active during speech perception.
A “what” and “where” A “what” and “where” pathway in speech pathway in speech
processing?processing? One pathway is anterior (forward) One pathway is anterior (forward)
and ventral (below)and ventral (below)
The other pathway is posterior The other pathway is posterior (backward) and dorsal (above).(backward) and dorsal (above).
Not clear what these pathways do.Not clear what these pathways do.
Understanding speech: Understanding speech: AftereffectsAftereffects
Tilt aftereffect and motion aftereffect due to Tilt aftereffect and motion aftereffect due to “fatigue” of specific neurons.“fatigue” of specific neurons.
Eimas & Corbett, (1973), performed a linguistic Eimas & Corbett, (1973), performed a linguistic version.version.
Take ambiguous phonemes, e.g. between /t/ and Take ambiguous phonemes, e.g. between /t/ and /d/./d/.
Listen to /d/ over and over, then the ambiguity Listen to /d/ over and over, then the ambiguity disappears.disappears.
Understanding speech: Understanding speech: Context effectsContext effects
In vision, surrounding objects affect In vision, surrounding objects affect interpretation of size, color, brightness. In interpretation of size, color, brightness. In other words, context influences perception.other words, context influences perception.
In speech, context influences perception. In speech, context influences perception. We noted this earlier with /di/ and /du/.We noted this earlier with /di/ and /du/.
Understanding speech: Understanding speech: Context effectsContext effects
Semantic context can influence perception.Semantic context can influence perception. Examples of song lyrics.Examples of song lyrics.
Speed of utterance influences phonetic Speed of utterance influences phonetic interpretation.interpretation. A syllable may sound like /ba/ when preceding A syllable may sound like /ba/ when preceding
words are spoken slowly, but like /pa/ when words are spoken slowly, but like /pa/ when preceding words are spoken quickly.preceding words are spoken quickly.
Cadence of a sentence can influence Cadence of a sentence can influence interpretation of the last word. (Ladeford & interpretation of the last word. (Ladeford & Broadbent, 1957)Broadbent, 1957)
Understanding speech:Understanding speech:visual effectsvisual effects
McGurk Effect McGurk Effect Movies of speakers influence Movies of speakers influence
syllables heard.syllables heard. Vocal /ga/ + lip /ba/ = /da/Vocal /ga/ + lip /ba/ = /da/ Vocal “tought” + lip “hole” = “towel”.Vocal “tought” + lip “hole” = “towel”.
McGurk effect reduced with face McGurk effect reduced with face inversioninversion
Emotions of talking Emotions of talking headsheads
Movie of facial emotion + voice with Movie of facial emotion + voice with an emotionan emotion
When face and voice agree, most When face and voice agree, most subject correctly identity emotion.subject correctly identity emotion.
When face and voice conflict, facial When face and voice conflict, facial expression provided the emotion.expression provided the emotion.
McGurk effect + talking heads effect McGurk effect + talking heads effect makes sense, since it enables humans makes sense, since it enables humans to function more reliably in noise to function more reliably in noise environments.environments.
Infants 18-20 weeks old can match Infants 18-20 weeks old can match voice and face.voice and face.
Humans can match movies of Humans can match movies of speakers with voices of speakers.speakers with voices of speakers.
Monkeys and Monkeys and preferential lookingpreferential looking
Ghazanfar & Logothetis, (2003).Ghazanfar & Logothetis, (2003).
Showed monkeys two silent movies of Showed monkeys two silent movies of monkeys vocalizing at the same time.monkeys vocalizing at the same time.
Played a vocalization that matched Played a vocalization that matched one of the silent movies.one of the silent movies.
All 20 monkeys looked at the monkey All 20 monkeys looked at the monkey face that matched the sound.face that matched the sound.
More neuroimaging of More neuroimaging of speech perceptionspeech perception
Subjects watched faces of silent Subjects watched faces of silent speakers.speakers.
MT (aka V5) was active for motion MT (aka V5) was active for motion processing.processing.
A1 and additional language centers A1 and additional language centers were also active.were also active.
Perceived sound boundaries in words are Perceived sound boundaries in words are illusory.illusory.
““Mondegreens”Mondegreens”
Pauses indicate times at which to switch Pauses indicate times at which to switch speakers.speakers.
Disfluency: repetitions, false starts, and Disfluency: repetitions, false starts, and useless interjections.useless interjections. Help by parsing sentence, give subject time to Help by parsing sentence, give subject time to
process, and hinting at new information.process, and hinting at new information.
Language-based learning Language-based learning impairment: A specifically linguistic, impairment: A specifically linguistic, rather than acoustic impairment.rather than acoustic impairment.
Fun illusion Fun illusion (nothing to do with class):(nothing to do with class):
http://www.ritsumei.ac.jp/http://www.ritsumei.ac.jp/~akitaoka/index-e.html~akitaoka/index-e.html