categorical perception of speech: task variations in infants and adults bob mcmurray jessica maye...
TRANSCRIPT
Categorical perception of speech: Task variations in infants and adults
Bob McMurrayJessica Maye
Andrea Lathropand
Richard N. Aslin
And a big thanks to Julie Markant
Categorical Perception & Task Variations
Overview
Previous work
• Categorical perception and gradient sensitivity to subphonemic detail.
• Categorical perception in infants
Reassessing this with HTPP & AEM
• Infants show gradient sensitivity
• A new methodology
• Adult analogues
Categorical PerceptionC
ateg
oric
al p
erce
pti
on &
gra
die
ncy Categorical Perception:
Is subphonemic detail retained [and used] during speech perception?
For a long time…
NO!
Subphonemic variation is discarded infavor of a discrete label.
Non-categorical Perception
A number of psychophysical-type results showed listeners’ sensitivity to within-category detail.
4AIX TaskPisoni & Lazarus (1974)
Speeded ResponseCarney, Widen & Viemeister (1977)
TrainingSamuel (1977)Pisoni, Aslin, Henessey & Perey (1982)
Rating TaskMiller (1997)Massaro & Cohen (1983)
Word Recognition
These results did not reveal:
Whether on-line word recognition is sensitive to such detail.
Whether such sensitivity is useful during recognition.
Word Recognition
Mounting evidence that word-recognition is sensitive:
• Lahiri & Marslen-Wilson (1991): vowel nasalization
• Andruski, Blumstein & Burton (1994): VOT
• Gow & Gordon (1995): word segmentation
• Salverda, Dahan & McQueen (in press): embedded words and vowel length.
• Dahan, Magnuson, Tanenhaus & Hogan (2001): coarticulatory cues in vowel.
Bear
Gradient Sensitivity
McMurray, Tanenhaus & Aslin (2002)
• Eye-movements to objects after hearing items from 9-step VOT continuum.
• Systematic relationship between VOT and looks to the competitor.
0 5 10 15 20 25 30 35 400.02
0.03
0.04
0.05
0.06
0.07
0.08
VOT (ms)
CategoryBoundary
Response= Response=
Looks to
Looks to
Com
pet
itor
Fix
atio
ns
Gradient Sensitivity
This systematic, gradient relationship between lexical activation and acoustic detail would allow the system take advantage of fine-grained regularities in the signal.
Gow, McMurray & Tanenhaus (Sat., 6:00 poster session)•Anticipate upcoming material.•Resolve Ambiguity
If fine-grained detail is useful we might expect infants and children to
•Show gradient sensitivity to variation•Tune their sensitivity to learning environment
….BUT
Infant categorical perception
c
Cat
egor
ical
per
cep
tion
in in
fan
ts Early findings of categorical perception for infants (e.g. Eimas, Siqueland, Jusczyk & Vigorito) have never been refuted.
Most studies use:Habituation (many repetitions)Synthetic SpeechSingle continuum
Perhaps a different method would be more sensitive?
Head-Turn Preference Procedure
Jusczyk & Aslin (1995)
Infants exposed to a chunk of language:• Words in running speech.• Stream of continuous speech (ala stat. learning)• Word list
After exposure, memory for exposed items (or abstractions) is assessed by comparing listening time to consistent items with inconsistent items.
How do we measure listening time?How do we measure listening time?
After exposure…Center Light blinks.Brings infant’s attention to center.
How do we measure listening time?How do we measure listening time?
When infant looks at center…One of the side-lights blinks.
How do we measure listening time?How do we measure listening time?
When infant looks at side-light…she hears a word.
Beach… Beach… Beach…
How do we measure listening time?How do we measure listening time?
When infant looks at side-light…she hears a word.…as long as she keeps looking…
Experiment 1: Gradiency in Infants
c
7.5 month old infants exposed to either 4 b-, or 4 p-wordsBomb Bear Bail BeachPalm Pear Pail Peach
80 repetitions total
Form a category of the exposed class of words.
Infa
nts
sh
ow g
rad
ien
t se
nsi
tivi
ty
Measure listening time on Bear Pear (Original word)Pear Bear (opposite)Bear* Pear* (VOT closer to boundary).
Experiment 1: Stimuli
Both were judged /b/ or /p/ at least 90% consistently by adult listeners.
B: 98.5% B*: 97%P: 99% P*: 96%
Stimuli constructed by cross-splicing natural, recorded tokens of each end point.
B: M= 3.6 ms VOTP: M= 40.7 ms VOT
B*: M=11.9 ms VOTP*: M=30.2 ms VOT
Bear*
Categorical
Gradient
Measuring gradient sensitivityMeasuring gradient sensitivity
Looking time is an indication of interest.
After hearing all of those B-wordsP sounds pretty interesting.
So: infants should look longer for pear than bear.
What about in between?
Lis
teni
ng T
ime
Bear Pear
Individual DifferencesIndividual Differences
Novelty/Familiarity preference varies across infants and experiments.
We’re only interested in the middle stimuli (b*, p*).
Infants categorized as novelty or familiarity preferring by performance on the endpoints.
Novelty Familiarity
B 27 11
P 19 10
Within each group will we see evidence for gradiency?
Novelty ResultsNovelty Results
Novelty infants, Trained on BVOT: p=.001**Linear Trend: p=.001**
4000
5000
6000
7000
8000
9000
10000
B B* P
Lis
ten
ing
Tim
e (m
s)
.004**.14
Novelty ResultsNovelty Results
Novelty infants, Trained on PVOT: p=.001**Linear Trend: p=.001**
4000
5000
6000
7000
8000
9000
10000
P P* B
Lis
ten
ing
Tim
e (m
s)
.1
.001**
Familiarity ResultsFamiliarity Results
Familiarity infants showed similar effects.
B exposureTrend: p=.001B vs B*: p=.19B* vs P: p=.21
P exposureTrend: p=.009P vs P*: p=.057P* vs. B: p=.096
4000
5000
6000
7000
8000
9000
10000
B B* P
Lis
ten
ing
Tim
e (m
s)
4000
5000
6000
7000
8000
9000
10000
P P* B
Lis
ten
ing
Tim
e (m
s)
Trained on P
Trained on B
Experiment 1: ConclusionsExperiment 1: Conclusions
• 7.5 month old infants show gradient sensitivity to subphonemic detail.
• Individual differences in familiarity/novelty preferences. Why?
• Length of exposure?• Individual factors?
• Limitations of paradigm may hinder further study:
• More repeated measures• Better understanding of “task”• Wider age-range.
Anticipatory Eye-MovementsAnticipatory Eye-MovementsA
new
met
hod
olog
y
An ideal methodology would
Yield an arbitrary, identification response.
Yield a response to a single stimuli
Yield many repeated measures
Much like a forced-choice identification
Anticipatory Eye-Movements (AEM):
Train Infants to look left or right in response to a single auditory stimulus
Anticipatory Eye-MovementsAnticipatory Eye-Movements
Visual stimulus moves under occluder.
Reemergence serves as “reinforcer”
Concurrent auditory stimulus predicts endpoint of occluded trajectory.
Subjects make anticipatory eye-movements to the expected location—before the stimulus appears.
Teak
Lamb
Anticipatory Eye-MovementsAnticipatory Eye-Movements
After training on original stimuli, infants are tested on a mixture of
• new, generalization stimuli (unreinforced)Examine category structure/similarity relative to trained stimuli.
• original, trained stimuli (reinforced)Maintain interest in experiment. Provide objective criterion for inclusion
Experiment 2: Pitch and DurationExperiment 2: Pitch and Duration
Goals:
Use AEM to assess auditory categorization.
Assess infants’ abilities to “normalize” for variations in pitch and duration…
or…
Are infants’ sensitive to acoustic-detail during a lexical identification task...
Experiment 2: Pitch and DurationExperiment 2: Pitch and Duration
Training:“Teak” -> rightward trajectory.“Lamb” -> leftward trajectory.
“teak!”
“lamb!”
Test:Lamb & Teak with changes in:
Duration: 33% and 66% longer.Pitch: 20% and 40% higher
If infants ignore irrelevant variation in pitch or duration, performance should be good for generalization stimuli.
If infants’ lexical representations are sensitive to this variation, performance will degrade.
The StimuliThe Stimuli
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Training stimulus
The StimuliThe Stimuli
Testing stimulus
QuickTime™ and aCinepak decompressor
are needed to see this picture.
ResultsResults
Each trials is scored as
correct: longer looking time to the correct side.incorrect: longer looking time to incorrect side.
Binary DV—similar to 2AFC.
On trained stimuli:
11 of 29 infants performed better than chance–this is a tough tasks for infants. Perhaps more training.
Durationp=.002
Durationp=.002
ResultsResults
On generalization stimuli:
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
TrainingStimuli
D1 / P1 D2 / P2
Stimulus
Pro
port
ion
Cor
rect
Tri
als
DurationPitch
Pitchp>.1Pitchp>.1
Experiment 2: ConclusionsExperiment 2: Conclusions
Infants’ developing lexical categories show graded sensitive to variation in duration.
Possibly not to pitchMight be an effect of “task relevance”
AEM yieldsmore repeated measurementsbetter understood task: 2AFC
Could it yield a picture of the entire developmental time course? Is AEM applicable to a wider age range?
Treating undergraduates like babiesTreating undergraduates like babies
Adults generally won’tLook at blinking lights…Suck on pacifiers…Kick their feet at mobiles…
Result: few infant methodologies allow direct analogues to adults.
They do make eye-movements……could AEM be adapted?
Extreme case: Adult perception.
Treating undergraduates like babiesTreating undergraduates like babies
Pilot study.
5 adults exposed to AEM stimuli.
Training: “Ba” left“Pa” right
TestBa – Pa (0-40 ms) VOT continuum.
ResultsResults
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35 40
VOT (ms)
% /p
/
2AFC
AEM
Second group of subjects run in an explicit 2AFC.Same category boundary.Steeper slope: less sensitivity to VOT.
Adult AEM: ConclusionsAdult AEM: Conclusions
AEM paradigm can be used unchanged for adults.
Should work with older children as well.
Results show same category boundary as traditional 2AFC tasks, perhaps more sensitivity to fine-grained acoustic detail.
Potentially useful for speech categorization when categories are not:
nameablepictureableimmediately obvious
ConclusionsConclusions
Like adults,7.5-month-old infants show gradient sensitivity to subphonemic detail.
VOTDuration
Perhaps not pitch (w.r.t. lexical categories)
ConclusionsConclusions
Task makes the difference:
Moving to HTPP from habituation revealed subphonemic sensitivity.
Taking into account individual differences crucial.
Moving to AEM yields
Better ability to examine tuning over time.
Ability to assess perception across lifespan with a single paradigm.
Categorical perception of speech: Task variations in infants and adults
Bob McMurrayJessica Maye
Andrea Lathropand
Richard N. Aslin
And a big thanks to Julie Markant
Natural StimuliNatural Stimuli
Palm Bomb
Infants may show more sensitivity to natural speechStimuli constructed from natural tokens of actual words with progressive cross-splicing.
Experiment 1: RepriseExperiment 1: Reprise
• High variance/individual differences—can’t predict novelty/familiarity.
• Only a single point to look at.• Between-subject comparison.
Bear*
6 m/o
8 m/o
Lis
teni
ng T
ime
Bear Pear
10 m/o
• Difficult interaction to obtain
Difficult to examine how sensitivity might be tuned to environmental factors in head-turn-preference procedure.
Experiment 1: RepriseExperiment 1: Reprise
AEM presents a potential solution:
1) Looking at whole continuum would yield more power.
Bear Pear
2) Is AEM applicable to a wider age range?
6 m/o
8 m/o
10 m/o
The StimuliThe Stimuli
Training stimulus
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Data analysisData analysis
Data coded by naive coders from video containing pupil & scene monitors.
Data analysisData analysis
Left Right
Left-out Right-outStart
Left-In RightCenter
Off
Left Right
Left-out Right-out
Left-In Right-In
Left-out, Right-out, center & start treated as “neither”.
Left-in, Left treated as anticipation to left.
Right-in, Right treated as anticipation to right.
Eye-movements coded from maximal size of stimulus to first appearance (or end of trial).