colleagues : allen braun, nih greg hickok, uc irvine jonathan simon, univ. maryland
DESCRIPTION
A brain’s-eye-view of speech perception David Poeppel Cognitive Neuroscience of Language Lab Department of Linguistics and Department of Biology Neuroscience and Cognitive Science Program University of Maryland College Park. Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/1.jpg)
Colleagues:• Allen Braun, NIH• Greg Hickok, UC Irvine• Jonathan Simon, Univ. Maryland
A brain’s-eye-view of speech perception
David Poeppel
Cognitive Neuroscience of Language LabDepartment of Linguistics and Department of Biology
Neuroscience and Cognitive Science ProgramUniversity of Maryland College Park
Students:• Anthony Boemio • Maria Chait • Huan Luo• Virginie van Wassenhove
![Page 2: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/2.jpg)
“chair”“uncomfortable”“lunch”“soon”
encoding ?
representation ?
Is this a hard problem?Yes!If it could be solved straightforwardly(e.g. by machine), Mark Liberman would be in Tahiti having cold beers.
![Page 3: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/3.jpg)
Outline
(1) Fractionating the problem in space:
Towards a functional anatomy of speech perception
(2) Fractionating the problem in time:
Towards a functional physiology of speech perception
- A hypothesis about the quantization of time
- Psychophysical evidence for temporal integration
- Imaging evidence
![Page 4: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/4.jpg)
interface with lexical items,word recognition
![Page 5: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/5.jpg)
interface with lexical items,word recognition
hypothesis about storage:distinctive features[-voice] [+voice] [+voice][+labial] [+high] [+labial][-round] [+round] [-round][….] [….] [….]
![Page 6: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/6.jpg)
interface with lexical items,word recognition
hypothesis about storage:distinctive features [-voice] [+voice] [+voice][+labial] [+high] [+labial][-round] [+round] [-round][….] [….] [….]
production,articulation ofspeech
![Page 7: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/7.jpg)
interface with lexical items,word recognition
hypothesis about storage: distinctive features[-voice] [+voice] [+voice][+labial] [+high] [+labial][-round] [+round] [-round][….] [….] [….]
hypothesis about production:distinctive features[-voice] [+voice][+labial] [+high][….] [….]
production, articulation ofspeech
![Page 8: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/8.jpg)
analysis of auditory signal spectro-temporal rep. FEATURES
interface with lexical items,wordrecognitionFEATURES
production, articulation ofspeechFEATURES
![Page 9: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/9.jpg)
interface with lexical items,word recognition
coordinate transformfrom acoustic to articulatory space
auditory-motor interface
auditory-lexical interface
analysis of auditory signal spectro-temporal rep. FEATURES
production, articulation ofspeech
Unifying concept:distinctive feature
![Page 10: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/10.jpg)
interface with lexical items,word recognition
coordinate transformfrom acoustic to articulatory space
analysis of auditory signal spectro-temporal rep. FEATURES
production, articulation ofspeech
![Page 11: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/11.jpg)
STG (bilateral)acoustic-phonetic
speech codespMTG (left)
sound-meaning interface
Area Spt (left)auditory-motor interface
pIFG/dPM (left)articulatory-based
speech codes
Hickok & Poeppel (2000), Trends in Cognitive SciencesHickok & Poeppel (in press), Cognition
![Page 12: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/12.jpg)
MTG and IFG overlap when controlling for the overt/covert distinction across tasks
Hypothesized functions:- lexical selection (MTG)- lexical phon. code retr. (MTG)- post-lexical syllabification (IFG)
Shared neural correlatesof word production and perception processes
Bilat mid/post STGL anterior STGL mid/post MTGL post IFG
Indefrey & Levelt, in press, CognitionMeta-analysis of neuroimaging data, perception/production overlap
![Page 13: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/13.jpg)
Scott & Johnsrude 2003
![Page 14: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/14.jpg)
Possible Subregions of Inferior Frontal GyrusBurton (2001)
Auditory Studies
Burton et al. (2000), Demonet et al. (1992, 1994), Fiez et al, (1995), Zatorre et al., (1992, 1996)
Visual Studies
Sergent et al. (1992, 1993), Poldrack et al., (1999), Paulesu et al. (1993, 1996), Sergent et al., 1993, Shaywitz et al. (1995)
![Page 15: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/15.jpg)
Auditory lexical decision versus FM/sweeps (a), CP/syllables (b), and rest (c)
(a)
(b)
(c)
z=+6 z=+9 z=+12
D. Poeppel et al. (in press)
![Page 16: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/16.jpg)
fMRI (yellow blobs) and MEG (red dots) recordings of speech perception show pronounced bilateral activation of left and right temporal cortices
T. Roberts & D. Poeppel(in preparation)
![Page 17: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/17.jpg)
Binder et al. 2000
![Page 18: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/18.jpg)
STG (bilateral)acoustic-phonetic
speech codespMTG (left)
sound-meaning interface
Area Spt (left)auditory-motor interface
pIFG/dPM (left)articulatory-based
speech codes
Hickok & Poeppel (2000), Trends in Cognitive SciencesHickok & Poeppel (in press), Cognition
![Page 19: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/19.jpg)
Outline
(1) Fractionating the problem in space:
Towards a functional anatomy of speech perception
(2) Fractionating the problem in time:
Towards a functional physiology of speech perception
- A hypothesis about the quantization of time
- Psychophysical evidence for temporal integration
- Imaging evidence
![Page 20: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/20.jpg)
![Page 21: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/21.jpg)
The local/global distinction in vision is intuitively clear
Chuck Close
![Page 22: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/22.jpg)
What information does the brain extract from speech signals?
![Page 23: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/23.jpg)
Acoustic and articulatory phonetic phenomena occur on different time scales
Phenomena at the scale of formant transitions, subsegmental cues“short stuff” -- order of magnitude 20-50ms
Phenomena at the scale of syllables (tonality and prosody)“long stuff” -- order of magnitude 150-250ms
finestructure
envelope
![Page 24: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/24.jpg)
Does different granularity in time matter?
Segmental and subsegmental information serial order in speech fool/flu
carp/crapbat/tab
Supra-segmental information
prosody Sleep during lecture! Sleep during lecture?
![Page 25: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/25.jpg)
The local/global distinction can be conceptualized as a multi-resolution analysis in time
Further processing
Supra-segmental information
(time ~200ms)
Segmental information
(time ~20-50ms)
syllabicity metrics tone features, segments
Binding process
![Page 26: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/26.jpg)
Outline
(1) Fractionating the problem in space:
Towards a functional anatomy of speech perception
(2) Fractionating the problem in time:
Towards a functional physiology of speech perception
- A hypothesis about the quantization of time
- Psychophysical evidence for temporal integration
- Imaging evidence
![Page 27: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/27.jpg)
Temporal integration windows
Psychophysical and electrophysiologic evidence suggeststhat perceptual information is integrated and analysed intemporal integration windows (v. Bekesy 1933; Stevens andHall 1966; Näätänen 1992; Theunissen and Miller 1995; etc).
The importance of the concept of a temporal integration window is that it suggests the discontinuous processing of information in the time domain. The CNS, on this view, treats time not as a continuous variable but as a series of temporal windows, and extracts data from a given window.
arrow of time, physics
arrow of time, Central Nervous System
![Page 28: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/28.jpg)
25ms
short temporalintegrationwindows
long temporalintegrationwindows
200ms
Asymmetric sampling/quantization of the speech waveform
This p a p er i s h ar d tp u b l i sh
![Page 29: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/29.jpg)
Two spectrograms of the same word illustrate how differentanalysis windows highlight different aspects of the sounds.
(a) high time resolution - each glottal pulse visible as vertical striation
(b) high frequency resolution - each harmonic visible as horizontal stripe
(a)High time,low frequ.-resolution
(b)Low time,high frequ.-resolution
![Page 30: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/30.jpg)
Hypothesis: Asymmetric Sampling in Time (AST)
Left temporal cortical areas preferentially extractinformation over 25ms temporal integration windows.
Right hemisphere areas preferentially integrate over long, 150-250ms integration windows.
By assumption, the auditory input signalhas a neural representation that is bilaterally symmetric(e.g. at the level of core); beyond the initial representation,the signal is elaborated asymmetrically in the timedomain.
Another way to cocneptualize the AST proposal is to say thatthe sampling rate of non-primary auditory areas isdifferent, with LH sampling at high frequencies (~40Hz)and RH sampling at low frequencies (4-10Hz).
![Page 31: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/31.jpg)
25[40Hz 4Hz]
250
Size of temporal integration windows (ms)[Associated oscillatory frequency (Hz)]
LH RH
Pro
po
rtio
n o
f
ne
uro
na
l en
sem
ble
s
25[40Hz 4Hz]
250
Symmetric representation of spectro-temporal receptive fields in primary auditory cortex
a. Physiological lateralization
LH RH
Analysesrequiring hightemporal resolution
Analysesrequiring high spectralresolutionformant transitions
e.g. intonation contours
e.g.
b. Functional lateralization
Temporally asymmetric elaboration of perceptual representations in non-primary cortex
![Page 32: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/32.jpg)
Asymmetric sampling in time (AST) characteristics
• AST is an example of functional segregation, a standard concept.
• AST is an example of multi-resolution analysis, a signal processing strategy common in other cortical domains (cf. visual areas MT and V4 which, among other differences, have phasic versus tonic firing properties, respectively).
• AST speaks to the “granularity” of perceptual representations: the model suggests that there exist basic perceptual representations that correspond to the different temporal windows (e.g. featural info isequally basic to the envelope of syllables, on this view).
• The AST model connects in plausible ways to the local versus global distinction: there are multiple representations of a given signalon different scales (cf. wavelets)
Global ==> ‘large-chunk’ analysis, e.g., syllabic levelLocal ==> ‘small-chunk’ analysis, e.g., subsegmental level
![Page 33: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/33.jpg)
LH RH
Analysesrequiring hightemporal resolution
Analysesrequiring high spectralresolutionformant transitions
e.g. intonation contours
e.g.
25[40Hz 4Hz]
250
Size of temporal integration windows (ms)[Associated oscillatory frequency (Hz)]
LH RH
Pro
po
rtio
n o
f
ne
uro
na
l en
sem
ble
s
25[40Hz 4Hz]
250
Symmetric representation of spectro-temporal receptive fields in primary auditory cortex
a. Physiological lateralization
b. Functional lateralization
Temporally asymmetric elaboration of perceptual representations in non-primary cortex
![Page 34: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/34.jpg)
Outline
(1) Fractionating the problem in space:
Towards a functional anatomy of speech perception
(2) Fractionating the problem in time:
Towards a functional physiology of speech perception
- A hypothesis about the quantization of time AST model
- Psychophysical evidence for temporal integration
- Imaging evidence
![Page 35: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/35.jpg)
Perception of FM sweeps
Huan Luo, Mike Gordon, Anthony Boemio,David Poeppel
![Page 36: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/36.jpg)
FM Sweep Example
Time (s)0 0.0800227
–0.2
0.2
0
Time (s)0 0.0800227
–0.2
0.2
0
Time (s)0 0.0800227
0
5000
waveform
spectrogram
80msec, from 3-2 kHz, linear FM sweep
![Page 37: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/37.jpg)
The rationale
• Important cues for speech perception:Formant transition in speech sounds
(For example, F2 direction can distinguish /ba/ from /da/)
• Importance in tone languages• Vertebrate auditory system is well equipped
to analyze FM signals.
![Page 38: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/38.jpg)
Tone languages
• For example, Chinese, Thai…
• The direction of FM (of the fundamental frequency) is important in the language to make lexical distinctions.
• (Four tones in Chinese)
/Ma 1/, /Ma 2/ , /Ma 3/, /Ma 4/
![Page 39: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/39.jpg)
Questions
• How good are we at discriminating these signals? determine the threshold of the duration of
stimuli (corresponding to rate) for the detection of FM direction
Any performance difference between UP and DOWN detection?
• Will language experience affect the performance of such a basic perceptual ability?
![Page 40: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/40.jpg)
Stimuli
• Linearly frequency modulated• Frequency range studied: 2-3 kHz (0.5 oct)• Two directions (Up / Down )• Changing FM rate (frequency range/time) by changing
duration. For each frequency range, frequency span is kept constant (slow / Fast )
• Stimuli duration: from 5msec(100 oct/sec) to 640 msec (0.8 oct/sec)
Tasks• Detection and discrimination of UP versus DOWN• 2 AFC, 2IFC, 3IFC
![Page 41: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/41.jpg)
Performance
30%
40%
50%
60%
70%
80%
90%
100%
100[5]
50[10]
25[20]
16.7[30]
12.5[40]
10[50]
6.2[80]
3.1[160]
1.6[320]
0.8[640]
FM Rate (oct/sec)[Stimulus Duration (ms)]
% C
orr
ec
t
Up
Down
Performance
30%
40%
50%
60%
70%
80%
90%
100%
100[5]
50[10]
25[20]
16.7[30]
12.5[40]
10[50]
6.2[80]
3.1[160]
1.6[320]
0.8[640]
FM Rate (oct/sec)[Stimulus Duration (ms)]
% C
orr
ec
t
Up
Down
Performance
30%
40%
50%
60%
70%
80%
90%
100%
100[5]
50[10]
25[20]
16.7[30]
12.5[40]
10[50]
6.2[80]
3.1[160]
1.6[320]
0.8[640]
FM Rate (oct/sec)[Stimulus Duration (ms)]
% C
orr
ec
t
Up
Down
2-3 kHz
1-1.5 kHz
600-900Hz
English speakers
• 3 frequency ranges relevant to speech(approximately F1, F2, F3 ranges)• single-interval 2-AFC
Two main findings:
• threshold for UP at 20ms • UP better than DOWN
Gordon & Poeppel (2001), JASA-ARLO
![Page 42: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/42.jpg)
2IFC • To eliminate the possibility of bias strategy
subjects can use • To see whether the asymmetric performance of
English subjects is due to their “Up preference bias”
Interval 1 Interval 2
UP Down
Which interval (1 or 2) contains certain direction sound?
Same duration of the two sounds, so the only difference is direction
![Page 43: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/43.jpg)
Results for Chinese Subjects
Expt. 2 (Chinese subjects)
0%
20%
40%
60%
80%
100%
5 10 20 30 40 50 80 160 320
Duration(ms)
Pe
rce
nt
Co
rre
ct
Up Down no significant difference
Threshold for both UP and DOWN is about
20 msec
![Page 44: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/44.jpg)
Results for English Subjects
Expt. 2 (English subjects)
0%
20%
40%
60%
80%
100%
5 10 20 30 40 50 80 160 320
Duration(ms)
Pe
rce
nt
Co
rre
ct
up down
No difference now between UP and DOWN
Threshold for both at 20msec
No difference between Chinese and English subjects now.
![Page 45: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/45.jpg)
3IFC
Standard Interval 1 Interval 2
Choose which interval contains DIFFERENT among the three sounds (different quality rather than only direction)
UP UP Down
![Page 46: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/46.jpg)
Expt. 3 vs. Expt. 2 (English speakers)
0%
20%
40%
60%
80%
100%
5 10 20 30 40 50 80 160 320
Duration(ms)
Pe
rce
nt
Co
rre
ct
up down 3interval difference
Expt. 3 vs. Expt. 2 (Chinese speaker)
0%
20%
40%
60%
80%
100%
5 10 20 30 40 50 80 160 320
Duration(ms)
Pe
rce
nt
Co
rre
ct
Up Down Difference detection
No difference between Chinese and English subjects
Threshold confirmed at 20ms
3 IFC versus 2 IFC
![Page 47: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/47.jpg)
Conclusion
• Importance of 20 msec as the threshold for discrimination of FM sweeps
- corresponds to temporal order threshold determined by Hirsh 1959
- consistent with Schouten 1985, 1989 testing FM sweeps
- this basic threshold arguably reflects the shortest integration window that generates robust auditory percepts.
![Page 48: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/48.jpg)
Click trainsClick trains
Anthony Boemio & David PoeppelAnthony Boemio & David Poeppel
![Page 49: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/49.jpg)
Click Stimuli
![Page 50: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/50.jpg)
Psychophysics
![Page 51: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/51.jpg)
Auditory visual integration: the McGurk effect
Virginie van Wassenhove, Ken Grant,David Poeppel
![Page 52: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/52.jpg)
McGurk Effect
• Audiovisual (AV) token
• Visual (V) token
• Auditory (A) token
![Page 53: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/53.jpg)
-40%
-20%
0%
20%
40%
60%
80%
100%
-46
7
-40
0
-33
3
-26
7
-20
0
-13
3
-67 0
67
13
3
20
0
26
7
33
3
40
0
46
7
A lead SOA (ms) A lag
Res
po
nse
Rat
e (%
)
Fusion Rate Visually driven Auditorily driven Corrected Fusion Rate
Response rate as a function of SOA (ms) in the ApVk McGurk pair.
Mean responses (N=21) and standard errors. Fusion rate (open red squares) and corrected fusion rate (filled red squares, dotted line) are /ta/ responses, visually driven responses (open green triangles) are /ka/, and auditorily driven responses (filled blue circles) are /pa/. A negative value in corrected fusion rate is interpreted as a visually dominated error response /ta/.
Identification Task (3AFC) ApVk
True bimodal responses
TWI
![Page 54: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/54.jpg)
Simultaneity Judgment Task (2AFC) ApVk vs. AtVt and AbVg vs. AdVd
0%
20%
40%
60%
80%
100%-4
67
-40
0
-33
3
-26
7
-20
0
-13
3
-67 0
67
13
3
20
0
26
7
33
3
40
0
46
7
A Lead SOA(ms) A Lag
Sim
ult
an
eit
y R
ate
(%
)
ApVk AtVt AbVg AdVd
Simultaneity judgment task. Simultaneity judgment as a function of SOA (ms) in both incongruent and congruent conditions (A pVk and AtVt N=21; AbVg
and AdVd N=18). The congruent conditions (open symbols) are associated with broader and higher simultaneity judgment
profile than the incongruent conditions (filled symbols).
![Page 55: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/55.jpg)
Temporal Window of Integration (TWI) across Tasks and Bimodal Speech Stimuli
-200 -150 -100 -50 0 50 100 150 200
A lead SOA (ms) A lag
AdVd
AtVt
AbVg
AbVg
AbVg
ApVk
S
ID
Stimulus TaskA Lead, Left
Boundary (ms)A Lag, Right
Boundary (ms)Plateau
Center (ms)Window Size
(ms)
ApVk
ID -25 +136 +56 161
S -44 +117 +37 161
AtVt S -80 +125 +23 205
AbVg
ID -34 +174 +70 208
S -37 +122 +43 159
AdVd S -74 +131 +29 205
![Page 56: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/56.jpg)
Outline
(1) Fractionating the problem in space:
Towards a functional anatomy of speech perception
(2) Fractionating the problem in time:
Towards a functional physiology of speech perception
- A hypothesis about the quantization of time • AST model
- Psychophysical evidence for temporal integration• FM sweeps and click trains: 20-30ms integration• AV processing in McGurk: 200ms integration
- Imaging evidence
![Page 57: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/57.jpg)
Binding of Temporal Quanta in Speech Processing
Maria Chait, Steven Greenberg, Takayuki Arai, David Poeppel
![Page 58: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/58.jpg)
Multi Resolution Analysis Hypothesis
“SYLLABLE”
Supra- segmental
information
(t.s ~300 ms)
(Sub)-segmental information
(t.s ~30 ms)
syllabicity stress tone feature
Binding process
![Page 59: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/59.jpg)
Original
0-265Hz
5045-6000 Hz
265-315Hz
E1, FS1
E14, FS14
E2, FS2
Low Pass E1 (0-3 Hz)
Low Pass E14 (0-3 Hz)
Low Pass E2 (0-3 Hz)
E1×FS1
E2×FS2
E14×FS14
Filtering Computing the Envelope and fine Structure
Low Pass Filter
Multiply E by FS
0-265Hz
5045-6000 Hz
265-315Hz
E1, FS1
E14, FS14
E2, FS2
High Pass E1 (22- Hz)
High Pass E14 (22- Hz)
High Pass E2 (22- Hz)
E1×FS1
E2×FS2
E14×FS14
S_l
ow
S_h
igh
Signal Processing:
![Page 60: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/60.jpg)
•0-6 khz
•14 channels
•spaced in 1/3 octave steps along the cochlear frequency map.
•Every two neighboring channels are separated by 50hz
![Page 61: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/61.jpg)
Envelope Extraction
Time
Amplitude
![Page 62: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/62.jpg)
Original Envelope
Low Passed Envelope
High Passed Envelope
![Page 63: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/63.jpg)
Original
High Passed Low Passed
![Page 64: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/64.jpg)
Evidence:
• Comodulation masking release• Ahissar et al. (2001) - Phase locking in the auditory
cortex to the envelope of sentence stimuli. • Shannon (1995)• Drullman (1994):
Effect of low pass filtering the envelope on speech reception:*severe reduction at 0-2Hz cutoff frequencies*marginal contribution of frequencies above 16HzEffect of High Pass filtering the envelope:*reduction in speech intelligibility for cutoff frequencies above 64Hz*no reduction in sentence intelligibility when only frequencies below 4Hz are reduced
![Page 65: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/65.jpg)
Experiment 1Stimuli:
- 53 Sentences from the IEEE corpus. - Nonsense Syllables (CUNY)
8 Blocks – 2(voiced/voiceless)*2 vowels(/a/,/i/) *2(CV/VC)
- 3 manipulations0-3 Hz Low Pass22-40 Hz Band Pass0-3 and 22-40 Hz
Each subject hears all 53 sentences but only one manipulationper sentence. A practice block of 26 sentences precedes the experiment.
Task:- Sentences: subjects asked to write down what they heard as precisely as they
can- Syllables: 7-alternative forced choice
Presented Dichotically
![Page 66: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/66.jpg)
Results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0-3
22-40
Dichotic
high-pass
![Page 67: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/67.jpg)
Results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0-3
22-40
Dichotic
high-passlow-pass
![Page 68: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/68.jpg)
Results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0-3
22-40
Dichotic
high-passlow-pass high-passplus
low-pass?
![Page 69: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/69.jpg)
Results
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0-3
22-40
Dichotic
high-passlow-pass high-passplus
low-pass?Result reflects the interaction between information carried on the short and long time scales.
![Page 70: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/70.jpg)
Outline
(1) Fractionating the problem in space:
Towards a functional anatomy of speech perception
(2) Fractionating the problem in time:
Towards a functional physiology of speech perception
- A hypothesis about the quantization of time • AST model
- Psychophysical evidence for temporal integration• FM sweeps and click trains: 20-30ms integration• AV processing in McGurk: 200ms integration• Interaction of temporal windows
- Imaging evidence
![Page 71: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/71.jpg)
fMRI study of temporalstructure in concatenated FMs
Anthony Boemio, Allen Braun, Steven Fromm, David Poeppel
![Page 72: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/72.jpg)
Stimulus Properties
![Page 73: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/73.jpg)
Stimulus Properties
All 13 stimuli have nearly identical long-term spectra and RMS power over the entire 9-second stimulus duration. Stimuli differ only in segment duration which was determined by drawing from a Gaussian distribution (previous panel), with means of 12, 25, 45, 85, 160, and 300ms.
Spectrograms Ampl. vs. TimePSDs
Time (sec)0 1Frequency (Hz)100 1E4
1
1E-10
FM Stimulus
CNST Stimulus
TONE Stimulus
![Page 74: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/74.jpg)
fMRI• Single-trial sparse acquisition paradigm(clustered volume acqu.)• 1.5T GE Signa, echo-planar sequence• 11.4s TR (9s signal,2.4s volume), TE 40ms• 24 reps/condition• SPM 99 random-effectsModel, p<0.05 corrected
![Page 75: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/75.jpg)
SPM 99 Cohort Analysis
FMs-CNST Categorical Contrasts (p < 0.05 corr.)
![Page 76: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/76.jpg)
Mean Supra-threshold Voxels vs. Segment Duration Summed Over All
Conditions/Auditory Areas/Hemispheres
0
50
100
150
200
12 25 45 85 160 300Segment Duration (ms)
errorbars are SEM
![Page 77: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/77.jpg)
![Page 78: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/78.jpg)
acquisition
threshold set bycategorical contrastto CNST stimulus-–anything below thislevel will be zero inthe SPM
Only 1secondof stimuli areshown forclarity
SegmentSegment Transition
Hemodynamic response/stimulus modelNot all segment transitions are equal.
Including the segment transitions and segments themselves, but assuming that transitions between long segments contribute more to the response than shorter ones produces the observed activation vs. segment-duration relation (left).
FM/TONECNST
![Page 79: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/79.jpg)
STS Only
0
50
100
150
200
12 25 45 85 160 300
SOA (ms)
Sup
rath
resh
old
STS Only
0.00
0.15
0.30
0.45
0.60
0.75
Supr
athr
esho
ld V
oxel
sMTG/STS P-Value
Type 0.5994Hemi 0.3127Rate <.0001
Type x Hemi 0.9396Type x Rate 0.3772Hemi x Rate 0.0034
Type x Hemi x Rate 0.4137
STG Only
0
50
100
150
200
250
300
350
400
450
12 25 45 85 160 300
SOA (ms)
Su
pra
thre
sh
old
Vo
xels
Left Hemi
Right Hemi
STG Only
0.00
0.15
0.30
0.45
0.60
0.75
Su
pra
thre
sh
old
Vo
xels Left Hemi
Right Hemi
STG P-Value
Type 0.0933Hemi 0.7514Rate <.0001
Type x Hemi 0.8152Type x Rate 0.0578Hemi x Rate 0.7211
Type x Hemi x Rate 0.3209
![Page 80: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/80.jpg)
MEG study of spectral responsesto complex sounds
David Poeppel, Huan Luo, Dana Ritter, Anthony Boemio, Didier Depireux, Jonathan Simon
![Page 81: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/81.jpg)
LH RH
Sen
sitiv
ity o
f
neu
rona
l ens
embl
es
Asymmetric sampling in time (AST) hypothesispredicts electrophysiological asymmetries in specific frequency bands, gamma (25-55Hz) and theta (3-8Hz) ….
… because the hypothesized temporal quantizationis reflected as oscillatory activity.
25 250[40Hz 4Hz]
Size of temporal integration windows (ms)[Associated oscillatory frequency (Hz)]
25 250[40Hz 4Hz]
![Page 82: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/82.jpg)
![Page 83: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/83.jpg)
![Page 84: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/84.jpg)
![Page 85: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/85.jpg)
Flow chart
LH
RH RMS
Gamma BandPass
Filter
Theta BandPass
Filter
RMSGamma for LH
Gamma for RH
Theta for LH
Theta for RH
![Page 86: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/86.jpg)
![Page 87: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/87.jpg)
Multi-taperspectral analysis
![Page 88: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/88.jpg)
Result
![Page 89: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/89.jpg)
Power ratio in specific frequency bands
•
• The difference is much greater in Theta band (low frequency band) and RH activation in Theta band is greater than LH
(P(L)/(P(L)+P(R)))
Kaiser Remetz Elliptic
Gamma 0.4769 0.4751 0.4733
Theta 0.3958 0.3965 0.4210
![Page 90: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/90.jpg)
Distribution of spectral responses
![Page 91: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/91.jpg)
Outline
(1) Fractionating the problem in space:
Towards a functional anatomy of speech perception
(2) Fractionating the problem in time:
Towards a functional physiology of speech perception
- A hypothesis about the quantization of time • AST model
- Psychophysical evidence for temporal integration• FM sweeps and click trains: 20-30ms integration• AV processing in McGurk: 200ms integration• Interaction of temporal windows
- Imaging evidence• fMRI: temporal sensitivity and lateralization• MEG spectral lateralization
![Page 92: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/92.jpg)
STG (bilateral)acoustic-phonetic
speech codespMTG (left)
sound-meaning interface
Area Spt (left)auditory-motor interface
pIFG/dPM (left)articulatory-based
speech codes
Hickok & Poeppel (2000), Trends in Cognitive SciencesHickok & Poeppel (in press), Cognition
![Page 93: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/93.jpg)
LH RH
Analysesrequiring hightemporal resolution
Analysesrequiring high spectralresolutionformant transitions
e.g. intonation contours
e.g.
25[40Hz 4Hz]
250
Size of temporal integration windows (ms)[Associated oscillatory frequency (Hz)]
LH RH
Pro
po
rtio
n o
f
ne
uro
na
l en
sem
ble
s
25[40Hz 4Hz]
250
Symmetric representation of spectro-temporal receptive fields in primary auditory cortex
a. Physiological lateralization
b. Functional lateralization
Temporally asymmetric elaboration of perceptual representations in non-primary cortex
Asymmetric sampling in time (AST) builds on anatomical symmetry but permits functional asymmetry
![Page 94: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/94.jpg)
Conclusion
The input signal (e.g. speech) must interface with higher-order symbolic representations of different types (e.g. segmental representations relevant to lexical access and supra-segmental representations relevant to interpretation).
These higher-order representation categories appear to be lateralized (e.g. segmental phonology/LH, phrasal prosody/RH).
The timing-based asymmetry provides a possible cortical ‘logistical’ or ‘administrative’ device that helps create representations of the appropriate granularity.
If this is on the right track, syllable is - at least for perception -as elementary a unit as feature/segment. Both are basic.
![Page 95: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/95.jpg)
Analysis-by-synthesis I
Hypothesize- and test models
Analysis
Synthesis
Peripheral auditoryprocessing
Segmentation andlabeling
spectralrepresentation
Lexical accesscode
Long-term memory:Abstract lexical repr.
Recoding
acoustic-phoneticmanifestations ofwords
contextualinformation
MATCHINGPROCESS
BEST LEXICALCANDIDATE
Where do the candidatesfor synthesis come from?
![Page 96: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/96.jpg)
Analysis-by-synthesis II
Analysis-by-synthesis model of lexical hypothesis generation and verification (adapted and extended from Klatt, 1979)
spectral analysis
analysis-by-synthesis verification;
“internal forward model”
speechwaveform
segmental analysis
lexical search
synt./seman. analysis
peripheral and central ‘neurogram’
partial feature matrix
lexical hypotheses
predicted subsequent items
best- scoring lexical candidates
acceptable word string
![Page 97: Colleagues : Allen Braun, NIH Greg Hickok, UC Irvine Jonathan Simon, Univ. Maryland](https://reader036.vdocuments.net/reader036/viewer/2022062321/56812d14550346895d91f417/html5/thumbnails/97.jpg)
Analysis-by-synthesis III
spectral analysis
analysis-by-synthesis verification;
“internal forward model”
speechwaveform
segmental analysis
lexical search
synt./seman. analysis
peripheral and central ‘neurogram’
partial feature matrix
lexical hypotheses
predicted subsequent items
best- scoring lexical candidates
acceptable word string
auditorycortex
pSTG?MTG?ITG?
frontal areas (articulatory codes) - l IFG, premotor temporo-parietal areas?