cross-modal prediction in speech perception carolina sánchez, agnès alsius, james t. enns &...
TRANSCRIPT
![Page 1: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/1.jpg)
Cross-modal Prediction in Speech
PerceptionCarolina Sánchez, Agnès Alsius, James T. Enns & Salvador
Soto-Faraco
Multisensory Research Group
Universitat Pompeu Fabra
Barcelona
![Page 2: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/2.jpg)
Auditory + visual performanceMSI enhancement
Background
Visual + Auditory
Improve Speech Perception
Multisensory Integration
![Page 3: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/3.jpg)
Background
• Prediction within one sensory modality• Many levels of information processing
– Phonological prediction “ This morning I went to the library and borrowed a … book” (De Long, 2005; Pickering, 20707)
– Visual prediction: Visual search (Enns, 2008; Dambacher, 2009)
– Sensorimotor prediction: forward model (Wolpert, 1997)
![Page 4: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/4.jpg)
Predictive coding
Pickering, 2007
![Page 5: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/5.jpg)
Hypothesis
• If there exists prediction within the same modality,
and if predictive coding models can account for prediction at a phonological level, then …
Predictive Coding could occur across different sensory modalities too.
![Page 6: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/6.jpg)
Indirect evidences of cross-modal transfer in speech
van Wassenhove’s , 2005
time
ERPs
• Amplitud reduction
• Shortening latency
/pa/ high visual saliency
/ka/ short visual saliency
![Page 7: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/7.jpg)
Our study
• Visual prediction
• Auditory prediction
• Visual-to-auditory cross-modal prediction
• Auditory-to-visual cross-modal prediction
![Page 8: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/8.jpg)
Visual prediction
Visual stream
Auditory stream
V
A
With visual informative visual context
Without informative context
Task :
AV Match vs. AV Mismatch
Target fragment
Context fragment
speechnon speech
![Page 9: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/9.jpg)
Results
*
0
200
400
600
800
1000
1200Reaction time
mse
c
match mismatch
With visual informative context
Without informative context
* With previous context participants respond faster than without it.
VISUAL PREDICTION
![Page 10: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/10.jpg)
Auditory prediction
Visual stream
Auditory stream
V
A
With auditory informative auditory context
Without informative context
speechnon speech
Task :
AV Match vs. AV Mismatch
Target fragment
Context fragment
![Page 11: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/11.jpg)
Results
*
0
200
400
600
800
1000
1200
With auditory informative context
Without informative context
Reaction time
mse
c
match mismatch
* With previous context participants respond faster than without it.
AUDITORY PREDICTION
![Page 12: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/12.jpg)
Visual vs. Auditory Visual prediction Auditory
prediction
0
200
400
600
800
1000
1200Rts
mse
c
congruent incongruent
With visual informative context
Without informative context*
0
200
400
600
800
1000
1200
With auditory informative context
Without informative context
Rts
mse
c
congruent incongruent
*
![Page 13: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/13.jpg)
Conclusions
• Visual prediction
• Auditory prediction
Is this prediction cross-modal?
![Page 14: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/14.jpg)
Predictability of Vision-to-Audition Design of the experiment
V
AMismatch
Unimodal continued
Auditory stream
Visual stream
Match
Unimodal continuedV
A
Discontinued
Match
V
A
Discontinued
Mismatch
V
A
Cross-modal continued
Mismatch
![Page 15: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/15.jpg)
Predictability of Vision-to-Audition Stimuli
V
AMismatch
V
AMismatch
V
AMismatch
Unimodal continued Discontinued Cross-modal continued
![Page 16: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/16.jpg)
Results
Participants were faster in the cross-modal condition than in the completely incongruent one.
VISUAL –TO-AUDITORY PREDICTION
700
750
800
850
900
950
1000
Reaction time
mse
c
*
VisualAuditory
Unimodal continued
Discontinued Cross-modal continued
![Page 17: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/17.jpg)
Predictability of Audition-to-Vision Design of the experiment
Auditory stream
Visual stream
Match
Unimodal continued
V
AMismatch
Unimodal continued
V
AMatch
Discontinued
V
AMismatch
Discontinued
V
AMismatch
Cross-modal continued
![Page 18: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/18.jpg)
0
200
400
600
800
1000
1200Reaction time
mse
c
Visual
Auditory
Unimodal continued
Discontinued Cross-modal continued
Results
We didn’t find any difference between the mismatch condicions
NO AUDITORY-TO-VISUAL PREDICTION
![Page 19: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/19.jpg)
Conclusions
• There is some kind of prediction from vision-to-auditory modality
• There is not any prediction from auditory-to-vision modality
Does this prediction depend on the language?
![Page 20: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/20.jpg)
Canadian participants with english sentences
VISUAL –TO-AUDITORY PREDICTION IN NATIVE LANGUAGE
700
750
800
850
900
950
1000Reaction time
mse
c
*
Visual
Auditory
Unimodal continued
Discontinued Cross-modal continued
700
750
800
850
900
950
1000
Reaction time
mse
c
*
VisualAuditory
Unimodal continued
Discontinued Cross-modal continued
Spanish participants with spanish sentences
Results (L1)
![Page 21: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/21.jpg)
Results (L1)
Canadian participants with english sentences
0
200
400
600
800
1000
1200Reaction time
mse
c
No differences between the mismatch conditions
No prediction from auditory-to-visual modality in native language
Spanish participants with spanish sentences
0
200
400
600
800
1000
1200Reaction time
mse
c
Visual
Auditory
Unimodal continued
Discontinued Cross-modal continued
Visual
Auditory
Unimodal continued
Discontinued Cross-modal continued
![Page 22: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/22.jpg)
Conclusions
• There is some kind of prediction from vision-to-auditory modality in L1
• There is not any prediction from auditory-to-vision modality L1
What happens with an unknown language?
![Page 23: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/23.jpg)
Unknown language : visual to auditory
Canadian participants with spanish sentences
NO VISUAL-TO-AUDITORY IN OTHER LANGUAGE
700
800
900
1000
1100
1200Reaction time
mse
c
Visual
Auditory
Unimodal continued
Discontinued Cross-modal continued
![Page 24: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/24.jpg)
Unknown language: auditory to visual
Spanish participants with english sentences
Canadian participants with spanish sentences
0
200
400
600
800
1000
1200Reaction time
mse
c
0
200
400
600
800
1000
1200Reaction time
mse
c
No differences between the mismatch conditions
No prediction from auditory-to-visual modality in other language
Visual
Auditory
Unimodal continued
Discontinued Cross-modal continued
Visual
Auditory
Unimodal continued
Discontinued Cross-modal continued
![Page 25: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/25.jpg)
Conclusions
• No visual-to-auditory cross-modal prediction in an unknown language…
it seems that some level of knowledge about the articulatory phonetics of the language is required to obtain the advantage of the predictive coding
• No auditory-to-visual cross-modal prediction
![Page 26: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/26.jpg)
General Conclusions
• Unimodal prediction from visual to visual modality from auditory to auditory
• L1: ASYMMETRY– Cross-modal prediction from visual-to-auditory
modality– No cross-modal prediction from auditory-to-visual
modality
• Unknown language: previous knowledge of the language is neccesary to make the prediction– No cross-modal prediction from visual-to-auditory
modality– No cross-modal prediction from auditory-to-visual
modality
![Page 27: Cross-modal Prediction in Speech Perception Carolina Sánchez, Agnès Alsius, James T. Enns & Salvador Soto-Faraco Multisensory Research Group Universitat](https://reader031.vdocuments.net/reader031/viewer/2022032105/56649cff5503460f949d0323/html5/thumbnails/27.jpg)
- Agnès Alsius, Postdoc
Queen’s University
- Antonia Najas, MA/ Research Assistant Universitat Pompeu Fabra
- Phil Jaekl, PostdocUniversitat Pompeu Fabra
- All the people of the Vision Lab, UBC, Vancouver
Thanks to…
Thanks for your attention!!