how to integrate automatic speech recognition (asr) into call applications helmer strik department...
TRANSCRIPT
How to integrateautomatic speech recognition (ASR) into CALL applications
Helmer Strik
Department of LinguisticsCentre for Language and Speech Technology (CLST)Radboud University Nijmegen, The Netherlands
Radboud University Nijmegen
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 2
Overview
IntroductionASR: automatic speech recognitionASR-based tutoringASR-based CALLASR-based literacy trainingConclusions
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 3
IntroductionStudents who receive 1-on-1 instruction
perform as well as the top two percent of students who receive traditional classroom instruction [Bloom 1984]
A human tutor for every student is not feasible
computer tutors
For language learning: CALLMany text-based CALL systems
Include speech
speech-based CALL system
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 4
Speech insideMany applications with ‘speech’:
Screen readers [#]
Reading pen
Mobile phone: photo + OCR + TTS
Some also (useful) for CALL
[#]
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 5
Speech inside (cont’d)Many applications with ‘speech’
Screen readers, reading pen, etc.
Some also (useful) for CALL
However, usually the learner canonly listen (TTS: text-to-speech)or, also speak, but …
no assessment, orthe learner has to carry out the assessment,e.g. by comparing with examples
use ASR / speech technologyIs it feasible?
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 6
ASR: automatic speech recognitionWhat is ASR?
Speech to text conversion
Applications:DictationCommand and controlSpoken dialogue systems (information)etc.
ASR is not flawless, and it will probably never beesp. for non-native speech
Note: this is not even the case for humans!
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 7
Speech Recognition
cgn2-s
vb
nn
mii
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 8
ASR-based tutoringITS: Intelligent Tutoring Systems
Spoken dialogue system for learningSubject matter: math, physics, etc.
Examples:ITSPOKE, Univ. of Pittsburgh, Litman et al.Topic: PhysicsSCoT, Stanford Univ., Peters et al.Topic (SCoT-DC): shipboard damage control
Communicate with speechthe subject matter doesn’t have to be speech
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 9
ASR-based CALLThe subject matter is speech
(language)
Late 1990’s:
1998: STiLL, Marholmen (Sweden); 1st time the CALL and Speech communities met
1999: Special Issue of CALICO, 'Tutors that Listen‘, focusing on ASR (mainly ‘discrete ASR’)
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 10
ASR-based literacy trainingWhat has been done? Reading tutors (the learner reads, not the PC):
Listen, CMU, Pittsburgh; Mostow et al. (1994)STAR system, UK; Russel et al. (1996)SPACE, KU Leuven; Van hamme, Duchateau, et al.… and many others [#]
FtL: Foundations to Literacy, Boulder; Cole, Wise, et al.
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 11
ASR-based literacy trainingFoundations to Literacy
Interactive BooksTeach fluent reading & comprehension
Foundational Skills TutorsTeach underlying reading skillsPhonics
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 12
ASR-based literacy training (cont’d)What has been done?
Reading tutors:Listen, CMU, Pittsburgh; Mostow et al. (1994)STAR system, UK; Russel et al. (1996)SPACE, KU Leuven; Van hamme, Duchateau, et al.…, and many others
FtL: Foundations to Literacy, Boulder; Cole, Wise, et al.
Mostly for children
And for adults?What is needed?What is possible, and what is not?…
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 13
ASR-based CALLASR is not flawless, and it will probably never be
esp. for non-native speech
Be aware of what is (not) possible with ASR technology
Problematic issues and possible solutions:Noise, esp. background speech min., head-setsDisfluencies min., improve autom. handlingNon-native pronunciation
Recognizing utterances utterance verification
Detect pronunciation errors classifiers
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 14
ASR-based CALLOur research:
Non-nativesAssessment of oral proficiencyDutch-CAPT – pronunciation
o ASR / UV – Utterance Verificationo PED – Pronunciation Error Detection
DISCO – pronunciation, morphology, syntaxTST-AAP
People with speech disabilityfor training & as communication aid (AAC)
ASR for dysarthric speechEST: E-learning based Speech Therapy
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 15
ASR-based CALLProject Dutch-CAPT
(Computer Assisted Pronuciation Training)
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 16
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 17
ASR-based CALL (cont’d)Project Dutch-CAPT
(CAPT: Computer Assisted Pronuciation Training)
Exp. group: used the Dutch-CAPT system2 control groups: didn’t use Dutch-CAPT
The reduction in the number of pronunciation errors made was significantly larger for the exp. group, Training: 4 weeks x 1 session of 30’ – 60’
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 18
ASR-based CALL (cont’d)ASR is not flawless, and it will probably never be
esp. for non-native speech
Be aware of what is (not) possible with ASR technology
Problematic issues and possible solutions:Noise, esp. background speech min., head-setsDisfluencies min., improve autom. handlingNon-native pronunciation
Recognizing utterances utterance verification
Detect pronunciation errors classifiers
Mix of expertise needed: ASR techn., L-acq., pedagogy, design, …
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 19
ASR-based literacy trainingDemonstration project TST-AAP
Existing courseAdd speech technology:Detect whether words & sounds were pronounced (correctly)
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 20
ASR-based literacy trainingListening; PC: produces speech
Text-To-Speech (TTS); quality good enough?Recorded speech, concatenation
Speaking; PC: recognizes speechPhonics (see FtL)
PC: Recognize words, utterances: CMs for Utt. Ver.PC: Recognize sounds: CMs for Phon. Ver. (contrasts)
Reading (reading tutors)PC: Recognize words, utterancesPC: Pointer in the text (‘track’ the reader)PC: Help when encountering problems
PC: Change tempo read faster
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 21
ASR-based CALLAdvantages of using speech (vs. writing)
Self-explanationExtra information:
Prosody (stress, accent)EmotionsConfidence
Other useful techniques:VTH [#]
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 22
ConclusionsASR is not flawlessASR-based tutoring is possible (restricted domain)
general topics; ITS: ITSPOKE, SCoTCALL; many systems: non-natives, disabled, etc.Literacy training
So far mainly for childrenAnd for adults !?
NeededMix of expertise: techn., L-acq., pedagogy, design, …Improved ASR, speech technologyProjects, funds
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 23
Questions?
Why are there so few ASR-basedCALL / literacy applications for adults?What are, in this context,important differences between children & adults?What is needed?
Listening; PC: produces speechSpeaking; PC: recognizes speech
PhonicsReading (reading tutors)
What else?
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 24
Questions?
Why are there so few ASR-basedCALL / literacy applications for adults?What are, in this context,important differences between children & adults?What is needed?
Listening; PC: produces speechSpeaking; PC: recognizes speech
PhonicsReading (reading tutors)
What else?
Radboud University NijmegenLESLLA, Antwerpen, 24-11-2008 25