Transcript
Page 1: Word-subword based keyword spotting with implications  in OOV detection

Word-subword based keyword spotting with

implications in OOV detection

Jan “Honza” Černocký, Igor Szöke, Mirko Hannemann, Stefan Kombrink

Brno University of TechbnologyBUT Speech@FIT

44th Asilomar Conference on Signals, Systems and Computers, 8.11.2010

Page 2: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 2/34

Agenda

• Word-based STD, OOV problem, subwords• Experiments• Sub-word units• Hybrid word-subword system • What can we do with OOVs • Conclusion

Page 3: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 3/34

Goal of STD and glossary of termsGoal: detect keywords or key-phrases in input

speech, for each detection, output:• Identity• Position• Score

Glossary • Large Vocabulary Continuous Speech Recognizer –

LVCSR – system converting spoken speech into text.• Out-of-vocabulary – OOV – word which is not in the

LVCSR vocabulary.• Term – textual entry consisting of one or more words in

sequence.• Spoken Term Detection – STD – a way to search for a

term in spoken data.• Subword(s) – unit(s) that are parts of words (phones,

syllables, automatically found, etc.).

Page 4: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 4/34

Word-based STD

• Due to the presence of language model, Word-based STD systems are reaching better accuracies than acoustic ones.

Page 5: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 5/34

Implementation• Term is searched in recognition lattice • Allows to estimate posterior probability of a

term.

Page 6: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 6/34

The OOV problemREF: THIS IS AN EXAMPLE OF RECOGNIZER OUTPUTREC: THIS IS AMEX APPLE OF RECOGNIZER OUTPUT

• One OOV causes several errors:• OOV can not be found (in the output of LVCSR).• OOV impairs recognition of neighboring words.

• OOV usually carries lot of information (named entity).

We need to handle OOVs ! • Word accuracy.• Spoken term detection accuracy.• Practical (memory, CPU, index size, etc.).

Page 7: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 7/34

Answer to OOV problem – sub-word STD

• Subword recognizer is built (output is subword lattice).

• Term is converted from words to sequence of subwords.

• This sequence is searched in the subword lattice.

*p -r-a y m * *m -ih -n ih -s t-a x r*

P R IM E M IN IS T E R

Page 8: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 8/34

Agenda

• Word-based STD, OOV problem, subwords• Experiments• Sub-word units• Hybrid word-subword system • What can we do with OOVs • Conclusion

Page 9: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 9/34

Evaluation - TWV

• Defined by NIST for NIST STD 2006 evaluation:

• one number• higher is better• depending on normalization

• Requires full STD system

Page 10: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 10/34

Normalization-independent evaluation - UBTVW• UBTWV - Upper Bound Term Weighted Value

• Finds optimum threshold for each term• one number• higher is better• Independent on

normalization

Page 11: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 11/34

Data

• NIST STD 2006 evaluations.• 3h of English telephone conversations.• 373 1-4 words long terms occurring 4737/196

times.

Page 12: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 12/34

Recognizer I.

• LVCSR developed in AMI/AMIDA project• State-of the art system including VTLN, MPE,

posterior features, SAT, 3 passes. • Acoustic models trained on 278h of speech.• Language model trained on 977M word tokens

(50k vocabulary).• Dictionary pruned to generate OOVs ->

WRDRED. • Word accuracy – 69.04%.

Page 13: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 14/34

Results

• Words• Words converted to phones• Phone recognizer

Phones too small => need longer units

Page 14: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 15/34

Agenda

• Word-based STD, OOV problem, subwords• Experiments• Sub-word units• Hybrid word-subword system • What can we do with OOVs • Conclusion

Page 15: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 16/34

Better subwords – phone multigrams

• Statistics of phone n-grams are collected (up to 6) from training data (phone transcriptions of speech).

• Probabilities of all units are estimated.• Training data are segmented by the most probable

sequence of multigrams.• Statistics are recomputed and low occurring units

are deleted. Several iterations.• N-gram language model is estimated on top of the

multigram segmentation of the training data.

Page 16: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 17/34

Constrained multigrams• nosil – sil is not part of multigram unit.• noxwrd – add information of word boundary to

multigram unit.

Term (word representation): PRIME MINISTERTerm pronunciation: p r ay m m ih n ih s t axrTerm (subword representation): *p-r-ay m* *m-ih-n ih-s t-axr*

Page 17: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 18/34

Results

• Subword search can process OOV terms.• Subword search is not so accurate as word search of

in-vocabulary terms.• Subword search consumes more index space.

=> Need for combination of word and subword searches.

Page 18: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 19/34

Agenda

• Word-based STD, OOV problem, subwords• Experiments• Sub-word units• Hybrid word-subword system • What can we do with OOVs • Conclusion

Page 19: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 20/34

Parallel word-subword

… works, but needs to maintain and run 2 systems.

Page 20: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 21/34

Hybrid word-subword

Page 21: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 22/34

Implementation by composition of networks

Page 22: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 23/34

Multigram dictionary for hybrid system

• For hybrid system, phone multigrams must not be trained on utterances.

• Phone multigrams are trained on dictionary.• Experimented with LVCSR vs. big vs. OOV

dictionary.

Page 23: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 24/34

Results – different configurations

• Pruning factors play role in the memory consumption, size of index, RT factor …

• “Reasonable system”• ~2.5x slower than word• ~2.5x bigger index than word• Matches the accuracy of word system for IV• OOVs found.

Page 24: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 25/34

Agenda

• Word-based STD, OOV problem, subwords• Experiments• Sub-word units• Hybrid word-subword system • What can we do with OOVs • Conclusion

Page 25: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 26/34

OOV detection by the hybrid system

Comparison of the subword confidence measure

to a threshold => detection of

OOVs

Page 26: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 27/34

OOV recovery

Use of phoneme to grapheme (P2G) to derive word-form of detected OOV

Page 27: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 28/34

Alignment error model

• Some detected OOVs could be even converted back to in-vocabulary words !

• But the phone pronunciation in 1-best output is not ideal…

• … alignment error model• Parameters (probabilities of deletion, insertion,

substitution) trained from data. • Can process dictionary and look up detected

OOVs.

Page 28: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 29/34

Going more complex …

Can construct an wFST accounting for • Sequences of in-vocabulary words• In-vocabulary words + common pre- and

suffixes• OOVs• And combinations …

m ey sh en -> INFORMATIONae l k ax hh aa l ih z em (ALCOHOLISM) -> ALCOHOL /

ISMaa f ax s m ae k s (’Office Max’) -> OFFICE OOV1572

Page 29: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 30/34

OOV clustering

• Alignment model allows for the evaluation of similarity

• Clustering possible

Page 30: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 31/34

Agenda

• Word-based STD, OOV problem, subwords• Experiments• Sub-word units• Hybrid word-subword system • What can we do with OOVs • Conclusion

Page 31: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 32/34

Conclusion

• Subword system with constrained multigrams - very good STD performace and OOV tolerant system.

• Improved hybrid word-subword system tested from STD accuracy and real application point of view.• Hybrid system brings better accuracy/size ratio and is

faster than the standalone system.• It works well in a real indexing & search engine.

• With a hybrid system, we can • Recover OOVs (simple P2G or more elaborate model)• Measure similarity of OOVs• Cluster them, find re-occurring ones, update

vocabulary.

Page 32: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 33/34

Reading and playing with• Igor Szöke: Hybrid word-subword spoken term

detection, Ph.D. thesis, Brno University of Technology, Oct 2010

• Stefan Kombrink, Mirko Hannemann, Lukáš Burget, and Hynek Heřmanský: Recovery of Rare Words in Lecture Speech, in Proc. Text, Speech and Dialogue (TSD) 2010, Brno, 2010

• Mirko Hannemann, Stefan Kombrink, Martin Karafiát, and Lukáš Burget: Similarity Scoring for Recognizing Repeated Out-of-VocabularyWords, in Proc. Interspeech 2010, Makuhari, Japan, 2010.

• … ‘Publications’ section of http://speech.fit.vutbr.cz/

• http://www.superlectures.com/odyssey/

Page 33: Word-subword based keyword spotting with implications  in OOV detection

ASILOMAR SS & C Černocký, Szöke, Hanneman, Kombrink 8.11.2010 34/34

Thank you for your attention


Top Related