to structure baseball live games as well as to improve the speech recognition accuracy
DESCRIPTION
Future Work. Research Purpose. Abstract. Proposed Method. Learning Stochastic Models. Prospect. Conclusion. Models. Experiments. Problems of Conventional Method. Situation Based Speech Recognition for Structuring Baseball Live Games. Atsushi SAKO, Tetsuya TAKIGUCHI and Yasuo ARIKI - PowerPoint PPT PresentationTRANSCRIPT
To structure baseball live games as well as to improve the speech recognition accuracy Using baseball dependent knowledge
Models
sentence phoneme signal
AcousticModel
Language Model(Bi-gram)
situation
SituationDependent
AcousticModel
SituationPrediction
Model
SituationDependentLanguage
Model
Conventional Method
Proposed Method
Formalization
3B1S
…
…
…
3B2S
Pitch
3B2S
And
NextNextBatterBatter
Strikeout!
3B3B2S2S
Foul ball
0B0S
…
OWS
Estimate word and situation concurrently
Following simplification• A situation depends only on a previous situation and a word co-occurrence.• A word depends only on a present situation and a previous word.
O : Sequence of observed feature vectors W : Sequence of words S : Sequence of situations
SituationDependent
Acoustic Model
SituationPrediction
Model
SituationDependent
Bi-gram
Formalization
O : Sequence of observed feature vectors W : Sequence of words
AcousticModel
Language Model(Bi-gram)
Problems
An example of recognition error
Situation Dependent Language Model
Learn from training data
1B1S
1B2S
Acoustic Model
2 Models such as normal emotion and excited emotion Adaptation by MLLR+MAP
Strikeout! Strikeout!
P=High
Situation Prediction Model
1B1S
2B1S
1B2S
2B2S
PitchStrike
StraightBall
PitchStrike
Experimental Conditions
Experimental Results
Work it well under ambiguous situations. More detail description of a situation including events
We proposed Situation Based Speech Recognition. Counts was used as a situation. It worked well under obvious situations.
2.3% improvement of keyword accuracy. 6.1% improvement of structuring correct rate. 75.0% correct rate of exciting scene detection.
An example of recognition result
Log likelihood
…
Four ball
Foul ball…
…Pitch and
Strikeout!
Pitch andStrikeout!
3B 2S
Next Batter
3B 2S
0B 0S Strikeout!
3B 1S
Time
Research Purpose
Problems of Conventional Method
Correct … foul ball, and strikeout in next pitch
Mistake … four ball (base on balls), and strikeout in next pitch
Abstract
It is a difficult problem to recognize baseball live speech because the speech is rather fast, noisy, emotional and disfluent due to rephrasing, repetition, mistake and grammatical deviation caused by spontaneous speaking style. To solve these problems, we have been studied the speech recognition method incorporating the baseball game task-dependent knowledge as well as an announcer’s emotion in commentary speech. In addition, in this paper, we propose the situation prediction model based on word co-occurrence. Owing to these proposed models, speech recognition errors are effectively prevented. This method is formalized in the framework of probability theory and implemented in the conventional speech decoding (Viterbi) algorithm. The experimental results showed that the proposed approach improved the structuring and segmentation accuracy as well as keywords accuracy.
P=Low
Using word co-occurrence (not BOW)
Learning Stochastic Models
Proposed Method
)(
),(),|(maxarg)|,(maxarg)ˆ,ˆ(
),(),( OP
SWPSWOPOSWPWS
WSWS
i
iii
iii
WSswwPwssPSWOPWS )|()|(),|(maxarg)ˆ,ˆ( 1
11
11
11
),(
i
iiiMiiiiWS
swwPwwssPSWOPWS )|(),,|(),|(maxarg)ˆ,ˆ( 111),(
Situation Based Speech Recognition for Structuring Baseball Live GamesAtsushi SAKO, Tetsuya TAKIGUCHI and Yasuo ARIKI
Department of Computer and Systems Engineering, Kobe University
Experiments
Prospect
Conclusion
Future Work
)(
)()|(maxarg)|(maxargˆ
OP
WPWOPOWPW
WW
i
iiW
wwPWOPW )|()|(maxargˆ1
Conventional Proposed
Keyword Acc. 66.8% 69.1%
Structuring Cor. 67.2% 73.3%
Exciting scene Cor. - 75.0%
Test set: A commentary speech on radio (7th Sep. 2003) Learning corpus
HMM: 200 hours (baseline) + 3 hours (adaptation) Language model: 570K morphemes
),,|( 11 Miiii wwssP
),|( SWOP
),|( 1 iii swwP
Correct … foul ball, and strikeout in next pitch
Conventional … four ball, and strikeout in next pitch
Proposed … foul ball, and strikeout in next pitch