a study on speech recognition using dynamic time warping

23
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU

Upload: kareem

Post on 30-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING. CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU. Goals. Learn how it works ! Focus: Pre-Processing Dynamic Time Warping/Dynamic Programming Verify using MATLAB Build a simple Voice to Text Converter application. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

CS 525 : Project Presentation

PALDEN LAMA and MOUNIKA NAMBURU

Page 2: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

GOALS

Learn how it works ! Focus:

Pre-Processing Dynamic Time Warping/Dynamic Programming

Verify using MATLAB Build a simple Voice to Text Converter

application.

Page 3: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

HOW DOES IT WORK?

Record Extracta voice Feature Vectors

Digitized Speech Signal(.wave

file)

Acoustic Preprocessin

g(DFT + MFCC)

Speech Recognizer(Dynamic

Time Warping)

Page 4: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING
Page 5: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

SPEECH SIGNAL

Voiced Excitation fundamental frequency (Speaker dependent)

Loudness signal amplitude Vocal tract shape spectral shaping

(most important to recognize words)

A time signal of vowel /a:/ (fs=11 kHz, length=100ms)

time

Page 6: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

ACOUSTIC PRE-PROCESSING

Log power spectrum of a vowel /a:/

Cepstrum of a vowel /a:/

Power spectrum of the vowel /a:/ after cepstral smoothing (liftering)

LIFTER

ING

IDFT

DFT

DFT

Page 7: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

MFCC (MEL FREQUENCY CEPSTRAL COEFFICIENTS)

The difference between the cepstrum and the mel-frequency cepstrum is that in the MFC, the frequency bands are equally spaced on the mel scale, which approximates the human auditory system's response more closely than the linearly-spaced frequency bands used in the normal cepstrum.

Coeff. Of power spectrum Mel Spectral Coeff. (FEATURE VECTOR)

Page 8: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

RECOGNIZER One word spoken contains dozens of feature

vectors. (preprocessing every 10 ms of signal)

Compute a ”distance” between this unknown sequence of vectors (unknown word) and known sequence of vectors (prototypes of words to recognize)

PROBLEM !! Unequal length of vector sequence

Page 9: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

DYNAMIC TIME WARPING : FIND OPTIMAL ASSIGNMENT PATH

Page 10: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

DYNAMIC TIME WARPING : FIND OPTIMAL ASSIGNMENT PATH

Page 11: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

DYNAMIC TIME WARPING : FIND OPTIMAL ASSIGNMENT PATH

Page 12: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

DTW : RECOGNIZING CONNECTED WORDS

Page 13: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

MATLAB FUNCTIONS

PRE-PROCESSING recordMelMatrix(3)

S = wavread(“speech.wav”) C = Melfiltermatrix(S, N, K) computeMelSpectrum( C,S);

DISPLAY FEATURES Featuredisp.m

WORD RECOGNITION dp_asym(vector1, vector2)

Page 14: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

RESULTShello hello1

Page 15: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

library

hello

Page 16: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

computerhello

Page 17: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

3.0304e+003

3.5820e+003

3.4499e+003

Page 18: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

Welcome home (male)

Welcome home (female)

Page 19: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

Welcome home Welcome back

Page 20: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

Welcome home Computer Science

Page 21: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

Welcome back Computer Science

Page 22: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

2.6418e+003

2.9468e+003

3.8109e+003

4.6701e+003

Page 23: A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

THANKS ! ANY QUESTIONS?