speaker verification system using svm

13
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System using SVM

Upload: justina-wilcox

Post on 02-Jan-2016

65 views

Category:

Documents


11 download

DESCRIPTION

Speaker Verification System using SVM. Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering. Outline – Summary of Ph.d Dissertation of Vincent Wan. Speaker verification system Extracting features - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Speaker Verification System using SVM

Jun-Won Suh Intelligent Electronic Systems

Human and Systems EngineeringDepartment of Electrical and Computer Engineering

Speaker Verification System using SVM

Page 2: Speaker Verification System using SVM

Page 2 of 12Research Progress: Jun-Won Suh

Outline – Summary of Ph.d Dissertation of Vincent Wan

• Speaker verification system

Extracting features

• Creating models of speakers

Generative models, discriminative models

Making generative models discriminative

• Developing speaker verification using SVMs

• My interest to improve our system.

Page 3: Speaker Verification System using SVM

Page 3 of 12Research Progress: Jun-Won Suh

Speaker verification system

• Authenticate a person’s claimed identity

• Text dependent and independent

The system models the sound of the client’s voice. (based on physical characteristics of the client’s vocal tract.)

A generic speaker verification system

• Feature extraction

• Enrolment

Creates a model for client’s voice

• Pattern matching

• Decision theory

Page 4: Speaker Verification System using SVM

Page 4 of 12Research Progress: Jun-Won Suh

Extracting features

• Building models of speakers depends on frequency analysis of the speaker’s voice.

• Linear predictive coding (LPC)

LPC assumes that speech can be modelled as the output of periodic pulses or random noise.

The solutions for these LPC coefficients is obtained by minimizing MSE.

• Perceptual linear prediction (PLP)

PLP combines LPC analysis with psychophysics knowledge of the human auditory system.

Ex: Human ear has a higher frequency resolution at low frequencies.

Page 5: Speaker Verification System using SVM

Page 5 of 12Research Progress: Jun-Won Suh

Creating models of speakers

• Generative models

Gaussian Mixture Model (GMM), Hidden Markov Model (HMM)

Models are probability density estimators that attempt to capture all of the fluctuations and variations of the data.

• Discriminative models

Polynomial classifiers, Support Vector Machines (SVM)

Models are optimized to minimize the error on a set of training samples.

Models draw the boundary between classes and ignores the fluctuations within each class.

• Generative models discriminative

Generative models use to estimate the within class probability densities and do not minimize a classification error.

Discriminative models achieves the highest performance in classification tasks.

Page 6: Speaker Verification System using SVM

Page 6 of 12Research Progress: Jun-Won Suh

Making generative models discriminative

• GMM-LR/SVM combination

GMM likelihood ratio

Bengio proposed that the probability estimates are not perfect and a better version would be

Bayes decision rule

)|(log)|(log)( XPMXPXS

cXPbMXPaXS )|(log)|(log)(

The input to the SVM is the two dimensional vector made up of the log likelihoods of the client and world models.

A limitation of these approaches arises from frame basis discrimination.

)|(log)|(log

)|(

)|(

XPMXPy

XP

MXP

Page 7: Speaker Verification System using SVM

Page 7 of 12Research Progress: Jun-Won Suh

Importance of kernels

• Early SVM using polynomial and RBF kernels

Optimization problems requiring significant computational resources that were unsustainable.

Employing cluster algorithms to reduce the accuracy.

Frame level training inputs discard the useful speaker classification information.

• SVM using score-space kernels

The variable length of utterance can be classified by sequence level.

Page 8: Speaker Verification System using SVM

Page 8 of 12Research Progress: Jun-Won Suh

Classifying sequences using score-space kernels

• The score-space kernel enables SVMs to classify whole sequences.

• A variable length sequence of input vectors is mapped explicitly onto a single point in a space of fixed dimension.

• The score-space is derived from the likelihood score.

• The likelihood ratio score-space

},...,{)}),,|(({)( 1^^ NkkkF

f

FxxXMXpfX

),|(

),|(log)}),|(({

22

11

MXP

MXPMXpf kkk

),|(

),|(log)(

22

11

MXP

MXPX

Page 9: Speaker Verification System using SVM

Page 9 of 12Research Progress: Jun-Won Suh

Computing the score-space vectors

Define the global likelihood of a sequence X = {x1, …, xNl}

Page 10: Speaker Verification System using SVM

Page 10 of 12Research Progress: Jun-Won Suh

Computing the score-space vectors

• The fixed length vectors of the likelihood ration kernel can be expressed as

• The final likelihood ratio kernel is

• The dimensionality of the score-space is equal to the total number of parameters in the generative models. Hence the SVM can classify the complete utterance sequences.

),|(log),|(log 2211 MXPMXP

)(

)()(

2

1

X

XX

Page 11: Speaker Verification System using SVM

Page 11 of 12Research Progress: Jun-Won Suh

Experiment Results on PolyVar

• The data has a noise.

• The data has a much more clients tests than YOHO.

Page 12: Speaker Verification System using SVM

Page 12 of 12Research Progress: Jun-Won Suh

Conclusion

• Add GMM-LR/SVM model in our verification system

• Add score-space kernel on SVM

Need to compare the computation requirement for Fisher and LR kernels.

Page 13: Speaker Verification System using SVM

Page 13 of 12Research Progress: Jun-Won Suh

References

• V. Wan, Speaker Verification using Support Vector Machines, University of Sheffield, June 2003

• V. Wan, Building Sequence Kernels for Speaker Verificaiton and Speech Recognition, University of Sheffield

• S. Bengio, and J. Marithoz, Learning the Decision Function for the Speaker Verification, IDIAP, 2001