2.5.4.1 basics of neural networks

64
1 2.5.4.1 Basics of Neural Networks 2.5.4.1 Basics of Neural Networks 0 X INPUT 1 X 2 X 1 N X Y OUTPUT 1 0 N i i i x W f y

Upload: noah-lowe

Post on 02-Jan-2016

37 views

Category:

Documents


6 download

DESCRIPTION

2.5.4.1 Basics of Neural Networks. 2.5.4.2 Neural Network Topologies. 2.5.4.2 Neural Network Topologies. 2.5.4.2 Neural Network Topologies. TDNN. 2.5.4.6 Neural Network Structures for Speech Recognition. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 2.5.4.1 Basics of Neural Networks

1

2.5.4.1 Basics of Neural Networks2.5.4.1 Basics of Neural Networks0X

INPUT

1X

2X

1NX

Y

OUTPUT

1

0

N

iii xWfy

Page 2: 2.5.4.1 Basics of Neural Networks

2

2.5.4.2 Neural Network Topologies2.5.4.2 Neural Network Topologies

Page 3: 2.5.4.1 Basics of Neural Networks

3

2.5.4.2 Neural Network 2.5.4.2 Neural Network TopologiesTopologies

Page 4: 2.5.4.1 Basics of Neural Networks

4

2.5.4.2 Neural Network Topologies2.5.4.2 Neural Network Topologies

Page 5: 2.5.4.1 Basics of Neural Networks

5

TDNNTDNN

Page 6: 2.5.4.1 Basics of Neural Networks

6

2.5.4.6 Neural Network Structures for 2.5.4.6 Neural Network Structures for Speech RecognitionSpeech Recognition

Page 7: 2.5.4.1 Basics of Neural Networks

7

2.5.4.6 Neural Network Structures for 2.5.4.6 Neural Network Structures for

Speech RecognitionSpeech Recognition

Page 8: 2.5.4.1 Basics of Neural Networks

8

3.1.1 Spectral Analysis Models3.1.1 Spectral Analysis Models

Page 9: 2.5.4.1 Basics of Neural Networks

9

3.1.1 Spectral Analysis Models3.1.1 Spectral Analysis Models

Page 10: 2.5.4.1 Basics of Neural Networks

10

3.2 THE BANK-OF-FILTERS 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSORFRONT- END PROCESSOR

Page 11: 2.5.4.1 Basics of Neural Networks

11

3.2 THE BANK-OF-FILTERS 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSORFRONT- END PROCESSOR

Page 12: 2.5.4.1 Basics of Neural Networks

12

3.2 THE BANK-OF-FILTERS 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSORFRONT- END PROCESSOR

Page 13: 2.5.4.1 Basics of Neural Networks

13

3.2 THE BANK-OF-FILTERS 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSORFRONT- END PROCESSOR

Page 14: 2.5.4.1 Basics of Neural Networks

14

3.2 THE BANK-OF-FILTERS 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSORFRONT- END PROCESSOR

Page 15: 2.5.4.1 Basics of Neural Networks

15

3.2.1 Types of Filter Bank Used for 3.2.1 Types of Filter Bank Used for Speech RecognitionSpeech Recognition

N

Fb

NQ

QiiN

Ff

si

si

2/

1,

Page 16: 2.5.4.1 Basics of Neural Networks

16

Nonuniform Filter BanksNonuniform Filter Banks

1

1

11

1

1

,2

)(

2,i

j

iji

ii

bbbff

Qibb

cb

Page 17: 2.5.4.1 Basics of Neural Networks

17

Nonuniform Filter BanksNonuniform Filter Banks

HzbHzfFilter

HzbHzfFilter

HzbHzfFilter

HzbHzfFilter

1600,2400:4

800,1200:3

400,600:2

200,300:1

44

33

22

11

Page 18: 2.5.4.1 Basics of Neural Networks

18

3.2.1 Types of Filter Bank Used for 3.2.1 Types of Filter Bank Used for Speech RecognitionSpeech Recognition

Page 19: 2.5.4.1 Basics of Neural Networks

19

3.2.1 Types of Filter Bank Used for 3.2.1 Types of Filter Bank Used for Speech RecognitionSpeech Recognition

Page 20: 2.5.4.1 Basics of Neural Networks

20

3.2.2 Implementations of Filter Banks3.2.2 Implementations of Filter Banks

Instead of direct convolution, which is Instead of direct convolution, which is computationally expensive, we assume computationally expensive, we assume each bandpass filter impulse response to each bandpass filter impulse response to be represented by:be represented by:

Where w(n) is a fixed lowpass filterWhere w(n) is a fixed lowpass filter

nji

ienwnh )()(

Page 21: 2.5.4.1 Basics of Neural Networks

21

3.2.2 Implementations of Filter Banks3.2.2 Implementations of Filter Banks

Page 22: 2.5.4.1 Basics of Neural Networks

22

3.2.2.1 Frequency Domain Interpretation of the Short-3.2.2.1 Frequency Domain Interpretation of the Short-

Time Fourier TransformTime Fourier Transform

Page 23: 2.5.4.1 Basics of Neural Networks

23

3.2.2.1 Frequency Domain 3.2.2.1 Frequency Domain Interpretation of the Short-Time Interpretation of the Short-Time

Fourier TransformFourier Transform

Page 24: 2.5.4.1 Basics of Neural Networks

24

3.2.2.1 Frequency Domain 3.2.2.1 Frequency Domain Interpretation of the Short-Time Interpretation of the Short-Time

Fourier TransformFourier Transform

Page 25: 2.5.4.1 Basics of Neural Networks

25

3.2.2.1 Frequency Domain 3.2.2.1 Frequency Domain Interpretation of the Short-Time Interpretation of the Short-Time

Fourier TransformFourier Transform

Page 26: 2.5.4.1 Basics of Neural Networks

26

Linear Filter Interpretation of the Linear Filter Interpretation of the STFTSTFT

)(~

ns)(ns)(nw

ije

)( 1jn eS

Page 27: 2.5.4.1 Basics of Neural Networks

27

3.2.2.4 FFT Implementation of a 3.2.2.4 FFT Implementation of a Uniform Filter BankUniform Filter Bank

Page 28: 2.5.4.1 Basics of Neural Networks

28

Direct implementation of an arbitrary Direct implementation of an arbitrary filter bankfilter bank

)(ns

)(1 nh

)(nX Q

)(2 nh

)(nhQ

)(1 nX

)(2 nX

Page 29: 2.5.4.1 Basics of Neural Networks

29

3.2.2.5 Nonuniform FIR Filter Bank 3.2.2.5 Nonuniform FIR Filter Bank ImplementationsImplementations

Page 30: 2.5.4.1 Basics of Neural Networks

30

3.2.2.7 Tree Structure Realizations of 3.2.2.7 Tree Structure Realizations of Nonuniform Filter BanksNonuniform Filter Banks

Page 31: 2.5.4.1 Basics of Neural Networks

31

3.2.4 Practical Examples of Speech-3.2.4 Practical Examples of Speech-Recognition Filter Banks Recognition Filter Banks

Page 32: 2.5.4.1 Basics of Neural Networks

32

3.2.4 Practical Examples of Speech-3.2.4 Practical Examples of Speech-Recognition Filter BanksRecognition Filter Banks

Page 33: 2.5.4.1 Basics of Neural Networks

33

3.2.4 Practical Examples of Speech-3.2.4 Practical Examples of Speech-Recognition Filter BanksRecognition Filter Banks

Page 34: 2.5.4.1 Basics of Neural Networks

34

3.2.4 Practical Examples of Speech-3.2.4 Practical Examples of Speech-Recognition Filter BanksRecognition Filter Banks

Page 35: 2.5.4.1 Basics of Neural Networks

35

3.2.5 Generalizations of Filter-Bank Analyzer 3.2.5 Generalizations of Filter-Bank Analyzer

Page 36: 2.5.4.1 Basics of Neural Networks

36

3.2.5 Generalizations of Filter-Bank Analyzer 3.2.5 Generalizations of Filter-Bank Analyzer

Page 37: 2.5.4.1 Basics of Neural Networks

37

3.2.5 Generalizations of Filter-Bank Analyzer 3.2.5 Generalizations of Filter-Bank Analyzer

Page 38: 2.5.4.1 Basics of Neural Networks

38

3.2.5 Generalizations of Filter-Bank Analyzer 3.2.5 Generalizations of Filter-Bank Analyzer

Page 39: 2.5.4.1 Basics of Neural Networks

39

Page 40: 2.5.4.1 Basics of Neural Networks

40

Page 41: 2.5.4.1 Basics of Neural Networks

41

Page 42: 2.5.4.1 Basics of Neural Networks

42

Page 43: 2.5.4.1 Basics of Neural Networks

43

Page 44: 2.5.4.1 Basics of Neural Networks

44

Page 45: 2.5.4.1 Basics of Neural Networks

45

Page 46: 2.5.4.1 Basics of Neural Networks

46

روش مل-کپسترومروش مل-کپستروم

Mel-scaling بندی فریم

IDCT

|FFT|2

Low-order coefficientsDifferentiator

Cepstra

Delta & Delta Delta Cepstra

زمانی سیگنال

Logarithm

Page 47: 2.5.4.1 Basics of Neural Networks

47

Time-Frequency analysisTime-Frequency analysis

Short-term Fourier TransformShort-term Fourier Transform Standard way of frequency analysis: decompose the Standard way of frequency analysis: decompose the

incoming signal into the constituent frequency incoming signal into the constituent frequency components.components.

W(n): windowing functionW(n): windowing function N: frame lengthN: frame length p: step sizep: step size

Page 48: 2.5.4.1 Basics of Neural Networks

48

Critical band integrationCritical band integration

Related to masking phenomenon: the Related to masking phenomenon: the threshold of a sinusoid is elevated when threshold of a sinusoid is elevated when its frequency is close to the center its frequency is close to the center frequency of a narrow-band noisefrequency of a narrow-band noise

Frequency components within a critical Frequency components within a critical band are not resolved. Auditory system band are not resolved. Auditory system interprets the signals within a critical interprets the signals within a critical band as a wholeband as a whole

Page 49: 2.5.4.1 Basics of Neural Networks

49

Bark scaleBark scale

Page 50: 2.5.4.1 Basics of Neural Networks

50

Feature Feature orthogonalizationorthogonalization

Spectral values in adjacent Spectral values in adjacent frequency channels are highly frequency channels are highly correlatedcorrelated

The correlation results in a The correlation results in a Gaussian model with lots of Gaussian model with lots of parameters: have to estimate all the parameters: have to estimate all the elements of the covariance matrixelements of the covariance matrix

Decorrelation is useful to improve Decorrelation is useful to improve the parameter estimation.the parameter estimation.

Page 51: 2.5.4.1 Basics of Neural Networks

51

CepstrumCepstrum Computed as the inverse Fourier transform Computed as the inverse Fourier transform

of the log magnitude of the Fourier of the log magnitude of the Fourier transform of the signaltransform of the signal

The log magnitude is real and symmetric -> The log magnitude is real and symmetric -> the transform is equivalent to the Discrete the transform is equivalent to the Discrete Cosine Transform.Cosine Transform.

Approximately decorrelatedApproximately decorrelated

Page 52: 2.5.4.1 Basics of Neural Networks

52

Principal Component Principal Component AnalysisAnalysis

Find an orthogonal basis such that the Find an orthogonal basis such that the reconstruction error over the training set reconstruction error over the training set is minimizedis minimized

This turns out to be equivalent to This turns out to be equivalent to diagonalize the sample autocovariance diagonalize the sample autocovariance matrixmatrix

Complete decorrelationComplete decorrelation Computes the principal dimensions of Computes the principal dimensions of

variability, but not necessarily provide variability, but not necessarily provide the optimal discrimination among classesthe optimal discrimination among classes

Page 53: 2.5.4.1 Basics of Neural Networks

53

Principal Component Analysis Principal Component Analysis ((PCAPCA))

MathematicalMathematical procedure that transforms a number of procedure that transforms a number of (possibly) correlated variables into a (smaller) number of (possibly) correlated variables into a (smaller) number of uncorrelateduncorrelated variables called variables called principal components (PC)principal components (PC)

Find an orthogonal basis such that the reconstruction error Find an orthogonal basis such that the reconstruction error over the training set is minimizedover the training set is minimized

This turns out to be equivalent to diagonalize the sample This turns out to be equivalent to diagonalize the sample autocovariance matrixautocovariance matrix

Complete decorrelationComplete decorrelation

Computes the principal dimensions of variability, but not Computes the principal dimensions of variability, but not necessarily provide the optimal discrimination among classesnecessarily provide the optimal discrimination among classes

Page 54: 2.5.4.1 Basics of Neural Networks

54

PCA PCA (Cont.)(Cont.)

AlgorithmAlgorithm

xFy

Apply Transform

Output =

(R- dim vectors)

MRy *

Input=

(N-dim vectors)

MNx * Covariance matrix

1

1

M

xxxxCov

M

i

T

ii

iN

i

EigVec

EigValNi ...1

Transform matrix

NEigVec

EigVec

EigVec

F.

2

1

...21 EigValEigVal

Eigen values

Eigen vectors

Page 55: 2.5.4.1 Basics of Neural Networks

55

PCA PCA (Cont.)(Cont.) PCA in speech recognition systemsPCA in speech recognition systems

Page 56: 2.5.4.1 Basics of Neural Networks

56

Linear discriminant Linear discriminant AnalysisAnalysis

Find an orthogonal basis such that the Find an orthogonal basis such that the ratio of the between-class variance ratio of the between-class variance and within-class variance is and within-class variance is maximizedmaximized

This also turns to be a general This also turns to be a general eigenvalue-eigenvector problemeigenvalue-eigenvector problem

Complete decorrelationComplete decorrelation Provide the optimal linear separability Provide the optimal linear separability

under quite restrict assumptionunder quite restrict assumption

Page 57: 2.5.4.1 Basics of Neural Networks

57

PCA vs. LDAPCA vs. LDA

Page 58: 2.5.4.1 Basics of Neural Networks

58

Spectral smoothingSpectral smoothing

Formant information is crucial for Formant information is crucial for recognitionrecognition

Enhance and preserve the formant Enhance and preserve the formant information:information: Truncating the number of cepstral Truncating the number of cepstral

coefficientscoefficients Linear prediction: peak-hugging Linear prediction: peak-hugging

propertyproperty

Page 59: 2.5.4.1 Basics of Neural Networks

59

Temporal processingTemporal processing

To capture the temporal features of To capture the temporal features of the spectral envelop; to provide the the spectral envelop; to provide the robustness:robustness: Delta Feature: first and second order Delta Feature: first and second order

differences; regressiondifferences; regression Cepstral Mean Subtraction:Cepstral Mean Subtraction:

For normalizing for channel effects and For normalizing for channel effects and adjusting for spectral slopeadjusting for spectral slope

Page 60: 2.5.4.1 Basics of Neural Networks

60

RASTA (RelAtive SpecTral RASTA (RelAtive SpecTral Analysis)Analysis)

Filtering of the temporal trajectories of Filtering of the temporal trajectories of some function of each of the spectral some function of each of the spectral values; to provide more reliable values; to provide more reliable spectral featuresspectral features

This is usually a bandpass filter, This is usually a bandpass filter, maintaining the linguistically important maintaining the linguistically important spectral envelop modulation (1-16Hz)spectral envelop modulation (1-16Hz)

Page 61: 2.5.4.1 Basics of Neural Networks

61

Page 62: 2.5.4.1 Basics of Neural Networks

62

RASTA-PLPRASTA-PLP

Page 63: 2.5.4.1 Basics of Neural Networks

63

Page 64: 2.5.4.1 Basics of Neural Networks

64