tutorial on neural network models for speech and image processing

1

Tutorial on

Neural Network Models for Speech and Image Processing

B. YegnanarayanaSpeech & Vision Laboratory

Dept. of Computer Science & EngineeringIIT Madras, Chennai-600036

[email protected]

WCCI 2002, Honululu, Hawaii, USAMay 12, 2002

2

Need for New Models of Computing for Speech & Image

Tasks

• Speech & Image processing tasks

• Issues in dealing with these tasks by human beings

• Issues in dealing with the tasks by machine

• Need for new models of computing in dealing with natural signals

• Need for effective (relevant) computing

• Role of Artificial Neural Networks (ANN)

Next >< Prev

3

Organization of the Tutorial

Part I Feature extraction and classification problems with speech and image data

Part II Basics of ANN

Part III ANN models for feature extraction and classification

Part IV Applications in speech and image processing

Next >< Prev

4

PART I

Feature Extraction and Classification Problems in

Speech and Image

5

Feature Extraction and Classification Problems in Speech and Image

• Distinction between natural and synthetic signals (unknown model vs known model generating the signal)

• Nature of speech and image data (non-repetitive data, but repetitive features)

• Need for feature extraction and classification

• Methods for feature extraction and models for classification

• Need for nonlinear approaches (methods and models)

Next >< Prev

6

Speech vs Audio

• Audio (audible) signals (noise, music, speech and other signals)

• Categories of audio signals

– Audio signal vs non-signal (noise)

– Signal from speech production mechanism vs other audio signals

– Non-speech vs speech signals (like with natural language)

Next >< Prev

7

Speech Production Mechanism

< Back

8

Different types of sounds

< Back

9

Categorization of sound units

< Back

10

Nature of Speech Signal

• Digital speech: Sequence of samples or numbers

• Waveform for word “MASK” (Figure)

• Characteristics of speech signal– Excitation source characteristics

– Vocal tract system characteristics

Next>< Prev

11

Waveform for the word “mask”

<Back

12

Source-System Model of Speech Production

Impulsetrain generator

Randomnoise generator

Time-varyingdigital filterX

Voice/unvoiced

switch

G

u(n)s(n)

Pitch period

Vocal tract parameters

Next >< Prev

13

Features from Speech Signal (demo)

• Different components of speech (speech, source and system)

• Different speech sound units (Alphabet in Indian Languages)

• Different emotions

• Different speakers

Next >< Prev

14

Speech Signal Processing Methods

• To extract source-system features and suprasegmental features

• Production-based features

• DSP-based features

• Perception-based features

Next >< Prev

15

Models for Matching and Classification

• Dynamic Time Warping (DTW)

• Hidden Markov Models (HMM)

• Gaussian Mixture Models (GMM)

Next >< Prev

16

Applications of Speech Processing

• Speech recognition

• Speaker recognition/verification

• Speech enhancement

• Speech compression

• Audio indexing and retrieval

Next >< Prev

17

Limitations of Feature Extraction Methods and Classification Models

• Fixed frame analysis

• Variability in the implicit pattern

• Not pattern-based analysis

• Temporal nature of the patterns

Next >< Prev

18

Need for New Approaches

• To deal with ambiguity and variability in the data for feature extraction

• To combine evidence from multiple sources (classifiers and knowledge sources)

Next >< Prev

19

Images• Digital Image - Matrix of numbers

• Types of Images

– line sketches, binary, gray level and color

– Still images, video, multimedia

Next >< Prev

20

Image Analysis

• Feature extraction• Image segmentation: Gray level, color,

texture• Image classification

Next >< Prev

21

Processing of Texture-like Images2-D Gabor Filter

A typical Gaussian filter with =30

A typical Gabor filter with

=30, =3.14 and =45

)sincos(2

1exp

2

1),,,,,( ))()(( 22

yxj

yxyxf

yxyxyx

Next >< Prev

22

Limitations

• Feature extraction• Matching• Classification methods/models

Next >< Prev

23

Need for New Approaches

• Feature extraction: PCA and nonlinear PCA

• Matching: Stereo images

• Smoothing: Using the knowledge of image and not noise

• Edge extraction and classification: Integration of global and local information or combining evidence

< Prev Next >

24

PART II

Basics of ANN

25

• Problem solving: Pattern recognition tasks by human and machine

• Pattern vs data

• Pattern processing vs data processing

• Architectural mismatch

• Need for new models of computing

Artificial Neural Networks

Next >< Prev

26

Biological Neural Networks

• Structure and function: Neurons, interconnections, dynamics for learning and recall

• Features: Robustness, fault tolerance, flexibility, ability to deal with variety of data situations, collective computation

• Comparison with computers: Speed, processing, size and complexity, fault tolerance, control mechanism

• Parallel and Distributed Processing (PDP) models

Next >< Prev

27

Basics of ANN

• ANN terminology: Processing unit (fig), interconnection, operation and update (input, weights, activation value, output function, output value)

• Models of neurons: MP neuron, perceptron and adaline

• Topology (fig)• Basic learning laws (fig)

Next >< Prev

28

Model of a Neuron

<back

29

Topology

<back

30

Basic Learning Laws

<back

31

Activation and Synaptic Dynamic Models

• General activation dynamics model

ij

ijjjiiiiiiiiiiiii wxfJxDExfIxCBxAtx ))()(())()(()(

Passive decay term

Excitatory term Inhibitory term

• Synaptic dynamics model

)()()()( tststwtw jiijij Passive

decay term

Correlation term

• Stability and convergence<Prev

Next>

32

Functional Units and Pattern Recognition Tasks

• Feedforward ANN– Pattern association– Pattern classification– Pattern mapping/classification

• Feedback ANN– Autoassociation– Pattern storage (LTM)– Pattern environment storage (LTM)

• Feedforward and Feedback (Competitive Learning) ANN– Pattern storage (STM)– Pattern clustering– Feature map

Next >< Prev

33

Two Layer Feedforward Neural Network (FFNN)

Next >< Prev

34

PR Tasks by FFNN• Pattern association

– Architecture: Two layers, linear processing, single set of weights– Learning:, Hebb's (orthogonal) rule, Delta (linearly independent) rule– Recall: Direct– Limitation: Linear independence, number of patterns restricted to input dimensionality– To overcome: Nonlinear processing units, leads to a pattern classification problem

• Pattern classification– Architecture: Two layers, nonlinear processing units, geometrical interpretation– Learning: Perceptron learning– Recall: Direct– Limitation: Linearly separable functions, cannot handle hard problems– To overcome: More layers, leads to a hard learning problem

• Pattern mapping/classification– Architecture: Multilayer (hidden), nonlinear processing units, geometrical interpretation– Learning: Generalized delta rule (backpropagation)– Recall: Direct– Limitation: Slow learning, does not guarantee convergence– To overcome: More complex architecture

Next >< Prev

35

Perceptron Network

• Perceptron classification problem• Perceptron learning law• Perceptron convergence theorem• Perceptron representation problem• Multilayer perceptron

Next >< Prev

36

Geometric Interpretation of Perceptron Learning

Next >< Prev

37

Generalized Delta Rule (Backpropagation Learning)

ok

okk

ok

hj

okkj ffbsssw )(,

K

k

okkj

hj

hji

hj

hji swfsasw

1

,

Next >< Prev

38

Issues in Backpropagation Learning

• Description and features of error backpropagation

• Performance of backpropagation learning• Refinements of backpropagation learning• Interpretation of results of learning• Generalization• Tasks with backpropagation network• Limitations of backpropagation learning• Extensions to backpropagation

Next >< Prev

39

PR Tasks by FBNN• Autoassociation

– Architecture: Single layer with feedback, linear processing units– Learning: Hebb (orthogonal inputs), Delta (linearly independent inputs)– Recall: Activation dynamics until stable states are reached– Limitation: No accretive behavior– To overcome: Nonlinear processing units, leads to a pattern storage problem

• Pattern Storage– Architecture: Feedback neural network, nonlinear processing units, states, Hopfield

energy analysis– Learning: Not important– Recall: Activation dynamics until stable states are reached– Limitation: Hard problems, limited number of patterns, false minima– To overcome: Stochastic update, hidden units

• Pattern Environment Storage– Architecture: Boltzmann machine, nonlinear processing units, hidden units,

stochastic update– Learning: Boltzmann learning law, simulated annealing– Recall: Activation dynamics, simulated annealing– Limitation: Slow learning– To Overcome: Different architecture

Next >< Prev

40

Hopfield Model• Model• Pattern storage condition

• Capacity of Hopfield model: Number of patterns for a given probability of error

kikjj

ij aaw )sgn(

lj

L

lliij aa

Nw

Lk

Ni

1

1

,...1

,...1where

• Energy analysis:

02

1

V

ssswV iijiij

Continuous Hopfield model: x

x

e

exf

1

1)(Next >< Prev

41

State Transition Diagram

Next >< Prev

42

Computation of Weights for Pattern Storage

Patterns to be stored (111) and (010).

Results in set of inequalities to be satisfied.

Next >< Prev

43

Pattern Storage Tasks• Hard problems : Conflicting requirements on a

set of inequalities• Hidden units: Problem of false minima• Stochastic update

Stochastic equilibrium: Boltzmann-Gibbs Law Z

esP

T

E

)(Next >< Prev

44

Simulated Annealing

Next >< Prev

45

Boltzmann Machine

• Pattern environment storage• Architecture: Visible units, hidden

units, stochastic update, simulated annealing

• Boltzmann Learning Law:

)(

ijijpp

Twij

Next >< Prev

46

Discussion on Boltzmann Learning• Expression for Boltzmann learning

– Significance of p+ij and p-

ij

– Learning and unlearning– Local property– Choice of and initial weights

• Implementation of Boltzmann learning– Algorithm for learning a pattern environment– Algorithm for recall of a pattern– Implementation of simulated annealing– Annealing schedule

• Pattern recognition tasks by Boltzmann machine– Pattern completion– Pattern association– Recall from noisy or partial input

• Interpretation of Boltzmann learning– Markov property of simulated annealing– Clamped-free energy and full energy

• Variations of Boltzmann learning– Deterministic Boltzmann machine– Mean-field approximation

Next >< Prev

47

Competitive Learning Neural Network (CLNN)

Output layer with on-center and off-surroundconnections

Input layer

Next >< Prev

48

PR Tasks by CLNN• Pattern storage (STM)

– Architecture: Two layers (input and competitive), linear processing units– Learning: No learning in FF stage, fixed weights in FB layer– Recall: Not relevant– Limitation: STM, no application, theoretical interest– To overcome: Nonlinear output function in FB stage, learning in FF stage

• Pattern clustering (grouping)– Architecture: Two layers (input and competitive), nonlinear processing units in

the competitive layer– Learning: Only in FF stage, Competitive learning– Recall: Direct in FF stage, activation dynamics until stable state is reached in

FB layer– Limitation: Fixed (rigid) grouping of patterns– To overcome: Train neighbourhood units in competition layer

• Feature map– Architecture: Self-organization network, two layers, nonlinear processing units,

excitatory neighbourhood units– Learning: Weights leading to the neighbourhood units in the competitive layer– Recall: Apply input, determine winner– Limitation: Only visual features, not quantitative– To overcome: More complex architecture

Next >< Prev

49

Learning Algorithms for PCA networks

Next >< Prev

50

Self Organization Network

(a) Network structure (b) Neighborhood regions at different times in the output layer

Input layer

Output layer

Next >< Prev

51

Illustration of SOM

< Prev Next >

52

PART III

ANN Models for Feature Extraction and Classification

Next >

53

Neural Network Architecture and Models for Feature Extraction

• Multilayer Feedforward Neural Network (MLFFNN)

• Autoassociative Neural Networks (AANN)

• Constraint Satisfaction Models (CSM)• Self Organization MAP (SOM)• Time Delay Neural Networks (TDNN)• Hidden Markov Models (HMM)

Next >< Prev

54

Multilayer FFNN

• Nonlinear feature extraction followed by linearly separable classification problem

Next >< Prev

55

• Complex decision hypersurfaces for classification

• Asymptotic approximation of a posterior class probabilities

Multilayer FFNN

Next >< Prev

56

• Radial Basis Function NN: Clustering followed by classification

Input vector a

Basis function

j

j(a)

Class labels

c1

cN

Radial Basis Function

Next >< Prev

57

• Architecture• Nonlinear PCA• Feature extraction• Distribution capturing ability

Autoassociation Neural Network (AANN)

Next >< Prev

58

Autoassociation Neural Network (AANN)

• Architecture

Input Layer Output LayerDimension Compression Hidden Layer

<Back

59

Distribution Capturing Ability of AANN

• Distribution of feature vector (fig) • Illustration of distribution in 2D

case (fig) • Comparison with Gaussian Mixture

Model (fig)

Next >< Prev

60

Distribution of feature vector

<Back

61

(a) Illustration of distribution in 2D case(b,c) Comparison with Gaussian Mixture Model

<Back

62

Feature Extraction by AANN

• Input and output to AANN: Sequence of signal samples (captures dominant 2nd order statistical features)

• Input and output to AANN: Sequence of Residual samples (captures higher order statistical features in the sample sequence)

Next >< Prev

63

Constraint Satisfaction Model

• Purpose: To satisfy the given (weak) constraints as much as possible

• Structure: Feedback network with units (hypotheses), connections (constraints / knowledge)

• Goodness of fit function: Depends on the output of unit and connection weights

• Relaxation Strategies: Deterministic and Stochastic

Next >< Prev

64

Application of CS Models

• Combining evidence• Combining classifiers outputs• Solving optimization problems

Next >< Prev

65

Self Organization Map (illustrations)

• Organization of 2D input to 1D feature mapping

• Organization of 16 Dimensional LPC vector to obtain phoneme map

• Organization of large document files

Next >< Prev

66

Time Delay Neural Networks for Temporal Pattern Recognition

Next >< Prev

67

Stochastic Models for Temporal Pattern Recognition

• Maximum likelihood formulation: Determine the class w, given the observation symbol sequence y, using criterion

• Markov Models

• Hidden Markov Models

)/(max wyPw

Next >< Prev

68

PART IV

Applications in Speech & Image Processing

69

Applications in Speech and Image Processing

• Edge extraction in texture-like images

• Texture segmentation/classification by CS model

• Road detection from satellite images

• Speech recognition by CS model

• Speaker recognition by AANN model

Next >< Prev

70

Problem of Edge Extraction in Texture-like Images

• Nature of texture-like images• Problem of edge extraction• Preprocessing (1-D) to derive partial evidence• Combining evidence using CS model

Next >< Prev

71

• Texture Edges are the locations where there is an abrupt change in texture properties

Problem of Edge Extraction

Image with 4 natural texture regions

Edgemap showing micro edges

Edgemap showing macro edges

Next >< Prev

72

1-D processing using Gabor Filter and Difference Operator

• 1-D Gabor smoothing filter : Magnitude and Phase

)2

exp(2

1),,(

2

2xj

xxf

1-D Gabor Filter: Gaussian modulated by a complex sinusoidal

Odd Component

Even Component

)cos()2

exp(2

1)(

2

2x

xxfc

)sin()2

exp(2

1)(

2

2x

xxfs

Next >< Prev

73

• Differential operator for edge evidence – First derivative of 1-D Gaussian function

• Need for a set of Gabor filters

1-D processing using Gabor filter and Difference operator (contd.)

)2

exp(2

)(2

23 yy

yc

Next >< Prev

74

Texture Edge Extraction using 1-D Gabor Magnitude and Phase

• Apply 1-D Gabor filter along each of the parallel lines of an image in one direction ( say, horizontal )

• Apply all Gabor filters of the filter bank in a similar way

• For each of the Gabor filtered output, partial edge information is extracted by applying the 1-D differential operator in the orthogonal direction ( say, vertical )

• The entire process is repeated in the orthogonal (vertical and horizontal) directions to obtain the partial edge evidence in the other direction

• The partial edge evidence is combined using a Constraint Satisfaction Neural Network Model

Next >< Prev

75

Bank of 1-D Gabor Filters

Input Image

Filtered Image

Edge evidence

Combining the Edge evidence using Constraint Satisfaction Neural Network Mode

Edge map

Texture Edge Extraction using a set of 1-D Gabor Filters

Post-processing using 1-D Differential operator and Thresholding

Next >< Prev

76

Structure of 3-D CSNN Model

I

J

K

3D lattice of size IxJxKConnections among the

nodes across the layers of for each pixel

+ve

-veConnections from a set of neighboring

nodes to each node in the same layer.

Combining Evidence using CSNN model

Next >< Prev

77

Combining the Edge Evidence using Constraint Satisfaction Neural Network

(CSNN) Model

• Neural network model contains nodes arranged in a 3-D lattice structure

• Each node corresponds to a pixel in the post-processed Gabor filter output

• Post processed output of a single 1-D Gabor filter is an input to one 2-D layer of nodes

• Different layers of nodes, each corresponding to a particular filter output, are stacked one upon the other to form the 3-D structure

• Each node represents a hypothesis• Connection between two nodes represents a constraint

• Each node is connected to other nodes with inhibitory and excitatory connections

Next >< Prev

78

31or 1 1 if , 8

1

21or 1 1 if , 16

1

11or 11 if , 8

1

,1,1,,,

jjii

jjii

jjii

W kjikji

)1(2

1 and 1,,,,,

K

W kjikji

Let, represents the weight of the connection from node (i,j,k) to node (i1,j1,k) within each layer k, and the weight represents the constraint between the nodes in two different layers (k and k1) in the same column. These are given as:

1,1,1,,, kjikjiW

1,1,1,,, kjikjiW

• The node is connected to other nodes in the same column with excitatory connections

Combining Evidence using CSNN model (contd.)

Next >< Prev

79

• Using the notation as the output of the node (i,j,k), and the set as the state if the network

• The state of the neural network model is initialized using:

• In the deterministic relaxation method, the state of the network is updated iteratively by changing the output of each node at one time

• The state of each node is obtained using:

Ui,j,k (n) = Wi,j,k,i1,j1,k i1,j1,k + Wi,j,k,i,j,k1 i,j,k1 +Ii,j,k

Where Ui,j,k(n) is the net input to node(i,j,k) at nth iteration, and Ii,j,k is the external input given to the node (i,j,k)

• The state of the network is updated using:

where is the threshold

}1,0{,, kji

},,{ ,,, kjikji

otherwise ,0

pixel edge an of evidence has pixel theif ,1)0(,,, kji

otherwise

if Un i,j,k

kji ,0

,1)1(,,

Combining Evidence using CSNN model (contd.)

Next >< Prev

80

Comparison of Edge Extraction using Gabor Magnitude and Gabor Phase

2-D Gabor Filter 1-D Gabor Magnitude

Texture Image 1-D Gabor Phase

2-D Gabor Filter1-D Gabor MagnitudeTexture Image

1-D Gabor Phase

Next >< Prev

81

Texture Segmentation and Classification

• Image analysis (revisited)• Problem of texture segmentation and

classification • Preprocessing using 2D Gabor filter to derive

feature vector • Combining the partial evidence using CS model

Next >< Prev

82

CS Model for Texture Classification

• Supervised and unsupervised problem• Modeling of image constraint• Formulation of a posterior probability CS

model• Hopfield neural network model and its energy

function • Deterministic and Stochastic relaxation

strategies

Next >< Prev

83

CS Model for Texture Classification- Modeling of Image Constraints

• Feature formation process: Defined by the conditional probability of the feature vector gs of each pixels given the model

parameter of each class k.

2

2/||||

)2()|(

22

kM

g

sss

kksekLgGP

• Partition process: Defines the probability of the label of a pixel given the label of the pixels in its pth order neighborhood.

p

LL

psrrs Z

eNLLP

psNr

rs

)(

),|(

Next >< Prev

• Label competition process: Describes the conditional probability of assigning a new label to an already labeled pixel

c

lk

lss Z

elLkLP

l

)(

)|(

84

CS Model for Texture Classification- Modeling of Image Constraints (contd.)

• Formulation of Posteriori Probability

lspsrsss lLNLgGkLE

lspsrsss

eZ

lLNrLgGkLP

),,,|(1

),,,|(

where

lNrrsk

m

k

ks

lspsrsss

lkLLg

lLNrLgGkLE

ps

)()())2ln((2

1

2

||||

),,,|(

22

2

and

lspsrs

ksss

total lLNrLgGkLEE ),,,|(,

)()( kLPgGPZZZ ssscp

• Total energy of the system

Next >< Prev

85

Connections among the nodes across the layers

of for each pixel

+ve-ve

CS Model for Texture Classification

Connections from a set of neighboring

nodes to each node in the same layer.

E

state

(ij1)

I

J

K

k(ijk)

(ijK)

(ij1)

(ijk)

(ijK)

Next >< Prev

86

Hopfield Neural Network and its Energy Function

ii

iiii i

iiHopfield OBOOWE 1

11,2

1

kjikji

kjikjikjikji kji

kjikjiHopfield OBOOWE ,,

,,,,1,1,1,,

,, 1,1,11,1,1,,,2

1

o1 oj oN

B1 Bj BN

IJ

K

Next >< Prev

87

Natural Textures Initial Classification Final Classification

< Back

Results of Texture Classification - Natural Textures

88

Band-2 IRS image containing 4 texture

classes Initial Classification Final Classification

< Back

Results of Texture Classification - Remotely Sensing Data

89

SIR-C/X-SAR image of Lost City of Ubar

Classification using multispectral information

Classification using multispectral and textural

information

< Back

Results of Texture Classification - Multispectral Data

90

Speech Recognition using CS Model

• Problem of recognition of SCV unit (Table)• Issues in classification of SCVs(Table)• Representation of isolated utterance of

SCV unit– 60ms before and 140 ms after vowel

onset point– 240 dimensional feature vector

consisting of weighted cepstral coefficients

• Block diagram of the recognition system for SCV unit (Fig)

• CS network for classification of SCV unit(Fig)

Next >< Prev

91

Problem of Recognition of SCV Units

<Back

92

Issues in Classification of SCVs

• Importance of SCVs – High frequency of occurrence: About 45%

• Main Issues in Classification of SCVs– Large number of SCV classes – Similarity among several SCVs classes

• Model of Classification of SCVs– Should have good discriminatory capablity( Artificial neural networks )- Should be able to handle large number of classes( Neural networks based on a modular approach )

<Back

93

Block Diagram of Recognition

System for SCV Units

<Back

94

CS Network for Classification of SCV Units

External evidence of bias for the node is computed using the

output of the MLFFNN5

External evidence of bias for the node is computed using the

output of the MLFFNN1

External evidence of bias for the node is

computed using the output of the MLFFNN9

Vowel Feedback Subnetwork

MOA Feedback Subnetwork

POA Feedback Subnetwork

<Back

95

Classification Performance of CSM and other SCV Recognition Systems

on Test Data of 80 SCV Classes

Decision CriteriaSCV RecognitionSystem Case 1 Case 2 Case 3 Case 4HMM based system 45.5 59.2 65.9 71.480-class MLFFNN 45.3 59.7 66.9 72.2MOA modularnetwork

29.2 50.2 59.0 65.3

POA modularnetwork

35.1 56.9 69.5 76.6

Vowel modularnetwork

30.1 47.5 58.8 63.6

Combined evidencebased system

51.6 63.5 70.7 74.5

ConstraintSatisfaction model

65.6 75.0 80.2 82.6

Next >< Prev

96

Speaker Verification using AANN Models and Vocal Tract System

Features• One AANN for each speaker • Verification by identification • AANN structure: 19L 38N 4N 38N 19 L• Feature: 19 weighted LPCC from 16th order

LPC for each frame of 27.5 ms and frame shift 13.75ms

• Training: Pattern mode, 100 epochs, 1 min of data

• Testing: Model giving highest confidence for 10 sec of test data

Next >< Prev

97

Speaker Recognition using Source Features

• One model for each speaker • Structure of AANN: 40L 48N 12N 48N 40L• Feature: About 10 sec of data, 60 epochs• Testing: Select model giving highest

confidence for 2 sec of test data

Next >< Prev

98

Other Applications

• Speech enhancement• Speech compression • Image compression• Character recognition• Stereo image matching

Next >< Prev

99

Summary and Conclusions

• Speech and image processing: Natural tasks• Significance of pattern processing • Limitation of conventional computer architecture• Need for new models or architectures for pattern

processing tasks• Basics of ANN• Architecture of ANN for feature extraction and

classification • Potential of ANN for speech and image

processing

< Prev

100

References

1. B.Yegnanarayana, “ Artificial Neural Networks”, Prentice-Hall of India, New Delhi, 1999

2. L. R. Rabiner and B. H. Juang, “Fundamentals of Speech Recognition”, Prentice-Hall, New Jersey, 1993

3. Alan C. Bovik, Handbook of Image and Video Processing, Academic Press, 2001

4. Xuedong Hwang, Alex Acero and Hsiao-Wuen Hon, “Spoken Language Processing”, Prentice-Hall, New Jersey, 2001

5. P. P. Raghu, “Artificial Neural Network Models for Texture Analysis”, PhD Thesis, CSE Dept., IIT Madras, 1995

6. C. Chandra Sekar, “Neural Network Models for Recognition of Stop Consonant Vowel (SCV) Segments in Continuous Speech”, PhD Thesis, CSE Dept., IIT Madras, 1996

7. P. Kiran Kumar, “Texture Edge Extraction using One Dimensional Processing”, MS Thesis, CSE Dept., 2001

8. S. P. Kishore, “Speaker Verification using Autoassociative Neural Netwrok Models”, MS Thesis, CSE Dept., IIT Madras, 2000

9. B. Yegnanarayana, K. Sharath Reddy and S. P. Kishore, “Source and System Features for Speaker Recognition using AANN Models”, ICASSP, May 2001

10. S. P. Kishore, Suryakanth V. Ganagashetty and B. Yegnanarayana, “Online Text Independent Speaker Verification System using Autoassociative Neural Network Models”, INNS-IEEE Int. Conf. Neural Networks, July 2001.

11. K. Sharat Reddy, “Source and System Features for Speaker Recognition”, MS Thesis, CSE Dept., IIT Madras, September 2001.

12. B. Yegnanarayana and S. P. Kishore, “Autoassociative Neural Networks: An alternative to GMM for Pattern Recognition”, to appear in Nerual Networks 2002.

tutorial on neural network models for speech and image processing

Documents

signalnature of speech

image feature extraction

speech production mechanism

image datapart

image distinction

image processingb

new models of computing

feature extractionto