introduction to pattern recognition (...

INTRODUCTION TO PATTERN RECOGNITION(การร� �จํ�าร�ปแบบเบ� องต้�น)

2

Week 2-3

Pattern Recognition Systems The Design Cycle Learning and Adaptation Classifier Based on Bayes Decision

Theory

3

Pattern Recognition Systems

4

Pattern Recognition Systems Sensing

Use of a transducer (camera or microphone)

PR system depends of the bandwidth, the resolution sensitivity distortion of the transducer

Segmentation and grouping Patterns should be well separated and

should not overlap

5

Pattern Recognition Systems Feature extraction

Discriminative features Invariant features with respect to translation,

rotation and scale.

Classification Use a feature vector provided by a feature

extractor to assign the object to a category

Post Processing Exploit context input dependent information other

than from the target pattern itself to improve performance

6

The Design Cycle

Data collection Feature Choice Model Choice Training Evaluation Computational Complexity

7

The Design Cycle

Data Collection How do we know when we have

collected an adequately large and representative set of examples for training and testing the system?

Feature Choice Depends on the characteristics of the

problem domain. Simple to extract, invariant to irrelevant transformation insensitive to noise.

8

The Design Cycle

Model Choice Unsatisfied with the performance of

our fish classifier and want to jump to another class of model

Training Use data to determine the classifier.

Many different procedures for training classifiers and choosing models

9

The Design Cycle

Evaluation Measure the error rate (or performance and

switch from one set of features to another one

Computational Complexity What is the trade-off between computational

ease and performance?(How an algorithm scales as a function of the number of features, patterns or categories?)

10

Learning and Adaptation

Supervised learningA teacher provides a category label or cost for each pattern in the training set- Classification

Unsupervised learningThe system forms clusters or “natural groupings” of the input patterns - Clustering

11

Classifier Based on Bayes Decision

Theory

12

Classifier Based on Bayes Decision Theory Bay’s Decision theory The Gaussian Probability Density

Function Minimum Distance Classifier

Euclidean Mahalanobis

13

Bay’s Decision theory

14


our example on classifying two fish as salmon or sea bass. And our agreement that any given fish is either a salmon or a sea bass; call state of nature of the fish.

Let’s define a (probabilistic) variable that describes the state of nature.

= for sea bass (1) = for salmon (2)

Let’s assume this two class case.

15


The a priori or prior probability reflects our knowledge of how likely we expect a certain state of nature before we can actually observe said state of nature.

In the fish example, it is the probability that we will see either a salmon or a sea bass next on the conveyor belt.

16


Note: The prior may vary depending on the situation. If we get equal numbers of salmon

and sea bass in a catch, then the priors are equal, or uniform.

Depending on the season, we may get more salmon than sea bass, for example.

17


We write = or just for the prior the next is a sea bass.

The priors must exhibit exclusivity and exhaustively. For states of nature, or classes:

18


A feature is an observable variable. A feature space is a set from which we

can sample or observe values. Examples of features:

Length, Width, Lightness, Location of Dorsal Fin

For simplicity, let’s assume that our features are all continuous values.

Denote a scalar feature as x and a vector feature as . For a -dimensional feature space,

19


In a classification task, we are given a pattern and task is to classify it into one out of classes.

The number of classes, , is assumed to be known a priori.

Each pattern is represented by a set of feature values, which make up the l-dimensional feature vector

We assume that each pattern is represented uniquely by a single feature vector and that it can belong to only one class.

20

Bays Decision Theory

Also, we let the number of possible classes be equal to , that is , …

According to the Bayes decision theory, is assigned to the class if

21 The Gaussian Probability

Density Function

22

The Gaussian Probability Density Function The Gaussian pdf is extensively used in

pattern recognition because of its mathematical tractability as well as because of the central limit theorem.

23

The Gaussian Probability Density Function The multidimensional Gaussian pdf has

form

Where is the mean vector is the covariance matrix is determinant of is invert of is number of dimension

24

The Gaussian Probability Density Function Example 1: Compute the value of a

Gaussian pdf, , at where ,

Answers

25

The Gaussian Probability Density Function Example 2: Consider a 2-class

classification task in the 2-dimensional space, where the data in both classes, , are distributed according to the Gaussian distributions and respectively.

Let , , Assuming that classify into or

Answers

classify into

26

Mean Vector and

Covariance Matrix

27

Mean Vector and Covariance Matrix The first step in analyzing multivariate

data is computing the mean vector and the variance-covariance matrix.

Consider the following matrix: Sample data matrix

Sample data matrix

28

Mean Vector and Covariance Matrix Each row vector is another observation

of the three variables (or components). The mean vector consists of

the means of each variable the variance-covariance matrix consists

of the variances of the variables along the main diagonal and the covariances between each pair of variables in the other matrix positions.

29

Mean Vector and Covariance Matrix The formula for computing the covariance

of the variables and is

with and denoting the means of and , respectively.

for this example.

30

Mean Vector and Covariance Matrix The results are: mean vector

variance-covariance matrix

31

Mean Vector and Covariance MatrixExample mean : covariance matrix:

Generate random number following the Gaussian distribution

32

Mean Vector and Covariance Matrix

S=[1 00 1] S=[0.2 0

0 0.2] S=[2 00 2]

33


S=[0.2 00 2 ] S=[2 0

0 0.2] S=[ 1 0.50.5 1 ]

34


S=[0.3 0.50.5 2 ] S=[ 0.3 −0.5

−0.5 2 ]

35 Minimum Distance

Classifiers• Euclidean• Mahalanobis

36

Minimum Distance Classifiers

Template matching can be expressed mathematically through a notion of distance.

Let be the feature vector for the unknown input, and let be mean for the c classes.

The error in matching against is given by distance between and .

Choose the class for which the error is a minimum.

This technique is called minimum distance classification.

37

Minimum Distance Classifiers

m1

m2

m3

x

Distancem1

Distancem2

Distancemc

•••

Min

imum

Sele

ctor

Cla

ss

•••

x

38

The Euclidean Distance Classifier The minimum Euclidean distance classifier.

That is, given an unknown , assign it to class if

Where is the mean of class is the mean of class

39

The Euclidean Distance Classifier

It must be stated that the Euclidean classifier is often used, because of its simplicity.

It assigns a pattern to the class whose mean is closest to it with respect to the Euclidean norm.

40

The Euclidean Distance Classifier

Example: Consider a 2-class classification task in the 3-dimensional space, where the two classes, and are modeled by Gaussian distributions with means and , respectively. , Given the point classify according to the Euclidean

distance classifier. Answers

The point is classified to the class.

41

The Mahalanobis Distance Classifier

The minimum Mahalanobis distance classifier.

That is, given an unknown x, it is assigned to class if

,

where is the common covariance matrix. The presence of the covariance matrix accounts for the shape of the Gaussians.

and is invert of

42

The Mahalanobis Distance Classifier

Example: Consider a 2-class classification task in the 3-dimensional space, where the two classes, and are modeled by Gaussian distributions with means and ,respectively. ,

Given the point The covariance matrix for distribution is Given the point , classifity according to

the Mahalanobis distance classifier.Answers

The point is classified to the

class.

introduction to pattern recognition (...

Documents

bays decision theory16note

target pattern

prior probability

design cycle9evaluationmeasure

set of features

sea bass

fish classifier

number of features