enn: extended nearest neighbor method for pattern recognition this lecture notes is based on the...

Post on 26-Dec-2015

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ENN: Extended Nearest Neighbor Method for Pattern Recognition

This lecture notes is based on the following paper:B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE

Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Prof. Haibo HeElectrical Engineering

University of Rhode Island, Kingston, RI 02881

Computational Intelligence and Self-Adaptive Systems (CISA) Laboratoryhttp://www.ele.uri.edu/faculty/he/

Email: he@ele.uri.edu

Extended Nearest Neighbor for Pattern Recognition

1. Limitations of K-Nearest Neighbors (KNN)

2. “Two-way communication”: Extended Nearest

Neighbors (ENN)

3. Experimental Analysis

4. Conclusion

Pattern Recognition

Parametric Classifier Class-wise density estimation, including

naive Bayes, mixture Gaussian, etc. Non-Parametric Classifier

Nearest Neighbors Neural Network Support Vector Machine

Nonparametric nature

Easy implementation

Powerfulness

Robustness

Consistency

Scale-Sensitive Problem: The class 1 samples dominate their near neighborhood with higher density (i.e., more concentrated distribution). The class 2 samples are distributed in regions with lower density (i.e., more spread out distribution).

Limitations of traditional KNN

Those class 2 samples which are close to the region of class 1 may be easily misclassified.

ENN: A New Approach

Define generalized class-wise statistic for each class:

Si denotes the samples in class i, and NNr(x, S) denotes the r-th nearest neighbor of x in S.

Ti measures the coherence of data from the same class. 0 ≤ Ti ≤ 1 with Ti = 1 when all the nearest neighbors of class i data are also from the same class i, and with Ti = 0 when all the nearest neighbors are from other classes.

Intra-class coherence:

Given an unknown sample Z to be classified, we iteratively assign it to class 1 and class 2, respectively, to obtain two new generalized class-wise statistics Ti

j, where j=1,2. Then, the sample Z is classified according to:

ENN Classification Rule: Maximum Gain of Intra-class Coherence.

For N-class classification:

To avoid the recalculation of generalized class-wise statistics in testing stage, an Equivalent Version of ENN is proposed:

The equivalent version has the same result as the original one, but avoids the recalculation of Ti

j

How this simple rule works better than KNN

The ENN method makes a prediction in a “two-way communication” style: it considers not only who are the nearest neighbors of the test sample, but also who consider the test sample as their nearest neighbors.

Experimental Results and Analysis

Sampling methods

Synthetic Data Set:A 3-dimensional Gaussian data with 3 classes:

Considering the following four models, their error rates are:

Model 2 Class 1 Class 2 Class 3

KNN ENN KNN ENN KNN ENNk = 3 32 31.9 39.3 34.4 31.4 30.5k = 5 31.2 29.7 40.5 33.7 28.6 26.7k = 7 28.5 28.3 40.8 33.6 25 24.3 Model 3 Class 1 Class 2 Class 3 KNN ENN KNN ENN KNN ENNk = 3 33.2 31 27 26.8 38.8 33.7k = 5 30.3 27.3 24 23.2 40.2 33.5k = 7 26.7 25.1 20.8 20.8 40.6 33

2 2 21 2 35, 20, 5

2 2 21 2 35, 5, 20

• MNIST Handwritten Digit Recognition

Sampling methods

Real-life Data Sets:

Data Examples

Sampling methods

Real-life Data Sets:

t-test shows that ENN can significantly improve the classification performance in 17 out of 20 datasets, in comparison with KNN.

• 20 data sets from UCI Machine Learning Repository

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

ENN

Summary: Three versions of ENN

ENN.V1

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Summary: Three versions of ENN

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Summary: Three versions of ENN

ENN.V2

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Summary: Three versions of ENN

Online Resources

http://www.ele.uri.edu/faculty/he/research/ENN/ENN.html

Supplementary materials and Matlab source code implementation

available at:

1. A new ENN classification methodology based on the maximum gain of intra-

class coherence.

2. “Two-way communication”: ENN considers not only who are the nearest

neighbors of the test sample, but also who consider the test sample as their

nearest neighbors.

3. Important and useful for many other machine learning and data mining

problems, such as density estimation, clustering, regression, among others.

Conclusion

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

top related