letter image recognition - university of iceland · – design classifiers for letter image...
TRANSCRIPT
![Page 1: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/1.jpg)
LETTER IMAGE RECOGNITION
![Page 2: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/2.jpg)
1. Introduction.
![Page 3: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/3.jpg)
3
1. Introduction.• Objective:
– design classifiers for letter image recognition. – consider accuracy and time in taking the decision.
• 20,000 samples: – Starting set: images based on 20 different fonts (20x26 samples)– Data set: each letter was randomly distorted to produce our data
set (the 20,000 samples)– we did not have this initial set free of noise.
• 16 numerical features: – statistical moments and edge counts– scaled to fit into a range of integer values from 0 through 15.
• We use H, R or L method to estimate the error of the classifier.
![Page 4: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/4.jpg)
4
1. Introduction.
– Attribute Information:• Capital Letter: (26 Values From A To Z)• X-Box: Horizontal Position Of Box• Y-Box: Vertical Position Of Box• Width: Width Of Box• High: Height Of Box• Onpix: Total # On Pixels• X-Bar: Mean X Of On Pixels In Box• Y-Bar: Mean Y Of On Pixels In Box
![Page 5: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/5.jpg)
5
1. Introduction.• Y2bar: Mean X Variance• Y2bar: Mean Y Variance• Xybar: Mean X Y Correlation• X2ybr: Mean Of X * X * Y• Xy2br: Mean Of X * Y * Y• X-Ege: Mean Edge Count Left To Right• Xegvy: Correlation Of X-Ege With Y• Y-Ege: Mean Edge Count Bottom To Top• Yegvx: Correlation Of Y-Ege With X
![Page 6: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/6.jpg)
2. Euclidean distance classifier.
![Page 7: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/7.jpg)
7
2. Euclidean distance classifier.
• The decision rule:
• Estimate the means for each category:
ijxxx jii ≠∀−<−⇔∈ µµω
∑=
=in
kk
ii xn 1
^ 1µ
![Page 8: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/8.jpg)
8
2. Euclidean distance classifier.
• Estimate the error with R method:
42.175057.8250.676
Error(%)
NC (%)
Accuracy(%)
Average Decision Time (ms)
![Page 9: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/9.jpg)
3. Gaussian classifier.
![Page 10: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/10.jpg)
10
3. Gaussian classifier.
• Assume Gaussian distribution. • Estimate the mean and covariance matrix
for each class, with these estimators:
∑=
−−−
=in
k
tikik
ii XXn
C1
^^^))((
11 µµ
∑=
=in
kk
ii xn 1
^ 1µ
![Page 11: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/11.jpg)
11
3. Gaussian classifier.
• Decision rule:
• Where gi(x) are the discriminant functions:
ijxgxgx jii ≠∀>⇔∈ )()(ω
iiiitii
tii
ti LnPLnxxxg 22)( 111 +Σ−Σ−Σ+Σ−= −−− µµµ
![Page 12: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/12.jpg)
12
3. Gaussian classifier.• We can estimate the error of the classifier with
the R method. The result:
10.245089.7553.156
ErrorNCAccuracyAverage Decision Time
![Page 13: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/13.jpg)
4. KNN classifier.
![Page 14: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/14.jpg)
14
4. KNN classifier.
• We will use the KNN rule, for each test-sample we find K nearest neighbors:
• The decision rule:
∑=i
iKK
ijKKx jii ≠∀>⇔∈ω
![Page 15: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/15.jpg)
15
4. KNN classifier.
• 1st approach: – compute the distance to all the training-
samples for each test-sample. – not optimum in the sense of decision time for
each sample.
![Page 16: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/16.jpg)
16
4. KNN classifier.• 2nd Approach:
– The features are numbers from 0 to 15. – We can order the training-samples by their distance
to the origin. – Given a test-sample, we measure its distance to the
origin and look for its knn only in training-samples with a similar distance.
– Suppose that the samples will be equally distributed in the 16D space
• use more training samples for furthest samples and less for closest samples (a smallest window for close samples and a big window for far samples).
• Optimum window: linearly from 50 to 1000.
![Page 17: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/17.jpg)
17
4. KNN classifier.
• 26 classes: we will get ties whether K is odd or even. Two options: – not to take a decision – break the tie
• give more importance to the closer samples (k votes for the nn, 1 vote for the knn).
• Example 4nn:– AABB -> A:1+1, B:1+1– AABB -> A:4+3, B:2+1– ABBA -> A:5, B:5
![Page 18: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/18.jpg)
18
4. KNN classifier.
11.095088.90538.009OW
6.86093.1468.475W=1000
12.705087.29527.805W=400
39.375060.6253.5115W=50
ErrorNCAccuracyAverage Decision TimeK=1
![Page 19: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/19.jpg)
19
4. KNN classifier.
11.095088.90547.515TB,OW
7.785092.21584.3111000,TB
1108954.595600,TB
44.075055.9256.468350,TB
5.8554.1889.96580.3561000
7.7756.79585.4349.28460022.1435.6142.2555.616650
ErrorNCAccuracyAverage Decision TimeK=3
![Page 20: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/20.jpg)
20
4. KNN classifier.
12.581.3486.0847.807TB,OW
7.4651.1191.42582.9631000,TB
10.971.27587.75551.378600,TB
46.051.72552.2256.640650,TB
5.9955.95588.0585.4851000
8.6858.2383.08553.822600
29.70530.4139.8857.11350
ErrorNCAccuracyAverage Decision TimeK=4
![Page 21: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/21.jpg)
21
4. KNN classifier.
8.960.4390.6184.588TB,OW (50, 2000)
14.210.61585.1847.263TB,OW4.470.2795.27233.093000,TB8.580.40591.0285.6851000,TB12.540.5786.8952.906600,TB49.21.58549.227.581450,TB7.194.36588.4584.513100010.256.64583.1154.20760034.1227.4838.47.504850
ErrorNCAccDecision TimeK=5
![Page 22: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/22.jpg)
5. Neural Network classifier.
![Page 23: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/23.jpg)
23
5. Neural Network classifier.• Multilayer neural network with backpropagation
algorithm.• We used the resilient backpropagation training
algorithm. Faster.• Many parameters of the networks and training
method where changed to find the optimum classifier:– Number of neurons in the hidden layer.– Number of hidden layers.– Functions in the layers: hyperbolic tangent, logistic,
lineal.– Learning rate.
![Page 24: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/24.jpg)
24
5. Neural Network classifier.– Number of training samples.– Preprocessing of the input data: mean and SD
normalization, principal components analysis (take out the components that contribute less than 2% in the total variation of the data set).
– Training algorithms.– Targets vectors: 0..1 (logistic), -1..1 (hyperbolic
tangent), -0.9..0.9 (hyperbolic tangent), 0..10 (lineal).– Performance functions.
![Page 25: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/25.jpg)
25
5. Neural Network classifier.
• The network that had better performance was the following one:
![Page 26: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/26.jpg)
26
5. Neural Network classifier.
• The hidden layer has 15 neurons, and both layers use the logistic function.
• Rule of thumb: around 30 neurons– the performance was better with 15 neurons
• The output layer had 26 neurons as the number of classes
![Page 27: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/27.jpg)
27
5. Neural Network classifier.
• We did not preprocess the inputs: – no scaling (actually the data was already
normalized)– no principal components analysis.
:ar• The target vector was
jj
ii
papa
ωω∈⇔=
∉⇔=r
r
9.00
![Page 28: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/28.jpg)
28
5. Neural Network classifier.• 5000 training-samples: randomly distributed
according to their class.– the performance was not better using more samples
• 1000 validation-samples – early stop of 50 (if in 50 iterations the performance
measured with the validation data was worse, then we stop to avoid overfitting)
• 2000 iterations maximum• Learning rate: η=0.1
![Page 29: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/29.jpg)
29
![Page 30: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/30.jpg)
30
5. Neural Network classifier.
• We could compare this to the performance of other network:– Input preprocessing: scaled and principal
components (11 over 16)– Training data: 2000– Hidden neurons: 10– Hidden layer function: hyperbolic tangent– Learning rate: 0.2
![Page 31: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/31.jpg)
31
![Page 32: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/32.jpg)
32
5. Neural Network classifier.
• H method: test the performance of the classifier with 5000 new samples – 2000 from the validation set and 3000
different one.• The decision rule was:
ijaax jii ≠∀>⇔∈ω
![Page 33: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/33.jpg)
33
5. Neural Network classifier.• The results:
– Probably better if we could train with the set without noise.
24.42075.584.744
ErrorNCAccuracyAverage Decision Time
![Page 34: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/34.jpg)
6. Summary and conclusion.
![Page 35: LETTER IMAGE RECOGNITION - University of Iceland · – design classifiers for letter image recognition. – consider accuracy and time in taking the decision. • 20,000 samples:](https://reader030.vdocuments.net/reader030/viewer/2022041023/5ed4239ca6cc2c57c3522d07/html5/thumbnails/35.jpg)
35
6. Summary and conclusion.• The best accuracy was achieved with the 5nn
classifier• If we consider the time in taking the decision, the
best classifier is the Gaussian.
12.705 087.295 27.805 1NN4.4650.2795.265233.095NN24.42075.584.744N. Network
10.245089.7553.156Gaussian42.175057.8250.676Euclidean
ErrorNCAccuracyTimeClassifier