jimbo project-handwriting recognition using an arificial neural network

8/3/2019 Jimbo Project-Handwriting Recognition Using an Arificial Neural Network

http://slidepdf.com/reader/full/jimbo-project-handwriting-recognition-using-an-arificial-neural-network 1/13

Jimbo Project: Handwriting recognition using

an Artificial Neural Network

Claudio Martella

p.num. 810807-P112

[email protected]

Martin Chlupacp.num. 810928-P272

[email protected]

1st June 2005

Abstract

This work deals with recognition of isolated hand-written characters us-

ing an artifical neural network. The characters are written on a regular sheet

of paper using a special pen that produces biometric signals which are thenanalysed by a computer. In this document we describe our research and

tests performed on several MLP and RBF network architectures. For each

of these, a solution is found and it is compared to the current one based on

K-means.

Introduction

Neural networks allow for high generalisation ability and do not require deep

background knowledge and formalisation to be able to solve the problem. For

these reasons and considering the high dimensionality of the input space we triedthis approach to determine whether a neural network could do better than a dis-

tance vector based problem.

The pen consists of two pairs of mechanical sensors that measure the hori-

zontal and vertical movements of the ballpoint nib, and a pressure sensor that is

placed in the top of the pen. The pen produces a total of three signals. Two sig-

nals correspond to the horizontal and vertical accelerations of the pen, and the

remaining one corresponds to the pressure sensor. These signals are processed by

a computer system.

1



Description

The signals produced by the pen are filtered, normalised and saved in text files.

Each of these signals is divided into 7 overlapping segments. Each segment is

analised and a vector of eight features is extracted. These feature vectors are then

saved to SNNS pattern files in different sets.

The following picture shows the unfiltered and filtered data in three dimen-

sions.

2



The feature vector for one segment is described as follows:

a; q; max; min; ; 1

4; 1

2; 3

4

.

• a ... approximated line coefficient a,

• q ... aprroximated line coefficient q,

• max ... maximum difference between original and approximated value,

• min ... minimum difference between original and approximated value,

• ... average difference between original and approximated value,

• 14

... difference between original and approximated value in 14 of segment,

• 12

... difference between original and approximated value in 12

of segment,

• 34

... difference between original and approximated value in 34

of segment.

The experiment involved 20 volunteers who wrote 100 characters each, (all

digits repeated 10 times), which resulted in 2000 patterns. We divided the data

into two sets of equal size. We noticed two possible methods of separating our data

into training and test sets: by dividing the number of people, having all data from10 different persons, or by dividing the examples, giving 5 samples per number

for every person.

We decided to test whether the features of different people are more important

than the different features of the different characters written by the same person.

We achieved better results for the test with half the number of the samples from

all people.

Every pattern was composed of 168 real numbers as a sequence of the X, Y

and Z values from all the segments. The output pattern was a binary string made

of zeros except for the n − th element, if the pattern represented number n.

3



Network Design

We tried different types of network, each of them was trying to solve some prob-

lems introduced previously. All these networks have 168 input nodes and 10 out-

put nodes, but they differ in the number of hidden layers, hidden nodes, and links

between input and hidden layers.

The first network connected the data of each segments (the 8 values in the

feature vector) to one node in the hidden layer (21 nodes). A second hidden layer

of 7 nodes represented 7 segments and provided a correlation between X, Y and

Z signals. This network yielded successful recognition of 70% on the validation

set.

4



The second network tried to correlate data between the different axis, so every

node in the hidden layer was connected to three input nodes, which described

the same feature on a different axis for the same segment. For example one node

could be connected to the first feature of X, Y, Z of the first segment. This network

had more hidden nodes (56) but no second layer. We did not notice any substantial

difference in the results.

The third network tried to correlate the feature values inside the same segment

with the same feature from all segments. This was done by dividing the hidden

layer in two sets:

• feature set: each node in this set is connected to eight input nodes represent-

ing all the features of one segment.

• segment set: each node in this set is connected to seven input nodes each of

them representing the same feature but in different segments.

5



This network worked better allowing us to succesfully recognise 77% patterns

in the validation set.

The fourth network was a sort of hybrid of the best features of the previous

nets. Because there was no correlation between the X, Y and Z axis in the third

network, we added two more sets of hidden nodes. The first set connected thefeature sets and the second connected the segment sets. This network allowed

us to reach 81% successful recognitions. Analysing the results, we realised that

some patterns were systematically unclassified, where all the output values where

close to zero. This hinted us about a possible inability of the network to represent

the problem completely. This pushed us to a specific test, described in the next

section, and to the design of the next network.

6



The fifth network in fact had an increased number of hidden nodes (20 more

nodes), divided into two groups, fully connecting all the feature sets and all the

segment sets. This new capability pushed the number of succesfull recognitions

to 85.1% on the validation set. This is the actual state-of-the-art network of the

Jimbo Project .

We also tried an RBF approach, with different network designs, but we could

not get any better result than 21% which might be justified by just the network

initialization. No improvement was reached via learning.

7



Tests

As a first test we had to see if our network had enough capability to solve the

problem, so we overtrained it until we could reach an SSE close to zero.

This was possible after 700 epochs with Backprop with momentum showing

us that we had enough hidden nodes. We would like to emphasise that the Stan-

dard Backprop and RProp were not able to reach such a result.

So we trained our network trying different learning rules and paramenters and

we got our best results with Backprop-Momentum, given:

η= 0.01, µ= 0.6,c= 0.1,d max= 0.1,700 epochs.

This is the Error graph for our training and validation set:

8



Our results can be described by the output of one the scripts used for analysis:

hora had 12 errors

krav had 3 errors

blaz had 7 errors

chme had 15 errors

bart had 13 errors

hyne had 3 errors

barta had 10 errors

fris had 8 errors

cerm had 4 errors

fise had 19 errors

kriz had 9 errors

cimp had 9 errors

holi had 3 errors

jako had 9 errors

krat had 9 errors

habe had 4 errors

bern had 4 errors

chlu had 6 errors

kost had 1 error

---[ Unsuccesfull recognitions stats:]---

0 not recognised 18 times










We had 851 succesfull recognitions on 1000 patterns 85.100000

We had 68 uncertain right and 56 unclassified

The first part explains how many patterns where misclassified for every person

while the second one tells us the same information but about the number to be

classified. In the end we show the rate of succesfully classified patterns. To de-

cide if a pattern was uncertain right (the highest output in the right position) or

unclassified (the highest value in the wrong position) we checked if the output

was lower than 0.5.We have also tried to reduce the input space dimensionality by reducing the

number of segments from seven to three. This reduction allowed us to test the net-

work with an input layer of 72 nodes but this network was not able to succesfully

recognise more than 74% of the validation set. This stopped us from continuing

any further in this direction.

9



Conclusion

What can be seen from this image, which compares on the left our best recog-

nised volunteer “Kost” with the worst “Fise” on the right, is that the characters

do not differ from each other too much, but the results show that discriminants

exist. We might find one in the Z axis that represents the pressure. This also hints

us where to find other improvements in the network and the feature extraction.

Another hint to this problem is the number of unclassified patterns. In fact, if

we could classify even just 80% of these patterns, we could get close to 93% of

successful recognitions achieved by Marek Musil with an ad-hoc K-Means based

solution.Considering the number of weights, increasing the number of patterns in the

training set might help as well (Data we did not have access to). Considering that

both RBF and K-Means use codebook vectors, we would have expected the RBF

approach to yield good results, but we think this inability is caused by the high

dimensionality of the input space.

10



Acknowledgements

We would like to thank Marek Musil who provided the data and the filtering en-

gine. We would also like to say thanks to Jimbo Jones for the inspiration.

References

• Marek Musil 2004: Diplomova prace: Hybridni metody extrakce priznaku

z biometrickych signalu (Master thesis: Hybrid methods for feature extrac-

tion from biometric signals).

• Olle Gaellmo, Jim Holmstroem 2005: Handouts to Artificial Neural Net-works Course.

• Andries P. Engelbrecht 2002: Computational Intelligence. An Introduction.

Appendix

The list of wrong patterns follows:

# <\bart_d0_07> 52

# <\bart_d0_08> 53

# <\bart_d1_06> 56

# <\bart_d3_06> 66

# <\bart_d3_07> 67

# <\bart_d3_10> 70

# <\bart_d4_06> 71

# <\bart_d6_10> 85

# <\bart_d7_07> 87

# <\bart_d7_08> 88

# <\bart_d7_09> 89

# <\bart_d7_10> 90

# <\bart_d8_06> 91

# <\barta_d2_08> 113

# <\barta_d3_06> 116

# <\barta_d3_08> 118

# <\barta_d6_07> 132

# <\barta_d6_08> 133

# <\barta_d6_09> 134# <\barta_d6_10> 135

# <\barta_d7_06> 136

# <\barta_d9_07> 147

# <\barta_d9_09> 149

# <\bern_d6_09> 184

# <\bern_d6_10> 185

# <\bern_d7_06> 186

# <\bern_d8_08> 193

# <\blaz_d6_08> 233

# <\blaz_d7_07> 237

# <\blaz_d7_08> 238

# <\blaz_d8_06> 241

# <\blaz_d8_07> 242

11



# <\blaz_d9_06> 246

# <\blaz_d9_08> 248# <\cimp_d0_07> 252

# <\cimp_d2_06> 261

# <\cimp_d2_09> 264

# <\cimp_d6_08> 283

# <\cimp_d6_09> 284

# <\cimp_d6_10> 285

# <\cimp_d7_06> 286

# <\cimp_d8_08> 293

# <\cimp_d9_08> 298

# <\cerm_d6_09> 334

# <\cerm_d7_06> 336

# <\cerm_d8_06> 341

# <\cerm_d9_06> 346

# <\fise_d0_07> 352

# <\fise_d0_08> 353

# <\fise_d1_08> 358

# <\fise_d1_10> 360

# <\fise_d2_06> 361

# <\fise_d2_07> 362

# <\fise_d2_08> 363

# <\fise_d2_09> 364

# <\fise_d2_10> 365

# <\fise_d3_06> 366

# <\fise_d4_09> 374

# <\fise_d6_10> 385

# <\fise_d7_06> 386

# <\fise_d7_09> 389

# <\fise_d8_06> 391

# <\fise_d8_08> 393

# <\fise_d9_06> 396

# <\fise_d9_07> 397

# <\fise_d9_08> 398

# <\fris_d0_06> 401

# <\fris_d4_10> 425

# <\fris_d6_07> 432

# <\fris_d6_09> 434

# <\fris_d7_08> 438

# <\fris_d8_08> 443

# <\fris_d8_10> 445

# <\fris_d9_10> 450

# <\habe_d0_10> 455

# <\habe_d6_08> 483

# <\habe_d6_10> 485

# <\habe_d8_10> 495

# <\holi_d0_09> 504

# <\holi_d5_08> 528

# <\holi_d9_08> 548# <\hora_d0_06> 551

# <\hora_d0_08> 553

# <\hora_d1_06> 556

# <\hora_d4_08> 573

# <\hora_d4_09> 574

# <\hora_d4_10> 575

# <\hora_d6_07> 582

# <\hora_d6_08> 583

# <\hora_d8_09> 594

# <\hora_d8_10> 595

# <\hora_d9_08> 598

# <\hora_d9_09> 599

# <\hyne_d0_06> 601

12



# <\hyne_d5_08> 628

# <\hyne_d6_08> 633# <\chlu_d0_10> 655

# <\chlu_d4_09> 674

# <\chlu_d6_06> 681

# <\chlu_d7_08> 688

# <\chlu_d8_10> 695

# <\chlu_d9_06> 696

# <\chme_d0_06> 701

# <\chme_d2_09> 714

# <\chme_d3_08> 718

# <\chme_d3_09> 719

# <\chme_d4_07> 722

# <\chme_d4_09> 724

# <\chme_d4_10> 725

# <\chme_d5_06> 726

# <\chme_d5_09> 729

# <\chme_d6_07> 732

# <\chme_d7_07> 737

# <\chme_d7_08> 738

# <\chme_d7_09> 739

# <\chme_d8_09> 744

# <\chme_d9_09> 749

# <\jako_d0_09> 754

# <\jako_d4_08> 773

# <\jako_d5_10> 780

# <\jako_d6_07> 782

# <\jako_d6_08> 783

# <\jako_d6_09> 784

# <\jako_d7_06> 786

# <\jako_d7_10> 790

# <\jako_d8_10> 795

# <\kost_d4_08> 823

# <\krat_d0_07> 852

# <\krat_d0_08> 853

# <\krat_d0_09> 854

# <\krat_d0_10> 855

# <\krat_d1_06> 856

# <\krat_d3_09> 869

# <\krat_d4_10> 875

# <\krat_d6_07> 882

# <\krat_d6_09> 884

# <\krav_d0_10> 905

# <\krav_d4_06> 921

# <\krav_d7_08> 938

# <\kriz_d0_06> 951

# <\kriz_d2_09> 964

# <\kriz_d7_06> 986

# <\kriz_d7_10> 990# <\kriz_d8_08> 993

# <\kriz_d8_09> 994

# <\kriz_d8_10> 995

# <\kriz_d9_06> 996

# <\kriz_d9_09> 999

13

jimbo project-handwriting recognition using an arificial neural network

Documents