handwritten chinese character recognition by metasynthetic approach

8
Pergamon Pattern Recognition, Vol. 30, No. 8. pp. 1321 1328, 1997 l 1997 Pattern Recognition Society. Published by Elsevier Science Ltd Printed in Great Britain. All rights reserved 0031 3203/97 $17.00+.00 Plh S0031.3203(96)00152-5 HANDWRITTEN CHINESE CHARACTER RECOGNITION BY METASYNTHETIC APPROACH HONG-WEI HAt, XU-HONG XIAO and RU-WEI DAI* Artificial Intelligence Laboratory, Institute of Automation, Chinese Academy of Sciences, EO. Box 2728, Beijing 100080, People's Republic of China (Received 16 July 1996; received for publication 15 October 1996) Abstract--Enlightened by the idea of metasynthesis, two integration approaches for handwritten Chinese character recognition are proposed in this paper. The first one is Integration based on a Linear Model and the second one is Network Integration based on Supervised Learning. Compared with previous integration approaches, the proposed methods succeed in automatically acquiring the parameters of the integrated systems by supervised learning which is very important for the large number of classes of pattern recognition problems. The experimental results show that the performances of the synthesized systems are much better than any of the individual classifiers, ci~) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd. Metasyntbesis model (ILM) integration Handwritten Chinese character recognition (HCCR) integration based on a linear Network integration based on supervised learning ( N 1 S L ) Human-machine 1. INTRODUCTION In the early 1990s, the theory of metasynthesis was proposed in China for solving the problems of the Open (OCGS). ~1~ Complex Giant Systems It contributed remarkably to Artificial Intelligence (AI) research in ChinaJ 2) The main point of metasynthetic approach is that not only computer systems but also human intelli- gence are combined so that some complex problems can be solved. That is, metasynthesis emphasizes the crucial role of human beings in an intelligent system and pursues human-machine integration. It is well known that machine recognition of handwritten Chinese characters is a very difficult problem and regarded as one of the L (3) ultimate goals of character recognition research; we attempt to use the metasynthetic approach to solve this problem. As a first step, we try to synthesize the results of various classification methods or systems; this is similar to the idea of combining multiple classifiers. ~4) Unlike previous classifier integration methods, the motivation of the experiments is not only to improve the performance of the character recognition system but also to pursue effective ways to realize human-machine integration so that the integration approaches can get rid of the predicament of obtaining model arguments by constant trial and error. This is very important for large set pattern recognition problems such as Chinese char- acter recognition which still remains unachieved due to the extremely large number of classes. In our cases, the meaning of human-machine integration is twofold: one is human beings' control of the system by presenting model or supervised learning, etc.; the other is the assistance of computers to human beings, that is, com- plex computation and information processing are accom- * Author to whom correspondence should be addressed. plished by computers under the supervision of human beings. Generally speaking, different classifiers complement each other to some extent. To synthesize the results of the individual classifiers, two integration models are pro- posed in this paper. One is Integration based on a Linear Model (ILM), which premises that the relations among different classifiers are linear; the multiple outputs of an ad hoc classifier support the final decision of the inte- gration system to different extent. The coefficients of the linear model are acquired by supervised parameter esti- mation. The other is Network Integration based on Supervised Learning (NISL), which premises that the relations among different classifiers are nonlinear. For each class of character, a multilayer perceptron network is designed and trained by supervised learning. That is, in the learning process, which class the input pattern should belong to is determined by the teacher. For the recognition of varied forms of patterns such as the free handwritten Chinese characters, the teacher plays a key role. As a matter of fact, the integrated results synthesize the merits of each individual system, thus a better result can be achieved. The results of the experimental system in which three handwritten Chinese character recognition systems are integrated by the proposed integration approaches show that the performances of the integration systems are far better than any of the individual classifier. Section 2 briefly introduces the database for this work. Section 3 gives three handwritten Chinese character recognition systems. Section 4 proposes the integration approach based on the linear model. Section 5 expatiates the implementation of NISL. Section 6 provides experi- mental results and discussion. The final section offers concluding remarks. 1321

Upload: hong-wei-hao

Post on 02-Jul-2016

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Handwritten Chinese character recognition by metasynthetic approach

Pergamon Pattern Recognition, Vol. 30, No. 8. pp. 1321 1328, 1997

l 1997 Pattern Recognition Society. Published by Elsevier Science Ltd Printed in Great Britain. All rights reserved

0031 3203/97 $17.00+.00

Plh S0031.3203(96)00152-5

HANDWRITTEN CHINESE CHARACTER RECOGNITION BY METASYNTHETIC APPROACH

HONG-WEI H A t , XU-HONG XIAO and RU-WEI DAI*

Artificial Intelligence Laboratory, Institute of Automation, Chinese Academy of Sciences, EO. Box 2728, Beijing 100080, People's Republic of China

(Received 16 July 1996; received for publication 15 October 1996)

Abstract--Enlightened by the idea of metasynthesis, two integration approaches for handwritten Chinese character recognition are proposed in this paper. The first one is Integration based on a Linear Model and the second one is Network Integration based on Supervised Learning. Compared with previous integration approaches, the proposed methods succeed in automatically acquiring the parameters of the integrated systems by supervised learning which is very important for the large number of classes of pattern recognition problems. The experimental results show that the performances of the synthesized systems are much better than any of the individual classifiers, ci~) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.

Metasyntbesis model (ILM) integration

Handwritten Chinese character recognition (HCCR) integration based on a linear Network integration based on supervised learning ( N 1 S L ) Human-machine

1. INTRODUCTION

In the early 1990s, the theory of metasynthesis was proposed in China for solving the problems of the Open

(OCGS). ~1~ Complex Giant Systems It contributed remarkably to Artificial Intelligence (AI) research in ChinaJ 2) The main point of metasynthetic approach is that not only computer systems but also human intelli- gence are combined so that some complex problems can be solved. That is, metasynthesis emphasizes the crucial role of human beings in an intelligent system and pursues human-machine integration. It is well known that machine recognition of handwritten Chinese characters is a very difficult problem and regarded as one of the

L (3) ultimate goals of character recognition research; we attempt to use the metasynthetic approach to solve this problem. As a first step, we try to synthesize the results of various classification methods or systems; this is similar to the idea of combining multiple classifiers. ~4)

Unlike previous classifier integration methods, the motivation of the experiments is not only to improve the performance of the character recognition system but also to pursue effective ways to realize human-machine integration so that the integration approaches can get rid of the predicament of obtaining model arguments by constant trial and error. This is very important for large set pattern recognition problems such as Chinese char- acter recognition which still remains unachieved due to the extremely large number of classes. In our cases, the meaning of human-machine integration is twofold: one is human beings' control of the system by presenting model or supervised learning, etc.; the other is the assistance of computers to human beings, that is, com- plex computation and information processing are accom-

* Author to whom correspondence should be addressed.

plished by computers under the supervision of human beings.

Generally speaking, different classifiers complement each other to some extent. To synthesize the results of the individual classifiers, two integration models are pro- posed in this paper. One is Integration based on a Linear Model (ILM), which premises that the relations among different classifiers are linear; the multiple outputs of an ad hoc classifier support the final decision of the inte- gration system to different extent. The coefficients of the linear model are acquired by supervised parameter esti- mation. The other is Network Integration based on Supervised Learning (NISL), which premises that the relations among different classifiers are nonlinear. For each class of character, a multilayer perceptron network is designed and trained by supervised learning. That is, in the learning process, which class the input pattern should belong to is determined by the teacher. For the recognition of varied forms of patterns such as the free handwritten Chinese characters, the teacher plays a key role.

As a matter of fact, the integrated results synthesize the merits of each individual system, thus a better result can be achieved. The results of the experimental system in which three handwritten Chinese character recognition systems are integrated by the proposed integration approaches show that the performances of the integration systems are far better than any of the individual classifier.

Section 2 briefly introduces the database for this work. Section 3 gives three handwritten Chinese character recognition systems. Section 4 proposes the integration approach based on the linear model. Section 5 expatiates the implementation of NISL. Section 6 provides experi- mental results and discussion. The final section offers concluding remarks.

1321

Page 2: Handwritten Chinese character recognition by metasynthetic approach

1322 H.-W. HAO et al.

Fig. 1. Some excellent samples in the database.

2. DATABASE

According to National Standard GB2312-80, there are more than 6000 commonly used Chinese characters and they are divided into two levels. The first level contains 3755 most frequently used characters and they are orga- nized as 40 subsets (39 subsets contain 94 classes, the last one contains 89 classes) by their pronunciations. The second level include another 3008 less frequently used characters. In order to do the research on handwritten Chinese character recognition and to evaluate the per- formance of various recognition systems, a standard sample library is necessary. Thus, in early 1988, a 4-M Sample Library (4MSL) was collected by the Institute of Auto- mation, Chinese Academy of Sciences. (5) The library contains 3755 first-level and 295 second-level Chinese characters as well as 10 categories of numerals, a total of 4060 classes of characters. Each class contains 1000 sam- pies written by different people and therefore, the library contains in total 4,060,000 samples. They are divided into four categories according to the quality of writing:

1. Excellent. Well written and scanned with weak noises.

2. Good. Slightly distorted, with weak noises, a few connections and breaks of strokes.

3. Moderate. Distorted, with connections and breaks of strokes.

4. Poor. Heavily distorted.

Samples except the poor categories were scanned and stored in two different formats. One format is that the characters of the same class are stored together in a file and the other is that samples from different classes, one sample per class, are stored together in a file so that people can use them conveniently.

We select the first-level characters (3755 classes) as our research object. For each class, 150 samples are picked up from the 4MSL, some of them are illustrated in Fig. 1.

3. INTRODUCTION TO THE THREE CLASSIFIERS

It is well known that the feature selection plays a very important role in a pattern recognition problem. Since handwritten Chinese characters are usually dis- totted heavily, selection of stable features which have the ability to bear writing distortions is a key problem that should be considered when designing the classifiers. In this section, three classifiers we designed are briefly introduced.

3.1. Preprocessing

Preprocessing includes three procedures: smoothing, nonlinearly normalizing (6) [to a size of 64 × 64, see Fig. 2(a)] and thinning. (7) For the convenience of feature extraction, each thinned image is divided into 8 × 8 nonuniform rectangular zones as shown in Fig. 2(b). A rectangular zone may be

1. an area of the background; 2. one part of a stroke; 3. an intersection of strokes.

3.2. Architecture o f the individual systems

Taking account of the large variation of handwriting, which is probably the largest difficulty of handwritten character recognition, three groups of features, which are insensitive to writing variations, are selected and inputted into three different classifiers, respectively. These clas- sifiers are in fact template-matching machines with

Page 3: Handwritten Chinese character recognition by metasynthetic approach

Handwritten Chinese chatacter recognition by metasynthetic approach

( a ) ~ N ~ ~

1323

(b) F T r ~ : r r I ; r l r - r l - l - ~ - , F 9 rTI"I"F;"II;"I fI"TI'-TTXTT-I I ! l . l \J ' l ILI [-~"k P'II IJ I I l l I II"11"1 [ t l l ! ii]~-°, i

r,.T,; r r f,~l 'I I ' l l f r I I " i - ~ l h" l I--lilsb~---l-t--.~ck---I I , ,T I , I I 'T-"I l l , I

I I t~i l i tl,!I il, t.4'i 7; II-1"I 1,t'll :;;t Fig. 2. (a) Examples of normalized characters "1~", "lSi~, " J ~ , " ~ . (b) Thinned characters in (a) are

divided into nonuniform zones.

features of

input cfiaracters

• "2;i;;i;; lc,.si.e, candidate 1 . . . . . . . . . . . . . . . . ~ candidate 2

candidate n

Fig. 3. Architecture of individual classifier.

similar architecture. Their function is to compare the features of the input character with those of possible candidate templates, and output the candidates in des- cending order (as shown in Fig. 3) with respect to similarity between input features and those of the re- ference templates.

Supposing R i i i = (rl, r2 , . . . , r~) is a feature vector of the reference template of the ith class, and S = ( S l , S 2 , . . . , s n ) is the feature vector of the input sample, then the distance between them is

d(S, e i ) i i i = f l (sl, r I , s2, r2, . . . , sn, rn)

and the similarity between them is

s(S, Ri) = f z (d (S , Ri) ),

wherefl andf2 are known functions, andfl must satisfy the condition d(S, S) = 0; f2 must be in inverse propor- tion to d(.). In our cases, s(.) is normalized so that s(.)= 1.

3.3. The classi f ier based on hierarchical per ipheral structures ( C-HPS )

Scan the thinned image from left to right, top to bottom, right to left and bottom to top, respectively, the first black pixels with which the scan lines intersect form the outermost periphery, and the second black pixels form the second outermost periphery, and so on. Figure 4(b) and (c) show the outermost and the second outermost peripheries of character "1~" [shown in Fig. 4(a)], respectively.

(a) (b) (c) ......... }"; ........ "~'"~] I I I t- t I l-~t J rl--1"1"9"r'l'l-[--

Ii~/ J. \ "-~ I I 1.1".J'1 I L I I 1 I I 4 / I / I • I I I I I 1 " 1 1 I ] f l l I I i I I

Itl-"tt--t-t-l-'N

)1 hi iF iS1 { II Ill [ l | l l Fig. 4. (a) Thinned image of character "l~J". (b) The outermost periphery of character "1~". (c~ The second

outermost periphery of character "ppj ' .

3.3.1. Features o f a black p ixe l point. For each black pixel of the peripheries, the following features are extracted:

1. Direction contributivity (Pl,P2,P3,P4), the definition of Pi (i=1,2,3,4) is similar to that in reference (8), supposing I m (m = 1 , 2 , . . . , 8) represent the eight direc- tional run-lengths (Fig. 5) from the pixel to the edge of the character stroke, then

li + li+4 Pi = . (1) ~/~jLl(lj q - l j + 4 ) 2

2. Distance d from the starting of scanning edge: run- length from the scanning starting point to the pixel.

So, for each black pixel point, five features (pl,pz,p3,P4,d) a r e extracted.

3.3.2. Hierarchical per ipheral structure represented

by statist ical features. The segmenting lines shown in Fig. 2(b) divide the outermost and the second outermost periphery into nonuniform parts [shown in Fig. 4(b) and (c), respectively]. Each part is a substructure of the periphery. Supposing substructure S consists of n black pixels with features i i i i i ( p l , P 2 , P 3 , P a , d ) (i = 1 ,2 , . . . ,n), the averages of features of these black pixels are considered as features of S, that is, the features of S are

= n, i = 1,2,3,4, ( 2 )

The periphery of each scan direction is divided into eight parts, so the outermost and the second outermost

3

5

A 2

'%....,... , . ; "

"k" ~_ 1

~" ~, ~8

Fig. 5. Eight directions of run-length.

Page 4: Handwritten Chinese character recognition by metasynthetic approach

1324 H.-W. HAt et al.

periphery are divided into 32 (8 × 4 - - 3 2 ) sub- structures, respectively, in total 64 (32 × 2 = 64) substructures, and each substructure has five statistical features, so the periphery structure is represented by a feature vector with 320 (64 × 5 : 320) dimensions.

3.3.3. Discriminating rules. Supposing the sub- structure set of an input candidate is s = {s l , s2 , . . . , $64}, the substructure set of the reference template is r = { r l , r2 , . . . , r64} , define the distance between the input and the reference as follows:

64

d(s, r) = Z disp(si, ri) * (IO ~' - - D ~ l + 8)/8.0 (4) i=1

and 4

disp(si, r,) = ~ IP~j ' - P~'I. (5) j : l

The less the distance, the more likely the input candidate will be the reference character.

3.4. The classifier based on four corner structures (C- FCS)

Divide the character into four parts, and scan along four diagonal directions as shown in Fig. 6. The arrows demonstrate the four scan areas and point to the scan direction at the corresponding corner. Figure 7 shows the periphery formed by the first black pixel at which the scan lines intersect with the thinned image of character "l~J" shown in Fig. 4(a).

The periphery of each corner is divided into six parts by diagonal lines with equal distance, and for each part (or substructure), features similar to those introduced in Section 3.3.2 are extracted. So the dimension of the feature vector is 120 (4 x 6 x 5 : 120). With similar discriminating rules as in Section 3.3.3 the classifier can be designed.

3.5. Classifier based on local stroke direction (C-LSD)

As can be seen from the above sections, C-HPS and C- FCS put emphasis on outer structures; as a complement, LSD puts outer and inner structures on the same level of

importance. Here, each rectangular zones introduced in Section 3 is taken as a substructure, and four direction contributivity features are extracted for each black pixel; the features of a substructure are the averages of those of the black pixels located in the corresponding zone. So each character is divided into 64 (8 x 8 = 64) substruc- tures; the dimension of the input feature vector of the classifier is 256. The discriminating rule of the classifier is defined as

64

d(s, r) Z d i s p ( s i , ri). (6) i I

4. CLASSIFIER INTEGRATION BASED ON A LINEAR MODEL

Our linear model is mainly based on the following empirical knowledge:

1. The discriminating abilities of an ad hoc classifier for different classes are different.

2. The ranked outputs support the final decision to different extents. Supposing the outputs are ranked in descending order according to the similarity metric, the lower the rank is, the higher the supporting effects the output provides. If its rank is above a threshold, the output provides a negative supporting effect rather than a positive effect.

4.1. Integration based on a linear model (LIM)

Let

1. x denote an input character; 2. Cm (m - 1 , 2 , . . . , M) represent classes of characters; 3. ei (i = 1 ,2 , . . . , 1) denote classifiers; 4. mi(x , j ,k ) is the normalized similarity measure

between input x and the kth candidate (belonging to class Cj) outputted by the ith classifier.

Then the support effect A i (x,j, k, l) of the kth candidate (belonging to class Ct) of classifier e i for the decision x c Cj is

S mi(x , l ,k) x Ri(j , l) x WI, k < N,, A i (x , j , k , l )

= \ 0 k > N ,

(7)

~.... ~... ] i ....... ".... ! i i .... . al/ i

Fig. 6. Four diagonal scan directions and corresponding scan areas.

.! ,%'

t . . .

Fig. 7. Peripheries corresponding to the four diagonal scan directions.

Page 5: Handwritten Chinese character recognition by metasynthetic approach

Handwritten Chinese chatacter recognition by metasynthetic approach 1325

where Ri(j,l) denotes the possibility to ascribe the input of class Cj to class Cl by classifier e~, and it reflects the discriminating ability of the classifier. The larger Ri(j,l) is, the larger the support to decision x E Cj; and Wk is the support coefficient of the kth candidate. The higher rank k is, the smaller Wk. We select Wk as

Wk -- l .O- 3 × k,

where 3 is a constant larger than zero but far smaller than 1, and N, = 1//3.

Equation (7) demonstrates that only the N, lowest ranked output candidates are considered giving support to the decision, and that the supporting effect is directly related to the similarity measure and the discriminating ability of the classifier, as well as the rank of the candidate. The final similarity M(xj) between the input x and class Cj is defined as the sum of the supporting effects provided by the Nt outputs of different classifiers, that is

1 N

M(x,j) = ~_~_Ai(x , j , k , l ) . (8) i=1 k 1

The integrated system can be depicted by the model shown in Fig. 8.

In Fig. 8, the three individual classifiers output Nt measures of similarity between the input and its ranked candidates, respectively, so the input of the integrated system is on measure level. The function of integration transform is twofold: on the one hand, it calculates the final similarity M(xj) between the input character x and the candidate classes according to equation (8); on the other hand, it compares the final similarities and outputs the final class label of the input, i.e.

if M(x, 1) = max M(x,j), then x E C~, J

and the system outputs the label I.

4.2. Estimation of Ri(j,l)

As mentioned above, R~fj,l) reflects the discriminating ability of the classifier Ci. On the other hand, the dis- criminating ability of a classifier can be demonstrated by recognition rate and substitution rate. That is, the dis- criminating ability for character in class Cj can be reflected by the matrix in equation (9). ¢4)

[,1, ' i1 1142 4,, Rj = . . . ( 9 )

1 ~/2 ' "

Each row of the matrix corresponds to a classifier and each column corresponds to a class. Element ~1 repre- sents

1. the recognition rate of classifier ei for class Cj when j=l;

2. the substitution rate at which classifier ei assigns characters of class Cj to class 6"l when j # 1.

Obviously, these elements can be easily acquired by supervised parameter estimation.

Considering that the structure of a character affects the discriminating ability of the classifiers, to be specific, none of the classifiers may acquire a high recognition rate for those classes quite similar to others and difficult to recognize, we take the normalized value of the elements of the matrix as the estimation of Ri(j,l), i.e.

Ri( j , l ) - ~t (10) r/max

where alma x = maxi ~j. After this transformation, maxiRi(j,j) = 1.0 for any class C i, i.e. the scales of Ri(j,l) are the same for all classes. But there is another problem, that is, the matrix is too large because there are 3755 classes of commonly used Chinese characters to be recognized. Since for a particular character class, its lower-ranked candidates are usually those much similar to the input and are relatively stable, and only several lower-ranked candidates support integration decision, higher-ranked candidates are quite dissimilar to the input, they repel the decision to ascribe the input to lower- ranked candidate class, the dimension of the matrix can be greatly curtailed. For example, only a few Ri(LI) with larger values for each class Cj are stored, and others are forced to have zero (no effects) or negative values (negative effects).

5. N E T W O R K I N T E G R A T I O N BASED ON SUPERVISED

L E A R N I N G

Another integration approach named Network Integra- tion based on Supervised Learning (NISL) different from

Feature set i

Feature set2

Feature set 3

d classifer 1[ "1

J classifier ~ "1

m

T

N " S

F

~M

Fig. 8. Integrated system model.

>

Page 6: Handwritten Chinese character recognition by metasynthetic approach

1326 H.-W. HAt et al.

Section 4 is introduced in this section. Instead of a linear model, NISL uses a nonlinear model to realize the integration by neural networks and supervised learning.

5.1. Implementation of NISL

Assume there is an n classes problem and there are m classifiers: Ci (i = 1 ,2 , . . . ,m), the outputs of the clas- sifiers are O o (j = 1 ,2 , . . . ,n ,n + 1) where n is the class label and n+ 1 denotes rejection, the object of integration is to obtain the final decision by synthesizing the results of the m classifiers.

In fact, if the outputs of the individual classifiers are viewed as a group of enhanced features, then the inte- gration will be just the same as the classification. Both can be regarded as a mapping from feature space to class space. Therefore, all kinds of methods which have been used in classification can also be applied to integration. Since it is difficult to express the mapping by an explicit function, the multilayer perceptron network (MLP) which provides an effective method for machine learning is used to carry out the integration.

If we use O U as inputs of the MLP network, the labels as the outputs of the network, and train the network by means of supervised learning, then the integration of multiple recognition systems by NISL can be conveni- ently realized.

5.2. NISL for Chinese character recognition

The major difficulties of NISL for Chinese character recognition are due to the large number of classes. If we

realize NISL as in Section 5.1, the network will have 3755 × m input nodes and 3755 output nodes, it is obviously difficult and impractical. This is in fact the difficulty of applying neural networks to solve the large vocabulary recognition problems. For the state of the art of handwritten Chinese character recognition using neur- al networks, we quote some comments from Hildebrandt and Liu: "Comparing neural network methods with the best results that have been otherwise obtained shows clearly that neural network recognition methods are still in their infancy. Although neural networks have shown great promise in 'toy' recognition problems, much work is required before they can be applied to practical pro- blem such as this. ''(9) Moil et al. d°) have the same comments on this problem. On the other hand, it has been proved theoretically that the multilayer perceptron, when trained as a classifier using backpropagation, ap- proximates the Bayers optimal discrimination func- tion. (n) That is, although MLP approximates the optimal classifier theoretically, but it will not be optimal while dealing with difficult practical problems. The reason is that the training of the network becomes very difficult due to the large amount of computational power required when the number of classes is very large. In order to make it practical for the application of neural networks to Chinese character recognition, a multiple subnets scheme is proposed as follows:

1. A preclassifier is necessary to reduce the amount of candidates.

2. Instead of a whole integration network as in Sec- tion 5.1, we use multiple subnets, one for each class.

class 1 class 2

"T i' /

I sub-net1 [ sub-ne t2

1' t t t TIt111 T[Oml) T|0121 TlOm2]

class k

? s u b - n e t

TIOI kl

kl rlOmkl

01

nonlinear transformation

I I 1 ) 7

II I ,roO.--'°. I

I I

°m'F F1 classifierm

Fig. 9. Illustration of NISL for Chinese character recognition.

t rek

teacher

Page 7: Handwritten Chinese character recognition by metasynthetic approach

Handwritten Chinese chatacter recognition by metasynthetic approach 1327

Table 1. Recognition rate of individual classifiers

Classifier: C-HPS C-FCS C-LSD Rec. rate (%)

Excellent sample set 83.4 80.1 78.9 Good sample set 75.3 74.3 68.2

3. The outputs of the individual classifiers for each candidate are transferred to the corresponding subnet by a nonlinear transformation.

4. For each subnet, there is an input layer with m nodes and an output layer with only one node. At least one hidden layer is needed. The number of hidden nodes varies according to different situations.

5. The nonlinear transformation is defined as follows: Assume that the preclassifier provides k candidates,

the individual classifiers rearranged them in descending order. Suppose for a specific class, say class j , the rank number is R(j), the transformation (denoted by 73 is defined as

1/[R(j) + 1], if R(j) E [0, k - 1], T(O0) = 0, otherwise,

where i E[ l , m], j E [l, k]. 6. Each subnet is trained by supervised learning. According to the procedure mentioned above, we can

get n subnets, each having m input nodes and one output node. The structure of each subnet is so simple that the training of them becomes surprisingly easy. The scheme is illustrated by Fig. 9.

6. EXPERIMENTAL RESULTS AND DISCUSSION

6.1. Comparison of the three classifiers

In order to compare the performances of the classifiers, an experiment was carried out. 3755 classes of charac- ters, 10 samples per class from an excellent sample set, 10 from a good sample set, a total of 75,100 characters were tested using each one of the classifiers as a pre- classifier. The recognition rate of each classifier is shown in Table 1. The accumulated recognition rates are de- monstrated in Fig. 10. It can be found that C-HPS and C-

FCS are much better than C-LSD. There may be two reasons to explain the results. One is that LSD features are too rigid so as to be sensitive to writing variations, which are the common cases of handwriting, while the templates of C-HPS and C-FCS are much flexible. The other is that features of HPS and FCS contain position information of substructures, and the quantity of infor- mation is larger than that of LSD.

6.2. Performances test of the integrated systems

In order to test the performances of our metasynthetic approaches, the above three classifiers are combined according to the proposed approaches. Ten samples per class from an excellent sample set and 10 from a good sample set were used as testing sets. In the NISL system, a preclassifier provides 100 candidates for further recognition, and only one hidden layer with 10 nodes was used for each subnet. It spent less than 10 h to train the 3755 subnets on a Pentium/90 PC. The results are shown in Table 2.

1 0 0

9 0

8 0

7 0

6 0

C-FCS

C - L S D

C - H P S

i ............... j ................................. =. .......................................................... t ............................ ~. o r d e r

1 2 5 1 0

Fig. 10. Accumulated recognition rate of different classifiers.

Table 2. Comparison of individual classifiers and the metasynthetic systems

Classifier: Individual classifiers Integrated system

Rec. rate (%) C-HPS C-FCS C-LSD ILM NISL

Excellent sample set 83.4 80.1 78.9 89.9 90.55 Good sample set 75.3 74.3 68.2 81.7 - -

" - - " stands for "not tested".

Page 8: Handwritten Chinese character recognition by metasynthetic approach

1328 H.-W. HAO et al.

The experimental results show that the performances

of the integrated systems are conspicuously better than any individual classifier. It seems that the linear model is nearly as effective as the nonlinear model albeit the linear

model is simpler than the nonlinear model with respect to computational complexity. But it is still encouraging that

neural networks can be utilized to realize an integration

system for such a large set pattern recognition problem.

7. CONCLUDING REMARKS

In this paper, three classifiers for handwritten Chinese character recognition are introduced. Then, from the

point view of metasynthetic approach, two integration approaches are proposed. The main point of metasynth- esis is that not only multiple classifiers but also human

intelligence are combined. Since handwritten Chinese

character recognition is so difficult that no simple scheme can solve this problem, we believe human-

machine integration is a promising strategy. Of course, multiple classifier integration is just an initial step to

realize the idea of metasysthesis for pattern recognition.

REFERENCES

1. Qian Xuesen, Yu Jingyuan and Dai Ruwei, A New Discipline of Science--The Study of Open Complex Giant System and Its Methodology, Chinese J. System Eng. Electronics (in English) 4(2), 2-12 (1993).

2. Ru-Wei Dai, J. Wang and J. Tian, Metasynthesis of Intelligent Systems. Zhejiang Science and Technology Press, Hangzhon, China (1995).

3. S. Mori, K. Yamamoto and M. Yasuda, Research on machine recognition of handprinted characters, IEEE Trans. Pattern Analysis Mach. lntell. PAMI-6(4), 386- 405 (1984).

4. L. Xu, A. Krzyzak and C. Y. Suen, Method of combining multiple classifiers and their application to handwriting recognition, IEEE Trans. SMC 22(3), 418-435 (1992).

5. J.-W. Tai (R. W. Dai), Y.-J. Liu and L.-Q. Zhang, A new approach for feature extraction and feature selection of handwritten Chinese character recognition, From Pixels to Features 111: Frontiers in Handwriting Recognition, pp. 479-489. Elsevier, Amsterdam (1992).

6. Seong-Whan Lee and Jeong-Seon Park, Nonlinear shape normalization methods for the recognition of large-set handwritten characters, Pattern Recognition 27(7), 895- 902 (1994).

7. N. J. Naccache and R. Shinghal, SPTA: A proposed algorithm for thinning binary patterns, IEEE Trans. SMC 14(3), 409418 (1984).

8. T. Akiyama and N. Hagita, Automated entry system for printed documents, Pattern Recognition 23(11), 1141- 1154 (1990).

9. T. H. Hildebrandt and W. Liu, Optical recognition of handwritten Chinese characters: Advances since 1980, Pattern Recognition 26(2), 205-225 (1993).

10. S. Mori, C. Y. Sun and K. Yamamoto, Historical review of OCR research and development, Proc. IEEE 80(7), 1027- 1058 (July 1992).

11. D. W. Ruck, S. K. Rogers, M. Kabrisky, M. E. Oxley and B. W. Suter, The multilayer perceptron as an approxima- tion to a Bayers optimal discriminant function, IEEE Trans. Neural Networks 1(4), 296-298 (December 1990).

About the A u t h o r - - H O N G - W E I HAO was born in 1967 and received the B.Sc. degree in Computer Science from North China Institute of Technology in 1987. He is now a Ph.D. student at the Institute of Automation, Chinese Academy of Sciences. His research interests include character recognition, artificial neural networks and signal processing.

About the Author - - XU-HONG XIAO was born in 1970 and received her B.Sc. degree in Electrical Engineering and Information Science from the University of Science and Technology of China in 1992. She is now a Ph.D. student at the Institute of Automation, Chinese Academy of Sciences. Her research interests are in character recognition, document analysis, signature verification, pattern recognition and image processing.

About the Author--RU-WEI DAI graduated from Beijing University in 1955, and worked at the Chinese Academy of Sciences. From 1980 to 1982, he was a visiting scholar at the School of Engineering, Purdue University. He has published more than 150 articles in China and abroad. He was elected to member of Chinese Academy of Sciences in 1991. He is the Chief Editor of the Chinese Journal of Pattern Recognition and Artificial Intelligence. His research interests are in pattern recognition, Chinese character recognition and artificial intelligence.