invariant handwritten chinese character recognition using fuzzy ring data

ELSEVIER Image and Vision Computing 14 (1996) 647-657

Invariant handwritten Chinese character recognition using fuzzy ring data

Din-Chang Tseng *, Hung-Pin Chiu, Jen-Chieh Cheng

Institute of Computer Science and Information Engineering, National Central University, Chung-Li. Taiwan 320. Republic of China

Received 12 December 1994; revised 21 November 1995

Abstract

An invariant handwritten Chinese character recognition system is proposed. Characters can be in arbitrary location, scale and orientation. Five invariant features are employed in this study. The first four features are used for preclassification to reduce matching time. The last feature, ring data, constructs ring-data vectors to characterize character samples and constructs weighted ring-data matrices to characterize characters to further reduce matching time. Fuzzy membership functions are defined based on these two characteristics to match characters. A character set is constructed from 200 handwritten Chinese characters and comprising several different samples of each character in arbitrary orientations. The performance of the proposed invariant features and fuzzy matching is verified through extensive experiments with the character set: (i) the performance of the proposed fuzzy matching is superior to that of two traditional statistical classifiers; (ii) the performance of the fuzzy ring-data vector is clearly superior to that of the fuzzy ring- data matrix, but the latter needed less matching time; (iii) the preclassification reduces the fuzzy matching time and improves the recognition rate; and (iv) the performance of the proposed invariant features is clearly superior to that of moment invariants.

Keywords: Handwritten Chinese character recognition; Invariant; Ring data; Fuzzy theory; Preclassification

1. Introduction

A system is proposed to recognize handwritten Chinese characters on handdrawn Chinese maps. The characters may arbitrarily vary in location, scale and orientation.

A survey of the pattern recognition literature [1,2] reveals that a pattern recognition system must solve two problems: how input patterns should be represented; and how input patterns should be classified based on the representation. In general, pattern representations vary with the extracted features [l-25]. Local and global features are usually used as feature vectors, then statistical recognition methods [2,7] or neural networks [3-6, 8-101 are used to classify the feature vectors. When structural features such as strokes are used, a pattern is generally represented by a relational graph or picture description language; and then a relaxation matching [l I-131 or syntactic approach [2] is used to match the patterns. An invariant pattern recognition system [3-6, 17-251 is a recognition system that recognizes patterns despite arbitrary variations in location, scale and orientation.

* Corresponding author. Email:[email protected]

0262~8856/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDI 0262-8856(95)01078-5

A large number of studies on character recognition

have been reported [1,7-221. However, only a few

researchers focused on recognizing invariantly handwritten characters [17-221, since the written variations are

large and invariant features are hard to be found. The previous invariant character systems were mainly con-

cerned with recognizing symbols [22], numerals [19,20] and English characters [ 18,20-221. Few researchers were concerned with recognizing invariantly handwritten Chinese characters. On the basis of a least-square-error function defined on the end points of a stroke, Liao and

Huang [ 171 proposed an invariant matching algorithm to match the radicals of Chinese character invariantly; however, the rotation invariance is limited in a small range.

Neural networks have been used to extract features and/or to classify patterns efficiently. Three classes of techniques for invariant recognition using neural net-

works have been proposed [23]. The first class is an exhaustive training scheme [18]. Different transformations of objects are presented to a neural network for training. The number of examples shown must be large enough such that the network may correctly generalize transformations other than those shown. This is the so- called problem of statistical significance [25]. Since the complexity of Chinese characters is high, the image must

648 D.-C. Tseng et al./Image and Vision Computing 14 (1996) 647-657

Fig. 1. Flowchart of the proposed invariant handwritten Chinese char-

acter recognition system.

be large enough to keep enough information for recognition. The larger images increase the implementation problem associated with this type of learning. Moreover, any desired invariance makes the training set much larger and correspondingly slows the learning process [25].

In the second class, the network structures are designed such that their functions are invariant to certain transformations. These network models include Widrow’s model [24], high-order neural networks [4,19] and Neocognitron [20]. These models have their own limitations. The number of connections and neurons required in Widrow’s model are prohibitively larger for larger images. An inherent problem in the implementation of high-order neural networks is the combinatorial increase in the number of neuron weights with the order of the networks. The problem of combinatorial explo- sion increases as the size of the input image increases. The network structure of Neocognitrons is very complex and not invariant to rotation.

The third class uses input features that are invariant to the required transformations, and the role of neural networks is simply to act as a classifier for the selected invariant features [3-5,19,21-231. The models of this class are generally practicable, because they combine the strength of conventional feature-extraction methods with the classifying power of neural networks [lo]. Moment invariants and Zernike moment [19,21] are the more popular invariant features that have been extensively used in a wide range of pattern recognition appli- cations. However, extraction of these invariant features are usually time-consuming. In this study, an adequate normalization process is first used to normalize the characters so that they are invariant to translation and scale

[ 19,2 1,221. Then we extract simple rotation-invariant features to satisfy all the required invariants. Fuzzy-set approaches [26,27] instead of neural networks are proposed to classify the invariant features, since the experimental character set is large. The performance of a neural network classifier with a small character set is presented in Ref. [28]. Besides, preclassification [l] is usually used to reduce the classification time of the larger data set, so we will also analyze its advantages in this study.

In this paper, an invariant handwritten Chinese character recognition system is proposed, as shown in Fig. 1. A normalization method is first used to normalize characters so that they are invariant to translation and scaling. Five rotation-invariant features - number of strokes, multi-fork points, total black pixels and connected components of characters and ring data - are then extracted. A thinning process acquires the skeletons of the normalized characters and then three primitives are extracted from the thinned characters: l-fork point, corner point and multi-fork point. The first two features are derived from the primitives, and the other three features are directly extracted from thinned characters. In the training stage, all reference character samples are clustered based on the first four invariant features. In the recognition stage, an unknown input sample is first preclassified to an appropriate cluster, and then matched with all reference character samples or characters in that cluster. Ring-data vectors and weighted ring-data matrices are constructed based on ring data to characterize character samples and characters. Fuzzy membership functions are defined based on these two characteristics to compute the fuzzy similarity between each reference character sample or character and an input sample. A character set was constructed from 200 handwritten Chinese characters and comprising several different samples of each character in arbitrary orientations. Several experiments were carried out with the character set to evaluate the performance of the proposed method. Other experiments were performed to verify the power of the proposed method, including the comparison of the proposed ring-data features to moment invariants and the comparison of the proposed fuzzy classifier to two traditional statistical classifiers: the nearest-neighbor and the minimum-mean-distance classifiers [21].

This paper is organized as follows. In Section 2, normalization, thinning and primitive extraction are described. Section 3 introduces the five invariant features, along with clustering and preclassification of characters based on the first four invariant features. The rotation-invariant ring-data vector, weighted ring-data matrix, and their corresponding fuzzy-membership matching functions are described in Section 4. Two traditional statistical classifiers are also briefly described in the section. Experiments are presented in Section 5. Section 6 states conclusions and ideas for further works.

D.-C. Tseng et al./Image and Vision Computing 14 (1996) 647-657 649

Multi-fork candidates

Comer candidates

Actual comer point

Fig. 2. An illustration of character primitives. ‘1’ indicates a l-fork

point, ‘2’ is a 2-fork point, ‘3’ and ‘4’ are multi-fork candidates, @ is

a corner candidate, and ‘C’ is a corner point.

2. Preprocessing

This section describes the preprocessing of Chinese characters for more efficient extraction of character features. Preprocessing includes normalization, thinning and primitive extraction. Moment normalization [21] is employed in this study, since it results in less distortion in the rotated patterns. Character primitives: l-fork points, corner points, and multi-fork points are extracted using the CPE and corner-checking algorithms to construct the invariant features. These primitives are extracted from thinned characters that are obtained using Chen’s parallel thinning algorithm [29].

In a binary image, ‘1’ denotes a black pixel and ‘0’ denotes a white pixel. A character is composed of black pixels. Considering a black pixel x in a thinned character, the fork degree Df(x) of x is defined as

OfCx) = 2 4,

i=O

where ais (i = 0, . . . ,7) are the &neighbor pixels of x, and each Ui is ‘1’ or ‘0’. The multi-fork point is defined as the pixel whose fork degree is greater than two. Thinning often splits a 4-fork point into two 3-fork points; the

problem can be eliminated by using the maximum circle technique [30]. If S is an input character and T is the thinned equivalent, then for every fork point p in T, the radius of the largest circle centered at p and within S is calculated. These radii are used to group all fork points into clusters, and the locations of fork points in each cluster are averaged to get a cluster center. All fork points in a cluster are replaced by the cluster center that is considered a new fork point. Fork points are finely extracted, but using the maximum circle technique is time-consuming. In this paper, we not only propose a CPE (Character Primitive Extracting) algorithm to solve the fork-point-splitting problem, but also propose a corner-checking algorithm to find corner points.

The CPE algorithm is powerful and efficient; the algorithm consists of two passes. In pass 1 we pick up every 2-fork point and check whether it is a corner candidate or not with a corner-checking algorithm. We also pick up every multi-fork point and label it a multi-fork candidate. In pass 2, we group every cluster of corner candidates to get a corner point, and group every cluster of multi-fork candidates to get a multi-fork point based on the structure of the character skeleton. The CPE algorithm is described below, and one example is illustrated in Fig. 2.

Algorithm (CPE) Character_Primitive_Extracting Input: A thinned pattern. Output: The pattern primitives. Beginning of algorithm

Step 1: (Pass 1) We scan the pattern from left to right and top to bottom using a 3 x 3 mask to label each black pixel a l-fork point, a 2-fork point, or a multi- fork candidate. Once a 2-fork point is found, the corner-checking algorithm is used to determine whether it is a corner candidate or not.

Step 2: (Pass 2) We re-scan the pattern from left to right and top to bottom. If a cluster of connected corner candidates is found, these corner candidates are grouped together to get one corner point. As soon as a cluster of connected multi-fork candidates is found, we group these candidates to get one multi- fork point.

End of algorithm

Detecting corner points in character patterns is difficult. Freeman and Davis [31] proposed a sequential corner-finding algorithm to detect sharp corners in a chain-coded plane curve based on the incremental cur- vature. Cheng and Hsu [32] proposed a parallel algorithm based on the extended (k-step) 3 x 3 mask to detect corners on digitized curves. The algorithm uses an extended 3 x 3 mask to get sufficient information for corner detection. In this study, we propose an efficient and concise corner-checking algorithm to determine whether a 2-fork point is a corner candidate or

650 D.-C. Tseng et aLlImage and Vision Computing 14 (1996) 647-6S7

a

Fig. 3. The 24 direction codes of the 7 x 7 mask.

not. Assume that p is a 2-fork point; a 7 x 7 mask centered at p is used to evaluate the angle 13~ at p. As shown in Fig. 3, 24 different directions are considered; the 24 boundary points of the 7 x 7 mask, representing 24 directions, are called direction codes. The 13, is set to lb - a/*15”, where a and b are the direction codes of two boundary points. If the angle 19~ is less than a given threshold Bi or greater than another threshold t$, then point p will be taken to be a corner candidate. The corner-checking algorithm is stated below.

Algorithm corner_checking Input: A 2-fork point. Output: Whether the 2-fork point is a corner candidate or not. Beginning of algorithm

Step 1: Check the 2-fork point p using a 3 x 3 mask. If the three black pixels in the mask form a straight line, p is not a corner candidate. Return to the CPE algorithm.

Step 2: Enlarge the 3 x 3 mask to a 7 x 7 mask. The angle 0, at point p is calculated using the 24 direction codes. Point p is a corner candidate as long as 19~ is less than 8i or greater than I!&.

End of algorithm

It should be noted that the corner-checking algorithm only find corner candidate, not real corner point, there- fore the processing is simple. In Pass 2 of the CPE algorithm, the real corner point is extracted using a simple merging technique. Three main advantages of the CPE and corner-checking algorithms are:

(i) The algorithms scan a pattern row-by-row, and do not track a chain-code sequence such that primitives can be determined as a point is being scanned.

(ii) The algorithms can be performed in parallel because each scanned point is locally processed.

(iii) The corner-checking algorithm is simple and efficient. In computing corner angles, only the direction code is needed as opposed to tedious calculation of inverse cosine function.

3. Invariant features and preclassification

In this paper, we combine a translation- and scale- invariant normalization process with rotation-invariant features to achieve invariant recognition. Five rotation- invariant features are used: number of strokes, number of multi-fork points, number of total black pixels, number of connected components, and ring data. Based on the first four invariant features, a preclassification is employed to reduce the matching time. Ring data are used in the matching process presented in the next section.

3.1. Invariant features

Among the five invariant features, number of strokes, number of connected components and number of multi-fork points are reliable structural features of characters. The number of total black pixels is a stable global feature; and ring data are local features that reveal the pixel distribution among the various rings of a character pattern. The first four features are calculated as follows:

(i) The number of strokes is calculated by the formula

[161, number of strokes G [(number of l-fork points)

+ (number of corner points) + (number of multi-fork points)]/2.

(ii) the number of multi-fork points is directly calculated by the CPE algorithm.

(iii) The numbers of connected components and total black pixels are calculated during row-by-row scan- ning pattern. When a black pixel is encountered, the number of total black pixels is increased by one. In addition, if the black pixel is not flagged yet, its connected pixels are recursively traversed in order and flagged to indicate that they have been visited. Recursion is terminated when all pixels of a connected component have been traversed, and the number of connected components is increased by one. When all pixels of the pattern have been scanned, the numbers of connected components and total black pixels are obtained.

An example is illustrated in Fig. 4. Assume that a normalized N x

defined as N binary pattern is

f (X?Y)> N

x=-- 2

)‘..) 0 )... ;;

y= -; N

> . . . , o,...,,.

The centroid of the normalized pattern is set to the origin for simplicity. Ring data (r) [33] is defined as the total number of black pixels whose distances (or radii) to the centroid of a character pattern are r, as

D.-C. Tseng et aLlImage and Vision Computing 14 (1996) 647-657 651

The character primitives t The number of l-fork points = 13.

* The number of corner points = 5.

* The number of multi-fork points = 8.

The character features

* The number of strokes = (13 + 5 + 8)/2 = 13.

(the correct number of strokes is 12.)

* The number of multi-fork points = 8.

* The number of total black pixels = 305.

+ The number of connected components = 1.

Fig. 4. The character primitives and features of a character sample.

shown in Fig. 5. In the definition, ring width is one; if the ring width is greater than one, the ring data are defined as follows:

ring data(l) = Cf(x, y), S = int [

mT7)-l+l 1 > I=S

W

I= 1,2 )...) L, (2)

where w is the ring width, L = int[N/2] is the largest radius, and int[ ] means taking the integer part from a real number. In our experiments,f(O, 0) is merged into ring data (1); and the pixels outside the radius L are forcibly merged into ring data (L) to include more information for recognition. These L ring data are regarded as an L dimensional ring-data vector. The L value varies with ring width, we will experimentally examine the performances of different ring widths.

A ring look-up table is constructed based on Eq. (2) to speed up the calculation of ring-data vectors. Each entry in the table stores a ring number associated with a specific location in the normalized character pattern. It should be noted that a normalized pattern’s centroid does not coincide with its center; however, the location difference is minor, we treat the two as identical. In fact, the similar results are obtained using either to extract ring data.

Fig. 5. Ring data.

3.2. Clustering andpreclass$cation

So many written variations of handwritten Chinese character exists that it is inefficient to match an input sample with all reference samples in a character set. To reduce the amount of matching, preclassification is indis- pensable. In the training stage, a clustering algorithm is employed to partition the reference character set into several clusters based on the first four features. In the recognition stage, an unknown sample is first assigned to one cluster based on the same four features; this method is known as ‘preclassification’. Then the input sample is matched with all character samples or characters in the assigned cluster.

The maximin-distance clustering algorithm [2] is used in the training stage. This algorithm is a heuristic procedure based on the Euclidean distance. The training features are normalized to have zero mean and unit variance before being clustered [21], to ensure that no subset of the features may dominate the distance measure. We present the clustering algorithm as follows:

Algorithm: the maximindistance clustering algorithm Input: Normalized feature vectors of all reference samples in the character set. Output: A number of clusters composed of the reference samples. Beginning of algorithm

Step 1: Generate the first and second cluster centers. (i) Arbitrarily select a sample X1 to be the first clus-

ter center Z1. (ii) Take the furthest sample from Z1 to be the sec-

ond cluster center Z2. Step 2: Find a new cluster center based on the set of

cluster centers. (i) Assume that Z1, Z2, . . . , and Z, are n established

cluster centers. Compute the distances from all remaining samples to these cluster centers. For each remaining sample, X, record the minimum distance to it cluster centers. For all remaining samples, we take the maximum of these minimum distances to be

d,, = M;x [ Mind@-, ZJ , 1 (3)

where d( ) is the Euclidean distance. (ii) Let X,, be the sample possessing the maximin

distance, and let c be a constant, 0 d cd 1. The average distance of all cluster centers is

(4)

If d,, > (c d,,&, then X,, is set to be the new

652 D.-C. Tseng et al./Image and Vision Computing 14 (1996) 647-657

cluster center Z,, 1 and repeat Step 2; otherwise where r,,,, and rl,, are the upper and the lower bounds of go to Step 3. all rk’s.

Step 3: All cluster centers have now been established. To assign the remaining samples to the cluster centers domain, assign each remaining sample to its nearest cluster by checking the distances to all cluster centers. To obtain a more representative cluster center for each cluster, take the sample mean for each cluster.

End of algorithm

The membership function for the j th reference sample Rj is now defined as

where

It is easy to extend the reference character set without changing the original cluster centers. We just measure the distances from the new input sample to all cluster centers. If the minimum distance is large enough, a new cluster center is generated; otherwise, the new sample is assigned to the cluster whose center is nearest to the input sample. The constant c in the algorithm is used to control the average size of clusters. The cluster reduction rate is evaluated in Section 5. In the recognition stage, preclassification is performed to select a cluster whose center is nearest to the input sample; and then matching is performed in that cluster.

4. Matching

Many human decisions are ill-defined due to uncertain information or environment. Fuzzy-set theory [26, 271 was developed to deal with the problem of uncertainty. In this study, fuzzy membership functions are proposed based on the ring-data vector and the weighted ring-data matrix to match character samples or characters. An input sample is matched with all reference character samples in the assigned cluster using ring-data-vector membership functions, or matched with all reference characters in the assigned cluster using ring-data matrix membership functions. The input sample is recognized as the one possessing the highest membership value.

4.1. Fuzzy ring-data vector

Let R= [r1,r2 ,..., rL] and U= [u1,u2 ,..., uL] be the ring-data vectors of reference and unknown samples, respectively; let Rj = [rjl, rj2,. . . , rjL] be the ring-data vector of the j th reference sample in a cluster. The membership function for the j th reference sample vi(U), 0 < Vj( U) < 1, is used for measuring the degree of similarity between U and Rj. Since the ring data along each dimension has different range, we re-scale the ring data to [0, l] to compute the membership value for each dimension consistently. If ring datum rk is re-scaled to s(rk), then

s(rk) = F, - rlo

The functionf( Is(rjk) - s(uk) 1, a) measures the degree of similarity between S(Uk) and s(rjk), and the positive parameter Q controls how fast the membership value decreases as the distance Is(rjk) - s(uk)I increases. If

v,,(u) - MF vj(U), (7)

the most similar reference sample to the unknown input U is R,,, and then U is classified to the character to which R,, belongs.

4.2. Fuzzy ring-data matrix

To solve the high written variation of handwritten Chinese characters and reduce the matching time more, we here use a weighted ring-data matrix instead of a ring- data vector for matching. We first quantize a ring-data vector R to a (M + 1)-level vector Q = [ql, q2,. . . , qL],

where

qk= 6

{

0, if rk 5 tkl

if tkn < rk 5 tk(n+l), (8)

M, if tkM < rk

withn= 1,2,... ,(M-1),andtkl,tk2 ,..., tkMaretheM threshold values for the kth ring. The tapered quantization that is processed on ring-data histograms is used to find M threshold values for each ring; different rings have different threshold values. We here define the weighted ring-data matrix to characterize the features of a certain character i as

Pfo Pi1 ... dlM

. . . . . .

(9)

The row vector [PAP;, . . .piM] represents the distribution (weighting) of the kth quantized ring data of the character i, and is defined as

Pin = sample number of character i with qi = n

total sample number of character i 7 (10)

where qa is the kth quantized ring data of a sample of


characteri,k=1,2 ,..., L,n=O,l,..., M,and

M

c p& = 1. (11) n=O

The distribution of ring data of each ring is considered in Eq. (10) to effectively include each ring’s variation information for recognition.

Suppose that an unknown input sample U is quantized to V= [Vi,z+,..., vL], as in Eq. (8) and the reference character i is represented by a weighted ring-data matrix Wi, as in Eq. (9), then the degree of similarity between V and Wi is defined by a fuzzy membership function

where

&cvk) = n$ob~,f(b - vkl, P)l>

f(d, P) = l-d& ifd,0<1 o

> otherwise’

(12)

(13)

and

’ = (int[MfiO] + 1)

In a weighted ring-data matrix, each ring is represented by a weighting vector instead of a value, so the similarity to each quantization level and its corresponding weight are considered in Eq. (13). The functionS( In - vk 1, ,6) is a membership function that measures the degree of similarity between vk and n. The parameter /3 controls how fast the membership value decreases as the distance In - &I increases. In the definition, 0 iS reCiprOCd to

M; that is, the more the quantization levels are, the slower the membership value decreases.

We show that the function values gi(Vk) and Fi( V) are all in the range [0, 11. By the definition of the membership function, we know that

0 <f(]n - vk(,P) 5 1.

From Eqs. (1 l)-(13) we get

0 I gi(uk) I eP& = 1: n=O

and

O<Fi(V) 5 1.

Let

F mm = MaxF,( I’), (14)

then the unknown input sample U is classified to the character mm.

4.3. Combination of fuzzy ring-data vector and matrix

We can measure the similarity between the input sample and each reference sample using the combination of the membership values of fuzzy ring-data vector and matrix based on Eqs. (6) and (12). We hope that the combination can provide more information for recognition., Suppose that an unknown input sample U is quantized to V, and the reference sample Rj belongs to character i. Then the combined membership function is defined as

Gj(u) = cuvj(u) + c&;.(I’), (15)

where 0 5 c,, c, 5 1 and c, + c,,, = 1. The unknown sample U is classified to the same as the reference sample that possesses the maximum membership function value. The combination is in the linear form; the nonlinear form is also possible.

4.4. Traditional statistical classljiers

To be an acceptable substitution for traditional classifiers, the fuzzy-set approach must be superior to them. In this study, two popular statistical classifiers [21] were implemented for comparison with the proposed fuzzy approach. The first is the nearest-neighbor rule [21]. When an unknown sample U is to be classified, the nearest neighbor of U is selected from among all reference samples, and its label is assigned to U. Let Rj = [rj1,rj2,. . . , &] be the j th reference sample of character i; then the distance between U and Rj is defined as

d(U,Rj) E 5 ‘uk;rjk’ , k=l [ ‘I (16)

where 0; is the standard deviation of the kth ring data of the feature vector of character i. The classification rule normalizes the distance for feature balance using the standard deviation. If

d(U, RjMm) = Mjnd(U, R;),

then the unknown sample U is assigned to character mm, The second classifier is a weighted minimum-mean

distance rule [21]. It characterizes each cluster by mean and standard deviations of the elements of its training feature vectors. let M’ = [mi,mi, . . .mi] be the mean feature vector of character i. The weighted distance between an unknown sample U and the character i is measured by

L [t.+-mkl d(U,M’)=x

k=l [ 1 . 4 (17)

Again, weighting by standard deviation balances the effect of all L features. The unknown sample is assigned to character mm, for which the distance is minimum;

that is,

d(U,Mmm) = Mind(U,M’).

5. Experiments

Several experiments were carried out to demonstrate the performance of the proposed invariant features and fuzzy matching. The effects of different ring widths were examined. The performances of two traditional classifiers were compared with that of the proposed fuzzy matching. The effect of the proposed preclassification plus fuzzy matching was investigated. The results of the proposed invariant features and the moment invariants were also compared.

We implemented the proposed approach in the C language running on a 486DX33 PC. The thresholds 0, and 02 used in the corner-checking algorithm were set to 105” and 255”, respectively.

The experimental character set consists of 200 Chinese characters. Each character has ten different samples, including regular and rotated ones for training (i.e. the reference character set), and six samples for testing (i.e. the unknown character set). The regular character samples are allowed to be rotated within -15” - +15”, and the rotated samples are allowed to be rotated arbitrarily. Partial experimental characters used in this study are shown in Fig. 6. After normalization, each character sample was digitized into a 97 x 97 dot pattern.

Table 1

Recognition rates of FR VA4 for different ring widths, w (preclassifica-

tion is not applied here)

w RRE (%) RR0 (%) URE (%) URO (%)

1 100 100 78.5 80.9

2 100 100 83.0 85.4

3 100 100 84.4 86.5

4 100 100 84.0 80.2

5 100 100 80.6 81.2

6 100 100 80.6 84.4

8 100 100 71.2 77.1

Several experiments were carried out to evaluate the performance of the proposed approach. In each experiment, the reference character set was re-used to evaluate the discrimination power, and the unknown-character set was used to evaluate the generality of the proposed approach. The regular and rotated samples in the reference and unknown charactersets were tested, respectively. We used RRE and RR0 to denote the sets of regular and rotated samples in the reference character set, and URE and URO to denote the sets of regular and rotated samples in the unknown-character set. In addition, we used FRVM, FWMM, NNC and MMDC

to denote the fuzzy ring-data-vector matching, the fuzzy weighted-ring-data-matrix matching, the nearest- neighbor classifier and the minimum-mean-distance classifier, respectively. Assume that Nf is the number of tested samples in an experiment; several terms are first given as follows:

(i) Preclassification success rate = N,/N,, where N, is the sample number correctly preclassified.

(ii) Preclassification reduction rate = cj”=, (s~)~/N~,

where K is the number of clusters and si is the sample number of cluster j.

(iii) Recognition rate = N,/N,, where N, is number of samples recognized correctly.

5.1. Effects of d@erent ring widths

In this experiment, we focused our attention on the effects of different ring widths to decide a better width for good recognition rate. Preclassification was not applied here. The results were obtained using the fuzzy matching on the ring-data vectors. Table 1 shows the experimental results. We found that ring width ‘3’ has the best recognition rate. We adopted ‘w = 3’ for subsequent experiments.

5.2. Performance of difSerent classifiers

In this experiment, we compared the performances of different classifiers; preclassification was not applied

D.-C. Tseng et aLlImage and Vision Computing I4 (1996) 647-657 655

Table 2

Recognition rates of FWMM for different numbers of quantization

levels, A4 (preclassification is not applied here)

M RRE (W) RR0 (%) URE (%) URO (%)

3 73.3 69.8 55.6 59.4

6 82.3 71.4 65.3 62.2

9 87.2 83.0 62.8 62.8

15 86.8 82.3 66.0 63.5

20 88.2 83.7 68.4 65.3

30 88.2 83.3 65.3 64.9

here. In such a case, the effect of FR VM is similar to that of NNC, since all reference samples are matched. The effect of FWMM is similar to that of MMDC, since only one representative is used for each character; the representatives are weighted ring-data matrix and mean feature-vector, respectively.

The experimental results of FWMM with different M

values (the number of quantization levels) are given in Table 2. We found that M = 20 has the best recognition rate, so M = 20 was adopted for subsequent experiments.

The experimental results of FRVM, NNC, FWMM and MMDC are given in Table 3. The recognition rate of FR VM is superior to that of NNC, and the recognition rate of FWMM is superior to that of MMDC. In addition, the performances of FRVM and NNC are clearly superior to those of FWMM and MMDC. This means that one single representative is not sufficient to represent a high-shape-variation character. However, FR VM and NNC are time-consuming, since all reference samples must be matched. In the following experiment, preclassification is applied to reduce the matching time; the performance of FR VMplus FWMM is also evaluated.

5.3. Performance of preclassiJication plus fuzzy matching

Table 4 gives the results of the preclassification with a variety of constants c’s (for controlling the average size of clusters) to evaluate the success rate, reduction rate and number of preclassification clusters. The results varied with the constant. When c is equal to 0.5, the success rate is the highest and the reduction rate is good enough. Thus, 0.5 was adopted for c in the following recognition experiments.

The experimental results of FRVM and FWMM are shown in Table 5. Comparing the results of Table 5 with

Table 3

Recognition rates of different classifiers (preclassification is not applied

here)

Class@er RRE (%) RR0 (%) URE (%) URO (%)

FVRM 100 100 84.4 86.5

NNC 100 100 81.2 81.9

FWMM 88.2 83.7 68.4 65.3

MMDC 77.1 77.8 66.3 68.8

Table 4

Performance evaluation of clustering and preclassification (c is used for

controlling the average size of clusters)

C Success rate (%) Number of Reduction rate

clusters (%) RRE RR0 URE URO

0.2 100 100 57.6 47.9 103 1.53

0.25 99.7 99.3 13.3 60.4 78 1.93

0.3 97.9 99.1 93.1 88.5 47 3.21

0.4 96.9 98.6 99.0 96.2 22 7.01

0.5 97.6 99.3 99.0 96.5 11 13.54

0.6 93.8 98.3 97.6 95.8 7 19.78

0.7 93.4 94.4 100 97.2 4 35.80

Table 5

Recognition rates of the proposed preclassification plus fuzzy matching

Classrjier RRE (%) RR0 (%) URE (%) URO (%)

FRVM 97.2 97.2 85.6 89.6

FWMM 93.6 91.6 79.6 77.3

those of Table 3, we find that the preclassification improves the recognition rates of FWMM, and gets similar recognition rates to FRVM. This means that some irrelevant or ambiguous samples based on the ring-data measurement were separated by the preclassification along with the other invariant features. The experiment shows that the preclassification is a valuable process.

The experimental results of the combined fuzzy matching are shown in Table 6. Not only the ring-data vector of a reference sample, but also its weighted ring-data matrix, were used in the matching. We hoped that the combination could provide more information for recognition. The results show that the performance is improved and is better when c, is near c,. The improve- ment of the unknown character set is more obvious than that of the reference character set.

5.4. Comparing with moment invariants

Moment invariants are popular invariant features which have been extensively used in a wide range of

Table 6

Recognition rates of the proposed preclassification and the combined

fuzzy matching

C” cm RRE (%) RR0 (%) VRE (%) URO (%)

0.0 1.0 93.6 91.6 79.6 77.3 0.2 0.8 99.6 98.6 86.0 85.6

0.4 0.6 99.3 98.3 90.2 89.6

0.5 0.5 98.9 97.9 89.5 91.4

0.6 0.4 98.6 97.9 89.5 91.4

0.8 0.2 98.2 97.6 87.0 91.0 1.0 0.0 97.2 97.2 85.6 89.6

656 D.-C. Tseng et aLlImage and Vision Computing 14 (1996) 647-657

Table 7

Recognition rates of moment invariants (MI) and ring data (RD) using

traditional classifiers (preclassification is not applied here)

RRE (%) RR0 (%) CJRE (X) URO (%)

MI + NNC 100 100 40.6 46.2

MI + MMDC 31.6 34.4 30.2 27.1

RD + NNC 100 100 81.2 81.9

RD + MMDC 77.1 77.8 66.3 68.8

pattern recognition problems. Hu [34] derived a set of seven moment invariants which have the desirable property of being invariant under pattern location, scale and orientation. The experimental results of moment invariants using traditional statistical classifiers are given in Table 7. The results of the proposed invariant features using the same classifiers are also given in the table for comparison. Likewise, preclassification was not applied here. From the table, we can see that the recognition rates of proposed method are obviously superior to those of moment invariants.

6. Conclusions

In this paper, we proposed a simple and efficient recognition system for handwritten Chinese characters regard- less of their location, size and orientation. Normalization was first used to normalize the scale and location of characters. After thinning, three primitives, l-fork, corner and multi-fork points, were extracted using the proposed CPE and corner-checking algorithms. The corner-checking algorithm only found the corner candidate; the CPE algorithm merged the connected corner candidates to extract the actual corner points during extracting l-fork and multi-fork points. The proposed algorithms are more concise and efficient than previous algorithms. Two invariant features; number of strokes and number of multi-fork points, were derived from the primitives. Other invariant features - number of total black pixels, number of connected components and ring data - were extracted from thinned characters. With the aid of a ring look-up table, the ring data were calculated rapidly.

To reduce matching time, preclassification was employed so that only reference character samples or characters in a specific cluster were matched with the input sample. Fuzzy membership functions were defined based on the ring-data vector and the weighted ring-data matrix to match the input character samples or characters. After being matched with all reference character samples or characters in the assigned cluster, the input sample was taken to be the character with the highest membership value.

The performances of the proposed invariant features and fuzzy matching were verified through extensive experiments:

(i) The performance of a ring of width ‘3’ is better than that of other widths;

(ii) the performance of the proposed fuzzy matching is superior to that of the traditional statistical classifiers;

(iii) the performance of the fuzzy ring-data vector is clearly superior to that of the fuzzy ring-data matrix, but the latter needs less matching time, and the reduced matching time fully depends on the sample numbers of characters;

(iv) the preclassification reduces the fuzzy matching time and improves the recognition rate;

(v) the combining fuzzy matching further improves the recognition rate; and;

(vi) the performance of the proposed invariant features is clearly superior to that of moment invariants.

Recognizing handwritten Chinese characters is very difficult because of the high complexity and variability of such characters. When added to the requirements for rotation-invariant recognition, the problem becomes even more difficult. At the current stage, we focus our attention on the problem of rotation invariance so that the variability of the experimental character set is limited. When the variability is large, a nonlinear normalization [35] may be considered to get better preprocessing results. The effects of different normalization methods have been analyzed in detail in our earlier paper [28]. Several related comparisons about ring-data feature were also presented [28]. On the whole, based on the experimental results, we can conclude that the proposed approach offers a simple solution in a limited domain for the complex problem of invariantly recognizing handwritten Chinese characters.

In a number of Chinese land record maps, the strokes of written characters are involved and overlapped. Special preprocessing techniques, invariant features and recognition techniques are needed to deal with the problems. In future, we will study other kinds of invariant features as well as invariant recognition techniques to deal with the involved and stroke- overlapped characters. On the other hand, the parallelization of matching and clustering computation can speed up the recognition. The neural-network-based recognition system [3-6,18-231 provides several advantages, such as parallel processes, mass computation, fault tolerance and robust- ness, over the traditional methods. The recognition results based on the fuzzy min-max neural network [26] are presented in our earlier paper [28].

References

[l] T.H. Hildebrand and W. Liu, Optical recognition of handwritten


Chinese characters: Advances since 1980, Pattern Recognition, 26

(2) (1993) 205-225.

[2] J.T. Tou and R.C. Gonzalez, Pattern Recognition Principles,

Addison-Wesley, Reading, MA, 1974.

[3] G.N. Bebis and G.M. Papadourakis, Object recognition using

invariant object boundary representations and neural network

models, Pattern Recognition, 25 (1) (1992) 25-44.

[4] L. Spirkovska and M.B. Reid, Robust position, scale, and rotation

invariant object recognition using higher-order neural networks,

Pattern Recognition, 25 (9) (1992) 975-985.

[5] N.M. Nasrabadi and W. Li, Object recognition by a Hopfield

neural network, IEEE Trans. Syst., Man, Cybern., 21 (6) (1991)

1523-1535.

[6] M. Fukumi, S. Omatu, F. Takeda and T. Kosaka, Rotation-

invariant neural pattern recognition system with application to

coin recognition, IEEE Trans. Neural Networks, 3 (2) (1992)

2722279.

[7] T.A. Mai and C.Y. Suen, A generalized knowledge-based system

for the recognition of unconstrained handwritten numerals, IEEE

Trans. Syst., Man, Cybern., 20 (4) (1990) 8355848.

[8] B. Hussain and M.R. Kabuka, A novel feature recognition

neural network and its application to character recognition,

IEEE. Trans. Pattern Anal. Machine Intell., 16 (1) (1994)

98-106.

[9] Y. LeCun, B. Boser and L.D. Jackel, Handwritten digit recog-

nition application of neural network chips and automatic learning,

IEEE Commun. (November 1989) 41-46.

[lo] K.T. Blackwell, T.P. Vogl, S.D. Hyman, G.S. Barbour and

D.L. Alkon, A new approach to handwritten character recog-

nition, Pattern Recognition, 25 (6) (1992) 655-666.

[l l] C.H. Leung, Y.Y. Cheung and Y.L. Wong, A knowledge-based

stroke-matching method for Chinese character recognition, IEEE

Trans. Syste., Man, Cybern., 17 (6) (1987) 993-1003.

[12] X. Huang, J. Gu and Y. Wu, A constrained approach to multifont

Chinese character recognition, IEEE Trans. Pattern Anal.

Machine Intell., 15 (8) (1993) 838-843.

[13] L.-H. Chen and J.-R. Lieh, Handwritten character recognition

using a 2-layer random graph model by relaxation matching,

Pattern Recognition, 23 (11) (1990) 1189- 1205.

[14] B. Chen and H.-J. Lee, Recognition of handwritten Chinese char-

acters via short line segments, Proc. Int. Computer Symposium,

Hsinchu, Taiwan, December 17-19 1990, pp. 1177122.

[15] M. Yasuda and H. Fujisawa, An improved correlation method for

character recognition, Systems, Computers and Controls, 10 (2)

(1979) 29-38. [I61 X. Ying and S. Chengjian, Recognition of restricted handwritten

Chinese characters by structure similarity method, Pattern Recog-

nition Lett., 11 (1990) 67-73.

[17] C.-W. Liao and J.-S. Huang, A transformation invariant matching

algorithm for handwritten Chinese character recognition, Pattern

Recognition, 23 (11) (1990) 1167-1188.

[18] D.E. Rumelhart, G.E. Hinton and R.J. Williams, Learning inter-

nal representations by error propagation, in Parallel Distributed

Processing, MIT Press, Cambridge, MA, 1989, pp 3188362.

[19] S.J. Perantonis and P.J.G. Lisboa, Translation, rotation, and scale

invariant pattern recognition by high-order neural networks and

moment classifiers, IEEE Trans. Neural Networks, 3 (2) (1992)

‘241-251.

[20] K. Fukushima, Character recognition with neural networks, Neu-

rocbmputing, 4 (1992) 221-233.

[21] A. Khotanzad and J.H. Lu, Classification of invariant image

representations using a neural network, IEEE Trans. ASSP, 38

(6) (1990) 1028-1038.

[22] C. Yuceer and K. Oflazer, A rotation, scaling, and translation

invariant pattern classification system, Pattern Recognition, 26

(5) (1993) 687-710.

[23] E. Barnard and D. Casasent, Invariance and neural nets, IEEE

Trans. Neural Networks, 2 (5) (1991) 4988508.

[24] B. Widrow and R. Winter, Neural nets for adaptive filtering and

adaptive pattern recognition, IEEE Computer (March 1988) 25-39.

[25] J.I. Minnix, E.S. McVey and R.M. Inigo, A multilayered self-

organizing artificial neural network for invariant recognition,

IEEE Trans. Knowledge and Data Engineering, 4 (2) (1992) 162-

167.

[26] P.K. Simpson, Fuzzy min-max neural networks - part 1: classi-

fication, IEEE Trans. Neural Networks, 3 (5) (1992) 7766786.

[27] P.K. Simpson, Fuzzy min-max neural networks - part 2: cluster-

ing, IEEE Trans. Fuzzy Systems, 1 (1) (1993) 32-45.

[28] H.-P. Chiu and D.-C. Tseng, Invariant handwritten Chinese char-

acter recognition using fuzzy min-max neural networks, Pattern

Recognition (revised).

[29] Y.S. Chen and W.H. Hsu, A modified fast parallel algorithm for

thinning digital patterns, Pattern Recognition Lett., 7, (1988) 99-

106.

[30] C.W. Liao and J.S. Huang, Stroke segmentation by bernstein-

bezier curve fitting, Pattern Recognition, 23 (5) (1990) 475-484.

[31] H. Freeman and L.S. Davis, A corner-finding algorithm for chain-

coded curves, IEEE Trans. Computers, 26, (March 1977) 2977

303.

[32] F.H. Cheng and W.-H. Hsu, Parallel algorithm for corner finding

on digital curves, Pattern Recognition Lett., 8 (July 1988) 47-53.

[33] K. Ueda and Y. Nakamura, Automatic verification of seal-

impression pattern, Proc. 9th Int. Conf. on Pattern Recognition,

Vol. 2, 1984, pp. 1019-1021.

[34] M. Hu, Visual pattern recognition by moment invariants, IRE

Trans. Inform. theory, 8 (February 1962) 179-187.

[35] H. Yamada, K. Yamamoto and T. Saito, A nonlinear normaliza-

tion method for handprinted Kanji character recognition - line

density equalization, Pattern Recognition, 23 (9) (1990) 10233

1029.

invariant handwritten chinese character recognition using fuzzy ring data

Documents