multi-stroke relaxation matching method for handwritten chinese character recognition

Pergamon Pattern Recognition, Vol. 31, No. 4, pp. 401 410, 1998

,~ 1998 Pattern Recognition Society. Published by Elsevier Science Ltd Printed in Great Britain. All rights reserved

0031-3203/98 $19.00 + .00

PII: S0031-3203(97) 00053-8

MULTI-STROKE RELAXATION M A T C H I N G M E T H O D FOR H A N D W R I T T E N CHINESE CHARACTER R E C O G N I T I O N

F A N G - H S U A N C H E N G

Department of Computer Science, Chung-Hua Polytechnic Institute, 30 Tung Shiang, Hsinchu, Taiwan 300, R.O.C

(Received 4 June 1996; in revised form 14 May 1997)

Abstract--Since Chinese characters can be represented by a set of basic line segments called sub-strokes, sub-strokes are often used as features to recognize handwritten Chinese Characters. However, the number of sub-strokes needed to evaluate a character are different due to handwriting variations and variations in the stroke extraction process. These differences in representation of the same character lower character recognition rate. A preliminary method to solve this problem is to merge several sub-strokes into a complete stroke. However, it is difficult to merge several sub-strokes into a correct and complete stroke because the Chinese character is not known in advance. In this paper, we propose a multi-stroke relaxation matching method to solve this problem. The proposed matching method can be divided into two parts; one is the multi-stroke relaxation process and the other is the multi-stroke select-match-pair process. The multi-stroke relaxation process will determine the optimal matching relations from the probability of each possible matching pair of sub-strokes and allow more than one of the sub-strokes to match with one or more additional sub-strokes by combining the merging steps into the relaxation process. The multi-stroke select match-pair process is used to determine the stroke matching relation between the input and reference characters.

Some experiments will be conducted to show the feasibility and correctness of the proposed algorithm. From the experimental results, we will prove that the proposed algorithm can solve the matching problem of different numbers of sub-strokes caused by handwriting variations and the stroke extraction process. For 2000 daily used Chinese characters, the actual recognition rate is 93.8 % and the cumulative recognition rate of the first five candidates is 98.9%. © 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved

Sub-stroke Multi-stroke relaxation process Multi-stroke select-match-pair process Affine transformation Pairwise error Cumulative recognition rate

1. INTRODUCTION

The demand for computer products to deal with documentat ion and data processing is constantly growing. Because of this trend, quick and efficient input techniques for Chinese characters are becoming increasingly important . The tradit ional methods such as keyboard input, radical component input, and en- coding input are inconvenient and inefficient and the researchers in optical character recognition (OCR) (1'27 are now seeking to provide alternative input techniques. Currently, O C R systems have been developed for recognition of characters including handwrit ten numerals, English alphabets, and Japanese characters.

The approach of O C R research can be classified into four categories; syntactic, statistical, structural and neural network. Syntactic approach (3~ uses the extracted primitives to recognize characters via gram- mar parsing. However, this method suffers from the unstable primitives which highly degrade the performance. Statistical approach (47 applies the clustering algori thm to recognize characters. How to derive the probabili ty distribution function of each character set is a challenging task in this approach. Structural approach (s) extracts the structural features in heuristic

sense to recognize characters. In general, there is no mathematical principle in this approach. The problem is how to choose features and relationships between them so that the description gives each character clear identification. Neural approach (6,v) adopts the neural network to learn and classify the characters for recognition. The performance for this approach is highly related to the learning database. Because all the above approaches have their own advantages and disadvan- tages, it is recommended to combine several approaches to design a practical character recognition system, especially for handwrit ten Chinese character recognition system.

A typical O C R system based on the structural approach consists of four parts: preprocessing; feature extraction; matching and recognition, and contextual post-processing. Preprocessing can be divided into three parts: segmentation, normalizat ion (s'97 and thinning.(lo xz/The method of feature extraction can be divided into four categories: (137 or thogonal expan- sion,(14-i 67 stroke distribution, ( 17, ~ 87 background feature distribution, (19'2°) and stroke analysis. (21 237 The matching method involves the correlation method, (17) shift similarity, (z47 complex similarity, (2s/ the linear programming method, (267 Hough transform, (27) and

401

402 F.-H. Cheng

the relaxation methodJ 2s'29) The accuracy of the overall system can be improved by adding contextual post-processing after the recognition of individual characters has been carried on. Various methods (3°'31) have been discussed in this topics.

The recognition of handwritten Chinese characters is more difficult than other characters for the following reasons:

(2) The number of Chinese characters is larger than other characters.

(2) The structure of Chinese characters is more com- plicated than others.

(3) There are many Chinese characters with similar appearances but different meanings.

(4) Different handwriting styles cause significant vari- ance in the representation of character.

The pre-classification method can reduce the diffi- culties associated with the large number of characters by classifying the set of Chinese characters into several small sets for recognition. (32) The complexity of a Chinese character can be simplified by dividing it into several components. (33) For the third problem, structural information can be used to separate the similar characters. Solution of the last problem re- quires the use of a good matching method.

A good matching method must work well under the following conditions:

1. Different numbers of features. 2. Translation. 3. Rotation. 4. Scaling. 5. Location variation.

The matching process of OCR systems must solve the above five problems. These problems are important in pattern recognition. Basically, research dealing with these problems can be divided into two categories: the extraction of invariant features such as moment (34) for matching, and the use of relational features such as angle or length (3s'36) for matching.

We had previously completed research which part- ly solves the above difficult problems of handwritten Chinese character recognition. (29~ Initially, our research used information about the neighborhood relationship among sub-strokes to recognize handwritten Chinese characters in 1993. (29) We later developed a planar pattern matching method based on minimum feature relations to recognize handwritten Chinese characters in 1995. (37) This method applied the affine transformation to deal with the problems of translation, rotation and scale change. Both of these studies used sub-strokes as features and the relaxation method for matching. However, they all belong to the category of one-to-one matching. This means that only one sub-stroke can match with the other one sub-stroke.

Since a Chinese character can be represented by a set of basic line segments called sub-strokes, sub-

strokes are often used as features to recognize handwritten Chinese Characters. Traditional method merges several sub-strokes into a complete stroke before matching. If the sub-stroke merging process is successful, then the typical one-to-one matching method is sufficient for recognition. Traditional stroke merging methods adopt the information of orientation and location of strokes to judge whether several sub-strokes can be merged into one complete stroke. However, it is difficult to merge several sub-strokes into a correct and complete stroke before the matching step when the Chinese character is not known in advance. For example, in Fig. 1, the sub-strokes of the Chinese character [ - ~ ~ after the stroke extraction process are difficult to merge into a correct and complete stroke representing the character.

The goal of this paper is to propose a nmlti-stroke relaxation matching method to recognize handwritten Chinese characters that does not require the merging of sub-strokes. The sub-stroke merging step will be processed during the relaxation process. In Fig. 2, different writing styles are shown to cause different numbers of sub-strokes. For example, sub-stroke # 1, # 2 and # 3 in Fig. 2(b) should match with sub-stroke

3

Y< (a) (b)

Fig. 1. The same Chinese character f ~A~,] but with different writing style .

#i

#4

#5

#6

(a)

#3

#i #5

#2 #6 #9

#3 #7

1 #1o

(b)

Fig. 2. The same Chinese character [- I~ ] but with different writing styles.

Multi-stroke relaxation matching method for handwritten Chinese character recognition 403

# 1 in Fig. 2(a). However, in our previous matching methods [29], only one of the sub-strokes # 1, # 2, # 3 in Fig. 2(b) will be chosen to match with sub-stroke # 1 in Fig. 2(a). The proposed multi-stroke relaxation matching method will allow one or more sub-strokes in input character to match with one or more sub- strokes in reference character. This means that sub- strokes # 1, # 2 and # 3 in Fig. 2(b) will be merged into one stroke in the relaxation process and will match with sub-stroke # 1 in Fig. 2(a).

The rest of this paper is divided into four sections. Section 2 shows the process flow of the OCR system; Section 3 describes the proposed algorithm in detail; Section 4 lists the experimental results; and Section 5 gives our discussions and conclusions.

2. THE OCR SYSTEM

The process flow of the proposed OCR system can be depicted as in Fig. 3. Since a handwritten Chinese character can be represented by a set of sub-strokes, before going into the matching process, a handwritten Chinese character should be processed by a thinning and stroke extraction process to obtain a set of sub- strokes. The major difference between the proposed and traditional OCR systems is that the stroke merging process is incorporated into stroke matching process.

Figure 4(a) is an original handwritten Chinese character, Fig. 4(b) is the result of the character processed by the thinning process, and Fig. 4 (c) is the result after the feature extraction process. After the preprocessing process, the Chinese character can be represented by a set of sub-strokes. This set of sub-strokes can be used as the recognition features during the matching process.

Handwritten Chinese character

Thinning Process

+ Stroke Extraction

Process

Stroke Merging & Matching

Output

Prototype Characters

Fig. 3. Process flow of the proposed OCR system.

7C (a) (b)

/ (c)

Fig. 4. Example of preprocessnig process. (a) is the original character; (b) is the result after thinning; (c) is the result after

stroke extraction.

3. THE MULTI-STROKE RELAXATION MATCHING METHOD

In this section, we will introduce the multi-stroke relaxation matching method which uses the sub- strokes as features. The sub-strokes will not be merged into a complete stroke before matching. The merging step will be merged into the relaxation process. The proposed method uses the concept of minimum feature relations to deal with the problems of translation, rotation and scale change. It tries to solve the merging problems during the relaxation process by multi-stroke matching.

The proposed matching method can be divided into two phases. One is the multi-stroke relaxation process, and the other is the multi-stroke select match- pair process. The first phase uses a relaxation algorithm to compute the pair probability of two sub- strokes in order to decide the probability of each matching pair. During the relaxation process, the merging of these sub-strokes will be decided. The second phase chooses the optimal matching pairs.

3.1. Multi-stroke relaxation process

The relaxation matching process will be discussed in this section. After the preprocessing process, the sub-strokes of the handwritten Chinese character will be extracted. For each sub-stroke, we will obtain the coordinates of its two extreme points, and a list of

404 F.-H. Cheng

neighbor sub-strokes of these two extreme points. Two extreme points of a sub-stroke are surrounded by several neighbor sub-strokes as depicted in Fig. 5, where (x~, y,) and (x2, Y2) a r e the coordinates of the two extreme points of a sub-stroke I. Before going into the details of the proposed algorithm, the following definitions are made.

Definition 3.1: sub-stroke of a character. A sub- stroke of a character is defined as a straight-line segment obtained from the thinning image of a character. The input character L~ can be represented by a set of sub-strokes {lip}, that is Li = {lip}, p = 1, 2, . . . , m, where m is the number of sub-strokes of the input character. Each sub-stroke lip can be represented by lip = {(Xipl , yivl), (xit;2, yip2), {nit;1 . . . . . nit;h}}, where (xit;1, yi;1) and (xip2, yiv2) are the coordinates of the two extreme points of sub-stroke lip, {nip1, ... , nit;h}, is the set of neighbor sub-strokes of lip, and h is the number of neighbor sub-strokes. The reference character L~ can be represented by a set of sub-strokes {lrq}, that is L~ = {lrq},q = 1,2 . . . . . n, where n is the number of sub-strokes of the reference character. Each sub-stroke Irq can be represented by lrq = {(xrql , yrql), (xrq2, yrq2), {nrql, . . . , nrqk}}, where (xrql, yrql) and (xrq2, yrq2) are the coordinates of the two extreme points of sub-stroke lrq, {nrql , . . . , nrqk} is the set of neighbor sub-strokes of lrq, and k is the number of neighbor sub-strokes.

Definition 3.2: similar affine transformation. For a point p = (x, y), the similar affine transformation can be expressed by

F'I I r3 lI:l = + ( i ) fl' r 2 /44 r 3

denoted as p' = R(p). Where r l and r2 are the transla- t ion and r 3 and r4 are the combinat ion of rotat ion and scale change. The similar affine transformation of a sub-stroke lit; can be represented by li' v = R(lip).

F r o m the view point of our previous work, (39) any two matching point pairs can identify a similar affine transformation. Different matching point pairs may give rise to different affine transformation. We are not sure whether the identified transformation is the true one for all the other matching point pairs. So addi- t ional matching point pairs are required to verify it. In

other words, we need at least three matching point pairs to compute the true parameter of affine transformation. If more than two matching point pairs are used to calculate r~, r2, r3, r4 of the transformation, it is a MSE solution. The computat ion of this MSE solution can be found in our previous research. (4°) For example, if we want to match a triangular with its mirror (three point pairs), we found that there is no correct affine transformation to make them correctly match with each other. However, an affine transformation can be computed from three matching point pairs of triangular based on MSE sense. The least the MSE is, the more correct the matching is. If only two point pairs are used to compute the affine transformation, we always can find an affine transformation with zero MSE for the two point pairs. Applying this affine transformation, the two point pairs are correctly matched with each other. Finally, we conclude that the triangular is correctly matched with its mirror. However, this conclusion is wrong. Therefore, at least three possible matching point pairs can identify a correct similar affine transformation R, and two possible matching line pairs can also identify a correct similar affine transformation, R. If sub- stroke lip and its neighbor sub-stroke nip are matched with sub-stroke lrq and its neighbor sub-stroke nrq, then a correct similar affine transformation R can be identified. The word "correct" means the true or actual affine transformation based on the least MSE s e n s e .

Definition 3.3: pairwise error. Suppose sub-stroke lip and its neighbor sub-stroke nip of an input character are matched with sub-stroke lrq and its neighbor sub-stroke nr~ of a reference character. Let lip = ((Xipl, yivl), (Xip2, yi;2)), nip = ((Xip3 , yip3), (xi;~, yip4)), lrq = ((xrql, yrql), (xrq2, yrq2)) and nrq = ((xrq3, yrq3), (Xrq4, yrq4)), then the pairwise error e between sub-stroke lip and lrq is defined as

}~=1 [(xrqk - R(xipk)) 2 + (yrqk -- R(yipk)) 2] e = (2)

If lrq I[

The pairwise error is normalized with the length of the sub-stroke of the reference character. By using the similar affine transformation, each lip matching with any lrq will be converted to li';. The translation, rotation, and scaling problem can be solved at this step.

/

nJ ( x l , y l )

n i

//1'

i71

Fig. 5. Demonstration of the neighbors of a sub-stroke.

3.1.1. Relaxat ion matchin9 algorithm. The matching methodology of the relaxation process is described as follows.

(1) Initial probability: Let S (~ (liv, lrq) denote the matching probabili ty between lip and lrq at the rth- iteration. At the beginning, all the elements of the probabili ty matrix are initialized to 1. That is S(°~(lip, lrq) = 1, for all p and q. This means that each sub-stroke of the input character has the greatest


probability of matching with any sub-stroke of the reference character at the beginning of the process.

(2) Compat ib i l i t y funct ion: In order to construct the matching probability between sub-strokes, a proper compatibility function C(lip, Irq) between input sub-stroke lip and reference sub-stroke lrq is defined as

1 C(liv, lrq) 1 + ~s + rico' (3)

where e is the pairwise error after affine transformation which is defined in eq. (2), and c~ is its weighting factor, co denoted as co = h# + kv is a measure of the extent of deviation of positions and fl is its weighting factor,

max(d1, d2)

- II li'p II ' where

0 if ( M L < II 1,-~ [3,

o = M L - - IIGII if ( M L > IIGII), II l r~ I]

ML = max(/a1, lee, ld3, Id4),

d, and de are the distance from the two extreme points of the transformed input sub-stroke Ii'p which is the result after affine transformation li v of to the reference sub-stroke Irq as depicted in Fig. 6. P and Q are the projection points from the two extreme points of li' v to lrq" Idl and ld2 are the distance from the two extreme points of Irq to P, ld3 and Id4 are the distance from two extreme points of lrq to Q. # is a measure of orientation difference between transformed input sub-stroke li'p and reference sub-stroke lrq, and v is a measure of overlapping between li'p and lrq. The value of compatibility function C (C z [0, 1]) is large when both # and v are small. This means that there is a good matching between li'p and Irq. In our previous work, (37) the

/

ldl//' lr 4 - ~ _ ./:

: P I !!

,, !

l: // / Y P : ld 3

ld2//!::: ~ . lit/://I : / : t:

_ / / ,"- z . _ _ _ . / // ld 4

Fig. 6. Definition of matching between two sub-strokes.

compatibility function is defined as C = 1/(1 + ke). In this approach, we take the situation of Fig. 2 into consideration. In order to match sub-stroke # 1 in Fig. 2(a) with sub-strokes # 1, # 2 and # 3 in Fig. 2(b) at the same time, the matching probability between sub-stroke # 1 in Fig. 2(a) and sub-stroke # 1 or # 2 or # 3 in Fig. 2(b) must be large.

Figure 7 demonstrates the different situations of matching between two sub-strokes. In the case of Fig. 7(a), the orientation difference between li' v and lrq is zero, so # is zero. This means that the transformed sub-stroke li'p is exactly matched with sub-stroke lrq according to the concept of orientation. The orientation difference of Fig. 7(b) is small than that of Fig. 7(c), so # in Fig. 7(b) will be smaller than that in Fig. 7(c). This means that extent of matching in Fig. 7(b) is better than that in Fig. 7(c). In the case of Fig. 7(d), li' v is fully contained in Irq, SO V is 0. This means that the transformed sub-stroke li'p is exactly matched with sub-stroke lrq according to the concept of overlapping. Ii'p is partially contained in lrq in Fig. 7(e) and li' v is not contained in lrq in Fig. 7(f), so v in Fig. 7(e) will be smaller than that in Fig. 7(f). This means that the extent of matching in Fig. 7(e) is better than that in Fig. 7(f).

(3) I tera t ion scheme. The iteration process is defined as follows:

S (r) (lip, Ire) =

_1 ~ [max S (~-1) (nipi, nr~i) C(nipi, nrq31 h i = l k j = i

where nipi is one of the neighbor sub-strokes of lip and h is the number of neighbor sub-strokes, nrqj is one of the neighbor sub-strokes of lrq and k is the number of neighbor sub-strokes, r is the iteration number.

After each relaxation iteration, the merging step will be processed. The merging algorithm is described in the next section.

(a)

/A

[rq

(c)

(d) (e) (0

Fig. 7. Demonstration of different situations of matching between two sub-strokes.

406 F.-H. Cheng

3.1.2. Mergin 9 algorithm Given: The probability matrix S (~) (lip, Irq), p = 1 . . . . . m and q = 1, . . . , n, and the merging pool M, where the sub-strokes in M are the sub-strokes that have the possibility of being merged. Goal: To merge the sub-strokes.

Step 1: After one relaxation iteration:

for all S in the probability matrix, if S > T H then put mp =(lip, Ira) into M- M = {mpdi = 1 . . . . . t}, where t is the number of matching pairs which matching probability exceeds the threshold.

Step 2: For each pair of elements mpm, rnp~ in M, let mpm = (lip1, lrql) and mp, = (lipz, lrq2)

if lipx = liv2 and Irql, lrq2 are connected then merge lrqx, lrq2 into lr'q and do relaxation iteration again to get S~, = S ~) (lip1, lr'q).

if S~, > S (~) (lip1, Irql) and Sin, > S ~) (lip2, lrq2) then lr~l and lrq2 should be merged, else lrqa and Irq2 should not be merged.

Step 3: If lrqt = lrq2 and live, live are connected then merge live, live into li'p and do relaxation iteration again to get Sin, = S (~) (li'v, lrql).

if S,,, > S (~) (li~, lrq~) and S ~ > S (~) (lip2, lrq2) then liv, and Iip~ should be merged, else lip, and liv2 should not be merged.

Step 4: Continue another relaxation iteration and repeat all the above steps.

The threshold in the above algorithm is used to control the number of merging conditions. If the threshold is too small, the number of sub-strokes which satisfy the merging conditions is growing and more sub-strokes should be inspected to determine whether they should be merged into a stroke or not. However, the actual merging process depends on the relaxation probability before and after merging process not on this threshold. Therefore, this threshold does not affect the performance of recognition rate but affect the matching speed. In our experiments, a moderate value such as 0.5 is adequate.

Figure 8 shows an example of the matching between two characters of the same category. In Fig. 8, (a) is the reference character, (b) is the input character and (c) is the probability matching matrix after the first iteration. If S(I2, R4) and S(I4, R4) are considered, both I2 and I4 match with R4, and I2 and I4 are connected. If we merge I2 and I4, and do the first iteration of relaxation again, we will get another probability matrix SS shown in Fig. 8(d). SS(I2', R4) in Fig. 8(d) is greater than S(I2, R4) and S(I4, R4) in Fig. 8(c), which means that the I2 and I4 will be merged and replaced by i2' for the next iteration. In Fig. 8(a), sub-stroke # 2 has no neighbors, which means that

#1. #5

#2

(a)

#6 #5

- - #2

(b)

R1 R2 R3 R4 R5

i i 0.70 0 0.98 0.93 0.93

i2 081 0 090 0.88

I3 0.73 0 0.93 0.91 0.91

I4 0.86 0 0.85 i~iN~ii 0.88

I5 0.75 0 0.93 0.90 0.93

I6 0.92 0 0.86 0.84 0.88

(c)

R1 R2 R3 R4 R5

} 1 0.70 0 0.98 0.93 0.93

I2' 0.73 0 0.95 ~!N!~ 0.89

13 0.73 0 0.93 0.91 0.91

I5 0.75 0 0.93 0.90 0.93

16 0.92 0 0.86 0.84 0.88

(d)

Fig. 8. Example of the matching between two characters of the same category.

this sub-stroke cannot get support from its neighbors. Therefore, the matching probability of R2 with all I's is set to zero.

During each iteration of the relaxation process, the merging algorithm will merge the sub-strokes and redo the relaxation process.

3.2. Multi-stroke select-match-pair process

After the stroke merging and matching process, a matching probability matrix will be constructed. Because the merging step has been processed in the relaxation step, the optimal matching pairs in the probability matrix are one-to-one mappings. This means that each column can only map to one row in the matching matrix. Then a simple stroke selecting algorithm can be developed as below.

3.2.1. Stroke selecting algorithm Given: Li, L,. and the probability matrix S (r) (lip, lrq) Goal: To find the optimal stroke matching pairs.

Step 1: Select the greatest S in the probability matching matrix and clear all elements of its corresponding row and column.


Step 2: Repeat Step 1 until all the match pairs are selected.

Fo r the example in Fig. 8, the optimal matching pairs are (I2', R4), (I1, R3), (I5, R5), and (I6, R1), where I2' is merged from I2 and 14. This means that both I2 and I4 match with R4. The selection process is described as follows:

if II l i . II ~ II lrq II, then I1 = II Zi.oj II, Rl = II lrq II

else I1 = II lrpoj II, Rl = II lip II

where n is the number of matching pairs, [1 lipo~ II is the length of the projection of lip on lrq and II Irpo~ r[ is the length of the projection of lrq on lip. For example, the matching score of Fig. 8(a) and (b) is 0.84, and the number of matching pairs is 4.

R1 R2 R3 R4 R5

I1 0.70 0 0.98 0.93 0.93 I2' 0.73 0 0.95 *0.99 0.89 I3 0.73 0 0.93 0.91 0.91 I5 0.75 0 0.93 0.90 0.93 I6 0.92 0 0.86 0.84 0.88

In the above matrix, the greatest element is (I2', R4). We select (I2', R4) and clear the row of I2' and the column of R4, obtaining,

R1 R2 R3 R4 R5

I1 0.70 0 *0.98 0 0.93 I2' 0 0 0 0 0 I3 0.73 0 0.93 0 0.91 I5 0.75 0 0.93 0 0.93 I6 0.92 0 0.86 0 0.88

F r o m the above matrix, the greatest element is (I1, R3). We select (I1,R3) and clear the row of I1 and the column of R3, obtaining,

R 1 R2 R3 R4 R5

I1 0 0 0 0 0 I2' 0 0 0 0 0 13 0.73 0 0 0 0.91 I5 0.76 0 0 0 *0.93 I6 *0.92 0 0 0 0.88

In the above matrix, the greatest element is (I5, R5). We select (I5, R5) and clear the row of I5 and the column of R5. Repeat the above procedure, then (I6, R1) will be selected.

The above stroke selection algori thm is not an optimal one. It is a very straight forward algorithm. In order to select the optimal matching relations, the reader can apply the other optimal algorithms such as Hangara in algorithm. (29~ However, the optimal method will takes much time to select the matching pairs.

3.2.2. Scoring method. After selecting the matching pairs, we compute the matching score to represent the similarity of matching. The scoring method is defined as follows:

Tlengthi SCOFe

Tlength.'

where Tlengthi = ~ Ilk, Tlength. = ~ Rlk k - - 1 k = l

4. THE EXPERIMENTAL RESULTS

The experimental data/3s) used in this paper are taken from the handwrit ten Chinese character database developed by the Computer and Commun- ication Research Laboratories of Industrial Techno- logy Research Institute (ITRI) in Taiwan. ITRI takes over 2 years to seek thousands of people to write daily-used Chinese characters. Each person writes thousands of different Chinese characters on paper. The database is then established by digitizing the characters on paper via image scanner without any normalizat ion and calibration process, so the characters in the database have the characteristics of translation, rotat ion and scale change. There are more than 5401 categories of Chinese characters in the database and each category of character owns more than 200 different writing styles. The Chinese characters in the database were compressed by run address and each character was saved into a file depending on its size after compression.

Because the size of the database is so large that only part of it is used to do the experiments. There are two experiments to be done to verify the performance of the matching algorithm. In the first experiment, we performed two different sets of matching. Set A represents the matching between two characters of

100

90( / / ~

/ \ 7o° \ / . .

/ / ", / i ,," ", s4oo / \ ,

3ooi / \ ,,'

'°o°I .... , _-,-: , %,,, 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

SCORE

Fig. 9. Distributions of matching between two different data sets. A is the matching between two different categories characters, B is the matching between two different writing

styles but the same categories.

408 F.-H. Cheng

different categories, and set B represents the matching between two characters of the same category but different writing styles. Figure 9 shows the distributions of matching between these two data sets. If the matching algorithm is perfect, then there is no overlap between the distributions of set A and set B. There- fore, a threshold can be easily chosen to identify whether the input and reference character are the same one or not. This means that if the matching score between input and reference character is greater than the threshold, then the input character is recognized as the reference character with full confidence. In real case, there must be some overlap between the distributions of the above two sets of matching. The smaller the area of overlap is, the more confident the recognized character is. | n this experiment, we indi- vidually perform 2500 matching between characters randomly chosen from the database for set A and set B, and the distributions of matching for set A and set B are shown in Fig. 9.

In the second experiment, we choose 2000 categories of Chinese characters most frequently used from 5401 categories of Chinese characters as the testing database. One writing style is randomly chosen from

input character reference character

0.94

reference character reference character

0.92 0.89


0.89 0.83

Fig. 10. Matching results between the input and reference characters. The number below each reference character is the

matching score.

200 writing styles as a reference character for each category of character. So we have a set of 2000 different Chinese characters as the reference database. Nine writing styles are randomly chosen from 200 writing styles as the input characters for each category of character. So we have a set of 18,000 Chinese characters as the tested database. In order to obtain the actual recognition rate, each tested input character is matched with the 2000 reference characters, and the character with the highest score is the recognition result. Figure 10 shows one of the matching results between the input and the reference characters.

In order to verify the feasibility of the proposed matching algorithm, the first five candidates of the matching results are listed. Figure 11 shows the histogram of the first five candidates of matching results for 18,000 input characters (9 writing styles for 2000 different characters). In this figure, set 1 denotes the number of matching results of the first candidate (highest score), and set 2 denotes the number of matching results of the second candidate, and so on.

Figure 12 shows the cumulative recognition rates of the experimental results. From this figure, the actual recognition rate is 93.8 %, and the cumulative recognition rate of the first five candidates is 98.9%. The performance of the proposed method is better than that of our previous researches (26'27'29) according to the recognition rate and the size of the tested

20000

15000

10000 ]Qnumbers]

5000

0 set 1 set2 set3 set4 set5

Fig. 11. Histogram of the first five candidates for 18,000 handwritten Chinese characters recognition.

Fig. 12. The cumulative recognition rate of the first five candidates.


database. However, the recognition speed is slower than that of our previous researches. The reason is obvious that merging process raises the recognition rate but lowers the recognition speed. How to increase the speed of merging process will be a challenging task in the future.

5. DISCUSSIONS AND CONCLUSIONS

In this paper, a multi-stroke relaxation matching method is proposed based on a relaxation technique and the concept of min imum feature relations. Sub- strokes are used as features for matching and the merging process is performed during the relaxation process. The experimental results show that the proposed method can solve the merging and matching problems of defective sub-strokes caused by stroke extraction process and different writing styles. The cumulative recognition rate of the first five candidates is 98.9%. The actual recognition rate of 2000 daily used Chinese characters is 93.8%.

In our experiments, there are some recognized characters for which the highest matching score is not the

input character reference character

reference character

0.92

reference character

0.89 0.83


0.82 0.76

Fig. 13. Mis-recognized resuIts between the input and reference characters. The number below each reference character

is the matching score.

correct one. Figure 13 shows one example of the character which is not correctly recognized by our algorithm. The reason for this result can be sum- marized as follows:

(1) Those sub-strokes which have no neighbor influ- ence the matching result.

(2) If the input character and the reference character are similar, one character may be some part of the other one.

Some important aspects of the proposed multi- stroke relaxation matching method are listed below

(1) Sub-strokes are used as features for matching. (2) A proper compatibili ty function is devised to raise

the possibility of matching among sub-strokes. (3) Merging steps is incorporated into the relaxation

process. (4) The problem of different number of sub-strokes

caused by different writing styles and feature extraction schemes is solved.

(5) Multi-stroke matching method resembles the hu- man recognition process.

Considerable research still needs to be done to improve this technique including: finding more relations between connected sub-strokes to increase the possibility of merging sub-stroke; Increasing the speed of merging method; and improving the scoring method to raise the actual recognition rate.

Acknowledgement--We would like to express our great ap- preciation of the National Science Council for its full support of this research under Grant no. NSC85-2213-E-216-017.

REFERENCES

1. V. K. Govindan, Character recognition, Pattern Recog- nition 23, 671-683 (1990).

2. T. H. Hildebrandt and W. Liu, Optical recognition of handwritten Chinese character: advances since 1980, Pattern Recognition 26, 205 225 (1993).

3. F. All and T. Pavlidis, Syntactic recognition of handwritten numerals, IEEE Trans. Systems Man Cybernet. 7, 537 541 (1977).

4. T. Caesar, J. Gloger, A. Kalrenmeier and E. Mandler, Recognition of handwritten word images by statistical methods, Proc. 3rd, Int. Workshop on Frontiers in Hand- writing Recognition, pp. 409 414, Buffalo, New York (1993).

5. F. H. Cheng and W. H. Hsu, Research on Chinese OCR in Taiwan, Int. J. Pattern Recognition Artificial Intell. 5, (1 & 2), 139 164 (1991).

6. Y. Y. Tang, L. T. Tu, T. Li, W. W. Lin, I. S. Shyu and C. Y. Suen, Chinese character recognition with stroke features and tree-structured neural network, Comput. Process. Chinese Oriental Languages , 8, 17 36, (1994).

7. H. D. Chang, J. F. Wang and S. C. Kuo, A Bayesian neural network for separating similar complex handwritten Chinese characters, Pattern Recognition Lett. 15, 403-408 (1994).

8. H. Yamada, K. Yamamoto and T. Saito, A nonlinear normalization method for handprinted Kanji character recognition line density equalization, Pattern Recog- nition 23, 1023 1029 (1990).

410 F.-H. Cheng

9. W. Guerfali and R. Plamondon, Normalizing and restor- ing on-line handwritten, Pattern Recognition, 26, 419-413 (1993).

10. T.Y. Zhand and C. Y. Suen, A fast parallel algorithm for thinning digital patterns, Commun. ACM, 27, 236-239 (1984).

11. Y. S. Chen and W. H. Hsu, A modified fast parallel algorithm for thinning digital patters, Pattern Recogni- tion Lett. 7, 99 106 (1988).

12. C. S. Chen and W. H. Tasi, A new fast one-pass thinning algorithm and its parallel hardware implementation, Pattern Recognition Lett. 11, 471M77 (1990).

13. S. Mori, K. Yamanoto and M. Yasuda, Research on machine recognition of handprinted characters, IEEE Trans. Pattern Analysis Mach. Intr. 6, 386405 (1984).

14. K. Nakata, Y. Nakano and Y. Uchikura, Research of Chinese characters, Proc. Conf. Machine Perception of Patterns of Pictures, Teddington, pp. 45-52 (1972).

15. Y. X. Gu, Q. R. Wang and C. Y. Suen, Application of multilayer decision tree in computer recognition of Chinese characters, IEEE Trans. Pattern Analysis Mach. Intell. 5 (1), pp. 83-89, 1983.

16. K. Sakai, S. Hirai, T. kawada, S. Amano and K. Mori, An optical Chinese character reader, Proc. 3rd Int. Joint Conf Pattern Recognition, pp. 122-126 (1976).

17. M. Yasnda and H. Fujisawa, An improvement of correlation method for character recognition, Trans. IECE Japan .162-D, 217-224, (1979).

18. T. Saito, H. Yamada and K. Yamamoto, An analysis of handprinted Chinese characters by directional pattern matching approach, Trans. IECE Japan J65-D, 550-557, (1982).

19. H. A. Glucksman, Classification of mixed font alpha- betics by character loci, Proc. 1st Annu. IEEE Comput. Conf, pp. 137 141 (1976).

20. Research on hand printed Chinese characters recognition, Nikkei Electron, 12-7, 148-167 (1981).

21. K. W. Gan and K. T. Lua, A new approach to stroke and feature point extraction in Chinese character recognition, Pattern Recognition Le~t. 12, 381-387 (1991).

22. Hsi-jian Lee and Bin Chen, Recognition of handwritten Chinese characters via sub-line segments, Pattern Recog- nition 25, 5, 543 525 (1992).

23. L. Y. Tseng and C. T. Chuang, An efficient knowledge- based stroke extraction method for multi-font Chinese characters, Pattern Recognition 25, 1445-1458 (1992).

24. H. Yamada, T. Saito and S. Mori, An improvement of correlation method-shift similarity, Trans. IECE Japan ,164-D, 970-976 (1981).

25. Y. Kurosawa, K. Maeda, H. Asada and K. Sakai, Experi- ment on Chinese character recognition based oil corn-

plex similarity, Proc. of IECE Annual Convention, pp. 71-79 (October 1981).

26. F. H. Cheng, W. H. Hsu and C. A. Chen, Fuzzy approach to solve the recognition problem of handwritten Chinese characters, Pattern Recognition 22, 133 141.

27. F.H. Cheng, W. H. Hsu and M. Y. Chen, Recognition of handwritten Chinese characters by modified Hough transform techniques, IEEE Trans. Pattern Analysis Mach. Intell. 11, 429~439 (1989).

28. K. Yamamoto, Recognition of hand printed Kanji characters by relaxation matching, Trans. IECE Japan 365- D, 1167 1174 (1982).

29. F.H. Cheng, W. H. Hsu and M. C. Kuo, Recognition of handprinted Chinese characters via stroke relaxation, Pattern Recognition 26, 4, 579 593, (1993).

30. M. Shinya and M. Umeda, Evaluation of compound post-processing method in character recognition, Trans. IECE Japan, 368-D(5), 1118-1124 (1985).

31. K. T. Lua and K. W. Gall, Recognizing Chinese characters through interactive activation and competition, Pat- tern Recognition, 23, 1311-1321 (1990).

32. R.H. Cheng, C. W. Lee and Z. Chen, Preclassification of handwritten Chinese characters based on basic stroke substructures, Pattern Recognition Lett. 16, 1023 1032 (1995).

33. F. H. Cheng and W. H. Hsu, Radical extraction from handwritten Chinese characters by background thinning method, Trans. IEICE E71 (1), 88-98 (1988).

34. M. K. Hu, Visual pattern recognition by moment invari- ants, IRE Trans. Dform. Theory, 8, 179-187 (1962).

35. K. Price and R. Reddy, Matching segments of images, IEEE Trans. Pattern Analysis Mach. Intell. 1, 11(~116 (1979).

36. J. C. Simon, A. Checroun and C. Roche, A method of comparing two patterns independent of possible trans- formations and small distortions, Pattern Recogition 4, 73 84 (1972).

37. Fang-Hsuan Cheng, Planar pattern matching algorithm and its applications to handwritten Chinese character recognition, Commun. COLIPS 5, (1 & 2), 9-18 (1995).

38. L. T. Tu, Y. S. Lin, C. P. Yeh, I. S. Shyu, J. L. Wang, K. H. Joe and W. W. Lin, Recognition of handprinted Chinese characters by feature matching, Int. Conf. on Computer Processing of Chinese and Oriental Languages, pp. 15~157 (1991).

39. F. H. Cheng, Point pattern matching algorithm invariant to geometrical transformation and distortion, Pat- tern Recognition Lett. 17, 1429 1435 (1996).

40. Shih-Hsu Chang, Fang-Hsuan Cheng, Wen-Hsing Hsu and Guo-Zua Wu, Fast algorithm for point pattern matching: invariant to translations, rotations and scale changes, Pattern Recognition 30, 311-320 (1997).

About the A u t h o ~ F A N G - H S U A N CHENG was born in Hsinchu, Taiwan, on 13 June 1960. He graduated with a B.E. degree from the Department of Electrical Engineering, National Chen-Kung University, Taiwan, in 1982. He received the M.E. and Doctor of engineering degree from the Institute of Electrical Engineering, National Tsing Hua University, Taiwan, in 1984 and 1988, respectively. He got an honor of Dragon Totem Award from Acer Corporation in 1988. From 1988 to 1992, Dr Cheng was with Chung Shan Institute of Science and Technology as a senior specialist, there, he was involved in signal processing, flight data analysis, parameter estimation and distributed database design. Dr Cheng has been an associate professor of Department of Information and Computer Engineering at Chung Yuan Christian University in Taiwan since 1991. In the fall of 1992, Dr Cheng joined the Department of Computer Science at Chung Hua Polytechnic Institute, where he is currently an associate professor. Prof. Cheng is also one of the members of editorial board of Communications of COLIPS, an international journal of Singapore. His current research interests include signal processing, coding and information theory, color image processing, multimedia and Chinese OCR, etc.

multi-stroke relaxation matching method for handwritten chinese character recognition

Documents