handwritten chinese characters recognition by greedy matching with geometric constraint

ELSEVIER Image and Vision Computing 14 (1996) 91-104

Handwritten Chinese characters recognition by greedy matching with geometric constraint

Ai-Jia Hsieha9b, Kuo-Chin Fana,*, Tzu-I Fana

aInstitute of Computer Science and Information Engineering, National Central University, Chungli, Taiwan 32054, ROC

bComputer & Communication Research Laboratories, Industrial Technology Research Institute, Hsinchu. Taiwan 31015. ROC

Received 6 June 1994; revised 20 April 1995

Abstract

In this paper, a greedy matching algorithm with geometric constraint for solving the polygonal arcs matching (boundary line segments matching) is proposed. Assume the matched cost and the unmatched penalty of each line segment are given, as well as any two matched pairs in the matching which preserves the geometric relation. Our goal is to find a matching such that (1) the sum of costs of matched line segments and the penalties of unmatched line segments is minimum, and (2) the matching preserves the geometric relation. The proposed greedy matching algorithm consists of two main modules. The first is an optimal matching module, which utilizes the optimization matching method to find an optimal matching without the geometric constraint. The second module is an evaluation and update module, which deletes the matched pair with a geometric relation having the maximum inconsistency after each matched pair has been evaluated, and a new matching is found again. The algorithm continues until a stable matching is found. To verify the validity of the proposed algorithm, we implement it to recognize handwritten Chinese postal characters. We selected 51 Chinese postal characters as the prototype characters, and the experiments are conducted on 2550 samples, with each category containing 50 variations. The recognition rate is 91.8%. Experimental results reveal the feasibility of our proposed method in recognizing handwritten Chinese postal characters.

Keywords: Contour tracing; Greedy matching algorithm; Bipartite weighted matching; Combinatorial optimization; Handwritten Chinese character

recognition

1. Introduction

Intelligent input is a natural and efficient way for man-machine communication which may replace the role of the keyboard in future. Abundant surveys on this topic have been presented [l-5]. Several systems which can recognize machine printed, hand-printed and on-line handwritten Chinese characters have also been developed. However, the topic of handwritten Chinese character recognition is still an open problem, due to the difficulties inherent in the recognition of handwritten Chinese characters. They are mainly due to the large number of categories, the complexities of the characters, the similarity among different categories, and the wide variability among writers [6].

Methods of pattern recognition can be classified into three main approaches: statistical, structural and neural

* Corresponding author. Email [email protected].

0262-8856/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved

SSDI 0262-8856(95)01043-2

network. In statistical pattern recognition, each pattern is represented by an n-dimensional feature space and then statistical decision theory, such as Bayes decision theory, is applied to classify each pattern. In structural pattern recognition, the pattern is represented by a string, a tree, or an attributed graph [7-91, and then a systematic matching method or parser is applied to recognize the pattern. The neural network approach has received a great deal of attention recently, because it has the advantages of a self-organization capability (learning) and high parallelism [lo].

The matching technique plays an important role, especially in structural pattern recognition. Several use- ful matching techniques have been proposed and imple- mented for character recognition, such as string matching, dynamic programming (DP) matching, relaxation matching, heuristic matching and combinatorial optimal matching [I]. String matching and DP matching are usually used to compare two sequences of features, but

92 A.J. Hsieh er al./Itnage and Vision Computing 14 (1996) 91-104

this constraint is not imposed on the other matching techniques. Among these, relaxation matching is the most well-known matching technique, which was first proposed by Yamamoto and Rosenfeld [l I] for the recognition of handwritten characters. Several improving works for character recognition by relaxation have been proposed [6,12,13]. Chou and Tsai [ 141 proposed an iterative scheme for matching line segments of input and prototype characters. Cheng et al. [15] transformed the matching problem into a stroke assignment problem. Wakahara [16,17] also proposed this concept for stroke- order and stroke-number free on-line Kanji character recognition in an independent study.

Several features have been proposed for use in the matching process, which include strokes [13], line segments of stroke [12,14], fixed-length line segments [16], line segments of a polygon [6,11], and so on. For handwritten Chinese characters, stroke extraction needs to be accomplished by a complicated algorithm comparing to the extraction of line segments of a polygon. Therefore, we adopt the line segment of a polygon as the primitive for the matching process in this paper.

Pattern description and representation in structural pattern recognition include string, tree, and graph, etc. In practical applications, patterns maybe distorted by noise or the feature extraction process, so it is possible that some features (e.g., line segments) in the input pattern may not appear in the prototype pattern, and vice versa. The matching problem could thus be classified as a string-to-string correction problem [ 181, a tree-to-tree correction problem [ 191, and graph/subgraph isomorphisms [20], etc. The correction problem is to determine the distance between two patterns as measured by the minimum cost of edit operations (insertion, deletion and substitution) needed to transform one pattern into another. The string-to-string and tree-to-tree correction problems had been solved by a dynamic programming technique at polynomial time. However, subgraph isomorphism is still an NP-complete problem, and graph isomorphism is still an open problem [21]. From the heritage intract- able problem, we know that graph representation is a more reasonable representation, but it needs more computation.

From another view, the simplest way to represent a pattern is to regard the features as an unordered set. To match two unordered sets, Cheng et al. [15] transformed the matching problem into the well-known assignment problem. In their matching algorithm, they directly transfer the matching problem into an assignment problem by adding some dummy features to the insufficient side. However, it is still unable to overcome the condition of disappearance of features in both sides, i.e. they do not thoroughly solve the problem based on the above three edit operations.

In our previous work [22], we have extended Cheng’s and Wakahara’s concept to model the matching problem

Input pattern

1 I

Preprocessing

I Polygonal

approximation

Size normalization

Result

Fig. 1. Block diagram of overall recognition system.

into a bipartite weighted matching with penalty problem (BWMPP) and a bipartite weighted matching with geometric constraint problem (BWMGCP). Since the matching goal of BWMPP is to find a matching such

that the sum of the costs of matched features and the penalties of unmatched features is minimum, it is equiva-

lent to finding an optimal solution based on the above three editing operations. Moreover, BWMGCP further restricts the optimal matching to satisfy the constraint of the geometric relation such that we can obtain a more reasonable result when the unordered features instead of high level structural representations (e.g. graphs) are used in representing a pattern. A greedy algorithm by applying the Hungarian method is proposed to obtain the local minimum matching. For each iteration in the greedy algorithm, a matched pair is deleted if the matched pair whose geometric relation against other matched pairs has the maximum inconsistency; and then a new matching is found by applying the Hungarian method again. This strategy can reinforce the preserving of relational consistency. In this way, we can always find a stable matching that preserves the geometric relation.

A.J. Hsieh el aLlImage and Vision Computing 14 (1996) 91-104 93

In this paper, a greedy matching algorithm in matching the polygonal line segments of handwritten Chinese postal characters is proposed. The improvement of the complexity of the above greedy matching algorithm is also addressed.

A block diagram of the proposed recognition system is shown in Fig. 1. The input pattern is first processed by preprocessing to enhance the quality of the input pattern. The processes of contour tracing and polygonal approximation are then employed to trace the boundary of the pattern and approximate the contour polygon by line segments, respectively. Next, the pattern is normalized to standard size. Finally, the input character is matched with all prototypes by a greedy algorithm without preclassification. The prototype with the minimum distance is chosen as the recognition result. The line segments of prototypes are created at a learning phrase in which the prototypes are also processed from preprocessing to size normalization.

The rest of the paper is organized as follows. The descriptions of preprocessing, contour tracing and polygonal approximation and given in the next section. The modelling of the recognition problem into an optimization problem is then addressed. The proposed algorithm for finding a local minimum matching with a geometric constraint is then presented, along with the distance measure for line segment matching. Experimental results conducted on handwritten Chinese postal characters recognition are illustrated. Finally, concluding remarks are given.

2. Polygonal approximation of boundary

In this section, the processes of preprocessing, contour tracing and polygonal approximation in approximating the boundary of an input pattern are described. Before discussing the techniques, some terminology first has to be defined.

Definition 1 In a binary pattern, each element can be

either a dark point or a white point. The &neighbour of point p is dejined as the eight points adjacent top, such as

the points from no to n7 in Fig. 2. Points nl, n3, n5, and n7 are also referred to as the kneighbour of p.

Definition 2 A break-point is a dark point, the deletion of which would break the connectedness of the original

pattern [23/.

n0 721 122

EM

n7 p n3

726 725 n4

Fig. 2. 8-neighbour of p.

2. I. Preprocessing

A pattern may have a one-pixel width stroke which is generated by selecting an incorrect scanning threshold or writing by an illegible pen. Since the tracing of a boundary may incur the breaking of a contour due to the appearance of noise or a thin stroke with only a one- pixel width, some effort to enhance the quality of the input pattern is necessary. In this subsection, the technique to fill some dark points to avoid the generation of a break-point is presented. In this way, the contour tracing algorithm need not use the retracing technique [24] when a pattern has no break-point in it. The detection of a break-point can be easily determined by judging its crossing-number. The crossing-number of a point p is defined as [25]:

4

xH(P) = Cbi i=l

where

(

1 if Y12i_ 1 is a white point and (either @2imo& bi = or n2i+ 1 mod8 is a dark point)

0 otherwise

If the crossing-number of p is not equal to 1, then p is a break-point. Shown in Fig. 3 are the crossing-numbers of some illustrative examples.

To change a white point into a dark point to avoid the occurrence of a break-point, the candidate pixel to be chosen must be one of its 4-neighbours once the break- point is detected. The one to be chosen in its 4- neighbour, namely ni, must satisfy the following con- ditions: (1) it is a white point; (2) one of its two successor points (ni_ 1 mo,jg and ni_2mods) and one of its two prede- cessor points (ni+lmods and IZi+zmods) must be a dark point; or it must belong to one of the diagonal cases as shown in Fig. 3d, e. In diagonal cases, we always choose nl as the pixel to be changed to avoid breaking the contour tracing. The above process is repeated until no more break-points can be found. After this preprocessing, we can easily verify that the boundary could be traced only once by utilizing a particular boundary tracing algorithm.

Fig. 4 gives an example illustrating the result generated after preprocessing. In this illustrative example, 16 points are added to avoid the occurrence of a break-point.

2.2. Contour tracing and polygonal approximation

The boundary of a pattern is a set of edge-points. An

* + * * *

* P * P * P P

* * * * *

(a) xH(P) = I (b) X,,(P) = 2 (c) X,(P) = 3 (4 XX(P) = 2

Fig. 3. Five cases of the crossing number.

*

P

*

(e) xd~) = 2

94 A.J. Hsieh et al./Image and Vision Computing 14 (1996) 91-104

..:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ::

starting point

... ..... .(T. ...... ........... ...... ............. ...... ..................... ....... ...................

.................. ..:...:::

............. ... ... ... ... .: ... ... ...

... ......

... ... ... ... ... ... .... :: ... .... ... ... ... ... ..... ... .... ... ..... .::.::::::::.::::::: ............... ........... ........... ........... .......... ......... ........ . . :. ... ...

:: . .. .:: ..... ..T”: ............ .............

......... ............ ....................... .............................. ......... ........

..~~~~~~.

... ......

.......

.... .... ::: ... .... :::

..... ... .... .........

. ......... ... ...... I::.,

.... ....

1:’ ..... .....

... ...

... ::.

.... ... ....

... .... .... ...

... ... ...

... ...

.::

... ...

...

... ...

2’ ... 1..

.

::. . . . ::.

.:: . . . . . . . . . . . . . . . . . . :: 2: : . . . . . . . . .

. . . . . . . . . . . ‘:;::” . . .

. . . . . . . . . . . . . . . . . . . . . . . ,.. . . . . . . . . . :: ::

$0 :t : r

Fig. 4. Illustration of preprocessing result. The raw pattern is marked

by small dots and the added points after preprocessing are marked by

large dots.

edge-point is the point with at least one of its 4- neighbours being the white point. A dark point which is not an edge-point is called the interior point. In the following context, four edge-point types are defined:

1. A left edge-point, with its left neighbour n7 being white.

2. A right edge-point, with its right neighbour n3 being white.

3. A top edge-point, with its top neighbour nl being white.

4. A bottom edge-point, with its bottom neighbour n5 being white.

Note that an edge-point can belong to more than one of the four types [23].

After preprocessing, the input pattern must be of at least two-pixels width, except for the endpoints of strokes. If we use a fixed tracing sequence to trace the contour of a pattern with each stroke being two-pixels in width, some errors, such as contour breaking, may occur [26,3 11. For example, if the tracing priority is the ordered set n7, n6,. . , no, then the tracing path obtained will traverse from p1 to p2 as shown in Fig. 5. Shown in Fig. 5a is the desired path, but the path in Fig. 5b is not. However, changing the priority sequence will still

starting

point

+----P2+-*+-*+--Pl A +I

BP3 * * P1

?

(4 04 Fig. 5. Illustration of contour tracing. (a) Correct path; (b) wrong path.

-: Part of the boundary; *: boundary point; + : interior point.

trace the wrong path, which can be regarded as rotating the pattern in Fig. 5b.

The basic idea of our tracing policy in selecting the next tracing pixel is always selecting the neighbouring pixel with greatest number of same edge-point types. For example, suppose an edge-point p belongs to a top and left edge-point, and its two neighbours q1 and q2

belong to a left edge-point and a right edge-point, respectively. Since p and q1 have the same property of a left edge-point, q1 is selected as the next tracing pixel. If there is no neighbouring pixel with the same edge-point type, then proceed with the tracing process as follows. If the current point is a top edge-point, move to n’,s 4- neighbour. If the current point is a bottom edge-point, move to n!g 4-neighbour. If the current point is a right edge-point, move to r&s 4-neighbour. If the current point is a left edge-point, move to r&s 4-neighbour. Otherwise, terminate the current contour tracing procedure. Fig. 6 illustrates a terminated case; though p has two unlabelled neighbouring points, the tracing procedure still has to be terminated. Fig. 7 illustrates a case where there is no candidate point with the same edge-point type in the current point p, but it is movable.

The outer and interior contours of a pattern are

+ +

Fig. 6. Illustration of a terminated case where point p cannot move

again. This pattern has two contours represented by a thick and thin

line. -: Part of the boundary; *: boundary point; + : interior point.

starting point

\

A.J. Hsieh et al./Image and Vision Computing 14 (1996) 91-104

starting point

95

\

*\+ + ++ * +**-

1

i “r * * l-l

Fig. 7. Illustration of no candidate point with the same edge type in

current point p,

defined as the outer and interior boundaries of the pattern. To fix the contour tracing direction, the first move taken is always moving to the one of the ordered set n7, n6, n5 and n4 of the starting point. Hence, the moving direction is always counter-clockwise. In Fig. 8, the first move of outer contour C, and interior contour C, are to move top1 andp,, respectively. Distinguishing whether a contour is an outer or an interior contour can be easily determined by judging the edge-point type of its starting point. If the starting point is a top edge-point, then the

contour considered is an outer contour; otherwise it is an interior contour. After all possible moving has been done, the ending point will be the neighbour of the starting point. Finally, the starting point is added to the current contour to complete the tracing process. In the tracing process, all traced points are labelled and never traversed again, i.e. each boundary point is traced only once. The complete contour tracing algorithm is given below.

Contour tracing algorithm Input: Boundary points Output: Line segments of contours.

Step 1. Each point is unlabelled, initially. Step 2. If all points are labelled, terminate this algo-

rithm. Otherwise, scan the picture systematically, say row by row, to find the left-most unlabelled boundary point p, and label it.

If p is a top edge-point, then this contour is an outer contour. Otherwise, it is an interior contour.

Move to one of the ordered sets n7, n6, n5 and n4. If this moving fails, go to Step 6.

Step 3. Find the unlabelled boundary points of p’s

8-neighbour, say Q. If Q is empty, go to Step 6. Step 4. For each qk E Q, let Sk = {edge-point type of

qk} n {edge-point type ofp}, and Is, 1 = maxOckc7 I+]. - - Step 5. If 1 s,,J # 0, move to qn and label it, and go back

to Step 3. If p is a top edge-point, set r = nl; if p is a bottom edge-point, set r = n5; ifp is a left edge-point, set r = n3; if p is a right edge-point, set r = n7. Choose a point qk E Q, which is the 4-neighbour of r. If qk exists,

/ 7 PI+* + + PZ

Fig. 8. First moving in a contour. C, moves top,; C, moves to p3.

then move to qk and label it, and go back to Step 3. Otherwise, go to Step 6.

Step 6. Add the start point of the contour as its end- point. If the number of points in this contour is smaller than a predefined threshold di, then this contour is regarded as a noise contour, and is thereby deleted. Otherwise, apply the contour approximation algorithm as described below to segment this contour, and go back to Step 2.

Now let us describe the polygonal approximation algorithm in approximating a traced contour, which is proposed by Ramer [27]. Let pi, ,pi,, . . ,pin be the points of the ith contour. The principle of contour segmentation adopts the splitting technique by successively subdivid- ing a segment into two parts until a given criterion is satisfied. If a given curve is to be subdivided, the furthest point away from the line segment pi,pi, becomes a new vertex and subdivides the curve into two subcurves [24,27]. In Ramer [27], for a closed polygon two oppo- sitely located extremal points are suggested for selection as the initial two points. To unify the algorithm, we additionally check if pi, is equal to pi, or not in distance computation. If pi, = pi,, then the distance of pi, to pi,pi, will be equal to the distance of pi, to pi,. Now, let us restate this technique by the following contour approximation (CA) algorithm.

Contour approximation (CA) algorithm Input: Points pi,, pi,, . . ,pi,. Output: End points of line segments of contours.

Step 1. If n 5 2, then return pi, U pin. Step 2. Let:

ej =

{

the distance of p+ to pi, if pi, = pi,

the distance of pi, to the line segment pi,pi, otherwise

step 3. Let ek = max ,<,,,$. If & is less than a predefined threshold dz, then return pi, upin; otherwise, return CAbi,, . .~Pik) uPik u cA(pik,...,Pin).


D (4 (b)

Fig. 9. Line segment representation in different writing styles. (a) Con-

necting stroke; (b) disconnecting stroke.

After contour approximation, the line segments of the polygon are adopted as the primitives. In this paper, we do not care too much about the sequence of line segments in the polygon. However, each line segment is directional. For the outer contour, the starting and ending points of a line segment are recorded by the tracing direction, whereas the line segments in the interior contour are recorded by the opposite tracing direction. Hence, the direction of a specific segment in different writing styles will always be the same. Fig. 9 shows an example, illustrating that the direction of line segments with a connecting stroke or a disconnecting stroke is consistent.

3. Modelling of line segment matching problem

In this section, we describe various models of the line segment matching problem which include the bipartite weighted matching problem (BWMP), the bipartite weighted matching with penalty problem (BWMPP) and the bipartite weighted matching with geometric constraint problem (BWMGCP). BWMP is the most essen- tial model for the pattern matching problem, and BWMPP and BWMGCP are two models extended from BWMP. BWMPP is an error-correction problem for matching two unordered feature sets. As for BWMGCP, it is a model for matching two unordered feature sets with a geometric constraint.

3.1. BWMP

Let G = (V, U, E = V x U) be a weighted complete bipartite graph, where V = {vi, w2,. . . , w,} and U={Ulrz42 )...) 24,). Let wij be the weight of edge (Vi, ui). A set M C E is a matching if no vertex is incident with more than one edge of M. If 1 VI 5 1 U( and every vertex vi E V is incident with an edge of M, then the matching is complete. If 1 VI = ( U 1 and every vertex Wi E V is incident with an edge of M, then the matching is perfect [28]. The assignment problem of assigning IZ jobs to n works is a spectral case of the bipartite weighted matching problem when m = n.

This assignment problem (AP) can be formulated in linear programming standard form as follows:

minimize F e wijxij (1) i=l j=l

subject to 2 xij = 1, forall 1 <j<m (4 i=l

f: xv = 1, for all 1 5 i < m (3) j=l

(4) In equations (l)-(4), xij = 1 if vi is matched to Uj. For the pattern matching problem, V and U can be

regarded as the sets of features of the input and prototype characters, respectively. The wii is the distance between the features vi and Uj. In this paper, the vertices in the bipartite graph are the line segments of polygons and the weights of edges are the distances between the line segments.

3.2. BWMPP

Since the result generated by polygonal approximation depends upon the threshold of segmentation or the shape of the stroke, it is possible that some line segments in the input character may not appear in the prototype character, and vice versa. An example is shown in Fig. 10. Therefore, BWMP cannot directly model this real world case. In Cheng et al. [15], some dummy features are added to the insufficient side, but it is still unable to overcome the condition of the disappearance of line segments in both sides. If we restrict the matching to be complete or perfect matching, the forcing of the superfluous line segment to be a matching pair will hinder the matching of the other line segments. One problem arises if we don’t restrict the complete or perfect matching, that is which matching will minimize the sum of the weights of matched line segments? Evidently, this optimal matching is the empty set. Hence, the unmatched line segments should be punished. Thus, the matching goal becomes

(4 o-4 Fig. 10. Line segments of (a) input and (b) prototype characters of the

Chinese character ‘Tai’.

A.J. Hsieh et al./lmage and Vision Computing 14 (1996) 91-104 91

finding a matching such that the sum of the weights of matched line segments and the penalties of unmatched line segments is minimum.

Let si be the penalty of unmatched vertex vi, and tj be the penalty of unmatched vertex Uj. We can modify equations (l)-(4) to the following equations:

minimize jz: F WyXy+~ l_~Xi, Si i=l j=l i=l ( ) j=l

(5)

subject to 2.~~ = 0 or 1, for all 1 < j < II (6) i=l

+=Oor 1, for all 1 5 i < m (7) j=l

In equations (5)-(g), xii = 1 if Vi is matched to Uj. cjn=ixii=o f i i Y is an unmatched vertex. Similarly, Cy! i xij = 0 if Uj is an unmatched vertex.

Note that this problem is equivalent to finding an optimal solution based on editing transformation [18,19], namely, insertion, deletion and substitution for two unordered sets. The matched pairs are regarded as the substitution transformation; the unmatched vertices in V and U are regarded as the deletion and insertion trans- formations, respectively.

We transform this problem into an assignment problem later. Though the concept of penalty is a good idea, it is not easy to define wii, si and tj such that an ideal matching can be found. The discussion of the definitions wij, Si and tj when applied to character recognition will be presented in a later section.

3.3. BWMGCP

As stated in the previous subsection, the optimal matching depends upon the definitions of wii, Si and tj. Due to the difficulty in defining wii, Si and tj, we cannot guarantee the generation of an ideal matching from a human’s point of view. Therefore, more constraints are needed in our matching problem such that the optimal matching can satisfy our requirements. For example, we restrict a matching satisfying the geometric relations such that the matching will preserve some structural properties.

In this paper, the geometric constraint imposed only includes the preservation of the orientation of the line segment which is connected from the two midpoints of two line segments. The orientation of a line segment is defined by the counter-clockwise angle spanned between the positive horizontal axis and the vector of the line segment from the starting point to the end point. A set

M C E is a restricted matching if M is a matching and,

for each two matched pairs (Vi,, Uj,), (Vi,, Ujz) E M, the difference between the orientation from Vi, to 2ri2 and the orientation from Uj, to UjZ is smaller than a predefined threshold.

Let g(vi, , 2.‘i2, Uj,, Uj2) be a function that tests whether the preservation of geometric relations holds or not. If it holds return 0, otherwise return 1. Let pi, ,pi, ,pj, and pj* be the mid-point of line segments Vi,, viz, Uj, and Uj2, respectively. Then

dvi, > viz T ujl Y uj2 1

{

0 if min( 1 Q1 - O2 1,360 - IO1 - t$ I) < threshold =

1 otherwise

where 13~ and e2 are the orientation fromPi to pi, and the orientation from pj, to pj2, respectively.

Thus, we can add constraint (9) to our matching problem:

dwi, I ui2 > ujl I uj2 jxil,j, xil ,j, = O,

for all 1 5 i,, iz 5 m, 1 Ih,j2 I n (9)

4. Greedy matching algorithm

The assignment problem (AP) has a very long history in the area of operations research [28,29]. The Hungarian method is one well-known primal-dual algorithm used in solving the assignment problem. The dual of this linear programming is as follows:

maximize 2 Qi + 2 Pj (10)

i=l j=l

subject to WV - ai - /3j 2 0, for all 1 < i 5 M

and 1 <j<n (11)

where ai, pj are the dual variables of vertices Vi and Uj, respectively. The admissible edges of primal and dual are ~0 = 1 and wii = ai - /3j, respectively.

Recall the previous definition, where W = (wii) is the weighted matrix, S = (Si) is the vector of penalty of V, T = (5) is the vector of penalty of U, and 1 V 1 = m, IUI =n. The algorithm transforming the bipartite weighted matching with penalty problem to the AP is given as follows, which creates a new matrix from the weighted matrix and penalty vectors:

The reduction algorithm Input: A weighted matrix W and two penalty vectors S and T.

Output: An augmented matrix W’.

Step 1. Create square matrix S’ from vector S by setting the diagonal element Sii = Si and all other elements to 00. Create T’ from T by the analogous procedure.


Step 2. Create W’ from W, S’ and T’ by:

w’= ($) = ; “d [t-l

where 0 is a zero matrix. From the definitions of W, S’ and T’, we can imme-

diately obtain the size of row and the size of column in W’ with both equalling m + n, i.e. W’ is a square matrix. Suppose x’ is an optimal solution by applying the Hun- garian method to W’. Let xij = x$ for all i 5 m and j < n. From the proof of Hsieh et al. [22], we know

that x is an optimal solution of equations (5)-(8). If we add geometric constraints to our matching

problem, the optimal solution of equations (5)-(8) is no longer a feasible solution of equations (5)-(9). In the dual problem, an admissible edge possesses the property that w> = (Yi + pj. An inadmissible edge has w> > Qi + ,G’j. Thus, if we increase an entry in W’ with some cost, then the old admissible edge is still an admissible edge for new W’ except for the increased edge. In our previous work [22], a greedy algorithm was proposed for BWMGCP. First, the Hungarian method is applied to obtain an optimal solution without any geometric constraint. Then, a chosen matched pair is deleted if the matched pair whose geometric relation against other matched pairs has the maximum inconsistency; and finally, a new matching is found by applying the Hungarian method again. Repeat the above procedure until a matching satisfying the geometric constraint is found.

Conventionally, the Hungarian method starts with the empty matching, and then repeatedly increases the optimal matching by a new matched pair until the maximum cardinality matching is obtained. Since an admissible edge possesses the property that w$ = oi + @j and an admissible edge is deleted by setting its cost to infinity, the ai, /3j and the set of admissible edges except for the deleted edge will not be affected when an admissible edge is deleted. Therefore, searching a new matching from the updated matrix W’ by the Hungarian method can be accomplished by applying only one stage of the Hun- garian method (increasing the cardinality of the matching by (1) from the previous matching. The greedy algorithm is thus reorganized as follows:

The greedy algorithm Input: A matrix W’ and some constraints. Output: A matching x.

Step 1. (Optimal matching module) Set k = 1 and Wk = W’. Apply the Hungarian method to solve Wk and let mk be the current optimal solution.

Step 2. (Evaluation module) For each matched pair (Vi,) Uj,) E mk, 1 5 il 5 m, 1 5 ji < n and (Vi,, UjZ) E mk, i2 # il andj, # ji , if (the length of line segment ~i2 1 the average length of the line segments of input character)

and (the distance between the middle points of two line segments Vi, and Liz > ds), then calculate their orientation difference.

Average the orientation difference. Step 3 (Update module) Choose a matched pair, say

(vi,, uj,), with the maximum value from Step 2. If the maximum value is smaller than the preselected

threshold d4, go to Step 4. kSet wt ,j, = 00 and delete the matched pair (Vi,, Uj,) in

m. Set k = k + 1 and Wk = Wk- ’ . Apply one stage of the

Hungarian method to solve Wk from mk- ’ and let mk be the current optimal solution. Go to Step 2.

Step 4. Let xij = rni for all i < m and j 2 n, then terminate.

The greedy algorithm will always converge to a stable matching which satisfies the geometric constraint. In the worst case, the matching will converge to a matching with one matched pair or empty. Since the original matrix W’ has been updated and the updated entries are discarded, this algorithm will always find a local minimum matching for the original problem of equations (5)-(9).

The time complexity of the greedy algorithm is analysed as follows. Since the order in increasing a matched pair in the Hungarian method (one stage of Hungarian method) is O(n2), the time complexity of the Hungarian method for the assignment problem is 0(n3). Therefore, Step 1 in the greedy algorithm can be solved in O((n + m)3) time.

The maximum cardinality of a matching is min(n, m). The task of Step 2 in the greedy algorithm is to evaluate every matched pair against other matched pairs, which requires 0((min(n,m))2) time, and the average of the orientation difference requires O(min(n, m)) time; in total it requires 0(min(n2,m2)) time. As for Step 3 in the greedy algorithm, it chooses a matched pair with a maximum inconsistency which requires O(min(n, m)) time. The deletion of a chosen matched pair requires 0( 1) time. Finding a new matching by applying one stage of the Hungarian method from the previous matching mk-’ requires 0( (n + m)2) time. In total it requires 0( (n + m)2) time. Therefore, each iteration in the greedy algorithm from Step 2 to Step 3 requires O((n + m)2) time. However, the greedy algorithm needs O(nm) iterations in the worst case. The greedy algorithm for the bipartite weighted matching with geometric constraint problem thereby requires O(nm(n + m)2) time.

5. Distance measure for line segment matching

Before defining wij, si and tj, we have to discuss the relations among them. The Hungarian method is a primed- dual algorithm. For the primal problem, if wij > Si + tj

A.J. Hsieh et al./Image and Vision Computing 14 (1996) 91-104 99

(i.e. the distance between two line segments is larger than the sum of their penalty), then entry xii will not equal 1. Thus, we can set wij < si + tj when they are a possible match pair of each other. Otherwise, set wii > si + ti.

To punish a line segment, its length is the only information we have at hand. The penalties of the line segments vi and Uj are defined by:

Si = Ci * kl

tj = ej * k2

where ei,ej are the lengths of line segments Vi and Uj, respectively, and kl, k2 are the weights.

Let (xil,Yil), (xi2,Yi2) and (xiX,yi,) be the starting, middle and end points of a line segment, respectively. The distance between line segments Vi and Uj is defined as follows:

WV = W(Wi, Uj)

= (ei + ej) * k3 * [(cl + c2 + c3 + ca) * k4 + c5 * k,]

where k3, k4, k5 are the weights, cl, c2, c3 are the Eucli- dean distances between V:S and z&s starting, middle and end points, respectively, c4 is the difference of the length of line segments vi and uj, and cs is the difference of the orientation of line segments vi and uj.

If the matched cost, on average, is larger than the unmatched penalty, then the number of matched pairs will decrease, i.e. the matching process is repcessed. Simi- larly, if the matched cost, on average, is smaller than the unmatched penalty, then the number of matched pairs will increase, i.e. the matching process is encouraged. Since the handwritten patterns always come with some distortions, the design concept of the distance function mentioned above is based on the normalization of penalties. The selection of parameters satisfying

41 42

(ci+c2+c3+c4)*k4~1/2 and cs*kS=l/2 are expected. The expression inside the brackets is approxi- mately equal to 1. The ei + ej is twice the average of ei and ej. Hence, if kl = k2 we must set k3 < k, to guarantee WV < Si + tj. Note that WV < Si + tj means that vi and Uj are a possible match pair, but without guarantee.

To avoid the impossible matching, we use the following pruning method to delete the impossible entry. First, each line segment is projected to horizontal and vertical lines. Let h^i and G, be the profile in horizontal and vertical lines, respectively. Let Eii = Ck< i &/ Ck & * 30 and pi = Ckji~k/ Ck Gk * 30. If the Lumber of line segments in the input and prototype characters is smaller than d,, then we set the threshold d to d6; otherwise we set the threshold d to d7. In general, the shift of line segments in complicated characters is less sensitive than those in simple characters. Hence, d6 > d7. If

I kxi2 - gx,2 I > d or I fy,2 where (xi,, yi, and (xj*, yj~)

- py, 1 > d, we set wii to 00, are the middle points of line

segments vi and Uj, respectively.

6. Experimental results

6.1. Prototype characters and testing characters

To validate the proposed algorithm, experiments are conducted to recognize handwritten Chinese postal characters. In total, 51 Chinese postal characters are selected as the prototype characters; these are the names of the postal cities in Taiwan province and the names of postal districts in Taipei city. Both of the prototype and testing samples are selected from part of the Computer and Communication Laboratories handwritten Chinese character image database [30], which contains 5401

Fig. 11. Raw images of prototype characters.

100 A.J. Hsieh et al./lmage and Vision Computing 14 (1996) 91-104

Fig. 12. Line segments of prototype characters.

characters with each category containing more than 200 variations arranged according to the writing qualities. The writers of the above database include junior high school students, college students and engineers. A 300dpi image scanner is used in scanning the collection sheets. In this paper, the first 51 samples are selected as the prototype and testing data. Before the simulation, one set of the prototype samples is selected by a specific person according to its quality, such as the clearness and regularity of strokes. The remainder, 50 samples for each category, are chosen as the testing data. Figs. 11 and 12 show the prototype characters in raw images and line segments representation, respectively.

6.2. Recognition results

A block diagram of the recognition system is shown in Fig. 1. First, the line segments of the input and prototype characters are generated by preprocessing, contour tracing, polygonal approximation and size normalization. Here, line segments of both input and prototype characters are normalized to a size of 64 x 64. Second, each input character is matched with all the prototypes by utilizing a greedy algorithm without preclassification. The prototype with the minimum cost after the greedy matching is chosen as the recognition result. Before the greedy matching, the costs (weights) of line segments between input and prototype characters are computed first, and then the penalty associated with each line segment is computed. The greedy algorithm is applied after the augmented matrix has been generated. Convention- ally, the first iteration of the greedy matching is named the Hungarian method because it is applied without any geometric constraint.

In the preprocessing and polygonal approximation, we set d, = 9 and d2 = 5, i.e. a contour with a length smaller

than 9 is regarded as noise and a line segment needs to be split into two lines segments in polygonal approximation if there exists a point such that the distance of the point to the line segment is larger than 5. For the parameters in the pruning method, we set d5 = 30, d6 = 8 and d7 = 5 to tolerate the shift of a line segment. As to other parameters of ki and di, we use a trial and error method by evaluating the recognition rate when the first 10 samples for each category are tested. The trials are shown in Table 1. From trials 1-6, we find that trial 2 has the best recognition rate from the Hungarian method. Hence, we use the parameters of trial 2 to do trials 7- 11. In trials 7- 11, trial 8 attains the best recognition rate from the greedy algorithm. Therefore, the parameters in trials 2 and 8 are chosen to do the overall recognition.

Table 2 shows the recognition result of the Hungarian method. The recognition rate in the kth order is defined by the rate of the input character appearing within rank

Table 1

Trial and error for deciding parameters

Trial k, = k2 k, k4 ks Rate (%)

1 I 5 l/l00 l/80 94.5 2 7 4 l/100 l/80 94.7

3 7 3 l/l00 l/80 94.1 4 7 4 l/SO l/SO 94.1 5 I 4 l/l00 l/40 93.1 6 7 4 l/l00 l/160 91.8

Trial d, d4 Rate (%)

7 16 25 94.9

8 16 20 95.3 9 16 15 95.1

IO 8 20 95.1

11 24 20 95.1


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

abcdfhij lmnksoqregp

Fig. 13 Matching results depicted in Fig. 10.

Table 2

Recognition result generated by the Hungarian method

Samples Cumulative recognition rate (%)

1st order 2nd order 3rd order 5th order 10th order

l-10 94.7 91.3 98.6 100 100

11-20 92.4 96.3 97.6 98.8 99.8

21-30 88.2 93.5 95.1 91.6 99.4

31-40 89.8 94.5 96.9 98.2 98.8

41-50 85.9 92.4 94.9 96.3 98.0

Total 90.2 94.8 96.1 98.2 99.2

Table 3

Recognition result generated by the greedy algorithm

Samples 1st order 2nd order 3rd order

l-10 95.3 98.4 98.6

11-20 93.1 91.3 91.6

21-30 89.8 94.9 95.1 31-40 91.2 95.7 96.9

41-50 89.6 94.1 94.9

Total 91.8 96.1 96.1

Table 4

Number of iterations in greedy algorithm

Samples Same category Different category

l-10 2.60 9.33

1 l-20 2.64 9.75 21-30 2.65 9.10

31-40 2.15 9.26

41-50 2.11 9.34

Average 2.68 9.36

Table 5

Ratio of convergence within the kth iteration in the greedy algorithm

k Ratio (%)]

1 38.8 2 61.7

3 17.3

4 85.9 5 90.1 6 94.0

7 95.7

k. From the recognition rate of the first three orders, we notice that the recognition rate is excellent. Hence, we only use the first three candidates to perform greedy matching until a stable situation is reached. For this improvement, the performance is upgraded 1.61%, in which there are 2.75% positive and 1.14% negative upgrades. This improvement is quite significant. Table 3 shows the recognition result after applying the greedy algorithm in the first three candidates for the matching with geometric constraint.

Fig. 13 shows the matching result for the input and prototype characters as depicted in Fig. 10, where the matched pairs are connected by a solid line. In this example, the matching is stable in the first iteration. The 17, e, g and p are the unmatched line segments. Shown in Fig. 14 are the line segments of polygonal representation of the input and prototype Chinese character ‘Pei’. The matching result generated by the Hungarian method is given in Fig. 15a. The g is the unmatched line segment.

10 l5 l 11 9 -Nk ‘i3 14

3

x 4 sj

(4 (b) Fig. 14. Line segments of (a) input and (b) prototype characters of the

Chinese character ‘Pei’.

102 A.J. Hsieh et al.lImage and Vision Computing 14 (1996) 91-104

1 9

2 I-

3 i

4 j

5 k

6 1

7 m

8 n

9 P

10 a

11 b

12 C

13 d

14. 0

15 h

16 e

17 f

g

iteration = 1

unstable

delete (14, o)

(a>

1 q

2 r

3 i

4 j

5 k

6 1

7 m

8 n

9 0

10 a

11 b

12 C

13 d

1Lh

15 P

16 e

17 f

g

iteration = 2

unstable

delete (15, p)

(b)

1 q

2 P

3 i

4 j

5 k

6 1

7 m

8 n

9 0

10 a

11 b

12c

13 d

1Lh

15 r

16 e

17 f

g

iteration = 3

unstable

delete (9, o)

(c)

1 q

2 r

3 i

4 j

5 k

6 1

7 m

8 n

9 P

10 a

11 b

12 C

13 d

14g

15 h

16 e

17 f

0

iteration = 4

stable

(d)

Fig. 15. Matching results depicted in Fig. 14.

While checking the geometric relation, since the matched pair (14, o) violates the geometric relation, it has to be deleted. Shown in Fig. 15b is the matching result after the matched pair (14, o) has been deleted. Then (15, p) and (14, o) are deleted in iterations 2 and 3, respectively. Finally, the stable result is obtained in iteration 4, where o is the unmatched line segment. This matching accords with a human’s point of view.

Table 4 shows the number of iterations required by the greedy algorithm. The average number of iterations will be fewer if the input character is the same as the prototype. The ratio of convergence within the kth iteration (i.e. the ratio of the input character which is matched with the same prototype character within kth iterations) is shown in Table 5.

Some of the testing samples with correct recognition and incorrect recognition are shown in Figs. 16 and 17, respectively. The recognition errors occurring in the

experiment are mainly due to the following reasons:

1. One line segment is divided into more than two line segments in the polygonal approximation, such as line segments 13 and 17 in Fig. 10.

2. One line segment is split into two line segments due to

stroke connection or low quality of the image, such as line segments h and d in Fig. 10, which may be merged to form a new segment if the first two strokes do not connect.

3. The pruning process for impossible matching may overshoot so that the matching of line segments may result in errors.

4. The characters are very similar in polygonal representation, such as the characters of the sequence 5 and 15 in the prototype characters.

5. The variation of the orientation of some specific line segments is very wide due to the writing styles of different writers.


Fig. 16. Some testing samples with correct recognition.

7. Conclusion

In this paper, a matching algorithm based on the exten- sion of bipartite weighted matching for solving the polygonal arcs matching (boundary line segments matching) with a geometric constraint is proposed to recognize handwritten Chinese postal characters. After preprocessing, contour tracing, polygonal approximation and size normalization, the line segments of a contour polygon are obtained. The distance measure between the line segments of input and prototype characters is defined by the length, coordinates and orientation of a line segment. It is possible that some line segments in the input character may not appear in the prototype character, or vice versa. Thus, the Hungarian method cannot be applied directly. To overcome this drawback, we introduce the concept of the unmatched line segments associated with a penalty. Thus, our matching goal becomes one of finding a matching such that the sum of the costs of matched line segments and the penalties of unmatched line segments are minimum. The matching result generated by the Hungarian method depends on the given costs (weights) and penalty. To obtain a good matching, the distances between line segments and the penalty of line

(1+51) [2+47) (3-+13) (4+32) (5437)

segments are given carefully. Moreover, a greedy algorithm based on the Hungarian method is proposed to restrict the optimal matching satisfying the constraint of a geometric relation. For each iteration in the greedy algorithm, a matched pair is deleted if the matched pair whose geometric relation against other matched pairs has the maximum inconsistency, and then a new matching is found by applying the Hungarian method again. Repeatedly, we can find a stable matching that preserves the geometric relation.

To verify the validity of the proposed approach, experiments were conducted to recognize handwritten Chinese postal characters. In our experiments, 51 Chinese postal characters were selected as the prototype characters. The recognition rate is about 91.8%, which is tested on 2550 samples. The second and third cumulative recognition rates are 96.1% and 96.7%, respectively. The experimental results reveal the feasibility of the new approach for handwritten Chinese postal character recognition.

In future, some issues for further research need to be studied to improve the performance:

1. Improve the polygonal approximation technique to

Fig. 17. Incorrect recognition. The @ + q) shows that p is the original pattern and q is the recognized pattern.


2.

3.

4.

5. 6.

obtain more stable features, such as fixed-length line segments. Add more constraints into the matching problem, such as parallelism, intersection, etc., to obtain more reasonable matching results to improve the recog-

nition rate. Design the cost and penalty more sophisticately such that the geometric relation can be really incorporated in the distance of line segments. Restrict the iteration number for some candidates with a larger cost. Propose a more efficient heuristic method. Add an alternative prototype into the database for the wide variation of characters with different writing

styles.

References

VI

VI

[31

I41

151

bl

[71

PI

I91

[lOI

illI

WI

H. Bunke and A. Sanfeliu (eds.) Syntactic and Structural Pat- tern Recognition: Theory and Applications, World Scientific, Singapore, 1990. V.K. Govindan and A.P. Shivaprasad, Character recognition - a review, Patt. Recog., 23 (1990) 671. S. Impedovo, L. Ottaviano and S. Occhinegro, Optical character recognition - a survey, Int. J. Patt. Recogn. Artif. Zntell., 5 (1991) 1. S. Mori, C.Y. Suen and K. Yamamoto, Historical review of OCR research and development, Proc. IEEE, 80 (1992) 1029. C.C. Tappert, C.Y. Suen and T. Wakahara, The state of the art in on-line handwriting recognition, IEEE Trans. PAMZ, 12 (1990) 787. S.L. Xie and M. Suk, On machine recognition of hand-printed Chinese characters by feature relaxation, Patt. Recog., 21 (1988) 1. K.P. Chan and Y.S. Cheung, Fuzzy-attribute graph with application to Chinese character recognition, IEEE Trans. Syst., Man, Cybern., 22 (1992) 175. (Erratum: IEEE Trans. Syst., Man, Cybern., 22 (1992) 402). L.H. Chen and J.R. Lieh, Handwritten character recognition using a 2-layer random graph model by relaxation matching, Patt. Recogn., 23 (1990) 1189. SW. Lu, Y. Ren and C.Y. Suen, Hierarchical attributed graph representation and recognition of handwritten Chinese characters, Patt. Recogn., 23 (1991) 617. T.H. Hildebrandt and W. Liu, Optical recognition of handwritten Chinese characters: advances since 1980, Patt. Recogn., 26 (1993) 205. K. Yamamoto and A. Rosenfeld, Recognition of hand- printed Kanji characters by a relaxation method, Proc. 6th Int. Conf Patt. Recogn. (1982) 395. L. Lam and C.Y. Suen, Structural classification and relaxation

t131

1141

u51

[I61

1171

WI

1191

WI

PI

I221

~231

1241

~51

WI

1271

WI

~91

1301

[311

matching of totally unconstrained handwritten Zip-code numbers, Patt. Recogn., 21 (1988) 19. C.H. Leung, Y.S. Cheung and Y.L. Wong, A knowledge- based stroke-matching method for Chinese character recognition, IEEE Trans. Syst., Man, Cybern., 17 (1992) 993. S.L. Chou and W.H. Tsai, Recognizing handwritten Chinese characters by stroke-segment matching using an iteration scheme, Znt. J. Patt. Recogn. Arttf. InteN., 5 (1991) 175. F.H. Cheng, W.H. Hsu and C.A. Chen, Fuzzy approach to solve the recognition problem of handwritten Chinese characters, Patt. Recogn., 22 (1989) 133. T. Wakahara and M. Umeda, Stroke-number and stroke- order free on-line character recognition by selective stroke linkage method, Proc. 4th Int. Conf Text Processing (1983) 157. T. Wakahara, On-line cursive script recognition using local affine transformation, Proc. 9th Znt. Conf. Patt. Recogn.

(1988) 1131. R.A. Wagner and M.J. Fischer, The string-to-string correction problem, J. ACM, 21 (1974) 168. S.M. Selkow, The tree-to-tree editing problem, Infor. Process. Lett., 6 (1977) 184. W.H. Tsai and K.S. Fu, Error-correcting isomorphisms of attributed relational graphs for pattern analysis, IEEE Trans. Syst., Man, Cybern., 9 (1979) 757. M. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman, San Francisco, CA (1979) A.J. Hsieh, K.C. Fan and T.I. Fan, Bipartite weighted matching for on-line handwritten Chinese character recognition, Patt. Recogn., 28 (1995) 143. N.J. Naccache and R. Shinghal, SPTA: a proposed algorithm for thinning binary patterns, IEEE Trans. Syst., Man, Cybern., 14 (1984) 409. R.C. Gonzalez and P. Wintz, Digital Image Processing, 2nd ed, Addison-Wesley, Reading, MA (1987) C.J. Hilditch, Linear skeletons from square cupboards, in B. Meltzer and D. Michie (eds.), Machine Intelligence, Elsevier, New York (1969). I. Sobel, Neighborhood coding of binary images for fast contour following and general binary array processing, Comput. Graphics Image Process., 8 (1978) 127. U. Ramer, An iterative procedure for the polygonal approximation of plane curves, Comput. Graphics Image Process., I (1972) 244. C.H. Papadimitriou and K. Steiglitz, Combinatorial Optimiza- tion: Algorithms and Complexity, Prentice-Hall, Englewood Cliffs, NJ (1982) E.L. Lawler, Combinatorial Optimization: Networks and Matroids, Holt, Rinehart & Winston, New York (1976) L.T. Tu, Y.S. Lin, C.P. Yeh, I.S. Shyu, J.L. Wang, K.H. Joe and W.W. Lin, Recognition of handprinted Chinese characters by feature matching, Proc. Znt. Conf. Comput. Process. of Chinese and Oriental Languages (1991) 154. T. Pavlidis, Algorithms for Graphics and Image Processing, Computer Science Press, Rockville, MD (1982)

handwritten chinese characters recognition by greedy matching with geometric constraint

Documents