a prototype for brazilian bankcheck recognitionalekoe/papers/koerich_ijprai_1997.pdf · numeric...

A PROTOTYPE FOR BRAZILIAN BANKCHECK RECOGNITION

Luan L. Lee, Miguel G. Lizarraga, Natanael R. Gomes and Alessandro L. Koerich

Faculty of Electrical and Computer Engineering, State University of Campinas

P.O. 6101, 13083-970, Campinas, SP, Brazil

ABSTRACT

This paper describes a prototype for Brazilian bankcheck recognition. The description is divided into three topics:

bankcheck information extraction, digit amount recognition and signature verification. In bankcheck information

extraction, our algorithms provide signature and digit amount images free of background pattern and bankcheck printed

information. In digit amount recognition, we dealt with the digit amount segmentation and implementation of a complete

numeric character recognition system involving image processing, feature extraction and neural classification. In

signature verification, we designed and implemented a static signature verification system suitable for banking and

commercial applications. Our signature verification algorithm is capable of detecting both simple, random and skilled

forgeries. The proposed automatic bankcheck recognition prototype was intensively tested by real bankcheck data as well

as simulated data providing the following performance results: for skilled forgeries, 4.7% equal error rate; for random

forgeries, zero Type I error and 7.3% Type II error; for bankcheck numerals, 92.7% correct recognition rate.

Keywords: Bankcheck Recognition, Signature Verification, Numeric Character Recognition, Image

Processing.

1. INTRODUCTION

Aiming at automated banking services, automatic bankcheck information processing and recognition

have been active areas of research in the past few years [1-9]. In this paper we describe a prototype for

automatic Brazilian bankcheck recognition that recognizes or identifies both printed and filled

2

information on a bankcheck automatically. The choice of Brazilian bankchecks for testing the proposed

bankcheck recognition system is merely due to the convenience. The organization of this paper is as

follows. Section 2 is devoted to a brief description of Brazilian bankcheck characteristics. Section 3

presents the bankcheck information extraction process consisting of: bankcheck image acquisition,

bankcheck identification, printed pattern generation, background and printed pattern elimination, and

digit amount and signature images extraction. Section 4 describes with considerable details of the

proposed digit amount recognition process including digit amount segmentations, numeral image

preprocessing, feature extraction and numeral identification. Mathematical morphology was a main tool

for the segmentation, the image processing and the feature extraction. A sequential decision procedure

was implemented for classification including the use of Hopfield neural networks. For signature

verification we rely heavily on mathematical morphological techniques in signature image processing and

feature extraction (Section 5). The Euclidean distance measure and multilayer perceptron were used to

quantify dissimilarity between two handwritten signatures.

2. BRAZILIAN BANKCHECKS AND THEIR CHARACTERISTICS

Brazilian bankchecks present a complex layout with considerable variations in color, pictures and

stylistic characters or symbols, but all having standard size measured approximately 17.1cm by 7.5cm.

Fig. 1.a shows the 256-gray level image of a Brazilian bankcheck at 200 dpi spatial resolution. In terms

of bankcheck information, the layout can be partitioned into 9 regions as shown in Fig. 1.b. In the first

top row several strings of printed numerals and alphabets are found. Some of these strings are used to

identify the bank, the agency, the account number and the check number while the others are

information just for internal banking processing. On the most right hand side of the first row, an area is

reserved for the courtesy (or numeric) amount filling. In the second and the third rows of the check,

there are two baselines for the legal (or worded) amount filling. The fourth row is reserved for the

payee’s name which can be optional. On the right hand side of the fifth row, we find a reserved area for

the name of the city and the date of check filling. Specific fields are reserved for day, month and year

information. In general the financial institution’s name, the agency’s name and its address are printed on

left hand side of the check just below the fifth row. The area, on the right hand side just below the fifth

3

row is reserved for handwritten signatures, where a short printed line functioning as a guideline for

signature writing can be found. It is required that the customer’s name and his tax identification number

(CPF) be printed just below the signature guideline. Finally in the bottom row all identification numbers

printed in the first row of the check are encoded and printed in the CMC-7 font via a MICR process

(Magnetic Ink Character Recognition).

(a) (b)

(c)

Figure 1: (a) A Brazilian bankcheck image (b) The bankcheck information layout (c) A image of the printed information.

In terms of information characteristics, we classify patterns appeared on a bankcheck into four

categories: background, fixed printed information, variable printed information and filled

information . Fig. 2.a illustrates abstract division of these information categories. Background refers to

the check’s colored background pictures and drawings. Fixed information regards the financial

institution’s name and identification numbers, printed horizontal and vertical lines on the check.

Variable printed information consists of all information with respect to the agency and the customer’s

personal information on the check. Finally filled information represents the information introduced by

the bank’s customer, such as the courtesy amount, the legal amount, the date, the city’s name, the

payee’s name and his/her signature [10].

4

3. BANKCHECK INFORMATION EXTRACTION

This section is devoted to the description of our bankcheck information extraction algorithm consisting

basically of the following steps: bankcheck image information acquisition and information identification,

position adjustment, printed information image generation, background generation, background and

printed information elimination, morphological filtering and information extraction. Fig. 2.b illustrates

the proposed bankcheck information extraction process via a functional block diagram.

(a) (b)

Figure 2: (a) An abstraction of information division (b) The bankcheck information extraction process.

3.1 Image Acquisition and Information Identification

The data acquistion instruments used in our bankcheck recognition system are: a scanner and a CMC-7

font reader. The scanner captures a digitized image of a check, denoted by IM(x,y) while the CMC-7

5

reader seeks to interpret coded information identifying the bank name, agency, account number and

bankcheck number. Via a magnetic information retrieval means we are able to identify the printed

information with almost zero mis-recognition rate, and possibly achieve on-line recognition speed.

Under the 200 dpi and 256 gray-level resolution, a digitized bankcheck image requires approximately

833 Kbyte memory space size.

3.2 Position Adjustment

The digitized image IM(x,y) may be physically rotated and/or shifted with respect to the generated

bankcheck sample. To adjust the image position, we adapted the method proposed by Akiyamal and

Hagita [11] which is based the vertical and horizontal projection profiles. The position adjustment

algorithm is as follows. Step 1: Obtain the vertical profile of IM(x,y) and locate the bankcheck’s first

horizontal baseline as a reference for both rotation and vertical adjustment. Step 2: Rotate the image

IM(x,y) little by little until the sharpest horizontal reference base line is detected. Step 3: shift vertically

the image IM(x,y) until the reference baseline in IM(x,y) coincides with that in the generated image. Step

4: Shift horizontally the image IM(x,y) until the maximum match occurs between IM(x,y) and the

generated sample image.

3.3 Printed Information Image Generation:

To recover the filled information from a bankcheck, recently Yoshimura, et al. have suggested the

subtraction of the image of the unused bankcheck from the filled bankcheck image [12]. A main

drawback of this method is that, for each filled bankcheck, the recognition system has to maintain an

unused bankcheck image sample requiring, therefore, 833 Kbyte memory size for each bankcheck. In

addition, we have found that both thresholding and some sophisticated techniques proposed by Kamel,

etc. [13] are unable to provide satisfactory results, specially when the printed information presents a

similar or higher gray-scale level than that of the filled parts.

For each bankcheck, our information extraction unit generates a binary image IP(x,y) which

corresponds a bankcheck image containing only the printed information (both the fixed and variable

printed information), and uses it to eliminate undesirable or redundant patterns. As a result, to generate

a binary printed information pattern, we keep the following information in our database: location, font

and size (dimension) of each symbol, character and geometry object on a bankcheck [10]. We have

6

found that the average memory space needed for the printed information patterns is of approximately 1.5

Kbytes. Fig. 1.c shows a generated printed information image for the bankcheck image in Fig. 1.a.

3.4 Background and Printed Information Elimination

To eliminate the background pattern on a bankcheck, we subtract a stored background sample image

IT(x,y) from the position-adjusted image IM’(x,y) resulting in a image (IB(x,y)) without background.

Although a background image may occupy as much as 700 Kbyte memory space , such an approach is

perfectly plausible since all Brazilian bankchecks issued by the same bank possess a common

background pattern. As a consequence, relatively no additional memory space is required because only

one background image needed to be saved, however, can be used by every customers. To recover the

image IF(x,y) which contains only the filled information we subtract the generated image IP(x,y) of

printed information from the image IB(x,y).

3.5 Morphological Filtering

We are rarely able to remove completely background and printed information from a bankcheck image

obtaining an image containing only filled information. This problem is mainly due to the variations of

gray levels in the printed information and no perfect position matches between images IT(x,y) and

IM’(x,y). Parts of the background pattern and printed information that the system fails to remove is called

residual background noise generally characterized by isolated spots. To eliminate the residual

background noise we apply a morphological erosion operator using a 2 x 2 structuring element of 1’s.

Although the erosion operation can remove easily residual noise, at the same time, the operation

may degrade the filled information pattern. In fact, the printed information elimination procedure may

also distort the filled information pattern if two type of information physically overlap. Commonly

observable degradation in the filled information is a connected stroke or line segment being broken into

two or several parts. To solve this problem, we apply a dilation operator after erosion using a 3 x 3

structuring element of 1’s.

3.6 Information Extraction

In this work we are interested in recognizing two pieces of data in the filled information of a bankckeck:

the courtesy amounts and the customer’s signature. Since Brazilian bankchecks have a standard layout,

7

the two areas (the signature area and courtesy amount area) of our interest can be easily delimited. The

rectangles that measure 9.7cm by 2 cm and 4.5cm and 0.8cm were chosen to isolate signatures and

courtesy amounts.

4. DIGIT AMOUNT RECOGNITION

The recognition of handwritten characters has been intensively investigated and many recognition

methods proposed in the literature have been implemented, and have achieved very high recognition

rates (>99%) [14-17]. Recently some research papers have developed new techniques for numeric

amount recognition on bankchecks [2,4,5,9]. In this section we describe our subsystem of courtesy

amount recognition in the following functional sequence: numeric amount segmentation, image

processing, feature extraction and numeral classification.

4.1 Numeric Amount Segmentation

The procedure of the numeric amount segmentation isolates each connected component and detects

symbols other than numerals. Symbols like commas, periods and dash lines can be easily detected

because of their low heights relative to adjacent numerals in a courtesy amount. Another way to

determine the position of a comma is via contextual information if none of two numerals after a comma

are omitted. Detecting symbols like “$”, “#”, “(“ and “)” in a digit amount is by no mean trivial.

Although the neural classifier proposed in our digit recognition system is capable of detecting symbols

other than digits, we strongly believe that sure detection of these symbols should rely mainly on

information in the legal amount. We assumed that only digits, commas and periods were present in a

numeric amount. Our segmentation algorithm executes the following tasks: (1) separating disconnected

components; (2) enclosing each connected component with the smallest rectangle; (3) detecting the

comma and periods; (4) estimating the width of each enclosing rectangle and compare it to the average

width of enclosed boxes; (5) for each rectangle with unusually large width, applying morphological

convolution masks to determine the connected point and separate the connected elements.

8

4.2 Image Preprocessing

Those isolated numerals are still not suitable for recognition due to most of those images being

unconstrained handwritten numerals. Experimental investigation suggests further image processing. The

preprocessing of numeral images considered consists of three steps: skeletonization, dilation and scale

normalization.

Skeletonization: Our thinning algorithm finds the skeleton of a numeral image based on the following

principles: maintaining end points; avoiding excessive local erosion; preserving connections in images.

Fig. 3.b shows a skeleton of a numeral “4” obtained from a enclosed numeral shown in Fig. 3.a. Like

most thinning processes, the skeletonization creates spurious line segments, the so-called skeleton

parasitic branches (see Fig. 3.b).

Skeleton parasitic branches should be removed from a skeleton image in order to maintain

extracted features with high discriminating capability and the reliability. To eliminate skeleton parasitic

branches, we appeal to the mask-shifting technique. The technique consists of matching the skeleton

image to a convolution mask, and removing the central pixel of the matched region every time a match

occurs. Figures 3.c and 3.e show, respectively, the numeral “4” without spurious elements and the seven

spatial masks used. Experimental investigation demonstrates that long parasitic branches (those with

more than two pixels) rarely appear in the skeletons of handwritten characters. As a result, only two

iterations normally are needed to obtain a skeleton without parasitic branches. The algorithm proposed

here presents some desirable properties: low complexity, short processing time and minimum distortion.

(a) (b) (c) (d) (e)

Figure 3: (a) Handwritten numeral “4” (b) The skeleton with spurious a line segment (c) The skeleton without spurious

elements (d) The dilated image (e) Seven convolution masks used to eliminate skeleton parasitic branches.

9

Dilation: Skeletonized pattern images which present a high disparity in numbers of black and white pixels

are not suitable for neural classification. Our solution to this problem is to increase the width of strokes

by dilation. The dilation of an image X by a structuring element E can be defined as

( )X E x E Xx

⊕ = ∩ ≠ ∅

∧| [18]. In this work two structuring elements ([1 0] and [0 1]) are used to dilate

the skeleton one pixel to the right and one pixel to the left. Fig. 3.d shows the image of the dilated

numeral “4” from that in Figure 3.c.

Size normalization : The size normalization is necessary to avoid different numeral images presenting

different sizes caused by skeletonization and dilation. Variations in image size complicate considerably

the design task of classification algorithms, and compromise good classifier performance. Basically, the

size normalization is carried out by an AND operator between two consecutive lines or columns. In

other words, image size amplification or reduction are based on the following strategy: the operator

AND producing a white pixel (“1”) if all pixels involved in AND operation were white; otherwise, a

black pixel (“0”) will be generated [18]. Here the convention is a black pixel being denoted by bit “0”

while a white pixel by bit “1”. We conclude the image preprocessing stage with each numeral image

represented by a 19 x 21 binary matrix which is in a format prompt for feature extraction.

4.3 Feature Extraction

The set of features used in our bankcheck digit recognition system is divided into two groups. The first

group consists of features extracted from numeral images, while the second group by the proper

normalized numeral images. The use of two groups of features is due to the fact that the classification

procedure suggested in this work is executed in two sequential steps, namely preclassification and neural

classification, using, respectively, two groups of sets above described.

The first group of features in the preclassification stage is as the following (1) Number of central

holes; (2) Number of right holes; (3) Number of left holes; (4) Number of intersections with the principal

axis(PA); (5) Number of intersections with the secondary axis (SA); (6) Crossing sequences; (7)

Relative location of each hole in the image (up, down, left, right); (8) Relative location of each

intersection (top, bottom); (9) Individual digit distinctive features. Fig. 4.a shows the principal and

secondary axes of an image “6”. The principal axis is defined as a vertical line segment drawn through

the center of gravity. The secondary axis is a vertical line equidistant between the principal axis and the

10

very first left column in the image. In this example, both the principal and secondary axes intersect

numeral “6” twice. Fig. 4.b illustrates the concept of central, left and right holes via an image of numeral

“6”. Note that these features represent an image’s background characteristics. In Fig. 4.c we add the

horizontal and vertical crossing sequences located on right hand side and bottom of the image of

numeral “6”, respectively. Each element of these sequences measures the number of blocks of black

pixels in its corresponding row line or column line. For example, a useful feature used to distinguish

numeral “2” from numeral “8” is a distance measure between the crossing sequences of an unknown

numeral image and the reference crossing sequences of each one of the two numeral classes (“2” and

“8”). An unknown numeral image is classified into the numeral class which offers the smallest distance

measure. We adopt the linear distance measure for crossing sequences and make no attempt to

determine how useful this distance measure is, although intuition suggests that many other distance

measure functions would also discriminate numerals with variable degree of discrimination. Fig. 4.d

illustrates an example of how distinctive features is used to characterize numerals “3” and “7”.

(a) (b) (c) (d)

Figure 4: (a) The principal and secondary axes (b) Holes position (c) Crossing sequences (d) Distinctive features of

numerals “3” and “7”

Via distinctive features, Figures 5.a, b, c, and e also show, respectively, how to (a) identify

numeral “6”, (b) differentiate numerals “2”, “3” and “5”; (c) distinguish numerals “1” and “4” and (d)

differentiate numerals “4” and “7 by verifying the existence of black pixels in the neighborhood of the

“arrows” between intersection points between the numeral and its principal axis (PA) or its secondary

axis (SA). The distinctive features used to differentiate numerals “1” and “7” shown in Fig. 5.d are: the

number of black pixels in each two and relative position of intersections. In Fig. 5.f we circled the

distinctive features used to discriminate numeral “9” from numeral “4”. It is highly probable that the

11

many of these features or their similar versions were suggested and widely used by many other

researchers; however, the design and implementation of our image processing and feature extraction

algorithms, and the use of these features in our recognition procedures are entirely our own

contributions. In fact, good recognition results can be achieved, if we are able to combine intelligently

good features with efficient decision rules.

(a) (b) (c) (d) (e) (f)

Figure 5: Distinctive features. (a) Numerals “6” (b) Numerals “2”, “3” and “5” (c) Numerals “1” and “4” (e) Numerals “4”

and “7” (e) Numerals “1” and “7” (f) Numerals “9” and “4”.

4.4 Numeral Classification

As mentioned before, the classification procedure is divided into two stages: preclassification and neural

classification. A sequential approach for the recognition of unknown numerals was adopted for the

preclassification stage. Fig. 6 shows explicitly the recognition algorithm used in the preclassification via

a tree diagram. An unknown numeral will be recognized if it successfully passes through a series of tests

along one of the paths in the tree diagram and ends up in a numeral class.

Unrecognizable numerals in the preclassification stage thus undergo the second classification test,

the neural classification, which involves four specialized Hopfield neural networks. We trained each

network with digits with different styles. Fig. 7 shows the neural classifier in a block diagram. We also

divided the neural classification into two steps (Block 1 and Block 2), both involving Hopfield networks

with different complexity and purposes. The first step, Block 1, has the goals of recognizing most of

numerals and forming groups of numerals having similar characteristics. In other words, an unknown

image is classified into one of the three groups, denoted by {1, 4, 7}, {2, 3, 8, 9} and {0, 5, 6},

respectively. For example, group {1, 4, 7} contains numeral classes “1”, “4” and “7”. We group numeral

classes in such a way because we believe that they have similar characteristics. For example, numerals

“1” and “7” are frequently misclassified if the initial stroke presents an unusual length or inclination. In

this case In order to distinguish “1” from “7”, we need a better classifier which is capable of exploring

highly their minor differences and use them as discriminating features.

12

true

41 1 1 7 4 7 2 3 5H H

One Intersectionwith Secondary

Axis

Number of RightHoles (N R)

Number of Intersectionswith Principal Axis (N I)

DistinctiveFeatures

Distinctive Features



Number of Left Holes (N L )

One Intersectionwith Principal

Axis

Two Intersectionswith

Principal Axis

Bottom

Top

0 6 4 9 2 8

AllExtensionTrue

False

H

Number of CentralHoles (N C )

Central HolePosit ion

DistinctiveFeatures

N C =0

DistinctiveFeatures

CrossingSequence

True

HHopfield

Neural Nets

HH

INPUT DIGIT IMAGE

True

false

false

others

false

N I ≠ 3

N I = 3

N R = 2

N R = 1

N L = 1

N L = 2

N C ≠ 1 2an d

H

N C = 2

N C = 1

Figure 6: The preclassification algorithm for numeral images.

Fig. 7. The numerical neural classification

In Block 2 three Hopfield networks are used, each one trained by the training patterns of one of

the three numeral groups, (1,4,7), (2, 3, 8, 9) and (0, 5, 6). Note that, when an image is introduced into

a Hopfield network, after many successive iterations, the network eventually ends up in a stable state in

which no more change occurs in the output image. In most cases, an output image is not identical to any

13

template pattern. As a result, at the output of each network a decision rule is performed to map the

output image to one of the numeral classes involved. Here we introduce some notations. Let matrices

G=[g(i,j)] 20x19 and Fk=[f k(i,j)] 20x19 denote the output image produced by the Hopfield network and a

template image of numeral “k”, respectively. Therefore, g(i,j) stands for the image gray-level in the pixel

with the spatial coordinate (i,j). The decision rule which classifies an unknown numeral image into a

group of numerals consists of the following steps: (1) Compute distance D(G,Fk) between the images G

and Fk; (2) Find k which provides the smallest D(G,Fk); (3) Finally classify image G into a class to which

the numeral k belongs. The distance measure D(G,Fk) is computed as follows.

1. Scan image G and Fk line by line.

2. In line i (i=1,2, .., 20) and for each black pixel of image g(i,j)∈G find a black pixel f i mk ( , ) ∈ F such that |j-m| is

smallest. Mathematically, we have d i j j mgm m K f i mk

( , ) min | |{ : , ( , ) }

= −≤ ≤ =1 1

.

3. For each unused black pixels f k in line i we compute d i j j mfm m K g i mk

( , ) min | |{ : , ( , ) }

= −≤ ≤ =1 1

.

4. Set d i jfk( , ) = 0 and d i jg ( , ) = 0 for white pixels.

5. The distance measure ( )D g fk, then is obtained byD g f d i j d i jk g fji

k( , ) [ ( , ) ( , )}= +∑∑

Note that conceptually the distance measure ( )D g fk, shows how two numeral images differs and

the degree of distortion a numeral suffers with respect to its template pattern.

5. STATIC SIGNATURE VERIFICATION

As for the static signature verification methods, there are many published works, most of which are

reviewed in [19,20]. In this section we described our static signature verification system which is capable

of detecting efficiently both simple/random and statically skilled forgeries. A forgery is considered

simple when the forger makes no attempt to simulate or trace a genuine signature; A forgery is regarded

as random when the forger uses his/her own signature for testing; finally, in skilled forgery, the forgery

tries and practices imitating as closely as possible the static and dynamic information of the genuine

signature [19]. When a forger focus his attention only on signature’s static information in the skilled

14

forgery, we regard the imitation as a statically skilled forgery [21]. The discussion about our signature

verification method is divided into three topics: signature image preprocessing, feature extraction

and signature verification.

5.1. Signature Image Preprocessing

Human signatures can be viewed as special handwriting because the contextual information may not be

fully interpreted, and frequently the handwriting is highly stylistic. It has been observed that American

signatures are relatively more stylistic than European signatures [22]. The static signature verification

method proposed in this work does not resort to signature contextual information. Instead, a static

signature is regarded as a drawn pattern consisting of interlaced strokes and line segments in a bi-

dimensional binary image. In our bankckeck recognition system, the signature verification unit receives

images directly from the bankcheck information extraction unit described in Section 3. Although the

inputted signature images are free of the background noise, they still need some processing before

features can be extracted. Basically two major operations are carried out in this preprocessing stage:

signature enclosing and size normalization. We do not need the horizontal alignment of the signature

because a horizontal guideline for signature handwriting is provided both on Brazilian bankchecks and

bank account registration forms.

Signature enclosing: The signature enclosing operation search the smallest box that covers the significant

parts of a signature image. The significant parts of a signature can be determined from two histograms,

H xx( ) and H yy( ) , that measure frequencies of black pixels in the horizontal direction (x) and the vertical

direction (y), respectively. Let (xini , xend ) and (yini , yend ) denote, respectively, the initial and the end

coordinates of the enclosing box in horizontal (x) and vertical (y) axes. They are found as follows.

x x z x H z Tini x= ∀ < <max{ : , ( ) } x x z x H z Tend x= ∀ > <min{ : , ( ) }

y y z x H z Tini y= ∀ < <max{ : , ( ) } y y z y H z Tend y= ∀ > <min{ : , ( ) }

In general, parts of a signature image eliminated from enclosing are located on the outskirts of the

signature, and vary largely in size and shape. Thus, the features extracted from the enclosed signature

15

image are expected possessing a higher degree of consistency than those from unbounded images. In this

work, we adopt T= 10 pixels as the decision threshold for the significant parts of signature images.

Size normalization: The signature size normalization operation consists of finding a suitable spatial

resolution for the signature image and standardizing its size. It has been observed that some features

extracted from an image with a very high spatial resolution are sensitive to noise introduced by the

variation in inputting instruments or handwriting. On the other hand, when a low resolution is chosen,

although all genuine signatures just look like the same, it is extremely hard to distinguish a genuine

signature from its skilled imitation. Therefore, based on this observation we adopted a normalized

signature having its length (X_f) fixed at 256 pixels and its width (Y_f) linearly proportional to X_f; i. e.,

Y fY sig

X sig_

_

_= 256

where X_sig and Y_sig denote the length and height of the enclosing box, respectively.

5.2 Feature Extraction

Two kinds of information in signature images were extracted for the implementation of the set of

features used in this work: signature stroke orientation and envelope orientation of a dilated signature.

To extract these signature features we resort to the mathematical morphology. The mathematical

morphology can be considered as a tool for extracting image components useful for the representation

and description of object geometrically [18].

Signature stroke orientation: The first feature set is with respect to the stroke orientation in a signature

image. Before extracting such information, first of all, we implement a set of 32 five-by-five different

structuring elements (SEs) as shown in Fig. 8. Each of the 32 SEs represents a short line segment with a

different degree of inclination. Next, for each SE we apply the morphological hit-or-miss transform to

the enclosed and then size normalized signature image. The morphological hit-or-miss transform is a

basic tool for shape detection [18]. Here, the goal is to identify the degree of inclination of signature

strokes and count the frequency of matches of each one of the 32 SEs with the signature image.

16

Figure 8: The 32 structuring elements representing short line segment with different degree of inclination.

Fig. 9.a compares the numbers of matches for each of 32 structuring elements in a genuine

signature and that in a simple forgery. Note that there is considerable discrepancy between two

frequencies of matches for the most of 32 SEs what demonstrates good discriminating capability of the

feature set.

Figure 9: (a) Frequencies of matches for the 32 structuring elements between a genuine signature and a simple forgery (b)

Frequencies of matches for the envelope orientation features between a genuine signature and a simple forgery

Envelope orientation of dilated signatures: The second feature set, similar to the first one, provides the

frequency of orientation of the contour line (envelope) in a signature image and its dilated versions. In

this work, only six different directions are considered in each images. To extract these features, seven 3

x 3 structuring elements are used, namely SE3.1, SE3.2, .., SE3.7 as shown in Figure 10. The

structuring element SE3.1 is for dilating a signature image as well as its successive dilated versions while

the others, each one representing distinct orientation in the envelopes. Note that symbol “x” in each SE

means that the pixel can be either black or white. Starting with a signature image we repeat the dilation

process n-1 times always over the last dilated image, totaling n different images (one original signature

image and n-1 dilated versions). For each of these n images, we count the frequency of each one of the

six directions occurred in the envelope. Thus, the second feature set used is a 6n-dimensiona vector.

17

Each component of this vector shows the frequency of occurrences of a determined direction in one

image envelope. Fig. 9.b compares the feature vector obtained from a genuine signature to that from a

simple forgery with the original signature images dilated twice. Therefore, each feature vector has 18

elements. Note that there are considerable discrepancies between two feature vectors, what

demonstrates good discriminating capability of the proposed feature set.

Figure 10: Structuring Elements for signature image dilation and envelope orientation matching.

5.3 Decision Rules

For signature verification, two kinds of classifiers were considered and tested: Euclidean distance

classifiers and multilayer perceptron. Let T t t tp= ( , ,....., ),1 2 and X x x xR p= ( , ,....., ),1 2 represent the being tested

signature feature vector and the arithmetic mean feature vector computed from the genuine reference

set, respectively.

Euclidean distance classifier: The decision rule for signature verification is based on the evaluation of

the Euclidean distance measure between feature vectors T and XR with respect to the decision threshold

D. The decision threshold D is the smallest real value so that the classifier correctly verifies all genuine

training signatures.

Multilayer perceptron : The feedforward multilayer neural network was trained by the backpropagation

algorithm over both genuine and forgery training sets. It can be shown that the network can be

considered as an approximation to a Bayes optimal discriminant function [23].

6. EXPERIMENTAL RESULTS AND CONCLUSION

For numeral recognition, two sets of data were constructed. The first set consisted of 240 images of

numerical characters used for the Hopfield net training. Those 240 images were divided into four groups

18

of 60 images, and each group was used for training one of the four Hopfield networks. We limited 60

training images for each net due to the fact that the number of spurious states would increase drastically

if the net is forced to save a certain amount of image patterns beyond its storage capacity [24]. In other

words, the number of training samples should not excessively exceed 15% of the total number of

neurons (380 = 20 x 19) in the Hopfield net. Otherwise, the net would easily converge to one of the

spurious states which do not correspond with any one of the true patterns. It is worth mentioning that

for the preclassification stage no training sample was indeed required.

To test the performance of our numeral recognition algorithm, we used the second numeral data

set consisting of 606 numerals extracted from 121 Brazilian bankchecks provided by a population of 12

subjects. Among the 606 numerals, 560 of them were correctly recognized that is equivalent to 92,4%

correct classification rate. Among 560 recognized numerals, 490 numerals and 70 numerals were

recognized in the preclassification stage and the neural classification stage, respectively. We found that,

among those 121 tested courtesy amounts, 88 of them were correctly recognized completely that

corresponds 72.7%. recognition rate. From a visual inspection over 46 misclassified numerals, we

observed that some of 46 numerals were considerably deformed and others, although visually

recognizable, present unusual characteristics introduced by personal stylistic writing or careless writing

resulting in ambiguity. Fig. 11 shows some of these numeral samples. Unrecognizable numeral images in

the pre-classification then were tested by the Hopefield nets resulting in 70 correctly classified numerals

out of 103 tested.

Fig.11: Numerals with considerable deformation

Recently, Lethelier et al. have claimed their numeral recognition algorithm achieving 87% correct

numeral recognition rate by testing 10.000 French bankcheck images [5]. Independently Congedo et al.

have claimed their recognition system achieving the misclassification rate as low as 0.08% at the expense

of allowing the rejection rate becoming as high as 49% [25]. Based on these classifications results, we

conclude that our courtesy amount recognition method has provided a satisfactory result for bankcheck

recognition applications. Up to now we have not considered the “rejection state” in our recognition

machine because our initial intention was forcing our system to maximize the recognition rate. In the

19

near future we intend to develop a “best” decision rule which takes into consideration of other

information and requirement, such as worded amount information, bank’s interest and priority, etc.

For static signature verification, three sets of experiments were performed. In the first set of

experiments, a database of 550 genuine signatures contributed equally from 5 subjects, 100 simple

forgeries and 300 skilled forgeries all written on white paper, was compiled. The skilled forgeries were

provided by graduate students and faculty members in the EE department. For each subject’s genuine

set, 10 signatures were selected for the classifier training, and remaining 100 genuine signatures and

both random and skilled forgeries were used for the classifier testing. Let E1, E2 and EQ denote,

respectively, the Type I error rate (false rejection) at zero Type II error (false acceptance), the Type II

Error rate at zero Type I error, and the equal error rate, respectively. Table 1 shows both individual and

global performances of the Euclidean classifier tested separately by random forgeries (R) and skilled

forgeries (S). Note that, on the average , 3% equal error rate was achieved by testing only random

forgeries. In the case of skilled forgeries, the verification system provided 14.3%. equal error rate. From

Table 1 we conclude that our verification system (the 62-feature set and the Euclidean classifier) is

suitable for detecting random forgeries (EQ=3%) and for point-of-sale application (E2=5.8%) [21].

Table 1: Individual and global performance (%) of the Euclidean classifier

Subject 1 Subject 2 Subject 3 Subject 4 Subject 5 Average

R S R S R S R S R S R S

E1 14 65 11 45 57 57 26 48 14 68 24.4 56.6

E2 3 45 6 45 10 50 4 37 6 40 5.8 43.4

EQ 1 16 4 10 5 19 3 13 2 15 3.0 14.6

In the second set of signature verification experiments, 94 genuine signatures extracted from real

bankchecks contributed by 6 subjects were used for the Euclidean classifier training. For each subject,

only 3 genuine signatures were used to compute the average feature vector while each subject’s all

genuine signatures were used to determine threshold value D. Our goal was determining the Type II

error rate (False acceptance) at zero Type I error. Table 2.a shows the classifier performance from

testing two forgery sets; Set 1: 100 random forgeries written on white paper and contributed by 10

subjects; Set 2: 94 bankcheck genuine signatures used as random forgeries - naturally excluding each

subject’s own genuine signatures. The second set of experiments was organized in this way because our

20

primary objective was to attend the following requirement: Customers’ satisfaction. In other words, we

expect that our system is capable of verifying all genuine signatures and detecting a considerable number

of random forgeries. Such an approach was motivated by the statistical investigation revealing that 99%

of signature forgeries in Brazilian bankchecks consists of random/simple forgeries. In addition, bank

clerks only verify signatures in bankchecks if the check value is high (> US$ 500) although, by law, the

signature on a bankcheck should be verified manually if the check value is merely as much as US$ 3.4 or

more, approximately.

Table 2: Tests on bankcheck signatures (a) Performance of Euclidean classifier (b) Performance of the multilayer

perceptron.

(a) (b)

Random Forgery Set 1 Set 2 Skilled Forgery

E2 6.7 % 7.3 % EQ 4.7 %

In the third set of experiments we investigated a new signature verification approach which

employed an one-hidden layer perceptron classifier and used the signature stroke orientation feature

vector defined in section 5.2. A database of 300 genuine signatures acquired from 3 subjects and 180

skilled forgeries were equally divided for the classifier training and testing. The testing results reveal that

the neural classifier provides a performance of 4.7% equal error rate on the average. We have found that

our neural classifier using the stroke orientation information performs considerably better than most off-

line signature verification methods reported in the literature; 14% and 10% error rates in detecting

skilled forgery were reported by Yoshimura et al [12] and Qi [25], respectively.

This paper has focused on the design and implementation of a prototype for Brazilian bankcheck

recognition. The prototype has three main units that perform the following three processes, respectively:

bankcheck information extraction, digit amount recognition and static signature verification. A novel

aspect concerning to the bankcheck information extraction is the use of generated images to eliminate

printed patterns. An immediate consequence of such a measure is memory space saving since no blank

bankcheck sample image is needed for background and filled pattern elimination. We tested intensively

our numeral recognition and static signature verification algorithms by both real bankcheck data and

simulated data. Although both numeric and signature images extracted from bankchecks were

considerably degraded and noisy, surprisingly our system provide good performances: for skilled forgery

21

4.7% equal error rate; for random forgery zero Type I error and 7.3% Type II error; for bankchecks

numerals, 92.7% correct recognition rate. Although we have used Brazilian bankchecks throughout this

work, after analyzing some traveler’s checks and American checks, we believe that our bankcheck

recognition method can be easily adapted for recognizing bankchecks other than Brazilian bankchecks.

REFERENCES

1. L. Lam, C. Y. Suen, D. Guillevic, N. W. Strathy, M. Cheriet, K. Liu and J. N. Said, “Automatic

processing of information on cheques,” Proc. IEEE Int. Conf. Syst, Man, Cybernetics, Canada, 1995.

2. A. Agarwal, L. Granowetter, K. Hussein and A. Gupta, “Detection of courtesy amount block on

bank checks”, Proc. Third Int. Conf. Document Analysis and Recognition, 1995, pp.748-751.

3. T. M. Ha and H. Bunke, “Model-based analysis and understanding of check forms,” Int. J. Pattern

Recog. and Artificial Intell 8, 5 (1994) 1053-1080.

4. J-P Dodel and R. Shinghal, “Symbol/Neural Recogntion of Cursive Amounts on Bank Cheques,”

Proc. Third Int. Conf. Document Analysis and Recognition, 1995, pp 15-18.

5. E. Lethelier, M. Leroux, M. Gilloux, “An automatic reading system for handwritten Numeral

Amounts on French Checks,” Proc. Third Int. Conf. Document Analysis and Recognition, 1995, pp

92-97.

6. F. Chin and F. Wu, “A microprocessor-based optical character recognition check reader,” Proc.

Third Int. Conf. Document Analysis and Recognition, 1995, pp 982-985.

7. M. Cheriet, J. N. Said and C. Y. Suen, “A formal model for document processing of business

forms,” Proc. Third Int. Conf. Document Analysis and Recognition, 1995, pp 210-213.

22

8. D. Guillevic and C. Y. Suen, “Cursive script recognition applied to the processing of bank cheques,”

Proc. Third Int. Conf. Document Analysis and Recognition, 1995, pp 216-223.

9. G. Dimauro, M.R. Grattagliano, S. Impedovo, G. Pirlo, “A system for bankcheck processing,” Proc.

Third Int. Conf. Document Analysis and Recognition, 1993.

10. A. L. Koerich, “Automatic Processing of Bankcheck information”, MS Dissertation, School of

Electrical Engineering, Universidade Estadual de Campinas. (In Portuguese).

11. A. Akiyama and N. Hagita, “Automated entry system for printed documents”, Pattern Recognition,

23 (1990) 1141-1154.

12. O. Yoshimura and M. Yoshimura, “Off-line verification of Japanese Signature after elimination of

background patterns”, Int. J. Pattern Recog. and Artif. Intell. 8, 3 (1994) 693-708.

13. M. Kamel and A. Zhao, “Extraction of binary character/graphics images from grayscale document

images”, Graphic Models and Image Processing, 55, 3 (1993) 203-217.

14. J. Mantas, “An overview of character recognition methodologies”, Pattern Recognition 6, 19

(1986), 6425-430.

15. M. Shridhar and A. Badreldin, “High accuracy character recognition algorithm using Fourier and

Topological descriptors”, Pattern Recognition 17, 5 (1984), 515 - 524.

16. K. Tsirikolias and B. G. Mertzios, “Statistical pattern recognition using efficient two-dimensional

moments with applications to character recognition”, Pattern Recognition 26, 6 (1991), 877-882.

17. G. Wang and J. Wang, “A new hierarchical approach for recognition of unconstrained handwritten

numerals”, IEEE Trans. Cons. Electronic 40, 3 (1994), 428-436.

23

18. R. C. Gonzales and R. Woods, Digital Image Processing, Addison Wesley, 1993.

19. R. Plamondom and G. Lorette, “Automatic signature verification and writer identification - The state

of the art”, Pattern Recognition 22, 2 (1989) 107-131.

20. F. Leclerc and R. Plamondom, “Automatic signature verification and writer identification - The state

of the art - 1989-1993,” Int. J. Pattern Recog. and Artificial Intell. 8, 3 (1994) 643-659.

21. L. L. Lee, T. Berger and E. Aviczer, “Reliable on-line human signature verification systems for

point-of-sale application”, IEEE Trans. Pattern Analysis & Machine Intelligence 18, 6 (1996) 643-

647.

22. H. Cardot, M. Revenu, B. Victorri and M. J. Revillet, “A static signature verification system based

on a cooperating neural networks architecture”, Int. J. Pattern Recog. and Artificial Intell. 8, 3

(1994) 679-792.

23. D. W. Ruck, S. K. Rogers, M. Kabrisky, M. E. Oxley and B. W. Suter, “The multilayer perceptron

as an approximation to a Bayes optimal discriminant function”, IEEE Trans. Neural Networks 1, 4

(1990) 296-300.

24. Hertz, A. Krogh and R. Palmer, Introduction To The Theory of Neural Computation, Addison

Wesley, 1991.

25. Y. Qi and B. R. Hunt, “Signature verification using global and grid features”, Pattern Recognition,

27, 12 (1994) 1621-1629.

26. G. Congedo, G. Dimauro, S. Impedovo, G. Pirlo, “A Structuring method with local refining for

handwritten character recognition,” Proc. Third Int. Conf. Document Analysis and Recognition,

1995, pp 853-856.

a prototype for brazilian bankcheck recognitionalekoe/papers/koerich_ijprai_1997.pdf · numeric...

Documents