recognition of printed bangla document from textual image using multi-layer perceptron (mlp) neural...

8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 1/6

Recognition of Printed Bangla Document from Textual Image Using Multi-Layer

Perceptron (MLP) Neural Network

Md. Musfique Anwar, Nasrin Sultana Shume, P. K. M. Moniruzzaman and Md. Al-Amin Bhuiyan

Dept. of Computer Science & Engineering, Jahangirnagar University, Bangladesh

Email: [email protected], [email protected], [email protected], [email protected]

Abstract

This paper focuses on the segmentation of printed

Bangla characters for efficient recognition of the

characters. The segmentation of characters is an

important step in the process of character recognitions because it allows the system to

classify the characters more accurately and quickly.

The system takes the scanned image file of the printed document as its input. A structural feature

extraction method is used to extract the feature. In

this case, each individual Bangla character is

converted to a N M × feature matrix. A Multi-

Layer Perceptron (MLP) neural network with back

propagation algorithm is chosen to feed the feature

matrix to train with the set of input patterns and todevelop knowledge to classify the character. The

effectiveness of the system has been tested with

several printed documents and the success rates inall cases are over 90%.

Keywords:

Character segmentation, Character recognition,

Feature extraction, Multi-Layer Perceptron (MLP),

etc.

1. Introduction

Optical character recognition [1] is one of the

attractive fields of image processing [2]. A

character recognition technique associates a

symbolic identity with the image of a character. Lotof research works on Bangla Character recognition

has been done through last few years. In the

modern approach, adaptive tools have been appliedto pattern recognition system. The Artificial Neural

Network (ANN) is the most popular adaptive toolthat is used for character recognition [3]. Most

application use feed forward ANN and a numerous

variant of classical backpropagation algorithm andother training algorithms. The area of this research

is not only individual character recognition but it

attempts to retrieve a complete paragraph from itsoptical image created by a scanner. In this paper we

proposed a way to recognize printed Bangla

document from textual image using multilayer perceptron with backpropagation algorithm for

individual character recognition.

2. Bangla Character Set

Character is the fundamental attribute for writing

and reading a language. Character recognition is

the process to classify the input character according

to the predefined character class. There is a

particular character set for each language in the

world and Bangla language has also its own

character set with 49 characters, 10 digits, punctuations and other symbols.

Bangla letters are formed in two-dimensional space

based on mostly horizontal, vertical and are stroke

[4].The Bangla characters are classified in two

categorizes as follows:

i) Sorborno: ‘Shorborno’ like vowel of EnglishLanguage Character. There are eleven

‘Shorborno’ characters. The first six charactersor letters have full matra, the 7

thhas half matra

and the last four have no matra.

ii) Banjonborno: ‘Banjonborno’ is like as theconsonant. There are 39 ‘Banjonborno’ in

Bangla letter. Here we are concerned about

only the characters.

Bangla scripts are moderately complex patterns.

Each word in Bangla scripts is composed

of sever al characters joined by a horizontal line(called ‘Matra’ or head-line) at the top. The

concept of upper and lower case (as in English)

character is absent her e. There are many

composite characters, called “Jukto barna” asshown in Fig. 1. There are more that about 253

compound characters composed of 2, 3, or 4

consonants (i.e. Banjonborno) [5]. There are someother types of characters used in Bangla dictionary,

called suffix-prefix characters as shown in Fig. 2.

(a) Shorbarna

(b) Benjonbarno

(c) Bangla numerals

(d) A few Bangla composite characters

Fig. 1 Some Bangla mainstream characters used for

images recognition.

Fig. 2 Suffix-prefix determiner characters

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 1, April 2010

254 http://sites.google.com/site/ijcsis/ISSN 1947-5500



3. The System Overview

The main phases of character recognition system isthe segmentation of text into characters so that the

computer is able to classify characters within a

paragraph as human can identify them. The overallmethod of the implemented system is illustrated in

the Fig. 3.

Fig. 3 Overall diagram of the recognition system

3.1 Data Acquisition

The input images are acquired from documents

containing printed Bangla text by using scanner as

an input device. Scanned images are then stored asan image file (.JPEG). Pre-processing is required to

make the raw data of the image into usable format

[6] because the scanned image does not happen to

be always in suitable form. This image is then passed for boundary detection.

3.2 Boundary Detection

We need to scan from the upper left and the bottom

right of the image to find the processing area of the

printed text document. The scanning is halted whenit faces a single pixel.

3.3 Segmentation

In this phase text is partitioned into its elementary

entities i.e. characters. First the system detects the

region of a text line of the paragraph. Then the textlines are segmented into words and the words are

divided into characters.

3.3.1 Text Line Detection

Text line detection is performed by scanning the

image row by row horizontally and keeps the

numbers of black pixels in each row. Now the boundary may be detected from the array by

counting the frequency of pixels in each line. In our

experiment we found the number of pixels of a

blank line in the image vary from 0 to 10. So thenumber of pixels where text is present in the image

is much larger than that of blank in the paper.There is a general concept that between two lines

more than two blank lines are present. In this waywe detect the boundary of a text line.

Upper boundary of a line is the first row where themore black pixels are found. After finding the

upper boundary, it continues scanning until a row

whose next row has no black pixels, which is the

lower boundary of the text line. There exist about 8

to 10 blank rows between two text lines.

3.3.2 Word Segmentation

Normally, in Bangle word there is no character

spacing due to Matra ( ⎯⎯ ). We have to detect the

Matra of a text line at first. Matra line is that row

that where the number of black pixels is themaximum [1, 7]. After detecting a line, the systemscans the image vertically from the upper boundary

of the line and count the number of black pixels in

each column. Start position of a word is the first

column where black pixels found first. The systemcontinues scanning until a column whose next

column has no black pixels, which is the end

position of the word. There exist about 4 to 6 blank columns between two words.

3.3.3 Character Segmentation

To perform the separation of characters in a word,

the system scans vertically from the start positionof the word which is also the start position of the

first character of the word. After finding the start

position of the character, it continues scanning untila column whose next column has no black pixels,

which is the end position of the character. Every

consecutive character in a word contains 2 to 3

blank columns shown in Fig. 4 .

Fig. 4 Character separation from below the Matra

3.4 Feature Extraction

Feature extraction is a subject of effective character

recognition and it helps easing classification task.

Maximum height and width of Bangla characters

(without compound characters) of SutonnyMJ font

with 10 font size is 76 × and maximum 912 × in

case of compound characters. After determining thestart and end position of a character, the region of

that character is converted to a 76 × matrix or



255 http://sites.google.com/site/ijcsis/

ISSN 1947-5500



912 × matrix (for compound characters)

containing 0 and 1, where 1 represents the presenceof character component and 0 represents the

absence of the character component.

The boundary of all characters are not of equal size,

i.e., the extracted matrices are not of equal size. If

some matrices are of smaller or greater height andwidth of our standard size then we scale the matrix,

but, if the height is equal but width is less then, we

add 0 to fill up the matrix to our standard size. The

character matrix acts as input to the recognition

stage. The input matrix is then fed to the neuralnetwork.

3.5 Recognition Engine and Classifier

In a back-propagation neural network, the learning

algorithm has two phases. First, a training input pattern (Bengali characters) is presented to the

network input layer. The network then propagates

the input pattern from layer to layer until the output

pattern is generated by the output layer. If this

pattern is different from the desired output, an error

is calculated and then propagated backwardsthrough the network from the output layer to the

input layer. The weights are modified as the error is

propagated.

A back-propagation neural network is determined

by the connections between neurons, the activationfunction used by the neurons, and the learning

algorithm that specifies the procedure for adjusting

weights. The network architecture for the backpropagtion neural network is shown in Fig. 5.

Fig. 5 Back-propagation neural network topology

A neuron determines its output by computing the

net weighted input:

∑

=

−=n

1i

θi

wi

xX ………… (1)

Where n is the number of inputs, and θ is

threshold applied to the neuron. Next, this inputvalue is passed through the sigmoid activation

function:

Xe1

1SigmoidY

−+

= ………… (2)

To derive the back-propagation learning law, let us

consider the three-layer network shown in Fig. 5.

The indices i, j, k here refer to neurons in the input,

hidden and output layers, respectively. The symbol

ijw denotes the weight for the connection between

neuron i in the input layer and neuron j in thehidden layer, and the symbol jk w the weight

between neuron j in the hidden layer and neuron k

in the output layer.

To propagate error signals, we start at the output

layer and work backward to the hidden layer. Theerror signal at the output of neuron k at iteration t is

defined by:

(t)a,k

Y(t)d,k

Y(t)k

e −= ………… (3)

Where t=1, 2, 3 and (t)d,k

Y is the desired output

of neuron k at iteration t.

Neuron k, which is located in the output layer, is

supplied with a desired output of its own. Hencewe may use a straightforward procedure to update

weight jk w :

(t) jk

Δw(t) jk

w1)(t jk

w +=+ ………… (4)

Where (t) jk

Δw is the weight correction, given by:

(t)k

δ(t) j

yα jk

Δw ××= ………… (5)

Where (t)k

δ is the error gradient at neuron k in

the output layer at iteration t.

In order to calculate the weight correction for the

hidden layer, we can apply the same equation as for the output layer:

(t)ij

Δw(t)ij

w1)(tij

w +=+ ………… (6)

Where (t)ij

Δw is the weight correction, given by:

(t) jδ(t)

ixα

ijΔw ××= ………… (7)

Where (t) jδ represents the error gradient at neuron

j in the hidden layer:




ISSN 1947-5500



∑=

×−×=l

1k (t)

jk (t)w

k δ(t)]

jy[1(t)

jy(t)

jδ … (8)

Where l is the number of neurons in the output

layer and,

(t)i

xe1

1(t)

jy

−

+

= ………… (9)

∑=

−×=n

1i jθ(t)

ijw(t)

ix(t)

iX ………… (10)

Where n is the number of neurons in the input

layer.

In our work, we use backpropagation neural

network consisting of 42 neurons in input layer, 30

neurons in the hidden layer and one output neuronin the output layer for character matrix of

size 76 × . And for character matrix of size 912 × ,

backpropagation neural network consists of 108

neurons (i.e. as inputs), 80 neurons in the hidden

layer and one output neuron in the output layer.

The system recognizes a character if the output of

the network is very close to one of the characters

with a certain acceptable tolerance. If the output isfar apart from all the possible outputs, then the

system cannot identify the character. This process

continues until the end of the text document. Theentire operation of the system can be easily

understood from the flow-chart shown in Fig. 6.

Fig. 6 Flow-chart of the recognition system

4. EXPERIMENTAL RESULT

We used bswing1_0_beta package for Bangla text

output and neuralj-0.0.4 package to implement backpropagation neural network in Java. The

number of neurons of hidden layer is always set to

(3/4) th of the number of neurons of input layer.

We use ‘PatternSet’ class which represents a set of patterns. The function ‘addPattern

(Pattern pattern)’ is used to add the required

patterns for all Bangla characters. The pattern for

Bangla character looks like:

‘pattern_set.addPattern(newPattern("0;0;1;0;1;0;0;0;0; 0;0;1;0;0;1;0;0;0;

0;0;1;0;0;0;1;0;0; 0;0;1;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;

0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;

0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;

0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;",

matrix_output_Str))’where 0;0;1;0;1 ………………….. 0;0;0; is the

input vector and ‘matrix_output_Str’ is the output

vector. We set the value of the following fields of ‘BackPropagation’ class as:

Field value

‘desired_error’ 0.001‘maximum_epochs’ 1000000000

Then the training of backpropagation neuralnetwork starts. After the training, the system scans

Bangla paragraph image and try to find the

correctly recognized characters and display thosecharacters as output. Fig. 7 illustrates the snapshot

of the implemented method. Results for different

types of sentences are furnished in Table 1.

Start

Input the image of the paragraph

which will be recognized

Detect the boundary of the printedtext document to perform the

se mentation of characters

Input the matrixto ANN

Stop

Select the character matrix of size

76 × or 912 × (for compound

character

Calculate OutputVector and error

error ≤ 0.001

Add the character to output list

Set index = 0,

maximum_epochs = 1000000000

Print “the character

is unrecognized”

index = index + 1

Whole documentrecognized?

No Yes

Yes

No




ISSN 1947-5500



Fig. 7 Sample output of the proposed system

Table 1: Success rate for experimental results

Total no. of

characters

Correctly

recognized

characters

Success

rate (%)

165 162 98.18

288 275 95.49

356 337 94.66

0

50

100

150

200

250

300

350

400

Total no. of

characters

Correctly

recognized

characters

Success rate (%)

Fig. 8 Success Rate Graph of experimental results

5. Conclusion

In this paper, we proposed a recognition system

emphasizing on the segmentation phase. The proposed system is capable of separating Bangla

letters, digits successfully from printed document.

It recognizes the segmented characters using

backpropagation neural network. The system

sometimes fails to recognize composite characters.

So to improve the performance of the system, the

segmentation process can be improved to deal withcomposite characters. In future, the proposed

recognition system may further be improved using

spell-checker.

References

[1] M. E Hoque, M. J. H. Siddiqi, S.M. Kamruzzamanand M. S. Chowdhury, “Efficient Method of Size

Independent Printed Bangla Paragraph

Recognition Using ANN and EfficientHeuristics”, Proceedings of International

Conference on Computer and Information

Technology (ICCIT), Dhaka, Bangladesh, pp.

755-758 (2003). [2] Rafael C. Gonzalez, Digital Image Processing, 2nd

Edition, Pearson Education publisher, New York,

2002.

[3] S. M. M. Rahman, S. M. Rahman and M.A.Rashid, “Kohonen Neural Network in Character

Recognition Applications”, Proceedings of

NCCIS, pp. 106-110 (1997). [4] M. R. Bashar, M. A. F. M. R. Hasan, M. F. Khan,

“Bangla Off-Line Handwritten Size Independent

Character Recognition Using Artificial Neural

Netwroks Based on Windowing Technique”Proceedings of International Conference on

Computer and Information Technology (ICCIT),

Dhaka, Bangladesh, pp. 351-354 (2003).




ISSN 1947-5500



[5] M. F. Zibran, A. Tanvir, R. Shammi and Md.

Abdus Sattar, “Computer Representation Of

Bangla Characters And Sorting Of BanglaWords”, Proceedings of International Conference

on Computer and Information Technology

(ICCIT), Dhaka, Bangladesh, pp. 191-195 (2002). [6] T.M. Ha and H. Bunke, “Off-line Handwritten

Numerical Recognition by Perturbation Method”,

IEEE Transactions on Pattern Analysis and

Machine Intelligence, vol.19, no.5, pp.535-539

(May 1997)

[7] M. A. Sattar, K. Mahmud, H. Arafat and A. F. M.

Noor-Uz-Zaman, “Segmenting Bangla Text for

Optical Recognition”, Proceedings of

International Conference on Computer andInformation Technology (ICCIT), Dhaka,

Bangladesh, pp. 283-286 (2003).

Md. Musfique Anwarcompleted his B.Sc (Engg.) in

Computer Science and

Engineering from Dept. of CSE, Jahangirnagar

University, Bangladesh in

2006. He is now a Lecturer in

the Dept. of CSE, Jahangirnagar University, Savar,

Dhaka, Bangladesh. His research interests include

Artificial Intelligence, Neural Networks, ImageProcessing, Pattern Recognition, Software

Engineering and so on.

Nasrin Sultana Shume

completed her B.Sc (Engg.)

in Computer Science andEngineering from Dept. of

CSE, Jahangirnagar

University, Bangladesh in2006. She is now a Lecturer

in the Dept. of CSE, Green University of

Bangladesh, Mirpur, Dhaka, Bangladesh. Her research interests include Artificial Intelligence,

Neural Networks, Image Processing, Pattern

Recognition, Database and so on.

P. K. M. Moniruzzamanreceived his B.Sc (Hons) in

Electronics and Computer

Science and M.S. inComputer Science and

Engineering from Dept. of

CSE, Jahangirnagar

University, Bangladesh. He successfully completed his post-graduate project on

Image Processing under the supervision of Dr. Md.

Al-Amin Bhuiyan. He is now working as aDatabase Administrator in a renowned commercial

bank in Dhaka, Bangladesh. His main researchinterests include Natural Language Processing,

Artificial Intelligence, Data Mining and so on.

Md. Al-Amin Bhuiyan

received his B.Sc (Hons) and

M.Sc. in Applied Physics andElectronics from University

of Dhaka, Dhaka, Bangladesh

in 1987 and 1988,respectively. He got the Dr.

Eng. Degree in Electrical

Engineering from Osaka City University, Japan, in2001. He has completed his Postdoctoral in the

Intelligent Systems from National Informatics

Institute, Japan. He is now a Professor in the Dept.

of CSE, Jahangirnagar University, Savar, Dhaka,

Bangladesh. His main research interests include

Image Face Recognition, Cognitive Science, ImageProcessing, Computer Graphics, Pattern

Recognition, Neural Networks, Human-machineInterface, Artificial Intelligence, Robotics and so

on.




ISSN 1947-5500

recognition of printed bangla document from textual image using multi-layer perceptron (mlp) neural...

Documents