recognition of printed bangla document from textual image using multi-layer perceptron (mlp) neural...

6
 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network Md. Musfique Anwar, Nasrin Sultana Shume, P. K. M. Moniruzzaman and Md. Al-Amin Bhuiyan Dept. of Computer Science & Engineering, Jahangirnagar University, Banglades h Email: musfique.anwar @gmail.com, shume_sultana@yahoo.com, get2monir@gmail.com, [email protected]  Abstract This paper focuses on the segmentation of printed Bangla characters for efficient recognition of the characters. The segmentation of characters is an important step in the process of character recognitions because it allows the system to classify the characters more accurately and quickly. The system takes the scanned image file of the  printed document as its input. A structural feature extraction method is used to extract the feature. In this case, each individual Bangla character is converted to a  N  M × feature matrix. A Multi- Layer Perceptron (MLP) neural network with back  propagation algorithm is chosen to feed the feature matrix to train with the set of input patterns and to develop knowledge to classify the character. The effectiveness of the system has been tested with several printed documents and the success rates in all cases are over 90%. Keywords: Character segmentation, Character recognition, Feature extraction, Multi-Layer Perceptron (MLP), etc. 1. Introduction Optical character recognition [1] is one of the attractive fields of image processing [2]. A character recognition technique associates a symbolic identity with the image of a character. Lot of research works on Bangla Character recognition has been done through last few years. In the modern approach, adaptive tools have been applied to pattern recognition system. The Artificial Neural  Network (ANN) is the most popular adaptive tool that is used for character recognition [3]. Most application use feed forward ANN and a numerous variant of classical backpropagation algorithm and other training algorithms. The area of this research is not only individual character recognition but it attempts to retrieve a complete paragraph from its optical image created by a scanner. In this paper we  proposed a way to recognize printed Bangla document from textual image using multilayer  perceptron with backpropagation algorithm for individual character recognition. 2. Bangla Character Set Character is the fundamental attribute for writing and reading a language. Character recognition is the process to classify the input character according to the predefined character class. There is a  particular character set for each language in the world and Bangla language has also its own character set with 49 characters, 10 digits,  punctuations and other symbols. Bangla letters are formed in two-dimensional space  based on mostly horizontal, vertical and are stroke [4]. The Bangla characters are classified in two categorizes as follows: i) Sorborno: ‘Shorborno’ like vowel of English Language Character. There are eleven ‘Shorborno’ characters. The first six characters or letters have full matra, the 7 th has half matra and the last four have no matra. ii) Banjonborno: ‘Banjonborno’ is like as the consonant. There are 39 ‘Banjonborno’ in Bangla letter. Here we are concerned about only the characters. Bangla scripts are moderately complex patterns . Each word in Bangla scripts is composed of sever al characters joined by a horizontal line (called ‘Matra’ or head-line) at the top. The concept of upper and lower case (as in English) character is absent her e. There are many composite characters, called “Jukto barna” as shown in Fig. 1. There are more that about 253 compound characters composed of 2, 3, or 4 consonants (i.e. Banjonborno) [5]. There are some other types of characters used in Bangla dictionary, called suffix-prefix characters as shown in Fig. 2. (a) Shorbarna (b) Benjonbarno (c) Bangla numerals (d) A few Bangla composite characters Fig. 1 Some Bangla mainstream characters used for images recognition. Fig. 2 Suffix-prefix determiner characters (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, April 2010 254 http://sites.google.com/site/ijcsis/ ISSN 1947-5500

Upload: ijcsis

Post on 30-May-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 1/6

 

Recognition of Printed Bangla Document from Textual Image Using Multi-Layer

Perceptron (MLP) Neural Network

Md. Musfique Anwar, Nasrin Sultana Shume, P. K. M. Moniruzzaman and Md. Al-Amin Bhuiyan

Dept. of Computer Science & Engineering, Jahangirnagar University, Bangladesh

Email: [email protected], [email protected], [email protected][email protected] 

Abstract 

This paper focuses on the segmentation of printed

Bangla characters for efficient recognition of the

characters. The segmentation of characters is an

important step in the process of character recognitions because it allows the system to

classify the characters more accurately and quickly.

The system takes the scanned image file of the printed document as its input. A structural feature

extraction method is used to extract the feature. In

this case, each individual Bangla character is

converted to a  N  M × feature matrix. A Multi-

Layer Perceptron (MLP) neural network with back 

 propagation algorithm is chosen to feed the feature

matrix to train with the set of input patterns and todevelop knowledge to classify the character. The

effectiveness of the system has been tested with

several printed documents and the success rates inall cases are over 90%.

Keywords:

Character segmentation, Character recognition,

Feature extraction, Multi-Layer Perceptron (MLP),

etc.

1. Introduction 

Optical character recognition [1] is one of the

attractive fields of image processing [2]. A

character recognition technique associates a

symbolic identity with the image of a character. Lotof research works on Bangla Character recognition

has been done through last few years. In the

modern approach, adaptive tools have been appliedto pattern recognition system. The Artificial Neural

 Network (ANN) is the most popular adaptive toolthat is used for character recognition [3]. Most

application use feed forward ANN and a numerous

variant of classical backpropagation algorithm andother training algorithms. The area of this research

is not only individual character recognition but it

attempts to retrieve a complete paragraph from itsoptical image created by a scanner. In this paper we

  proposed a way to recognize printed Bangla

document from textual image using multilayer   perceptron with backpropagation algorithm for 

individual character recognition.

2. Bangla Character Set

Character is the fundamental attribute for writing

and reading a language. Character recognition is

the process to classify the input character according

to the predefined character class. There is a

  particular character set for each language in the

world and Bangla language has also its own

character set with 49 characters, 10 digits, punctuations and other symbols.

Bangla letters are formed in two-dimensional space

 based on mostly horizontal, vertical and are stroke

[4].The Bangla characters are classified in two

categorizes as follows:

i)  Sorborno: ‘Shorborno’ like vowel of EnglishLanguage Character. There are eleven

‘Shorborno’ characters. The first six charactersor letters have full matra, the 7

thhas half matra

and the last four have no matra.

ii)  Banjonborno: ‘Banjonborno’ is like as theconsonant. There are 39 ‘Banjonborno’ in

Bangla letter. Here we are concerned about

only the characters.

Bangla scripts are moderately complex patterns.

Each word in Bangla scripts is composed

of  sever al characters joined by a horizontal line(called ‘Matra’ or head-line) at the top. The

concept of upper and lower case (as in English)

character is absent her e.  There are many

composite characters, called “Jukto barna” asshown in Fig. 1. There are more that about 253

compound characters composed of 2, 3, or 4

consonants (i.e. Banjonborno) [5]. There are someother types of characters used in Bangla dictionary,

called suffix-prefix characters as shown in Fig. 2.

(a)  Shorbarna

(b) Benjonbarno

(c) Bangla numerals

(d) A few Bangla composite characters

Fig. 1 Some Bangla mainstream characters used for 

images recognition.

Fig. 2 Suffix-prefix determiner characters

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 1, April 2010

254 http://sites.google.com/site/ijcsis/ISSN 1947-5500

Page 2: Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 2/6

 

3. The System Overview 

The main phases of character recognition system isthe segmentation of text into characters so that the

computer is able to classify characters within a

 paragraph as human can identify them. The overallmethod of the implemented system is illustrated in

the Fig. 3.

 Fig. 3 Overall diagram of the recognition system

3.1 Data Acquisition

The input images are acquired from documents

containing printed Bangla text by using scanner as

an input device. Scanned images are then stored asan image file (.JPEG). Pre-processing is required to

make the raw data of the image into usable format

[6] because the scanned image does not happen to

  be always in suitable form. This image is then passed for boundary detection.

3.2 Boundary Detection

We need to scan from the upper left and the bottom

right of the image to find the processing area of the

 printed text document. The scanning is halted whenit faces a single pixel.

3.3 Segmentation

In this phase text is partitioned into its elementary

entities i.e. characters. First the system detects the

region of a text line of the paragraph. Then the textlines are segmented into words and the words are

divided into characters.

3.3.1 Text Line Detection

Text line detection is performed by scanning the

image row by row horizontally and keeps the

numbers of black pixels in each row. Now the  boundary may be detected from the array by

counting the frequency of pixels in each line. In our 

experiment we found the number of pixels of a

 blank line in the image vary from 0 to 10. So thenumber of pixels where text is present in the image

is much larger than that of blank in the paper.There is a general concept that between two lines

more than two blank lines are present. In this waywe detect the boundary of a text line.

Upper boundary of a line is the first row where themore black pixels are found. After finding the

upper boundary, it continues scanning until a row

whose next row has no black pixels, which is the

lower boundary of the text line. There exist about 8

to 10 blank rows between two text lines.

3.3.2 Word Segmentation

  Normally, in Bangle word there is no character 

spacing due to Matra ( ⎯⎯ ). We have to detect the

Matra of a text line at first. Matra line is that row

that where the number of black pixels is themaximum [1, 7]. After detecting a line, the systemscans the image vertically from the upper boundary

of the line and count the number of black pixels in

each column. Start position of a word is the first

column where black pixels found first. The systemcontinues scanning until a column whose next

column has no black pixels, which is the end

 position of the word. There exist about 4 to 6 blank columns between two words.

3.3.3 Character Segmentation

To perform the separation of characters in a word,

the system scans vertically from the start positionof the word which is also the start position of the

first character of the word. After finding the start

 position of the character, it continues scanning untila column whose next column has no black pixels,

which is the end position of the character. Every

consecutive character in a word contains 2 to 3

 blank columns shown in Fig. 4 . 

Fig. 4 Character separation from below the Matra

3.4 Feature Extraction

Feature extraction is a subject of effective character 

recognition and it helps easing classification task.

Maximum height and width of Bangla characters

(without compound characters) of SutonnyMJ font

with 10 font size is 76 × and maximum 912 × in

case of compound characters. After determining thestart and end position of a character, the region of 

that character is converted to a 76 × matrix or 

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 1, April 2010

255 http://sites.google.com/site/ijcsis/

ISSN 1947-5500

Page 3: Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 3/6

 

912 × matrix (for compound characters)

containing 0 and 1, where 1 represents the presenceof character component and 0 represents the

absence of the character component.

The boundary of all characters are not of equal size,

i.e., the extracted matrices are not of equal size. If 

some matrices are of smaller or greater height andwidth of our standard size then we scale the matrix,

 but, if the height is equal but width is less then, we

add 0 to fill up the matrix to our standard size. The

character matrix acts as input to the recognition

stage. The input matrix is then fed to the neuralnetwork.

3.5 Recognition Engine and Classifier 

In a back-propagation neural network, the learning

algorithm has two phases. First, a training input  pattern (Bengali characters) is presented to the

network input layer. The network then propagates

the input pattern from layer to layer until the output

  pattern is generated by the output layer. If this

 pattern is different from the desired output, an error 

is calculated and then propagated backwardsthrough the network from the output layer to the

input layer. The weights are modified as the error is

 propagated.

A back-propagation neural network is determined

 by the connections between neurons, the activationfunction used by the neurons, and the learning

algorithm that specifies the procedure for adjusting

weights. The network architecture for the backpropagtion neural network is shown in Fig. 5.

 Fig. 5 Back-propagation neural network topology

A neuron determines its output by computing the

net weighted input:

=

−=n

1i

θi

wi

xX ………… (1)

Where n is the number of inputs, and θ  is

threshold applied to the neuron. Next, this inputvalue is passed through the sigmoid activation

function:

Xe1

1SigmoidY

−+

= ………… (2)

To derive the back-propagation learning law, let us

consider the three-layer network shown in Fig. 5. 

The indices i, j, k here refer to neurons in the input,

hidden and output layers, respectively. The symbol

ijw denotes the weight for the connection between

neuron i in the input layer and neuron j in thehidden layer, and the symbol  jk w the weight

 between neuron j in the hidden layer and neuron k 

in the output layer.

To propagate error signals, we start at the output

layer and work backward to the hidden layer. Theerror signal at the output of neuron k at iteration t is

defined by:

(t)a,k 

Y(t)d,k 

Y(t)k 

e −= ………… (3)

Where t=1, 2, 3 and (t)d,k 

Y is the desired output

of neuron k at iteration t.

 Neuron k, which is located in the output layer, is

supplied with a desired output of its own. Hencewe may use a straightforward procedure to update

weight  jk w :

(t) jk 

Δw(t) jk 

w1)(t jk 

w +=+ ………… (4)

Where (t) jk 

Δw is the weight correction, given by:

(t)k 

δ(t) j

yα jk 

Δw ××= ………… (5)

Where (t)k 

δ is the error gradient at neuron k in

the output layer at iteration t.

In order to calculate the weight correction for the

hidden layer, we can apply the same equation as for the output layer:

(t)ij

Δw(t)ij

w1)(tij

w +=+ ………… (6)

Where (t)ij

Δw is the weight correction, given by:

(t) jδ(t)

ixα

ijΔw ××= ………… (7)

Where (t) jδ represents the error gradient at neuron

 j in the hidden layer:

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 1, April 2010

256 http://sites.google.com/site/ijcsis/

ISSN 1947-5500

Page 4: Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 4/6

 

∑=

×−×=l

1k (t)

 jk (t)w

k δ(t)]

 jy[1(t)

 jy(t)

 jδ … (8)

Where l is the number of neurons in the output

layer and,

(t)i

xe1

1(t)

 jy

+

= ………… (9)

∑=

−×=n

1i  jθ(t)

ijw(t)

ix(t)

iX ………… (10)

Where n is the number of neurons in the input

layer.

In our work, we use backpropagation neural

network consisting of 42 neurons in input layer, 30

neurons in the hidden layer and one output neuronin the output layer for character matrix of 

size 76 × . And for character matrix of size 912 × ,

  backpropagation neural network consists of 108

neurons (i.e. as inputs), 80 neurons in the hidden

layer and one output neuron in the output layer.

The system recognizes a character if the output of 

the network is very close to one of the characters

with a certain acceptable tolerance. If the output isfar apart from all the possible outputs, then the

system cannot identify the character. This process

continues until the end of the text document. Theentire operation of the system can be easily

understood from the flow-chart shown in Fig. 6.

 

Fig. 6 Flow-chart of the recognition system

4. EXPERIMENTAL RESULT 

We used bswing1_0_beta package for Bangla text

output and neuralj-0.0.4 package to implement  backpropagation neural network in Java. The

number of neurons of hidden layer is always set to

(3/4) th of the number of neurons of input layer.

We use ‘PatternSet’ class which represents a set of   patterns. The function ‘addPattern

(Pattern pattern)’ is used to add the required

 patterns for all Bangla characters. The pattern for 

Bangla character looks like:

‘pattern_set.addPattern(newPattern("0;0;1;0;1;0;0;0;0; 0;0;1;0;0;1;0;0;0;

0;0;1;0;0;0;1;0;0; 0;0;1;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;

0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;

0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;

0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;",

matrix_output_Str))’where 0;0;1;0;1 ………………….. 0;0;0; is the

input vector and ‘matrix_output_Str’ is the output

vector. We set the value of the following fields of ‘BackPropagation’ class as:

Field value

‘desired_error’ 0.001‘maximum_epochs’ 1000000000

Then the training of backpropagation neuralnetwork starts. After the training, the system scans

Bangla paragraph image and try to find the

correctly recognized characters and display thosecharacters as output. Fig. 7 illustrates the snapshot

of the implemented method. Results for different

types of sentences are furnished in Table 1.

Start

Input the image of the paragraph

which will be recognized 

Detect the boundary of the printedtext document to perform the

se mentation of characters

Input the matrixto ANN

Stop

Select the character matrix of size

76 × or 912 × (for compound

character  

Calculate OutputVector and error 

error ≤ 0.001

Add the character to output list

Set index = 0,

maximum_epochs = 1000000000

Print “the character 

is unrecognized”

index = index + 1

Whole documentrecognized?

  No Yes

Yes

 No

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 1, April 2010

257 http://sites.google.com/site/ijcsis/

ISSN 1947-5500

Page 5: Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 5/6

 

Fig. 7 Sample output of the proposed system

Table 1: Success rate for experimental results

Total no. of 

characters

Correctly

recognized

characters

Success

rate (%)

165 162 98.18

288 275 95.49

356 337 94.66

0

50

100

150

200

250

300

350

400

Total no. of

characters

Correctly

recognized

characters

Success rate (%)

 Fig. 8 Success Rate Graph of experimental results

5. Conclusion 

In this paper, we proposed a recognition system

emphasizing on the segmentation phase. The  proposed system is capable of separating Bangla

letters, digits successfully from printed document.

It recognizes the segmented characters using

  backpropagation neural network. The system

sometimes fails to recognize composite characters.

So to improve the performance of the system, the

segmentation process can be improved to deal withcomposite characters. In future, the proposed

recognition system may further be improved using

spell-checker.

References

[1] M. E Hoque, M. J. H. Siddiqi, S.M. Kamruzzamanand M. S. Chowdhury, “Efficient Method of Size

Independent Printed Bangla Paragraph

Recognition Using ANN and EfficientHeuristics”, Proceedings of International

Conference on Computer and Information

Technology (ICCIT), Dhaka, Bangladesh, pp.

755-758 (2003). [2] Rafael C. Gonzalez, Digital Image Processing, 2nd 

Edition, Pearson Education publisher, New York,

2002.

[3] S. M. M. Rahman, S. M. Rahman and M.A.Rashid, “Kohonen Neural Network in Character 

Recognition Applications”, Proceedings of 

 NCCIS, pp. 106-110 (1997). [4] M. R. Bashar, M. A. F. M. R. Hasan, M. F. Khan,

“Bangla Off-Line Handwritten Size Independent

Character Recognition Using Artificial Neural

  Netwroks Based on Windowing Technique”Proceedings of International Conference on

Computer and Information Technology (ICCIT),

Dhaka, Bangladesh, pp. 351-354 (2003).

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 1, April 2010

258 http://sites.google.com/site/ijcsis/

ISSN 1947-5500

Page 6: Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network

http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 6/6

 

[5] M. F. Zibran, A. Tanvir, R. Shammi and Md.

Abdus Sattar, “Computer Representation Of 

Bangla Characters And Sorting Of BanglaWords”, Proceedings of International Conference

on Computer and Information Technology

(ICCIT), Dhaka, Bangladesh, pp. 191-195 (2002). [6] T.M. Ha and H. Bunke, “Off-line Handwritten

 Numerical Recognition by Perturbation Method”,

IEEE Transactions on Pattern Analysis and

Machine Intelligence, vol.19, no.5, pp.535-539

(May 1997)

[7] M. A. Sattar, K. Mahmud, H. Arafat and A. F. M.

  Noor-Uz-Zaman, “Segmenting Bangla Text for 

Optical Recognition”, Proceedings of 

International Conference on Computer andInformation Technology (ICCIT), Dhaka,

Bangladesh, pp. 283-286 (2003).

Md. Musfique Anwarcompleted his B.Sc (Engg.) in

Computer Science and

Engineering from Dept. of CSE, Jahangirnagar 

University, Bangladesh in

2006. He is now a Lecturer in 

the Dept. of CSE, Jahangirnagar University, Savar,

Dhaka, Bangladesh. His research interests include

Artificial Intelligence, Neural Networks, ImageProcessing, Pattern Recognition, Software

Engineering and so on.

Nasrin Sultana Shume

completed her B.Sc (Engg.)

in Computer Science andEngineering from Dept. of 

CSE, Jahangirnagar 

University, Bangladesh in2006. She is now a Lecturer  

in the Dept. of CSE, Green University of 

Bangladesh, Mirpur, Dhaka, Bangladesh. Her research interests include Artificial Intelligence,

  Neural Networks, Image Processing, Pattern

Recognition, Database and so on. 

P. K. M. Moniruzzamanreceived his B.Sc (Hons) in

Electronics and Computer 

Science and M.S. inComputer Science and

Engineering from Dept. of 

CSE, Jahangirnagar 

University, Bangladesh. He successfully completed his post-graduate project on

Image Processing under the supervision of Dr. Md.

Al-Amin Bhuiyan. He is now working as aDatabase Administrator in a renowned commercial

  bank in Dhaka, Bangladesh. His main researchinterests include Natural Language Processing,

Artificial Intelligence, Data Mining and so on.

Md. Al-Amin Bhuiyan

received his B.Sc (Hons) and

M.Sc. in Applied Physics andElectronics from University

of Dhaka, Dhaka, Bangladesh

in 1987 and 1988,respectively. He got the Dr.

Eng. Degree in Electrical

Engineering from Osaka City University, Japan, in2001. He has completed his Postdoctoral in the

Intelligent Systems from National Informatics

Institute, Japan. He is now a Professor in the Dept.

of CSE, Jahangirnagar University, Savar, Dhaka,

Bangladesh. His main research interests include

Image Face Recognition, Cognitive Science, ImageProcessing, Computer Graphics, Pattern

Recognition, Neural Networks, Human-machineInterface, Artificial Intelligence, Robotics and so

on. 

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 1, April 2010

259 http://sites.google.com/site/ijcsis/

ISSN 1947-5500