text detection and recognition

43
Text Detection and Character Recognition from Images Badruz Nasrin Bin Basri 1051101534 Supervisor : Mohd Haris Lye Abdullah 1

Upload: badruz-basri

Post on 02-Dec-2014

22.852 views

Category:

Technology


4 download

DESCRIPTION

Text Detection and Recognition from Images using Matlab, Final year Project Presentations

TRANSCRIPT

Page 1: Text Detection and Recognition

1

Text Detection and Character Recognition from Images

Badruz Nasrin Bin Basri

1051101534  Supervisor : Mohd Haris Lye Abdullah

Page 2: Text Detection and Recognition

2

Contents

Introduction1

Literature review  2

Method Used 3

Experiment and Result4

Future works   5

Page 3: Text Detection and Recognition

3

Aims and objectives

Segmentation -Separate the text region into its individual characters. 

Student 2

Recognition -Recognize each of the character in the detected text region using a suitable algorithm

The aim of this project is to detect, extract and recognize text from images, particularly license car plate.

Page 4: Text Detection and Recognition

4

Motivation 

Text detection and recognition in general have quite a lot of relevant application for automatic indexing or information retrieval such document indexing, content-based image retrieval, and license car plate recognition which further opens up the possibility for more improved and advanced systems.

Page 5: Text Detection and Recognition

5

Problem statement

1. To segment the image to individual characters, we need to find the characteristic to be used as boundary to segment the image.

2. To classify, we need to use the best template to compare with the segmented image and to determine how the template will be used to compare with the image.

Page 6: Text Detection and Recognition

6

Literature review

Segmentation

1. Klouver in research on recognition text in PayPal HIP [1] proposed the use of vertical projection to segment the characters in images. There are several reasons vertical segmentation been proposed by Klouver :• PayPal HIP image is computer generated only minor preprocessing

needed. • There are obvious separation between each character • Character size in PayPal HIP is fixed

2. Study on Malaysian License Plate Recognition by Othman [3] proposed a model specifically to detect and recognize the text in Malaysian license plate. For segmentation, connected components method has been proposed however this method only can use if the license plates are single row license plate and the study only been made on single row license plate. Same method also been use by Ganapathy and Lui in [4]

Page 7: Text Detection and Recognition

7

Literature review

Recognition

1. Klouver in research on recognition text in PayPal HIP [1] and Ho C. H. et. al in research on License Plate Recognition(LPR) [2] used Templates Matching to recognize the characters in image. Klouver detailed the matching classifier into four types of classifier that are Pixel Counting, Horizontal Projection, Vertical Projection and Template Correlations. Klouver’s experiment proved that the best classifier is Vertical projection and Template Correlation where both of this classifier yield 100% accuracy.

2. Fixed type of font in image(PayPal HIP) makes it very easy to distinguish between different characters using templates matching. There are no other research that yield 100% accuracy.

Page 8: Text Detection and Recognition

8

Literature review

Recognition

Study on Malaysian License Plate Recognition by Othman [3], Ganapathy and Lui in [4], M.Fukumi et. Al [6] , Anas J.A. Husain et. Al [7] used Neural Network to recognized the text. Compared to templates matching, neural network consume more time. Neural network also need training before it can be used and it only can achieve high accuracy if the sampled image is almost same with the training images. Andrew Vogt and Joe G. Bared [5] concluded the disadvantages of neural network are :

1. Minimizing overfitting in neural networks requires a great deal of computational effort

2. The individual relations between the input variables and the output variables are not developed by engineering judgment so the model tends to be a black box or input/output table without analytical basis and to make the accuracy level high the sample size has to be large.

Page 9: Text Detection and Recognition

9

Literature review

Recognition

Jared Hopkins and Tim Anderson in [9], used Fourier Descriptor to recognize text in image. In most of the researches, Fourier descriptor been used to recognize more complex shape such as for logo classification by Folkers and Samet [10] and for Sinhala Script by Rohana, Ruvan and Kevin [11].

Basically there are no research on LPR using Fourier descriptor(FD), hence, this research will also test the usage of FD to recognize text in Malaysian License Plate.

Wisam Al Faqheri and Syamsiah Mashohor in A Real-Time Malaysian Automatic License Plate Recognition (M-ALPR) using Hybrid Fuzzy [12] used the hybrid Fuzzy method to recognize the license number. Compared to other study previously done on license plate detection where almost all of the previous work relied on a single method like template matching or neural network Wisam and Syamsiah proposed combination of more than one method based on the type of license plate.

Page 10: Text Detection and Recognition

10

Approach - Flowchart to recognize the characters in image

Start

Check Template.mat

Noise Removal

Segment

Make template

Classify

End

Erode

Profiler

Select

Check Profile

segment2

Need more erosion

Single row plate

Double row plate

Corrector

Preprocess

Segmentation

Recognition

Page 11: Text Detection and Recognition

11

Make template

To create templete.mat to be use for classification:

……

36 images of charactersSize = 42 X 24

Matrix size 24 X 42 X 36 Saved as template.mat

Page 12: Text Detection and Recognition

12

PreprocessRaw image

Noise filter

Resizing

Baunding

Complimenting

Binarize

Preprocessed image

Page 13: Text Detection and Recognition

13

Segmentation – Vertical Projections 

Vertical projection analysisPreprocessed Image

Vertical Projection

Segmented Image

Page 14: Text Detection and Recognition

14

Segmentation – Vertical Projections 

Weaknesses

Image that failed to be segmented by vertical projection

Page 15: Text Detection and Recognition

15

The segmentation character involves the following steps:

Scan the image from left to right to find ‘on’ pixel. If on pixel been found, all ‘on’ pixel connected to the

detected on pixel will be extracted segmented as a pixel.

The process will be repeated until it reach end right of the image.

Segmentation – Connected Components 

Page 16: Text Detection and Recognition

16

Source Image Template

image

Number of ‘on’ pixel

649 624 652 639

allcorrs(j) 0.97520 0.99702 0.99007

Recognition - Pixel Counting

tempSum = sum(sum(templates(:,:,j))); inSum = sum(sum(chars(:,:,i))); allCorrs(j) = abs(tempSum - inSum);allCorrs(j) = 1 - (allCorrs(j)/1008);

Page 17: Text Detection and Recognition

17

Corr2

Where is the mean of the input matrix i and is the mean of the input matrix j. 0 < r < 1 1 mean i and j is exactly same while 0 mean the

i and j not same at all.

Page 18: Text Detection and Recognition

18

Source Image Template

image

Vertical Projection

allCorrs(j) 0.90077 0.90721 0.63654

Recognition - Vertical projections

tempVP = sum(templates(:,:,j)); inVP = sum(chars(:,:,i)); allCorrs(j) = corr2(tempVP, inVP);

Page 19: Text Detection and Recognition

19

Source Image Template

image

Horizontal Projection

allCorrs(j) 0.90077 0.73379 0.45380

Recognition - Horizontal projections 

tempHP = sum(templates(:,:,j)'); inHP = sum(chars(:,:,i)'); allCorrs(j) = corr2(tempHP, inHP);

Page 20: Text Detection and Recognition

20

Source Image Template

image

allcorrs(j) 0.82011 0.57395 0.43850

Recognition - Template Correlations

temp = templates(:,:,j); in = chars(:,:,i); allCorrs(j) = corr2(temp, in);

Page 21: Text Detection and Recognition

21

Recognition – Fourier Descriptor

Following is the detailed steps on extracting and comparing the Fourier Descriptor (FD)1. Edging

~U=(𝑥0 𝑦0𝑥1 𝑦1...

𝑥𝑛 𝑦𝑛

)

Page 22: Text Detection and Recognition

22

Recognition – Fourier Descriptor

2. Extracting FD – 1 D Discrete Fourier Transform (DFT) been done to the complex vector to get the frequency domain of the boundaries using the following equation:

~F=𝐹𝐹𝑇 [~𝑈 ]=∑𝑘=0

𝑁− 1~U k (− 2πN 𝑘𝜇)

Page 23: Text Detection and Recognition

23

Recognition – Fourier Descriptor

3. Normalize FD :

Translation invariant

Scale invariant

Rotation invariant

Page 24: Text Detection and Recognition

24

Recognition – Fourier Descriptor

4. Resize FD – As FD contains information of all information of the ‘on’ pixel, the size of FD is number of on pixel. To make it comparable with other FD it need to be resized to predefined number of descriptor, Figure 3.9 show different shape reconstructed using different number of descriptor. As to resize the FD to n descriptor, function shiftfft in Matlab will remove low frequency descriptor leaving only n-th highest descriptor.

Images of ‘E’ reconstructed from (a) n = 4 (b) n = 8 (c) n = 10 (d) n = 15 (e) n = 25 (f) n = 30 (g) n = 278

Page 25: Text Detection and Recognition

25

Recognition – Fourier Descriptor

5. Compare FD – CompareFD a measure of the difference between two inputs FD. It will quantify the difference between FDs. Higher values of different mean the two FDs are far apart in shape. The extracted FD, I can be compared with using the following algorithm:

CompareFD(I,T)

D ← ø

for each Templates T

do diff = -1

if length = length (I)

do diff ← sum ()

return k such that

Page 26: Text Detection and Recognition

26

Recognition – Heuristic Filter

Context Approach

After considering all data in database, it was concluded that:

• First two characters are always text.• Third character can be text or number.• All following characters are always numbers.

Page 27: Text Detection and Recognition

27

Recognition – Heuristic Filter

Euler number - Euler number is equal to the number of connected elements (always equal to one) minus the number of holes.

Page 28: Text Detection and Recognition

28

Recognition – Heuristic Filter

Notice that for ‘H’, x line passed through 2 white line and y passed through 2 white line, for’ W’, x passed 3 white line and y passed 2 white line while ‘M’ is opposite.

Page 29: Text Detection and Recognition

29

Experiment 1: Comparison between Different Segmentation Method and Different Templates Matching Classifier

1. Template with size 42 X 24 was created using images of 36 characters.

2. To conduct the experiment, all 125 images have been renamed as their ground truth and saved in a folder.

3. A Matlab script as included in appendices was created to load all the images in the folder as well as their name and then perform preprocess segmentation and recognition to all of the images.

4. Then the result of the segmentation and recognition as well as time needed to recognize a number plate were recorded and calculated using following equation. The result will also be analyzed automatically by the Matlab script.

Some of images that been used in the experiment

Page 30: Text Detection and Recognition

30

5. Segmentation Accuracy was calculated using formula :

Where is segmentation accuracy, is number of correctly segmented character and is number of characters in sample.

6. Classification Accuracy was calculated using formula :

Where is recognition accuracy, is number of correctly recognized character and is number of characters that had been correctly segmented.

7. Classification Accuracy was calculated using formula :

Where is average recognition time, total running time to recognize all sample images and is number of sample images.

Experiment 1: Comparison between Different Segmentation Method and Different Templates Matching Classifier

Page 31: Text Detection and Recognition

31

8. Then the experiment repeated four times using connected components as segmentation method and the following as recognition classifier: Pixel Count Vertical Projection Horizontal Projection Templates Correlation

Experiment 1: Comparison between Different Segmentation Method and Different Templates Matching Classifier

Page 32: Text Detection and Recognition

32

Result

Segmentation Method Vertical Projection Connected Component

125 125

76 125

60.80% 100%

Method Pixel Count Vertical ProjectionHorizontal

Projection

Templates

Correlation

0.3839 0.4109 0.4115 0.5112%

1.56% 25.25% 37.42% 76.45%

Comparison on segmentation by Vertical Projection and Connected Components

Comparison on Template Matching using different classifier

Page 33: Text Detection and Recognition

33

1. Experiment repeated two times using connected components as segmentation method and the following as recognition classifier: Templates Correlation Fourier Descriptor

Experiment 2: Comparison between Template Correlation and Fourier Descriptor

Page 34: Text Detection and Recognition

34

Result

Method Templates Correlation Fourier Descriptor

0.4704 6.0175

78.35% 52.32%

Comparison on Recognition by Templates Correlation and Fourier Descriptor

Page 35: Text Detection and Recognition

35

1. Experiment repeated with introducing context in the algorithm

Experiment 3: Improvement on LPR Using Context Approach

Result

MethodTemplates Correlation without

Context Approach

Templates Correlation with

Context Approach

0.4704 0.4419

78.35% 90.08%

Comparison on recognition by Templates Correlation after context been introduced

Page 36: Text Detection and Recognition

36

1. Experiment repeated with introducing hybrid in the algorithm

Experiment 4: Improvement on LPR Using Hybrid method

Result

Method

Templates Correlation

without Context

Approach

Fourier Descriptor

with Context

Approach

Hybrid Method

0.4704 0.4419 0.8494

78.35% 90.08% 98.46%

Comparison on recognition by Templates Correlation after hybrid been introduced

Page 37: Text Detection and Recognition

37

Discussion

  Image

Original image

   

Preprocessed

image    

Segmented image

Preprocessed image after erosion

Image that failed to be segmented using connected components

Page 38: Text Detection and Recognition

38

Why heuristic filters failed ?

Image that failed to be recognized due to change in Euler number

Page 39: Text Detection and Recognition

39

Why Fourier Descriptors failed ?

Image that failed to be recognized due to unsmoothed image

Image that failed to be recognized due to rotation invariant

Page 40: Text Detection and Recognition

40

Conclusion

The objective of this paper is to segment and recognize characters in image have been achieved. Even the segmentation accuracy from the experiment is 100% the result during real application may be lower due to limited set of picture used in experiment. However, this shown that segmentation using connected components is best method to segmenting the image.

After several experiments been done to find the best method to recognize the characters with highest accuracy and considerable amount of time, the best way is by using templates correlations as main recognition method with Fourier Descriptor and several heuristic approach as filters. Experiments have found that this method’s recognition accuracy is 98.46%.

Page 41: Text Detection and Recognition

41

References

[1] Kurt Alfred Kluever. (2008) Digital Media Library : RIT Scholars. [Online]. https://ritdml.rit.edu/bitstream/handle/1850/7813/KKlueverTechPaper05-20-2008.pdf

[2] C. H. Ho, S. B. Koay, M. H. Lee, M. Moghavvemi, and M. Tamjis, "License Plate Recognition (Software)," Universiti Malaya, No Date.

[3] Sheroz Khan, Rafiqul Islam Othman khalifa, "Malaysian Vehicle License Plate Recognition," The International Arab Journal of Information Technology , pp. 359-364, 2007.

[4] Velappa Ganapathy and Wen Lik Dennis Lui, "A Malaysian Vehicle License Plate Localization and Recognition System," Monash University Malaysia,.

[5] Office of Safety and Traffic, Operations Research and Development, USA. (2010, February) Literature Review : Artificial Neural Network. [Online]. http://www.tfhrc.gov/safety/98133/ch02/body_ch02_05.html

[6] M. Fukumi, Y. Takeuchi, and M. Khalid, "Neural Network Based Threshold Determination for Malaysia License Plate Character Recognition," Universiti Technologi Malaysia, No Date.

Page 42: Text Detection and Recognition

42

References

[7] K. Saleh Ali Al-Omari, Putra Sumari, A. Sadik Al-Taweel, and J.A. Anas Hussain, "Digital Recognition using Neural Network," Journal of Computer Science, vol. 5, no. 6, pp. 427-434, 2009.

[8] Velappa Ganapathy and Leong Liew Kok, "Handwritten Character Recognition Using Multiscale Neural Network Training Technique," World Academy of Science, Engineering and Technology, vol. 39, 2008.

[9] Jared Hopskins and Tim Anderson, "A Fourier Descriptor Based Character Recognition Engine Implemented under the GameraOpen-Source Document Processing Framework," No Date.

[10] Andre Folkers and Samet Hanan, "Content-based Image Retrieval Using Fourier Descriptors on a Logo Database," in Proc of the 16th Int. Conf. on Pattern Recognition, vol. III, Quebec City, Canada, 2002, pp. 521-524.

[11] Rajapakse K Rohana, Ruvan A Weerasinghe, and Kevin E Seneviratne, "A Neural Network Based Character Recognition System For Sinhala Script," Department of Statistics and Computer Science, University of Colombo, Colombo, No Date.

[12] Wisam Al Faqheri and Syamsiah Mashohor, "A Real-Time Malaysian Automatic License Plate Recognition (M-ALPR) using Hybrid Fuzzy," IJCSNS International Journal of Computer Science and Network Security, vol. VOL.9, no. No.2, pp. 333-340, February 2009.

Page 43: Text Detection and Recognition

43