programming club project summer 2013. optical character recognition team members:- 1. abhishek...

23
PROGRAMMING CLUB PROJECT SUMMER 2013

Upload: melanie-patterson

Post on 18-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

PROGRAMMING CLUB PROJECT

SUMMER 2013

OPTICAL CHARACTER

RECOGNITION

Team Members:-1. Abhishek Srivastava2. Rishav Choudhary3. Shivam SahuMentored by Karan Singh

INTRODUCTIONAn OCR system converts the text

from the images to editable formats.

It has evolved over the years, and has found numerous applications in various field.

Digitization of printed texts in library is an example of the application of OCR.

TRAINING DATA PREPARATIONWe took Images of characters a-z,A-Z,0-9,( )

– Inclination of images ranges from to10 different fonts and their italicized form

were taken15 extra images for normal orientation were

takenHence 230 images for each of 65 charactersTOTAL 65 x 275 = 17875 ELEMENTS IN THE

TRAINING SET

CHARACTER EXTRACTION

IMAGE PRE-PROCESSINGReading the Image and making

its grayscale image and adjusting its contrast.

Making its binary image and applying Connected Component Analysis on its inverse.

Finding the box containing image and its Average Area. Now Removing the elements with area less than quarter of average area.

Example Output after noise removal

IMAGE PROCESSING (Cont.)Sort the boxes according to their

box upper left corner location first line-wise and then for every line column-wise.

Finding out if the letter is end of a word or a sentence and saving it.

For every component now we extracted the box containing it and resized that to 32 x 32 (same aspect ratio and extra plane rows/columns on both side)

32 x 32 BINARY AND ITS INVERSE

CLASSIFICATION

CHARACTER RECOGNITION FROM IMAGES

ALGORITHMS TRIED◦Decision Tree using ID3◦K- Nearest Neighbour◦Regularised Logistic Regression◦Neural Network

FINAL DECISION◦Neural Network because of better

Training and Testing accuracy

Decision Tree using ID3Training Accuracy of 79% on

CAPITAL LETTER DATASET.Accuracy dropped to 26% when

all of a-z A-Z 0-9 were included in the dataset.

Regularised Logistic RegressionTraining accuracy of 99.7% on a

dataset consisting of 275 x 65 training datasets.

Testing accuracy ~80% (because of use of i without dot ).

Neural NetworkUsed with input layer of 1024

elements one hidden layer of 200 elements and output layer of 65 elements.

Used with inverse of binary image.

Training set accuracy of 99.83% on a dataset of 275 x 65 characters.

TEST DATA PREPARATIONTaking every box (32 x 32) and

reshaping into a row vector of 1 x 1024

Hence make test matrix of m x 1024 elements.

PREDICTIONPassing Each row vector through

Already Trained Neural Network and predicting the index of maximum in output layer.

Cross Checking with aspect ratio to minimise possible error like similarity of ‘o’ and ‘O’.

OUTPUTOutput the predicted data and

newline/new word index created earlier using a loop and key to a txt file.

Sample Input

Corresponding outputue headmaSter a wizened owl l1ke man screamed With whose permiSsion did you enter the building7 Kindly go out Or I shall Send for the police WiS was received with howling Jeering and hooting And follow1ng it tableS and benches were overturned and broken and window panes were smaShed MoSt of the Board School boyS merged with the crowd A hew howeve1 stood apart uey were ErSt invited to come out but when they Showed reluctance they were dragged out SwaminathanS part In all thiS waS by no means negligIble It waS he who Shouted Ne will spit on the polIce (though it waS drowned in the din) when the headmaster mentioned the police he mention of the police had Sent his blood boiling What brazenness what shameleSSneSS to talk of Police the nefarious agentS of the LancaShire thumb cutters When the pandemonIum started he was behind no one In deStroying the School Surniture With tremendous Joy he diScovered that there were many glaSS paneS untouched yet His craving to break them could not be fully SatisEed in his own School He ran round collecting ink bottleS and hung them one by one at every pane that caught hiS eye When the Board School boys were dragged out he felt that he could not do much in that line most of the boys being as big as himself On the 8aSh of a br1ght idea he wriggled through the crowd and looked for the Infant StandardS uere he found little children huddled together and shIverIng with fright He charged Into thIS crowd with such ferocity that the children Scattered about Stumbling and

RESULTWe got approximately 10% errors in a

page of a scanned document.Most of them were due to presence of

stuck characters. There were a few common confusions,

like the one b/w ‘i’ and ‘1’, ‘r’ and ‘t’, and some were b/w the upper and lower case of the same letter.

The above mentioned error was minimized to a great extent by the aspect ratio analysis.

POSSIBLE SOLUTIONS AND IMPROVEMENTWe have written a code that successfully

splits the stuck characters on the basis of a confidence measure from the k-NN distance calculation. It is computationally very expensive though.

We have also written a raw code that can Segment the sentences in cases of larger skew.If we include the punctuation marks in the

training data and classify without filtering the dots, then we can increase the accuracy.

LIMITATIONSThe punctuation marks were not

included in the training set. So they cannot be classified correctly.

Stuck characters cannot be read.Code won’t give line wise correct

output in case of very large skew.

Thanks!!!!!!!!!