farsi handwritten word recognition using continuous hidden markov models and structural features m....

72
Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

Upload: griselda-stokes

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features

M. M. HajiCSE Department

Shiraz University

January 2005

Page 2: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

2© M. M. Haji, 2005

Outline Introduction Preprocessing

Text SegmentationDocument Image BinarizationSkew and Slant CorrectionSkeletonization

Structural Feature Extraction Multi-CHMM Recognition Conclusion and Discussion

Page 3: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

3© M. M. Haji, 2005

Introduction

One of the most challenging problems in Artificial Intelligence. Words are rather complex patterns, having much variability in

handwriting style. Performance of handwriting recognition systems is still far from

human's both in terms of accuracy and speed.

Page 4: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

4© M. M. Haji, 2005

Introduction

Previous Research: Dehghan et al. (2001). "Handwritten Farsi (Arabic)

Word Recognition: A Holistic Approach Using Discrete HMM", Pattern Recognition, vol. 34, pp. 1057-1065.

Dehghan et al. (2001). "Unconstrained Farsi Handwritten Word Recognition Using Fuzzy Vector Quantization and Hidden Markov Models", Pattern Recognition Letters, vol. 22, pp. 209-214.

A maximum recognition rate of 65% for a 198-word lexicon!

Page 5: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

5© M. M. Haji, 2005

Methodology

Holistic Strategies

Analytical StrategiesImplicit Segmentation

Explicit Segmentation

Page 6: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

6© M. M. Haji, 2005

Holistic Strategies

Recognition on the whole representation of a word. No attempt to segment a word to its individual

characters. Necessary to segment the text lines into words.

Intra-word space is sometimes greater than inter-word space!

Page 7: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

7© M. M. Haji, 2005

Holistic Strategies

Using a lexicon, a list of the allowed interpretations of the input word image.

The error rate increases with the lexicon size. Successful for postal address recognition or

bank check reading where lexicon is limited and small.

Page 8: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

8© M. M. Haji, 2005

Analytical Strategies

Explicit Segmentation: Isolating single letters which are then separately recognized usually

by neural networks . Successful for English machine-printed text. Arabic/Farsi texts whether machine-printed or handwritten are

cursive. Cursiveness and character overlapping are the main challenges.

Page 9: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

9© M. M. Haji, 2005

Analytical Strategies

Implicit Segmentation: Converting the text (line or word) image into a sequence of small size units. Recognition at this intermediate level rather than the word or character level

usually by Hidden Markov Model (HMM). Each unit may be a part of a letter, so a number of successive units can

belong to a single letter.

Page 10: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

10© M. M. Haji, 2005

Text Segmentation

Page 11: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

11© M. M. Haji, 2005

Text Segmentation Detecting text regions in an image (removing non-text components). Applications in document image analysis and understanding, image

compression and content-based image retrieval. Document image binarization and skew correction algorithms

usually require predominant text area to have an accurate estimate of text characteristics.

Numerous methods have been proposed (an extensive literature). There is no general method to detect arbitrary text strings. In the most general form, detection must be:

insensitive to noise, background model and lighting conditions and, invariant to text language, color, size, font and orientation even in a

same image!

Page 12: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

12© M. M. Haji, 2005

Text Segmentation We believe that a text segmentation algorithm should have

adaptation and learning capability. A learner usually needs much time and training data to achieve

satisfactory results, which restricts its practicality. A simple procedure was developed for generating training data from

manually segmented images. A Naive Bayes Classifier (NBC) was utilized, which is fast both in

training and application phase. Surprisingly excellent results were obtained by this simple classifier!

Page 13: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

13© M. M. Haji, 2005

Text Segmentation

DCT-18 features 10,000 training instance Naive Bayes Classification:

),...,,|(maxarg 21MAP njVv

aaavPvj

)()|,...,,(maxarg

),...,,(

)()|,...,,(maxarg

21

21

21MAP

jjnVv

n

jjn

Vv

vPvaaaP

aaaP

vPvaaaPv

j

j

)|()|,...,,( 21 jii

jn vaPvaaaP

Page 14: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

14© M. M. Haji, 2005

Text Segmentation

Naive Bayes Classification:

)|()(maxargNB jii

jVv

vaPvPvj

P(Text) = P(Non-text) = 0.5.

)|()...|()|()|()...|()|(

)|()...|()|()Text(

21822211181211

1181211

vaPvaPvaPvaPvaPvaP

vaPvaPvaPP

Page 15: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

15© M. M. Haji, 2005

Binarization

Page 16: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

16© M. M. Haji, 2005

Binarization

Converting gray-scale images into two-level images. Many vision algorithms and operators only handle two-level images. Applied in primary steps of a vision algorithm. Selecting a proper threshold surface. Challenging for images with poor contrast, strong noise and variable

modalities in histograms. Global and local (adaptive) algorithms. General and special-purpose algorithms.

Page 17: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

17© M. M. Haji, 2005

Binarization

Four different algorithms for document image binarization were compared and contrasted:

Otsu, N. (Jan. 1979). “A Threshold Selection Method from Gray Level Histograms”, IEEE Trans. on Systems, Man and Cybernetics, vol. 9, pp. 62-66.

Niblack, W. (1989). An Introduction to Digital Image Processing, Prentice Hall, Englewood Cliffs, pp. 115-116.

Wu, V. and Manmatha, R. (Jan. 1998). "Document Image Clean-Up and Binarization", Proceedings of SPIE conference on Document Recognition.

Liu, Y. and Srihari, S. N. (May 1997). “Document Image Binarization Based on Texture Features”, IEEE Trans. on PAMI, vol. 19(5), pp. 540-544.

global, general purpose

local, general-purpose

local, special-purpose

global, special-purpose

Page 18: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

18© M. M. Haji, 2005

Binarization

Input

Otsu

Wu and Manmatha

Histogram

Niblack

Liu and Srihari

Page 19: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

19© M. M. Haji, 2005

Binarization

Quality improvement by preprocessing and postprocessing. Preprocessing:

Taylor, M. J. and Dance, C. R. (Sep. 1998). "Enhancement of Document Images from Cameras", Proceedings of SPIE conference on Document Recognition, pp. 230-241.

Postprocessing: Trier, D. and Taxt, T. (March 1995). "Evaluation of Binarization Methods for

Document Images", IEEE Trans. on PAMI, vol. 17(3), pp. 312-315.

Unsharp Masking 3 Binarization 3Input Output

super-resolution

Page 20: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

20© M. M. Haji, 2005

Skew Correction

Page 21: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

21© M. M. Haji, 2005

Skew Correction

The angle that text lines deviate from the x-axis. Page decomposition techniques require properly aligned

images as input. 3 types:

global skew multiple skew non-uniform skew

“Skew correction" is applied by a rotation after "skew detection“.

Page 22: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

22© M. M. Haji, 2005

Skew Correction

Categories based on the underlying techniques: Projection Profile Correlation Hough Transform Mathematical Morphology Fourier Transform Artificial Neural Networks Nearest-Neighbor Clustering

Page 23: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

23© M. M. Haji, 2005

Skew Correction

The projection profile at the global skew angle of the document has narrow peaks and deep valleys.

Page 24: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

24© M. M. Haji, 2005

Skew Correction

Projection profile technique:

θ)))(I,(( max argmaxmin

rotateProfileProjectionhorizontalfAngleSkewglobal

i

ihihSD 2))1()((

goodness measure

Page 25: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

25© M. M. Haji, 2005

Skew Correction

Limiting the range of skew angles. Binary search for finding the maximizer of a function. Computing the sum of pixels along parallel lines at an angle, instead

of rotation at the angle. Reducing the size of input image, as much as structure of text lines

is preserved. MIN, MAX downsampling

Local skew correction, after line segmentation, by robust line fitting.

Page 26: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

26© M. M. Haji, 2005

Slant Correction

uniform

non-uniform

Page 27: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

27© M. M. Haji, 2005

Slant Correction

The deviation of average near-vertical strokes from the vertical direction.

Occurring in handwritten and machine-printed texts.

اراک Slant is non-informative. The average slant angle is estimated first and then a shear

transformation in horizontal direction is applied to the word (or line) image to correct its slant.

Page 28: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

28© M. M. Haji, 2005

Slant Correction

The most effective methods are based on the analysis of vertical projection profiles (histograms) at various angles.

Identical to the projection profile based methods for skew correction, except that: The histograms are computed in vertical rather than horizontal direction. Shear transformation is used instead of rotation.

Accurate result for handwritten words with uniform slant. Robust to noise.

Page 29: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

29© M. M. Haji, 2005

Slant Correction

Page 30: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

30© M. M. Haji, 2005

Slant Correction

Projection profile technique:

θ)))(I,(( max argmaxmin

ShearhorizontalProfileProjectionverticalfAngleslant

i

ihihSD 2))1()((

goodness measure

Page 31: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

31© M. M. Haji, 2005

Slant Correction

Postprocessing: Smoothing jagged edges.

1 1 1

1 p 0

1 1 1

1 0 0

1 p 0

1 0 0

A part of a slanted word after slant correction and after smoothing

Page 32: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

32© M. M. Haji, 2005

Skeletonization

Page 33: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

33© M. M. Haji, 2005

Skeletonization

Skeletonization or medial axis transform (MAT) of a shape has been one the most surveyed problems in image processing and machine vision.

A skeletonization (thinning) algorithm transforms a shape into arcs and curves of thickness one which is called skeleton.

An ideal skeleton has the following properties: retaining basic structural properties of the original shape well-centered well-connected precisely reconstructable robust

Page 34: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

34© M. M. Haji, 2005

Skeletonization

Simplifying classification: Diminishing variability and distortion of instances of one class. Reducing the amount of data to be handled.

Proved to be effective in pattern recognition problems: Character recognition Fingerprint recognition Chromosome recognition …

Providing compact representations and structural analysis of objects.

Page 35: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

35© M. M. Haji, 2005

Skeletonization

Five different skeletonization algorithms were compared and contrasted with the main focus on preserving text characteristics:

Naccache, N. J. and Shinghal, R. (1984). "SPTA: A Proposed Algorithm for Digital Pictures", IEEE Trans. on Systems, Man and Cybernetics, vol. SMC-14(3), pp. 409-418.

Zhang, T. Y. and Suen, C. Y. (1984). "A Fast Parallel Algorithm for Thinning Digital Patterns", Comm. ACM, vol. 27(3), pp. 236-239.

Ji, L. and Piper, J. (1992). "Fast Homotopy-Preserving Skeletons Using Mathematical Morphology", IEEE Trans. on PAMI, vol. 14(6), pp. 653 - 664.

Sajjadi, M. R. (Oct. 1996). "Skeletonization of Persian Characters", M. Sc. Thesis, Computer Science and Engineering Department, Shiraz University, Iran.

Huang, L., Wan, G. and Liu, C. (2003). "An Improved Parallel Thinning Algorithm", Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003), pp. 780-783.

Page 36: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

36© M. M. Haji, 2005

Skeletonization

Input Homotopy-Preserving Zhang-Suen

SPTA DTSA Huang et al.

Page 37: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

37© M. M. Haji, 2005

Skeletonization

Input Homotopy-Preserving Zhang-Suen

SPTA DTSA Huang et al.

Page 38: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

38© M. M. Haji, 2005

Skeletonization

Input SPTA

DTSA Huang et al.

robustness to border noise

Page 39: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

39© M. M. Haji, 2005

Skeletonization

Postprocessing: Removing spurious branches

Page 40: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

40© M. M. Haji, 2005

Skeletonization

Modification: Removing 4-connectivity, and preserving 8-connectivity of the pattern.

0 1 x

1 p 0

x 0 0

x 1 x

1 p 1

x 0 x

Page 41: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

41© M. M. Haji, 2005

Structural Feature Extraction

The connectivity number Cn:

Cn=0 Cn=1 Cn=2

Cn=2 Cn=3 Cn=4

dot end-point

branch-point cross-point

Page 42: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

42© M. M. Haji, 2005

Structural Feature Extraction

Capable of tolerating much variation. Not robust to noise. Hard to extract. 1D HMM needs 1D observation sequence. Converting 2D word image into a 1D signal.

speech recognition, online handwritten recognition: 1D signal. offline handwritten recognition: 2D signal.

Page 43: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

43© M. M. Haji, 2005

Structural Feature Extraction

Converting the word skeleton into a graph. Tracing the edges in a canonical order:

1

2

5

3

4

6

7

End-PointEnd-Point

Branch-Point

Page 44: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

44© M. M. Haji, 2005

Structural Feature Extraction

Loop Extraction: Important distinctive features. Making the number of strokes smaller:

Easier Modeling Lower Computational Cost

Different types of loops: simple-loop multi-link-loop double-loop

A DFS algorithm was written to find complex loops in the word graph.

Page 45: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

45© M. M. Haji, 2005

Structural Feature Extraction

simple-loopmulti-link-loopmulti-link-loopdouble-loopdouble-loopdouble-loop

صـص ف مـ و...

ـصــطـمــوـه...

هــهـ

Page 46: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

46© M. M. Haji, 2005

Structural Feature Extraction

Each edge is transformed into a 10D feature vector: Normalized length feature (f1)

Curvature feature (f2)

Slope feature (f3)

Connection type feature (f4)

Endpoint distance feature (f5 )

Number of segments feature (f6 )

Curved features (f7-f10)

Independent of the baseline location. Invariance against scaling, translation and rotation.

Page 47: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

47© M. M. Haji, 2005

Structural Feature Extraction

1: [0.68, 1.00, 6, 0 , 0.05, 1, 0.0, 0.0, 0.7, 0.0]

2: [0.11, 1.01, 6, 1 , 0.23, 1, 0.0, 0.0, 0.0, 0.0]

3: [2.00, 3.00, 8, 10, 0.00, 0, 0.0, 0.0, 0.0, 0.0]

...

Page 48: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

48© M. M. Haji, 2005

Hidden Markov Models

Signal Modeling: Deterministic Stochastic:

Characterizing the signal by a parametric random process. HMM is a widely used statistical (stochastic) model:

The most widely used technique in modern ASR systems. Speech and handwritten text are similar:

Symbols with ambiguous boundaries. Symbols with variations in appearance.

Not modeling the whole pattern as a single feature vector, exploring the relationship between consecutive segments.

Page 49: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

49© M. M. Haji, 2005

Hidden Markov Models

Nondeterministic finite state machines: Probabilistic state transition. Each state is associated with

a random function. Unknown state sequence. Some probabilistic function of

the state sequence can be seen.

Sunny

Cloudy Rainy0.2

0.3

0.8

0.6 0.4

Page 50: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

50© M. M. Haji, 2005

Hidden Markov Models

N: The Number of states of the model

S = {s1, s2, ..., sN}: The set of states

∏ = {πi= P(si at t = 1)}: The initial state probabilities

A = {aij = P(sj at t+1 | si at t)}: The state transition probabilities

M: The Number of observation symbols

V = {v1, v2, ..., vM}: The set of possible observation symbols

B = {bi(vk) = P(vk at t | si at t}: The symbol emission probabilities

Ot: The observed symbol at time t

T: The length of observation sequence

λ = (A, B, ∏): The compact notation to denote the HMM.

Page 51: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

51© M. M. Haji, 2005

Left-to-Right HMMs

S1 S2 S3 S4 S5

.

.

S1 S2 S3 S4 S5

.

.

A 5-state Left-to-Right HMM

A 5-state Left-to-Right HMM with maximum relative forward jump of 2

Page 52: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

52© M. M. Haji, 2005

Hidden Markov Models

The Three Fundamental Problems:

1. Given a model λ = (A, B, ∏), how do we compute P(O | λ), the probability of occurrence of the observation seq. O = O1, O2, ..., OT.

The Forward-Backward Algorithm

2. Given the observation sequence O and a model λ, how do we choose a state sequence S = s1, s2, ..., sT so that P(O, S | λ) is maximized, i.e. finding a state sequence that best explains the observation.

The Viterbi Algorithm

3. Given the observation sequence O, how do we adjust the model parameters λ = (A, B, ∏) so that P(O | λ) or P(O, S | λ) is maximized. i.e. finding a model that best explains the observed data.

The Baum-Welch Algorithm, The Segmental K-means Algorithm

Page 53: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

53© M. M. Haji, 2005

Hidden Markov Models

Discrete HMM: Discrete observation sequences: V = {v1, v2, ..., vM}.

A codebook obtained by Vector Quantization (VQ). Codebook size?

Distortion: information loss due to the quantization error!

Continuous Hidden Markov Model (CHMM): Overcoming the distortion problem. Requiring more parameters → more memory More deliberate initialization techniques:

Diverging with randomly selected initial parameters!

Page 54: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

54© M. M. Haji, 2005

Hidden Markov Models

Multivariate Gaussian mixture:

))()(2

1exp(

||)2(

),;()(

1

1

1

M

m

Timtimimt

imK

im

M

mimimtimti

ooc

ocob

cim: The mth mixture gain coefficient in state i

μim: The mean of the mth mixture in state i

∑im: The covariance of the mth mixture in state i

M: The number of mixtures used

K: The dimensionality of the observation space

Page 55: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

55© M. M. Haji, 2005

The Block Diagram of the Recognition System

NormalizationFeature

Extraction

P(O | λ1)

P(O | λ2)

P(O | λn)

ObservationSequenceInput Word Image

RankedWord List

Evaluate the likelihood of the observationsequence by the Viterbi algorithm againstall models

Page 56: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

56© M. M. Haji, 2005

The Class Diagram of the Experimental Recognition System

WordClassifier

HMMWordClassifier NNWordClassifier

MLPWordClassifier

CHMMWordClassifier DHMMWordClassifier

CodeBook

1-codebook1

DHMMWordModel

1

-models2..*

CHMMWordModel

1

-models2..*

FeatureExtractor

FixedSizeFeatureExt.

1

-fe 1

1

-ffe

1

StructuralFeatureExt.

FourierFeatureExt.

Page 57: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

57© M. M. Haji, 2005

Text SegmentationGlobal SkewCorrection

Line ExtractionLocal SkewCorrection

Slant Correction

Input Image

BinarizationWord

SegmentationDenoising and

Smoothing

Skeletonization Feature ExtractionMulti-CHMMRecognition

Output Text

HeightNormalization

An Overview of the Complete System

Two-Stage Skew Correction Postponed Binarization

Page 58: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

58© M. M. Haji, 2005

Training Data

The recognition system was trained and evaluated on a dataset of 100 city names of Iran.

A pattern recognition problem with 100 classes was considered. Most samples in the dataset were automatically generated by a

Java program drawing input string with different fonts, sizes and orientations on output image.

The dataset contains 150 samples for each word.

Page 59: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

59© M. M. Haji, 2005

Training Data

Page 60: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

60© M. M. Haji, 2005

Training Data

Page 61: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

61© M. M. Haji, 2005

Experimental Results

1-best recognized

Page 62: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

62© M. M. Haji, 2005

Experimental Results

1-best recognized

Page 63: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

63© M. M. Haji, 2005

Experimental Results

1-best recognized

Page 64: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

64© M. M. Haji, 2005

Experimental Results

1-best recognized

Page 65: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

65© M. M. Haji, 2005

Experimental Results

1-best recognized

Page 66: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

66© M. M. Haji, 2005

Experimental Results

3-best recognized:

. 3 اصفهان. 2 زنجان .1دامغان

Page 67: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

67© M. M. Haji, 2005

Experimental Results

4-best recognized:

قم. 2 قشم .1 مرند. 3 مشهد. 4

Page 68: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

68© M. M. Haji, 2005

Experimental Results

Not N-best recognized, for N ≤ 20

Page 69: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

69© M. M. Haji, 2005

Experimental Results

Not N-best recognized, for N ≤ 20

Page 70: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

70© M. M. Haji, 2005

Experimental Results

Not N-best recognized, for N ≤ 20

Page 71: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

71© M. M. Haji, 2005

Conclusion

The first work to use CHMMs with structural features to recognize Farsi handwritten words.

A complete offline recognition system for Farsi handwritten words.

A new machine learning approach based on the NBC for text segmentation.

Comparing and contrasting different algorithms for: Binarization Skew and Slant Correction Skeletonization

Excellent generalization performance. A maximum recognition rate of 82% on our dataset of

size 100.

Page 72: Farsi Handwritten Word Recognition Using Continuous Hidden Markov Models and Structural Features M. M. Haji CSE Department Shiraz University January 2005

72© M. M. Haji, 2005

Thanks for your attention

Please feel free to ask any question

[email protected]