iaetsd degraded document image enhancing in

6
Degraded Document Image Enhancing in Spatial Domain using Adaptive Contrasting and Thresholding Dr. P V Ramaraju, Professor Department of ECE, SRKR Engineering College, Bhimavaram, India. [email protected] G.Nagaraju, Asst.Professor, Department of ECE, SRKR Engineering College, Bhimavaram, India. [email protected] V.Rajasekhar, M.Tech Student Department of ECE, SRKR Engineering College, Bhimavaram, India. [email protected] Abstract: This paper presents a new adaptive approach for the binarization and enhancement of degraded document images. The proposed method can deal with degradations which occur due to shadows, non-uniform illumination, low contrast, ink bleeding-through, smear and strain. We follow several distinct steps in the proposed technique; an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized by using local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. Some post-processing is further applied to improve the document binarization quality. The proposed method is simple, robust, and involves minimum parameter tuning. Keywords--Adaptive contrast map, document image enhancing, Adaptive thresholding, degraded document image binarization, pixel classification. I. INTRODUCTION Robust binarization gives the possibility of a correct extraction of the sketched line drawing or text from its background. For the binarization of images many algorithms have been implemented. Thresholding is a sufficiently accurate and high processing speed segmentation approach to monochrome image. This paper describes a modified logical thresholding method for binarization of seriously degraded and very poor quality gray-scale document images. In general there are two types of image thresholding techniques available: global and local. In the global thresholding technique a gray level image is converted into a binary image based on an image intensity value called global threshold which is fixed in the whole image domain whereas in local thresholding technique, threshold value can vary from one pixel location to next. Thus, global thresholding converts an input image I to a binary image G as follows G(i, j) = 1 for I (i, j) T, or, G(i, j) = 0 for I (i, j) < T, where T is the threshold, G (i, j) = 1 for foreground and G (i, j) = 0 for background. Whereas, for a local threshold, the threshold T is a function over the image domain, i.e.,T= T(x, y). In addition, if in constructing the threshold value/surface the algorithm adapts itself to the image intensity values, then it is called dynamic or adaptive threshold. In a general setting, thresholding can be expressed as a test operation that tests against a function T of the form [1]: T = T[x , y, h, I ] , where, I is the input image and h denotes some local property of this point– for example, the average gray level of a neighborhood centered on (x, y). Threshold selection depends on the information available in the gray level histogram of the image. We know that an image function I(x, y) can be expressed as the product of a reflectance function and an illumination function based on a simple image formation model. If the illumination component of the image is uniform then the gray level histogram of the image is clearly bimodal, because the gray levels of object pixels are significantly different from the gray levels of the background. It indicates that one mode is populated from object pixels and the other mode is populated from background pixels. Then objects could be easily partitioned by placing a single global threshold at the neck or valley at the histogram. However, in reality bimodality in histograms does not occur very often. Consequently, a fixed threshold level based on the information of the gray level histogram will fail totally to separate objects from the background. In this scenario we turn our attention to adaptive local threshold surface where threshold value changes over the image domain to fit the spatially changing background and lighting conditions. Over the years many threshold selection methods have been proposed. Otsu has suggested a global image thresholding technique where the INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014 INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in 17 ISBN:378-26-138420-0223

Upload: iaetsd-iaetsd

Post on 13-Apr-2017

157 views

Category:

Engineering


0 download

TRANSCRIPT

Degraded Document Image Enhancing in

Spatial Domain using Adaptive Contrasting and

Thresholding Dr. P V Ramaraju, Professor

Department of ECE,

SRKR Engineering College,

Bhimavaram, India.

[email protected]

G.Nagaraju, Asst.Professor,

Department of ECE,

SRKR Engineering College,

Bhimavaram, India.

[email protected]

V.Rajasekhar, M.Tech Student Department of ECE,

SRKR Engineering College,

Bhimavaram, India.

[email protected]

Abstract: This paper presents a new adaptive approach

for the binarization and enhancement of degraded

document images. The proposed method can deal with

degradations which occur due to shadows, non-uniform

illumination, low contrast, ink bleeding-through, smear

and strain. We follow several distinct steps in the

proposed technique; an adaptive contrast map is first

constructed for an input degraded document image.

The contrast map is then binarized by using local

threshold that is estimated based on the intensities of

detected text stroke edge pixels within a local window.

Some post-processing is further applied to improve the

document binarization quality. The proposed method is

simple, robust, and involves minimum parameter

tuning.

Keywords--Adaptive contrast map, document image

enhancing, Adaptive thresholding, degraded document

image binarization, pixel classification.

I. INTRODUCTION

Robust binarization gives the possibility of a

correct extraction of the sketched line drawing or text

from its background. For the binarization of images

many algorithms have been implemented.

Thresholding is a sufficiently accurate and high

processing speed segmentation approach to

monochrome image. This paper describes a modified

logical thresholding method for binarization of

seriously degraded and very poor quality gray-scale

document images.

In general there are two types of image

thresholding techniques available: global and local.

In the global thresholding technique a gray level

image is converted into a binary image based on an

image intensity value called global threshold which is

fixed in the whole image domain whereas in local

thresholding technique, threshold value can vary

from one pixel location to next. Thus, global

thresholding converts an input image I to a binary

image G as follows G(i, j) = 1 for I (i, j) ≥ T, or,

G(i, j) = 0 for I (i, j) < T, where T is the threshold,

G (i, j) = 1 for foreground and G (i, j) = 0 for

background.

Whereas, for a local threshold, the threshold

T is a function over the image domain, i.e.,T= T(x, y).

In addition, if in constructing the threshold

value/surface the algorithm adapts itself to the image

intensity values, then it is called dynamic or adaptive

threshold.

In a general setting, thresholding can be

expressed as a test operation that tests against a

function T of the form [1]: T = T[x , y, h, I ] ,

where, I is the input image and h denotes some local

property of this point– for example, the average gray

level of a neighborhood centered on (x, y).

Threshold selection depends on the

information available in the gray level histogram of

the image. We know that an image function I(x, y)

can be expressed as the product of a reflectance

function and an illumination function based on a

simple image formation model. If the illumination

component of the image is uniform then the gray

level histogram of the image is clearly bimodal,

because the gray levels of object pixels are

significantly different from the gray levels of the

background. It indicates that one mode is populated

from object pixels and the other mode is populated

from background pixels.

Then objects could be easily partitioned by

placing a single global threshold at the neck or valley

at the histogram. However, in reality bimodality in

histograms does not occur very often. Consequently,

a fixed threshold level based on the information of

the gray level histogram will fail totally to separate

objects from the background. In this scenario we turn

our attention to adaptive local threshold surface

where threshold value changes over the image

domain to fit the spatially changing background and

lighting conditions.

Over the years many threshold selection

methods have been proposed. Otsu has suggested a

global image thresholding technique where the

INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014

INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in17

ISBN:378-26-138420-0223

optimal global threshold value is ascertained by

maximizing the between-class variance with an

exhaustive search [2]. Sahoo et al. [3] claim that

Otsu’s method is suitable for real world applications

with regard to uniformity and shape measures.

Though Otsu’s method is one of the most popular

methods for global thresholding, it does not work

well for many real world images where a significant

overlap exists in the gray level histogram between the

pixel intensity values of the objects and the

background due to un-even and poor illumination.

As many degraded documents do not have a

clear bimodal pattern, global thresholding [4]–[7] is

usually not a suitable approach for the degraded

document binarization. Adaptive thresholding [8]–

[14], which estimates a local threshold for each

document image pixel, is often a better approach to

deal with different variations within degraded

document images. For example, the early window-

based adaptive thresholding techniques [12], [13]

estimate the local threshold by using the mean and

the standard variation of image pixels within a local

neighborhood window.

The local image contrast and the local image gradient

are very useful features for segmenting the text from

the document background because the document text

usually has certain image contrast to the neighboring

document background.

The image gradient is defined as follows

G(x,y)=fmax(x, y) − fmin(x,y ) �1

The Local contrast is defined as follows

max min

max min

( , ) ( , )( , ) 2

( , ) ( , )

x y x yD x y

x y x y

f f

f f ε

−= →

+ +

Where ε is a positive but infinitely small number that

is added in case the local maximum is equal to 0.

II. PROPOSED METHOD

This section describes the proposed

document image binarization techniques. Given a

degraded document image, an adaptive contrast map

is first constructed. The text is then segmented based

on the local threshold that is estimated from the

detected text stroke edge pixels. Some post

processing is further applied to improve the

document binarization quality.

A. Contrast Image Construction

The image contrast in Equation 2 has one

typical limitation that it may not handle document

images with the bright text properly. This is because

a weak contrast will be calculated for stroke edges of

the bright text where the denominator in Equation 2

will be large but the numerator will be small. To

overcome this over-normalization problem, we

combine the local image contrast with the local

image gradient and derive an adaptive local image

contrast as follows

max min( , ) ( , ) (1 )( ( , ) ( , )) 3aD x y D x y f x y f x yα α= + − − →

Where D(x, y) denotes the local contrast in

Equation 2 and (fmax(x, y) − fmin(x,y )) refers to the

local image gradient that is normalized to [0, 1]. The

local windows size is set to 3 empirically. α is the

weight between local contrast and local gradient that

is controlled based on the document image statistical

information. Ideally, the image contrast will be

assigned with a high weight (i.e. large α) when the

document image has significant intensity variation.

So that the proposed binarization technique depends

more on the local image contrast that can capture the

intensity variation well and hence produce good

results. Otherwise, the local image gradient will be

assigned with a high weight.

We model the mapping from document image

intensity variation to α by a power function as

follows

α= (Std/128)γ

�4

Where Std denotes the document image

intensity standard deviation, and γ is a pre-defined

parameter. The power function has a nice property in

that it monotonically and smoothly increases from 0

to 1 and its shape can be easily controlled by

different γ .γ can be selected from [0,∞], where the

power function becomes a linear function when γ

= 1. Therefore, the local image gradient will play the

major role in Equation 3 when γ is large and the local

image contrast will play the major role when γ is

small. The setting of parameter γ will be discussed in

the section of parameter selection.

Fig. 1

INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014

INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in18

ISBN:378-26-138420-0224

Fig. 2

(a)

(b)

(c)

Fig. 3. Contrast Images constructed using (a) local image gradient,

(b) local image contrast [15], and (c) our proposed method for the

original sample document images which are shown in Fig. 1 and 2,

respectively.

Fig. 3 shows the contrast map of the sample

document images in Fig. 1 and 2 that are created by

using local image gradient, local image contrast [15]

and our proposed method in Equation 3, respectively.

B. Local Threshold Estimation

The text can then be extracted from the

document background pixels once the high contrast

stroke edge pixels are detected properly. Two

characteristics can be observed from different kinds

of document images [15]: First, the text pixels are

close to the detected text stroke edge pixels. Second,

there is a distinct intensity difference between the

high contrast stroke edge pixels and the surrounding

background pixels. The document image text can

thus be extracted based on the detected text stroke

edge pixels as follows

{1.. min&& ( , ) /2

0..( , ) 5Ne N I x y Emean Estd

otherwiseR x y≥ ≤ +

= →

Where Emean and Estd are the mean and the

standard deviation of the image intensity of the

detected high contrast image pixels (within the

original document image) within the neighborhood

window that can be evaluated as follows

mean

( , ) *(1 ( , ))

E 6neighbor

I x y E x y

Ne

= →

2(( ( , ) ) * (1 ( , )))

E 72

neighbor

std

I x y Emean E x y− −

= →

The size of the neighborhood window W can

be set based on the stroke width of the document

image under study.

C. Post-Processing

Once the initial binarization result is derived

from Equation 5 as described in previous subsections,

the binarization result can be further improved by

incorporating certain domain knowledge as described

in Algorithm 1. First, the isolated foreground pixels

that do not connect with other foreground pixels are

filtered out to make the edge pixel set precisely.

Second, the neighborhood pixel pair that lies on

symmetric sides of a text stroke edge pixel should

belong to different classes (i.e., either the document

background or the foreground text). One pixel of the

pixel pair is therefore labeled to the other category if

both of the two pixels belong to the same class.

Finally, some single-pixel artifacts along the text

stroke boundaries are filtered out by using several

logical operators as described in[16].

Algorithm 1 Post-Processing Procedure

Require: The Input grayscale Document Image ‘I’,

Initial Binary Result ‘B’ and Corresponding Binary

Text Stroke Edge Image ‘Edge’

Ensure: The Final Binary Result ‘Bf’

1: Obtain the connect components of the stroke edge

pixels in ‘Edge’.

2: Take out those pixels that do not connect with

other pixels.

3: For removing isolated pixels, we need to check

connectivity.

4: for Each remaining edge pixels (i, j ): do

5: Get its neighborhood pairs:

(i − 1, j) & (i + 1, j); (i, j − 1) &(i, j + 1)

6: if The pixels in the pairs belong to the same class

(both text or background) then

7: Classify the foreground and background pixels

based on pixel values.

8: end if

9: end for

10: Remove single-pixel artifacts [16] along the text

stroke boundaries after the document thresholding.

11: Store the new binary result to ‘Bf ‘.

INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014

INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in19

ISBN:378-26-138420-0225

D. Parameter Selection

In the first experiment, we apply different γ

to obtain different power functions and test their

performance. α is close to 1 when γ is small and the

local image contrast Da dominates the adaptive image

contrast Da in Equation 3. On the other hand, Da is

mainly influenced by local image gradient when γ is

large. At the same time, the variation of α for

different document images increases when γ is close

to 1. Under such circumstance, the power function

becomes more sensitive to the global image intensity

variation and appropriate weights can be assigned to

images with different characteristics.

The proposed method can assign more suitable α to

different images when γ is closer to 1. Parameter γ

should therefore be set around 1 when the

adaptability of the proposed technique is maximized

and better and more robust binarization results can be

derived from different kinds of degraded document

images.

III. RESULTS

This section evaluates the results for

proposed document image binarization techniques.

Given a degraded document image, an adaptive

contrast map is first constructed. The text is then

segmented based on the local threshold that is

estimated from the detected text stroke edge pixels.

Some post-processing is further applied to improve

the document binarization quality.

Example 1

Fig.4 input degraded document image having ink bleeding through

effect

Fig.5 Contrast image constructed based on proposed adaptive local

contrast map

Fig.6 Binarized resultant image constructed based on proposed

local thresholding and post processing.

Example 2

Fig.7 input degraded document image having ink bleeding through

effect

Fig.8 Contrast image constructed based on proposed adaptive local

contrast map

Fig.9 Binarized resultant image constructed based on proposed

local thresholding and post processing.

IV. COMPARISON OF DIFFERENT

BINARIZATION METHODS

In this experiment, we quantitatively

compare our proposed method with Otsu’s method

(OTSU) [2], Sauvola’s method (SAUV) [12],

Niblack’s method (NIBL) [13], Bernsen’s method

(BERN) [8], Gatos et al.’s method (GATO) [17], and

LMM method (LMM [15], BE [16]). These are

composed of the same series of document images that

suffer from several common document degradations

INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014

INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in20

ISBN:378-26-138420-0226

such as smear, smudge, bleed-through and low

contrast.

Example 1

Fig. 10. Binarization results of the sample document image in Fig.

1(b) produced by different methods. (a) OTSU [2]. (b) SAUV [12].

(c) NIBL [13]. (d) BERN [8]. (e) GATO [17]. (f) LMM [15]. (g)

BE [16]. (h) Proposed.

Example 2

Fig. 11. Binarization results of the sample document image (PR

06) in DIBCO 2011 dataset produced by different methods. (a)

Input Image. (b) OTSU [2]. (c) SAUV [12]. (d) NIBL [13]. (e)

BERN [8]. (f) GATO [17]. (g) LMM [15]. (h) BE [16]. (i) LELO

[18]. (j) SNUS. (k) HOWE [19]. (l) Proposed.

V. CONCLUSION

This paper presents a simple and robust

method of enhancing degraded document images.

The method proposed in this paper constitutes

binarization that is tolerant to different types of

document degradation such as non-uniform

illumination, ink bleeding through and document

smear. This image binarization is based on local

thresholding along with adaptive contrast mapping.

The proposed method has been tested over various

noise affected document images and is binarized

efficaciously. But we observed that the performance

on Bickley diary dataset needs to be improved, we

will explore it in future.

REFERENCES

[1] R. C. Gonzalez and R. E. Woods, Digital Image

Processing, Pearson prentice Hall, 2005.

[2] N. Otsu, “A threshold selection method from

gray– level histogram,” IEEE Transactions on System

Man Cybernatics, Vol. SMC-9, No.1, pp. 62-66,

1979.

[3] P.K. Sahoo, S. Soltani, A.K.C. Wong, and Y.

Chen, “A survey of thresholding techniques,”

Computer Vision Graphics Image Processing, Vol.

41, 1988, pp. 233 – 260.

[4] A. Brink, “Thresholding of digital images using

two-dimensional entropies,” Pattern Recognit., vol.

25, no. 8, pp. 803–808, 1992.

[5] J. Kittler and J. Illingworth, “On threshold

selection using clustering criteria,” IEEE Trans. Syst.,

Man, Cybern., vol. 15, no. 5, pp. 652–655, Sep.–Oct.

1985.

[6] N. Otsu, “A threshold selection method from gray

level histogram,” IEEE Trans. Syst., Man, Cybern.,

vol. 19, no. 1, pp. 62–66, Jan. 1979.

[7] N. Papamarkos and B. Gatos, “A new approach

for multithreshold selection,” Comput. Vis. Graph.

Image Process., vol. 56, no. 5, pp. 357–370, 1994.

[8] J. Bernsen, “Dynamic thresholding of gray-level

images,” in Proc. Int. Conf. Pattern Recognit., Oct.

1986, pp. 1251–1255.

[9] L. Eikvil, T. Taxt, and K. Moen, “A fast adaptive

method for binarization of document images,” in

Proc. Int. Conf. Document Anal. Recognit., Sep.

1991, pp. 435–443.

[10] I.-K. Kim, D.-W. Jung, and R.-H. Park,

“Document image binarization based on topographic

analysis using a water flow model,” Pattern

Recognit., vol. 35, no. 1, pp. 265–277, 2002.

[11] J. Parker, C. Jennings, and A. Salkauskas,

“Thresholding using an illumination model,” in Proc.

Int. Conf. Doc. Anal. Recognit., Oct. 1993, pp. 270–

273.

INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014

INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in21

ISBN:378-26-138420-0227

[12] J. Sauvola and M. Pietikainen, “Adaptive

document image binarization,” Pattern Recognit.,

vol. 33, no. 2, pp. 225–236, 2000.

[13] W. Niblack, An Introduction to Digital Image

Processing. Englewood Cliffs, NJ: Prentice-Hall,

1986.

[14] J.-D. Yang, Y.-S. Chen, and W.-H. Hsu,

“Adaptive thresholding algorithm and its hardware

implementation,” Pattern Recognit. Lett., vol. 15, no.

2, pp. 141–150, 1994.

[15] B. Su, S. Lu, and C. L. Tan, “Binarization of

historical handwritten document images using local

maximum and minimum filter,” in Proc. Int.

Workshop Document Anal. Syst., Jun. 2010, pp. 159–

166.

[16] S. Lu, B. Su, and C. L. Tan, “Document image

binarization using background estimation and stroke

edges,” Int. J. Document Anal. Recognit., vol. 13, no.

4, pp. 303–314, Dec. 2010.

[17] B. Gatos, I. Pratikakis, and S. Perantonis,

“Adaptive degraded document image binarization,”

Pattern Recognit., vol. 39, no. 3, pp. 317–327, 2006.

[18] T. Lelore and F. Bouchara, “Super resolved

binarization of text based on the fair algorithm,” in

Proc. Int. Conf. Document Anal. Recognit., Sep.

2011, pp. 839–843.

[19] N. Howe, “A Laplacian energy for document

binarization,” in Proc. Int. Conf. Doc. Anal.

Recognit., Sep. 2011, pp. 6–10.

Dr.P.V.RamaRaju working as a Professor at

the Department of

Electronics and

Communication

Engineering, S.R.K.R.

Engineering College,

AP, India.

His research interests include Biomedical-Signal

Processing, Signal Processing, VLSI Design and

Microwave Anechoic Chambers Design. He is author

of several research studies published in national and

international journals and conference proceedings.

Nagaraju.G presently working as

Assistant Professor at

the Department of

Electronics and

Communication

Engineering, S.R.K.R.

Engineering College,

Bhimavaram, AP,

He received the B.Tech degree from S.R.K.R.

Engineering College, Bhimavaram in 2002, and

M. Tech degree in Computer electronics

specialization from Govt. College of engg, Pune

university in 2004. His current research interests

include Image processing, digital security

systems, Biomedical-Signal Processing, Signal

Processing, and VLSI Design.

V.RAJASEKHAR

received the B-tech

degree in Electronics

and communication

engineering from Sri

Vasavi Engineering

college,Tadepalligudem

, A.P, India, in 2011.

He is currently pursuing the M.Tech degree in

Communication Systems from S.R.K.R.

Engineering college, Bhimavaram .

INTERNATIONAL CONFERENCE ON CIVIL AND MECHANICAL ENGINEERING, ICCME-2014

INTERNATIONAL ASSOCIATION OF ENGINEERING & TECHNOLOGY FOR SKILL DEVELOPMENT www.iaetsd.in22

ISBN:378-26-138420-0228