mcgillivary thesis

8/8/2019 McGillivary Thesis

1/118

QUANTIFYING NOISE EFFECTS IN BILEVEL DOCUMENT IMAGES

by

Craig D. McGillivary

A thesis

submitted in partial fulfillment

of the requirements for the degree of

Master of Science in Electrical Engineering

Boise State University

October, 2007


2/118

2007Craig D. McGillivary

ALL RIGHTS RESERVED


3/118

iii

The thesis presented by Craig D. McGillivary entitled Quantifying Noise Effects inBilevel Document Images is hereby approved

Elisa H. Barney Smith Date

Advisor

Tim Andersen Date

Committee Member

Jim Browning DateCommittee Member

John R. Pelton Date

Dean of the Graduate College


4/118

iv

ACKNOWLEDGEMENTS

I would like to Dr. Barney Smith who encouraged me to get a graduate degree and

then poked and prodded me until I completed it. I would not have succeeded without her

support and mentorship.

I would also like to thank my friends and family. They helped me to overcome the

stress and frustrations that came up from time to time as I worked towards my goals. I

would especially like to thank my coworkers Joetta Anderson, Chris Hale, Darrin Reed,

Jim Steele who acted as sounding boards for ideas, reviewed and edited my writing and

provided friendship in this journey. I am honored and grateful to my committee members

Dr. Tim Anderson and Dr. Jim Browning who provided time and energy to review this

thesis.

This material is based upon work supported by the National Science Foundation

under Grant No. CCR- 0238285.. Any opinions, findings, and conclusions or

recommendations expressed in this material are those of the author(s) and do not

necessarily reflect the views of the National Science Foundation.


5/118

v

ABSTRACT

The effect of binarization via global thresholding on additive Gaussian noise in high

contrast images is explored. A measure of noise in bilevel images called noise spreadis

developed with the use of a degradation model that applies to many image degradations

included in desktop scanning. When high contrast images are binarized, noise is

concentrated on the edges of objects in the image.Noise spreadis the breadth of the

domain in which pixels are affected by noise after binarization. It depends on both the

noise level and the gradients of the image prior to thresholding.

There is a strong linear relationship between noise spreadand the expected Hamming

distance between an image with noise added and the same image without noise added. It

is also known that if two images of an object are synthetically generated with

independent random phases and zero noise, there is a small Hamming distance between

them. Experiments on circles and on a 5 were run to determine the combined effect of

random independent phase and noise spreadon the expected Hamming distance. The two

factors are not additive and that the phase effects become less significant when the noise

spreadincreases. The degree to which this is true depends on the shape of the object

being scanned.

In addition to experiments on Hamming distance, experiments were run to determine

the geometric precision of images with noise. This includes experiments relating noise

spreadon the localizability of straight edges at several different orientations. The

localizability of and edge is defined by the ability to determine the orientation and


6/118

vi

position of an edge segment. The quality of edge measurements is quantified by the angle

between the measured edge and the true edge and by the distance between the measured

edge segment and the midpoint of the true edge segment. Surprisingly the distance

measurements for edges at certain orientations actually are precise when the noise spread

increases, but this variation is offset by less precision in the measurements of edge

orientation. For most edge orientations the precision of both distance measurements and

orientation measurements decreases when noise spreadincreases. Experiments relating

noise spreadto the localizability of circles were also conducted. These experiments

reveal that the positional error in circle measurements has a Rayleigh distribution, while

the radius measurements have a normal distribution, and that circle localizability

decreases as noise spreadincreases.

Noise Spreadprovides a strong theoretical foundation for future research. Since

random effects play a critical role in optical character recognition (OCR) and in pattern

recognition generally, it is important to understand and quantify them. Future research

will focus on relating noise spreadto human preference, on finding novel techniques for

measuring noise spreaddirectly from binary images, and on developing filters and other

techniques which will make OCR systems less susceptible to noise.Noise spreadmay

also have unforeseen applications in problems other than document research.


7/118

vii

TABLE OF CONTENTS

LIST OF FIGURES ........................................................................................................... ix

LIST OF TABLES........................................................................................................... xiii

LIST OF SYMBOLS, TERMS & ABBREVIATIONS ...................................................... i

1. INTRODUCTION ........................................................................................................ 1

2. TECHNICAL BACKGROUND................................................................................... 5

2.1. GENERATING SYNTHETIC SCANNED IMAGES .................................................. 6

2.1.1. Basic Scanner Model ...................................................................... 7

2.1.2. Scanned Straight Edges................................................................. 10

2.1.3. Scanned Strokes............................................................................ 13

2.1.4. Scanned Circles............................................................................. 17

2.1.5. Scanned Characters....................................................................... 20

2.2. EDGE FINDING TECHNIQUES ......................................................................... 22

2.3. CIRCLE FITTING TECHNIQUES ....................................................................... 26

2.4. EFFECT OF SAMPLING ................................................................................... 29

3. NOISE SPREAD THEORY ....................................................................................... 35

3.1. NOISE SPREAD FOR STRAIGHT ISOLATED EDGES............................................ 35

3.2. EXTENDINGNOISE SPREAD TO GENERAL SHAPES ......................................... 40

3.3. RELATIONSHIP BETWEENNOISE SPREAD AND HAMMING DISTANCE ............ 42

3.4. NOISE SPREAD WITH VARYINGNOISE LEVELS .............................................. 45


8/118

viii

4. HAMMING DISTANCE BETWEEN SCANNED OBJECTS.................................. 49

4.1. HAMMING DISTANCE BETWEEN SCANNED EDGES ........................................ 49

4.2. HAMMING DISTANCE OF SCANNED CIRCLES................................................. 52

4.3. HAMMING DISTANCE OF SCANNED CHARACTERS......................................... 56

5. GEOMETRIC MEASUREMENTS OF BILEVEL SCANS...................................... 60

5.1. LOCALIZABILITY OF SCANNED EDGES .......................................................... 60

5.1.1. Comparing Various Approaches to Finding Edges ...................... 63

5.1.2. Effect of Threshold ....................................................................... 65

5.1.3. Effect of PSF Width...................................................................... 67

5.1.4. Effect of Edge Orientation............................................................ 70

5.2. LOCALIZABILITY OF SCANNED CIRCLES ....................................................... 77

6. CONCLUSIONS AND FUTURE WORK ................................................................. 81

REFERENCES ..................................................................................................................84

APPENDIX A 86

Iso Curves for Cauchy and Gaussian PSFs

APPENDIX B 100

Code for Calculating

APPENDIX C 102

Code for CalculatingRi


9/118

ix

LIST OF FIGURES

Figure 1.1 The ideal edge and the noisy edge have the same phase and orientation.

The Hamming distance between the two images is the number of pixelsthat are different between the two images. ................................................. 3

Figure 2.1 This scanner model is used to determine the value of the pixelf[i,j]centered on each sensor element................................................................. 8

Figure 2.2 Gaussian ESFs are shown with w1=1 and w2=2. When the image is

blurred and thresholded the position of the edge is shifted from the dotted

step function to the solid edge function. This shift is called c and is byconvention positive when the edge is shifted to the left. .......................... 11

Figure 2.3 When a stroke characterized by two parallel edges is scanned the resulting

stroke thickness changes. Interference between the parallel edges causes

the grayscale value of pixels to be less than that predicted by the ESF. As

a result the stroke thickness will be less than that predicted by c. .......... 14

Figure 2.4 The Iso Curves for a Cauchy PSF are significantly different from the ccurves even when the stroke width is 15. ................................................. 16

Figure 2.5 When a circle is scanned its radius changes. As with scanned strokes if the

threshold is too high the circle will disappear when it is scanned. ........... 18

Figure 2.6 Algebraic Fitting can sometimes result in a poor fit. The center of the

Algebraic fit is (10.24,20.98) and the radius is 4.83. The center of the

Geometric fit is (10.10,7.92) and the raidus is 11.77............................... 29

Figure 2.7 Spirographs are a useful tool for studying the phase effects of straight

edges. This spirograph was created using N=20 and an edge with a 35degree angle of inclination. The marks on the circle represent the locations

of sample points relative to the edge. The lines on the interior of the circle

connect sample points that are on adjacent columns of the image. The

location of the edge can be represented by a point on the circle. ............. 31

Figure 2.8 (a) The variance of the distance between the measured and actual edges is

determined for edges that are 20 pixels long. (b) The variance in the

angular error is determined for noiseless edges that are 20 pixels long. .. 33


10/118

x

Figure 3.1 Edges with varying amounts of noise spread. While the standard deviation

of the noise in first three images is the same, the noise spread is different.The picture on the far left shows an extreme amount of noise. ................ 35

Figure 3.2 (a) Edge after blurring with a generic PSF of width, w. When no noise is

added, the thresholding produces the edge shift c. (b) Edge with noise.The uncertain boundary shown in gray, is the noise spread region. Theeffects of sampling are not shown. ........................................................... 36

Figure 3.3 The threshold probability (THP) function is shown for a Gaussian PSF

with w=1, =0.7 and noise=0.1.Noise spreadin this case is about .72. Thepiecewise approximation of threshold probability is inaccurate at the tails

of theTHP function. ................................................................................. 38

Figure 3.4 The threshold probability function given in Equation 3.1 (solid) is

compared to the approximation in Equation 3.8 (dashed) with the

parameters w=1, =0.7 and =0.1. The actual threshold probabilityis a little lower on the tails. ....................................................................... 40

Figure 4.1 The relationship between the expected Hamming distance and the noisespreadis shown for (a) 13 different PSF widths (b) 7 different thresholdsand (c) 8 different angles of inclination. The PSF width, threshold and

orientation of the edge dont affect the relationship between Hamming

distance and noise spreadas long as it is not a degenerate orientation with

a slope that can be represented by an irreducible fraction of small integerssuch as 0 degrees....................................................................................... 51

Figure 4.2 Hamming Distance versus noise spreadfor circles of varying radii. (a)

Both images have the same phase shift (b) Images have randomindependent phases. (c)Noise spreadis plotted against the Hammingdistance for both in-phase and independent phase together using the

average of the results from each radius. The effect of phase is less

significant when the noise spreadis higher. ............................................ 55

Figure 4.3 (a) Results of in-phase experiments. (b) Results of independent phase

experiments show a strong relationship between noise spreadandexpected Hamming distance. .................................................................... 58

Figure 4.4 The effect of using an independent phase becomes less significant as

Noise increases.......................................................................................... 59

Figure 5.1 An example of a straight edge scanned with w=3, =.3 and NS=.6. Thesolid line shows the position of the original edge. The dotted line is the

theoretical position of the scanned edge which results from shifting by c.The center point is the midpoint of the theoretical edge segment. ........... 63


11/118

xi

Figure 5.2 (a) The variance of angle measurements from perpendicular fitting and

standard fitting out perform the other methods and LoG is particularlypoor. (b) The variance of distance measurements clearly shows that

standard and perpendicular fitting both perform markedly better then the

other methods............................................................................................ 64

Figure 5.3 (a) The perpendicular and standard fitting both low perpendicular bias.

However, this is not true of the other methods. (b) There is a small angularbias in the standard fitting compared to the perpendicular fitting. LoG has

a very large angular bias. .......................................................................... 64

Figure 5.4 (a) Variance of angular error with a fixed sampling grid. (b) Variance of

distance measurements for fixed sampling grid........................................ 66

Figure 5.5 (a) Bias in angle measurement with a fixed sampling grid. (b) Bias in edge

position measurement with a fixed sampling grid. ................................... 66

Figure 5.6 (a) Variance of angular error with a random sampling grid. (b) Variance ofdistance measurements for random sampling grid.................................... 67

Figure 5.7 (a) Bias in angle measurement. (b) Bias in edge position measurement. . 67

Figure 5.8 (a) Variance of angular error with a random sampling grid and severaldifferent values ofw. (b) Variance of distance measurements for a random

sampling grid and several different values ofw. ...................................... 68

Figure 5.9 (a) Bias in the orientation measurements of edges with fixed sampling grid

and several different values ofw. (b) Bias of distance measurements for a

fixed sampling grid and several different values ofw. ............................. 68

Figure 5.10 (a) Variance of angular error with a random sampling grid and several

different values ofw. (b) Variance of distance measurements for a randomsampling grid and several different values ofw. ...................................... 69

Figure 5.11 (a) Bias in the orientation measurements of edges with random sampling

grid and several different values ofw. (b) Bias of distance measurementsfor a random sampling grid and several different values ofw.................. 69

Figure 5.12 (a) Bias in the orientation measurements of edges with a fixed sampling

grid and several different edge orientations. (b) Bias in the distance

measurements of edges with a fixed sampling grid and several differentedge orientations. ...................................................................................... 71

Figure 5.13 (a) Variance in the orientation measurements of edges with a fixedsampling grid and several different edge orientations. (b) Variance of the

distance measurements of edges with a fixed sampling grid and several

different edge orientations ........................................................................ 71


12/118

xii

Figure 5.14 (a) Variance of angular error with a random sampling grid and several

different edge orientations. (b) Variance of distance measurements forrandom sampling grid and several different edge orientations................. 72

Figure 5.15 (a) Bias in the orientation measurements of edges with random samplinggrid and several different edge orientations. (b) Bias of distance

measurements for random sampling grid and several different edge

orientations................................................................................................ 72

Figure 5.16 (a) Variance of angular error with a fixed sampling grid and degenerateedge orientations. (b) Variance of distance measurements for a fixed

sampling grid and degenerate edge orientations....................................... 73

Figure 5.17 (a) Bias of angular error with a fixed sampling grid and degenerate edge

orientations. (b) Bias of distance measurements for a fixed sampling grid

and degenerate edge orientations. ............................................................. 73

Figure 5.18 (a) Variance of angular error with a random sampling grid and degenerateedge orientations. (b) Variance of distance measurements for a randomsampling grid and degenerate edge orientations....................................... 74

Figure 5.19 (a)Bias in the orientation measurements of edges with random sampling

grid and degenerate edge orientations. (b) Bias of distance measurements

for random sampling grid and degenerate edge orientations. ................... 74

Figure 5.20 Distance and Angular Error for 0 degree edge without noise................... 75

Figure 5.21 Distance and Angular Error for 20 degree edge without noise................. 75

Figure 5.22 Distance and Angular Error for 0 degree edge withNS=.3. ..................... 76

Figure 5.23 Distance and Angular Error for 20 degree edge withNS=.3. ................... 76

Figure 5.24 (a) The radius measurements from Experiment 1 have a normal

distribution. (b) The distances between the measured circle centers and the

actual circles centers from Experiment 1 have a Rayleigh distribution.... 78

Figure 5.25 (a)The Rayleigh parameter of the positional error for several different

values ofw. (b) Variance of radius measurements for several different

values ofw. ............................................................................................... 79

Figure 5.26 (a)The Rayleigh parameter of the positional error for several different

values of. (b) Variance of radius measurements for several different

values of. ............................................................................................... 80


13/118

xiii

LIST OF TABLES

Table 2.1 Simple operators used for edge detection. .................................................24

Table 3.1 Functions for calculating blurred noise......................................................48

Table 4.1 Edge experiment parameters......................................................................50

Table 5.1 Parameters for threshold experiments........................................................65

Table 5.2 Parameters for PSF width experiments......................................................68

Table 5.3 Parameters for edge orientation experiments.............................................70

Table 5.4 Circle Experiment Parameters ...................................................................77


14/118

LIST OF SYMBOLS, TERMS & ABBREVIATIONS

OCR Optical Character Recognition

PSF Point Spread Function

ESF Edge Spread Function

LSF Line Spread Function

CSF Circle Spread Function

o(x,y) bilevel continuous input image

s(x,y) intensity of pixels at (x, y) before noise

s(x) intensity of pixels at position x before noise

s[i,j] sampled unquantized output of blurring convolution

f[i,j] final binary scanner output

n[i,j] Gaussian noise added in degradation model

noise standard deviation of Gaussian noise

w general width parameter for PSF

binarization or threshold level

Distance between samples

max binarization or threshold level which causes stroke to disappear.

c This is the distance that an edge shifts when it is scanned

Thickness of a scanned stroke before scanning

scanned Thickness of a scanned stroke after scanning

Amount that the thickness of a scanned stroke increases by

Rf Scanned circle radius

Ri Original circle radius

Cutoff used for noise spread

Z Z value for alpha cutoff for noise spread


15/118

(x) Density of pixels

v(x,y) Noise on source image

qs(x,y) Variance of noise on source image

qb(x,y) Variance of noise after blurring

Angle of inclination for edge


16/118

1

1. INTRODUCTION

High contrast images, which occur frequently in document images, are often digitized

into binary images. These binary images are then analyzed by optical character

recognition (OCR) systems which convert the text images into ASCII characters. OCR

systems depend on many features measured from the text images. They use large sets of

images that have already been classified and ideally are representative of the characters in

document images. Features are then measured from these training sets and those

measurements are used to label unclassified characters. One way of generating these

training sets is to generate large numbers of synthetic character images whose labels are

known and then to use these images to train the OCR system. To do this effectively it is

best if the synthetic characters are as similar as possible to real scanned characters. It is

not enough that individual characters be similar to characters in real document images;

the statistical properties of large numbers of characters need to match those of real

characters. This means that a good theoretical basis for the nondeterministic effects in the

generation of scanned characters is required.

Part of the nondeterministic aspects of generated characters is the random position or

phase of the sampling grid relative to a scanned object. When the sampling grid is shifted

relative to a continuous character, a large number of different bitmaps can result. It is

possible to determine the number and frequency of each of these bitmaps using modulo

grid diagrams [1]. Modulo grid diagrams are formed by performing a modulo one

operation with respect to both the horizontal and vertical coordinates of an objects


17/118

2

boundary. The effects of random phase can be incorporated into OCR by generating

training sets with random phases.

Unfortunately random phase is not the only nondeterministic aspect of scanned

characters. Another source of randomness is the noise that is in the document image prior

to digitization and the noise that is added during scanning. When the image is binarized,

the noise becomes concentrated on the boundaries of objects. In order for training sets to

have the same statistical properties as real scanned characters, it is necessary for the

training sets to have the appropriate amount of noise. Before this problem can be

addressed the amount of noise on the edges of binary images must be quantified and

understood theoretically.

The amount of noise in binary images is not only dependent on the amount of noise

added to the image, but on the shape of the object being scanned and on the scanner

model parameters. This research focuses on quantifying the noise in bilevel images and

on relating that amount of noise to the parameters of a commonly used degradation

model. The quantity that was developed is called noise spread.Noise spreadis the size of

the domain over which pixels in the binary image are affected by noise. While the

research in this thesis focuses on document images, the concept ofnoise spreadcould be

applied to other situations where an image is binarized.

Noise spreadprovides a theoretical basis for understanding and measuring noise in

document images. It is critical for developing methods to mitigate the effects of noise in

binary images.Noise spreadallows bilevel images to be created with different

degradation parameters but the same amount of noise. Filters can be tested to see if the

negative effects of noise can be suppressed. Measurement techniques can also be


18/118

3

developed to measure noise directly from document images. The noise in a binary

document image can be measured, and the training sets that are used to design an OCR

system could have the same levels of noise as the documents that the OCR system is

being applied to. Alternatively the noise measurement could be included into a general

OCR system as an additional feature.

Noise in binary images is equivalent to errors in general binary signals. In

information theory the Hamming distance between two binary signals of the same length

is the number of bits that are different between the two signals. The amount of noise in a

bilevel image can reasonably be measured by the Hamming distance between the image

with noise and the same image without noise. Figure 1.1 shows the Hamming distance for

images of a straight edge with and without noise. Theory about the relationship between

Hamming distance and noise spreadwill be introduced in Section 3.3, and experiments to

verify this theory are described in Chapter 4. Chapter 4 also includes experiments on the

relationship between the Hamming distance and noise spread when the phases of the two

objects are independent.

Since objects with noise are likely harder to precisely locate in an image, several

localization experiments were conducted to verify that noise spread accurately quantifies

Figure 1.1: The ideal edge and the noisy edge have the same phase and orientation. The Hamming

distance between the two images is the number of pixels that are different between the two

images.

Ideal Edge Noisy Edge Hamming Distance


19/118

4

the noise in thresholded images. The experiments were designed to verify that noise

spread accounts for all the effects of different document degradation parameters on the

localizability of circles and straight edges. There are applications where the ability to

precisely locate edges in bilevel images is important. For instance Hok Sum Yam [2]

developed a method for finding the degradation parameters from bilevel document

images. The method depended on precise edge measurements in order to determine how

much the corners of characters were eroded by scanning.

Chapter 2 discusses the technical background. That chapter includes a discussion of

how synthetic scanned images are created, a review of edge and circle localization

techniques, and a discussion of the effect of sampling. Chapter 3 introduces noise spread

in great detail and shows that it is related to Hamming distance. Chapter 4 provides an

experimental relationship between Hamming distance and noise spread that reinforces

and expands upon the relationship theorized in Chapter 3. Experiments in Chapter 5

provide evidence that all of the effects of the document degradation parameters on the

localizability of scanned edges and circles are determined by the noise spread.


20/118

5

2. TECHNICAL BACKGROUND

Before the theory behind noise in binarized images can be presented, it is necessary to

discuss the basic degradation model upon which it is based. It is also necessary to review

literature on the precise geometric measurements of straight edges and circles in images.

Determining the effect of noise on localizability is important to prove that noise spread is

a good quantitative measure of edge noise in bilevel images. The effects of phase are also

important because phase affects both localizability and Hamming distance between

objects.

The degradation model describes the acquisition of a binary scanned image as a

multistage process whose steps include: convolving with apoint spread function (PSF),

sampling, adding noise, and thresholding. Section 2.1 will discuss the document

degradation model in more depth and show how the model can be applied to scanned

edges, circles, and strokes. A method for determining the gradients of the simulated grey

level images is also discussed.

Edge finding is a very basic computer vision task and has been widely studied. The

experiments in this thesis are designed to determine the localizability of scanned edges

under different amounts of noise. Section 2.2 provides a review of the literature on edge

finding techniques and discusses in detail the edge finding techniques that were used in

this thesis.

There are several techniques for localizing circles. All the techniques that were

considered use least squares, but some techniques work better than others. Section 2.3


21/118

6

discusses these circle localization techniques and describes the Gauss-Newton algorithm

used to solve the nonlinear least squares problem.

The effects of sampling images have been studied extensively. Section 2.4 reviews

the literature on sampling effects, discusses the effect of sampling and edge orientation

on the localizability of scanned edges, and also discusses tools like the modulo-grid

diagram which provide a means for predicting the possible bitmaps that result from

different shapes due to phase effects alone.

2.1. Generating Synthetic Scanned ImagesThis thesis uses a degradation model based on a model proposed by Baird [3]. In that

model an ideal continuous bilevel image is convolved and sampled by a point spread

function (PSF). Then Gaussian noise is added to the image to represent noise added

during scanning and noise that would have been originally present on the paper image.

Finally, the image is binarized at a certain threshold level as shown in Figure 2.1.

This section begins with a detailed mathematical description of the scanner model.

Then in subsequent subsections this model is applied to straight edges, scanned strokes,

circles, and general scanned characters. In order to understand how noise affects scanned

bilevel images, we will need to determine the intensity gradients of the image before

thresholding. These gradients help determine how noisy a bilevel image will be. A

technique for obtaining these gradients is discussed for each of the different types of

scanned objects.


22/118

7

2.1.1. Basic Scanner Model

The basic scanner model describes the sampling of the spatially continuous image of

blackness or absorptance, o(x,y), where absorptance is one minus the reflectance. The

values ofo(x,y) can be either 0 (white) or 1 (black). The image is digitized by a sensor

array in the scanner. A PSF is used to model the fact that for each point on a physical

paper image different amounts of light are reflected to each sensor. The PSF is the 2-D

equivalent to the impulse response of a scanner. This equivalence means that convolution

can be used to predict the amount of reflected light each sensor detects. If the image is

sampled at pointsxj, yi on a rectangular grid, then the image is given by

[ ] ( ) ( )dudvvuovyuxjis ij ,,PSF, = . (2.1)

This equation assumes that the scanner is spatially invariant over the field of view, which

is valid for small regions. In order to model the noise that would exist on the original

image and the noise that is added during scanning, Gaussian noise n[i,j] is added to the

image

[ ] [ ] [ ]jinjisjia ,,, += . (2.2)

The noise is added to every sensor independently and has a mean of zero and a standard

deviation ofnoise. Other types of additive noise could be used as well.

To produce a bilevel image the intensity is quantized using a thresholding operation

[ ][ ]

[ ]

(2.23)

and

c 2 . (2.24)

The lower bound comes from the fact that scanned cannot be negative, and the upper

bound comes from Equation 2.20, Equation 2.13 and the fact that the ESF is always

positive. These upper and lower bounds can be used with the bisection method of root

finding to find .

Values ofw and which result in certain /2 values can be represented by iso curves

which are equipotential curves in w and on which /2 has the same values. Figure 2.4

shows these iso curves for a Cauchy PSF on the same plot as the c curves. As can be

seen the values of/2 are significantly smaller than c even when is 15. Appendix A

shows the iso curves that result from both Gaussian and Cauchy PSFs and for several

different values of.. Code for calculating is included in Appendix B.

For a Gaussian PSF the difference between /2 and c becomes insignificant as gets

larger. Ifc is used to estimate the size of the scanned stroke after scanning, then it is

necessary to determine whether the edges are close enough to have interference. As a rule

of thumb, a Gaussian ESF is approximately either 1 or 0 at 3w from an edge. This means

that if the size of the stroke after scanning predicted by c is greater than 3w, no

interference occurs. This rule of thumb can be summarized by the following inequality


31/118

16

which if satisfied means that no interference occurs

w c 3 . (2.25)

As with straight edges it is very important to determine the magnitude of the gradient

ofs(x) at the location of the thresholded edges. The derivative ofs(x) is given by

( )w

w

x

w

x-

xs

+

=

2

2LSF

2

2LSF

. (2.26)

Since the slope on the rising edge is positive

w

w

w

s

scannedscanned

scanned

++

+

=

2LSF

2LSF

2. (2.27)

Figure 2.4 The Iso Curves for a Cauchy PSF are significantly different from the c curves even whenthe stroke width is 15.

-3

-2-1

0

1

2

3

-3

-2

-1

0

12

3

c

/2

/2c


32/118

17

The gradient can also be expressed as a function of

w

ww

s scanned

+

+=

2LSF

2

2LSF

2

. (2.28)

2.1.4. Scanned Circles

Circles are among the simplest geometric shapes. As a consequence, when

experiments are done on the effects of noise on bilevel images, it is useful to apply them

to circles. A circle spread function CSF can be defined to describe the intensity of pixels

as a function of the distance from the center of the circle. When the circle is scanned, its

size changes. The scanned circle radiusRfcan be found from the original circle radiusRi

given the scanner parameters. Likewise sometimes when circles are generated for

experiments,Ri needs to obtained fromRf. As with edges and scanned strokes the

gradient of the scanned circle can be determined for both Gaussian and Cauchy PSFs.

Figure 2.5 shows the cross section of a circle scanned with a Cauchy PSF.

The intensity of a pixel as a function of the distance from the center of the circle can

be obtained using

( )

=i

i

i

i

R

R

dx

xR

r,y;w)dy(xr

xR22

22

PSFCSF . (2.29)

For a Gaussian PSF the equation becomes

( )( )

=

i

i

R

R

idx

w

xRerf

w

rx

wr

22exp

2

1CSF

22

2

2

Gaussian . (2.30)

For a Cauchy PSF the equation becomes


33/118

18

( )( )( ) +++

=

i

i

R

R i

idx

wRxrrwrx

xR

wr

22222

22

Cauchy2

CSF . (2.31)

The integrals have to be evaluated numerically. Special care must be taken when

numerically solving for the Gaussian CSF. The value of the integrand is near zero for a

large part of the domain over which it is integrated. This causes large errors when certain

numerical algorithms are used. If

iRwr 5 , (2.32)

then the following integral should be used to calculate the Gaussian CSF

( )

( )

=

i

i

R

wr

i

dxw

xR

erfw

rx

wr5

22

2

2

Gaussian 22exp2

1

CSF . (2.33)

The value ofRfdepends onRi, , and w. As with stroke thickness,Rfis defined

implicitly by

Figure 2.5 When a circle is scanned its radius changes. As with scanned strokes if the threshold is too

high the circle will disappear when it is scanned.

R

R


34/118

19

,w;RR if =CSF . (2.34)

Numerical methods are necessary to find the value ofRf. The CSF is a monotonically

decreasing function because of the restrictions that were placed on the PSF. As with

scanned strokes there is a max above whichRfwill be zero. To find max we use

( ) max0CSF = . (2.35)

The easiest way to find CSF(0) is to use polar coordinates

( )=iR

rdrr;w)(

0

PSF20CSF . (2.36)

For a Gaussian PSF this simplifies to

( )

=

2

2

Gaussian2

exp10CSFw

Ri . (2.37)

For a Cauchy PSF it is

( )22

Cauchy 10CSF

iRw

w

+= . (2.38)

Once it is confirmed that is not greater than max the value ofRfcan be found. There is

a lower bound onRfsince it cannot be negative. To find an upper bound onRf, a value

must be found which causes the CSF to be less than . The upper bound is first chosen to

be two timesRi. If this does not result in a CSF value less than , then 2Ri becomes the

new lower bound, and the upper bound is chosen to be four timesRi. The assumption for

the upper bound is doubled until it results in a CSF value less than . Once the upper

bound is determined, a bisection method can be used to findRf. It is also possible to

determineRi ifRf, w and are known.Ri is greater than zero andRfincreases


35/118

20

monotonically asRi increases. The algorithm for findingRi is essentially the same as the

method for findingRf. The algorithm for findingRi is included in Appendix C.

The magnitude of the gradient of scanned circles is important for determining the

effects of noise. If the gradient is calculated by evaluating the CSF at two points, the

calculation is prone to error. The numerical integration has some noise which is

magnified by this technique. Instead it is better to take the derivative analytically. The

gradient is given by

( ) ( )( ) dxdyyrx

r

r

r

i

i

R

R

xR

xR

=

22

22

,PSFCSF . (2.39)

For a Cauchy PSF this simplifies to

( )( ) ( )( )

( )( ) ( )( ) +++++

= i

i

R

R i

i dx

wRxrxwrx

xRwRxrxxrwr

r 2/32222222

222222

Cauchy

3223CSF

. (2.40)

For a Gaussian PSF the gradient is given by

( ) ( )

= R

R

idx

w

xRerf

w

rx

w

rxr

r 22exp

2CSF

22

2

2

3Gaussian . (2.41)

In both cases the gradient is found by numerical integration. For a Gaussian PSF the same

problem exists with the integrand being near zero over a large part of the domain over

which it is integrated. The solution for finding the CSF can be applied in exactly the same

way to obtain the gradient.

2.1.5. Scanned Characters

The shape of scanned characters is too complicated to use many of the analytical

methods in the previous section. Instead the value of the scanned image is determined by


36/118

21

using the discrete convolution of a sampled character image and a sampled PSF. There is

some error associated with this method, but it is reduced by using sampled images and

PSFs with resolutions larger than the resolutions of the final images. Thescale factoris

defined as the simulated resolution divided by resolution of the final image. A PSF is

generated which is also sampled at the samescale factoras the character image. Because

each pixel in the sampled PSF represents an area smaller than the pixels in the final

image the PSF kernel is

[ ]2

,PSF,PSFKernel

rscalefacto

yxji

ij= . (2.42)

Since the PSF kernel must be finite in size, the PSF is effectively truncated.

There are several advantages of using a Gaussian PSF over a Cauchy PSF in terms of

accurately simulating the continuous convolution. A Gaussian PSF can be safely

truncated at four times the w and have very little error. However, for a Cauchy PSF this is

a problem because it is a heavy tailed distribution. To achieve the same accuracy the

Cauchy PSF would have to be truncated at about 3000 times the w. Another advantage of

the Gaussian PSF is that it is separable. This means that the convolution can be calculated

by taking the one dimensional PSF, convolving it with each row, and then convolving it

with each column. While a Gaussian PSF provides several advantages over the Cauchy

PSF, the experiments in this thesis involve isolated characters. The white background

makes it possible for this situation to be simulated even for a Cauchy PSF as long as the

convolution kernel is a little more than twice the size of the original character. This is

because the truncated part of the Cauchy PSF would always be over white background.

After the high resolution images are convolved with the high resolution truncated

PSFs, the images are then down sampled. The location of the final sampling grid does not


37/118

22

necessarily coincide with the high resolution sampling grid. In order to have random

continuous phase shifts and non-integer factor values it is necessary to interpolate the

values of pixels. To do this bilinear interpolation can be used because of its simplicity

and because the errors associated with it are not significant.

It is also necessary to determine the gradients of the scanned characters. While the

gradients could be measured from the high resolution grey level image, this is not the

method that was used. The derivative is a linear operation which means that the

derivative of the PSF can be taken and then the gradient of the image can be determined

by convolving the original character image with the resulting kernel. The derivative with

respect tox of the Cauchy PSF is

( )

( ) 25

222Cauchy

2

3,PSF

wyx

xwyx

x++

=

. (2.43)

For the Gaussian PSF the derivative is

( )

+

=

2

22

4Gaussian 2exp2,PSF w

yx

w

x

yxx . (2.44)

When these functions are used to create convolution kernels, the functions have to be

divided by thescale factorsquared. The kernels for the derivatives with respect toy can

be obtained by transposition. The Gaussian kernel is separable which can be used to

speed up computations.

2.2. Edge Finding TechniquesThis thesis focuses on the effect of additive Gaussian noise on scanned images. One

component is to explore the ability to accurately locate edges in scanned document

images. Finding lines in an image is critically important in the fields of image processing


38/118

23

and computer vision, and there is a substantial amount of work that has been done on the

topic. A significant amount of attention has gone to developing operators, which bring

out the edges in an image. This is usually followed by techniques that use the Hough

transform to find the location of the line [6]. There has also been study of accurately

locating edges and lines in bilevel rasterized images [8].

One approach to edge detection is to convolve the image with an edge detector and

then to threshold the image and locate the edge using a Hough transform [6]. The Hough

transform works by mapping points to the set of lines that pass through those points.

Edges can be represented by two parameters such as angle and distance from the origin.

These two parameters form a parameter space which can be divided into discrete bins.

The Hough transform is performed by looping through every edge point in the image and

then incrementing the value in every bin that contains parameters to an edge that runs

through the point. After this is done for every edge point the true edge can be determined

by finding the bin with the largest value.

There are a variety of operators that can be used for edge detection. One such

operator is the Sobel operator. The Sobel operator is a combination of two operators

which estimate the two components of the image gradient Gx and Gy. IfA is the original

image Gx is given by

AGx

=

101

202

101

. (2.45)

Gyis calculated using an operator that is simply the transpose of the one used to calculate

Gx. The magnitude of the gradient can then be estimated as


39/118

24

22 yx GGG += . (2.46)

Once the gradient image is determined, it is thresholded to find the edge points, and then

the Hough transform is used to find the edge. The maximums of the Hough transform

correspond to the parameters of the edge. The Prewitt and Roberts operators work in a

way that is similar to that of the Sobel operator. Table 2.1 shows the Sobel, Prewitt, and

Roberts operators.

Table 2.1: Simple operators used for edge detection.

Sobel Operators Prewitt Operators Roberts Operators

-1 0 1

-2 0 2

-1 0 1

-1 -2 -1

0 0 0

1 2 1

-1 0 1

-1 0 1

-1 0 1

-1 -1 -1

0 0 0

1 1 1

0 -1

1 0

-1 0

0 1

In addition to the simple Sobel, Prewitt, and Roberts operators more complicated

operators can be used. One such operator is theLaplacian of Gaussian (LoG) operator.

This operator is given by

( )

=

2

2

4

22

2exp

rrrh . (2.47)

This operator is the second derivative of a Gaussian function with a width parameter of.

The operator is circularly symmetrical. Numerically it is represented by at least a five by

five kernel. One approximation of the LoG kernel is given by

=

00100

01210

121621

01210

00100

LoG . (2.48)

Some of the most important work on finding edges in grey level images was done by

Canny [7]. The theoretical basis for the edge detection mask developed by Canny


40/118

25

depends on being able to separate the image into noise and signal components. However,

when an image is subjected to a nonlinearity such as thresholding, the noise and signal

components cannot be separated in this way. The Canny operator begins by smoothing

the image with a Gaussian. Then the gradients of the image are determined. Two

thresholds are used to determine which pixels are edge pixels. The first threshold is set

very higher than the other and any pixel whose gradient exceeds the threshold is labeled

as an edge pixel. Then pixels that are adjacent to an edge pixel are also labeled edge

pixels if their gradient exceeds the second threshold. The Canny operator was

implemented in this thesis using Matlabs built in edge detection function.

Because using operators such as the Canny operator has no strong theoretical basis in

bilevel images, we can use a more basic approach. This approach involves selecting data

points between each pair of adjacent black and white pixels. Then a line can be fitted to

these points based on the least squared distance. The least squares fitting can either use

the squared vertical distance of points to the edge or use the squared perpendicular

distances. Gordon and Seering [8] analyzed the accuracy of least squares at finding the

location of edges. They use an assumption that the vertical distance between points on a

digitized line and its corresponding continuous line vary independently of one another.

Using this assumption they determined the estimation error of edges. The case in which

the points do not vary independently of one another will be explored in more detail in

Section 2.4.

The least squares approach and the operator based approaches are explored

extensively in Section 5.1.1. In that section experiments are conducted to determine


41/118

26

which of the methods work best for bilevel straight edges. The effectiveness of

perpendicular vs. vertical least squares will also be analyzed.

2.3.

Circle Fitting Techniques

In order to understand the effects of noise on 2-D objects, it is necessary to explore

the effect of noise on the ability to precisely determine the position and radius of scanned

circles. To do this, data points were selected between adjacent pairs of black and white

pixels, then a circle was fit to these data points. Several classical methods of doing this

fitting are discussed in [9]. This section includes a discussion of these methods.

The simplest method for fitting a circle to data points is called Algebraic circle fitting.

The equation of a circle can be given implicitly by

( ) 0=++= caF TT xbxxx , (2.49)

where the coefficients a, b and c are such that a is not zero and b is a two element

column vector. If the values of each data point are plugged into this equation, the result is

uB = , (2.50)

where is the error vector which is to be minimized, B is a matrix

+

+

=

1

1

212

22

1

1211212

211

mmmm xxxx

xxxx

MMMMB (2.51)

and u=[a,b1,b2,c]. Since both sides of Equation 2.49 can be multiplied by a constant, a

constraint can be applied to u that it must be a unit vector. The squared Euclidean norm

of can be minimized using Lagrange multipliers. The constraint that u is a unit vector is

applied to create the following equation


42/118

27

u

u

u

=

22

. (2.52)

The left side of the equation becomes

( ) ( )( ) ( )uBB

u

BuBu

u

uBuB

u

=

=

=

TTTT

2

2

. (2.53)

The right side also simplifies giving

uuBB = 22 T . (2.54)

The Lagrange multipliers are also the eigenvalues ofBTB. Substitution gives

=== uuBuBu TTT2 . (2.55)

This means that the squared Euclidean norm of is minimized by using the value ofu

associated with the smallest eigenvalue ofBTB. Equivalently u is the right singular vector

associated with the smallest singular value ofB. The center can be obtained from u using

=

a

b

a

bz

2

,

2

21 . (2.56)

The radius is obtained by using

a

c

ar =

2

2

4

b. (2.57)

The problem with Algebraic circle fitting is that minimizing the Euclidean norm of does

not necessarily result in the best fitting circle. It is especially poor when fitting a circle to

an arc of data points.

An alternative to the Algebraic method is Geometric circle fitting. Geometric circle

fitting is a nonlinear least squares procedure which minimizes the sum of the squared

distances of points to the nearest point on the circle. If the center point of the circle is z


43/118

28

and the radius is r, then the distance of a pixel to the circle is

( )22 rd ii = zx . (2.58)

Ifu=[z1,z2,r]T

defines the circle, then u needs to be selected to minimize

( )m

i

id u2 . (2.59)

A method called Gauss-Newton is used to minimize this expression. The method starts

out with a decent guess of the best value ofu. Ifd(u) is a column vector of the functions

di(u), then the idea is to find the change h in u which will minimize d(u) in the least

squares sense. To do this d(u+h) is approximated using a Taylor series expansion

( ) ( ) ( ) huJudhud +=+ , (2.60)

where J(u) is the Jacobian matrix. In this case the Jacobian is given by

( )

=

1

1

2211

1

122

1

111

m

m

m

m

xu

xu

xu

xu

xu

xu

xu

xu

MMMuJ . (2.61)

The change in u that is required is found by solving the linear least squares problem

( ) ( ) 0+ huJud . (2.62)

The value ofh is

( )( ) ( ) ( )uduJuJuJh = TT 1)( . (2.63)

With every iteration of the algorithm, h is used to update u and a closer approximate

solution of the nonlinear least squares problem is found. This method produces a much

better fit. As was stated earlier, this method requires an initial guess of the value ofu.

One way to obtain this is to use the Algebraic circle fitting. Another way is to find the


44/118

29

mean of all the data points and make this the center of the circle. The radius can be

estimated by taking the distance of each point to this center and taking the mean of those

distances. Figure 2.6 compares the results of Algebraic and Geometric least squares for a

certain set of points. The points were chosen experimentally to show the weakness of the

Algebraic technique. The Geometric technique always produces better results because the

error that it attempts to minimize is more sensible.

2.4. Effect of SamplingSampling of continuous bilevel images can produce several undesirable effects. Since

the position of the sampling grid relative to the image is random, there are variations that

occur in the resulting bitmaps. Even without noise there is an unavoidable Hamming

Distance between different scans of an image, even from the same scanner. The

geometric precision of edge and circle measurements is limited by the sampling

Figure 2.6 Algebraic Fitting can sometimes result in a poor fit. The center of the Algebraic fit is

(10.24,20.98) and the radius is 4.83. The center of the Geometric fit is (10.10,7.92) and the

raidus is 11.77.


45/118

30

resolution. In addition the sampling grid for images is anisotropic. This means that edges

at certain orientations are measured with less precision than edges at other orientations.

One of the goals of this thesis is to explore how the random effects of noise and random

phase shifts affect document images.

A review of the literature shows several tools for analyzing the effect of phase on

scanned images. Dorst and Smeulders [10] gave an expression for determining the set of

continuous line segments which could generate a certain chaincode string. This

expression could also be used to find the worst case positional accuracy of an edge

segment. Dorst and Duin [11] introduced the concept of spirographs and used it to

calculate the average and worst case positional accuracy of edges. Havelock [12] used

modulo grids to analyze the positional accuracy of various shapes. Sarkar [1] expanded

upon Havelocks work by using modulo grids to calculate the number and frequency of

bitmaps that an object would produce.

Spirographs can be used to describe the way in which a continuous edge is sampled.

Any edge can be flipped on the reflection linesx=y,y=0 andx=0. Because of this the

effect of sampling any straight edge can be determined by studying those straight edges

with a slope in the range (0, 1). A spirograph consists of a circle withNpoints on it

which divide it intoNarcs as seen in Figure 2.7

Each consecutive point is placed the same constant clockwise distance around the

circle from the previous point. The sampling grid for an edge can be represented by

making the distance between each consecutive point equal to the slope of the edge. The

random location of the sampling grid with respect to the edge can be represented by

randomly placing the edge as a point on the spirograph;Nis then the number of columns


46/118

31

in the edge image. If the edge is shifted up vertically, it is moved clockwise around the

circle. If it crosses a sample point on the spirograph, the bitmap of the edge will change,

and the number of segments formed around the spirograph is the number of bitmaps a

certain edge can have.

One special case is when the slope of an edge can be represented by the irreducible

fractionp/q and when q


47/118

32

the sampling grid. The position of the edge relative to the sampling grid is a random

number. So the variance of the perpendicular distance between the measured and actual

positions of a noise free edge is

( )22121

qpVar

+= . (2.65)

In order for the length of an edge segment with slope m to beL the number of

columnsNmust be determined by

.

12

+=

m

LroundN (2.66)

If a spirograph is defined with the first parameter being the distance between successive

points and the second byN, then the spirograph for this edge is

+1,

2m

LroundmSPIRO . (2.67)

The precision of edge measurements can be determined by the combination of two

parameters. The distance parameter is the perpendicular distance of the edge from the

midpoint of the continuous edge segment where the distance is positive if the measured

edge is above the continuous edge and negative otherwise. The angular error is the

difference between the angle of inclination associated with the theoretical edge and the

angle of inclination associated with the measured edge. The variance in the distance for

an edge can be shown to be

( )( )

+

+= 2

2

3

112ii

i pedm

dDistanceVar , (2.68)

where di is the length of the ith

arc on the spirograph andpei is perpendicular distance


48/118

33

between the actual and measured edge when the phase is chosen to be the midpoint of the

ith

arc. The variance of the angular error between the measured and theoretical edge can

be shown to be

( ) = 2ii aedorAngularErrVar , (2.69)

where aei is the angular error of the measured edge when the phase is chosen to be on the

ith

arc. Figure 2.8 shows the variance of the distance and angular error as a function of

slope. The slopes of 0, 1/2 and 1 have large distance variance, but the angular errors for

these slopes are zero. The greatest angular errors occur for edges with slopes close to but

not equal to 0, 1/2 and 1.

In addition to the effects of sampling on the geometric measurements, sampling also

affects the Hamming distance between two scans of the same object. If the two scans had

the same phase and there were no noise, then the two scans would have a Hamming

distance of zero. However, because different phases result in different bitmap

configurations, there is some Hamming distance even when two objects are aligned to

(a) (b)Figure 2.8: (a) The variance of the distance between the measured and actual edges is determined for

edges that are 20 pixels long. (b) The variance in the angular error is determined for

noiseless edges that are 20 pixels long.


49/118

34

minimize the difference. The Hamming distance that will occur depends on the shape of

the object being scanned. Modulo grids can be used to determine the expected Hamming

distance between scans with independent random phases and no noise. However, this

approach probably would not be more efficient than large experiments that generate scans

and then find the minimum Hamming distance. Certain shapes like circles are known to

have high Hamming distances because the size of the locals in the modulo grid are small.

For this same reason these shapes have been analyzed for their use in image registration

[13],[14].

Neither modulo grids nor spirographs can predict the effect of combining sampling

and noise on geometric measurements. The phase of simulated scans in an experiment

can be fixed in order to isolate the effects of noise. Then further experiments can explore

the combined effects of noise and random phase. These noise effects are explored in

detail in Chapters 4 and 5.


50/118

35

3. NOISE SPREAD THEORY

For grey level images noise is usually described by the standard deviation noise of the

additive noise. However the amount of noise present in a bilevel scanned image is not

dependent purely on the level of noise added prior to thresholding. This can be seen

clearly by looking at Figure 3.1. The first three images all have the same amount of

additive noise. However, the noise spread(NS) increases from left to right. One of the

central points of this thesis is to derive this quantity and show that it is a good

representation of the amount of noise in a bilevel image. This makes it possible to

generate synthetic bilevel images with specific amounts of noise.

3.1. Noise Spread for straight isolated edgesThe basic idea behind noise spreadis that when an image is thresholded the noise is

concentrated on the edges of the objects in the image. The noise spreadfor a given edge

is the size of the domain in which pixels are affected by additive noise. Typically this

domain, called the noise spread region, is less than a pixel thick. Its size is still relevant

because if it is larger then it is more likely that an edge pixel will be in this region. Noise

Figure 3.1 Edges with varying amounts of noise spread. While the standard deviation of the noise in

first three images is the same, the noise spread is different. The picture on the far left shows

an extreme amount of noise.

w=0.64

=0.5

noise=0.05

NS=0.2

w=1.27

=0.5

noise=0.05

NS=0.4

w=1.9

=0.5

noise=0.05

NS=0.6

w=3.16

=0.5

noise=0.1

NS=2.0


51/118

36

spreadis dependent in part on the shape of the object being scanned. Initially

noise

spreadis derived for isolated edges. Isolated edges are among the simplest shapes upon

which to do experiments and can be represented in one dimension as step functions.

Section 2.1.2 discussed how straight edges are affected by scanning, but that section was

focused on the deterministic effects of scanning. Nondeterministic effects such as

additive noise must be discussed in the context of probability.

Figure 3.2 shows how an edge is affected by scanning. Figure 3.2(a) shows what

happens when noise is disregarded. As was discussed in Section 2.1.2 the edge shifts by

c. However, as shown in Figure 3.2(b), when noise is added there is a region in which

the value of pixels after thresholding is uncertain. This region is called the noise spread

region. The size of this region is called the noise spread(NS), and as illustrated in Figure

3.1, it is a good quantitative measure of how noisy a bilevel image is. To precisely define

NSit is necessary to define the probability that a pixel at a certain distance from the edge

will be above the threshold. This threshold probability (THP) depends on the cumulative

distribution function (CDF) of the noise and is

(a) (b)

Figure 3.2: (a) Edge after blurring with a generic PSF of width, w. When no noise is added, the

thresholding produces the edge shift c. (b) Edge with noise. The uncertain boundary shownin gray, is the noise spread region. The effects of sampling are not shown.


52/118

37

( )

=noise

w

x

x

ESF

CDFTHP Gauss . (3.1)

The noisy edge will be above the threshold with probability near 0 on one side of theNS

region and with a probability of near 1 on the other side of theNSregion. The THP can

then be represented with a piecewise approximation

( )

( ) ( )

( )

+

++

+

=

21

22

12

0

)THP(

NSx

NSx

NSx

NS

NSx

x

c

cc

c

, (3.2)

where

cxx

NS

=

=THP

1. (3.3)

The derivative of THP is a function of the noisesprobability density function (PDF)

noise

noise

w

x

w

w

x

x

=

ESF

PDF

LSFTHP

Gauss

. (3.4)

Evaluating atx=-c, which is where ESF(x/w)=, gives

( ))wx noisex c =

=

2ESFLSFTHP

1

. (3.5)

Substituting this back into Equation 3.3 gives an estimate of the noise spread

( )( )

1ESFLSF

2

=

wNS noise . (3.6)


53/118

38

Figure 3.3 shows the piecewise approximation of the threshold probability curve.

While the piecewise approximation is illustrative it is also fairly crude. The original

definition of the noise spreadwas the size of the domain in which the values of pixels are

uncertain. This level of uncertainty can be quantified by defining the noise spread as the

breadth of the domain over which the threshold probability is in the range (,1- ). The

arbitrary cutoff is used to determine the boundaries of the noise spread region. The more

accurate approximation of the THP starts by linearizing the ESF atx=-c

( )( )cx

ww

x

++

1ESFLSFESF . (3.7)

Substituting this into Equation 3.1 gives

( )( )( ) ( )

+=

w

xx

noise

c

1

GaussESFLSF

CDFTHP . (3.8)

A parameterZcan be defined such that

Figure 3.3 The threshold probability (THP) function is shown for a Gaussian PSF with w=1, =0.7

and noise=0.1.Noise spreadin this case is about .72. The piecewise approximation of

threshold probability is inaccurate at the tails of theTHP function.


54/118

39

( ) ZGaussCDF1 = . (3.9)

Since the Gaussian CDF is odd symmetric, the noise spread region will be centered on c

so

= cNS

2

THP1 . (3.10)

This can be evaluated by using Equation 3.8 which gives

( )( )

=

w

NS

noise

2

ESFLSFCDF1

1

Gauss . (3.11)

NSis then solved for by using Equations 3.9 and 3.11, which produces

( )( )

1ESFLSF

2

=

wZNS noise . (3.12)

This definition is identical to the one in Equation 3.6 if

253.12

2==

Z . (3.13)

In order to maintain consistency the cutoff defined in Equation 3.13 will be used. The

resulting value ofis 0.105 which is a reasonable level of uncertainty. This cutoff means

that the noise spread is the breadth of the domain over which the threshold probability is

in the range (0.105, 0.895). The edge images in Figure 3.1 show that noise is very

noticeable in images withNSvalues as low as .2. In most casesNSwill be less than the

extreme example on the far right in Figure 3.1. At some point the added noise is extreme

enough that even pixels away from the edge have an uncertain value. When this occurs,

the approximation in Equation 3.8 is no longer valid. Where this occurs depends on the

PSF used and the degradation parameters. Generally the approximation is better when

is close to 0.5 and when noise levels are small. Figure 3.4 illustrates the approximation in


55/118

40

Equation 3.8. The parameters used in Figure 3.4 are extreme; for most sets of degradation

parameters it is very hard to distinguish the results from Equations 3.1 and 3.8.

3.2. Extending Noise Spread to general shapesNoise spreadwas introduced for straight edges in Section 3.1, but it is possible to

extend the noise spread theory to arbitrary shapes. In Section 3.1 the noise spread region

was defined to be anywhere the THP is between and 1-. This applies directly to

general shapes. However, with the exceptions of isolated straight edges, scanned strokes,

and circles, the noise spread region will not have the same thickness along the boundary

of a general object. For this reason the noise spread must be defined for any point on the

contourCdefined bys(x,y)=. To do thisNScan be defined as the thickness of the noise

spread region along the direction defined by the gradient at any point on the contourC.

Figure 3.4 The threshold probability function given in Equation 3.1 (solid) is compared to the

approximation in Equation 3.8 (dashed) with the parameters w=1, =0.7 and =0.1.

The actual threshold probability is a little lower on the tails.


56/118

41

NSof an entire object can then be defined as the mean value of the noise spreadon the

contourC. If the mean reciprocal of the magnitude of the gradient ofs(x,y) along Cis

estimated, then this estimate can be used to find the noise spreadof the object. To do this

a linearization procedure is required.

If (x0,y0) is a point on the contours(x,y)=, and the notationsx andsy is used to

denote the partial derivatives with respect tox andy, then the approximation

( )( )

( )( )

( )0000

000

00

000 ,

,

,,

,

,yxsu

yxs

yxsuy

yxs

yxsuxs xx +

+

+ (3.14)

can be made. This relationship can be used to find the value ofs(x,y) at any point near the

contours(x,y)=. Following the reasoning that gave Equation 3.1, the THP can be

represented as a function ofu and the magnitude of the gradient

=

noise

usu

GaussCDF)THP( . (3.15)

The noise spread at any point on the contour can be expressed as a function of the

gradient at that point

( )( )00

00,

2,

yxs

ZyxNS noise

=

. (3.16)

The noise spread of the object is just the line integral with respect to arc length divided

by the total arc lengthL

( )

( ) ==

C

noiseC

yxsLL

dlyxNS

NS,

12,

. (3.17)

In the case of straight edges, scanned strokes, and circles the gradient is constant. For

other shapes the line integral has to be estimated numerically.


57/118

42

Noise spreadhas been generalized to apply to arbitrary shapes. It is a very powerful

measure of the edge noise in binary document images. It makes it possible to compare

noise levels of different objects scanned with different scanner parameters. In Section 3.3

its relationship with the Hamming distance of a scanned object and its noiseless template

is explored.

3.3. Relationship between Noise Spread and Hamming DistanceThe real benefit of determining the noise spreadof a scanned object is that it provides

an effective measure of how noisy an object is. Since Hamming distance provides a

metric of how different two scanned objects are, it is very useful for analyzing the noise

in bilevel images. The Hamming distance between a template and a scanned character is

determined by the combination of the phase effects and of the noise. The phase effects

were described earlier. If the phase effects are removed by forcing the template and the

scanned object to have the same phase, then the effects of noise alone can be analyzed.

When this is done it is possible to relate the expected Hamming distanceHto the noise

spread.

To see how this is true it is best to start with the case of isolated scanned edges. The

probability of error (PE) is defined to relate the expected Hamming distance to the noise

spread. The probability of error is related to the THP and is the probability of a pixel

having a different value because of noise than it would without noise. Formally it is

defined by

tol)

R=(Rmin+Rmax)/2;


118/118

103

[CSF]=CauchyCSF(Rf,alpha,R);if CSF-theta>0

Rmax=R;

elseRmin=R;

endend

mcgillivary thesis

Documents