mcgillivary thesis
TRANSCRIPT
-
8/8/2019 McGillivary Thesis
1/118
QUANTIFYING NOISE EFFECTS IN BILEVEL DOCUMENT IMAGES
by
Craig D. McGillivary
A thesis
submitted in partial fulfillment
of the requirements for the degree of
Master of Science in Electrical Engineering
Boise State University
October, 2007
-
8/8/2019 McGillivary Thesis
2/118
2007Craig D. McGillivary
ALL RIGHTS RESERVED
-
8/8/2019 McGillivary Thesis
3/118
iii
The thesis presented by Craig D. McGillivary entitled Quantifying Noise Effects inBilevel Document Images is hereby approved
Elisa H. Barney Smith Date
Advisor
Tim Andersen Date
Committee Member
Jim Browning DateCommittee Member
John R. Pelton Date
Dean of the Graduate College
-
8/8/2019 McGillivary Thesis
4/118
iv
ACKNOWLEDGEMENTS
I would like to Dr. Barney Smith who encouraged me to get a graduate degree and
then poked and prodded me until I completed it. I would not have succeeded without her
support and mentorship.
I would also like to thank my friends and family. They helped me to overcome the
stress and frustrations that came up from time to time as I worked towards my goals. I
would especially like to thank my coworkers Joetta Anderson, Chris Hale, Darrin Reed,
Jim Steele who acted as sounding boards for ideas, reviewed and edited my writing and
provided friendship in this journey. I am honored and grateful to my committee members
Dr. Tim Anderson and Dr. Jim Browning who provided time and energy to review this
thesis.
This material is based upon work supported by the National Science Foundation
under Grant No. CCR- 0238285.. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation.
-
8/8/2019 McGillivary Thesis
5/118
v
ABSTRACT
The effect of binarization via global thresholding on additive Gaussian noise in high
contrast images is explored. A measure of noise in bilevel images called noise spreadis
developed with the use of a degradation model that applies to many image degradations
included in desktop scanning. When high contrast images are binarized, noise is
concentrated on the edges of objects in the image.Noise spreadis the breadth of the
domain in which pixels are affected by noise after binarization. It depends on both the
noise level and the gradients of the image prior to thresholding.
There is a strong linear relationship between noise spreadand the expected Hamming
distance between an image with noise added and the same image without noise added. It
is also known that if two images of an object are synthetically generated with
independent random phases and zero noise, there is a small Hamming distance between
them. Experiments on circles and on a 5 were run to determine the combined effect of
random independent phase and noise spreadon the expected Hamming distance. The two
factors are not additive and that the phase effects become less significant when the noise
spreadincreases. The degree to which this is true depends on the shape of the object
being scanned.
In addition to experiments on Hamming distance, experiments were run to determine
the geometric precision of images with noise. This includes experiments relating noise
spreadon the localizability of straight edges at several different orientations. The
localizability of and edge is defined by the ability to determine the orientation and
-
8/8/2019 McGillivary Thesis
6/118
vi
position of an edge segment. The quality of edge measurements is quantified by the angle
between the measured edge and the true edge and by the distance between the measured
edge segment and the midpoint of the true edge segment. Surprisingly the distance
measurements for edges at certain orientations actually are precise when the noise spread
increases, but this variation is offset by less precision in the measurements of edge
orientation. For most edge orientations the precision of both distance measurements and
orientation measurements decreases when noise spreadincreases. Experiments relating
noise spreadto the localizability of circles were also conducted. These experiments
reveal that the positional error in circle measurements has a Rayleigh distribution, while
the radius measurements have a normal distribution, and that circle localizability
decreases as noise spreadincreases.
Noise Spreadprovides a strong theoretical foundation for future research. Since
random effects play a critical role in optical character recognition (OCR) and in pattern
recognition generally, it is important to understand and quantify them. Future research
will focus on relating noise spreadto human preference, on finding novel techniques for
measuring noise spreaddirectly from binary images, and on developing filters and other
techniques which will make OCR systems less susceptible to noise.Noise spreadmay
also have unforeseen applications in problems other than document research.
-
8/8/2019 McGillivary Thesis
7/118
vii
TABLE OF CONTENTS
LIST OF FIGURES ........................................................................................................... ix
LIST OF TABLES........................................................................................................... xiii
LIST OF SYMBOLS, TERMS & ABBREVIATIONS ...................................................... i
1. INTRODUCTION ........................................................................................................ 1
2. TECHNICAL BACKGROUND................................................................................... 5
2.1. GENERATING SYNTHETIC SCANNED IMAGES .................................................. 6
2.1.1. Basic Scanner Model ...................................................................... 7
2.1.2. Scanned Straight Edges................................................................. 10
2.1.3. Scanned Strokes............................................................................ 13
2.1.4. Scanned Circles............................................................................. 17
2.1.5. Scanned Characters....................................................................... 20
2.2. EDGE FINDING TECHNIQUES ......................................................................... 22
2.3. CIRCLE FITTING TECHNIQUES ....................................................................... 26
2.4. EFFECT OF SAMPLING ................................................................................... 29
3. NOISE SPREAD THEORY ....................................................................................... 35
3.1. NOISE SPREAD FOR STRAIGHT ISOLATED EDGES............................................ 35
3.2. EXTENDINGNOISE SPREAD TO GENERAL SHAPES ......................................... 40
3.3. RELATIONSHIP BETWEENNOISE SPREAD AND HAMMING DISTANCE ............ 42
3.4. NOISE SPREAD WITH VARYINGNOISE LEVELS .............................................. 45
-
8/8/2019 McGillivary Thesis
8/118
viii
4. HAMMING DISTANCE BETWEEN SCANNED OBJECTS.................................. 49
4.1. HAMMING DISTANCE BETWEEN SCANNED EDGES ........................................ 49
4.2. HAMMING DISTANCE OF SCANNED CIRCLES................................................. 52
4.3. HAMMING DISTANCE OF SCANNED CHARACTERS......................................... 56
5. GEOMETRIC MEASUREMENTS OF BILEVEL SCANS...................................... 60
5.1. LOCALIZABILITY OF SCANNED EDGES .......................................................... 60
5.1.1. Comparing Various Approaches to Finding Edges ...................... 63
5.1.2. Effect of Threshold ....................................................................... 65
5.1.3. Effect of PSF Width...................................................................... 67
5.1.4. Effect of Edge Orientation............................................................ 70
5.2. LOCALIZABILITY OF SCANNED CIRCLES ....................................................... 77
6. CONCLUSIONS AND FUTURE WORK ................................................................. 81
REFERENCES ..................................................................................................................84
APPENDIX A 86
Iso Curves for Cauchy and Gaussian PSFs
APPENDIX B 100
Code for Calculating
APPENDIX C 102
Code for CalculatingRi
-
8/8/2019 McGillivary Thesis
9/118
ix
LIST OF FIGURES
Figure 1.1 The ideal edge and the noisy edge have the same phase and orientation.
The Hamming distance between the two images is the number of pixelsthat are different between the two images. ................................................. 3
Figure 2.1 This scanner model is used to determine the value of the pixelf[i,j]centered on each sensor element................................................................. 8
Figure 2.2 Gaussian ESFs are shown with w1=1 and w2=2. When the image is
blurred and thresholded the position of the edge is shifted from the dotted
step function to the solid edge function. This shift is called c and is byconvention positive when the edge is shifted to the left. .......................... 11
Figure 2.3 When a stroke characterized by two parallel edges is scanned the resulting
stroke thickness changes. Interference between the parallel edges causes
the grayscale value of pixels to be less than that predicted by the ESF. As
a result the stroke thickness will be less than that predicted by c. .......... 14
Figure 2.4 The Iso Curves for a Cauchy PSF are significantly different from the ccurves even when the stroke width is 15. ................................................. 16
Figure 2.5 When a circle is scanned its radius changes. As with scanned strokes if the
threshold is too high the circle will disappear when it is scanned. ........... 18
Figure 2.6 Algebraic Fitting can sometimes result in a poor fit. The center of the
Algebraic fit is (10.24,20.98) and the radius is 4.83. The center of the
Geometric fit is (10.10,7.92) and the raidus is 11.77............................... 29
Figure 2.7 Spirographs are a useful tool for studying the phase effects of straight
edges. This spirograph was created using N=20 and an edge with a 35degree angle of inclination. The marks on the circle represent the locations
of sample points relative to the edge. The lines on the interior of the circle
connect sample points that are on adjacent columns of the image. The
location of the edge can be represented by a point on the circle. ............. 31
Figure 2.8 (a) The variance of the distance between the measured and actual edges is
determined for edges that are 20 pixels long. (b) The variance in the
angular error is determined for noiseless edges that are 20 pixels long. .. 33
-
8/8/2019 McGillivary Thesis
10/118
x
Figure 3.1 Edges with varying amounts of noise spread. While the standard deviation
of the noise in first three images is the same, the noise spread is different.The picture on the far left shows an extreme amount of noise. ................ 35
Figure 3.2 (a) Edge after blurring with a generic PSF of width, w. When no noise is
added, the thresholding produces the edge shift c. (b) Edge with noise.The uncertain boundary shown in gray, is the noise spread region. Theeffects of sampling are not shown. ........................................................... 36
Figure 3.3 The threshold probability (THP) function is shown for a Gaussian PSF
with w=1, =0.7 and noise=0.1.Noise spreadin this case is about .72. Thepiecewise approximation of threshold probability is inaccurate at the tails
of theTHP function. ................................................................................. 38
Figure 3.4 The threshold probability function given in Equation 3.1 (solid) is
compared to the approximation in Equation 3.8 (dashed) with the
parameters w=1, =0.7 and =0.1. The actual threshold probabilityis a little lower on the tails. ....................................................................... 40
Figure 4.1 The relationship between the expected Hamming distance and the noisespreadis shown for (a) 13 different PSF widths (b) 7 different thresholdsand (c) 8 different angles of inclination. The PSF width, threshold and
orientation of the edge dont affect the relationship between Hamming
distance and noise spreadas long as it is not a degenerate orientation with
a slope that can be represented by an irreducible fraction of small integerssuch as 0 degrees....................................................................................... 51
Figure 4.2 Hamming Distance versus noise spreadfor circles of varying radii. (a)
Both images have the same phase shift (b) Images have randomindependent phases. (c)Noise spreadis plotted against the Hammingdistance for both in-phase and independent phase together using the
average of the results from each radius. The effect of phase is less
significant when the noise spreadis higher. ............................................ 55
Figure 4.3 (a) Results of in-phase experiments. (b) Results of independent phase
experiments show a strong relationship between noise spreadandexpected Hamming distance. .................................................................... 58
Figure 4.4 The effect of using an independent phase becomes less significant as
Noise increases.......................................................................................... 59
Figure 5.1 An example of a straight edge scanned with w=3, =.3 and NS=.6. Thesolid line shows the position of the original edge. The dotted line is the
theoretical position of the scanned edge which results from shifting by c.The center point is the midpoint of the theoretical edge segment. ........... 63
-
8/8/2019 McGillivary Thesis
11/118
xi
Figure 5.2 (a) The variance of angle measurements from perpendicular fitting and
standard fitting out perform the other methods and LoG is particularlypoor. (b) The variance of distance measurements clearly shows that
standard and perpendicular fitting both perform markedly better then the
other methods............................................................................................ 64
Figure 5.3 (a) The perpendicular and standard fitting both low perpendicular bias.
However, this is not true of the other methods. (b) There is a small angularbias in the standard fitting compared to the perpendicular fitting. LoG has
a very large angular bias. .......................................................................... 64
Figure 5.4 (a) Variance of angular error with a fixed sampling grid. (b) Variance of
distance measurements for fixed sampling grid........................................ 66
Figure 5.5 (a) Bias in angle measurement with a fixed sampling grid. (b) Bias in edge
position measurement with a fixed sampling grid. ................................... 66
Figure 5.6 (a) Variance of angular error with a random sampling grid. (b) Variance ofdistance measurements for random sampling grid.................................... 67
Figure 5.7 (a) Bias in angle measurement. (b) Bias in edge position measurement. . 67
Figure 5.8 (a) Variance of angular error with a random sampling grid and severaldifferent values ofw. (b) Variance of distance measurements for a random
sampling grid and several different values ofw. ...................................... 68
Figure 5.9 (a) Bias in the orientation measurements of edges with fixed sampling grid
and several different values ofw. (b) Bias of distance measurements for a
fixed sampling grid and several different values ofw. ............................. 68
Figure 5.10 (a) Variance of angular error with a random sampling grid and several
different values ofw. (b) Variance of distance measurements for a randomsampling grid and several different values ofw. ...................................... 69
Figure 5.11 (a) Bias in the orientation measurements of edges with random sampling
grid and several different values ofw. (b) Bias of distance measurementsfor a random sampling grid and several different values ofw.................. 69
Figure 5.12 (a) Bias in the orientation measurements of edges with a fixed sampling
grid and several different edge orientations. (b) Bias in the distance
measurements of edges with a fixed sampling grid and several differentedge orientations. ...................................................................................... 71
Figure 5.13 (a) Variance in the orientation measurements of edges with a fixedsampling grid and several different edge orientations. (b) Variance of the
distance measurements of edges with a fixed sampling grid and several
different edge orientations ........................................................................ 71
-
8/8/2019 McGillivary Thesis
12/118
xii
Figure 5.14 (a) Variance of angular error with a random sampling grid and several
different edge orientations. (b) Variance of distance measurements forrandom sampling grid and several different edge orientations................. 72
Figure 5.15 (a) Bias in the orientation measurements of edges with random samplinggrid and several different edge orientations. (b) Bias of distance
measurements for random sampling grid and several different edge
orientations................................................................................................ 72
Figure 5.16 (a) Variance of angular error with a fixed sampling grid and degenerateedge orientations. (b) Variance of distance measurements for a fixed
sampling grid and degenerate edge orientations....................................... 73
Figure 5.17 (a) Bias of angular error with a fixed sampling grid and degenerate edge
orientations. (b) Bias of distance measurements for a fixed sampling grid
and degenerate edge orientations. ............................................................. 73
Figure 5.18 (a) Variance of angular error with a random sampling grid and degenerateedge orientations. (b) Variance of distance measurements for a randomsampling grid and degenerate edge orientations....................................... 74
Figure 5.19 (a)Bias in the orientation measurements of edges with random sampling
grid and degenerate edge orientations. (b) Bias of distance measurements
for random sampling grid and degenerate edge orientations. ................... 74
Figure 5.20 Distance and Angular Error for 0 degree edge without noise................... 75
Figure 5.21 Distance and Angular Error for 20 degree edge without noise................. 75
Figure 5.22 Distance and Angular Error for 0 degree edge withNS=.3. ..................... 76
Figure 5.23 Distance and Angular Error for 20 degree edge withNS=.3. ................... 76
Figure 5.24 (a) The radius measurements from Experiment 1 have a normal
distribution. (b) The distances between the measured circle centers and the
actual circles centers from Experiment 1 have a Rayleigh distribution.... 78
Figure 5.25 (a)The Rayleigh parameter of the positional error for several different
values ofw. (b) Variance of radius measurements for several different
values ofw. ............................................................................................... 79
Figure 5.26 (a)The Rayleigh parameter of the positional error for several different
values of. (b) Variance of radius measurements for several different
values of. ............................................................................................... 80
-
8/8/2019 McGillivary Thesis
13/118
xiii
LIST OF TABLES
Table 2.1 Simple operators used for edge detection. .................................................24
Table 3.1 Functions for calculating blurred noise......................................................48
Table 4.1 Edge experiment parameters......................................................................50
Table 5.1 Parameters for threshold experiments........................................................65
Table 5.2 Parameters for PSF width experiments......................................................68
Table 5.3 Parameters for edge orientation experiments.............................................70
Table 5.4 Circle Experiment Parameters ...................................................................77
-
8/8/2019 McGillivary Thesis
14/118
LIST OF SYMBOLS, TERMS & ABBREVIATIONS
OCR Optical Character Recognition
PSF Point Spread Function
ESF Edge Spread Function
LSF Line Spread Function
CSF Circle Spread Function
o(x,y) bilevel continuous input image
s(x,y) intensity of pixels at (x, y) before noise
s(x) intensity of pixels at position x before noise
s[i,j] sampled unquantized output of blurring convolution
f[i,j] final binary scanner output
n[i,j] Gaussian noise added in degradation model
noise standard deviation of Gaussian noise
w general width parameter for PSF
binarization or threshold level
Distance between samples
max binarization or threshold level which causes stroke to disappear.
c This is the distance that an edge shifts when it is scanned
Thickness of a scanned stroke before scanning
scanned Thickness of a scanned stroke after scanning
Amount that the thickness of a scanned stroke increases by
Rf Scanned circle radius
Ri Original circle radius
Cutoff used for noise spread
Z Z value for alpha cutoff for noise spread
-
8/8/2019 McGillivary Thesis
15/118
(x) Density of pixels
v(x,y) Noise on source image
qs(x,y) Variance of noise on source image
qb(x,y) Variance of noise after blurring
Angle of inclination for edge
-
8/8/2019 McGillivary Thesis
16/118
1
1. INTRODUCTION
High contrast images, which occur frequently in document images, are often digitized
into binary images. These binary images are then analyzed by optical character
recognition (OCR) systems which convert the text images into ASCII characters. OCR
systems depend on many features measured from the text images. They use large sets of
images that have already been classified and ideally are representative of the characters in
document images. Features are then measured from these training sets and those
measurements are used to label unclassified characters. One way of generating these
training sets is to generate large numbers of synthetic character images whose labels are
known and then to use these images to train the OCR system. To do this effectively it is
best if the synthetic characters are as similar as possible to real scanned characters. It is
not enough that individual characters be similar to characters in real document images;
the statistical properties of large numbers of characters need to match those of real
characters. This means that a good theoretical basis for the nondeterministic effects in the
generation of scanned characters is required.
Part of the nondeterministic aspects of generated characters is the random position or
phase of the sampling grid relative to a scanned object. When the sampling grid is shifted
relative to a continuous character, a large number of different bitmaps can result. It is
possible to determine the number and frequency of each of these bitmaps using modulo
grid diagrams [1]. Modulo grid diagrams are formed by performing a modulo one
operation with respect to both the horizontal and vertical coordinates of an objects
-
8/8/2019 McGillivary Thesis
17/118
2
boundary. The effects of random phase can be incorporated into OCR by generating
training sets with random phases.
Unfortunately random phase is not the only nondeterministic aspect of scanned
characters. Another source of randomness is the noise that is in the document image prior
to digitization and the noise that is added during scanning. When the image is binarized,
the noise becomes concentrated on the boundaries of objects. In order for training sets to
have the same statistical properties as real scanned characters, it is necessary for the
training sets to have the appropriate amount of noise. Before this problem can be
addressed the amount of noise on the edges of binary images must be quantified and
understood theoretically.
The amount of noise in binary images is not only dependent on the amount of noise
added to the image, but on the shape of the object being scanned and on the scanner
model parameters. This research focuses on quantifying the noise in bilevel images and
on relating that amount of noise to the parameters of a commonly used degradation
model. The quantity that was developed is called noise spread.Noise spreadis the size of
the domain over which pixels in the binary image are affected by noise. While the
research in this thesis focuses on document images, the concept ofnoise spreadcould be
applied to other situations where an image is binarized.
Noise spreadprovides a theoretical basis for understanding and measuring noise in
document images. It is critical for developing methods to mitigate the effects of noise in
binary images.Noise spreadallows bilevel images to be created with different
degradation parameters but the same amount of noise. Filters can be tested to see if the
negative effects of noise can be suppressed. Measurement techniques can also be
-
8/8/2019 McGillivary Thesis
18/118
3
developed to measure noise directly from document images. The noise in a binary
document image can be measured, and the training sets that are used to design an OCR
system could have the same levels of noise as the documents that the OCR system is
being applied to. Alternatively the noise measurement could be included into a general
OCR system as an additional feature.
Noise in binary images is equivalent to errors in general binary signals. In
information theory the Hamming distance between two binary signals of the same length
is the number of bits that are different between the two signals. The amount of noise in a
bilevel image can reasonably be measured by the Hamming distance between the image
with noise and the same image without noise. Figure 1.1 shows the Hamming distance for
images of a straight edge with and without noise. Theory about the relationship between
Hamming distance and noise spreadwill be introduced in Section 3.3, and experiments to
verify this theory are described in Chapter 4. Chapter 4 also includes experiments on the
relationship between the Hamming distance and noise spread when the phases of the two
objects are independent.
Since objects with noise are likely harder to precisely locate in an image, several
localization experiments were conducted to verify that noise spread accurately quantifies
Figure 1.1: The ideal edge and the noisy edge have the same phase and orientation. The Hamming
distance between the two images is the number of pixels that are different between the two
images.
Ideal Edge Noisy Edge Hamming Distance
-
8/8/2019 McGillivary Thesis
19/118
4
the noise in thresholded images. The experiments were designed to verify that noise
spread accounts for all the effects of different document degradation parameters on the
localizability of circles and straight edges. There are applications where the ability to
precisely locate edges in bilevel images is important. For instance Hok Sum Yam [2]
developed a method for finding the degradation parameters from bilevel document
images. The method depended on precise edge measurements in order to determine how
much the corners of characters were eroded by scanning.
Chapter 2 discusses the technical background. That chapter includes a discussion of
how synthetic scanned images are created, a review of edge and circle localization
techniques, and a discussion of the effect of sampling. Chapter 3 introduces noise spread
in great detail and shows that it is related to Hamming distance. Chapter 4 provides an
experimental relationship between Hamming distance and noise spread that reinforces
and expands upon the relationship theorized in Chapter 3. Experiments in Chapter 5
provide evidence that all of the effects of the document degradation parameters on the
localizability of scanned edges and circles are determined by the noise spread.
-
8/8/2019 McGillivary Thesis
20/118
5
2. TECHNICAL BACKGROUND
Before the theory behind noise in binarized images can be presented, it is necessary to
discuss the basic degradation model upon which it is based. It is also necessary to review
literature on the precise geometric measurements of straight edges and circles in images.
Determining the effect of noise on localizability is important to prove that noise spread is
a good quantitative measure of edge noise in bilevel images. The effects of phase are also
important because phase affects both localizability and Hamming distance between
objects.
The degradation model describes the acquisition of a binary scanned image as a
multistage process whose steps include: convolving with apoint spread function (PSF),
sampling, adding noise, and thresholding. Section 2.1 will discuss the document
degradation model in more depth and show how the model can be applied to scanned
edges, circles, and strokes. A method for determining the gradients of the simulated grey
level images is also discussed.
Edge finding is a very basic computer vision task and has been widely studied. The
experiments in this thesis are designed to determine the localizability of scanned edges
under different amounts of noise. Section 2.2 provides a review of the literature on edge
finding techniques and discusses in detail the edge finding techniques that were used in
this thesis.
There are several techniques for localizing circles. All the techniques that were
considered use least squares, but some techniques work better than others. Section 2.3
-
8/8/2019 McGillivary Thesis
21/118
6
discusses these circle localization techniques and describes the Gauss-Newton algorithm
used to solve the nonlinear least squares problem.
The effects of sampling images have been studied extensively. Section 2.4 reviews
the literature on sampling effects, discusses the effect of sampling and edge orientation
on the localizability of scanned edges, and also discusses tools like the modulo-grid
diagram which provide a means for predicting the possible bitmaps that result from
different shapes due to phase effects alone.
2.1. Generating Synthetic Scanned ImagesThis thesis uses a degradation model based on a model proposed by Baird [3]. In that
model an ideal continuous bilevel image is convolved and sampled by a point spread
function (PSF). Then Gaussian noise is added to the image to represent noise added
during scanning and noise that would have been originally present on the paper image.
Finally, the image is binarized at a certain threshold level as shown in Figure 2.1.
This section begins with a detailed mathematical description of the scanner model.
Then in subsequent subsections this model is applied to straight edges, scanned strokes,
circles, and general scanned characters. In order to understand how noise affects scanned
bilevel images, we will need to determine the intensity gradients of the image before
thresholding. These gradients help determine how noisy a bilevel image will be. A
technique for obtaining these gradients is discussed for each of the different types of
scanned objects.
-
8/8/2019 McGillivary Thesis
22/118
7
2.1.1. Basic Scanner Model
The basic scanner model describes the sampling of the spatially continuous image of
blackness or absorptance, o(x,y), where absorptance is one minus the reflectance. The
values ofo(x,y) can be either 0 (white) or 1 (black). The image is digitized by a sensor
array in the scanner. A PSF is used to model the fact that for each point on a physical
paper image different amounts of light are reflected to each sensor. The PSF is the 2-D
equivalent to the impulse response of a scanner. This equivalence means that convolution
can be used to predict the amount of reflected light each sensor detects. If the image is
sampled at pointsxj, yi on a rectangular grid, then the image is given by
[ ] ( ) ( )dudvvuovyuxjis ij ,,PSF, = . (2.1)
This equation assumes that the scanner is spatially invariant over the field of view, which
is valid for small regions. In order to model the noise that would exist on the original
image and the noise that is added during scanning, Gaussian noise n[i,j] is added to the
image
[ ] [ ] [ ]jinjisjia ,,, += . (2.2)
The noise is added to every sensor independently and has a mean of zero and a standard
deviation ofnoise. Other types of additive noise could be used as well.
To produce a bilevel image the intensity is quantized using a thresholding operation
[ ][ ]
[ ]
(2.23)
and
c 2 . (2.24)
The lower bound comes from the fact that scanned cannot be negative, and the upper
bound comes from Equation 2.20, Equation 2.13 and the fact that the ESF is always
positive. These upper and lower bounds can be used with the bisection method of root
finding to find .
Values ofw and which result in certain /2 values can be represented by iso curves
which are equipotential curves in w and on which /2 has the same values. Figure 2.4
shows these iso curves for a Cauchy PSF on the same plot as the c curves. As can be
seen the values of/2 are significantly smaller than c even when is 15. Appendix A
shows the iso curves that result from both Gaussian and Cauchy PSFs and for several
different values of.. Code for calculating is included in Appendix B.
For a Gaussian PSF the difference between /2 and c becomes insignificant as gets
larger. Ifc is used to estimate the size of the scanned stroke after scanning, then it is
necessary to determine whether the edges are close enough to have interference. As a rule
of thumb, a Gaussian ESF is approximately either 1 or 0 at 3w from an edge. This means
that if the size of the stroke after scanning predicted by c is greater than 3w, no
interference occurs. This rule of thumb can be summarized by the following inequality
-
8/8/2019 McGillivary Thesis
31/118
16
which if satisfied means that no interference occurs
w c 3 . (2.25)
As with straight edges it is very important to determine the magnitude of the gradient
ofs(x) at the location of the thresholded edges. The derivative ofs(x) is given by
( )w
w
x
w
x-
xs
+
=
2
2LSF
2
2LSF
. (2.26)
Since the slope on the rising edge is positive
w
w
w
s
scannedscanned
scanned
++
+
=
2LSF
2LSF
2. (2.27)
Figure 2.4 The Iso Curves for a Cauchy PSF are significantly different from the c curves even whenthe stroke width is 15.
-3
-2-1
0
1
2
3
-3
-2
-1
0
12
3
c
/2
/2c
-
8/8/2019 McGillivary Thesis
32/118
17
The gradient can also be expressed as a function of
w
ww
s scanned
+
+=
2LSF
2
2LSF
2
. (2.28)
2.1.4. Scanned Circles
Circles are among the simplest geometric shapes. As a consequence, when
experiments are done on the effects of noise on bilevel images, it is useful to apply them
to circles. A circle spread function CSF can be defined to describe the intensity of pixels
as a function of the distance from the center of the circle. When the circle is scanned, its
size changes. The scanned circle radiusRfcan be found from the original circle radiusRi
given the scanner parameters. Likewise sometimes when circles are generated for
experiments,Ri needs to obtained fromRf. As with edges and scanned strokes the
gradient of the scanned circle can be determined for both Gaussian and Cauchy PSFs.
Figure 2.5 shows the cross section of a circle scanned with a Cauchy PSF.
The intensity of a pixel as a function of the distance from the center of the circle can
be obtained using
( )
=i
i
i
i
R
R
dx
xR
r,y;w)dy(xr
xR22
22
PSFCSF . (2.29)
For a Gaussian PSF the equation becomes
( )( )
=
i
i
R
R
idx
w
xRerf
w
rx
wr
22exp
2
1CSF
22
2
2
Gaussian . (2.30)
For a Cauchy PSF the equation becomes
-
8/8/2019 McGillivary Thesis
33/118
18
( )( )( ) +++
=
i
i
R
R i
idx
wRxrrwrx
xR
wr
22222
22
Cauchy2
CSF . (2.31)
The integrals have to be evaluated numerically. Special care must be taken when
numerically solving for the Gaussian CSF. The value of the integrand is near zero for a
large part of the domain over which it is integrated. This causes large errors when certain
numerical algorithms are used. If
iRwr 5 , (2.32)
then the following integral should be used to calculate the Gaussian CSF
( )
( )
=
i
i
R
wr
i
dxw
xR
erfw
rx
wr5
22
2
2
Gaussian 22exp2
1
CSF . (2.33)
The value ofRfdepends onRi, , and w. As with stroke thickness,Rfis defined
implicitly by
Figure 2.5 When a circle is scanned its radius changes. As with scanned strokes if the threshold is too
high the circle will disappear when it is scanned.
R
R
-
8/8/2019 McGillivary Thesis
34/118
19
,w;RR if =CSF . (2.34)
Numerical methods are necessary to find the value ofRf. The CSF is a monotonically
decreasing function because of the restrictions that were placed on the PSF. As with
scanned strokes there is a max above whichRfwill be zero. To find max we use
( ) max0CSF = . (2.35)
The easiest way to find CSF(0) is to use polar coordinates
( )=iR
rdrr;w)(
0
PSF20CSF . (2.36)
For a Gaussian PSF this simplifies to
( )
=
2
2
Gaussian2
exp10CSFw
Ri . (2.37)
For a Cauchy PSF it is
( )22
Cauchy 10CSF
iRw
w
+= . (2.38)
Once it is confirmed that is not greater than max the value ofRfcan be found. There is
a lower bound onRfsince it cannot be negative. To find an upper bound onRf, a value
must be found which causes the CSF to be less than . The upper bound is first chosen to
be two timesRi. If this does not result in a CSF value less than , then 2Ri becomes the
new lower bound, and the upper bound is chosen to be four timesRi. The assumption for
the upper bound is doubled until it results in a CSF value less than . Once the upper
bound is determined, a bisection method can be used to findRf. It is also possible to
determineRi ifRf, w and are known.Ri is greater than zero andRfincreases
-
8/8/2019 McGillivary Thesis
35/118
20
monotonically asRi increases. The algorithm for findingRi is essentially the same as the
method for findingRf. The algorithm for findingRi is included in Appendix C.
The magnitude of the gradient of scanned circles is important for determining the
effects of noise. If the gradient is calculated by evaluating the CSF at two points, the
calculation is prone to error. The numerical integration has some noise which is
magnified by this technique. Instead it is better to take the derivative analytically. The
gradient is given by
( ) ( )( ) dxdyyrx
r
r
r
i
i
R
R
xR
xR
=
22
22
,PSFCSF . (2.39)
For a Cauchy PSF this simplifies to
( )( ) ( )( )
( )( ) ( )( ) +++++
= i
i
R
R i
i dx
wRxrxwrx
xRwRxrxxrwr
r 2/32222222
222222
Cauchy
3223CSF
. (2.40)
For a Gaussian PSF the gradient is given by
( ) ( )
= R
R
idx
w
xRerf
w
rx
w
rxr
r 22exp
2CSF
22
2
2
3Gaussian . (2.41)
In both cases the gradient is found by numerical integration. For a Gaussian PSF the same
problem exists with the integrand being near zero over a large part of the domain over
which it is integrated. The solution for finding the CSF can be applied in exactly the same
way to obtain the gradient.
2.1.5. Scanned Characters
The shape of scanned characters is too complicated to use many of the analytical
methods in the previous section. Instead the value of the scanned image is determined by
-
8/8/2019 McGillivary Thesis
36/118
21
using the discrete convolution of a sampled character image and a sampled PSF. There is
some error associated with this method, but it is reduced by using sampled images and
PSFs with resolutions larger than the resolutions of the final images. Thescale factoris
defined as the simulated resolution divided by resolution of the final image. A PSF is
generated which is also sampled at the samescale factoras the character image. Because
each pixel in the sampled PSF represents an area smaller than the pixels in the final
image the PSF kernel is
[ ]2
,PSF,PSFKernel
rscalefacto
yxji
ij= . (2.42)
Since the PSF kernel must be finite in size, the PSF is effectively truncated.
There are several advantages of using a Gaussian PSF over a Cauchy PSF in terms of
accurately simulating the continuous convolution. A Gaussian PSF can be safely
truncated at four times the w and have very little error. However, for a Cauchy PSF this is
a problem because it is a heavy tailed distribution. To achieve the same accuracy the
Cauchy PSF would have to be truncated at about 3000 times the w. Another advantage of
the Gaussian PSF is that it is separable. This means that the convolution can be calculated
by taking the one dimensional PSF, convolving it with each row, and then convolving it
with each column. While a Gaussian PSF provides several advantages over the Cauchy
PSF, the experiments in this thesis involve isolated characters. The white background
makes it possible for this situation to be simulated even for a Cauchy PSF as long as the
convolution kernel is a little more than twice the size of the original character. This is
because the truncated part of the Cauchy PSF would always be over white background.
After the high resolution images are convolved with the high resolution truncated
PSFs, the images are then down sampled. The location of the final sampling grid does not
-
8/8/2019 McGillivary Thesis
37/118
22
necessarily coincide with the high resolution sampling grid. In order to have random
continuous phase shifts and non-integer factor values it is necessary to interpolate the
values of pixels. To do this bilinear interpolation can be used because of its simplicity
and because the errors associated with it are not significant.
It is also necessary to determine the gradients of the scanned characters. While the
gradients could be measured from the high resolution grey level image, this is not the
method that was used. The derivative is a linear operation which means that the
derivative of the PSF can be taken and then the gradient of the image can be determined
by convolving the original character image with the resulting kernel. The derivative with
respect tox of the Cauchy PSF is
( )
( ) 25
222Cauchy
2
3,PSF
wyx
xwyx
x++
=
. (2.43)
For the Gaussian PSF the derivative is
( )
+
=
2
22
4Gaussian 2exp2,PSF w
yx
w
x
yxx . (2.44)
When these functions are used to create convolution kernels, the functions have to be
divided by thescale factorsquared. The kernels for the derivatives with respect toy can
be obtained by transposition. The Gaussian kernel is separable which can be used to
speed up computations.
2.2. Edge Finding TechniquesThis thesis focuses on the effect of additive Gaussian noise on scanned images. One
component is to explore the ability to accurately locate edges in scanned document
images. Finding lines in an image is critically important in the fields of image processing
-
8/8/2019 McGillivary Thesis
38/118
23
and computer vision, and there is a substantial amount of work that has been done on the
topic. A significant amount of attention has gone to developing operators, which bring
out the edges in an image. This is usually followed by techniques that use the Hough
transform to find the location of the line [6]. There has also been study of accurately
locating edges and lines in bilevel rasterized images [8].
One approach to edge detection is to convolve the image with an edge detector and
then to threshold the image and locate the edge using a Hough transform [6]. The Hough
transform works by mapping points to the set of lines that pass through those points.
Edges can be represented by two parameters such as angle and distance from the origin.
These two parameters form a parameter space which can be divided into discrete bins.
The Hough transform is performed by looping through every edge point in the image and
then incrementing the value in every bin that contains parameters to an edge that runs
through the point. After this is done for every edge point the true edge can be determined
by finding the bin with the largest value.
There are a variety of operators that can be used for edge detection. One such
operator is the Sobel operator. The Sobel operator is a combination of two operators
which estimate the two components of the image gradient Gx and Gy. IfA is the original
image Gx is given by
AGx
=
101
202
101
. (2.45)
Gyis calculated using an operator that is simply the transpose of the one used to calculate
Gx. The magnitude of the gradient can then be estimated as
-
8/8/2019 McGillivary Thesis
39/118
24
22 yx GGG += . (2.46)
Once the gradient image is determined, it is thresholded to find the edge points, and then
the Hough transform is used to find the edge. The maximums of the Hough transform
correspond to the parameters of the edge. The Prewitt and Roberts operators work in a
way that is similar to that of the Sobel operator. Table 2.1 shows the Sobel, Prewitt, and
Roberts operators.
Table 2.1: Simple operators used for edge detection.
Sobel Operators Prewitt Operators Roberts Operators
-1 0 1
-2 0 2
-1 0 1
-1 -2 -1
0 0 0
1 2 1
-1 0 1
-1 0 1
-1 0 1
-1 -1 -1
0 0 0
1 1 1
0 -1
1 0
-1 0
0 1
In addition to the simple Sobel, Prewitt, and Roberts operators more complicated
operators can be used. One such operator is theLaplacian of Gaussian (LoG) operator.
This operator is given by
( )
=
2
2
4
22
2exp
rrrh . (2.47)
This operator is the second derivative of a Gaussian function with a width parameter of.
The operator is circularly symmetrical. Numerically it is represented by at least a five by
five kernel. One approximation of the LoG kernel is given by
=
00100
01210
121621
01210
00100
LoG . (2.48)
Some of the most important work on finding edges in grey level images was done by
Canny [7]. The theoretical basis for the edge detection mask developed by Canny
-
8/8/2019 McGillivary Thesis
40/118
25
depends on being able to separate the image into noise and signal components. However,
when an image is subjected to a nonlinearity such as thresholding, the noise and signal
components cannot be separated in this way. The Canny operator begins by smoothing
the image with a Gaussian. Then the gradients of the image are determined. Two
thresholds are used to determine which pixels are edge pixels. The first threshold is set
very higher than the other and any pixel whose gradient exceeds the threshold is labeled
as an edge pixel. Then pixels that are adjacent to an edge pixel are also labeled edge
pixels if their gradient exceeds the second threshold. The Canny operator was
implemented in this thesis using Matlabs built in edge detection function.
Because using operators such as the Canny operator has no strong theoretical basis in
bilevel images, we can use a more basic approach. This approach involves selecting data
points between each pair of adjacent black and white pixels. Then a line can be fitted to
these points based on the least squared distance. The least squares fitting can either use
the squared vertical distance of points to the edge or use the squared perpendicular
distances. Gordon and Seering [8] analyzed the accuracy of least squares at finding the
location of edges. They use an assumption that the vertical distance between points on a
digitized line and its corresponding continuous line vary independently of one another.
Using this assumption they determined the estimation error of edges. The case in which
the points do not vary independently of one another will be explored in more detail in
Section 2.4.
The least squares approach and the operator based approaches are explored
extensively in Section 5.1.1. In that section experiments are conducted to determine
-
8/8/2019 McGillivary Thesis
41/118
26
which of the methods work best for bilevel straight edges. The effectiveness of
perpendicular vs. vertical least squares will also be analyzed.
2.3.
Circle Fitting Techniques
In order to understand the effects of noise on 2-D objects, it is necessary to explore
the effect of noise on the ability to precisely determine the position and radius of scanned
circles. To do this, data points were selected between adjacent pairs of black and white
pixels, then a circle was fit to these data points. Several classical methods of doing this
fitting are discussed in [9]. This section includes a discussion of these methods.
The simplest method for fitting a circle to data points is called Algebraic circle fitting.
The equation of a circle can be given implicitly by
( ) 0=++= caF TT xbxxx , (2.49)
where the coefficients a, b and c are such that a is not zero and b is a two element
column vector. If the values of each data point are plugged into this equation, the result is
uB = , (2.50)
where is the error vector which is to be minimized, B is a matrix
+
+
=
1
1
212
22
1
1211212
211
mmmm xxxx
xxxx
MMMMB (2.51)
and u=[a,b1,b2,c]. Since both sides of Equation 2.49 can be multiplied by a constant, a
constraint can be applied to u that it must be a unit vector. The squared Euclidean norm
of can be minimized using Lagrange multipliers. The constraint that u is a unit vector is
applied to create the following equation
-
8/8/2019 McGillivary Thesis
42/118
27
u
u
u
=
22
. (2.52)
The left side of the equation becomes
( ) ( )( ) ( )uBB
u
BuBu
u
uBuB
u
=
=
=
TTTT
2
2
. (2.53)
The right side also simplifies giving
uuBB = 22 T . (2.54)
The Lagrange multipliers are also the eigenvalues ofBTB. Substitution gives
=== uuBuBu TTT2 . (2.55)
This means that the squared Euclidean norm of is minimized by using the value ofu
associated with the smallest eigenvalue ofBTB. Equivalently u is the right singular vector
associated with the smallest singular value ofB. The center can be obtained from u using
=
a
b
a
bz
2
,
2
21 . (2.56)
The radius is obtained by using
a
c
ar =
2
2
4
b. (2.57)
The problem with Algebraic circle fitting is that minimizing the Euclidean norm of does
not necessarily result in the best fitting circle. It is especially poor when fitting a circle to
an arc of data points.
An alternative to the Algebraic method is Geometric circle fitting. Geometric circle
fitting is a nonlinear least squares procedure which minimizes the sum of the squared
distances of points to the nearest point on the circle. If the center point of the circle is z
-
8/8/2019 McGillivary Thesis
43/118
28
and the radius is r, then the distance of a pixel to the circle is
( )22 rd ii = zx . (2.58)
Ifu=[z1,z2,r]T
defines the circle, then u needs to be selected to minimize
( )m
i
id u2 . (2.59)
A method called Gauss-Newton is used to minimize this expression. The method starts
out with a decent guess of the best value ofu. Ifd(u) is a column vector of the functions
di(u), then the idea is to find the change h in u which will minimize d(u) in the least
squares sense. To do this d(u+h) is approximated using a Taylor series expansion
( ) ( ) ( ) huJudhud +=+ , (2.60)
where J(u) is the Jacobian matrix. In this case the Jacobian is given by
( )
=
1
1
2211
1
122
1
111
m
m
m
m
xu
xu
xu
xu
xu
xu
xu
xu
MMMuJ . (2.61)
The change in u that is required is found by solving the linear least squares problem
( ) ( ) 0+ huJud . (2.62)
The value ofh is
( )( ) ( ) ( )uduJuJuJh = TT 1)( . (2.63)
With every iteration of the algorithm, h is used to update u and a closer approximate
solution of the nonlinear least squares problem is found. This method produces a much
better fit. As was stated earlier, this method requires an initial guess of the value ofu.
One way to obtain this is to use the Algebraic circle fitting. Another way is to find the
-
8/8/2019 McGillivary Thesis
44/118
29
mean of all the data points and make this the center of the circle. The radius can be
estimated by taking the distance of each point to this center and taking the mean of those
distances. Figure 2.6 compares the results of Algebraic and Geometric least squares for a
certain set of points. The points were chosen experimentally to show the weakness of the
Algebraic technique. The Geometric technique always produces better results because the
error that it attempts to minimize is more sensible.
2.4. Effect of SamplingSampling of continuous bilevel images can produce several undesirable effects. Since
the position of the sampling grid relative to the image is random, there are variations that
occur in the resulting bitmaps. Even without noise there is an unavoidable Hamming
Distance between different scans of an image, even from the same scanner. The
geometric precision of edge and circle measurements is limited by the sampling
Figure 2.6 Algebraic Fitting can sometimes result in a poor fit. The center of the Algebraic fit is
(10.24,20.98) and the radius is 4.83. The center of the Geometric fit is (10.10,7.92) and the
raidus is 11.77.
-
8/8/2019 McGillivary Thesis
45/118
30
resolution. In addition the sampling grid for images is anisotropic. This means that edges
at certain orientations are measured with less precision than edges at other orientations.
One of the goals of this thesis is to explore how the random effects of noise and random
phase shifts affect document images.
A review of the literature shows several tools for analyzing the effect of phase on
scanned images. Dorst and Smeulders [10] gave an expression for determining the set of
continuous line segments which could generate a certain chaincode string. This
expression could also be used to find the worst case positional accuracy of an edge
segment. Dorst and Duin [11] introduced the concept of spirographs and used it to
calculate the average and worst case positional accuracy of edges. Havelock [12] used
modulo grids to analyze the positional accuracy of various shapes. Sarkar [1] expanded
upon Havelocks work by using modulo grids to calculate the number and frequency of
bitmaps that an object would produce.
Spirographs can be used to describe the way in which a continuous edge is sampled.
Any edge can be flipped on the reflection linesx=y,y=0 andx=0. Because of this the
effect of sampling any straight edge can be determined by studying those straight edges
with a slope in the range (0, 1). A spirograph consists of a circle withNpoints on it
which divide it intoNarcs as seen in Figure 2.7
Each consecutive point is placed the same constant clockwise distance around the
circle from the previous point. The sampling grid for an edge can be represented by
making the distance between each consecutive point equal to the slope of the edge. The
random location of the sampling grid with respect to the edge can be represented by
randomly placing the edge as a point on the spirograph;Nis then the number of columns
-
8/8/2019 McGillivary Thesis
46/118
31
in the edge image. If the edge is shifted up vertically, it is moved clockwise around the
circle. If it crosses a sample point on the spirograph, the bitmap of the edge will change,
and the number of segments formed around the spirograph is the number of bitmaps a
certain edge can have.
One special case is when the slope of an edge can be represented by the irreducible
fractionp/q and when q
-
8/8/2019 McGillivary Thesis
47/118
32
the sampling grid. The position of the edge relative to the sampling grid is a random
number. So the variance of the perpendicular distance between the measured and actual
positions of a noise free edge is
( )22121
qpVar
+= . (2.65)
In order for the length of an edge segment with slope m to beL the number of
columnsNmust be determined by
.
12
+=
m
LroundN (2.66)
If a spirograph is defined with the first parameter being the distance between successive
points and the second byN, then the spirograph for this edge is
+1,
2m
LroundmSPIRO . (2.67)
The precision of edge measurements can be determined by the combination of two
parameters. The distance parameter is the perpendicular distance of the edge from the
midpoint of the continuous edge segment where the distance is positive if the measured
edge is above the continuous edge and negative otherwise. The angular error is the
difference between the angle of inclination associated with the theoretical edge and the
angle of inclination associated with the measured edge. The variance in the distance for
an edge can be shown to be
( )( )
+
+= 2
2
3
112ii
i pedm
dDistanceVar , (2.68)
where di is the length of the ith
arc on the spirograph andpei is perpendicular distance
-
8/8/2019 McGillivary Thesis
48/118
33
between the actual and measured edge when the phase is chosen to be the midpoint of the
ith
arc. The variance of the angular error between the measured and theoretical edge can
be shown to be
( ) = 2ii aedorAngularErrVar , (2.69)
where aei is the angular error of the measured edge when the phase is chosen to be on the
ith
arc. Figure 2.8 shows the variance of the distance and angular error as a function of
slope. The slopes of 0, 1/2 and 1 have large distance variance, but the angular errors for
these slopes are zero. The greatest angular errors occur for edges with slopes close to but
not equal to 0, 1/2 and 1.
In addition to the effects of sampling on the geometric measurements, sampling also
affects the Hamming distance between two scans of the same object. If the two scans had
the same phase and there were no noise, then the two scans would have a Hamming
distance of zero. However, because different phases result in different bitmap
configurations, there is some Hamming distance even when two objects are aligned to
(a) (b)Figure 2.8: (a) The variance of the distance between the measured and actual edges is determined for
edges that are 20 pixels long. (b) The variance in the angular error is determined for
noiseless edges that are 20 pixels long.
-
8/8/2019 McGillivary Thesis
49/118
34
minimize the difference. The Hamming distance that will occur depends on the shape of
the object being scanned. Modulo grids can be used to determine the expected Hamming
distance between scans with independent random phases and no noise. However, this
approach probably would not be more efficient than large experiments that generate scans
and then find the minimum Hamming distance. Certain shapes like circles are known to
have high Hamming distances because the size of the locals in the modulo grid are small.
For this same reason these shapes have been analyzed for their use in image registration
[13],[14].
Neither modulo grids nor spirographs can predict the effect of combining sampling
and noise on geometric measurements. The phase of simulated scans in an experiment
can be fixed in order to isolate the effects of noise. Then further experiments can explore
the combined effects of noise and random phase. These noise effects are explored in
detail in Chapters 4 and 5.
-
8/8/2019 McGillivary Thesis
50/118
35
3. NOISE SPREAD THEORY
For grey level images noise is usually described by the standard deviation noise of the
additive noise. However the amount of noise present in a bilevel scanned image is not
dependent purely on the level of noise added prior to thresholding. This can be seen
clearly by looking at Figure 3.1. The first three images all have the same amount of
additive noise. However, the noise spread(NS) increases from left to right. One of the
central points of this thesis is to derive this quantity and show that it is a good
representation of the amount of noise in a bilevel image. This makes it possible to
generate synthetic bilevel images with specific amounts of noise.
3.1. Noise Spread for straight isolated edgesThe basic idea behind noise spreadis that when an image is thresholded the noise is
concentrated on the edges of the objects in the image. The noise spreadfor a given edge
is the size of the domain in which pixels are affected by additive noise. Typically this
domain, called the noise spread region, is less than a pixel thick. Its size is still relevant
because if it is larger then it is more likely that an edge pixel will be in this region. Noise
Figure 3.1 Edges with varying amounts of noise spread. While the standard deviation of the noise in
first three images is the same, the noise spread is different. The picture on the far left shows
an extreme amount of noise.
w=0.64
=0.5
noise=0.05
NS=0.2
w=1.27
=0.5
noise=0.05
NS=0.4
w=1.9
=0.5
noise=0.05
NS=0.6
w=3.16
=0.5
noise=0.1
NS=2.0
-
8/8/2019 McGillivary Thesis
51/118
36
spreadis dependent in part on the shape of the object being scanned. Initially
noise
spreadis derived for isolated edges. Isolated edges are among the simplest shapes upon
which to do experiments and can be represented in one dimension as step functions.
Section 2.1.2 discussed how straight edges are affected by scanning, but that section was
focused on the deterministic effects of scanning. Nondeterministic effects such as
additive noise must be discussed in the context of probability.
Figure 3.2 shows how an edge is affected by scanning. Figure 3.2(a) shows what
happens when noise is disregarded. As was discussed in Section 2.1.2 the edge shifts by
c. However, as shown in Figure 3.2(b), when noise is added there is a region in which
the value of pixels after thresholding is uncertain. This region is called the noise spread
region. The size of this region is called the noise spread(NS), and as illustrated in Figure
3.1, it is a good quantitative measure of how noisy a bilevel image is. To precisely define
NSit is necessary to define the probability that a pixel at a certain distance from the edge
will be above the threshold. This threshold probability (THP) depends on the cumulative
distribution function (CDF) of the noise and is
(a) (b)
Figure 3.2: (a) Edge after blurring with a generic PSF of width, w. When no noise is added, the
thresholding produces the edge shift c. (b) Edge with noise. The uncertain boundary shownin gray, is the noise spread region. The effects of sampling are not shown.
-
8/8/2019 McGillivary Thesis
52/118
37
( )
=noise
w
x
x
ESF
CDFTHP Gauss . (3.1)
The noisy edge will be above the threshold with probability near 0 on one side of theNS
region and with a probability of near 1 on the other side of theNSregion. The THP can
then be represented with a piecewise approximation
( )
( ) ( )
( )
+
++
+
=
21
22
12
0
)THP(
NSx
NSx
NSx
NS
NSx
x
c
cc
c
, (3.2)
where
cxx
NS
=
=THP
1. (3.3)
The derivative of THP is a function of the noisesprobability density function (PDF)
noise
noise
w
x
w
w
x
x
=
ESF
PDF
LSFTHP
Gauss
. (3.4)
Evaluating atx=-c, which is where ESF(x/w)=, gives
( ))wx noisex c =
=
2ESFLSFTHP
1
. (3.5)
Substituting this back into Equation 3.3 gives an estimate of the noise spread
( )( )
1ESFLSF
2
=
wNS noise . (3.6)
-
8/8/2019 McGillivary Thesis
53/118
38
Figure 3.3 shows the piecewise approximation of the threshold probability curve.
While the piecewise approximation is illustrative it is also fairly crude. The original
definition of the noise spreadwas the size of the domain in which the values of pixels are
uncertain. This level of uncertainty can be quantified by defining the noise spread as the
breadth of the domain over which the threshold probability is in the range (,1- ). The
arbitrary cutoff is used to determine the boundaries of the noise spread region. The more
accurate approximation of the THP starts by linearizing the ESF atx=-c
( )( )cx
ww
x
++
1ESFLSFESF . (3.7)
Substituting this into Equation 3.1 gives
( )( )( ) ( )
+=
w
xx
noise
c
1
GaussESFLSF
CDFTHP . (3.8)
A parameterZcan be defined such that
Figure 3.3 The threshold probability (THP) function is shown for a Gaussian PSF with w=1, =0.7
and noise=0.1.Noise spreadin this case is about .72. The piecewise approximation of
threshold probability is inaccurate at the tails of theTHP function.
-
8/8/2019 McGillivary Thesis
54/118
39
( ) ZGaussCDF1 = . (3.9)
Since the Gaussian CDF is odd symmetric, the noise spread region will be centered on c
so
= cNS
2
THP1 . (3.10)
This can be evaluated by using Equation 3.8 which gives
( )( )
=
w
NS
noise
2
ESFLSFCDF1
1
Gauss . (3.11)
NSis then solved for by using Equations 3.9 and 3.11, which produces
( )( )
1ESFLSF
2
=
wZNS noise . (3.12)
This definition is identical to the one in Equation 3.6 if
253.12
2==
Z . (3.13)
In order to maintain consistency the cutoff defined in Equation 3.13 will be used. The
resulting value ofis 0.105 which is a reasonable level of uncertainty. This cutoff means
that the noise spread is the breadth of the domain over which the threshold probability is
in the range (0.105, 0.895). The edge images in Figure 3.1 show that noise is very
noticeable in images withNSvalues as low as .2. In most casesNSwill be less than the
extreme example on the far right in Figure 3.1. At some point the added noise is extreme
enough that even pixels away from the edge have an uncertain value. When this occurs,
the approximation in Equation 3.8 is no longer valid. Where this occurs depends on the
PSF used and the degradation parameters. Generally the approximation is better when
is close to 0.5 and when noise levels are small. Figure 3.4 illustrates the approximation in
-
8/8/2019 McGillivary Thesis
55/118
40
Equation 3.8. The parameters used in Figure 3.4 are extreme; for most sets of degradation
parameters it is very hard to distinguish the results from Equations 3.1 and 3.8.
3.2. Extending Noise Spread to general shapesNoise spreadwas introduced for straight edges in Section 3.1, but it is possible to
extend the noise spread theory to arbitrary shapes. In Section 3.1 the noise spread region
was defined to be anywhere the THP is between and 1-. This applies directly to
general shapes. However, with the exceptions of isolated straight edges, scanned strokes,
and circles, the noise spread region will not have the same thickness along the boundary
of a general object. For this reason the noise spread must be defined for any point on the
contourCdefined bys(x,y)=. To do thisNScan be defined as the thickness of the noise
spread region along the direction defined by the gradient at any point on the contourC.
Figure 3.4 The threshold probability function given in Equation 3.1 (solid) is compared to the
approximation in Equation 3.8 (dashed) with the parameters w=1, =0.7 and =0.1.
The actual threshold probability is a little lower on the tails.
-
8/8/2019 McGillivary Thesis
56/118
41
NSof an entire object can then be defined as the mean value of the noise spreadon the
contourC. If the mean reciprocal of the magnitude of the gradient ofs(x,y) along Cis
estimated, then this estimate can be used to find the noise spreadof the object. To do this
a linearization procedure is required.
If (x0,y0) is a point on the contours(x,y)=, and the notationsx andsy is used to
denote the partial derivatives with respect tox andy, then the approximation
( )( )
( )( )
( )0000
000
00
000 ,
,
,,
,
,yxsu
yxs
yxsuy
yxs
yxsuxs xx +
+
+ (3.14)
can be made. This relationship can be used to find the value ofs(x,y) at any point near the
contours(x,y)=. Following the reasoning that gave Equation 3.1, the THP can be
represented as a function ofu and the magnitude of the gradient
=
noise
usu
GaussCDF)THP( . (3.15)
The noise spread at any point on the contour can be expressed as a function of the
gradient at that point
( )( )00
00,
2,
yxs
ZyxNS noise
=
. (3.16)
The noise spread of the object is just the line integral with respect to arc length divided
by the total arc lengthL
( )
( ) ==
C
noiseC
yxsLL
dlyxNS
NS,
12,
. (3.17)
In the case of straight edges, scanned strokes, and circles the gradient is constant. For
other shapes the line integral has to be estimated numerically.
-
8/8/2019 McGillivary Thesis
57/118
42
Noise spreadhas been generalized to apply to arbitrary shapes. It is a very powerful
measure of the edge noise in binary document images. It makes it possible to compare
noise levels of different objects scanned with different scanner parameters. In Section 3.3
its relationship with the Hamming distance of a scanned object and its noiseless template
is explored.
3.3. Relationship between Noise Spread and Hamming DistanceThe real benefit of determining the noise spreadof a scanned object is that it provides
an effective measure of how noisy an object is. Since Hamming distance provides a
metric of how different two scanned objects are, it is very useful for analyzing the noise
in bilevel images. The Hamming distance between a template and a scanned character is
determined by the combination of the phase effects and of the noise. The phase effects
were described earlier. If the phase effects are removed by forcing the template and the
scanned object to have the same phase, then the effects of noise alone can be analyzed.
When this is done it is possible to relate the expected Hamming distanceHto the noise
spread.
To see how this is true it is best to start with the case of isolated scanned edges. The
probability of error (PE) is defined to relate the expected Hamming distance to the noise
spread. The probability of error is related to the THP and is the probability of a pixel
having a different value because of noise than it would without noise. Formally it is
defined by
tol)
R=(Rmin+Rmax)/2;
-
8/8/2019 McGillivary Thesis
118/118
103
[CSF]=CauchyCSF(Rf,alpha,R);if CSF-theta>0
Rmax=R;
elseRmin=R;
endend