microarray gridding - shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015....

27
Chapter 4 Microarray Gridding 4.1 Introduction The discovery of microarray technology in 1995 has opened new avenues for investigating gene expressions (Schena et al., 1995) and introduced new information problems (Fenstemacher, 2005; MacMullen et al., 2005). Researchers have developed several microarray image processing methods and modeling techniques that are specific to DNA microarray analysis (Quackenbush, 2001) with the objective to draw biologically meaningful conclusions (Bajcsy et al., 2004; Baldi et al., 2001; Golub et al., 1999; Moore, 2000). However, the analysis of DNA microarray data consists of several processing steps (Goryachuv et al., 2001) that can significantly deteriorate the quality of gene expression information, and hence lower our confidence in any derived research result. Thus, understanding microarray image processing steps (Bajcsy, 2006) becomes critical for performing optimal microarray data analysis and deriving biologically meaningful conclusions. Microarray image analysis consists of several steps, of which the first critical step is referred to as addressing or gridding (Brandle et al., 2003). This is the process of identifying the areas within an image that contain a single spot, which represents the gene and identifying which subgrid and then which row and column within that subgrid, the spot belongs to. Although this process may seem relatively straightforward, it is complicated since the quality of images suffers from the existence of noise (dust on the slide), artifacts (inner holes and * Some parts of the material in this chapter are appeared / communicated in the following research papers: 1. Skew Correction and Noise Reduction for Automatic Gridding of Microarray Images, International Journal of Computer Science and Information Security, Vol. 8(4), pp: 326-334, 2010. 2. Automatic Technique for Gridding of Skewed and Noisy Microarray Images, Journal of Computational Intelligence in Bioinformatics, Vol. 3(2), pp: 185-198, 2010. 3. Automatic Gridding of Noisy Microarray Images Based on Coefficient of Variation, International Journal of Computer Science Issues.(communicated)

Upload: others

Post on 24-Sep-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Chapter 4

Microarray Gridding

4.1 Introduction

The discovery of microarray technology in 1995 has opened new avenues for

investigating gene expressions (Schena et al., 1995) and introduced new information

problems (Fenstemacher, 2005; MacMullen et al., 2005). Researchers have developed

several microarray image processing methods and modeling techniques that are

specific to DNA microarray analysis (Quackenbush, 2001) with the objective to draw

biologically meaningful conclusions (Bajcsy et al., 2004; Baldi et al., 2001; Golub et

al., 1999; Moore, 2000). However, the analysis of DNA microarray data consists of

several processing steps (Goryachuv et al., 2001) that can significantly deteriorate the

quality of gene expression information, and hence lower our confidence in any

derived research result. Thus, understanding microarray image processing steps

(Bajcsy, 2006) becomes critical for performing optimal microarray data analysis and

deriving biologically meaningful conclusions. Microarray image analysis consists of

several steps, of which the first critical step is referred to as addressing or gridding

(Brandle et al., 2003). This is the process of identifying the areas within an image that

contain a single spot, which represents the gene and identifying which subgrid and then

which row and column within that subgrid, the spot belongs to. Although this process

may seem relatively straightforward, it is complicated since the quality of images

suffers from the existence of noise (dust on the slide), artifacts (inner holes and

* Some parts of the material in this chapter are appeared / communicated in the following research papers:

1. Skew Correction and Noise Reduction for Automatic Gridding of Microarray Images,

International Journal of Computer Science and Information Security, Vol. 8(4), pp: 326-334,

2010.

2. Automatic Technique for Gridding of Skewed and Noisy Microarray Images, Journal of

Computational Intelligence in Bioinformatics, Vol. 3(2), pp: 185-198, 2010.

3. Automatic Gridding of Noisy Microarray Images Based on Coefficient of Variation,

International Journal of Computer Science Issues.(communicated)

Page 2: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 68

scratches) and uneven background, while some spots are poorly contrasted and ill-

defined. In addition, spots vary in size and position due to the presence of noise during

the sample preparation and hybridization processes. Thus there may be rotations,

misalignments and local deformations of the ideal rectangular grid (Bajcsy, 2005).

4.2 Proposed Techniques for Gridding

We propose three techniques for automatic gridding of skewed and noisy microarray

images. In the first method, the microarray image is skew corrected, noise removed

using adaptive thresholds computed on various segments, spatial topology of spots

detected, gridding performed and finally grids are refined.

In the second method the microarray image is skew corrected, noise removed using

adaptive thresholds computed on various segments, spatial topology of spots detected,

bounding boxes are drawn over the spots.

In the third method, the projection profiles of the binarized image are obtained, noise

removed using morphological operations. The unduly non uniform distance between

grid lines in noisy microarray images are corrected using coefficient of variation (CV)

of the successive differences in the gridding locations.

Figure 4.1 shows the block diagram which describes the salient stages of the proposed

methods for automatic gridding of noisy microarray images.

Page 3: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 69

Figure 4.1: Stages of automatic gridding of noisy microarray images

4.2.1 Skew Detection and Correction

This section describes the first stage of microarray gridding, skew detection and

correction. In this section an approach for skew detection and correction of

microarray images based on corner positions of the subgrid is presented.

Skew Detection

First step in this process is to convert the RGB image to gray scale image. Figure 4.2

shows the computation of the parameters topx, topy, leftx, lefty, xmid and ymid

which are required to find the skew angle. Scan the gray scale image rowwise. The

very first pixel in the image is assigned the coordinate address (topx, topy). Scan the

gray scale image columnwise. The very first pixel in the image is assigned the

Page 4: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 70

coordinate address (leftx, lefty). xmid is the mid value of columns and ymid is the

mid value of rows.

If topx < xmid and lefty > ymid the skew is clockwise (Figure.4.2)

If topx > xmid and lefty < ymid the skew is anticlockwise (Figure.4.3)

Figure 4.2: Parameters for clockwise skew detection

Figure 4.3: Parameters for anticlockwise skew detection

The clockwise skew angle can be found using the formula

Φ = (atan topx – leftx ) / (lefty – topy)

The anticlockwise skew angle can be found using the formula

Φ = (atan lefty – topy ) /( topx - leftx)

Skew Correction

The new coordinate address xx and yy are computed as given below.

Skew correction for clockwise tilt:

It is required to perform rotation about (leftx, lefty) by Φ in anticlockwise direction.

xx = (leftx + x-leftx) * cos (Φ ) – (y-lefty) * sin(Φ )

yy =( lefty + x-leftx) * sin (Φ ) + (y-lefty) * cos(Φ )

Page 5: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 71

Skew correction for anticlockwise:

xx = (topx + x-topx) * cos(Φ) + (y-topy) * sin(Φ)

yy = (topy + y-topy) * cos(Φ) + (topx-x) * sin(Φ)

where, x varies from 1 to number of columns and y varies from 1 to number of rows

The minxx and minyy are computed and translated to (0,0). The new image Ll with

the coordinate address is given below:

xx1=xx-minxx

yy1=yy-minyy is the skew corrected image.

Figures 4.4 and 4.5 (Image ID: 62919) shows the clockwise skewed image and skew

corrected image.

Figure 4.4: Clockwise Skewed Image ID: 62919

Figure 4.5: Skew Corrected Image ID: 62919

Page 6: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 72

Table 4.1: Estimated angle (φ) and execution time (τs) of the proposed skew correction

technique

Image ID Estimated angle

in degrees (Φ)

Execution time

in seconds (τs)

1c7b060rex2 2.0651 13.14

1c4bo64rex2 2.5323 12.12

s62919 4.8253 15.15

40031 3.7653 13.54

4.2.2 Adaptive Threshold Computation and Filtering

Threshold computation and filtering technique is used to filter insignificant spots so

that automatic gridding procedure becomes easier. Filtering is performed in 2 steps

which have been described in section 3.2.2.1. Subsequent to filtering, gridding is

performed.

4.2.3 Automatic Gridding Process

In this section, a novel approach for automatic gridding of skewed and noisy

microarray images is presented.

4.2.3.1 Spatial Topology Method

Automatic gridding is performed in 3 steps which are described in the 3 subsections

below:

Determination of position of grid lines

For each connected component in the filtered image, rmin, rmax, cmin, cmax are

determined as shown in Figure 4.6. Sorted arrays of rmin values (similarly rmax,

cmin, cmax values) are found. Array of successive differences of rmin array called

diff_rmin also for rmax, cmin, cmax (diff_rmax, diff_rmin, diff_cmin, diff_cmax) is

found. Key portions of rmin, rmax and diff_rmin, diff_rmax are shown below. All

computations are done on image ID (62919).

Page 7: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 73

rmin:

rmax:

Figure 4.6: Computation of rmin, rmax, cmin & cmax of a spot

The steps below describe determination of horizontal grid lines.

1) The rmin array is sorted in ascending order

sorted_ rmin:

Page 8: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 74

2) The differences of successive rmin values in the sorted rmin array are calculated.

diff_rmin:

3) Sudden change in the difference in rmin values indicate the end of previous row of

spots and beginning of next row of spots.

4) Observe the sudden change from 0 to 15, at position 3 in diff_rmin array. The third

element of rmin array is 9. Hence examination of diff_rmin suggests a grid line at row

9. Similarly it is understood that successive values of grid rmin.

grid_ rmin:

Similarly grid_rmax is determined. Shown below are sorted_rmax, diff_rmax,

grid_rmax values.

sorted_ rmax:

diff_rmax:

grid_ramx:

Page 9: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 75

Finally, positions of horizontal gridlines are determined by finding average of rows

suggested by grid_rmin and grid_rmax contents. Thus horizontal gridlines are placed

at rows 9, 25 [(25+25)/2], 41 [(38+43)/2], 57 [(55+60)/2]…etc.

In a similar manner vertical gridlines are positioned using sorted_cmin, diff_cmin,

grid_cmin, sorted_cmax diff_cmax, grid_cmax.

Grid Refinement Algorithm

The algorithm described in the section before, will draw grid lines as long as, a spot

exists on each row and each column of the filtered image. However there may be

images where no spots are present in several consecutive rows or columns. In these

images, there will be irregular spacing between gridlines. Figure 4.7, 4.8 show sparse

gridding in horizontal and vertical direction. In such cases the refinement algorithm

suggested can be used to draw additional / missing grid lines. Grid refinement process

is used to check whether all the gridlines have been drawn. If the differences in the

positions of successive rows (i, i+1) is greater than the average of previous spacing of

rows (avgrowspace), then the algorithm will draw horizontal lines at every successive

avgrowspace beginning from the previously drawn horizontal line, until i+1 or end of

rows. Similar procedure is repeated while drawing vertical lines.

Figure 4.9 shows both horizontal (figure 4.7) and vertical (figure 4.8) gridlines placed

before refinement process. Figure 4.10 is the gridding result obtained after refinement.

Observe that there are more grid lines here when compared to figure 4.9. Figures 4.11

and 4.12 show gridding done by projection profiles and standard deviation methods.

Observation reveals that, these have less and nonuniform grid lines.

The results are summarized in the table 4.2. Table 4.2 shows comparison of proposed

method with projection profile algorithm and standard deviation algorithm to perform

gridding. The comparison was performed on 10 sets of microarray images and it is

evident that proposed method performs better than other existing approaches.

Expected number of rows and columns are inferred by the number of connected

components across each row and column.

Page 10: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 76

Table 4.2: Performance comparison of proposed spatial topology method with other

approaches

Method Image ID Expected

Number

of Rows

Expected

Number of

Columns

Number

of Rows

obtained

Number of

Columns

obtained

Total

Error

(%)

Gridding

using

Standard

Deviation

62919 29 30 27 27 8.474576

22593 17 15 21 15 12.5

37993 29 29 27 29 3.448276

34212 20 21 21 21 2.439024

34217 18 23 18 23 0

34143 22 23 22 21 4.444444

34134 23 23 23 22 2.173913

52694 28 29 23 28 10.52632

57852 27 29 25 28 5.357143

66357 28 29 26 29 3.508772

Gridding

using

Projection

Profile

62919 29 30 27 29 5.084746

22593 17 15 20 15 9.375

37993 29 29 26 26 10.34483

34212 20 21 20 21 0

34217 18 23 21 24 9.756098

34143 22 23 24 23 4.444444

34134 23 23 23 21 4.347826

52694 28 29 26 29 3.508772

57852 27 29 25 29 3.571429

66357 28 29 27 29 1.754386

Gridding

using

Proposed

method

62919(SMD) 29 30 29 30 0

22593(SMD) 17 15 17 15 0

37993(UNC) 29 29 29 29 0

34212(UNC) 20 21 20 21 0

34217(UNC) 18 23 18 23 0

34143(UNC) 22 23 23 23 2.222222

34134(UNC) 23 23 23 23 0

52694(SMD) 28 29 28 29 0

57852(SMD) 27 29 27 29 0

66357(SMD) 28 29 28 29 0

Page 11: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 77

Figure 4.7: Filtered image with sparse horizontal grid lines

Figure 4.8: Filtered image with sparse vertical grid lines

Figure 4.9: Gridding of noisy microarray image before refinement process

Page 12: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 78

Figure 4.10: Gridding of noisy microarray image after refinement process

Figure 4.11: Gridding of noisy microarray image by projection profile method

Figure 4.12: Gridding of noisy microarray image by standard deviation method

Page 13: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 79

4.2.3.2 Bounding Box Method

In this approach, the procedure of adaptive threshold computation and filtering is

discussed in 3.2.2.1 is used. Subsequently, the procedure of spatial topology

computation discussed in section 4.2.3.1 is used. Instead of drawing grid lines as

discussed in 4.2.3.1, rectangular grid (Bounding box) is constructed around each spot

as shown in figure 4.13.

Figure 4.13: Coordinate system of the spot for rectangular grid

Construction of rectangular grid

Consolidated rmin, rmax, cmin and cmax are used to build the grid structure. Figure

4.13, describes the coordinate system for the grid structure using above mentioned

values. Point A coordinates (grid_rmin, grid_cmin) represents top left corner, B

coordinates (grid_rmin, grid_cmax) represents top right corner, C coordinates

(grid_rmax, grid_cmax) represents bottom right corner and finally D coordinates

(grid_rmax, grid_cmin) represents bottom left corner of the rectangular grid. The

major advantage of the proposed method is that the next stage of the microarry image

analysis, segmentation can be performed easily and with minimum errors.

Figure 4.14 shows one subgrid of noisy microarray image. As discussed in section

3.2.2.1, Adaptive threshold is used to perform filtering. Figure 4.15 shows filtered

image of figure 4.14 and the observation reveals that, most of the contaminated

(insignificant, noisy) pixels are removed. Figure 4.16 shows noisy microarray image

and in figure 4.17 shown is the filtered image using proposed approach. Figure 4.18,

4.19 and 4.20 shows the gridding of the noisy microarray images using the proposed

Bounding Box method.

Page 14: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 80

The results are summarized in the table 4.3. Table 4.3 shows the comparison of

proposed Bounding Box method with projection profile and standard deviation

methods to perform gridding. The comparison was performed on 10 sets of

microarray images and it is evident that proposed method performs better than other

existing approaches. Actual number of connected components are inferred by the

number of connected components across each row and column.

Figures 4.21 and 4.22 show gridding done by projection profiles and standard

deviation methods. Observation reveals that, these have nonuniform grid lines.

Figure 4.14: Subgrid of noisy microarray image, Image ID: 32919

Figure 4.15: Filtered image using adaptive threshold, Image ID: 32919

Page 15: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 81

Figure 4.16: Subgrid of noisy microarray image, Image ID: 35964

Figure 4.17: Filtered image using adaptive threshold, Image ID: 35964

.

Figure 4.18: Gridding of noisy microarray image using proposed Bounding Box method,

Image ID: 38052

Page 16: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 82

Figure 4.19: Gridding of noisy microarray image using our method, Image ID: 75186

Figure 4.20: Gridding of noisy microarray image using our method, Image ID: 37010

Figure 4.21: Gridding of noisy microarray image by projection profile method

Page 17: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 83

Figure 4.22: Gridding of noisy microarray image by standard deviation method

Table 4.3: Performance comparison of proposed Bounding Box method with other methods

Method Image ID Actual Number

of connected

components

Number of

components

obtained

Error (%)

Standard

deviation

62919(SMD) 871 750 13.89

22593(SMD) 255 215 15.68

37993(UNC) 841 750 10.82

34212(UNC) 420 350 16.66

34217(UNC) 414 348 15.94

34143(UNC) 506 430 15.01

34134(UNC) 529 440 16.82

52694(SMD) 812 680 16.25

57852(SMD) 783 660 15.70

66357(SMD) 812 678 16.50

Projection

profile

62919(SMD) 871 800 8.15

22593(SMD) 255 225 11.76

37993(UNC) 841 785 6.65

34212(UNC) 420 375 10.71

34217(UNC) 414 368 11.11

34143(UNC) 506 450 11.06

34134(UNC) 529 480 9.26

52694(SMD) 812 740 8.86

57852(SMD) 783 700 10.60

66357(SMD) 812 690 15.02

Proposed 62919(SMD) 871 831 4.5

22593(SMD) 255 235 7.84

37993(UNC) 841 810 3.68

34212(UNC) 420 390 7.14

34217(UNC) 414 389 6.036

34143(UNC) 506 480 5.1383

34134(UNC) 529 500 5.4820

52694(SMD) 812 800 1.4778

57852(SMD) 783 750 4.21455

66357(SMD) 812 790 2.70

Page 18: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 84

4.2.3.3 Coefficient of Variation method

Automatic gridding of noisy microarray images is performed in 5 steps which are

described in the subsections below. Figure 4.23 shows the block diagram which

describes the salient stages of the proposed Coefficient of variation method for

automatic gridding of noisy microarray images.

Figure 4.23: Stages of automatic gridding of noisy Microarray images

4.2.3.3.1 Basic method of vertical and Horizontal Projection Profiles.

Let the size of the given gray scale image matrix be m x n where m is the number of

rows and n is the number of columns. Let intensity in noise free gray scale image be F

(i, j) at row i and column location j. Then the vertical projection profile P is calculated

using:

P (j) = ∑ i=1to m F (i, j) , for j=1,2,..n (1)

A typical vertical profile for a 4 x 4 subarray is shown in Figure 4.24. The locations of

regional minima (valley points) of the profile give the column positions of the vertical

grid lines (Figure 4.24).

Page 19: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 85

Due to the presence of noise in the subarray, a valley (region between adjacent peaks)

of the vertical profile may contain more than one local minimum. This causes

ambiguity in determining the location of the vertical grid line in that valley region.

Therefore the vertical profile signal should be processed further to uniquely determine

the position of vertical grid lines. This disadvantage is overcome by our proposed

method where projection profiles are obtained after binarizing the subarray image.

4.2.3.3.2 Coefficient of Variation method

In the proposed method, the projection profiles are first transformed into the

corresponding 2D binary images and then further processed to achieve efficient

automatic gridding.

Binarization

Binarization is converting the given gray scale image into its equivalent two level

quantized black and white image. Let F be the gray scale image matrix of size m x n

and T be the threshold level. The spots of the subarray have higher pixel intensity than

the background region. Therefore the spots can be segmented by binarizing the

subarray image using a suitable threshold. Then the binary map B of F is given by,

(2)

Figure 4.24: 4x4 subarray and its vertical profile

Page 20: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 86

In the resulting binary image, 1’s represent the pixels whose intensities are greater

than the threshold level and 0’s the other set of pixels. Thus, 1’

s represent the

foreground and 0’s the background. Now consider the vertical projection profiles

after binarization. A valley region (interval between adjacent foreground columns) of

a noise free subarray is represented by all 0’s. Therefore the corresponding vertical

profile (sum of the column values) has zero magnitude in the valley region as shown

in Figure 4.25. Thus the spot region and the valley region can be clearly separated.

Vertical Projection Profile as an image

In our method, the projection profiles obtained after binarization are converted into

corresponding images as described below. After binarization the vertical projection

profile is given by,

V (j) = ∑ i =1 to m G (i, j), for j=1, 2...n (3)

Figure 4.26(a) shows the histogram of the foreground pixels in columns. Observe the

pyramid like structure in Figure 4.26(a).

From Pyramids Extended to Bounding Boxes

The pyramids are enclosed in vertically extended bounding boxes as shown in Figure

4.26(b). The width of each extended bounding box is equal to the base width of the

corresponding pyramid. The heights of the boxes are extended from bottom of the

image all the way to the top for easy manipulation. Here, the heights of the boxes are

Figure 4.25: 4x4 subarray and its vertical profile after binarization

Page 21: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 87

immaterial whereas the width and the horizontal spacing’s between the adjacent boxes

determine the vertical grid lines. In general, almost equally spaced vertical boxes

represent the spot columns of the subarray. The black bars of Fig. 4.26(b) separate the

successive spot columns of the subarray. Therefore, the vertical bisectors of black

bars are the vertical gridlines. But because of noise, distortion and mixed distribution

of foreground-background intensities, certain pre-processing is required to remove the

noise and other irregularities to get the correct result.

Figure 4.26 (a) Histogram of subarray in Figure 4.25 (b) Extended Bounding Box image

Drawing vertical grid lines from extended bounding box image

The algorithm given here is proposed for vertical grid line determination in case of

noisy microarray images.

Algorithm

1. Select a suitable gray threshold level.

2. Binarize the subarray using this threshold.

3. Get the vertical profile.

4. Convert the profile into its equivalent binary profile image.

5. Find the bounding boxes of all the profile pyramids by vertically extending them.

6. Draw vertical bisectors for the black bars of the extended bounding box image to

get the vertical grid lines

Horizontal gridlines are obtained similarly using the horizontal profile.

Gridding in noisy Microarray images

Noise in the image produce false spurious, missing and misplaced ridges and valley

regions in the projection profiles and in turn generate spurious, missing and misplaced

grid lines. The effect of subarray irregularities on false gridding can be controlled by

proper selection of threshold level for binarization. A high threshold level suppresses

Page 22: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 88

many pixels, from the foreground. This results in false valleys in the profile. Figure

4.27(a) is binarized subarray using a high threshold. Observe that, there are few pixels

in the foreground. Figure 4.27(b) is the vertical profile of array in Figure 4.27(a).

Figure 4.27(c) is the extended bounding. Figure 4.27(d) shows the ill spaced vertical

grid lines.

Figure 4.27: Effect of high threshold Binarization (a) Binarized subarray

(b) Vertical Profile image (c) Extended box profile (d) Incorrect vertical gridding

A low threshold binarization retains most of the pixels in the background

region. Therefore the adjacent spot columns will merge and the bases of the pyramids

touch each other. This results in merged white bars and missing black bars in the box

profile image as shown in Figure 4.28. Here, the first two pyramids touch each other.

Therefore, the connected components labelling technique used in our method

generates single white bar for these two spot columns. Also the valley between them

has disappeared in Figure 4.28(b) and 4.28(c). This results in incorrect gridding as

depicted in Figure 4.28(d).

Figure 4.28: Effect of low threshold Binirization (a) Binarized subarray (b) Vertical Profile

image (c) Extended box profile (d) Incorrect vertical gridding

Both over thresholding and under thresholding result in irregular spacing of gridlines

(figures 4.27(d) and 4.28(d)). The correct threshold is that level which generates

almost equally spaced grid lines. The proposed procedure for adjusting grid locations

in case of unduly non uniform grid lines is given as follows:

a b c d

a b c d

Page 23: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 89

Let x(i) denote the horizontal coordinate value of the i th

grid line for i=1,2,...,k where

k is the total number of vertical gridlines as shown in figure 4.29.

Let d(j) represent the separation (space) between successive gridlines, for j=1,2,...(k-

1) where,

d(j) = x(j+1) − x(j) (4)

For an ideal subarray, spatial separations d(j)’s all equal. Under this condition, the

coefficient of variation (ratio of standard deviation to mean) for d (j)’s, is zero. In a

practical subarray, d(j)’s are very nearly equal and the coefficient of variation (CV) is

less. The coefficient of variation (CV) for d(j)’s is a measure of the diversity among

the values of d(j)’s. When the binarization threshold level varies the corresponding

d(j)’s also vary and hence the resulting CV’s also vary. The best threshold is that

value which yields minimum CV. This gives the best gridline distribution. Therefore,

in the proposed method, binarization threshold level is varied in steps, from a low

value to a high value and the corresponding CV’s are calculated for each threshold

value. Then that threshold value which results in minimum CV is chosen as the best

candidate for binarization and the resulting gridlines give the best gridding for that

subarray. The algorithm is given below.

Figure 4.29 Vertical gridlines with their marked separations, d(j)’s

0 x(1) x(2) x(3) x(k-1) x(k)

d(2) d(1) d(k-1)

Page 24: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 90

Algorithm: Determination of the correct threshold level for binarization and the

best gridding

Normalize the gray levels of the image in the range 0 to1.

Choose a suitable lower threshold value TL (1%), suitable upper threshold value TH

(85%) and an appropriate increment value ΔT (5%).

1. Set T to its initial value as T = TL.

2. Set the iteration count ic=1.

3. Binarize the subarray, get the vertical grid lines and x(i)’s for this T using the

algorithm.

4. Determine d(j)’s as given by Eq. (4) for this set of grid lines.

5. Calculate the Coefficient of Variation CV(ic) for this set of d.

6. Increment T as T = T+ΔT, where ΔT is the increment term.

7. If T > TH go to step 9, else

8. Set the iteration counter ic to its next value as ic=ic+1 and go back to step 3.

9. Determine the minimum among CV’s. Find the iteration count at which it occurs

and get the corresponding value of T. For this T, get x(i)’s, the locations of vertical

grid lines, as in Algorithm .

The horizontal gridlines are similarly obtained using the horizontal profile image.

4.2.3.3.3 Experimental Results

In this section the performance of the proposed approach is evaluated on noisy

microarray images from SMD (Stanford Microarray Database). The algorithm was

executed on Pentium Centrino Duo processor with 2 GB RAM. For a subarray (18,

18), of an image id (37010), the CV versus T values are shown in Table 4.6. Here, TL

= 0.1, TH = 0.85 and ΔT = 0.05. From Table 4.4, we see that minimum CV is 0.030,

occuring at iteation 4 and the corresponding threshold value (T) is 0.25. Figure 4.30(a)

shows the gridding of the subarray image (37010) using proposed method and figures

4.30(b) and 4.30(c) show the gridding of the same image with the standard deviation

and projection profile methods. All grid lines in figure 4.30(a) are drawn more or less

equally spaced and no grid line is drawn over the spots. Observe in figures 4.30(b),

4.30(c) there are grid lines running through the spots.

Page 25: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 91

Table 4.4: Determination of grid lines

Iteration

Count ( ic )

Threshold

Level ( T )

Coefficient of

Variation ( CV ) Remarks

1 0.10 0.314 T = TL

2 0.15 0.033

3 0.20 0.037

4 0.25 0.030 CV minimum

5 0.30 0.033

6 0.35 0.040

7 0.40 0.045

8 0.45 0.051

9 0.50 0.250

10 0.55 0.252

11 0.60 0.252

12 0.65 0.338

13 0.70 0.338

14 0.75 0.336

15 0.80 0.343

16 0.85 0.515 T = TH

Page 26: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 92

Figure 4.30 (a): Gridding of noisy microarray image using Coefficient of variation method

Figure 4.30 (b): Gridding of noisy microarray image by standard deviation method

Figure 4.30 (c): Gridding of noisy microarray image by projection profile method

Page 27: Microarray Gridding - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/15870/13/13... · 2015. 12. 4. · Microarray Gridding 68 scratches) and uneven background, while some spots

Microarray Gridding 93

4.3 Conclusion

In this chapter, three novel methods for automatic gridding of noisy, skewed

microarray image are presented. In the first method, spatial topology technique was

used to automatically grid skewed, noisy microarray images. The results of our

experiment on skewed noisy microarray images on SMD and UNC are encouraging.

The skew correction algorithm depends on determination of coordinate addresses of

just two positions of the image. The noise removal is done through adaptive

thresholding which makes processes effective. Finally the gridding is performed

based on spatial topology of spots. In the second method, rectangular grid (Bounding

box) was used on each spot to automatically grid noisy, skewed microarray images.

The noise removal is performed through adaptive thresholding, the entire process is

robust, in the presence of noise, skew, artifacts and weakly expressed spots. Finally

the gridding is performed by drawing Bounding Box surrounding the spots in the grid.

In the third method, the projection profiles of the binarized image are obtained, noise

removed using morphological operations. The unduly non uniform distance between

grid lines in noisy microarray images are corrected using coefficient of variation (CV)

of the successive differences in the gridding locations.

To summarize the stages of the proposed methods when executed in sequence, it is

effective and computationally simple.