a new line symmetry distance based automatic
Post on 06-Apr-2018
223 Views
Preview:
TRANSCRIPT
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 1/16
A New Line Symmetry Distance Based AutomaticClustering Technique: Application to ImageSegmentation
Sriparna Saha,1 Ujjwal Maulik 2
1 Image Processing and Modeling, Interdisciplinary Center for Scientific Computing (IWR),University of Heidelberg, Heidelberg, Germany
2 Department of Theoretical Bioinformatics, DKFZ (Deutsches Krebsforschungszentrum,German Cancer Research Center), Heidelberg, Germany
Received 3 March 2009; accepted 16 May 2010
ABSTRACT: In this article, at first an automatic clustering technique
using the concept of line symmetry property is developed. The pro-
posed real-coded variable string length genetic clustering technique
(VGALS clustering) is able to evolve the number of clusters present in
the data set automatically. Here assignment of points to different
clusters is done based on the line symmetry based distance rather
than the Euclidean distance. The cluster centers are encoded in the
chromosomes, whose value may vary. A newly developed line sym-
metry based cluster validity index, LineSym-index, is used as a mea-
sure of ‘‘goodness’’ of the corresponding partitioning. This validity
index is able to correctly indicate the presence of clusters of different
sizes as long as they are line symmetrical. A Kd-tree based data
structure is used to reduce the complexity of computing the line sym-metry distance. The proposed technique is then applied to automati-
cally segment different images. At first, the superiority of the pro-
posed method to automatically segment the image data sets over
Fuzzy C-means clustering technique, well-known mean-shift based
method and GAPS clustering with Sym-index based method, are
demonstrated for three remote sensing satellite images. Thereafter it
is applied on several simulated T1-weighted, T2-weighted, and pro-
ton density normal and MS lesion magnetic resonance brain images.
The proposed method is able to detect most of the regions well. Su-
periority of the proposed method over Fuzzy C-means and Expecta-
tion Maximization clustering algorithms are demonstrated quantita-
tively. The automatic segmentation obtained by VGALS clustering
technique is also compared with the available ground truth informa-
tion. VVC 2011 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 21, 86–100,
2011; Published online in Wiley Online Library (wileyonlinelibrary.com). DOI10.1002/ima.20243
Key words: unsupervised classification; cluster validity index;
symmetry; line symmetry based distance; Principal component
analysis; Kd tree; magnetic resonance image
I. INTRODUCTION
Remote sensing satellite images have significant applications in dif-
ferent areas like climate studies, assessment of forest resources,
examining marine environments, etc. For remote sensing applica-
tions, classification is an important task where the pixels in the
images are classified into homogeneous regions, each of which cor-
responds to some particular landcover type. The problem of pixel
classification is often posed as clustering (Saha and Bandyopad-
hyay, 2010) in the intensity space (Maulik and Bandyopadhyay,
2003). In the unsupervised pixel classification framework, various
clustering algorithms like Fuzzy C-means (Cannon et al., 1986),
and statistical methods have been used for the purpose of satellite
image segmentation. Recently application of genetic algorithms
(GAs) in the field of pixel classification has obtained significant
attention of the researchers (Maulik and Bandyopadhyay, 2003;
Saha and Bandyopadhyay, 2008). A context-sensitive clustering
technique for unsupervised image segmentation based on graph-cut
initialization and expectation-maximization algorithm is developed
by Tyagi et al., (2008). An unsupervised hyperspectral image classi-
fication technique based on fuzzy-clustering algorithms that spa-
tially exploit membership relations is proposed by Bilgin et al.,
(2008). A multiresolution remote sensing image clustering tech-
nique is proposed by Wemmert et al., (2009) which uses informa-
tion contained in both spatial resolutions. An agglomerative hier-
archical clustering method for large multispectral images, whichuses both spectral and spatial information for the aggregation deci-
sion, is proposed by Marcal and Castro, (2005). A new spectral-spa-
tial classification scheme for hyperspectral images which combines
the results of a pixel wise support vector machine classification and
the segmentation map obtained by partitional clustering using ma-
jority voting is proposed by Tarabalka et al., (2009). Recently, a
multiobjective fuzzy genetic clustering technique for pixel classifi-
cation has been proposed by Bandyopadhyay et al., (2007). By Cha-
mundeeswari et al., (2007), an unsupervised classification algorithmCorrespondence to: Sriparna Saha; e-mail: sriparna.saha@gmail.com
' 2011 Wiley Periodicals, Inc.
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 2/16
using Maximum a posteriori (MAP) segmentation for SAR images
is presented. Some more clustering techniques for satellite image
segmentation can be found elsewhere (Hilbert, 1977; Kauth et al.,
1977; Ghassemian and Landgrebe, 1988; Chen and Landgrebe,
1989; Sayood, 1992; Friedl et al., 2002; Pugh and Waxman, 2006;
Chamundeeswari et al., 2007; Guo et al., 2009; Gandhi et al.,
accepted; Jaffar et al., accepted).
Fully automatic brain tissue classification from magnetic reso-
nance images (MRI) is also an important research issue. The accu-
rate segmentation of MR images into different tissue classes, likegray matter (GM), white matter (WM), and cerebrospinal fluid
(CSF), is an important task. Additionally, regional volume calcula-
tions sometimes also bring even more useful diagnostic information
in diseases like Alzheimer disease, in movement disorders such as
Parkinson or Parkinson related syndrome, in white matter metabolic
or inflammatory disease, in congenital brain malformations or peri-
natal brain damage, or in post traumatic syndrome. The automatic
segmentation of brain MR images, however, remains a difficult
problem. Clustering approaches have been widely used for segmen-
tation of MR brain images. The use of neural networks, evolution-
ary computation and/or fuzzy clustering techniques for MR image
segmentation has been investigated in (Suckling et al., 1999; Bhan-
darkar and Zhang, 1999; Saha and Bandyopadhyay, 2007, 2009,
accepted).
To cluster a data set, some similarity or dissimilarity criteria has
to be defined. A new type of nonmetric distance, based on point
symmetry, (d ps), is proposed in (Bandyopadhyay and Saha, 2007).
For reducing the complexity of point symmetry distance computa-
tion, Kd-tree based data structure is used. From the geometrical
symmetry viewpoint, point symmetry and line symmetry are two
widely discussed issues. Inspired by this, a line symmetry based dis-
tance was proposed in (Saha and Bandyopadhyay, 2007). But the
proposed distance had several drawbacks. The major shortcoming
of the old line symmetry based distance was that its application is
limited to two-dimensional data sets only. A new line symmetry
based technique using principal component analysis is developed in
(Saha and Bandyopadhyay, accepted) that removes the limitationsof (Saha and Bandyopadhyay, 2007). This new line symmetry based
distance is more general and is applicable for any dimensional data
sets. In (Saha and Bandyopadhyay, accepted), a genetic clustering
technique is also developed based on this line symmetry distance.
But this technique a priori assumes the number of clusters present
in a data set. In this article, we have developed a line symmetry
based automatic clustering technique. The motivation of this article
is to develop a line symmetry-based automatic genetic clustering
technique for image segmentation.
In this article, a variable string length genetic line symmetry
(VGALS-clustering) based clustering technique is proposed which
is then used to automatically segment the remote sensing satellite
images. Centers of the clusters are encoded in the chromosome
whose values vary over a certain range. Here assignment of pointsto different clusters is done based on a newly proposed line symme-
try based distance rather than the Euclidean distance. A new cluster
validity index based on the newly developed line-symmetry based
distance, LineSym-index, is proposed here and thereafter it is uti-
lized for computing the fitness of the chromosomes. The effective-
ness of the proposed line symmetry-based automatic clustering
technique (VGALS-clustering) is shown for automatically partition-
ing Indian remote sensing (IRS) satellite images of the parts of the
city of Kolkata and SPOT image of the part of the city of Kolkata.
Comparisons are made with those obtained by Fuzzy C-means clus-
tering technique, popular Mean-shift based segmentation technique
(Comaniciu and Meer, 2002) and GAPS clustering with Sym-index
based method (Bandyopadhyay and Saha, 2007). The segmentation
results are compared qualitatively and quantitatively.
The effectiveness of the proposed algorithm is thereafter
shown in segmenting the MR images of the normal brain and MR
brain images with multiple sclerosis lesions. The segmentation
results are then compared with the available ground truth informa-
tion. For the purpose of comparison, the well-known Fuzzy C-means algorithm (Bezdek, 1973) and the Expectation Maximiza-
tion (EM) (Jain et al., 1999) clustering algorithm are also exe-
cuted, firstly with the number of clusters automatically deter-
mined by the VGALS clustering and then with the actual number
of clusters present in the images. The segmentation results are
compared with that provided by VGALS clustering algorithm
quantitatively.
In a part of the experiment, fuzzy variable string length genetic
algorithm (Fuzzy-VGA) (Maulik and Bandyopadhyay, 2003) which
uses the Euclidean distance for computing the membership values
of points to different clusters is also executed on the MR brain
images to automatically segment it. The results are also compared
with those obtained by VGALS clustering technique. compared
qualitatively and quantitatively.
II. SYMMETRY BASED DISTANCES
In this section, at first the existing point symmetry based distance is
described in brief. Then a new definition of the newly developed
line symmetry based distance is proposed.
A. Existing Point Symmetry Based Distance. In this section,
a new PS distance (Bandyopadhyay and Saha, 2007), d ps( x,c), asso-
ciated with point x with respect to a center c is described. Let a
point be x. The symmetrical (reflected) point of x with respect to a
particular centre c is 2 3 c 2 x. Let us denote this by x*. Let knear
unique nearest neighbors of x* be at Euclidean distances of d is, i 5
1,2,. . .knear . Then
d psð x; cÞ ¼ d symð x; cÞ3d eð x; cÞ; ð1Þ
¼
Pknear i¼1 d i
knear 3d eð x; cÞ; ð2Þ
where d e( x,c) is the Euclidean distance between the point x and c.
The complexity of computing d ps( x,c) is of order n, where n is
the total number of data points. For all the n points and K clusters,
the complexity becomes of order n2 K . Thus, to reduce this, the Kd-
tree based nearest neighbor search ANNlib (Mount and Arya, 2005)
(Approximate Nearest Neighbor), which is a library written in
C11 (obtained from http://www.cs.umd.edu/ $mount/ANN) isused in (Bandyopadhyay and Saha, 2007). ANNlib is used to find
d i, i 5 1 to knear , in Eq. (2) efficiently.
B. Newly Developed Line Symmetry Based Distance
(Saha and Bandyopadhyay, accepted). Given a particular
data set, we first find the first principal axis of this data set using
Principal Component Analysis (Jolliffe, 1986). Let the eigen vec-
tor of the co-variance matrix of the data set with highest eigen
value be [eg1 eg2 eg3 eg4 . . . egd ], where d is the dimension of the
Vol. 21, 86–100 (2011) 87
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 3/16
original data. Then the first principal axis of the data set is
given by:
ð x1 À c1Þ
eg1
¼ð x2 À c2Þ
eg2
¼ . . . ¼ð xd À cd Þ
egd
where the center of the data set is c 5 {c1, c2, . . ., cd }.
The obtained principal axis is treated as the symmetrical line of
the relevant cluster, i.e., if the data set is indeed symmetrical then it
should also be symmetric with respect to the first principal axis of the dataset identified by the principal component analysis (PCA).
This symmetrical line is used to measure the amount of line symme-
try of a particular point in that cluster. To measure the amount of
line symmetry of a point ( x) with respect to a particular line i, d ls( x,
i), the following steps are followed.
1. For a particular data point x, calculate the projected point pi
on the relevant symmetrical line i.
2. Find d sym( x, pi) as:
:d symð x; piÞ ¼
Pknear i¼1 d i
knear ð3Þ
where knear unique nearest neighbors of x* 5 2 3 pi 2 x are at
Euclidean distances of d is, i 5 1,2,. . .knear . ANN library
(Mount and Arya, 2005) utilizing Kd-tree based nearest neigh-
bor search is used to reduce the complexity of computing these
d is (as described in Section II-C). Then the amount of line sym-
metry of a particular point x with respect to the symmetrical line
of cluster i, is calculated as:
:d lsð x; iÞ ¼ d symð x; piÞ3d eð x; cÞ ð4Þ
where c is the centroid of the particular cluster i and d e ( x,c) is
the Euclidean distance between the point x and c.
It can be seen from Eq. (3) that knear cannot be chosen equal to 1,
since if x* exists in the data set then d sym( x, pi) 5 0 and hence there
will be no impact of the Euclidean distance in the definition of d ls( x,
i). On the contrary, large values of knear may not be suitable because
it may underestimate the amount of symmetry of a point with respect
to the first principal axis. Here knear is chosen equal to 2. It may be
noted that the proper value of knear largely depends on the distribu-
tion of the data set. A fixed value of knear may have many draw-
backs. For instance, for very large clusters (with too many points), 2
neighbors may not be enough as it is very likely that a few neighbors
would have a distance close to zero. On the other hand, clusters with
too few points are more likely to be scattered, and the distance of the
two neighbors may be too large. Thus, a proper choice of knear is an
important issue that needs to be addressed in the future.
Note that every point symmetric cluster is also line symmetricwith respect to its central axis. Thus, the proposed distance is able to
detect line symmetric as well as point symmetric clusters. The previ-
ous point symmetry based distance was only able to measure the
amount of point symmetry of a point with respect to a cluster center.
But the proposed line symmetry based distance measures total line
symmetry with respect to the symmetrical line of a cluster.
The above line symmetry based distance is realized more clearly
from Figure 1. Here x is a particular data set. c is the center of a par-
ticular cluster. The first principal axis of the given cluster is denoted
by Line l. This line is treated as the symmetrical line of the particu-
lar cluster. The projected point of x on this symmetrical line is
denoted by p. Let the reflected point of x with respect to p be
denoted by x0. The first two nearest neighbors (here knear is chosen
equal to 2) of x0 are at Euclidean distances of d 1 and d 2, respec-
tively. Then the total amount of symmetry of x with respect to the
projected point p on line Line l is calculated as follows:
d symð x; pÞ ¼ d 1þd 22
. Therefore, the total line symmetry based distance
of the point x with respect to the symmetrical line of cluster l is cal-
culated as d ls( x,l) 5 d sym( x, p) 3 d e( x, c) where d e( x, c) is the Eu-
clidean distance between the point x and the cluster center c.
It is evident that the symmetrical distance computation is very
time consuming because it involves the computation of the nearest
neighbors. Computation of d ls( x,i) is of complexity O( N ). Hence for
N points and K clusters, the complexity of computing the line sym-
metry based distance between all points to different clusters is
O( N 2 K ). To reduce the computational complexity, an approximate
nearest neighbor search using the Kd-tree approach is adopted in
this article.
C. Kd-tree Based Nearest Neighbor Computation. A K -
dimensional tree, or Kd-tree is a space-partitioning data structure
for organizing points in a K-dimensional space. A Kd-tree uses
only those splitting planes those are perpendicular to one of the
coordinate axes. Approximate Nearest Neighbor (ANN) is a library
written in C11 (Mount and Arya, 2005), which supports data
structures and algorithms for both exact and approximate nearest
neighbor searching in arbitrarily high dimensions. In this article
ANN library utilizing Kd-tree for nearest neighbor search is used to
find d is, where i 5 1,. . . , knear , in Eq. (3) efficiently. Thus, it
requires the construction of a Kd-tree consisting of N points in the
data set, where N is the size of the data set. The construction of Kd-
tree requires O( N log N ) time and O( N ) space (Anderberg, 2000).
(Friedman et al., 1977) reported O(log N ) expected time for findingthe nearest neighbor using Kd-tree.
III. VGALS-CLUSTERING: VARIABLE STRING LENGTH
GENETIC LINE SYMMETRY DISTANCE BASED
CLUSTERING TECHNIQUE
In this section, the use of variable string length genetic algorithm
using the newly developed line symmetry based distance (VGALS-
clustering) is proposed for automatically evolving the number of
clusters present in a data set. Here we have considered the best
Figure 1. Example of line symmetry based distance [color figure
can be viewed in the online issue, which is available at
wileyonlinelibrary.com.].
88 Vol. 21, 86–100 (2011)
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 4/16
partition to be the one that corresponds to the maximum value of
the proposed LineSym-index which is defined later. Both the num-
ber of clusters as well as the appropriate partitioning of the data are
evolved simultaneously using the search capability of genetic algo-
rithms. Since the number of clusters is considered to be variable,
the string lengths of different chromosomes in the same population
are allowed to vary. As a consequence, the crossover and mutation
operators are suitably modified to tackle the concept of variable
length chromosomes. The technique is described below in detail.
A. String Representation and Population Initialization. In
VGALS-clustering, the chromosomes are made up of real numbers
which represent the coordinates of the centers of the partitions. If
chromosome i encodes the centers of M i clusters in N dimensional
space then its length li is taken to be N * M i. Each center is consid-
ered to be indivisible. Each string i in the population initially enco-
des the centers of a number, M i, of clusters, such that M i 5
(rand()mod M *) 1 2. Here, rand() is a function returning an integer,
and M * is a soft estimate of the upper bound of the number of clus-
ters. The number of clusters will therefore range from two to M * 1
1. The M i centers encoded in a chromosome are randomly selected
distinct points from the data set. Thereafter five iterations of the K -
means algorithm is executed with the set of centers encoded in each
chromosome. The resultant centers are used to replace the centersin the corresponding chromosomes. This makes the centers sepa-
rated initially.
Example. Let M * 5 10. Let the random number M i be equal to 4
for chromosome i. Then this chromosome will encode the centers
of 4 clusters. Let the four cluster centers (four randomly chosen
points from the data set) be (10.0, 5.0) (20.4, 13.2) (15.8, 2.9) (22.7,
17.7). Thus the chromosome may look like (20.4, 13.2) (15.8, 2.9)
(10.0, 5.0) (22.7, 17.7).
B. Fitness Computation. This is composed of two steps. First,
assignment of n points to different clusters are done by using the
newly developed line symmetry based distance, d ls. Next, a new
cluster validity index based on the newly developed line symmetrydistance, LineSym-index, is computed and used as a measure of the
fitness of the chromosome.
(1) Computing the membership values. Let a particular chromo-
some encode centers of K number of clusters. At first, the first princi-
pal axis of each cluster is determined using principal component
analysis (Jolliffe, 1986). Then the assignment of each point x j , j 5
1,2, . . . N , to K different clusters are done in the following way. Find
the cluster center nearest to x j in the line symmetrical sense. That is,
we find the cluster center k that is nearest to the input pattern x j using
the minimum-value criterion: k 5 argmini51,. . . K d ls( x j,i) where the
line symmetry based distance d ls( x j,i) is computed by Eq. (4). If the
corresponding d sym( x j , k ) [as defined in Eq. (3)] is smaller than a pre-
specified parameter y, then assign that particular point x j to k th clus-
ter. Otherwise assignment is done based on the minimum Euclideandistance criterion as normally used in (Bandyopadhyay and Maulik,
2002) or the K -means algorithm, i.e., assign x j to k th cluster where k
5 argmini51, . . . K d e( x j , ci). Here, ci denotes the center of the ith
cluster and d e( x j , ci) denotes the Euclidean distance between the point
x j and the cluster center ci. The reason for doing such an assignment
is as follows: in the intermediate stages of the algorithm, when the
centers are not yet properly evolved, then the minimum d ls value
for a point is expected to be quite large, since the point might not
be symmetric with respect to any cluster. In such cases, using
Euclidean distance for cluster assignment appears to be intuitively
more appropriate.
The value of y is kept equal to the maximum nearest neighbor
distance among all the points in the data set as described in (Ban-
dyopadhyay and Saha, 2007, 2008). It is to be noted that if a point
is indeed symmetric with respect to the principal axis of some clus-
ter center then the symmetrical distance computed in the above way
will be small, and can be bounded as follows. Let d NN max be the
maximum nearest neighbor distance in the data set. That is
d max NN ¼ maxi¼1;... N d NN ð xiÞ; ð5Þ
where d NN ( xi) is the nearest neighbor distance of xi. Assuming that
x* lies within the data space, it may be noted that
d 1 d max
NN
2and d 2
3d max NN
2; ð6Þ
resulted in, d 1þd 22
d max NN : Ideally, a point x is exactly symmetrical
with respect to the principal axis of some cluster if d 15 0. However
considering the uncertainty of the location of a point as the sphere
of radius d NN max around x, we have kept the threshold y equals to
d NN max. Thus the computation of y is automatic and does not require
user intervention.After the assignments are done, the cluster centers encoded in
the chromosome are replaced by the mean points of the respective
clusters. This is referred to as the K -means like update center
operation.
(2) Fitness calculation: The fitness of a chromosome is com-
puted using a newly defined cluster validity index, LineSym-index
Note that this index is inspired by the point symmetry based cluster
validity index Sym-index (Saha and Bandyopadhyay, 2008; Ban-
dyopadhyay and Saha, 2008). Let K cluster centres be denoted by ci
where 1 i K and ni denotes the number of points present in the
i th cluster. Then LineSym-index is defined as follows:
LineSymð K Þ ¼1
K 3
1
E K
3 D K
; ð7Þ
where K is the number of clusters present in that chromosome. Here,
E K ¼P K
i¼1
Ei such that Ei ¼Pni
j ¼1 d lsð xi j ; iÞ and D K ¼ max K
i; j ¼1 ci À c j :
D K is the maximum Euclidean distance between any two cluster
centers among all centers. Here x j i denotes the j th point of the ith
cluster. d ls( x j i, i) is computed by Eq. (4) where i denotes the symmet-
rical line (first principal axis) of the ith cluster.
The objective is to maximize the LineSym-index in order to
obtain the actual number of clusters and to achieve proper cluster-
ing. As formulated in Eq. (7), LineSym is a composition of three
factors, these are 1/ K , 1=E K and D K . The first factor increases as K
decreases; as LineSym needs to be maximized for optimal cluster-ing, it will prefer to decrease the value of K . The second factor is
the within cluster total line symmetrical distance. For clusters which
have good symmetrical structure, Ei value is less. This, in turn, indi-
cates that formation of more number of clusters, which are symmet-
rical in shape, would be encouraged. Finally the third factor, D K ,
measuring the maximum separation between a pair of clusters,
increases with the value of K . As these three factors are comple-
mentary in nature, so they are expected to compete and balance
each other critically for determining the proper partitioning.
Vol. 21, 86–100 (2011) 89
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 5/16
The fitness function for chromosome j is defined as LineSym j ,
i.e., the LineSym -index computed for partitioning encoded in that
chromosome. The objective of the GA is to maximize this fitness
function.
C. Selection. Conventional proportional selection is applied on
the population of chromosomes. Here, a chromosome receives a
number of copies that is proportional to its fitness in the population.
We have used roulette wheel strategy for implementing the propor-
tional selection scheme.
D. Crossover. For the purpose of crossover, the cluster centers
are considered to be indivisible, i.e., the crossover points can only
lie in between two cluster centers. The crossover operation, applied
stochastically with probability of crossover (lc), must ensure that
information exchange takes place in such a way that both the off-
spring encode the centers of at least two clusters. For this, the oper-
ator is defined as follows (Maulik and Bandyopadhyay, 2003): Let
parent chromosomes P1 and P2 encode M 1 and M 2 cluster centers,
respectively. s1, the crossover point in P1, is generated as s1
5rand() mod M 1. Let s2 be the crossover point in P2, and it may
vary in between [LB(s2),UB(s2)], where LB(s2) and UB(s2) indicate
the lower and upper bounds of the range of s2, respectively. LB(s2)
and UB(s2) are given by LB (s2) 5 min[2,max[0,2 2 ( M 1 2 s1)]]and UB (s2) 5 [ M 22 max[0,22 s1]]. Therefore s2 is given by
s2 ¼ LBðs2Þ þ randðÞmodðUBðs2Þ À LBðs2ÞÞ
if ðUBðs2Þ ! LBðs2ÞÞ;
¼ 0 otherwise:
It can be verified by some simple calculations that if the crossover
points s1 and s2 are chosen according to the above rules, then none
of the offsprings generated would have less than two clusters.
E. Mutation. Mutation is applied on each chromosome with prob-
ability lm. Mutation is of three types.
1. Each cluster center encoded in a chromosome is mutated
with probability lm in the following way. The cluster center
is replaced with a random variable drawn from a Laplacian
distribution, pðeÞ / eÀjeÀljd , where the scaling factor d sets the
magnitude of perturbation. Here l is the value at the position
which is to be perturbed. The scaling factor d is chosen equal
to 1.0. The old value at the position is replaced with the
newly generated value.
2. One randomly generated cluster center is removed from the
chromosome, i.e., the total number of clusters encoded in the
chromosome is decreased by 1.
3. The total number of clusters encoded in the chromosome is
increased by 1. One randomly chosen point from the data set
is encoded as the new cluster center.
Any one of the above mentioned types of mutation is applied
randomly on a particular chromosome if it is selected for mutation.
F. Termination. In this article, we have executed the algorithm
for a fixed number of generations. Moreover, the elitist model of
GAs has been used, where the best string seen so far is stored in a
location within the population. The best string of the last generation
provides the solution to the clustering problem.
G. Complexity Analysis of VGALS Clustering
Technique. Below we have analyzed the time complexity of the
proposed VGALS clustering technique. Here N : total number of
points in the data set, M * : Maximum possible number of clusters
and d : dimension of the data.
As discussed above Kd-tree data structure has been used in
order to find the nearest neighbor of a particular point. The
construction of Kd-tree requires O( N log N ) time and O( N )
space (Anderberg, 2000). Initialization of GA needs O(Popsize 3 stringlength) time
where Popsize and stringlength indicate the population size
and the length of each chromosome in the GA, respectively.
Note that stringlength is O( M * 3 d ) where d is the dimension
of the data set and M * is the soft estimate of the upper bound
of the number of clusters.
Fitness Computation is composed of three steps.
1. In order to find membership values of each point to all clus-
ter centers minimum line symmetrical distance of that point
with respect to all clusters have to be calculated. At first
Principal Component Analysis (PCA) is applied to detect the
first Principal axis. This will take O( N 3 d 2) time. There-
after this first Principal axis is used as the symmetrical line
of the respective cluster. In order to determine line symmetrybased distance, Kd-tree based nearest neighbor search is
used. If the points are roughly uniformly distributed, then
the expected case complexity is O(cd 1 log N ), where c is a
constant depending on dimension and the point distribution.
This is O(log N ) if the dimension d is a constant (Bentley
et al., 1980). (Friedman et al., 1977) also reported O(log N )
expected time for finding the nearest neighbor. So in order
to find minimal symmetrical distance of a particular point,
O( M *log N ) time is needed. Thus total complexity of com-
puting membership values of N points to M * clusters is
O( M * N log N ).
2. For updating the centers total complexity is O( M *).
3. Total complexity for computing the fitness values is O( N
3 M *).So the fitness evaluation has total complexity5 O(Popsize 3
M * N log N ).
Selection step of the GA requires O(Popsize3 stringlength) time.
Mutation and Crossover require O(Popsize 3 stringlength)
time each.
Thus summing up the above complexities, and considering string-
length ( N , total time complexity becomes O( M * N log N 3 Popsize)
per generation. For maximum Maxgen number of generations total
complexity becomes O( M * N log N 3 Popsize 3 Maxgen).
H. Space Complexity Analysis. The major space requirement
of VGALS clustering is due to its population. Thus, the total space
complexity of VGALS clustering is O(Popsize 3 Stringlength), i.e.,
O(Popsize 3 d 3 M *). Also for each population we have to keep a
membership matrix of size N 3 M *. Thus total space compexity
will be O(Popsize3 N 3 M *) as N ) d .
IV EXPERIMENTAL RESULTS
A. Results on Satellite Images. In this section at first, the ex-
perimental results obtained after application of the above mentioned
90 Vol. 21, 86–100 (2011)
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 6/16
VGALS-clustering technique for segmenting two remote sensing
satellite images of the parts of the cities of Kolkata are provided.
The two satellite images are of sizes 512 3 512, i.e., the size of the
data set to be clustered in all the images is 262,144. For these multi-spectral satellite images, the feature vector is composed of the in-
tensity values at different bands of the image. The parameters of the
proposed algorithm are as follows: population size520, number of
generations520, probability of crossover 50.8 and probability of
mutation50.2. For the purpose of comparison, popular Fuzzy C-
means (FCM) (Bezdek, 1981) clustering algorithm, a recent method
of satellite image segmentation proposed in (Bandyopadhyay and
Saha, 2007) (GAPS clustering with Sym-index based method),
mean-shift based segmentation technique (Comaniciu and Meer,
2002) are also executed on real-life images. The results are com-
pared both qualitatively and quantitatively.
B. IRS Image of Kolkata. The data used here was acquired from
Indian Remote Sensing Satellite (IRS-1A) using the LISS-II sensor
that has a resolution of 36.25 3 36.25 h. The image is contained in
four spectral bands namely, blue band of wavelength 0.45–0.52 lm,
green band of wavelength 0.52–0.59 lm, red band of wavelength
0.62 – 0.68 lm, and near infra red band of wavelength 0.77 – 0.86
lm. Thus, the feature vector of each image pixel is composed of
four intensity values at different bands. The distribution of the pix-
els in the first three feature space of this image is shown in Figure
2. It can be easily seen from the Figure 2 that the entire data can be
partitioned into several different shaped clusters where symmetry
does exist.
Figure 3 shows the Kolkata image in the near infra red band.
Some characteristic regions in the image are the river Hooghly cut-
ting across the middle of the image, several fisheries observed
Figure 3. IRS image of Kolkata in the near infra red band with histo-
gram equalization.
Figure 2. Data distribution of IRS image of Kolkata in the first three
feature space.
Figure 4. Clustered IRS image of Kolkata using VGALS-clustering
technique.
Figure 5. Clustered IRS image of Kolkata using FCM clustering.
Vol. 21, 86–100 (2011) 91
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 7/16
toward the lower-right portion, a township, SaltLake, to the upper-
left hand side of the fisheries. This township is bounded on the top
by a canal. Two parallel lines observed towards the upper right-
hand side of the image correspond to the airstrips in the Dumdum
airport. Other than these, there are several water bodies, roads, etc.
in the image. From our ground knowledge, we know that the image
has four clusters (Maulik and Bandyopadhyay, 2003) and these four
clusters correspond to the classes turbid water, pond water, concrete
and open space.
The VGALS-clustering technique automatically provides four
clusters for this image data (Fig. 4). It may be noted that the water
class has been differentiated into turbid water (the Hooghly) and
pond water (fisheries, etc.) because of a difference in their spectral
properties. The canal bounding SaltLake from the upper portion has
also been correctly classified as pond water. Figure 5 shows the
Kolkata image partitioned in four clusters using FCM algorithm
(Bezdek, 1973). But the segmentation result is unsatisfactory from
the human visualization judgement. As can be seen, the river
Hooghly as well as the city region has been incorrectly classified as
belonging to the same class. Therefore we can conclude that
although some regions, viz., fisheries, canal bounding SaltLake,
parts of the airstrip, etc., have been correctly identified, a significant
amount of confusion is evident in the FCM clustering result. GAPS
clustering technique with Sym index based method automatically
provides K 5 4 number of clusters from this data set. The corre-
sponding partitioning is shown in Figure 7. The automatically seg-
mented image after application of the mean-shift based segmenta-
tion technique (Comaniciu and Meer, 2002) is shown in Figure 6.
To validate the segmentation result obtained by VGALS-cluster-
ing quantitatively, here two well-known Euclidean distance based
cluster validity indices, namely I index (Maulik and Bandyopad-
hyay, 2002) and XB-index (Xie and Beni, 1991) values are also
computed. These are provided in Table I. Smaller values of XB-
index and larger values of I index correspond to good clustering.
The values again show that the segmentation provided by VGALS-
clustering is much better than the existing other methods.
C. SPOT Image of Kolkata. The French satellites SPOT (Sys-
tems Probataire d’Observation de la Terre) (Richards, 1993),
launched in 1986 and 1990, carry two imaging devices that consist
of a linear array of charge coupled device (CCD) detectors. Two
imaging modes are possible, the multispectral and panchromatic
modes. The 512 3 512 SPOT image of a part of the city of Kolkata
is available in three bands in the multispectral mode. These bands
are:
Band 1 — green band of wavelength 0.50 – 0.59 lm
Band 2 — red band of wavelength 0.61 – 0.68 lm
Band 3 — near infra red band of wavelength 0.79 – 0.89 lm.
Thus, here feature vector of each image pixel composed of three
intensity values at different bands. The distribution of the pixels in
the feature space of this image is shown in Figure 9. It can be easily
seen from the Figure 9 that the entire data can be partitioned intoseveral hyperspherical clusters.
Figure 7. Clustered IRS image of Kolkata using GAPS clustering
technique with Sym-index based method.
Figure 6. Clustered IRS image of Kolkata using mean-shift based
clustering technique.
Table 1. I index and XB-index values of the segmented Mumbai and Kolkata satellite images provided by VGALS-clustering, FCM-
clustering, and method proposed in (Bandyopadhyay and Saha, 2007)
Index
Kolkata IRS Mumbai IRS SPOT Kolkata
VGALS FCM GAPS with Sym VGALS FCM GAPS with Sym VGALS FCM GAPS with Sym
I index 24.24 5.71 18.27 214.78 23.06 180.45 26.58 5.71 20.67
XB index 1.75 23.67 2.23 2.12 4.67 2.91 2.28 12.23 3.05
92 Vol. 21, 86–100 (2011)
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 8/16
Some important landcovers of Kolkata are present in the image.
Most of these can be identified, from a knowledge about the area,
more easily in the near infra-red band of the input image (Fig. 8).
These are the following: The prominent black stretch across the fig-
ure is the river Hooghly. Portions of a bridge (referred to as the sec-
ond bridge), which was under construction when the picture was
taken, protrude into the Hooghly near its bend around the center of
the image. There are two distinct black, elongated patches below
the river, on the left side of the image. These are water bodies, the
one to the left being Garden Reach lake and the one to the right
being Khidirpore dockyard. Just to the right of these water bodies,
there is a very thin line, starting from the right bank of the river,
and going to the bottom edge of the picture. This is a canal called
the Talis nala. Above the Talis nala, on the right side of the picture,there is a triangular patch, the race course. On the top, right hand
side of the image, there is a thin line, stretching from the top edge,
and ending on the middle, left edge. This is the Beleghata canal
with a road by its side. There are several roads on the right side of
the image, near the middle and top portions. These are not very
obvious from the images. A bridge cuts the river near the top of the
image. This is referred to as the first bridge.
The proposed VGALS clustering method automatically provides
K 5 7 as the optimal number of clusters for this image data set (cor-
responding partitioning is shown in Figure 10). As identified in (Pal
et al., 2001) the above satellite image has seven classes namely, tur-
bid water, concrete, pure water, vegetation, habitation, open space
and roads (including bridges). The partitioning provided by the
Figure 9. Data distribution of SPOT image of Kolkata in the feature
space.
Figure 8. SPOT image of Kolkata in the near infra red band with
histogram equalization.
Figure 10. Clustered SPOT image of Kolkata using VGALS cluster-
ing technique.
Figure 11. Clustered SPOT image of Kolkata using FCM clustering
technique.
Vol. 21, 86–100 (2011) 93
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 9/16
VGALS clustering technique separates almost all the regions well.
The Talis nala has been identified properly by the proposed method
(shown in Figure 10). The bridge is also correctly identified by the
proposed algorithm. This again shows that the proposed VGALS is
able to detect clusters of widely varying sizes. The segmentation
result obtained by Fuzzy C-means algorithm on this image for K 5
7 (actual number of clusters present in this image data set) is shown
in Figure 11. It can be seen from Figure 11 that FCM algorithm is
not able to detect the bridge. The automatically segmented image
obtained after application of mean-shift based segmentation tech-
nique (Comaniciu and Meer, 2002) is shown in Figure 12. Here
also the bridge has not been detected. The method proposed in
(Saha and Bandyopadhyay, 2008) (GAPS along with Sym-index)
provides K 5 6 clusters from this data set. The corresponding parti-
tioning is shown in Figure 13. This technique is also able to detect
the bridge correctly.
In order to validate the segmentation result obtained by
VGALS-clustering quantitatively, here two well-known Euclidean
distance based cluster validity indices, namely I index (Maulik and
Bandyopadhyay, 2002) and XB-index (Xie and Beni, 1991) valuesare also computed. These are provided in Table I. Smaller values of
XB-index and larger values of I index correspond to good cluster-
ing. The values again show that the segmentation provided by
VGALS-clustering is much better than the existing clustering
techniques.
To validate the results, 932 pixel positions were manually
selected from seven different land cover types which were labelled
accordingly. The confusion matrix of the partitioning obtained by
VGALS clustering technique for this data set is shown in Table II.
The class by class classification accuracies are 87, 83, 82, 83, 80,
81, and 89, respectively. The overall accuracy is 87.
D. Results on MR Brain Images. The MR images of the brain
chosen for the experiments are available in three bands: T 1 -weighted, proton density ( pd)-weighted and T 2 -weighted. The nor-
mal brain images are obtained from Brainweb database
(BrainWeb). The images correspond to the 1 mm slice thickness,
3% noise (calculated relative to the brightest tissue) and with 20%
intensity nonuniformity. The image of size 217 3 181 is available
in 181 different z planes. The proposed clustering algorithm is exe-
cuted on seven of these z planes. The parameters of the VGALS
Figure 12. Clustered SPOT image of Kolkata using mean-shift
based clustering technique.
Figure 13. Clustered SPOT image of Kolkata using GAPS cluster-
ing technique with Sym-index based method.
Table II. Confusion matrix of the partitioning obtained by VGALS cluster-
ing technique for numeric SPOT Kolkata image. Here following notations
are used: TW: turbid water, C: concrete, PW: pure water, V: vegetation,
H: habitation, OS: open space and R: roads
Ground Truth (Percent)
Class TW C PW V H OS R
TW 87 0 13 0 0 0 0
C 0 83 0 0 0 0 1
PW 13 0 82 0 0 0 5
V 0 0 0 83 15 3 0
H 0 12 0 10 80 10 1
OS 0 0 0 7 4 81 4
R 0 5 5 0 1 6 89
Table III. Minkowski Scores (MS) Obtained by FCM, EM and VGALS
clustering algorithms on simulated MR volumes for normal brain projected
on different z planes. Here # AC, # OC denotes, respectively, the actual
number of clusters and the automatically obtained number of clusters (after
application of VGALS).
z Plane No. #AC
MS for AC
#OC
MS for OC
FCM EM FCM EM VGALS
1 6 1.077 1.052 11 0.80 1.17 0.77
2 6 0.76 0.78 9 0.65 0.83 0.62
3 6 0.57 0.76 8 0.62 0.64 0.59
36 9 0.90 0.98 9 0.90 0.98 0.84
72 10 0.75 0.74 10 0.75 0.74 0.70
108 9 0.79 0.589 10 0.81 0.68 0.701
144 9 0.82 0.72 11 0.49 0.60 0.32
94 Vol. 21, 86–100 (2011)
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 10/16
algorithm are as follows: population size520, total number of gen-
erations515, probability of crossover (lc)50.9, probability of
mutation (lm)50.02. Number of clusters, K , is varied from 2 to 20.
For the normal MR brain image, the ground truth information isavailable to us. There are a total of 10 classes present in the images.
But the number of classes varies along the different z planes. Ten
classes are Background, CSF, Grey Matter, White Matter, Fat, Mus-
cle/Skin, Skin, Skull, Glial Matter, and Connective. Table III shows
the actual number of clusters and the number of clusters automati-
cally determined by the proposed VGALS clustering technique (af-
ter application on the above mentioned brain images projected on
different z planes). To measure the segmentation solution quantita-
tively, we have also calculated Minkowski Score (MS) (Ben-Hur
and Guyon, 2003). This is a measure of the quality of a solution
given the true clustering. Let T be the ‘‘true’’ solution and S the so-
lution we wish to measure. Denote by n11 the number of pairs of
elements that are in the same cluster in both S and T. Denote by n01
the number of pairs that are in the same cluster only in S and not inT, and by n10 the number of pairs that are in the same cluster in T
and not in S. Minkowski Score (MS) is then defined as:
MSðT ; SÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffi
n01þn10
n11þn10
q . For MS, the optimum score is 0, with lower
scores being ‘‘better.’’ The MS scores obtained by VGALS cluster-
ing corresponding to the 7 brain images are also reported in Table
III. For the purpose of comparison, we have executed Fuzzy C-means (FCM) (Bezdek, 1973) and Expectation Maximization (EM)
(Jain et al., 1999) algorithms on the above mentioned brain datasets
with two different values of K . In the first case, K is kept equal to the
actual number of clusters that present in that particular plane. Next, it
is set equal to that automatically determined by VGALS algorithm.
The MS scores obtained by both the comparing algorithms are also
reported in Table III for all the seven images. Results show that the
MS scores corresponding to the partitionings provided by the
VGALS clustering, in general, is the minimum among all the parti-
tions. This implies the superior performance of VGALS to automati-
cally detect the proper partitioning from MR normal brain images.
Figures 14a, 15a, 16a, 17a, and 18a show the original MR normal
brain images in T1 band projected on z1, z36, z72, z108, z144
planes, respectively. Figures 14b, 15b, 16b, 17b, and 18b show,respectively, the corresponding automatically segmented images
obtained after application of VGALS clustering algorithm.
Figure 14. (a) Original T1-weighted MR image of the normal brain in z1 plane. (b) Segmentation obtained by VGALS clustering technique.
Figure 15. (a) Original T1-weighted MR image of the normal brain in z36 plane. (b) Segmentation obtained by VGALS clustering technique.
Vol. 21, 86–100 (2011) 95
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 11/16
Figure 17. (a) Original T1-weighted MR image of the normal brain in z108 plane. (b) Segmentation obtained by VGALS clustering technique.
Figure 18. (a) Original T1-weighted MR image of the normal brain in z144 plane. (b) Segmentation obtained by VGALS clustering technique.
Figure 16. (a) Original T1-weighted MR image of the normal brain in z72 plane. (b) Segmentation obtained by VGALS clustering technique.
96 Vol. 21, 86–100 (2011)
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 12/16
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 13/16
Figure 20. (a) Original T1-weighted MRI image of the brain with multiple sclerosis lesions in z36 plane. (b) Segmentation obtained by VGALS
clustering technique.
Figure 21. (a) Original T1-weighted MRI image of the brain with multiple sclerosis lesions in z72 plane. (b) Segmentation obtained by VGALS
clustering technique.
Figure 22. (a) Original T1-weighted MRI image of the brain with multiple sclerosis lesions in z108 plane. (b) Segmentation obtained by VGALS
clustering technique.
98 Vol. 21, 86–100 (2011)
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 14/16
lesions are also provided in Table V. Results show that the proposed
VGALS clustering technique is more effective than the Fuzzy-
VGA. This establish the fact that the line symmetry based distanceis more effective in segmenting the MR brain images than the exist-
ing Euclidean distance.
V. DISCUSSION AND CONCLUSION
In this article, a new variable string length genetic algorithm based
clustering technique is developed which utilizes a recently developed
line symmetry based distance for assignment of points to different
clusters. A new cluster validity index based on the line symmetry
based distance is also developed here and thereafter it is utilized for
computing the fitness of the proposed genetic clustering technique.
The proposed clustering technique can automatically determine the
appropriate partitioning and the appropriate number of partitions
from a given data set having line symmetrical clusters. The effective-
ness of the proposed technique is shown in detecting the proper parti-
tioning from two remote sensing satellite images of the parts of the
cities of Kolkata. Results are compared with those obtained by the
Fuzzy C-means clustering technique, Mean-shift based method and
GAPS-clustering with Sym-index based method (Bandyopadhyay
and Saha, 2007). Thereafter, the effectiveness of the proposed algo-
rithm is shown in segmenting several MR brain images. The segmen-
tation results are then compared with the available ground truth infor-
mation. For the purpose of comparison, the well-known Fuzzy C-means and EM algorithms are also executed on these images. Experi-
mental results show that VGALS clustering is not only able to auto-
matically segment the MR brain images into different tissue classes
but the corresponding segmentation results are also the best.
Note that present work does not use any spatial information
while segmenting the images. But incorporation of spatial informa-
tion surely improves the quality of the results. Thus some new
methods of incorporating the spatial information have to be
invented in the future. Future work also includes the development
of some fuzzy genetic clustering technique based on the line sym-
metry based distance. Developing some new form of symmetry,
like plane symmetry etc. is also another important future research
work. Authors are currently working in this direction.
REFERENCES
Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Schwarz-
kopf. Computational Geometry: Algorithms and Applications. Springer-
Verlag, Berlin, Germany, 2nd edition, 2000.
S. Bandyopadhyay and U. Maulik, Genetic clustering for automatic evolution
of clusters and application to image classification, Pattern Recogn (2002),
1197–1208.
S. Bandyopadhyay and S. Saha, GAPS: A clustering method using a new
point symmetry based distance measure, Pattern Recogn 40 (2007), 3451.
S. Bandyopadhyay and S. Saha, A point symmetry based clustering technique
for automatic evolution of clusters, IEEE Trans Knowl Data Eng 20 (2008),
1–17.
S. Bandyopadhyay, U. Maulik, and A. Mukhopadhyay, Multiobjectivegenetic clustering for pixel classification in remote sensing imagery, IEEE
Trans Geosci Remote Sens 45 (2007), 1506–1511.
A. Ben-Hur and I. Guyon, Detecting Stable Clusters using Principal Compo-
nent Analysis in Methods in Molecular Biology, M. Brownstein and A.
Kohodursky, (Editors), Humana Press, 2003.
J.L. Bentley, B.W. Weide, and A.C. Yao, Optimal expected-time algorithms
for closest point problems, ACM Trans Math Software 6 (1980), 563–580.
J.C. Bezdek, Fuzzy mathematics in pattern classification, Ph.D. dissertation,
Cornell University, Ithaca, NY, 1973.
Figure 23. (a) Original T1-weighted MRI image of the brain with multiple sclerosis lesions in z144 plane. (b) Segmentation obtained by VGALS
clustering technique.
Table V. The automatically obtained cluster (OC) number and the
corresponding Minkowski Scores (MS) after application of VGA And
VGALS clustering algorithms on simulated MR volumes for brain with
multiple sclerosis lesions projected on first 10 z planes
z Plane No. AC
VGA VGALS
OC MS OC MS
1 6 2 1.21 11 0.77
2 6 2 1.20 8 0.77
3 6 2 1.19 11 0.78
4 6 5 0.69 9 0.77
5 6 2 1.184 9 0.75
6 6 2 1.18 8 0.71
7 6 2 1.17 9 0.78
8 6 2 1.16 10 0.73
9 6 2 1.16 11 0.84
10 9 2 1.17 11 0.83
Vol. 21, 86–100 (2011) 99
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 15/16
J.C. Bezdek, Pattern recognition with fuzzy objective function algorithms,
Plenum, New York, 1981.
S.M. Bhandarkar and H. Zhang, Image segmentation using evolutionary
computation, IEEE Trans Evol Comp 3 (1999), 1–21.
G. Bilgin, S. Erturk, and T. Yildirim, Unsupervised classification of hyper-
spectral-image data using fuzzy approaches that spatially exploit member-
ship relations, IEEE Geosci Remote Sens Lett 5 (2008), 673–677.
BrainWeb: Simulated brain database. Available at: http://www.bic.mni.
mcgill.ca/brainweb; http://www.bic.mni.mcgill.ca/brainweb/.
C.A. Cocosco, V. Kollokian, R.K.-S. Kwan, A.C. Evans: ‘‘BrainWeb:Online Interface to a 3D MRI Simulated Brain Database’’ NeuroImage,
vol. 5, no. 4, part 2/4, S425, 1997 – Proceedings of 3-rd International Confer-
ence on Functional Mapping of the Human Brain, Copenhagen, May, 1997.
R.L. Cannon, R. Dave, J.C. Bezdek, and M. Trivedi, Segmentation of a the-
matic mapper image using fuzzy c-means clustering algorithm, IEEE Trans
Geosci Remote Sens 24 (1986), 400–408.
V.V. Chamundeeswari, D. Singh, and K. Singh, Unsupervised land cover
classification of SAR images by contour tracing, Proc. of IEEE International
Geoscience and Remote Sensing Symposium (IGARSS, 2007), Barcelona,
Spain, July, 23–28, 2007, pp. 547–550.
C.-C.T. Chen and D.A. Landgrebe, A spectral design system for the HIRIS/
MODIS era, IEEE Trans Geosci Remote Sens 27 (1989), 681–686.
D. Comaniciu and P. Meer, Mean shift: A robust approach toward feature
space analysis, IEEE Trans Pattern Anal Machine Intell 24 (2002), 603–619.
M.A. Friedl, D.K. McIver, J.C.F. Hodges, X.Y. Zhang, D. Muchoney, A.H.
Strahler, C.E. Woodcock, S. Gopal, A. Schneider, A. Cooper, A. Baccini, F.
Gao, C. Schaaf, Global land cover mapping from MODIS: Algorithms and
early results, Remote Sens Environ 83 (2002), 287–302.
J.H. Friedman, J.L. Bently, and R.A. Finkel, An algorithm for finding best
matches in logarithmic expected time, ACM Trans Math Software 3 (1977),
209–226.
V. Gandhi, J.M. Kang, S. Shekhar, J. Ju, E.D. Kolaczyk, and S. Gopal, Con-
text inclusive function evaluation: a case study with EM-based multi-scale
multi-granular image classification. Knowl. Inf. Syst. 21, 2 (Oct. 2009),
231–247. DOI5http://dx.doi.org/10.1007/s10115-009-0208-0
H. Ghassemian and P.A. Landgrebe, Object oriented feature extraction
method for image data compaction, IEEE Control Syst Mag 8 (1988), 42–48.
D. Guo, H. Xiong, V. Atluri, and N.R. Adam, Object discovery in high-resolution remote sensing images: a semantic perspective, Knowl Inf Syst
19 (2009), 211–233. DOI=http://dx.doi.org/10.1007/s10115-008-0160-4.
E.E. Hilbert, Cluster compression algorithm-a joint clustering data compres-
sion concept, JPL Publication, NASA, Technical Report, 1977
M.A. Jaffar, A. Hussain, and A.M. Mirza, Fuzzy entropy based optimization
of clusters for the segmentation of lungs in CT scanned images. Knowl. Inf.
Syst. 24, 1 (Jul. 2010), 91-111. DOI=http://dx.doi.org/10.1007/s10115-009-
0225-z.
A.K. Jain, M.N. Murthy, and P.J. Flynn, Data clustering: A review, ACM
Comput Rev 31, 3 (Sep. 1999), 264–323. DOI=http://doi.acm.org/10.1145/
331499.331504.
I. Jolliffe, Principal component analysis, Springer Series in Statistics, Eng-
land, 1986.
R. Kauth, A. Pentland, and G. Thomas, BLOB: An unsupervised clustering
approach to spatial preprocessing of MSS imagery, Proceedings of 11th Int.
Symp. Remote Sensing of the Environment, Ann Arbor, Mich, 1977, 1309–
1317.
A. Marcal and L. Castro, Hierarchical clustering of multispectral images
using combined spectral and spatial criteria, IEEE Geosci Remote Sens Lett
2 (2005), 59–63.
U. Maulik and S. Bandyopadhyay, Performance evaluation of some cluster-
ing algorithms and validity indices, IEEE Trans Pattern Anal Machine Intell
24 (2002), 1650–1654.
U. Maulik and S. Bandyopadhyay, Fuzzy partitioning using a real-coded
variable-length genetic algorithm for pixel classification, IEEE Trans Geosci
Remote Sens 41 (2003), 1075–1081.
D.M. Mount and S. Arya, ANN: A library for approximate nearest neighbor
searching, 2005, Available at:http://www.cs.umd.edu/ $mount/ANN.
S.K. Pal, S. Bandyopadhyay, and C.A. Murthy, Genetic classifiers for
remotely sensed images: Comparison with standard methods, Int J RemoteSens 22 (2001), 2545–2569.
M.L. Pugh and A.M. Waxman, Classification of spectrally-similar land
cover using multi-spectral neural image fusion and the fuzzy ARTMAP neu-
ral classifier, Proc. of IEEE International Geoscience and Remote Sensing
Symposium (IGARSS, 2006), Denver, Colorado, July 31, 2006–Aug. 4,
2006, pp. 1808–1811.
J.A. Richards, Remote sensing digital image analysis: An introduction,
Springer-Verlag, New York, 1993.
S. Saha and S. Bandyopadhyay, MRI brain image segmentation by fuzzy
symmetry based genetic clustering technique, Proceedings of the 2007 IEEE
Congress on Evolutionary Computation (CEC’07), Singapore, 2007a, pp.
4417–4424.
S. Saha and S. Bandyopadhyay, A genetic clustering technique using a new
line symmetry based distance measure, Proceedings of Fifth InternationalConference on Advanced Computing and Communications (ADCOM’07),
IEEE Computer Society, Guwahati, India, 2007b, pp. 365–370.
S. Saha and S. Bandyopadhyay, Application of a new symmetry based clus-
ter validity index for satellite image segmentation, IEEE Geosci Remote
Sens Lett 5 (2008), 166–170.
S. Saha and S. Bandyopadhyay, MR brain image segmentation using a
multi-seed based automatic clustering technique, Fundam Inform 97 (2009),
199–214.
S. Saha and S. Bandyopadhyay, A new multiobjective clustering technique
based on the concepts of stability and symmetry, Knowl Inf Syst 23 (2010),
1–27.
S. Saha and S. Bandyopadhyay, Automatic MR brain image segmentation
using a multiseed based multiobjective clustering approach, Applied Intelli-
gence. DOI: 10.1007/s10489-010-0231-6.
S. Saha and S. Bandyopadhyay, On principle axis based line symmetry clus-
tering techniques, Memetic Computing. DOI: 10.1007/s12293-010-0049-0.
K. Sayood, Data compression in remote sensing applications, IEEE Geosci
Remote Sens Newslett 84 (1992), 7–15.
J. Suckling, T. Sigmundsson, K. Greenwood, and E. Bullmore, A modified
fuzzy clustering algorithm for operator independent brain tissue classifica-
tion of dual echo MR images, Magn Reson Imaging 17 (1999), 1065–1076.
Y. Tarabalka, J.A. Benediktsson, and J. Chanussot, Spectralspatial classifi-
cation of hyperspectral imagery based on partitional clustering techniques,
IEEE Trans Geosci Remote Sensi 47 (2009), 2973–2987.
M. Tyagi, F. Bovolo, A. Mehra, S. Chaudhuri, and L. Bruzzone, A context-
sensitive clustering technique based on graph-cut initialization and expecta-
tion-maximization algorithm, IEEE Geosci Remote Sens Lett 5 (2008), 21– 25.
C. Wemmert, A. Puissant, G. Forestier, and P. Gancarski, Multiresolution
remote sensing image clustering, IEEE Geosci Remote Sens Lett 6 (2009),
533–537.
X.L. Xie and G. Beni, A validity measure for fuzzy clustering, IEEE Trans
Pattern Anal Machine Intell 13 (1991), 841–847.
100 Vol. 21, 86–100 (2011)
8/2/2019 A New Line Symmetry Distance Based Automatic
http://slidepdf.com/reader/full/a-new-line-symmetry-distance-based-automatic 16/16
Copyright of International Journal of Imaging Systems & Technology is the property of John Wiley & Sons,
Inc. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright
holder's express written permission. However, users may print, download, or email articles for individual use.
top related