model order selection for multiple cooperative swarms clustering
TRANSCRIPT
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 1/15
Model order selection for multiple cooperative swarms clustering
using stability analysis
Abbas Ahmadi a,⇑, Fakhri Karray b, Mohamed S. Kamel b
a Industrial Engineering Department, Amirkabir University of Technology, Tehran, Iranb Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada
a r t i c l e i n f o
Article history:
Available online 28 October 2010
Keywords:
Model order selection
Data clustering
Particle swarm optimization
Cooperative swarms
Swarm intelligence
a b s t r a c t
Extracting different clusters of a given data is an appealing topic in swarm intelligence
applications. This paper introduces two main data clustering approaches based on particle
swarm optimization, namely single swarm and multiple cooperative swarms clustering. A
stability analysis is next introduced to determine the model order of the underlying data
using multiple cooperative swarms clustering. The proposed approach is assessed using
different data sets and its performance is compared with that of k-means, k-harmonic
means, fuzzy c -means and single swarm clustering techniques. The obtained results indi-
cate that the proposed approach fairly outperforms the other clustering approaches in
terms of different cluster validity measures.
Ó 2010 Elsevier Inc. All rights reserved.
1. Introduction
Recognizing subgroups of the given data is of interest in data clustering. A vast number of clustering techniques have
been developed to deal with unlabelled data based on different assumptions about the distribution, shape and size of the
data. Most of the clustering techniques require a priori knowledge about the number of clusters [5,25], whereas some other
approaches are capable of extracting such information [16].
Swarm intelligence approaches such as particle swarm optimization (PSO), biologically inspired by the social behavior of
flocking birds [15], have been applied for clustering applications [1,3,7,8,19,22–24]. The goal of PSO-based clustering tech-
niques is usually to find cluster centers. Most of the recent swarm clustering techniques use a single swarm approach to
reach a final clustering solution [8,18,19]. Multiple swarms clustering has been recently proposed [3]. A multiple swarms
clustering approach is useful to deal with high dimensional data as it uses a divide and conquer strategy. In other words, it
distributes the search space among multiple swarms, each of which explores its associated division while cooperating with
others. The novelty of this paper is to apply a stability analysis for determining the number of clusters in underlying data
using multiple cooperative swarms [4].This paper is organized as follows. First, an introduction to cluster analysis is given. Particle swarm optimization and par-
ticle swarm clustering approaches are next explained. Then, model order selection using stability analysis is described. Fi-
nally, experiments using eight different data sets and concluding remarks are provided.
2. Cluster analysis
Organizing a set of unlabeled data points Y into several clusters using some similarity measure is known as clustering [9].
The notion of similarity between samples is usually represented using their corresponding distance. Each cluster C k contains
0020-0255/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved.doi:10.1016/j.ins.2010.10.010
⇑ Corresponding author. Tel.: +98 21 222 39403.
E-mail address: [email protected] (A. Ahmadi).
Information Sciences 182 (2012) 169–183
Contents lists available at ScienceDirect
Information Sciences
j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / i n s
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 2/15
a set of similar data points given by C k ¼ y k j
n onk
j¼1, where y k j denotes data point j in cluster k, nk indicates the number of its
associated data points and K is the number of clusters. Let us assume the solution of a clustering algorithm AK (Y ) for the given
data points Y of size N is presented by T :¼ AK (Y ) which is a vector of labels T ¼ ftigN i¼1, where t i denotes the obtained label for
data point i and t i 2 L : f1; . . . ; K g.
The main approaches for grouping data are hierarchical and partitional clustering. The hierarchical clustering approach
generates a hierarchy of clusters known as a dendrogram. Apparently, the dendrogram can be broken at different levels
to yield different clusterings of the data [13]. To build the dendrogram, agglomerative and divisive approaches are used.
In an agglomerative approach, each data point is initially considered as a cluster. Then, the two close clusters merge togetherand produce a new cluster. Merging close clusters is continued until all points form a single cluster. Different notions of
closeness exist, which are single link, complete link and average link.
In contrast to the agglomerative approach, the divisive approach begins with a single cluster containing all data points. It
then splits the data points into two separate clusters. This procedure continues till each cluster includes a single data point
[13].
In the partitional approach to data clustering, the aim is to partition a given data set into a pre-specified number of clus-
ters. Various partitional clustering algorithms are available. The most famous one is the k-means algorithm. The k-means
(KM) procedure initiates with k arbitrary random points as cluster centers. The algorithm assigns each data point to the near-
est center. New centers are then computed based on the associated data points of each cluster. This procedure is repeated
until no improvement is obtained after a certain number of iterations.
Unlike k-means, the k-harmonic means (KHM) algorithm, introduced by Zhang and Hsu [25], does not rely on the initial
solution. It utilizes the harmonic average of distances from each data point to the centers. As compared to the k-means algo-
rithm, it improves the quality of clustering results in certain cases [25].Another extension to k-means algorithm was suggested by Bezdeck using fuzzy set theory [5]. This algorithm is known as
fuzzy c -means (FCM) clustering in which every data point is associated to each cluster with some degree of membership.
Another class of partitional clustering is probabilistic clustering approaches, such as particle swarm-based clustering,
which are developed using probability theory [14].
In particle swarm clustering, the intention is to find the centers of clusters such that an objective function is optimized.
The objective function can be defined in terms of cluster validity measures. These measures are used to evaluate the quality
of clustering techniques [11]. There are numerous measures, which mainly tend to minimize intra-cluster distance, and/or
maximize the inter-cluster distance. Some widely used quality measures of clustering techniques are next described.
2.1. Compactness measure
Compactness measure, also represented by within-cluster distance, indicates how compact the clusters are [9]. This mea-
sure is denoted by F c ðm1;
. . .;
mK
Þ or simply by F c ðMÞ, where M = (m1
, . . . , mK
). Having K clusters, compactness measure isdefined as
F c ðMÞ ¼1
K
XK
k¼1
1
nk
Xnk
j¼1
d mk; y k j
; ð1Þ
where mk denotes the center of the cluster k and d(Á) stands for Euclidean distance. Clustering techniques tend to minimize
this measure.
2.2. Separation measure
This measure, also known as between-cluster distance, evaluates the separation of the clusters [9]. It is given by
F sðMÞ ¼
1
K ðK À 1ÞXK
j¼1
XK
k¼ jþ1 d m
j;m
kÀ Á:ð
2Þ
It is desirable to maximize this measure, or equivalently minimize ÀF sðMÞ.
2.3. Turi’s validity index
Turi’s validity index [21] is defined as
F TuriðMÞ ¼ ðc  N ð2; 1Þ þ 1Þ Âintra
inter ; ð3Þ
where c is a user-specified parameter, equal to unity in this paper, and NðÁÞ is a Gaussian distribution with l = 2 and r = 1.
The intra denotes the within-cluster distance provided in Eq. (1). Also, the inter term is the minimum Euclidean distance be-
tween the cluster centers computed by
inter ¼ minfdðmk;mqÞg; k ¼ 1; 2; . . . ;K À 1; q ¼ k þ 1; . . . ;K : ð4Þ
170 A. Ahmadi et al. / Information Sciences 182 (2012) 169–183
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 3/15
The aim of the different clustering approaches is to minimize Turi’s index.
2.4. Dunn’s index
Let’s define a(C k, C q) and b(C k) as
a C k;C q
¼ min
x 2C k ;z2C qdð x ; zÞ;
bðC kÞ ¼ max x ;z2C k
dð x ; zÞ:ð5Þ
Now, Dunn’s index [10] can be computed as
F DunnðMÞ ¼ min16k6K
minkþ16q6K
a C k; C q
max16~k6K bðC
~kÞ
0@
1A
8<:
9=;: ð6Þ
Clustering techniques are required to maximize Dunn’s index.
2.5. S_Dbw index
Let the average scattering of the clusters is considered as a measure of compactness expressed by
Scatt ¼ K À1XK
k¼1
rðC kÞ rðY Þk k
; ð7Þ
where r(Á) stands for the variance of the data and kÁk indicates Euclidean norm. Then, the separation measure is given by
Sep ¼1
K ðK À 1Þ
XK
k¼1
XK
q¼1
D zk;q
À Ámax D mkð Þ;D mqð Þf g
; q–k; ð8Þ
where zk,q is the middle point of the line segment defined by cluster centers mk and mq. Also, DðmkÞ denotes a density func-
tion around point mk which is estimated by DðmkÞ ¼Pnk
j¼1 f mk; yk
j
, and
f mk; yk j ¼
1 if d mk; yk j
< ~r
0 Otherwise;( ð9Þ
where ~r ¼ K À1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PK
k¼1krðC kÞq
k. Finally, S_Dbw index [11,12] is defined as
F S DbwðMÞ ¼ Scatt þ Sep: ð10Þ
Maximizing this index is of interest when trying to cluster a set of data into several groups.
3. Particle swarm clustering
Particle swarm optimization (PSO) is a search algorithm introduced for dealing with optimization problems [15]. The PSO
procedure commences with an initial swarm of particles in an n-dimensional space and evolves through a number of iter-
ations to find an optimal solution given a predefined objective function F . Each particle i is distinguished from others by its
position and velocity vectors, denoted by x i and vi, respectively. To choose a new velocity, each particle considers three com-ponents: its previous velocity, a personal best position and a global best position. The personal best and global best positions,
denoted by x pi and x *, respectively, keep track of the best solutions obtained so far by the associated particle and the swarm.
Thus, the new velocity and position are updated as
v i t þ 1ð Þ ¼ w v iðt Þ þ c 1r 1 x pi ðt Þ À x iðt Þ
À Áþ c 2r 2 x Ãðt Þ À x iðt Þð Þ; ð11Þ
x i t þ 1ð Þ ¼ x iðt Þ þ v i t þ 1ð Þ; ð12Þ
where w indicates the impact of the previous history of velocities on the current velocity, c 1 and c 2 are cognitive and social
components, respectively, r 1 and r 2 are generated randomly using a uniform distribution in interval [0,1]. If minimizing the
objective function is of interest, the personal best position of particle i at iteration t can be provided by
x pi ðt þ 1Þ ¼
x pi ðt Þ if F x iðt þ 1Þð ÞP F x pi ðt ÞÀ Á;
x iðt þ 1Þ otherwise:( ð13Þ
A. Ahmadi et al. / Information Sciences 182 (2012) 169–183 171
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 4/15
Moreover, the global best position is updated as
x Ãðt þ 1Þ ¼ arg min x
p
iðt Þ
F ð x pi ðt ÞÞ; i ¼ 1; 2; . . . ;n: ð14Þ
Maximum number of iterations, number of iterations with no improvement, and minimum objective function criterion
are common strategies to terminate the PSO procedure [2]. The first strategy is taken into consideration hereafter in this
paper.
3.1. Single swarm clustering
In this approach, the position of the particle i is expressed by x i = (m1, . . . , mK )i or simply by x i = (M)i, where
M = (m1, . . . , mK ) and mk denotes the center of cluster k. In other words, each particle contains a representative for the center
of all clusters. The representation of particle position x i for three-cluster case (K = 3) is illustrated in Fig. 1.
To model the clustering problem as an optimization problem, it is required to formulate an objective function. Cluster
validity measures described in Section 2 can be considered as the objective function. By considering F ðm1; . . . ; mK Þ or
F ðMÞ as the required objective function, PSO algorithm can explore the search space to find the cluster centers.
When the dimensionality of the data increases and the number of clusters is large, the ability of the single swarm clus-
tering is not sufficient to traverse all of the search space. Instead, multiple cooperative particle swarms can be considered to
determine clusters’ centers [3].
3.2. Multiple cooperative swarms clustering
Multiple cooperative swarms clustering approach assumes that the number of swarms is equal to the number of clusters
and particles of each swarm are candidates for the corresponding cluster’s center. The procedure of multiple cooperative
swarms clustering is completed through two main phases: to distribute the search task among multiple swarms and to build
up a cooperation mechanism between swarms. More detailed description of the proposed distribution and cooperation strat-
egies is given next.
3.2.1. Distribution strategy
The core idea is to divide the search space into different divisions sk, k 2 [1,. . . , K ]. Each division sk is denoted by its center
zk and width Rk; i.e., sk = f (zk, Rk). To distribute the search space into different divisions, a super-swarm is used. The super-
swarm, which is a population of particles, aims to find the center of divisions zk, k 2 [1,. . . , K ]. Each particle of the super-
swarm is defined as (z
1
,. . .
, z
K
), where z
k
denotes the center of division k. By repeating the single swarm clustering procedureusing one of the mentioned cluster validity measures as the objective function, the centers of different divisions are obtained.
Then, the width of divisions are computed by
Rk ¼ akkmax
; k 2 ½1; . . . ; K ; ð15Þ
where a is a positive constant selected experimentally and kkmax is the square root of the biggest eigenvalue of data points
belonging to division k [3].
Fig. 1. Representation of particle position in single swarm clustering.
172 A. Ahmadi et al. / Information Sciences 182 (2012) 169–183
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 5/15
3.2.2. Cooperation strategy
After distributing the search space, each division is assigned to a swarm. That is, the number of swarms is equal to the
number of divisions, or clusters, and particles of each swarm are candidates for the corresponding cluster’s center. In this
stage, there is information exchange between swarms and each swarm knows the global best, the best cluster center, of
the other swarms obtained so far. Therefore, there is a cooperative search scheme where each swarm explores its related
division to find the best solution for the associated cluster center interacting with other swarms. The schematic representa-
tion of the multiple cooperative swarms is depicted in Fig. 2.
In the multiple cooperative clustering approach, particles of each swarm are required to optimize the following
problem:
min F ðMiÞ
s:t: : m1
i 2 s1; . . . ;mK i 2 sK ;
i 2 ½1; . . . ;n;
ð16Þ
where FðÁÞ denotes one of the cluster validity measures introduced in Section 2.
The search procedure using multiple swarms is performed in a parallel scheme. First, n different solutions for cluster cen-
ters are obtained using Eq. (16). The best solution is called a new candidate for the cluster centers, denoted by
M0 = (m01, . . . , m0K ). To update the cluster centers, the following rule is applied:
Mðnew
Þ ¼M0
if F ðM0Þ 6 F ðMðoldÞÞ;
MðoldÞ
otherwise;
(ð17Þ
where M = (m1, . . . , mK ). In other words, if the objective value of the new candidate for cluster centers ( M0) is smaller than
that of the former candidate (M(old)), the new solution is accepted; otherwise, it is rejected. The overall algorithm of multiple
swarms clustering is provided in Algorithm 1.
The PSO-based clustering approaches assume that the number of clusters is known in advance. In this paper, the notion of
stability analysis is used to extract the number of clusters for the underlying data.
4. Model order selection using stability approach
Determining the number of clusters in data clustering is known as a model order selection problem. There exist two main
stages in model order selection. First, a clustering algorithm should be chosen. Then, the model order needs to be extracted,
given a set of data [16].
Fig. 2. Schematic representation of multiple swarms. First, the cooperation between multiple swarms initiates and each swarm investigates its associated
division (a). When the particles of each swarm converge (b), the final solution for cluster centers is revealed.
A. Ahmadi et al. / Information Sciences 182 (2012) 169–183 173
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 6/15
Algorithm 1: Multiple cooperative swarms clustering
Stage 1: Distribute search space into K different divisions s1, . . . , sK
Obtain the center of divisions z1, . . . , zK
Obtain the width of divisions R1, . . . , RK
Stage 2: Cooperate till convergence
Explore the division by
– 1.1. Computing new positions and velocities of all particles of swarms– 1.2. Determining the fitness value of all particles using the associate cluster validity measure
– 1.3. Choosing a solution that minimizes the optimization problem provided in Eq. (16) and denoting it as the
new candidate for cluster centers (m01, . . . , m0K )
Update the cluster centers
– 2.1. If the objective value of the new candidates for centers of clusters (m01, . . . , m0K ) is smaller than that of
previous iteration, accept the new solution; otherwise, reject it
– 2.2. If termination criterion is achieved, stop; otherwise, continue this stage
Most of the clustering approaches assume that the model order is known in advance. Here, we employ stability analysis to
obtain the number of clusters when using the multiple cooperative swarms to cluster the underlying data. A description of
stability analysis is provided before explaining the core algorithm.
Stability concept is used to evaluate the robustness of a clustering algorithm. In other words, the stability measure indi-cates how much the results of the clustering algorithm are reproducible on other data drawn from the same source. Some
examples of stable and unstable clustering are shown in Fig. 3 when the aim is to cluster the presented data into two groups.
As can be seen in Fig. 3, data points shown in Fig. 3(a) provide a stable clustering solution in a sense that the same clus-
tering results are obtained by repeating a clustering algorithm several times. However, the data points illustrated in Fig. 3(b)
and (c) do not yield stable clustering solutions when two clusters are of interest. That is, different results are generated by
running the clustering algorithm a number of times. Each line in Fig. 3 presents a possible clustering solution for the corre-
sponding data. The reason for getting unstable clustering solutions in these cases is the inappropriate number of clusters. In
other words, stable results are obtained for these data sets by choosing a suitable number of clusters. The proper number of
clusters for these data are three and four, respectively.
As a result, one of the issues that affects the stability of the solutions produced by a clustering algorithm is the model
order. For example, by assuming a large number of clusters the algorithm generates random groups of data influenced by
the changes observed in different samples. On the other hand, by choosing a very small number of clusters, the algorithm
may compound separated structures together and return unstable clusters [16]. As a result, one can utilize the stability mea-sure for estimating the model order of the unlabeled data [4].
The multiple cooperative swarms clustering data requires a priori knowledge of the model order in advance. In order to
enable this approach to estimate the number of clusters, the stability approach is taken into consideration. This paper uses
the stability method introduced by Lange et al. for the following reasons:
it requires no information about the data,
it can be applied to any clustering algorithm,
it returns the correct model order using the notion of maximal stability.
The required procedure for model order selection using stability analysis is provided in Algorithm 2.
A more precise schematic description of this algorithm is depicted in Fig. 4. The goal is to get the true cluster centers,
denoted by (m1, . . . , mK ), for the given data Y . First, the underlying data is randomly divided into two halves Y 1 and Y 2. Multi-
ple cooperative swarms approach is used to cluster these two halves and the obtained solutions are shown by T 1 and T 2,respectively. Next, a classifier /(Y 1) is trained by using the first half of data and its associated labels ( Y 1,T 1).
(a) Stable (b) Unstable (c) Unstable
Fig. 3. Examples of stable and unstable clustering when two clusters are desired.
174 A. Ahmadi et al. / Information Sciences 182 (2012) 169–183
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 7/15
Algorithm 2: Model order selection using stability analysis
for k 2 [2 . . . K ] do
for r 2 [1. . . r max] do
– Randomly split the given data Y into two halves Y 1, Y 2– Cluster Y 1, Y 2 independently using an appropriate clustering approach; i.e., T 1 :¼ Ak(Y 1), T 2 :¼ Ak(Y 2)
– Use (Y 1, T 1) to train classifier /(Y 1) and compute T 02 ¼ /ðY 2Þ
– Calculate the distance of the two solutions T 2 and T 02 for Y 2; i.e., dr ¼ dðT 2; T 02Þ
– Again cluster Y 1, Y 2 by assigning random labels to points
– Extend random clustering as above, and obtain the distance of the solutions; i.e., dnr
end for
– Compute the stability stab(k) = meanr (d)
– Compute the stability of random clusterings stabrand(k) = meanr (dn)
– sðkÞ ¼ stabðkÞstabrand ðkÞ
end for
– Select the model order k* such that k* = arg mink{s(k)}
Now, the trained classifier can be used to determine the labels for Y 2, denoted by T 02. Consequently, there exist two dif-ferent labels for Y 2. The more similar the labels are, the more stable the results would be. The similarity of the obtained solu-
tions can be stated in terms of their associated distance. Accordingly, if the distance is low, it is said that the obtained results
are stable. As it is revealed from the algorithm, the explained procedure is repeated several times (r max) to ensure that the
reported results are not generated by random. Furthermore, we repeat the whole procedure for different values of K to ex-
tract a correct model order.
Next, the most important aspects of the model order selection algorithm are explained.
4.1. Classifier /(Y)
A set of labeled data is required for training a classifier /. The data set Y 1 and its clustering solution from algorithm
Ak, i.e., T 1 :¼ Ak(Y 1), can be used to establish a classifier. There are a vast range of classifiers that can be used for clas-
sification. In this paper,k
-nearest neighbor (KNN) classifier was chosen as it requires no assumption on the distribution
of data.
Fig. 4. The schematic description of the model order selection algorithm.
A. Ahmadi et al. / Information Sciences 182 (2012) 169–183 175
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 8/15
4.2. Distance of solutions provided by clustering and classifier for the same data
Having a set of training data, the classifier can be tested using a test data Y 2. Its solution is given by T 02 ¼ /ðY 2Þ. But, there
exists another solution for the same data obtained from the multiple cooperative swarms clustering technique, i.e.,
T 2 :¼ Ak(Y 2). The distance of these two solutions is calculated by
d T 2; T 02À Á ¼ arg min
x2qkX
N
i¼1
# xðt 2iÞ – t 02iÈ É; ð18Þ
where
# t 2i – t 02i
È É¼
1 if t 2i – t 02i;
0 otherwise:
&ð19Þ
Also, qk contains all permutations of k labels and x is the optimal permutation which produces the maximum agreement
between two solutions [16].
2 3 4 5 6 7
14
16
18
20
22
24
k
s t a b
( k )
(a) Excluding random clustering
2 3 4 5 6 7
0.54
0.56
0.58
0.6
0.62
0.64
k
s
( k )
(b) Including random clustering
Fig. 5. The effect of the random clustering on the selection of the model order.
Table 1
Data sets selected from UCI machine learning repository.
Data set Classes Samples Dimensionality
Iris 3 150 4
Wine 3 178 13
Teaching assistant evaluation (TAE) 3 151 5
Breast cancer 2 569 30
Zoo 7 101 17
Glass identification 7 214 9
Diabetes 2 768 8
176 A. Ahmadi et al. / Information Sciences 182 (2012) 169–183
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 9/15
1. Speech data 2. Iris data
0 20 40 60 80 100
−1.5
−1
−0.5
0
0.5
1
1.5
2
Iterations
T u r i i n d e x
K−means
Single swarm
Multiple swarms
0 20 40 60 80 100
−0.5
0
0.5
1
1.5
2
Iterations
T u r i i n d e x
K−means
Single swarm
Multiple swarms
3. Wine data 4. Teaching assistant evaluation data
0 20 40 60 80 100−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Iterations
T u r i i n d e x
K−means
Single swarm
Multilpe swarms
0 20 40 60 80 100−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Iterations
T u r i i n d e x
K−means
Single swarmMultiple swarms
5. Breast cancer data 6. Zoo data
0 20 40 60 80 100
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Iterations
T u r i i n d e x
K−means
Single swarm
Multiple swarms
0 20 40 60 80 100
−6
−4
−2
0
2
4
6
8
10
Iterations
T u r i i n d e x
K−means
Single swarm
Multiple swarms
7. Glass data 8. Diabetes data
0 20 40 60 80 100
−6
−4
−2
0
2
4
6
8
10
Iterations
T u r i i n d e x
K−means
Single swarm
Multiple swarms
0 20 40 60 80 100
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
Iterations
T u r i i n d e x
K−means
Single swarm
Multiple swarms
Fig. 6. Comparing the performance of the multiple cooperative swarms clustering with k-means and single swarm clustering in terms of Turi’s index.
A. Ahmadi et al. / Information Sciences 182 (2012) 169–183 177
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 10/15
4.3. Random clustering
The stability measure depends on the number of classes or clusters. For instance, the accuracy rate of 50% for binary clas-
sification is more or less the same as that of a random guess. However, this rate for k = 10 is much better than a random
predictor. In other words, if a clustering approach outcomes the same accuracy for model orders k1 and k2, where k1 < k2,
the clustering solution for k2 is more reliable than the other solution. Hence, the primary stability measure obtained for a
certain value k, stab(k) in Algorithm 2, should be normalized using the stability rate of a random clustering stabrand(k)
[16]. Therefore, the final stability measure for the model order k is obtained as follows:
sðkÞ ¼stabðkÞ
stabrandðkÞ
& ': ð20Þ
The effect of the random clustering is studied on the performance of the Zoo data set provided in Section 5 to determine
the model order of the data using k-means algorithm. The stability measure for different number of clusters with and with-
out using random clustering is shown in Fig. 5.
As depicted in Fig. 5, the model order of the zoo data using k-means clustering is recognized as 2 without considering
random clustering, while it becomes 6, which is close to the true model order, by normalizing the primary stability measure
by the stability of the random clustering.
4.4. Appropriate clustering approach
For a given data set, the algorithm does not provide the same result for multiple runs. Moreover, the model order is highlydependent on the type of appropriate clustering approach that is used in this algorithm (see Algorithm 2), and there is no
specific emphasis in the work of Lange et al. [16] on the type of clustering algorithm that should be used. K -means and k-
harmonic means algorithms are either sensitive to the initial conditions or to the type of data. In other words, they cannot
capture true underlying patterns of the data, and consequently the estimated model order is not robust. However, PSO-based
clustering methods such as single swarm or multiple cooperative swarms clustering do not rely on initial conditions, and
they are the search schemes which can explore the search space more effectively and may escape from local optimums.
Moreover, as described earlier, multiple cooperative swarms clustering is more probable to get the optimal solution as com-
pared with single swarm clustering and it can provide more stable and robust solutions.
Therefore, the multiple cooperative swarms approach distributes the search space among multiple swarms and enables
cooperation between swarms, leading to an effective search strategy. Accordingly, we propose to use multiple cooperative
swarms clustering in stability analysis-based approach to find the model order of the given data.
5. Experimental results
The performance of the proposed approach is evaluated and compared with other approaches such as single swarm clus-
tering, k-means and k-harmonic means clustering using eight different data sets, seven of which are selected from the UCI
machine learning repository [6], and the last one being a speech data set is taken from the standard TIMIT corpus [17]. The
name of data sets chosen from UCI machine learning repository, their associated number of classes, samples and dimensions
are provided in Table 1.
Also, the speech data include four phonemes: /aa/, /ae/, /ay/ and /el/, from the TIMIT corpus. A total of 800 samples from
these classes was selected, and twelve mel-frequency cepstral coefficients [20] have been considered as speech features.
Table 2
Average and standard deviation of different measures for speech data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.8328 [0.8167] 0.0789 [0.0142] 3.3093 [0.327]
K -harmonic means 3.54e05 [2.62e05] 0.0769 [0.0001] 3.3242 [0.0001]
Single swarm À1.4539 [0.8788] 0.1098 [0.014] 1.5531 [0.0372]
Cooperative swarms À1.6345 [1.0694] 0.1008 [0.0153] 1.583 [0.0388]
Table 3
Average and standard deviation of different measures for iris data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.4942 [0.3227] 0.1008 [0.0138] 3.0714 [0.2383]
K -harmonic means 0.82e05 [0.95e05] 0.0921 [0.0214] 3.0993 [0.0001]
Single swarm À0.8802 [0.4415] 0.3979 [0.0001] 1.4902 [0.0148]
Cooperative swarms À0.89 [1.0164] 0.3979 [0.0001] 1.48 [0.008]
178 A. Ahmadi et al. / Information Sciences 182 (2012) 169–183
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 11/15
The performance of the multiple cooperative swarms clustering approach is compared with k-means and single swarm
clustering techniques in terms of Turi’s validity index over 80 iterations (Fig. 6). The results are obtained by repeating the
algorithms 30 independent times. For these experiments, the parameters are considered as w = 1.2 (decreasing gradually
[2]), c 1 = 1.49, c 2 = 1.49, n = 30 (for all swarms). Also, the number of clusters is considered to be equal to the number of
classes.
Table 4Average and standard deviation of different measures for wine data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.2101 [0.3565] 0.016 [0.006] 3.1239 [0.4139]
K -harmonic means 2.83e07 [2.82e07] 190.2 [320.75] 2.1401 [0.0149]
Single swarm À0.3669 [0.4735] 0.1122 [0.0213] 1.3843 [0.0026]
Cooperative swarms À0.7832 [0.8564] 0.0848 [0.009] 1.3829 [0.0044]
Table 5
Average and standard deviation of different measures for TAE data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.6329 [0.7866] 0.0802 [0.0306] 3.2321 [0.5205]
K -harmonic means 1.36e06 [1.23e06] 0.123 [0.0001] 2.7483 [0.0001]Single swarm À0.5675 [0.6525] 0.1887 [0.0001] 1.4679 [0.0052]
Cooperative swarms À0.7661 [0.7196] 0.1887 [0.0001] 1.4672 [0.004]
Table 6
Average and standard deviation of different measures for breast cancer data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.1711 [0.1996] 0.0173 [0.0001] 2.1768 [0.0001]
K -harmonic means 0.88e08 [0.55e08] 7.0664 [38.519] 1.8574 [0.0203]
Single swarm À0.62 [0.7997] 217.59 [79.079] 1.7454 [0.079]
Cooperative swarms À0.6632 [0.654] 245.4857 [53.384] 1.7169 [0.0925]
Table 7
Average and standard deviation of different measures for zoo data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.8513 [1.0624] 0.2228 [0.0581] 2.5181 [0.2848]
K -harmonic means 1.239 [1.5692] 0.3168 [0.0938] 2.3048 [0.1174]
Single swarm À5.5567 [3.6787] 0.5427 [0.0165] 2.0528 [0.0142]
Cooperative swarms À6.385 [4.6226] 0.5207 [0.0407] 2.0767 [0.025]
Table 8
Average and standard deviation of different measures for glass identification data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.7572 [0.9624] 0.0286 [0.001] 2.599 [ 0.2571]
K -harmonic means 0.89e05 [1.01e05] 0.0455 [0.0012] 2.0941 [0.0981]
Single swarm À4.214 [3.0376] 0.1877 [0.0363] 2.6797 [0.3372]
Cooperative swarms À6.0543 [4.5113] 0.225 [0.1034] 2.484 [0.1911]
Table 9
Average and standard deviation of different measures for diabetes data.
Method Turi’s index Dunn’s index S_Dbw
K -means 0.243 [0.3398] 0.0137 [0.0001] 2.297 [ 0.0001]
K -harmonic means 1.88e07 [1.9e07] 153.68 [398.42] 2.0191 [0.353]
Single swarm À0.2203 [0.2621] 1298.1 [0.0001] 1.5202 [0.027]
Cooperative swarms À0.3053 [0.3036] 1298.1 [0.0001] 1.5119 [0.0043]
A. Ahmadi et al. / Information Sciences 182 (2012) 169–183 179
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 12/15
As illustrated in Fig. 6, multiple cooperative swarms clustering provides better results as compared with k-means, as well
as single swarm clustering approaches, in terms of Turi’s index for a majority of the data sets.
In Tables 2–9, the multiple cooperative swarms clustering is compared with other clustering approaches using different
cluster validity measures over 30 independent runs. The results presented for different data sets are in terms of average and
standard deviation ([r]) values.
As observed in Tables 2–9, multiple swarms clustering is able to provide better results in terms of the different cluster
validity measures for most of the data sets. This is because it is capable of manipulating multiple-objective problems, in
Speech data Iris data Wine data TAE data
K -means
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
k
s
( k )
2 3 4 5 6 7
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
k
s
( k )
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
0.4
k
s
( k )
2 3 4 5 6 7
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
k
s
( k )
K -harmonic means
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
0.4
k
s
( k )
2 3 4 5 6 7
0.22
0.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
0.4
k
s
( k )
2 3 4 5 6 7
0.3
0.35
0.4
0.45
0.5
k
s
( k )
2 3 4 5 6 7
0.25
0.3
0.35
0.4
0.45
0.5
0.55
k
s
( k )
Fuzzy c-means
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
k
s
( k )
2 3 4 5 6 7
0.05
0.1
0.15
0.2
0.25
0.3
0.35
k
s
( k )
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
0.4
k
s
( k )
2 3 4 5 6 7
0.2
0.25
0.3
0.35
0.4
0.45
k
s
( k )
Single swarm
2 3 4 5 6 7
0.58
0.6
0.62
0.64
0.66
0.68
0.7
0.72
0.74
k
s
( k )
2 3 4 5 6 7
0.4
0.41
0.42
0.43
0.44
0.45
0.46
0.47
0.48
k
s
( k )
2 3 4 5 6 7
0.43
0.44
0.45
0.46
0.47
0.48
0.49
0.5
0.51
k
s
( k )
2 3 4 5 6 7
0.58
0.6
0.62
0.64
0.66
0.68
0.7
0.72
0.74
k
s
( k )
Multiple swarms
2 3 4 5 6 7
0.58
0.59
0.6
0.61
0.62
0.63
0.64
0.65
k
s
( k )
2 3 4 5 6 7
0.45
0.5
0.55
0.6
0.65
0.7
0.75
k
s
( k )
2 3 4 5 6 7
0.35
0.4
0.45
0.5
0.55
0.6
0.65
k
s
( k )
2 3 4 5 6 7
0.5
0.52
0.54
0.56
0.58
0.6
0.62
0.64
0.66
0.68
k
s
( k )
Fig. 7. Stability measure as a function of model order: speech, iris, wine and TAE data sets.
180 A. Ahmadi et al. / Information Sciences 182 (2012) 169–183
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 13/15
contrast to k-means and k-harmonic means clustering, and it distributes the search space between multiple swarms and
solves the problem more effectively.
Now, the stability-based approach for model order selection in multiple cooperative swarms clustering is studied. The
PSO parameters are kept the same as before, and r max = 30 and k is considered to be 25 for KNN classifier. The stability mea-
sures of different model orders for the multiple cooperative swarms and other clustering approaches using different data sets
are presented in Figs. 7 and 8. The results for speech, iris, wine and teaching assistant evaluation data sets are provided in
Fig. 7 and for the last four data sets including breast cancer, zoo, glass identification and diabetes are shown in Fig. 8. In these
Breast cancer data Zoo data Glass data Diabetes data
K -means
2 3 4 5 6 7
0.05
0.1
0.15
0.2
0.25
0.3
0.35
k
s
( k )
2 3 4 5 6 7
0.54
0.56
0.58
0.6
0.62
0.64
k
s
( k )
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
k
s
( k )
2 3 4 5 6 7
0.15
0.2
0.25
0.3
0.35
0.4
0.45
k
s
( k )
K -harmonic means
2 3 4 5 6 7
0.4
0.45
0.5
0.55
0.6
0.65
k
s
( k )
2 3 4 5 6 7
0.52
0.54
0.56
0.58
0.6
0.62
0.64
0.66
k
s
( k )
2 3 4 5 6 7
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
0.3
k
s
( k )
2 3 4 5 6 7
0.35
0.4
0.45
0.5
0.55
k
s
( k )
Fuzzy c-means
2 3 4 5 6 7
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
k
s
( k )
2 3 4 5 6 7
0.1
0.2
0.3
0.4
0.5
0.6
0.7
k
s
( k )
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
k
s
( k )
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
k
s
( k )
Single swarm
2 3 4 5 6 7
0.2
0.22
0.24
0.26
0.28
0.3
0.32
k
s
( k )
2 3 4 5 6 7
0.56
0.57
0.58
0.59
0.6
0.61
0.62
0.63
0.64
k
s
( k )
2 3 4 5 6 7
0.25
0.3
0.35
0.4
0.45
k
s
( k )
2 3 4 5 6 70.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
k
s
( k )
Multiple swarms
2 3 4 5 6 7
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
k
s
( k )
2 3 4 5 6 7
0.46
0.48
0.5
0.52
0.54
0.56
k
s
( k )
2 3 4 5 6 7
0.25
0.3
0.35
0.4
0.45
k
s
( k )
2 3 4 5 6 7
0.1
0.15
0.2
0.25
0.3
0.35
0.4
k
s
( k )
Fig. 8. Stability measure as a function of model order: breast cancer, zoo, glass identification and diabetes data sets.
A. Ahmadi et al. / Information Sciences 182 (2012) 169–183 181
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 14/15
figures, k and s(k) indicate model order and stability measure for the given model order k, respectively. The corresponding
curves for single swarm and multiple swarms clustering approaches are obtained using Turi’s validity index.
According to Figs. 7 and 8, the proposed approach using multiple cooperative swarms clustering is able to identify the
correct model order for most of the data sets. Moreover, the best model order for different data sets can be obtained as pro-
vided in Table 10. The minimum value for stability measure given any clustering approach is considered as the best model
order (k*), i.e.
k
Ã
¼ arg mink fsðkÞg: ð21Þ
As presented in Table 10, k-means, k-harmonic means and fuzzy c -means clustering approaches do not converge to the
true model order using the stability-based approach for the most of the data sets. The performance of the single swarm clus-
tering is partially better than that of k-means, k-harmonic means and fuzzy c -means clustering because it does not depend
on initial conditions and it can escape trapping in local optimal solutions. Moreover, the multiple cooperative swarms ap-
proach using Turi’s index provides the true model order for majority of the data sets. As a result, Turi’s validity index is
appropriate for the model order selection using the proposed clustering approach. Its performance, based on Dunn’s index
and S_Dbw index, is also considerable as compared to the other clustering approaches. Subsequently, the proposed multiple
cooperative swarms can provide better estimates for model order, as well as stable clustering results as compared to the
other clustering techniques by using the introduced stability-based approach.
6. Conclusion
A new bio-inspired multiple cooperative swarms algorithm was described to deal with clustering problem. The stability
analysis-based approach was introduced to estimate the model order of the multiple cooperative swarms clustering. We pro-
posed to use multiple cooperative swarms clustering to find the model order of the data, due to its robustness and stable
solutions. The performance of the proposed approach has been evaluated using eight different data sets. The experiments
indicate that the proposed approach produces better results as compared with the k-means, k-harmonic means, fuzzy c -
means and single swarm clustering approaches. In the future, we will investigate other similarity measures as Euclidean dis-
tance works well when a dataset contains compact or isolated clusters. Furthermore, we will study other stability measures
since the used measure requires a considerable computational burden to discover a suitable model order.
References
[1] A. Abraham, C. Grosan, V. Ramos (Eds.), Swarm Intelligence in Data Mining, Springer, 2006.
[2] A. Abraham, H. Guo, H. Liu, Swarm Intelligence: Foundations, Perspectives and Applications, Studies in Computational Intelligence, Springer-Verlag,Germany, 2006.
[3] A. Ahmadi, F. Karray, M. Kamel, Multiple cooperating swarms for data clustering, in: IEEE Swarm Intelligence Symposium, 2007, pp. 206–212.
[4] A. Ahmadi, F. Karray, M. Kamel, Model order selection for multiple cooperative swarms clustering using stability analysis, in: IEEE Congress on
Evolutionary Computation within IEEE World Congress on Computational Intelligence, Hong Kong, 2008, pp. 3387–3394.
[5] J. Bezdek, Pattern Recognition with Fuzzy Objective Function Algoritms, Plenum Press, New York, 1981.
[6] C. Blake, C. Merz, UCI Repository of Machine Learning Databases. <http://www.ics.uci.edu/mlearn/MLRepository.html> , 1998.
[7] C. Chen, F. Ye, Particle swarm optimization algorithm and its application to clustering analysis, in: IEEE International Conference on Networking,
Sensing and Control, 2004, pp. 789–794.
[8] X. Cui, J. Gao, T.E. Potok, A flocking based algorithm for document clustering analysis, Journal of Systems Architecture 52 (8-9) (2006) 505–515.
[9] R. Duda, P. Hart, D. Stork, Pattern Classification, John Wiley and Sons, 2000.
[10] J. Dunn, A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters, Cybernetics 3 (1973) 32–57.
[11] M. Halkidi, Y. Batistakis, M. Vazirgiannis, On clustering validation techniques, Intelligent Information Systems 17 (2–3) (2001) 107–145.
[12] M. Halkidi, M. Vazirgiannis, Clustering validity assessment: finding the optimal partitioning of a data set, in: International Conference on Data Mining,
2001, pp. 187–194.
[13] A. Jain, M. Murty, P. Flynn, Data clustering: a review, ACM Computing Surveys 31 (3) (1999) 264–323.
[14] M. Kazemian, Y. Ramezani, C. Lucas, B. Moshiri, Swarm Clustering Based on Flowers Pollination by Artificial Bees, Swarm Intelligence in Data Mining,
Springer, 2006. pp. 191–202.[15] J. Kennedy, R. Eberhart, Particle swarm optimization, in: IEEE International Conference on Neural Networks, vol. 4, 1995, pp. 1942–1948.
Table 10
The best model order (k*) for data sets.
Data set Real model order KM KHM FCM Single swarm Multiple swarms
Turi Dunn S_Dbw Turi Dunn S_Dbw
Speech 4 2 2 2 7 2 2 4 2 2
Iris 3 2 3 2 7 2 2 3 4 2
Wine 3 2 7 2 4 4 2 3 5 3
TAE 3 2 2 2 3 2 2 4 2 2
Breast cancer 2 2 3 2 2 2 2 2 2 2
Zoo 7 6 2 4 4 2 2 6 2 2
Glass 7 2 3 2 4 2 2 7 2 2
Diabetes 2 2 3 4 2 2 2 2 2 2
182 A. Ahmadi et al. / Information Sciences 182 (2012) 169–183
8/3/2019 Model Order Selection for Multiple Cooperative Swarms Clustering
http://slidepdf.com/reader/full/model-order-selection-for-multiple-cooperative-swarms-clustering 15/15
[16] T. Lange, V. Roth, M. Braun, J. Buhmann, Stability-based validation of clustering solutions, Neural Computing 16 (2004) 1299–1323.
[17] National Institute of Standards and Technology, TIMIT Acoustic-Phonetic Continuous Speech Corpus, Speech Disc 1-1.1, NTIS Order No. PB91-
5050651996, 1990.
[18] M. Omran, A. Engelbrecht, A. Salman, Particle swarm optimization method for image clustering, International Journal of Pattern Recognition and
Artificial Intelligence 19 (3) (2005) 297–321.
[19] M. Omran, A. Salman, A. Engelbrecht, Dynamic clustering using particle swarm optimization with application in image segmentation, Pattern Analysis
and Applications 6 (2006) 332–344.
[20] M. Seltzer, Sphinx III signal processing front end specification, Technical Report, CMU Speech Group, 1999.
[21] R. Turi, Clustering-based colour image segmentation, Ph.D. Thesis, Monash University, Australia, 2001.
[22] D. van der Merwe, A. Engelbrecht, Data clustering using particle swarm optimization, in: Proceeding of IEEE Congress on Evolutionary Computation,
vol. 1, 2003, pp. 215–220.[23] X. Xiao, E. Dow, R. Eberhart, Z. Miled, R. Oppelt, Gene clustering using self-organizing maps and particle swarm optimization, in: IEEE Procedings of
International Parallel Processing Symposium, 2003, p. 10.
[24] F. Ye, C. Chen, Alternative kpso-clustering algorithm, Tamkang Journal of Science and Engineering 8 (2) (2005) 165–174.
[25] B. Zhang, M. Hsu, K -harmonic means: a data clustering algorithm, Technical Report, Hewlett-Packard Labs, HPL-1999-124.
A. Ahmadi et al. / Information Sciences 182 (2012) 169–183 183