chapter 6 development of particle swarm optimization based algorithm...

CHAPTER 6

DEVELOPMENT OF PARTICLE SWARM

OPTIMIZATION BASED ALGORITHM FOR GRAPH

PARTITIONING

6.1 Introduction

From the review, it is studied that the min cut k – partitioning problem is a

fundamental partitioning problem and is NP hard also. Most of the existing

partitioning algorithms are heuristic in nature and they try to find a reasonably

good solution. These algorithms falls in move – based category in which solution

is generated iteratively from an initial solution applying move to the recent

solution. Most frequently, these move – based approaches are combined with

stochastic algorithms

In this chapter, we have developed Multilevel Recursive Discrete Particle Swarm

Optimization (MRDPSO) technique which integrates a new DPSO based

refinement approach and an efficient matching based coarsening scheme for

solving GPP.

6.2 Discrete Particle Swarm Optimization

The original PSO algorithm optimizes problems in which the elements of

solution space are continuous. Since most of the real life applications are based

on optimization of discrete valued space, Kennedy et al [163] developed discrete

PSO (DPSO) to solve discrete optimization problems. The PSO algorithm

regulates the trajectories of population particles through a problem space using

information about the previous best performance of the particle and its neighbor.

In PSO, trajectories represent changes in positions of some number of

dimensions, whereas in DPSO particles operate on discrete search space and

trajectories represents variations in probability which a coordinate takes a value

from reasonable discrete values. In this method, each particle specifies the

potential solution composed of k elements. The fitness function is used to

evaluate the correctness of the solution. Each particle is considered as a position

in k–dimensional space and each element of particle is restricted to ‘0’ and ‘1’,

where ‘0’ represents ‘included’ and ‘1’ represents ‘not included’. Each element

can vary from 0 to 1 and vice versa. Additionally, each particle will have k –

dimensional velocity between the range [- Vmax, Vmax]. Velocities are defined

using probabilities that a bit will be in one state or the other.

At the initial stage of the algorithm, the number of particles and their velocity

vectors are generated randomly. Then in a certain number of iterations, the

algorithm aims to find the optimal or close – optimal solutions using predefined

fitness functions. At each iteration, update the velocity vector using best

positions pbest,nbestand then update the position of particles using velocity vector.

pbestand nbest are k – dimensional vectors composed of ‘0’ and ‘1’ which operates

as a memory of the algorithm. Since the initial step of algorithm, best position

that the particle has visited is pbestand the position that the particle and its

neighbor has visited is nbest. Depending on size of neighborhoods, two distinct

PSO algorithms can be developed. If the entire population size of the swarm is

considered as the neighbor of particle, then nbestis called as global best (gbest)

whereas, if for each particle smaller neighborhood are defined then nbest is called

as local best (lbest). Star neighborhood and ring neighborhood are the topologies

used by gbest and lbest respectively. PSO based on gbest converges faster than lbest

based PSO due to its larger particle connectivity, but lbest based PSO is less

vulnerable to being trapped in local minima.

Velocity and position of particle are updated using following equations:

( ) = ( ) + ( ) − ( ) + ( ) − ( ) (6.1)( ) = 1, ( ) >, ( ) ≤ (6.2)

Sigmoid function is given by the relation

Sig( ( )) = ( ( )) (6.3)where ( ) - nth element of mth particle in tth iteration of algorithm.( ) - nth element of the velocity vector of mth particle in tth iteration of

algorithm.

and are positive accelerated constants that controls the influence of

and on the search process.

and ∈ [0, 1] are random values sampled from a uniform distribution, and∈ [0, 1] is random number.

Stopping criteria for DPSO can either be the maximum number of iterations or

determining acceptable solution or no further improvement in the number of

iterations.

Discrete Particle Swarm Optimization algorithm is

Create and initialize k – dimensional swarm with N – particles

Repeat

for each particle i = 1, 2, …., N

if ( ) > then // f represents the fitness function

= ;

end

if > then

= ;

end

endfor each particle i = 1, 2, …., N do

update the velocity vector using equation (6.1)

update the position vector using equation (6.2)

end

until stopping criteria is satisfied.

6.3 DPSO based Mathematical Model of Graph Partitioning

Problem

Let G = (V, E) be a weighted undirected graph with set of vertices V and E be set

of edges, the weight of an edge ( ) ∈ and k be natural number greater than

1, then the graph partitioning problem (GPP) is to partition vertex set V into k -

blocks , , … , such that ∪ ∪ … ∪ = and ∩ = ∅, ∀ ≠ .

If all blocks have the same weight, then the partition is balanced.

For balanced partitioning | | ≤ = (1+∈) | | for ∀ ∈ {1,2, … , }If ∈ = 0, then the partitioning is perfectly balanced.

If | | > , then the block is overloaded.

Objective function for the graph partitioning problem is to minimize the cut of

partition cut between two sets and which are subsets of is

, = ( , ) (6.4)∈∈Selection of Discrete PSO for GPP

We have developed technique based on discrete particle swarm optimization

(DPSO) algorithm to explore good quality approximate solutions of the min – cut

partitioning problem for graph G = (V, E) with p vertices and q edges. In the

problem, each particle of the swarm is considered as partitioning vector where

the solution space is p – dimensional. Obviously, cut ( ) is the fitness function to

be minimized.

If the scale of the swarm is N then the position of nth particle will be represented

by a p – dimensional vector ⃗ = , , … , , ∈ {1, 2, … , } where each

element represents dth dimension of position vector ⃗ which is restricted to

the values zero and one.The velocity of this particle is denoted by another p –

dimensional vector ⃗ = , , … , in which each particle indicates

the probability of the bit possessing the value one. The best formerly visited

position of nth particle is represented by the vector ⃗ = , , … , .

The best formerly visited position of the swarm is represented by the vector,⃗ = , , … , , index of best particle in the swarm is i.

Let k be iteration number, then the velocity and position used in our DPSO are

defined by⃗ = . ⃗ + ⃗[0, ] ∗ ⃗ − ⃗ + ⃗[0, ] ∗ ⃗ − ⃗ (6.5)

= 1, >, ≤Sigmoid function is

Sig( ) = ( ) (6.7)⃗ is function following uniform distribution which returns vectors whose

positions are randomly selected.′ ∗ represents point wise multiplication of vectors.

and are positive constants called as cognitive and social parameter

repsectively.∈ [0, 1] generates random number selected from a uniform distribution.

The standard discrete PSO is not much effective approach since the space for

feasible solutions of min – cut partitioning problem is excessively large,

particularly when the number of vertices is in the thousands. Hence, we have

chosen multilevel partitioning method to combine with discrete PSO and

developed, multiple recursive discrete particle swarm optimization algorithm

(MRDPSO) for min – cut partitioning. As MRDPSO algorithm is applied during

the refinement phase of the multilevel method; it improves the quality of level m

graph Gm using the boundary refinement policy for partitioning PGm. In addition

to this, discrete PSO grouped with local search optimization and hybrid PSO

local search is developed which applies the Fiduccia Mattheyses algorithm [55]

to each particle for achieving local optimum and maintain the balance constraint

of partitioning ⃗ . This hybrid approach permits the search process to divert

from local minima, as well simultaneously reinforces efficiency and achieves a

noteworthy speedup for balanced partitioning.

6.4 Core Sort Heavy Edge Matching

Sorted heavy edge matching (SHEM) sorts vertices of the graph in ascending

order using their degrees to decide the order to visit for matching. We have

matched the vertex u with vertex v, if there is an edge of maximum weight from

vertex u to an unmatched vertex v. Highly connected groups of vertices are

identified and collapsed together in SHEM, this method is discussed in detail in

Section 2.5.1. A core number of the vertex is a maximum order of core which

contains that vertex [164]. To find core number, the vertices of the graph are to be

arranged in ascending degrees, then for each vertex v identified the vertices

adjacent to v whose degree is greater than v. Degree of all such vertices is

reduced by 1. The process continued until all vertices get core number. The

concept of graph core for coarsening power law graphs is introduced in [165].

An algorithm for finding Core Number:

Input: Graph G = (V, E)Output: Table representing core number for each vertexBegin

Compute degrees of vertices;arrange vertices in ascending order of their degrees;

for each ∈ in the order do begin[ ] = [ ];for each ∈ ℎ ( ) do begin

If [ ] > [ ] then begin[ ] = [ ] − 1;

reorder V accordingly;

end if

end for

end for

end

return table with core number for each vertex.

Core sort heavy edge matching algorithm (CSHEM) combines SHEM with the

concept of core sort. Initially, to decide order to visit for matching by CSHEM,

we have sorted vertices of a graph in descending order using core numbers of the

vertices. During the process of matching, if tie occurred, according to edge

weights, then the vertex with highest core number is selected.

An algorithm for Core Sort Heavy Edge Matching:

Input: Graph = ( , )Output: Coarsened graph = ( , )Procedure coarsening = ( , )←←←

while ≠ ∅ do

Randomly select a vertex ∈Select the ( , ) ∈ of maximal weight

Remove , from

Add the new vertex = { , } ∈For each edge ( , ) in E do

if ( , ) ∈ then( , ) ← ( , ) + ( , )Remove ( , ) from

end if

end for

end while

return = ( , )end procedure

6.5 Greedy Graph Growing Partition (GGGP)

Graph growing partition algorithm (GGP) introduced by Karypis [17] iteratively

generates set A of vertices consisting of half of the vertices such that ( ) =( ). During implementation of algorithm vertices of graph are divided into

three sets A, B, C where B contains vertices that are adjacent to vertices of set A

called as border of A, and C contains remaining vertices. Set A is initialized by

randomly selecting any vertex of the graph, at each iteration vertices that are

adjacent to A are added to A, hence = ∪ . The process ends when A will

contain set of vertices which represent half the total weight of the graph.

Partition generated by GGP is dependent on proper choice of initial vertex. GGP

algorithm has been improved by using the concept of cut gain [18] of the vertex

and developed a greedy graph growing partition (GGGP). In GGGP also, vertices

of graph are divided into three sets A, B, C as in the case of GGP algorithm. A is

initialized by randomly selecting any vertex from V and sets B, C are then

initialized. For initialization of set A, selected nearest vertex (say u) from the set

B to the vertex in set A and then added it to A which is vertex of maximal gain in

B. After that, each vertex in set C which is adjacent to u is moved to the set B and

calculated its gain. Similarly, recalculate the gain of each vertex in B that is

adjacent to u and therefore the next iteration starts. Process continued till the

weight of set A reaches to half of the total weight. Algorithm ends when ( ) =( ). GGGP algorithm generates better result for any choice of the initial vertex

to move.

An algorithm for Greedy Graph Growing Partition:

Input: Graph = ( , )Output: Set A such that ( ) = ( )

Randomly select a vertex ∈← { }← − { }← { ∈ ℎ ℎ ( , ) ∈ }Compute the gains of

while ( ) < ( ) do

Select the ∈ of maximal gain

Move from

for each edge ( , ) in E do

if ∈ then

update the gain of

else if ∉ then

add to

compute the gain ofend if

end for

end while

return

end procedure

6.6 Multilevel Recursive Discrete Particle Swarm Optimization

(MRDPSO) Algorithm

To develop an algorithm for achieving superior quality approximate solutions of

the min – cut partitioning problem for graphs, we have used a hybrid approach

in which multilevel partitioning method is combined with DPSO.

Our Multilevel recursive discrete particle swarm optimization algorithm

(MRDPSO) works in three steps. The first step is an initial portioning phase, in

which MRDPSO initializes population on the smallest graph. In the second step

i.e, during the refinement phase, it successively projects all the particles back to

the next level finer graph. Whereas, in a third step, it recursively partitions

bisected graph into k - parts.

For the initial partitioning of the given graph G (V, E), we have used GGGP

approach. Effective matching based coarsening scheme is applied during the

coarsening phase. In this phase, CSHEM algorithm is used on the original graph

and SHEM is applied to the coarsened graphs. For graphs with small number

vertices, use of core number yields matching same as that of SHEM. Hence, for

the graphs with less than fifteen vertices directly SHEM is applied to the original

graph.

Position vector, velocity vector, and personal best vector on the graph =( , ) for each nth particle are ⃗ , ⃗ , and ⃗ respectively. Initial MDPSO

initializes the population on in partitioning phase and projects back

successively all the particles, ⃗ , ⃗ , ⃗ to the next level finer graph.

In the next stage, internal and external weights for each particle are determined.

Internal weight for the nth particle of the vertex xis denoted as and is the

sum of weights of the edges incident on the vertex x within the block and

external weight for the nth particle of the vertex xis denoted as and is the

sum of weights of the edges incident on the vertex x outside the block.

= ( , )( , )∈ (6.8)= ( , )( , )∈ (6.9)

Boundary vertices for the each particle with positive external weight are stored

in a boundary hash – table. An initialization phase starts at time t = 0 in which,

internal weight, external weight and boundary hash table is computed. The

structure of MRDPSO consists of a nested loop. Stopping criteria is decided by

the outer loop, either MRDPSO is run for maximum number of cycles or

not. Velocity of each particle ⃗ is to be adjusted by boundary refinement policy

in the inner loop of MRDPSO. Enforce maximum value on the velocity ,

if velocity exceeds the threshold value, then set it . Move vertices between

the partitions and update particle’s position ⃗ , this moving may break balance

constraint ∈.To maintain the balance constraint of ⃗ , we prefer to select vertex with highest

gain to move from larger block. During these moves, it is very important to

inspect and compare boundary vertices by the space which permits to store,

recover and update the gains of vertices speedily. In this space, we have used

last-in and first-out scheme to achieve an efficient MRDPSO. Internal and

external weights of the particle play an important role in computing gain and

boundary vertex for the easy implementation of MRDPSO. At each iteration, the

weights of all the neighboring vertices of the moved vertex are updated for

maintaining the consistency in internal and external weights. The boundary hash

table also changes with respect to changes in partitioning ⃗ .In the third step, we apply recursive algorithm for k – partitioning of the bisected

graph generated in the first two steps of MRDPSO.

6.7 Result and DiscussionTo assess the performance of the developed multilevel recursive discrete particleswarm optimization MRDPSO, we carried out experiment on ordinary graphs aswell as hypergraphs. For the first one, we have used Walshaw GraphPartitioning Test Bench [82] as characterized in Table 6.1.

GraphSize Degree| | | | Max Min Avg

add20 2395 7462 123 1 6.23data 2851 15093 17 3 10.593elt 4720 13722 9 3 5.81uk 4824 6837 3 1 2.83add32 4960 9462 31 1 3.82bcsstk33 8738 291583 140 19 66.76whitaker3 9800 28989 8 3 5.92crack 10240 30380 9 3 5.93wing-nodal 10937 75488 28 5 13.80fe_4elt2 11143 32818 12 3 5.89vibrobox 12328 165250 120 8 26.8bcsstk29 13992 302748 70 4 43.274elt 15606 45878 10 3 5.88fe_sphere 16386 49152 6 4 5.99cti 16840 48232 6 3 5.73memplus 17758 54196 573 1 6.10cs4 22499 43858 4 2 3.90bcsstk30 28924 1007284 218 3 69.65bcsstk31 35588 572914 188 1 32.197fe_pwt 36519 144794 15 0 7.93bcsstk32 44609 985046 215 1 44.1636fe_body 45097 163734 28 0 7.26t60k 60005 89440 3 2 2.98wing 62032 121544 4 2 2,57brack2 62631 366559 32 3 11.71finan512 74752 261120 54 2 6.99fe_tooth 78136 452591 39 3 11.58fe_rotor 99617 662431 125 5 13.30598a 110971 741934 26 5 13.37fe_ocean 1433437 409593 6 1 5.71

Table 6.1: WalshawGraph Partitioning Test Benchmark

For hypergraphs ISPD98 benchmark suit [166] is used. The details of graphs are

given in Table 6.2.

Hypergraphs Vertices Hyperedges

ibm01 12752 14111

ibm02 19601 19584

ibm03 23136 27401

ibm04 27507 31970

ibm05 29347 28446

ibm06 32498 34826

ibm07 45926 48117

ibm08 51309 50513

ibm09 53395 60902

ibm10 69429 75196

ibm11 70558 81454

ibm12 71076 77240

ibm13 84199 99666

ibm14 147605 152772

ibm15 161570 186608

ibm16 183484 190048

ibm17 185495 189581

ibm18 210613 201920

Table 6.2: ISPD98 Benchmark Suit

6.7.1 Performance Evaluation for Ordinary Graphs

We have compared partitions obtained by our MRDPSO algorithm in short

computing time limited to three minutes with state–of–the-art graph partitioning

packages METIS [167], CHACO [168] and multilevel iterated tabu search [169]

Additionally, we have also compared our results with the best partitions ever

reported in partitioning archive.

Benchmark graphs along with their characteristics are as listed in Table 6. 1. We

have used multilevel p – METIS algorithm for METIS, multilevel KL – algorithm

for CHACO with recursive bisection is chosen. For MITS parameters values

are: = 0.1, = 0.01| |, 0.02| | and coarsening threshold being 100. METIS

and CHACO does not permit randomized repetitions to run the algorithm, we

chose to run MRDPSO only once. Furthermore, the optimal set of parameter

values chosen for MRDPSO is = 1, = = 0.5, = 4, = 30, =20. The cutoff limit varies with size of graphs. It is from one second for the

graphs with upto 4000 vertices to three minutes for the largest graphs.

In the results, METIS, CHACO, tabu search and our approach are labeled as p -

METIS, CHACO, MITS and MRDPSO, respectively. Cut value for partition of each

graph for different values of k are given in Table 6.3, 6.4, 6.5 and 6.6. If partition

is perfectly balanced, then the balance constraint has value = 1. For the

situations in which the partition is not perfectly balanced, balance constraint

value is also mentioned in parenthesis along with cut value. The number of times

each algorithm produces the best partition over the benchmark graphs is given in

the last row of each table. CSHEM used for coarsening in MRDPSO immensely

reduces the size of graph which helps GGGP for generating better balanced

partition. Whereas implementation of DPSO at the most complex and time

consuming refinement stage helps a lot to reduce cut value in small cutoff limit

also. Hence from the results, we can observe that our MRDPSO approach

performs very well in terms of cut value than the other three approaches.

From the results, we can also observe that for ≥ 32, few partitions by p –

METIS, MITS and our approach are not perfectly balanced but in our approach

this imbalance also occurs very less than other two.

Graphk = 2 k = 4

P - METIS CHACO MITS MRDPSO P - METIS CHACO MITS MRDPSO

add20 729 742 708 522 `1292 1329 1224 3data 218 199 189 183 480 433 427 3573elt 108 103 90 90 231 234 214 185uk 23 36 23 15 67 69 47 33add32 21 11 11 11 42 56 40 37bcsstk33 10205 10172 10171 10168 23131 23723 22492 20989whitaker3

135 131 127 119 406 425 385 365crack 187 225 184 179 382 445 371 343wing-nodal

1820 1823 1797 1782 4000 4022 3715 3439fe_4elt2 130 144 130 125 359 402 423 314vibrobox 12427 11367 11184 9897 21471 21774 19811 18589bcsstk29 2843 3140 2852 2847 8826 9202 8572 83714elt 154 158 139 127 406 433 390 299fe_sphere 440 424 386 375 872 852 774 697cti 334 372 366 354 1113 1117 1039 837memplus 6337 7549 5696 5391 10559 11535 9982 9672cs4 414 517 377 352 1154 1166 987 991bcsstk30 6458 6563 9812 6298 17685 17106 22436 15931bcsstk31 3638 3391 2820 2520 8770 9199 8751 6954fe_pwt 366 362 360 340 738 911 1249 753bcsstk32 5672 6137 6936 4421 12205 15704 9864 8674fe_body 311 1036 271 273 957 1415 728 487t60k 100 91 86 68 255 235 226 219wing 950 901 861 657 2086 1982 1770 1599brack2 738 976 731 719 3250 3462 3291 2878finan512 162 162 162 162 324 325 405 313fe_tooth 4297 4642 3827 3803 8577 8430 7460 6334fe_rotor 2190 2151 2122 1998 8564 8215 7765 6799598a 2504 2465 2402 2297 8533 8975 8159 8092fe_ocean 505 499 468 439 2039 2110 2850 1785

Total 3 2 3 27 1 0 1 28

Table 6.3: Comparison of MRDPSO with p - METIS, CHACO, and MITS for k = 2, 4

Graphk = 8 k = 16

P - METIS CHACO MITS MRDPSO P - METIS CHACO MITS MRDPSO

add20 1907 1867 1750 1537 2504 2297 2120 1995data 842 783 679 621 1370 1360 1167 10953elt 388 389 352 337 665 660 585 549uk 101 119 113 76 189 211 163 118add32 81 115 74 59 128 174 143 99bcsstk33 40070 39070 34568 34681 59791 61890 55539 52727whitaker3 719 765 672 639 1237 1218 1120 1103crack 773 777 720 611 1255 1253 1157 1159wing-nodal

6070 6147 5481 5387 9290 9273 8405 8189fe_4elt2 654 718 621 625 1152 1135 1039 976vibrobox 28177 33362 24840 23897 37441 43064 34392 30949bcsstk29 16555 18158 17014 12887 28151 28029 26055 213564elt 635 688 615 497 1056 1083 1005 842fe_sphere 1330 1302 1243 1099 2030 2037 1855 1942cti 2110 2102 1838 1763 3181 3083 3033 2908memplus 13110 14265 12640 12341 14942 16433 14097 12344cs4 1746 1844 1529 1387 2538 2552 2293 1967bcsstk30 36357 37406 36373 33741 77293 81069 79265 75333bcsstk31 16012 15551 15262 15685 27180 28557 25787 22899fe_pwt 1620 1670 1531 1389 2933 3200 2857 2789bcsstk32 23601 25719 24435 19987 43371 47829 39902 35742fe_body 1348 2277 1293 981 2181 2947 2076 2098t60k 561 524 522 389 998 977 937 765wing 3205 3174 2686 2476 4666 4671 4188 3747brack2 7844 8026 7644 7432 12655 13404 12240 10995finan512 810 648 729 689 1377 1296 1458 1087fe_tooth 13653 13484 12083 10985 19346 20887 18336 15435fe_rotor 15712 15244 13558 11037 23863 23936 21687 18767598a 17276 17530 16270 13887 28922 29674 26565 26189fe_ocean 4516 5309 4272 3995 9613 9690 8397 6758

Total 0 1 2 27 0 0 2 28

Table 6.4: Comparison of MRDPSO with p - METIS, CHACO, and MITS for k = 8, 16

Graphk = 32

P - METIS CHACO MITS MRDPSOadd20 NAN 2684 2524 (1.03) 2243data 2060 (1.01) 2143 1933 (1.01) 1732 (1.01)3elt 1093 1106 1053 972uk 316 (1.01) 343 296 (1.01) 221add32 288 (1.01) 303 266 (1.01) 198 (1.01)bcsstk33 86008 84613 90438 70535whitaker3 1891 1895 1758 1572crack 1890 1962 1741 1632wing-nodal 13237 13258 12238 10989fe_4elt2 1787 1796 1688 1672vibrobox 46112 51006 47048 (1.01) 37847bcsstk29 41190 42935 38346 324174elt 1769 1766 1631 1689fe_sphere 2913 2920 2701 2731cti 4605 4532 4479 3989memplus 17303 17936 NAN 12811cs4 3579 3588 3137 2639bcsstk30 131405 128694 117414 108331bcsstk31 42645 45354 40029 38256fe_pwt 6029 6036 6596 5138bcsstk32 70020 73377 64138 58003fe_body 3424 4194 3290 2491t60k 1613 1594 1539 1549wing 6700 6843 6067 5831brack2 19786 20172 18411 16511finan512 2592 2592 2592 2592fe_tooth 29215 29849 26110 22887fe_rotor 36225 36367 32746 30999598a 44760 45780 40980 35767fe_ocean 14613 15059 13358 11789

Total 1 1 4 27

Table 6.5: Comparison of MRDPSO with p - METIS, CHACO, and MITS for k = 32

Graphk = 64

P - METIS CHACO MITS MRDPSOadd20 3433 (1.07) 3349 3219 (1.03) 2789 (1.01)data 3116 (1.03) 3145 2924 (1.07) 26453elt 1710 1722 1606 (1.03) 1487uk 495 (1.02) 540 496 (1.03) 395add32 626 (1.02) 730 571 (1.01) 442 (1.01)bcsstk33 116203 (1.01) 115530 131639 (1.05) 116317whitaker3 2796 (1.01) 2811 2628 2231crack 2847 (1.01) 2904 2628 (1.01) 2139wing-nodal 17899 (1.01) 17783 16258 (1.01) 13669(1.01)fe_4elt2 2765 (1.01) 2781 2590 2591vibrobox 53764 (1.01) 58392 54503 44665bcsstk29 62891 (1.01) 63576 59548 (1.01) 54489 (1.01)4elt 2953 2921 2676 (1.01) 2517fe_sphere 4191 4151 3776 3789cti 6461 6334 6181 5441memplus 19140 (1.01) 18978 NAN 15899cs4 4791 4817 4286 3972bcsstk30 191691 191446 175845 (1.02) 168436bcsstk31 66526 68375 61154 62003fe_pwt 9310 9231 8487 8054bcsstk32 106733 108855 96197 88673fe_body 5843 6326 5097 4613t60k 2484 2506 2345 1854wing 9405 9308 8123 7389brack2 28872 29223 27130 26442finan512 10842 11962 11077 9782fe_tooth 40162 40306 35988 32327fe_rotor 53623 52947 48206 42489598a 64307 65094 57303 53092fe_ocean 23317 22692 21212 18985

Total 0 0 3 27

Table 6.6: Comparison of MRDPSO with p - METIS, CHACO, and MITS for k = 64

As it will be noticed from the second experiment that for some graphs more

computational time, hence more iterations are required to establish balance

partition. Whereas, CHACO generates perfectly balanced partitions for all values

of k due to use of recursive bisection.

We have also compared the performance of the developed algorithm with best

balanced partitions which are reported in the Graph Partitioning archive [82].

Most of these results are generated by the algorithm developed by Schulz et al

[170], and their algorithm combines an evolutionary approach with JOSTEL

multilevel method. This approach requires large running time (around one week

on normal machine for larger graphs), since each run consists of almost 50,000

calls. Other best results are reported by approaches used in [171, 172].

For the second experiment we have increased cutoff limits ranging from one

minute for the smallest graph to one hour for largest graph and run MRDPSO

algorithm ten times for each value of k for evaluating update in cut value. In

table 6.7, 6.8 and 6.9 cut values for the best balance partitions, ever reported in

partitioning archive, cut values generated by MRDPSO and standard deviations

after ten executions of our algorithm are given. From the results we can be

observed that the imbalance is completely removed due increase in cutoff time

and the number of executions, but at the same time we can observe that the cut

values are same in most of the cases though the time limit is increased. Hence use

of DPSO in partitioning gives an optimal solution in less time and less executions

also. Cut values generated by MRDPSO are better than best partitions also.

Graphk = 2 k = 4

BEST MRDPSO

(y) (y) (Y

SD BEST MRDPSO SD

add20 596 522 52.3 1154 1089 46

data 189 183 4.24 382 357 17.7

3elt 90 90 0 201 185 11.3

uk 19 15 2.83 41 33 5.66

add32 11 11 0 34 37 2.12

bcsstk33 10171 10168 2.12 21717 20989 515

whitaker3 127 119 5.66 388 365 11.3

crack 184 179 3.54 366 343 16.3

wing-nodal 1707 1782 53 3575 3439 96.2

fe_4elt2 130 125 3.54 349 314 24.7

vibrobox 10343 9897 315 18976 18589 274

bcsstk29 2843 2847 2.83 8035 8371 238

4elt 139 127 8.49 326 299 19.1

fe_sphere 386 375 7.78 768 697 50.2

cti 334 354 14.1 954 837 82.7

memplus 5513 5391 86.3 9448 9672 158

cs4 369 352 12 932 991 29

bcsstk30 6394 6298 67.9 16651 15931 509

bcsstk31 2762 2520 171 7351 6954 281

fe_pwt 340 340 0 705 753 21.2

bcsstk32 4667 4421 174 9314 8674 453

fe_body 262 273 7.78 599 487 79.2

t60k 79 68 7.78 209 219 7.07

wing 789 657 93.3 1623 1599 17

brack2 731 719 8.49 3084 2878 146

finan512 162 162 0 324 313 7.78

fe_tooth 3816 3803 9.19 6889 6334 392

fe_rotor 2098 1998 70.7 7222 6799 299

598a 2398 2297 71.4 8001 8092 64.3

fe_ocean 464 439 17.7 1882 1785 68.6

Total 4 26 5 25

Table 6.7: Comparison of MRDPSO with BEST reported cut for k = 2, 4

Graphk = 8 k = 16

BEST MRDPSO

(y) (y) (Y

SD BEST MRDPSO SD

add20 1686 1537 105 2047 1995 36.77

data 668 621 33.2 1127 1095 22.63

3elt 345 337 5.66 573 549 16.97

uk 84 76 5.66 146 118 19.8

add32 67 59 5.66 118 99 13.44

bcsstk33 34437 34681 173 54680 52727 1381

whitaker3 656 639 12 1088 1103 10.61

crack 679 611 48.1 1088 1159 50.2

wing-nodal 5435 5387 33.9 8334 8189 102.5

fe_4elt2 607 625 12.7 1007 976 21.92

vibrobox 24484 23897 415 31892 30949 666.8

bcsstk29 23986 12887 848 21958 21356 425.7

4elt 545 497 33.9 934 842 65.05

fe_sphere 1156 1099 40.3 1714 1942 121.6

cti 1788 1763 17.7 2793 2908 81.32

memplus 11712 12341 445 12895 12344 389.6

cs4 1440 1387 37.5 2075 1967 76.37

bcsstk30 34846 33741 781 70440 75333 3460

bcsstk31 13285 15685 1697 23869 22899 685.9

fe_pwt 1447 1389 41 2830 2789 28.99

bcsstk32 20070 19987 58.7 36250 35742 359.2

fe_body 1033 981 36.8 1736 2098 256

t60k 456 389 47.4 813 765 33.94

wing 2504 2476 19.8 3876 3747 91.22

brack2 7140 7432 206 11644 10995 458.9

finan512 648 689 29 1296 1087 147.8

fe_tooth 11418 10985 306 17355 15435 1358

fe_rotor 12841 11037 1276 20391 18767 1148

598a 15922 13887 1439 25792 26189 280.7

fe_ocean 4188 3995 136 7713 6758 675.3

Total 6 24 6 24


Graphk = 32 k = 64

BEST MRDPSO SD BEST MRDPSO SD

add20 2362 2243 84.1 2959 2684 120.208

data 1799 1699 47.4 2839 2645 137.179

3elt 960 972 8.49 1532 1487 31.8198

uk 254 221 23.3 408 395 9.19239

add32 213 191 10.6 485 419 30.4056

bcsstk33 77414 70535 4864 107247 116317 213.46

whitaker3 1668 1572 67.9 2491 2231 183.848

crack 1679 1632 33.2 2541 2139 284.257

wing-nodal 11768 10989 551 15775 13443 1489.17

fe_4elt2 1614 1672 41 2478 2591 37.4767

vibrobox 39477 37847 1153 46653 44665 1405.73

bcsstk29 34968 32417 1804 55521 54376 729.734

4elt 1551 1689 43.8 2566 2517 34.6482

fe_sphere 2490 2731 170 3547 3789 100.409

cti 4049 3989 42.4 5630 5441 133.643

memplus 13953 12811 808 16223 15899 229.103

cs4 2928 2639 204 4027 3972 38.8909

bcsstk30 113443 108331 3615 171279 168436 2010.3

bcsstk31 37158 38256 776 57402 62003 3253.4

fe_pwt 5575 5138 309 8202 8054 104.652

bcsstk32 60038 58003 1439 90895 88673 1571.19

fe_body 2846 2491 251 4801 4613 132.936

t60k 1323 1549 123 2077 1854 157.685

wing 5594 5831 168 7625 7389 166.877

brack2 17387 16511 619 25808 26442 448.306

finan512 2592 2592 0 10560 9782 550.129

fe_tooth 24885 22887 1413 34240 32327 1352.7

fe_rotor 31141 30999 100 45687 42489 2261.33

598a 38581 35767 1990 56097 53092 2124.86

fe_ocean 12684 11789 633 20069 18985 766.504

Total 6 24 3 27


Figures 6.1 and 6.2 shows improvement in the partitioning cut obtained by

MRDPSO relative to the best cut reported in partitioning archive for k =2, 4, 8

and for k = 16, 32, 64, respectively. Points lying below 1.0 in curves indicate that

MRDPSO performs better than other approaches. It can be observed that our

approach shows an improvement in 86%, 83%, 80%, 80%, 80% and 90% for k = 2,

4, 8, 16, 32 and 64, respectively.

The results show that overall performance of the developed MRDPSO algorithm

is noteworthy for generating balanced partition in significantly less time.

Fig 6.1: Relative improvement in the partitioning cut for k = 2, 4, 8

0

0.2

0.4

0.6

0.8

1

1.2

1.4

add2

0da

ta3e

lt ukad

d32

bcss

tk33

whi

take

r3cr

ack

win

g-no

dal

fe_4

elt2

vibr

obox

bcss

tk29 4elt

fe_s

pher

e cti

mem

plus cs4

bcss

tk30

bcss

tk31

fe_p

wt

bcss

tk32

fe_b

ody

t60k

win

gbr

ack2

finan

512

fe_t

ooth

fe_r

otor

598a

fe_o

cean

Relative Cut

k = 2 k = 4 k = 8

Fig 6.2: Relative improvement in the partitioning cut for k = 16, 32, 64

6.7.2 Performance Evaluation for Hypergraphs

We have used 18 hypergraphs from ISPD98 benchmark suit with vertices range

from 12752 to 210613 and hyperedges range from 14111 to 201920 for

performance evaluation of our algorithm. The characteristics of these

hypergraphs are listed in Table 6.2. We compared results produced by MRDPSO

algorithm with the results derived by METIS recursive bisection (hMETIS – RB)

[173] and METIS k – way partitioning (hMETIS – k way) [174]. We have used

CSHEM algorithm during the coarsening phase. GGGP algorithm is used to

consistently detects smaller edge cuts than the other algorithms during initial

0

0.2

0.4

0.6

0.8

1

1.2

1.4

add2

0da

ta3e

lt ukad

d32

bcss

tk33

whi

take

r3cr

ack

win

g-no

dal

fe_4

elt2

vibr

obox

bcss

tk29 4e

ltfe

_sph

ere cti

mem

plus cs4

bcss

tk30

bcss

tk31

fe_p

wt

bcss

tk32

fe_b

ody

t60k

win

gbr

ack2

finan

512

fe_t

ooth

fe_r

otor

598a

fe_o

cean

Relative Cut

k = 16 k = 32 k = 64

partitioning phase and boundary KL for refinement and then MRDPSO is

applied. To guarantee the statistical significance of results, we run the algorithm

fifteen times. Furthermore, the optimal set of parameter values is chosen for

MRDPSO as = 1, = = 0.5, = 4, = 30, = 20 and balance

constraint is ∈ = 1.0.

Table 6.10, 6.11, and 6.12 gives the number of hyperedges that are cut during the

partitioning by MRDPSO algorithm, METIS recursive bisection (hMETIS – RB) and

METIS k – way partitioning (hMETIS– k way) for k = 8, 16 and 32 partitions,

respectively. In the last row, the total time required by each method for all

eighteen hypergraphs is given in minutes. From the results, it can be observed

that MRDPSO gives optimal cut value in very less computing time. This is

because of the fact that DPSO refinement heuristic is able to perform an excellent

job of optimizing the objective function, as it is applied successively to finer

coarsen graphs. Furthermore, results produced indicate that, the MRDPSO offers

an additional benefit of producing high quality partitioning while enforcing tight

balancing constraints.

Figures 6.3 and 6.4 shows improvement in the partitioning cut obtained by

MRDPSO relative to the METIS recursive bisection (hMETIS – RB) and METIS k –

way partitioning (hMETIS– k way) for k = 8, 16 and 32,respectively. Points below

1.0 indicate that MRDPSO performs better than hMETIS – RB and hMETIS– k way.

The results illustrate that MRDPSO produces partitions whose cut is far better

than the cut obtained by the other two methods.

Hypergraphk =8

hMETIS – RB hMETIS– k way MRDPSO

ibm01 760 795 382

ibm02 1720 1790 930

ibm03 2503 2553 1363

ibm04 2857 2902 2323

ibm05 4548 4464 2592

ibm06 2452 2397 938

ibm07 3454 3422 2417

ibm08 3696 3544 3893

ibm09 2756 2680 1349

ibm10 4301 4263 2580

ibm11 3592 3713 2873

ibm12 5913 6183 4139

ibm13 3042 2744 1825

ibm14 5501 5244 4125

ibm15 6816 6855 4089

ibm16 6871 6737 5496

ibm17 9341 9420 7753

ibm18 5310 5540 4981

Run - Time 364.53 175.86 113.35

Table 6.10: Comparison of MRDPSO with hMETIS – RB and hMETIS– k way for k = 8

Hypergraphk = 16


ibm01 1258 1283 632

ibm02 3150 3210 1732

ibm03 3256 3317 1963

ibm04 3989 3896 3187

ibm05 5465 5612 3943

ibm06 3356 3241 1231

ibm07 4804 4764 3663

ibm08 4916 4718 5187

ibm09 3902 3968 2189

ibm10 6190 6209 4376

ibm11 5260 5371 4063

ibm12 8540 8569 6872

ibm13 5522 5329 2953

ibm14 8362 8293 6761

ibm15 8691 9201 5991

ibm16 10230 10250 8331

ibm17 15088 15206 11445

ibm18 8860 9025 7931

Run - Time 431.902 237.125 192.571


Hypergraphk = 32


ibm01 1723 1702 936

ibm02 4412 4380 2342

ibm03 4064 4120 2987

ibm04 5094 5050 4185

ibm05 6211 5948 3341

ibm06 4343 4231 1642

ibm07 6300 6212 4899

ibm08 6489 6154 6503

ibm09 5502 5490 2986

ibm10 8659 8612 6138

ibm11 7514 7534 6841

ibm12 11014 11392 9322

ibm13 7541 7610 3941

ibm14 12681 12838 10532

ibm15 13342 13853 9643

ibm16 15589 15335 11189

ibm17 20175 19812 14767

ibm18 13410 13102 12890

Run - Time 505.42 326.20 274.459


Fig 6.3: Improvement in the partitioning cut obtained by MRDPSO relative to hMETIS – RB

for k = 8, 16, 32

Fig 6.4: Improvement in the partitioning cut obtained by MRDPSO relative to hMETIS – k wayfor k = 8, 16, 32

0

0.2

0.4

0.6

0.8

1

1.2

Improvement in Cut

k = 8 k = 16 k = 32

0

0.2

0.4

0.6

0.8

1

1.2

Improvement in Cut

k = 8 k = 16 k = 32

Table 6.13 gives improvement percentage in METIS – RB and METIS k – way

corresponding to the results obtained by MRDPSO algorithm, for k = 8, 16 and

32 partitions. The average improvement in METIS – RB is 32.14%, 29.41% and

28.76% for k = 8, 16 and 32 respectively. As well as average improvement in

METIS k – way is 32.14%, 29.41% and 28.76% for k = 8, 16 and 32 respectively.

Hypergraph

Improvement % in

METIS – RB

Improvement % in

METIS k – way

k = 8 k = 16 k = 32 k = 8 k = 16 k = 32

ibm01 49.73% 49.76% 45.67% 51.94% 50.74% 45.00%

ibm02 45.93% 45.01% 46.91% 48.04% 46.04% 46.52%

ibm03 45.54% 39.71% 26.50% 46.61% 40.82% 27.5%

ibm04 18.69% 20.10% 17.84% 19.95% 18.19% 17.12%

ibm05 43.00% 27.85% 46.20% 41.93% 29.73% 43.82%

ibm06 61.74% 63.31% 62.19% 60.86% 62.01% 61.19%

ibm07 30.02% 23.75% 22.23% 29.36% 23.11% 21.13%

ibm08 -5.33% -5.51% -0.21% -9.84% -9.94% -5.67%

ibm09 51.05% 43.90% 45.72% 49.66% 44.83% 45.61%

ibm10 40.01% 29.30% 29.11% 39.47% 29.52% 28.72%

ibm11 20.01% 22.75% 8.95% 22.62% 24.35% 9.19%

ibm12 30.00% 19.53% 15.36% 33.05% 19.80% 18.17%

ibm13 40.00% 46.52% 47.73% 33.49% 44.58% 48.21%

ibm14 25.01% 19.14% 16.94% 21.33% 18.47% 17.96%

ibm15 40.00% 31.06% 27.72% 40.35% 34.88% 30.39%

ibm16 20.01% 18.56% 28.22% 18.42% 18.72% 27.03%

ibm17 17.00% 24.14% 26.80% 17.6964 24.7337 25.46%

ibm18 6.19% 10.48% 3.87% 10.0903 12.1219 1.61%

Average 32.14% 29.41% 28.76% 31.94% 29.59% 28.28%

Table 6.11: Improvement percentage in METIS – RB and METIS k – way corresponding to the

results obtained by MRDPSO algorithm, for k = 8, 16 and 32 partitions.

Swarm intelligence based Ant Colony Optimization is used for bisecting graph

[20]. Same set up of hypergraphs, we have used to compare our MRDPSO based

approach with the bipartitioning result obtained by ACO. Quality measures used

for comparison are min – cut and average – cut. These measures are obtained in

fifteen runs with different random seed for everyone. Table 6.12 shows min-cut

and average – cut obtained by ACO and MRDPSO, and their relative

performance along with improvement percentage corresponding to ACO.

Table 6.12: Min-cut and average – cut obtained by ACO and MRDPSO, relative performanceand improvement percentage

HypergraphACO MRDPSO Relative Performance Improvement %

Min -Cut

Avg -Cut

Min -Cut

Avg -Cut

Min -Cut

Avg -Cut

Min -Cut

Avg -Cut

ibm01 67 95 19 41 0.28358 0.43316 71.64% 56.68%

ibm02 171 440 174 300 1.01754 0.68213 -1.75% 31.79%

ibm03 623 851 544 968 0.87319 1.13648 12.68% 13.65%

ibm04 323 504 276 471 0.85449 0.93458 14.55% 6.54%

ibm05 1322 1583 640 902 0.48411 0.56987 51.58% 43.01%

ibm06 352 997 188 923 0.53409 0.92619 46.59% 7.38%

ibm07 375 464 269 345 0.71733 0.74296 28.26% 25.70%

ibm08 761 1100 824 1104 1.08279 1.00362 -8.27% -0.36%

ibm09 364 449 345 368 0.9478 0.8207 5.219% 17.93%

ibm10 629 1060 580 820 0.9221 0.77393 7.790% 22.61%

ibm11 388 676 375 478 0.96649 0.70759 3.35% 29.24%

ibm12 1064 1715 1071 1464 1.00658 0.85365 -0.65% 14.63%

ibm13 682 1366 548 905 0.80352 0.6625 19.64% 33.75%

ibm14 1095 1759 1075 1524 0.98174 0.86627 1.82% 13.37%

ibm15 1291 3349 1079 3392 0.83579 1.01295 16.42% -1.30%

ibm16 1012 1448 960 1534 0.94862 1.05939 5.13% -5.94%

ibm17 1317 1775 1258 1779 0.9552 1.00175 4.47% -0.17%

ibm18 1412 1592 1383 1873 0.97946 1.17617 2.05% 17.62%

Average 0.84413 0.85355 15.58% 14.65%

Figure 5.6 represents improvement percentage by MRDPSO in Min – cut and

Average – cut corresponding to ACO.

Our approach shows an improvement varies from -8.27% and 71.64%with average

of 58% in Min – Cut and varies from -5.94% and 56.68%with average of 14.65% in

Avg – Cut corresponding to bisection obtained by ACO based algorithm.

Complete evaluation on all 18 graphs using an ACO based algorithm has taken

two hours, whereas MRDPSO takes one hour eighteen minutes.

Fig 6.5: Improvement percentage in Min – cut and Average – cut for bipartitioning.

6.10 ConclusionIn this chapter we have presented MRDPSO, a multilevel recursive discrete PSO

approach for balance graph k – partitioning. The developed algorithm follows

the fundamental concept of multilevel graph partitioning method and integrates

most powerful DPSO refinement procedure. We evaluated comprehensively the

performance of the algorithm on a collection of the Graph Partitioning Archive

-20.00%

-10.00%

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%Improvement %

Min - Cut Avg - Cut

for ordinary graphs as well as hypergraphs for the different values of k, set

to = 2, 4, 8, 16, 32, 64.

From the results and comparisons, we can observe that the overall performance

of developed the MRDPSO algorithm is remarkable for generating balanced

partition in significantly less computing time than those produced by METIS,

CHACO and Tabu search.

The MRDPSO k – way partitioning approach substantially outperforms than

METIS – RB and METIS k – way, both for minimizing hyperedge cut as well as

minimizing the computation time. Furthermore, our experiment research proves

that MRDPSO algorithm is more competent than the ACO based algorithm for

bipartitioning.

chapter 6 development of particle swarm optimization based algorithm...

Documents