an investigation on fpga placement using mixed genetic algorithm with simulated annealing meng yang...
TRANSCRIPT
An Investigation on FPGA An Investigation on FPGA Placement Using Mixed Genetic Placement Using Mixed Genetic
Algorithm with Algorithm with Simulated Simulated Annealing Annealing
Meng YangMeng Yang
Napier UniversityNapier University
Edinburgh, UKEdinburgh, UK
Overview Overview
• Placement problem definitionPlacement problem definition
• Symmetrical FPGA general architectureSymmetrical FPGA general architecture
• Proposed algorithmProposed algorithm
• Experimental resultsExperimental results
• ConclusionsConclusions
FPGA Placement DefinitionFPGA Placement Definition
• ConstraintsConstraintsSome fixed I/O padsSome fixed I/O padsArchitectureArchitecture
• Problem definitionProblem definition
Given a netlist to find exact locations of the Given a netlist to find exact locations of the FPGA logic blocks with constraints to minimize FPGA logic blocks with constraints to minimize wire length required for routingwire length required for routing
FPGA General ArchitectureFPGA General Architecture
Switch Block
CLBsCLBs IOBs
Out
4-inputLUT D
Flip-flopClock
In
9
GenotypeGenotype
• The chromosome structure The chromosome structure is (is (LL11, , LL22, , LL33, ……, , ……, LLNN) )
• Chromosome length, Chromosome length, NN, , depends on the size of depends on the size of FPGA, FPGA, KK
• The location of CLB is The location of CLB is calculated ascalculated asP P = (= (xx-1)×-1)×K K + + (y-(y-1)1)
0 1 2 3 … 13 14 15
-1 5 2 9 … 6 -110
10
8
1,2
1,3
1,1
1,4
2,2
2,3
2,1
2,4
3,2
3,3
3,1
3,4
4,2
4,3
4,1
4,4
2 11
5 67 4
13
Fitness FunctionFitness Function
Compensation factor
Bounding box for horizontal span
Bounding box for vertical span
The worst cost for placement maxcost
Half-perimeter Wire Length Half-perimeter Wire Length ModelModel
Bounding Box=6=4 (Hori. dist.)+2(Vert. dist.)
Net with 6 terminals
Half-perimeter Wire Length Half-perimeter Wire Length ModelModel
Net with 6 terminals Bounding Box=5
=3 (Hori. dist.)+2(Vert. dist.)
Overview of GASAOverview of GASA
0101 begin begin02 02 initialize_population (); initialize_population (); 03 03 whilewhile ( (generationgeneration < MAX_GENS) < MAX_GENS) dodo04 04 evaluate_population_fitness ();evaluate_population_fitness ();05 05 reproduce_population (Preserve);reproduce_population (Preserve);06 06 forfor ii = 1 = 1 toto POP_SIZE/2 POP_SIZE/2 dodo07 07 crossover (Pcrossover); crossover (Pcrossover); 08 08 forfor jj=1 =1 toto NUM_GENES NUM_GENES dodo09 09 mutate(Pmutation); mutate(Pmutation); 10 10 forfor ii = 1 = 1 toto POP_SIZE POP_SIZE dodo1111 local_improvement(Plocal);local_improvement(Plocal);12 12 elitism();elitism();13 13 end whileend while1414 select_the_best_one();select_the_best_one();1515 TT = = set_temperature();set_temperature();16 16 R = set_block_movement_range();R = set_block_movement_range();1717 /* /* following algorithm is pseudo-code of SA*/following algorithm is pseudo-code of SA*/18 18 whilewhile (Exit_criterion() == FALSE) (Exit_criterion() == FALSE) dodo1919 whilewhile (inner_criterion() == FALSE) (inner_criterion() == FALSE) dodo2020 Pnew = generate_movement (R, Pold)Pnew = generate_movement (R, Pold)2121 ΔC = C (Pnew) - C (Pold);ΔC = C (Pnew) - C (Pold);2222 RANDOMRANDOM = = generate_number();generate_number();23 23 if (RANDOM < e exp (-ΔC/T))if (RANDOM < e exp (-ΔC/T))24 24 Pold = Pnew;Pold = Pnew;2525 end whileend while2626 end whileend while2727 end algorithm end algorithm
SelectionSelection
• Individuals are selected Individuals are selected according to their fitness valueaccording to their fitness value
• The fitness values of The fitness values of population are sorted in population are sorted in increasing order.increasing order.
• A small number of individuals A small number of individuals of population with higher of population with higher fitness value in the current fitness value in the current generation are intact and generation are intact and remain in the population remain in the population
• WW individuals are individuals are simultaneously selectedsimultaneously selected
• The selection procedure is The selection procedure is random butrandom but fitter fitter individual is individual is more likely to be selectedmore likely to be selected
Crossover ProcessCrossover Process
0 1 2 3 4 5 6 7 8
6 -1 1 4 5 2 -1 3 -1
0 1 2 3 4 5 6 7 8
-1 5 2 3 -1 6 -1 4 1
0 1 2 3 4 5 6 7 8
1 -1 6 4 5 2 -1 3 -1
11,2
1,3
1,1
2,2
2,3
2,1
3,2
3,3
3,1
26
5 3
4
1
1,2
1,3
1,1
2,2
2,3
2,1
3,2
3,3
3,1
62
5 4
3
61,2
1,3
1,1
2,2
2,3
2,1
3,2
3,3
3,1
21
5 3
4
Local optimization (SA) stageLocal optimization (SA) stage
• Once GA has done the global search in the first Once GA has done the global search in the first stage, SA will take over from GA to do local search. stage, SA will take over from GA to do local search.
• The takeover is static. If the improvement does not The takeover is static. If the improvement does not gained in the GA for 5 generations or the number of gained in the GA for 5 generations or the number of generations is greater than the maximum number generations is greater than the maximum number of generations, SA will start to work on individual of generations, SA will start to work on individual instead of entire population.instead of entire population.
• As the takeover process is static, according to the As the takeover process is static, according to the experimental results, the initial temperature experimental results, the initial temperature TT in in the second stage of our algorithm is selected at 1 the second stage of our algorithm is selected at 1 degree degree
Local optimization (SA) stage Local optimization (SA) stage (Cont.)(Cont.)
• New temperature is computed as New temperature is computed as Tnew Tnew = = β Toldβ Told• ββ depends on depends on αα• αα is the percentage of attempted movements between two is the percentage of attempted movements between two
swapped blocks that have been acceptedswapped blocks that have been accepted• Movement is only between two blocks nearbyMovement is only between two blocks nearby
α β
0.15 < α < 0.3 0.95
0.05 <= α <= 0.15 0.8
α < 0.05 0.6
Comparison FlowComparison Flow
Logic optimization and technology map to 4 Look
Up Tables (LUTs)
Pack Flip-Flops and LUTs into basic logic elements
Placement
(VPlace)
Routing (VRouter)
Placement (GASA)
Channel density
Benchmarks
Comparison to GA Comparison to GA
Name
GA GASA
CPU (s) No. of Tracks CPU (s)No. of Tracks
9symml 25.74 5 22.86 5
alu2 91.76 6 74.27 6
apex7 38.39 5 38.11 5
e64 163.70 8 155.21 8
example2 107.57 5 95.23 5
k2 461.59 10 364.77 9
term1 28.06 5 26.35 5
too-lrg 82.51 7 74.37 7
vda 179.17 8 148.33 8
Total 1178.49 59 999.5 58
Comparison to VPR Comparison to VPR
BenchmarksVPlace [5] GASA
Cost Cost
9symml 690 693
alu2 1670 1678
apex7 785 785
e64 2853 2849
example2 1348 1345
k2 5874 5873
term1 700 700
too-lrg 1750 1748
vda 3067 3067
Total 18737 18738
ConclusionsConclusions
• FPGA placement by using GASA is presented.
• The experimental results show that the proposed algorithm is effective in improving the quality of placement for the tested MCNC benchmarks.
• The proposed GASA achieves less CPU time than GA in all cases without degradation of performance in the final routing stage, i.e. same number of routing channel tracks for all benchmarks.
• It also shows GASA and VPlace are highly comparable placement tools.
Thank you for your attentionThank you for your attention