evaluation of logic by software using bdds
DESCRIPTION
Evaluation of Logic by Software Using BDDs. Andrew Mihal EE219B Spring 2000 5/16/2000. Outline. Problem Statement Simple Approach Table-based BDD Approach Branch-based BDD Approach BDD Visualization Clumping Algorithms. Problem Statement. Given: - PowerPoint PPT PresentationTRANSCRIPT
Evaluation of Logic by Software Using BDDs
Andrew MihalEE219B Spring 2000
5/16/2000
Outline Problem Statement Simple Approach Table-based BDD Approach Branch-based BDD Approach BDD Visualization Clumping Algorithms
Problem Statement Given:
A multilevel combinational network in BLIF format
Generate: A software function that evaluates
the network void eval_network(int PI[], int PO[]);
Simple Approach Perform topological sort on nodes Output each node as a C assignment using
boolean operators in SOP form
PIs
POs
Rank 0
Rank 1
Rank 2
Rank 3 void eval_network(PI, PO){ int I[6]; I[0] = PI[1] & PI[2]; I[1] = ( & ) | ( & ); ... PO[0] = I[4]; PO[1] = I[5]; return;}
Simple Approach Pros:
Very simple to program No control flow Potential for compiler optimizations
Cons: (n) in time and space Not sophisiticated
Benchmarks
Input BLIF Files (mcnc91): 1 to 3500 nodes
9symml apex7 comp i5 sctC1355 b1 cordic i6 smallC17 b9 count i7 tC1908 c8 cu i8 t481C2670 cc dalu i9 tconC3540 cht decod k2 term1C432 cm138a des lal too_largeC499 cm150a example2 majority ttt2C5315 cm151a f51m mux unregC6288 cm152a frg1 my_adder vdaC7552 cm162a frg2 pair x1C880 cm163a i1 parity x2cm42a i10 pcle x3 x4alu2 cm82a i2 pcler8 z4mlalu4 cm85a i3 pm1 rotapex6 cmb i4
Benchmarks gcc -O3 -pg Pentium-class machine Each network tested with 100,000
random input vectors (deterministic) Measure average time spent in each eval_network call
Scripts used to run tests and gather statistics
Simple Approach Performance
0
50
100
150
200
250
300
1 10 100 1000 10000
Nodes
us
ec
/ e
va
l
flat
Power (flat)
Table-based BDD Approach Instead of statements of the form:
I[1] = (PI[0] & PI[1]) | I[0]; Use BDDs instead
I[1] = eval_bdd(bdd_1, PI[], I[]); Assume we have an efficient eval_bdd
function Statements are still in topological order bdd_1 is a constant hardcoded table BDD ordering with sift
Table-based Approach Building a network node into a table
I1
PI[0]
PI[1] I[0]
I1 = (PI[0] PI[1]) + I[0]I[0]
PI[1]
PI[0]
1const bddn bdd_1 = { {INT, I[0], 1, POS, 3, POS}, {INT, PI[0], 3, NEG, 2, POS}, {INT, PI[1], 3, NEG, 3, POS}, {CONSTANT_1}};
Table-based Approach Pros:
BDD may be more efficient than SOP form Data hardcoded into program
All we need to write is eval_bdd function
Cons: Compiler doesn't optimize hardcoded data eval_bdd function is inefficient
Function call overhead BDD data table indexing
Table-based Peformance
0
200
400
600
800
1000
1200
1 10 100 1000 10000
Nodes
us
ec
/ e
va
l flat
table_nc
Power (flat)
Power (table_nc)
Branch-based Approach Get rid of tables and eval_bdd function
calls Replace eval_bdd statements with inline
code Still use topological sort
Branch-based Approachvoid eval_network(int PI[], int PO[]){ int I[6]; int complement; ... NODE_1_START: complement = 1; NODE_1_0: if (I[0]) goto NODE_1_3; else goto NODE_1_1; NODE_1_1: if (PI[0]) goto NODE_1_2; else { complement ^= 1; goto NODE_1_2;} NODE_1_2: if (PI[1]) goto NODE_1_3; else { complement ^=1; goto NODE_1_3;} NODE_1_3: I[1] = complement;...
I1
PI[0]
PI[1] I[0]
I[0]
PI[1]
PI[0]
1
Branch-based Approach Pros:
No table lookups No function calls goto compiles straight to a simple
jump Cons:
Performance?
Branch-based Performance
0
200
400
600
800
1000
1200
1 10 100 1000 10000
Nodes
us
ec
/ e
va
l
flat
table_nc
branch_nc
Power (flat)
Power (branch_nc)
Power (table_nc)
Branch-based Performance
0
5
10
15
20
25
30
35
40
1 10 100 1000 10000
Nodes
us
ec
/ e
va
l
flat
table_nc
branch_nc
Power (flat)
Power (branch_nc)
BDD Visualization Instead of emitting BDDs as tables
or branch structures, produce a graph
Uses DOT, a graph drawing tool from AT&T
Clumping Algorithms Can we improve performance by
making BDDs larger? Clumping: Collapse a node into its
fanouts, removing it from the network
F = A X + B
A X = C D B
F = A C D + B
A B
Clumping Algorithms Two different heuristics
Input clumping Tries to make all BDDs have about N
inputs Greedy algorithm
Size clumping Tries to make all BDDs have about N
nodes Greedy algorithm
Performance with Clumping
0
50
100
150
200
250
300
1 10 100 1000 10000
Nodes
us
ec
/ e
va
l
branch_nc
branch_i5
branch_i10
branch_i15
branch_i20
Power (branch_nc)
Power (branch_i5)
Power (branch_i10)
Power (branch_i15)
Power (branch_i20)
Performance with Clumping
0
5
10
15
20
25
30
0 20 40 60 80 100 120 140
Nodes
us
ec
/ e
va
l
branch_nc
branch_i5
branch_i10
branch_i15
branch_i20
Power (branch_nc)
Power (branch_i5)
Power (branch_i10)
Power (branch_i15)
Power (branch_i20)
Performance with Clumping
0
5
10
15
20
25
30
35
40
45
50
1 10 100 1000
Nodes
us
ec
/ e
va
l flat
branch_i20
Power (flat)
Power (branch_i20)
Clumping Issues Number of nodes decreases, but BDD
size increases Average number of BDD nodes we
evaluate stays the same? Synthesis and compile time very long
and very memory intensive when using clumping
Flat method synthesizes and compiles quickly, and scales to larger networks