synthesis of speed independent circuits based on decomposition
DESCRIPTION
Synthesis of Speed Independent Circuits Based on Decomposition. Tomohiro Yoneda National Institute of Informatics Tokyo Institute of Technology. Hiroomi Onda Tokyo Institute of Technology. Chris Myers University of Utah. Background. High-level synthesis - PowerPoint PPT PresentationTRANSCRIPT
Synthesis of Speed Independent Circuits Based on
Decomposition
Tomohiro YonedaNational Institute of Informatics
Tokyo Institute of Technology
Hiroomi OndaTokyo Institute of Technology
Chris MyersUniversity of Utah
2004/4/21 Async2004 2
Background
High-level synthesis plays an important role to push Async. design to wide
use Major approach to high-level synthesis
Prepare basic cells that correspond to specification language constructs
Translate specifications to basic cell networks syntax-directedly with local optimizations
Very efficient Global optimization may be difficult
2004/4/21 Async2004 3
Challenge
Our approach to high-level synthesis Translate high-level spec to low-level spec (time
Petri nets) Use timed logic synthesis technique Global optimization can be possible by
logic optimization timing information
Cost for synthesis is very high
2004/4/21 Async2004 4
How to reduce the cost
Translation technique to low-level spec guarantees that low-level spec has CSC
by adding state variables sufficiently Idea: [Yoneda,Myers 2003]
Developing Balsa Compiler
Efficient logic synthesis technique decomposes low-level spec w.r.t. each output synthesizes each sub-circuit from each sub-spec
Goal of this work
In this paper, speed independent circuit synthesis is discussed
2004/4/21 Async2004 5
Decomposition based synthesis
Input STG
1 safe output semi-modular with CSC (Complete State Coding) several more restrictions
Output Reduced STG for each output
g-C or atomic-gate implementation is synthesizable
Feature Only state graphs for reduced STGs are
necessary It is not necessary to explore the reachable states of
the original STGs
2004/4/21 Async2004 6
Key issue - input set determination
gC req2
ack1req1
csc
csc
gC ack1
ack1
req1
req1
csc
gC csc
ack1
ack2
req1
2004/4/21 Async2004 7
Key issue - input set determination
gC req2
ack1req1
csc
csc
gC ack1
ack1
req1
req1
csc
gC csc
ack1
ack2
req1
Reduction
2004/4/21 Async2004 8
Related works
Synthesizing each output separately T.A. Chu, Synthesis of Self-Timed VLSI Circuits from Gr
aph-theoretic Specification, PhD thesis, MIT,1987 No idea for input set determination
R. Puri, J. Gu, A Modular Partitioning Approach for Asynchronous Circuit Synthesis, IEEE TCAD, 1995 Input set determination is performed based on the state graph
of the original STG Input signals are kept, if hiding them does not increase the numbe
r of CSC conflicts W. Vogler, R. Wollowski, Decomposition in Asynchrono
us Circuit Design, Tech Report, Univ. Augsburg, 2002 STG reduction technique - net contraction - is formalized No general idea for input set determination
2004/4/21 Async2004 9
Our approach
Step 1: Select possible trigger signals as the initial input set
Step 2: Contract the original STG by deleting signals except for the output and those in the current input set
Step 3: If the reduced STG has CSC, doneStep 4: Otherwise, choose appropriate signals
and add them to the input setStep 5: Goto Step 2
2004/4/21 Async2004 10
Contraction
contraction
bisimilar translation(i.e., by W. Vogler, R. Wollowski)
Possible trigger signals
Original STG Reduced STG
2004/4/21 Async2004 11
Issues to be discussed
If the reduced STG has CSC, is a correct speed independent circuit synthesized from it?
How can appropriate signals be chosen without the state graph of the original STG?
How is the overhead (performance degradation of the synthesized circuit)?
2004/4/21 Async2004 12
Issues to be discussed
If the reduced STG has CSC, is a correct speed independent circuit synthesized from it?
How can appropriate signals be chosen without the state graph of the original STG?
How is the overhead (performance degradation of the synthesized circuit)?
2004/4/21 Async2004 13
An example
0100
110R0110
011F
0110
0000
1101111R
1111
0000
0010
1000
1010
b+
c+ a+/1
a+/1
x+
x+
c+
a-/1
x-
b-a+/2
c-
a+/2c-
a-/2c+
ES(x+)
ES(x-)
(a b c x)
2004/4/21 Async2004 14
If a is deleted
0100
110R0110
011F
0110
0000
1101111R
1111
0000
0010
1000
1010
b+
c+ a+/1
a+/1
x+
x+
c+
a-/1
x-
b-a+/2
c-
a+/2c-
a-/2c+
CD(ES(x+))
CD(ES(x-))
CD(S): Extended set of S by deleting signals
(a b c x)
2004/4/21 Async2004 15
Irrelevant input set
A set D of signals is an irrelevant input set for an output x, if D In Out – {x} CD(ES(x+)) – UR = ES(x+) CD(ES(x–)) – UR = ES(x–)
In: Input signal set of the original STGOut: Output signal set of the original STGUR: Unreachable state set of the original STG
2004/4/21 Async2004 16
If a is deleted
CD(ES(x+)) – UR ES(x+)
CD(ES(x–)) – UR ES(x–)
{a} is not an irrelevant input set
If a non-irrelevant input set is deleted,the reduced STG has no CSC
0100
110R0110
011F
0110
0000
1101111R
1111
0000
0010
1000
1010
b+
c+ a+/1
a+/1
x+
x+
c+
a-/1
x-
b-a+/2
c-
a+/2c-
a-/2c+
CD(ES(x+))
CD(ES(x-))
(a b c x)
2004/4/21 Async2004 17
If c is deleted
{c} is an irrelevant input set
If an irrelevant input set (including no possible trigger signals) is deleted, a correct circuit is obtained from the reduced STG
0100
110R0110
011F
0110
0000
1101111R
1111
0000
0010
1000
1010
b+
c+ a+/1
a+/1
x+
x+
c+
a-/1
x-
b-a+/2
c-
a+/2c-
a-/2c+
CD(ES(x-))
CD(ES(x+))
0101
CD(ES(x+)) – UR = ES(x+)
CD(ES(x–)) – UR = ES(x–)
(a b c x)
2004/4/21 Async2004 18
Theorem 1
For an STG G that has CSC and is output semi-modular, if a reduced STG G' obtained from G by deleting some signal set V (including no possible trigger signals) has CSC, then a correct circuit is obtained from G' If V is not an irrelevant input set, G' must not have CSC
V must be an irrelevant input set
A correct circuit is obtained from G'
2004/4/21 Async2004 19
Issues to be discussed
If the reduced STG has CSC, is a correct speed independent circuit synthesized from it?
How can appropriate signals be chosen without the state graph of the original STG?
How is the overhead (performance degradation of the synthesized circuit)?
2004/4/21 Async2004 20
Contraction with initial input set
contraction
Possible trigger signals
Original STG Reduced STG
2004/4/21 Async2004 21
Checking CSC
Constructing state graph of the reduced STG
00
10
0F
a-/1
x-
a+/2
11
1R
x+
00
a+/1
a-/2
1R
10
CSC conflict
Reduced STG
2004/4/21 Async2004 22
Guided Simulation
0100
110R0110
011F
0110
0000
1101111R
1111
0000
0010
1000
1010
b+
c+ a+/1
a+/1
x+
x+
c+
a-/1
x-
b-a+/2
c-
a+/2c-
c+
00
10
0F
a-/1
x-
a+/2
11
1R
x+
00
a+/1
State graph of the original STG
abstracted trace original trace
interface transition
noninterface transition
This can be obtained by simulating the original STG not requiring the state graph of the original STG
2004/4/21 Async2004 23
Generating original trace
abstracted trace: a+ b+t1 t2
b+
a+
original trace: t2 a+
Original STG
noninterfacetransitions
interfacetransitions b+
t3t3
2004/4/21 Async2004 24
Analysis of original trace
0100
0110
011F
0110
0000
111R
1111
0000
0010
1000
b+
c+
a+/1
x+
a-/1
x-
b-
c-
a+/2
interface signal
noninterface signal
Find a noninterface signal thatcertainly changes odd times here
2004/4/21 Async2004 25
Analysis of original trace
0100
0110
011F
0110
0000
111R
1111
0000
0010
1000
b+
c+
a+/1
x+
a-/1
x-
b-
c-
a+/2
interface signal
noninterface signal
Resolve this CSC conflict
Add b to the input set
2004/4/21 Async2004 26
Analysis of original trace
0100
0110
011F
0110
0000
111R
1111
0000
0010
1000
b+
c+
a+/1
x+
a-/1
x-
b-
c-
a+/2
interface signal
noninterface signal
Select a noninterface signal thatcertainly changes odd times here
c also seems to satisfythis condition
But, c does not actuallyconcurrent
2004/4/21 Async2004 27
Formalization(init)
CSC conf.
CSC conf.
interface signal
f0
f1
e1
e2
w is odd-confined by f1 :
1. w changes odd times in f12. if w changes in f0, then we1 e1
3. e1 ws2
4. we2 e2
we1 last w
ws2 first w
we2 last w
"" represents causality relation obtained from structure of STG
original trace
2004/4/21 Async2004 28
Analysis of original trace
0100
0110
011F
0110
0000
111R
1111
0000
0010
1000
b+
c+
a+/1
x+
a-/1
x-
b-
c-
a+/2
interface signal
noninterface signal
b is odd-confined by f1c is not odd-confined by f1
f0
f1
2004/4/21 Async2004 29
Theorem 2
110R
1100
1101
f1
w+
h1
If w ( and ui ) satisfies the following condition, adding w ( and ui ) resolvesthe CSC conflict in f1 ( sufficient condition ) w is odd-confined by f1 w does not changes in l If w changes before the first interface signal, for each odd iui is odd-confined by hi with causalityrelation shown in the figure
For one CSC conflict, there exist many candidate sets of signals ↓• Analyze every CSC conflict• Set up the covering problem and solve it
110R
w-110R
w+110R
h3
0
1
1
1
1
1
f0
u+
interface transition
1001
interface transition
1
l
2004/4/21 Async2004 30
Drawback
For an STG with conflicting transitions, backtracking may be needed
Finding actually fired noninterface transitions is no longer deterministic due to deleting conflicting transitions
If many conflicting transitions exist, backtracking sometimes costs a lot
Approaches that seem practical are to Keep all conflicting transitions even if they are not related to back
tracking, or Manually specify some of necessary conflicting transitions
Our compiler from a high-level language can automatically specify those conflicting transitions
2004/4/21 Async2004 31
Issues to be discussed
If the reduced STG has CSC, is a correct speed independent circuit synthesized from it?
How can appropriate signals be chosen without the state graph of the original STG?
How is the overhead (performance degradation of the synthesized circuit)?
2004/4/21 Async2004 32
Experimental results
Experiments Implementation of the proposed method in C Pentium 2.8GHz, 4GB memory Final logic synthesis tool : petrify -gc -eqn
Benchmarks1. Instruction cache controller of TITAC2
generated from high-level spec by our compiler large, but simple → input signal sets are small compiler decisions are used for specifying conflicting transitions
2.Controllers of various filters manually designed medium, but complicated → input signal sets are large all conflicting transitions are kept
3.Async Benchmarks small and simple all conflicting transitions are kept
2004/4/21 Async2004 33
Experimental results
CPU times, Memory usage Benchmark1: significantly reduced Benchmark2: reduced
Quality of synthesized circuits Area (num. of transistors): almost no overhead
2004/4/21 Async2004 34
0
0.5
1
1.5
2
2.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Experimental results
For Async Benchmarks Quality : no overhead (exactly the same area) Cost : advantageous only for largest specs
Petrify
Proposed
CPU times (sec)
2004/4/21 Async2004 35
Conclusion
New algorithm to find input signal sets for decomposition based synthesis method state graph of the original STG is not necessary
can handle larger circuits
Logic synthesis tool : NUTAS Linux binary is downloadable from http://research.nii.ac.jp/~yoneda
Future works extend the algorithms to support timed circuit synthesis finish the compiler development for high-level synthesis and in
tegrate both