area and speed oriented implementations of asynchronous logic operating under strong constraints
DESCRIPTION
Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints. Igor Lemberski Baltic International Academy Riga, Latvia e-mail: [email protected] Petr Fišer Czech Technical University in Prague Faculty of Information Technology - PowerPoint PPT PresentationTRANSCRIPT
Area and Speed Oriented ImplementationsArea and Speed Oriented Implementationsof Asynchronous Logicof Asynchronous Logic
Operating Under Strong ConstraintsOperating Under Strong Constraints Igor Lemberski
Baltic International AcademyRiga, Latvia
e-mail: [email protected]
Petr Fišer
Czech Technical University in PragueFaculty of Information Technology
e-mail: [email protected]
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 22
OutlineOutline
Asynchronous circuits model usedMotivation & proposed methodExperimental resultsConclusions
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 33
Asynchronous Circuits Model UsedAsynchronous Circuits Model Used
Unbounded delay model Gate and wire delays are not limited The circuit is able to recognize the moment when input
states have changed Dual-rail encoding
Positive and negative values of each signal are provided
f(0) = 1, f(1) = 0 – log. 1 f(0) = 0, f(1) = 1 – log. 0 f(0) = 0, f(1) = 0 – space state (spacer) f(0) = 1, f(1) = 1 – not allowed
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 44
Four-Phase DisciplineFour-Phase Discipline
Inputs in space state (00)
Inputs in working state
(10, 01)
Outputs in space state (00)
Outputs in working state
(10, 01)
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 55
Seitz’s ConstraintsSeitz’s Constraints
Strong constraintsEach output changes its state only when all
inputs have changed their state
In contrast to weak constraintsSome outputs are permitted to change their
state when some inputs have changed their state
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 66
Seitz’s ConstraintsSeitz’s Constraints
Strong constraintsEach output changes its state only when all
inputs have changed their state
In contrast to weak constraintsSome outputs are permitted to change their
state when some inputs have changed their state
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 77
Seitz’s Strong ConstraintsSeitz’s Strong ConstraintsPros
Regularity Extra completion detection logic not needed Circuit delay is based on actual gate delays No additional synchronization chains
Cons Rather high area and delay
DIMS (Delay-Insensitive Minterm Synthesis)NCL (Null Convention Logic)NCL (Null Convention Logic)Direct LogicDirect Logic
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 88
DIMS (Delay-Insensitive Minterm Synthesis)DIMS (Delay-Insensitive Minterm Synthesis)
2-level implementation2n n-input C-elements + n-input OR
Function implemented as sum-of-minterms
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 99
NCL (Null Convention Logic)NCL (Null Convention Logic)
Library of 27 special gatesBased on threshold functionsAny function up to 4 inputs can be implemented
… but in dual-rail, 4 inputs = 2 variables only
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1010
Direct LogicDirect Logic
Two‑level C-OR DIMS logic implemented as a single gateBoth positive and complemented outputs are providedDifferent delays for each input
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1111
ComparisonComparisonDIMSDIMS Direct logicDirect logic
InputsInputs Trans.Trans. DelayDelay Trans.Trans. DelayDelay2 24 8.2 22 8.23 64 15.1 34 12.34 160 21.3 54 19.45 384 N/A 90 N/A6 896 N/A 158 N/A
NCLNCL
2-input gate2-input gate Trans.Trans. DelayDelayAND, OR 21 5.8
XOR 24 8.6
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1212
Multi-Level Dual-Rail NetworkMulti-Level Dual-Rail Network
Positive and complemented values of each Positive and complemented values of each signal providedsignal providedEach node implemented as DIMS, NCL, or Each node implemented as DIMS, NCL, or Direct logicDirect logic
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1313
Motivation & Proposed MethodMotivation & Proposed Method
State-of-the-art Nodes are implemented as simple gates (NAND, XOR)
4x 2-input gate = 22*4 = 88 transistors in Direct logic
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1414
Motivation & Proposed MethodMotivation & Proposed Method
Proposed Nodes are implemented as complex gates
1x 2-input gate + 1x 3-input gate = 22 + 34 = 56 transistors
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1515
Motivation & Proposed MethodMotivation & Proposed Method
State-of-the-art Nodes are implemented as simple gates (NAND, XOR)
Proposed Nodes are implemented as complex gates, i.e. gates of a
given number of inputs and any function Can be implemented both in DIMS and Direct logic Like FPGA LUTs Tools for synchronous synthesis can be used
FPGA mapping
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1616
Where’s the Problem?Where’s the Problem?
Facts:Increase of the number of node inputs will:
Decrease the number of nodesDecrease the number of levelsIncrease the node sizeIncrease the node delay
Question:
Where is the trade-off?Where is the trade-off?
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1717
Experimental SetupExperimental Setup
228 circuits processed (MCNC, ISCAS)Optimized by ABC choice script
1. Mapped into k-input NANDs (ABC map command ) state-of-the-art (k-NAND)
2. Mapped into k-LUTs (ABC fpga command) complex gates (k-CG)
3. Mapped into MCNC standard cells (ABC map) something in-between (SC)
k = 2…6
Implemented as DIMS, Direct logic, and NCL
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1818
Results – DIMS - AreaResults – DIMS - Area
SC6-CG5-CG4-CG3-CG2-CG
6-NAND5-NAND4-NAND3-NAND2-NAND
0,0 2,0M 4,0M 6,0M 8,0M 10,0M 12,0M 14,0M 16,0M
Transistors
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1919
Results – DIMS - AreaResults – DIMS - Area
SC6-CG5-CG4-CG3-CG2-CG
6-NAND5-NAND4-NAND3-NAND2-NAND
0% 10% 20% 30% 40% 50% 60% 70%
Best in
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2020
Results – DIMS – DelayResults – DIMS – Delay
SC
4-CG
3-CG
2-CG
4-NAND
3-NAND
2-NAND
0,0 5,0k 10,0k 15,0k 20,0k 25,0k
Delay
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2121
Results – DIMS – DelayResults – DIMS – Delay
SC
4-CG
3-CG
2-CG
4-NAND
3-NAND
2-NAND
0% 10% 20% 30% 40% 50% 60%
Best in
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2222
Discussion - DIMSDiscussion - DIMS
Implementation using arbitrary 2-input gates is the best one, both in area and delay
No big surprise. Complexity (and delay)of DIMS grows exponentially with the number of gate inputsResults are consistent – the more node inputs, the higher area and delay
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2323
Results – Direct Logic - AreaResults – Direct Logic - Area
SCNCL
6-CG5-CG4-CG3-CG2-CG
6-NAND5-NAND4-NAND3-NAND2-NAND
0,0 500,0k 1,0M 1,5M 2,0M 2,5M 3,0M
Transistors
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2424
Results - Direct Logic - AreaResults - Direct Logic - Area
SCNCL
6-CG5-CG4-CG3-CG2-CG
6-NAND5-NAND4-NAND3-NAND2-NAND
0% 10% 20% 30% 40% 50%
Best in
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2525
Results – Direct Logic - DelayResults – Direct Logic - Delay
SC
NCL
4-CG
3-CG
2-CG
4-NAND
3-NAND
2-NAND
0,0 5,0k 10,0k 15,0k 20,0k
Delay
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2626
Results – Direct Logic - DelayResults – Direct Logic - Delay
SC
NCL
4-CG
3-CG
2-CG
4-NAND
3-NAND
2-NAND
0% 10% 20% 30% 40% 50% 60% 70%
Best in
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2727
Discussion - Direct LogicDiscussion - Direct LogicImplementation using 3-input complex gates is the best
one, both in area and delay
This is a good result confirming our theoryResults are consistent - no coincidenceState-of-the-art 2-NAND implementation is extremely inefficient: 21% area improvement 19% delay improvement
3-CG implementation is even better than NCL 10% area improvement 19% delay improvement
EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2828
ConclusionsConclusions
Efficient implementation of asynchronous logic Efficient implementation of asynchronous logic operating under strong constraints proposedoperating under strong constraints proposedTools (& methods) for synchronous synthesis Tools (& methods) for synchronous synthesis are used for asynchronous synthesisare used for asynchronous synthesis3-input complex nodes implemented using Direct 3-input complex nodes implemented using Direct logiclogicExtensive experiments confirmed the theoryExtensive experiments confirmed the theorycca. 20% area and delay improvement vs. all cca. 20% area and delay improvement vs. all state-of-the-art methodsstate-of-the-art methods