area and speed oriented implementations of asynchronous logic operating under strong constraints

28
Area and Speed Oriented Area and Speed Oriented Implementations Implementations of Asynchronous Logic of Asynchronous Logic Operating Under Strong Operating Under Strong Constraints Constraints Igor Lemberski Baltic International Academy Riga, Latvia e-mail: [email protected] Petr Fišer Czech Technical University in Prague Faculty of Information Technology e-mail: [email protected]

Upload: benson

Post on 21-Mar-2016

44 views

Category:

Documents


1 download

DESCRIPTION

Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints. Igor Lemberski Baltic International Academy Riga, Latvia e-mail: [email protected] Petr Fišer Czech Technical University in Prague Faculty of Information Technology - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

Area and Speed Oriented ImplementationsArea and Speed Oriented Implementationsof Asynchronous Logicof Asynchronous Logic

Operating Under Strong ConstraintsOperating Under Strong Constraints Igor Lemberski

Baltic International AcademyRiga, Latvia

e-mail: [email protected]

Petr Fišer

Czech Technical University in PragueFaculty of Information Technology

e-mail: [email protected]

Page 2: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 22

OutlineOutline

Asynchronous circuits model usedMotivation & proposed methodExperimental resultsConclusions

Page 3: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 33

Asynchronous Circuits Model UsedAsynchronous Circuits Model Used

Unbounded delay model Gate and wire delays are not limited The circuit is able to recognize the moment when input

states have changed Dual-rail encoding

Positive and negative values of each signal are provided

f(0) = 1, f(1) = 0 – log. 1 f(0) = 0, f(1) = 1 – log. 0 f(0) = 0, f(1) = 0 – space state (spacer) f(0) = 1, f(1) = 1 – not allowed

Page 4: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 44

Four-Phase DisciplineFour-Phase Discipline

Inputs in space state (00)

Inputs in working state

(10, 01)

Outputs in space state (00)

Outputs in working state

(10, 01)

Page 5: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 55

Seitz’s ConstraintsSeitz’s Constraints

Strong constraintsEach output changes its state only when all

inputs have changed their state

In contrast to weak constraintsSome outputs are permitted to change their

state when some inputs have changed their state

Page 6: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 66

Seitz’s ConstraintsSeitz’s Constraints

Strong constraintsEach output changes its state only when all

inputs have changed their state

In contrast to weak constraintsSome outputs are permitted to change their

state when some inputs have changed their state

Page 7: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 77

Seitz’s Strong ConstraintsSeitz’s Strong ConstraintsPros

Regularity Extra completion detection logic not needed Circuit delay is based on actual gate delays No additional synchronization chains

Cons Rather high area and delay

DIMS (Delay-Insensitive Minterm Synthesis)NCL (Null Convention Logic)NCL (Null Convention Logic)Direct LogicDirect Logic

Page 8: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 88

DIMS (Delay-Insensitive Minterm Synthesis)DIMS (Delay-Insensitive Minterm Synthesis)

2-level implementation2n n-input C-elements + n-input OR

Function implemented as sum-of-minterms

Page 9: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 99

NCL (Null Convention Logic)NCL (Null Convention Logic)

Library of 27 special gatesBased on threshold functionsAny function up to 4 inputs can be implemented

… but in dual-rail, 4 inputs = 2 variables only

Page 10: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1010

Direct LogicDirect Logic

Two‑level C-OR DIMS logic implemented as a single gateBoth positive and complemented outputs are providedDifferent delays for each input

Page 11: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1111

ComparisonComparisonDIMSDIMS Direct logicDirect logic

InputsInputs Trans.Trans. DelayDelay Trans.Trans. DelayDelay2 24 8.2 22 8.23 64 15.1 34 12.34 160 21.3 54 19.45 384 N/A 90 N/A6 896 N/A 158 N/A

NCLNCL

2-input gate2-input gate Trans.Trans. DelayDelayAND, OR 21 5.8

XOR 24 8.6

Page 12: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1212

Multi-Level Dual-Rail NetworkMulti-Level Dual-Rail Network

Positive and complemented values of each Positive and complemented values of each signal providedsignal providedEach node implemented as DIMS, NCL, or Each node implemented as DIMS, NCL, or Direct logicDirect logic

Page 13: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1313

Motivation & Proposed MethodMotivation & Proposed Method

State-of-the-art Nodes are implemented as simple gates (NAND, XOR)

4x 2-input gate = 22*4 = 88 transistors in Direct logic

Page 14: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1414

Motivation & Proposed MethodMotivation & Proposed Method

Proposed Nodes are implemented as complex gates

1x 2-input gate + 1x 3-input gate = 22 + 34 = 56 transistors

Page 15: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1515

Motivation & Proposed MethodMotivation & Proposed Method

State-of-the-art Nodes are implemented as simple gates (NAND, XOR)

Proposed Nodes are implemented as complex gates, i.e. gates of a

given number of inputs and any function Can be implemented both in DIMS and Direct logic Like FPGA LUTs Tools for synchronous synthesis can be used

FPGA mapping

Page 16: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1616

Where’s the Problem?Where’s the Problem?

Facts:Increase of the number of node inputs will:

Decrease the number of nodesDecrease the number of levelsIncrease the node sizeIncrease the node delay

Question:

Where is the trade-off?Where is the trade-off?

Page 17: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1717

Experimental SetupExperimental Setup

228 circuits processed (MCNC, ISCAS)Optimized by ABC choice script

1. Mapped into k-input NANDs (ABC map command ) state-of-the-art (k-NAND)

2. Mapped into k-LUTs (ABC fpga command) complex gates (k-CG)

3. Mapped into MCNC standard cells (ABC map) something in-between (SC)

k = 2…6

Implemented as DIMS, Direct logic, and NCL

Page 18: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1818

Results – DIMS - AreaResults – DIMS - Area

SC6-CG5-CG4-CG3-CG2-CG

6-NAND5-NAND4-NAND3-NAND2-NAND

0,0 2,0M 4,0M 6,0M 8,0M 10,0M 12,0M 14,0M 16,0M

Transistors

Page 19: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 1919

Results – DIMS - AreaResults – DIMS - Area

SC6-CG5-CG4-CG3-CG2-CG

6-NAND5-NAND4-NAND3-NAND2-NAND

0% 10% 20% 30% 40% 50% 60% 70%

Best in

Page 20: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2020

Results – DIMS – DelayResults – DIMS – Delay

SC

4-CG

3-CG

2-CG

4-NAND

3-NAND

2-NAND

0,0 5,0k 10,0k 15,0k 20,0k 25,0k

Delay

Page 21: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2121

Results – DIMS – DelayResults – DIMS – Delay

SC

4-CG

3-CG

2-CG

4-NAND

3-NAND

2-NAND

0% 10% 20% 30% 40% 50% 60%

Best in

Page 22: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2222

Discussion - DIMSDiscussion - DIMS

Implementation using arbitrary 2-input gates is the best one, both in area and delay

No big surprise. Complexity (and delay)of DIMS grows exponentially with the number of gate inputsResults are consistent – the more node inputs, the higher area and delay

Page 23: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2323

Results – Direct Logic - AreaResults – Direct Logic - Area

SCNCL

6-CG5-CG4-CG3-CG2-CG

6-NAND5-NAND4-NAND3-NAND2-NAND

0,0 500,0k 1,0M 1,5M 2,0M 2,5M 3,0M

Transistors

Page 24: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2424

Results - Direct Logic - AreaResults - Direct Logic - Area

SCNCL

6-CG5-CG4-CG3-CG2-CG

6-NAND5-NAND4-NAND3-NAND2-NAND

0% 10% 20% 30% 40% 50%

Best in

Page 25: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2525

Results – Direct Logic - DelayResults – Direct Logic - Delay

SC

NCL

4-CG

3-CG

2-CG

4-NAND

3-NAND

2-NAND

0,0 5,0k 10,0k 15,0k 20,0k

Delay

Page 26: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2626

Results – Direct Logic - DelayResults – Direct Logic - Delay

SC

NCL

4-CG

3-CG

2-CG

4-NAND

3-NAND

2-NAND

0% 10% 20% 30% 40% 50% 60% 70%

Best in

Page 27: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2727

Discussion - Direct LogicDiscussion - Direct LogicImplementation using 3-input complex gates is the best

one, both in area and delay

This is a good result confirming our theoryResults are consistent - no coincidenceState-of-the-art 2-NAND implementation is extremely inefficient: 21% area improvement 19% delay improvement

3-CG implementation is even better than NCL 10% area improvement 19% delay improvement

Page 28: Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, LilleEUROMICRO DSD 2010, Lille 2828

ConclusionsConclusions

Efficient implementation of asynchronous logic Efficient implementation of asynchronous logic operating under strong constraints proposedoperating under strong constraints proposedTools (& methods) for synchronous synthesis Tools (& methods) for synchronous synthesis are used for asynchronous synthesisare used for asynchronous synthesis3-input complex nodes implemented using Direct 3-input complex nodes implemented using Direct logiclogicExtensive experiments confirmed the theoryExtensive experiments confirmed the theorycca. 20% area and delay improvement vs. all cca. 20% area and delay improvement vs. all state-of-the-art methodsstate-of-the-art methods