1 synthesizing datapath circuits for fpgas with emphasis on area minimization andy ye, david lewis,...

31
1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto {yeandy, lewis, jayar}@eecg.utoronto.ca

Upload: frank-hill

Post on 17-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

1

Synthesizing Datapath Circuits for FPGAs With Emphasis on

Area Minimization

Andy Ye, David Lewis, Jonathan Rose

Department of Electrical and Computer Engineering, University of Toronto

{yeandy, lewis, jayar}@eecg.utoronto.ca

Page 2: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

2

Motivation: Datapath Regularity

• Larger FPGAs– Larger applications on FPGAs

– More datapath logic in larger applications

– Datapath logic is highly regular

• Utilize regularity to improve logic density

Page 3: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

3

Utilizing Datapath Regularity

• A new datapath-oriented FPGA

• New CAD tools supporting the new FPGA– Synthesis

– Packing

– Placement

– Routing

• This talk focuses on synthesis

Page 4: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

4

Background: Datapath-oriented FPGA

• Architected to utilize datapath regularity

• Architectural features– Capture regularity using special logic blocks

– Increase logic density by coarse grain routing

Page 5: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

5

Background: FPGA Overview

L L

L L

S

L Logic cluster

Coarse grain routing tracksFine grain routing tracks

S Switch box

RoutingChannels

Page 6: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

6

Background: Logic ClusterBLEBLEBLEBLE

BLEBLEBLEBLE

BLEBLEBLEBLE

BLEBLEBLEBLE

Subcluster 1Subcluster 2Subcluster 3Subcluster 4

LocalRoutingNetwork

BLEBLEBLEBLE

A Subcluster

MU

X

LUTDF

F

MA Basic Logic Element (BLE)

Page 7: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

7

Background: FPGA Overview

L L

L L

S

L Logic cluster

Coarse grain routing tracksFine grain routing tracks

S Switch box

RoutingChannels

Page 8: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

8

Background: Coarse Grain Routing Tracks

Logic Cluster

Sub-cluster

Sub-cluster

Sub-Cluster

Sub-cluster

M

Sw

itch

Bo

x

M

M

Coarse Grain Routing

M M M M

Fine Grain Routing

Page 9: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

9

Datapath Synthesis

• Synthesis– The first step in a fully automated CAD flow

– Transforms high level descriptions into logic

• Conventional synthesis (flat synthesis)– Minimizes area and delay metrics

– Destroys datapath regularity

• Datapath synthesis– Preserves datapath regularity

– Supports downstream CAD tools

Page 10: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

10

Datapath Representation

• Datapath circuits are represent by netlists of datapath components (VHDL or Verilog)

• Datapath component library– Multiplexers

– Adders/subtracters

– Shifters

– Comparators

– Registers

• Each component consists of identical bit-slices

Page 11: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

11

Hard Boundary Hierarchical Synthesis

• Optimize within the boundaries of bit-slices

• Keep identical bit-slices identical

• Optimized 15 datapath circuits from Pico-java processor using Synopsys [sun]– Good regularity

– Bad area - 38% area inflation

• FPGA architecture – increase logic density– Need a better synthesis tool

Page 12: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

12

Causes of Area Inflation

• Examined circuits to determine the causes

• Constraint of preserving bit-slice boundaries– Common sub-expressions exist across bit-slices

– Harder to discover in datapath synthesis

• Constraint of preserving datapath regularity– Identical bit-slices have different external connections

– Some bit-slices have more optimization opportunities

– Missing optimization opportunities if one has to keeping all bit-slices identical

Page 13: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

13

Enhanced Module CompactionNetlist of Datapath

Components

Word-level Optimization

Module Compaction

Bit-slice Netlist I/OOptimization

Flat Synthesis & OptimizationWithin Bit-slice Boundaries

Manual Operation

Netlist of SynthesizedBit-slices

Page 14: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

14

Word-level Optimization

• Done manually and will be automated

• Optimizes across bit-slice boundaries

• Uses the functionality of each datapath component to create optimization opportunities

• Two are performed– Multiplexer tree collapsing

– Operation reordering

• More in the future

Page 15: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

15

Multiplexer Tree Collapsing

• Datapath circuits contain multiplexers in a tree topology

• Collapses several multiplexers in a multiplexer tree into a single multiplexer

• Collapsing operation creates common sub-expressions

• Extracts common expressions out of multiple bit-slices to save area

Page 16: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

16

An Example

FF

S1

S2

R

A

FF

A

rl

S1

S2

rl – random logic

mux1

mux2

Page 17: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

17

Operation Reordering

• Transforms result selection into operand selection

• Accepts the transformation if resulting in smaller area

Page 18: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

18

An Example

mux

+ +a b c d

se

mux

+

a c b dmux

e

s

sum carry sum carry

a0b0cin0a c0

d0cin0b

cout0a

cout0bs0

e0

sum carry

e0cout0

cin0

a0 c0 b0 d0

s0

Page 19: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

19

Module Compaction

• Merges bit-slices into larger bit-slices

• Based on connectivity between datapath components

• Larger bit-slices have more optimization opportunities for flat synthesis

• Avoids merging based on carry chains

• Similar to the algorithm proposed by Koch

Page 20: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

20

An Example

mux0 mux1 mux2 mux3

FA0 FA1 FA2 FA3 FA4

Page 21: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

21

Bit-slice I/O Optimization

• Granularity of bit-slice I/O optimization, m

• Breaks datapath components into m-bit wide chunks

• m bit-slices are kept identical to each other

• Allows some bit-slices in a datapath component to be optimized more than others

Page 22: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

22

Bit-slice I/O Optimization

• Converts bit-slice I/O signals into internal signals if all m bit-slices meet an optimization criteria

• More optimization opportunities for flat synthesis

• Four types of I/O optimizations– Constant absorption

– Feedback absorption

– Duplicated input absorption

– Unused output absorption

Page 23: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

23

Experimental Results

• Fifteen benchmark circuits– From the Pico-java processor

– Synthesized into 4-LUTs and DFFs

• Experiments– Area

– Regularity

– Area against m (the granularity of bit-slice I/O optimization)

Page 24: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

24

Area

• m (granularity of bit-slice I/O optimization) = 4

• Compare datapath synthesis with flat synthesis

Page 25: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

25

Post-synthesis Area (LUT Count)

Flat Synthesis

Area

Datapath Synthesis

Area Inflation

icu_dpath 3120 3235 3.7%ex_dpath 2530 2553 0.91%multmod_dp 1558 1634 4.9%ucode_dat 1243 1304 4.9%imdr_dpath 1182 1219 3.1%dcu_dpath 960 966 0.63%mantissa_dp 846 878 3.8%incmod_dp 779 865 11%smu_dpath 490 493 0.61%exponent_dp 477 501 5.0%pipe_dpath 443 471 6.3%prils_dp 377 388 2.9%rsadd_dp 346 305 -12%code_seq_dp 218 223 2.3%ucode_reg 78 82 5.1%Total Area 14647 15117 3.2%

Page 26: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

26

Regularity

• m (granularity of bit-slice I/O optimization) = 4

• Two terminal connections captured by– 4-bit wide buses

– 4-bit wide control groups

Page 27: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

27

Regularity

A 4-bit wide bus

S1S2S3S4

S1S2S3S4

S1S2S3S4

A 4-bit wide control group

Page 28: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

28

Regularity ResultsTwo Terminal Connections

4-bit Wide Buses 4-bit Wide Control groups

dcu_dpath 2232 49% 43%ex_dpath 6547 52% 39%icu_dpath 8047 47% 36%imdr_dpath 3100 50% 36%pipe_dpath 1049 48% 42%smu_dpath 1167 48% 25%ucode_data 3143 52% 41%ucode_reg 194 72% 21%code_seq_dp 799 58% 18%exponent_dp 1362 32% 23%incmod_dp 2013 42% 33%mantissa_dp 2533 47% 36%multmod_dp 3380 39% 25%prils_dp 864 41% 32%rsadd_dp 722 52% 27%Total 37152 48% 35%

• 94% of LUTs remain in regular datapath components

Page 29: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

29

Granularity (m) Vs. Area

• Higher m (the granularity of bit-slice I/O optimization)– Keeps more bit-slices identical

– Preserves more regularity

– Higher area cost

Page 30: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

30

Granularity Vs. Area Inflation

0

1

2

3

4

5

6

7

8

%

1 4 8 12 16 20 24 28 32

Page 31: 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer

31

Conclusion

• Presented a datapath-oriented FPGA architecture

• Presented an enhanced module compaction algorithm

• Empirically demonstrated the area efficiency of the algorithm– 3%-8% area inflation

• Good regularity– 48% two terminal connections are in 4-bit wide buses– 35% two terminal connections are in 4-bit wide control

groups