energy efficient and high speed on-chip ternary bus

12
Energy Efficient and High Speed On- Chip Ternary Bus Chunjie Duan Mitsubishi Electric Research Labs, Cambridge, MA, USA Sunil P. Khatri Texas A&M University, College Station, TX, USA

Upload: maj

Post on 13-Feb-2016

54 views

Category:

Documents


7 download

DESCRIPTION

Energy Efficient and High Speed On-Chip Ternary Bus. Chunjie Duan Mitsubishi Electric Research Labs, Cambridge, MA, USA Sunil P. Khatri Texas A&M University, College Station, TX, USA. Motivation. Trends in VLSI design Shrinking feature size - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Energy Efficient and High Speed On-Chip Ternary Bus

Energy Efficient and High Speed On-Chip Ternary Bus

Chunjie DuanMitsubishi Electric Research Labs, Cambridge, MA, USA

Sunil P. KhatriTexas A&M University, College Station, TX, USA

Page 2: Energy Efficient and High Speed On-Chip Ternary Bus

2

03/13/2008

Motivation

• Trends in VLSI design– Shrinking feature size

• Deep SubMicron (DSM) and Very Deep SubMicron (VDSM) processes– Scaling down supply voltage– Increasing die-size (e.g. SoC, NoC, CMP)

• Impacts Smaller gate delay (high speed logic) Lower switching power per gate High complexity (>billion gates) χ Increasing power consumptionχ Higher leakage current (standby power)χ Reduced noise marginχ Increasing interconnect delay

• Interconnect delay >> gate delay• Global interconnect becomes the performance bottleneck

Page 3: Energy Efficient and High Speed On-Chip Ternary Bus

3

03/13/2008

On-chip Bus Interconnects

• The impact of DSM / VDSM: – W↓, P↓ – L↑, T↑

• to avoid quadratic increase in resistance of the wire:

• Inter-wire capacitance CI is much greater than substrate capacitance CL, → crosstalk becomes dominant

– λ = CI / CL > 10 for metal 4 in a m CMOS process

WTLR

CLCL CL

T

W

CI CICI CI

CL CL CL

Earlier process

P

DSM process

Page 4: Energy Efficient and High Speed On-Chip Ternary Bus

4

03/13/2008

Ternary Bus and Mapping

• Advantage of a ternary bus– low voltage step: Vdd/2 instead of Vdd

• We propose a bit-to-bit binary-ternary mapping scheme– Each binary bit is mapped directly to a line on the ternary bus.– A binary 0 is mapped to a middle value on the ternary bus. i.e. 0b->0t.

– A binary 1 is mapped to either high or low value on the ternary bus. i.e. 1b+ or 1b - .

• Disadvantage: lower bit density (1 bit/line vs 1.58 bit/line for true ternary bus)• Advantages: direct mapping and flexible polarity

– Ternary to binary conversion is very slow and complex– Flexible polarity results in low crosstalk. e.g., the ternary vectors +0+, -0-, +0- and

-0+ all represent the same binary value 101.• Each ternary value is represented by the polarity Pj and the magnitude Dj

Ternary driver truth table

Dj Pj Tj Vout

0 X 0 V0

1 0 - V-

1 1 + V+

Page 5: Energy Efficient and High Speed On-Chip Ternary Bus

5

03/13/2008

Crosstalk in a Multi-valued Bus

• Define the effective crosstalk as

– where j,k = sgn(j) Vk is the normalized voltage change,

and . NOL is the number of logic levels• Delay can be approximated as

– for • Energy consumption is

– when >> 1,

• For ternary bus, Vstep = Vdd/2, we know– max(Xeff,j)= 8

– min(Xeff,j)=0

• Bus speed/power is highly data pattern dependent!

n

jstepjeffLtotal VXCE

1

2.

jeffstepLj XVCk ,

jeffjstepLj XVCk ,

n

jstepLjeffjtotal VCXE

1

2.

1,1,, 2 jjjjjjeff absX

Table 1. Examples of Total Crosstalk

Vt-1 Vt Xeff

000 +++ 0

000 0++ 1

000 0+- 5

+0+ 0+0 4

+0+ 0-0 0

-+0 +-0 6

+-+ -+- 8

+++ --- 0

NOLVV dd

step step

jj V

V

Page 6: Energy Efficient and High Speed On-Chip Ternary Bus

6

03/13/2008

A Low Power, High Speed 4X Ternary Bus• Using direct bit-to-bit mapping• Coding rules:

– Rule #1: A direct - ↔ + transition is prohibited. – Rule #2: A 1b0b is mapped as -t0t or +t0t depending only on the current polarity of the 1b. – Rule #3: For a 0b1b transition on bj, if bj-1 is transitioning, Pj is coded so both lines transition

in the same direction.– Rule #4: For a 0b1b transition on bj, if bj-1 is not transitioning and and bj+1 is transitioning

from 1 to 0, Pj is coded so that the jth and (j+1)th line transition in the same direction.– Rule #5: For a 0b1b transition on bj, if no transition on either neighbor, Pj is coded so {Pj =

Pj-1 or Pj = Pj+1} with Pj = Pj-1 having the higher priority.• The 1st rule guarantees max(Xeff,j) = 4, therefore a 2X speed up from a conventional

binary bus• The other rules are designed to lower the probability of high value Xeff,j’s occurrence on

the bus• Identical encoder/decoder logic for each bit

An example of 4X ternary sequences

Binary Ternary Xeff

1111011100110101111000110101010010101110011100010000001100011110

++-000-+00—0+0+++-000-+0+0+0+00-0-0-+-00+-+000-000000--000+++-0

01100121012201111011212200001021012122001343112100110121

Page 7: Energy Efficient and High Speed On-Chip Ternary Bus

7

03/13/2008

An Even Faster 3X Ternary Bus

• Partition the bus into 5-bit groups• Insert shield wire between groups• Apply the same rules for 4X bus• It can be proven that such a configuration guarantees max(Xeff) = 3

– Additional 33% speed up over 4X ternary bus• At the cost of 20% additional wires

Enc

Pj+1

Dj+1

Bj+1

Tj+2

Enc

Pj+2

Dj+2

Bj+2

Enc

Ternarydriver

Tj

Pj

Dj

Bj

Ternarydriver

Ternarydriver

Tj+1

Enc

Pj+3

Dj+3

Bj+3

Ternarydriver

Tj+3

Enc

Pj+4

Dj+4

Bj+4

Ternarydriver

Tj+4

Ternarydriver

Tj

Enc

Pj

Dj

Bj

Ternarydriver

Tj+1

Enc

Pj+1

Dj+1

Bj+1

Ternarydriver

Tj-1

Enc

Pj-1

Dj-1

Bj-1

To j+2, …To j-2, …

4X bus encoder and driver circuit 3X bus encoder and driver circuit

Page 8: Energy Efficient and High Speed On-Chip Ternary Bus

8

03/13/2008

Circuit Implementations• Encoder implemented based on the 5 rules• Decoder is extremely simple (implemented with two 2-input gates)• Ternary driver and receiver can be implemented in current or voltage mode

– Current mode is more power hungry (static current)– Voltage mode requires a low impedance Vdd/2 supply

M1

M3

M2

VddVdd/2

busw xtalk

Ire f

VddI ref

2Ire f

out2out1 dout

I-receiver

ENCdin

M3 M4

M5

M1 M2

I-driver

to Dj+1

to Dj-1

CL

CI

R

bus

to Dj+1

to Dj-1

CL

CI

RENCdin

Vdd

Vref1

Vref2

Vdd

Vdd

Vref2

Vref1

doutV-driver

V-receiver

shared V-ref

(B) Voltage mode

(A) current mode

Page 9: Energy Efficient and High Speed On-Chip Ternary Bus

9

03/13/2008

Experimental Results

Crosstalk distribution and normalized energy consumption comparison (code ternary vs. half-swing binary)

Bus Size0X 1X 2X 3X 4X EF

(x104)%

5 B 52821 81837 46056 20289 3792 25.0 34.5

T 74712 99228 28101 2754 0 16.3

8 B 16924 26509 14432 6123 1540 7.99 28.2

T 21792 31373 11104 1259 0 5.73

16 B 15541 25637 15437 7264 1641 8.49 27.2

T 19843 31302 12685 1690 2 6.17

32 B 14852 25109 15949 7771 1823 8.76 27.5

T 18976 31285 13550 1691 2 6.35

• The power saving comes from the redistribution of the Xeff

– More transitions are pushed towards lower Xeff

• The average power saving is ~27%

4X: ternary bus using 4X code; HB: half-swing binary bus; RP: ternary bus with random polarity; TT: true ternary bus

Page 10: Energy Efficient and High Speed On-Chip Ternary Bus

10

03/13/2008

Experimental Results

• The proposed 4X and 3X busses are advantageous over other bus coding schemes.

• EF: Normalized total energy• PDP: power delay product

Bus type 4XT 3XT SB HB RP TT

EF (x104) 6.13 6.67 19.7 8.38 12.1 7.55

Delay 4x 3x 4x 4x 8x 8x

PDP (x105) 2.45 2.00 7.88 3.35 9.68 6.04

Pwr saving (%) 68.9 66.1 0 57.5 38.6 61.7

PDP gain (%) 68.9 74.6 0 57.5 -22.8 23.4

Bus Area 1 1.2 1.97 1 1 0.68

4XT: ternary bus using 4X code; 3XT: ternary bus with 3X code; SB: binary bus with shielding; HB: half-swing binary bus; RP: ternary bus with random polarity; TT: true ternary bus

Bus performance comparison

Page 11: Energy Efficient and High Speed On-Chip Ternary Bus

11

03/13/2008

Experimental Results

Eye diagrams for uncoded an coded busses (10mm)

Page 12: Energy Efficient and High Speed On-Chip Ternary Bus

12

03/13/2008

Summary

• Crosstalk classification was extended to multi-valued buses• We proposed a direct bit-to-bit binary-ternary mapping scheme which

results in a simple CODEC design.• We proposed a 4X coding scheme that allows us to double the speed

of a conventional ternary bus and save energy. • We proposed a coding scheme (3X coding) to attain an additional

33% speed gain at the cost of 20% area overhead.• We designed and implemented the CODEC and ternary

driver/receiver.• Our experimental results show significant power saving (27%) and

speed gain (2X or more) over other schemes