after tech. mapping. 7. circuit level design buffer chain delay analysis of buffer chaindelay...

61
After Tech. Mapping 0 10 20 30 40 50 60 70 80 Power(mW), Ratio h=2 6 h=3 h=3 10 h=4 h=5 h=3 15 h=4 h=5 h=5 20 h=6 h=7 h=8 Fanin, Height K 1 = 3, k 2 =3 SIS+LEVEL MAP SIS+ OURS+ LEVEL MAP Improvement Ratio

Post on 18-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

After Tech. Mapping

0

10

20

30

40

50

60

70

80

Pow

er(

mW

), R

atio

h=26

h=3 h=310

h=4 h=5 h=315

h=4 h=5 h=520

h=6 h=7 h=8

Fanin, Height

K 1=3, k 2=3

SIS+LEVEL MAP

SIS+OURS+LEVEL MAP

Improvement Ratio

Page 2: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

7. Circuit Level Design

Page 3: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Buffer Chain• Delay analysis of buffer chain • Delay analysis considering parasitic

capacitance,Cp

input

stage 1 stage 1

stage ( i- 1) stage i s tage n

s ize 1 s ize s ize i- 2 s ize i- 1s ize n- 1

C in C ini- 1C in

iC inC in=nC in

)/ln()( , 72.2)(

0)(

)ln(

)/ln(

)ln(

)/ln(

)(

)/( )/(

0

1 100

1

inLoptimumoptimum

d

inLd

inL

inn

L

n

k

n

kdd

kk

CCne

T

CCtT

CCn

CC

tntktT

LWLW

) : (typical 10~21

1

1) (

) (

21

1

1

2

122

1

e

Eff

CCfVPP

CCfVfVCP

CCC

nn

n

nn

kpinddkT

pini

ddddkk

pk

ink

k

Ck,Pk: stage k buffer output 의 total capacitance, power

PT: buffer chain 의 power consumption

Pn: load capacitance CL 의 power consumption

Eff: power efficiency pn/pT

Page 4: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Slew Rate

• Determining rise/fall time

P eriod T

tr tf

t1 t3t2

V in

Vdd+ Vtp

Imax

Imean

Vtn

Ishort

fr

tddddmeanSC

ttntpp

t

t

tin

t

t

short

t

t

t

t

shortshortmean

tt

fVVVIP

VVV

dtVVT

dttIT

dttIdttIT

I

where,

)2(2

, where,

)(2

4

)(4

)()(2

3

n

22

1

2

1

2

1

3

2

Page 5: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Slew Rate(Cont’d)

• Power consumption of Short circuit current in Oscillation Circuit

Vdd

Vdd

Vo

V i

Vdd

Vdd

Vo

V i

VoV i

Page 6: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Pass Transistor Logic

• Reducing Area/Power

– Macro cell(Large part in chip area) XOR/XNOR/MUX(Primitive) Pass Tr. Logic

– Not using charge/discharge scheme Appropriate in Low Power Logic

• Pass Tr logic Family

– CPL (Complementary Pass Transistor Logic)

– DPL (Dual Pass Transistor Logic)

– SRPL (Swing Restored Pass Transistor Logic)

• CPL– Basic Scheme

– Inverter Buffering

A

ABAB

B

ABB

B

A

ABAB

B

B

ABB

VddVdd

p- M O S Latch

Page 7: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Pass Transistor Logic(Cont’d)• DPL

– Pass Tr Network + Dual p-MOS– Enables rail-to-rail swing– Characteristics

• Increasing input capacitance(delay)

• Increasing driving ability for existing 2 ON-path

• equals CPL in input loading capacitance

• SRPL

– Pass Tr network + Cross coupled inverter

– Restoring logic level

– Inverter size must not be too big

AB

B

A

B

AA

B

A

B

AB

n-M O S C PLnetw ork

Page 8: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Dynamic Logic• Using Precharge/Evaluation scheme

• Family

– Domino logic

– NORA(NO RAce) logic

• Characteristics

– Decreasing input loading capacitance

– Power consumption in precharge clock

– Increasing useless switching in precharging period

• Basic architecture of Domino logic

A

B

clk

C in C L

A

P1

N1

NLogic Blockc lk

B

A

precharge evaluation

Page 9: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Input Pin Ordering• Reorder the equivalent inputs to a

transistor based on critical path delays and power consumption

• N- input Primitive CMOS logic

– symmetrical in function level

– antisymmetrical in Tr level

• capacitance of output stage

• body effect

• Scheme

– The signal that has many transition must be far from output

– If it is hard to estimate switching frequency, we must determine pin ordering considering path and path delay balance from primary input to input of Tr.

• Example of N-input CMOS logic

A

D

C L

C

B

C 3

C 1

C 2

Experimentd with gate array of TIFor a 4-input NAND gate in TI’s BiCMOS gate array library (with a load of 13 inverters), the delay varies by 20% while power dissipation by 10% between a good and bad ordering

Page 10: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

INPUT PIN Reordering

CL

A B C D

C

A

B

D

CB

CC

CD

VDD

MPA MPB MPC

MPD

MNA

MNB

MNC

MND

1 1

1 1

1 1

1 1

1

1

1

1

(a) (b) (c) (d)

Simulation result ( tcycle=50ns, tf/tr=1ns)

: A 가 critical input 인 경우 =38.4uW,

D 가 critical input 인 경우 =47.2uW

Page 11: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Sensitization

• Example

• Definition

– sensitization : input signal that forces output transition event

– sensitization vector : the other inputs if one signal is sensitized

X1

X3

X2

),,,1,,,(

),,,0,,,(

][ ][

11

11

10

nili

nili

XXi

XXXXf

XXXXf

ffX

Yii

32332

101

][ ][ 11

XXXXX

ffX

YXX

321 )( XXXY

Page 12: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Sensitization(Cont’d)

• Considering Sensitization in Combinational logic:Remove unnecessary transitions in the C.L

• Considering Sensitization in Sequential logic: Also reduces the power consumption in the flip-flops.C om binational

LogicXn

E

QY

C om binationalLogic

X1

Xn

E

QY

C om binationalLogic

X1

Xn

E

Q Y

C om binationalLogic

QYD Q

D Q

c lk

X1

Xn

D Q

D Q

E

Page 13: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

TTL-Compatible• TTL level signal CMOS

input• Characteristic Curve of CMOS

Inverter

Vdd= 3.3V

Vdd= 3.3V

Vo

V i

1.4V

V IL= 0.8V V IH= 2.0V Vdd= 3.3V

V i

VoI leak= avg(Id1, Id2)

IDTTL1 IDTTL2

Vdd

V in

TTL INP U T

padinput compatible TTL ofnumber : e wher

)( 21

TTL

DTTLDTTLddTTLTTL

N

IIVNP

Page 14: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

TTL Compatible(Cont’d)

• CMOS output signal TTL input

– Because of sink current IOL,

CMOS gets a large amount of

heat

– Increased chip operating

temperature

– Power consumption of whole

system

C hip Boundary C hip Boundary

Input Pad

O utput Pad

VO L

IO L

Page 15: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

INPUT PIN Reordering◈ To reduce the power dissipation one should place the

input with low transition density near the ground end.

(a) If MNA turns off , only CL needs to be charged (b) If MND turns off , all CL, CB, CC and CD needs to be charged (c) If the critical input is rising and placed near output node, the initial charge of CB, CC and CD are zero and the delay time of CL

discharging is less than (d) (d) If the critical input is rising and placed near ground end, the charge of CB, CC and CD must dischagge before the charge of CL discharge to zero

Page 16: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

저전력 Booth Multiplier 설계

성균관대학교전기전자컴퓨터공학부

김 진 혁 , 이 준 성 , 조 준 동

Page 17: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Modified Booth 곱셈기

R ecoded D ig it O peration on X

0 : Add 0 to the partia l p roduct

+1 : Add X to the partia l product

+2 : Sh ift le ft X one position and add it

to the partia l p roduct

-1 : Add two ’s com plem ent o f X to the

partia l p roduct

-2 : Take two ’s com plem ent o f X and

sh ift le ft one position

Y 2i+1 Y 2i Y 2I-1 R ecoded O peration D ig it on X

0 0 0 0 0X 0 0 1 +1 +1X 0 1 0 +1 +1X 0 1 1 +2 +2X 1 0 0 -2 -2X 1 0 1 -1 -1X 1 1 0 -1 -1X 1 1 1 0 0X

• Multibit Recoding 을 사용하여 부분합의 갯수를 1/2 로 줄여 고속의 곱셈을 가능하게 한다 .

• 피승수 (multiplicand) : X , 승수 (multiplier) : Y

Recoded digit = Y2i-1 + Y2i -2Y2i+1 ( Y-1=0 )

< Generation and operation of recoded digit >

Page 18: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Modified Booth 곱셈기 - 예

• Example

10010101 = X

01101001 = Y

1111111110010101000000110101100000011010111100101010

1101010000011101 = P (- 11235)

(- 107)

(+ 105)Operation B its recoded

+ 1- 2- 1+ 2

010100101011

signextension

Page 19: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Wallace Tree - 4:2 CompressorX 7Y 7

X 0

Y 0

..............

.............. : Zero: B it jum ping leve l: partia l p roduct: b it genera ted by

com pressor

1st stage

2nd stage

Two sum m ands tobe added

(a)

4*8 Partia l P roduct generators

8 4-2 com pressors

4*8 Partia l P roduct generators

8 4-2 com pressors

16-b it adder

11 4-2 com pressors

1st stage(b lock A )

1st stage(b lock B )

2nd stage(b lock C )

X3 , X2 , X1 , X0

X7 , X6 , X5 , X4

8

4

4Y

P 0P 15

(b)

Page 20: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Multipliers - Area

• 16-bit Multiplier Area

Multipliertype

Area(mm2) Gate count

Array 4.2 2,378

Wallace 8.1 2,544

Modified booth 8.5 3,375

Page 21: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Multiplier - Delay

• Average Power Dissipation (16-bit)

Multipliertype

Power(mW) Logictransitions

Array 43.5 7,224

Wallace 32.0 3,793

Modified booth 41.3 3,993

Page 22: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Multiplier - Power

• Worst-Case Delay (16-bit)

Multipliertype

Delay(ns) Gatedelays

Array 92.6 50

Wallace 54.1 35

Modified booth 45.4 32

Page 23: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Instruction Level Power Analysis

• Estimate power dissipation of instruction sequences and power dissipation of a program

• Eb : base cost of individual instructions

Es : circuit state change effects

• EM : the overall energy cost of a program

Bi : the base cost of type i instruction

Ni : the number of type i instruction

Oi,j : the cost occurred when a type i instruction is followed by

a type j instruction

Ni,j : the number of occurrences when a type i instruction is

immediately followed by a type j instruction

E B Nb i i E O Ns i j i j , ,

E E EM b s

Page 24: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Instruction ordering

• Develop a technique of operand swapping

• Recoding weight : necessary operation cost of operands

• Wtotal : total recoding weight of input operand

Wi : weight of individual recoded digit i in Booth Multiplier

Wb : base weight of an instruction

Winter : inter-operation weight of instructions

• Therefore, if an operand has lower Wtotal , put it in the second

input(multiplier).

W Wtotal ii

W W Wi b er int

Page 25: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

RESULT

Circuit State Effects[pJ]

when switchedInstruction

Name

Base

cost

[pJ]LOAD ADD 2’s

complement

SHIFT

LOAD 1.46 0.18 1.20 1.08 0.73

ADD 0.86 0.31 0.49 0.61

2’s

complement

0.77 0.27 0.34

SHIFT 0.29 0.15

< 4 by 4 multiplier >

Circuit State Effects[pJ]

when switchedInstruction

Name

Base

cost

[pJ]LOAD ADD 2’s

complement

SHIFT

LOAD 3.25 0.40 2.67 2.38 1.63

ADD 1.91 0.58 1.11 1.44

2’s

complement

1.72 0.55 0.78

SHIFT 0.65 0.38

< 8 by 8 multiplier >

Circuit State Effects[pJ]

when switchedInstruction

Name

Base

cost

[pJ]LOAD ADD 2’s

complement

SHIFT

LOAD 4.81 0.59 3.96 3.57 2.40

ADD 2.83 1.02 1.63 2.12

2’s

complement

2.55 1.00 1.14

SHIFT 0.96 0.78

< 12 by 12 multiplier >

Page 26: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Conclusion

0

2

4

6

8

10

12

4bit

8bit

12bit

aver

age

0

5

10

15

20

25

30

35

4bit

8bit

12bit

circuitstateseffects notconsidered

circuitstateseffe c t sconsidered

Power[pJ]

bits bits

% of instances with

circuit states effects

4.0%

reduction

12.0%

reduction

9.0%

reduction

Page 27: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

8. Layout Level Design

Page 28: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

• Constant scaled wire increases coupling capacitance by S and wire resistance

by S• Supply Voltage by 1/S, Theshold Voltage by 1/S, Current Drive by 1/S• Gate Capaitance by 1/S, Gate Delay by 1/S• Global Interconnection Delay, RC load+para by S• Interconnect Delay: 50-70% of Clock Cycle• Area: 1/S2

• Power dissipation by 1/S - 1/S2

• ( P = nCVdd2f, where nC is the sum of capacitance times #transitions)

• SIA (Semiconductor Industry Association): On 2007, physical limitation: 0.1 m

20 billion transistors, 10 sqare centimeters, 12 or 16 inch wafer

Device Scaling of Factor of S

Page 29: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Delay Variations at Low-Voltage

• At high supply voltage, the delay increases with temperature (mobility is decreasing with temperature) while at very low supply voltages the delay decreases with temperature (VT is decreasing with temperature).

• At low supply voltages, the delay ratio between large and minimum transistor widths W increases in several factors.

• Delay balancing of clock trees based on wire snaking in order to avoid clock-skew. In this case, at low supply voltages, slightly VT variations can significantly modify the delay balancing.

Page 30: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Quarter Micron Challenge

• Computers/peripherals (SOC): 1996 ($50 Billion) 1999 ($70 Billion)• Wiring dominates delay: wire R comparable to gate driver R; wire/wire coupling

C > C to ground• Push beyond 0.07 micron• Quest for area(past), speed-speed (now), power-power-power(future)• Accelerated increases of clock frequencies• Signal integrity-based tools• Design styles (chip + packages)• System-level design(system partitioning)• Synthesis with multiple constraints (power,area,timing)• Partitioning/MCM• Increasing speed limits complicate clock and power distribution• Design bounded by wires, vias, via resistance, coupling• Reverse scaling: adding area/spacing as needed: widening, thickening of wires,

metal shielding & noise avoidance - adding metal

Page 31: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

CLOCK POWER CONSUMPTION

•Clock power consumption is as large as the logic power; Clock Signal carrying the heaviest load and switching at high frequency, clock distribution is a major source of power dissipation.• In a microprocessor, 18% of the total power is consumed by clocking• Clock distribution is designed as a hierarchical clock tree, according to the decomposition principle.

Page 32: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Power Consumption per block in typical microprocessor

Page 33: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Crosstalk

Page 34: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Solution for Clock Skew• Dynamic Effects on Skew

Capacitance Coupling• Supply Voltage Deviation (Clock

driver and receiver voltage difference)

• Capacitance deviation by circuit operation

• Global and local temperature• Layout Issues: clocks routed first• Must aware of all sources of delay• Increased spacing• Wider wires• Insert buffers• Specialized clock need net

matching• Two approaches: Single Driver, H-

tree driver

• Gated Clocks: The local clocks that are conditionally enabled so that the registers are only clocked during the write cycles. The clock is partitioned in different blocks and each block is clocked with its own clock.

• Gating the clocks to infrequently used blocks does not provide and acceptable level of power savings

• Divide the basic clock frequency to provide the lowest clock frequency needed to different parts of the circuit

• Clock Distribution: large clock buffer waste power. Use smaller clock buffers with a well-balanced clock tree.

Page 35: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

PowerPC Clocking Scheme

Page 36: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

CLOCK DRIVERS IN THE DEC ALPHA 21164

Page 37: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

DRIVER for PADS or LARGE CAPACITANCES

Off-chip power (drivers and pads) are increasing and is very difficult to reduce such a power, as the pads or drivers sizes cannot be decreased with the new technologies.

Page 38: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Layout-Driven Resynthesis for Lower Power

Page 39: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Low Power Process

• Dynamic Power Dissipation

Vdd

V in Vo

C ovp

C ovn

C djp

C djn

DrainW

D

C jbC jsw

)(2 ,

)(

)(

)(2

0

1

1

2

2

DWPDWA

PCACC

WCC

CC

LWCC

VVI

fVCP

DD

DjswDjdj

GDov

m

jjgatein

n

ioxgate

tgsds

ddLd

Page 40: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Crosstalk• In deep-submicron layouts, some of the netlengths for connection between

modules can be so long that they have a resistance which is comparable to the resistance of the driver.

• Each net in the mixed analog/digital circuits is identified depending upon its crosstalk sensitivity– 1. Noisy = high impedance signal that can disturb other signals, e.g., clock

signals.– 2. High-Sensitivity = high impedance analog nets; the most noise sensitive

nets such as the input nets to operational amplifiers.– 3. Mid-Sensitivity = low/medium impedance analog nets.– 4. Low-Sensitivity = digital nets that directly affect the analog part in some

cells such as control signals.– 5. Non-Sensitivity = The most noise insensitive nets such as pure digital

nets,

• The crosstalk between two interconnection wires also depends on the frequencies (i.e., signal activities) of the signals traveling on the wires. Recently, deep-submicron designs require crosstalk-free channel routing.

Page 41: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Power Measure in Layout

• The average dynamic power consumed by a CMOS gate is given below, where C_l is the load capacity at the output of the node, V_dd is the supply voltage, T_cycle is the global clock period, N is the number of transitions of the gate output per clock cycle, C_g is the load capacity due to input capacitance of fanout gates, and C_w is the load capacity due to the interconnection tree formed between the driver and its fanout gates.

• Pav = (0.5 Vdd2) / (Tcycle Cl N) = (0.5 Vdd

2) / (Tcycle (Cg + Cw )N)

• Logic synthesis for low power attempts to minimize SUMi Cgi Ni

• Physical design for low power tries to minimize SUMi Cwi Ni

• . Here Cwi consists of Cxi + CsI, where Cxi is the capacitance of net i due to its crosstalk, and CsI is the substrate capacitance of net i. For low power layout applications, power dissipation due to crosstalk is minimized by ensuring that wires carrying high activity signals are placed sufficiently far from the other wires. Similarly, power dissipation due to substrate capacitance is proportional to the wirelength and its signal activity.

Page 42: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

이중 전압을 이용한 저전력 레이아웃 설계

성균관대학교전기전자컴퓨터공학부

김 진 혁 , 이 준 성 , 조 준 동

Page 43: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

목 차• 연구목적• 연구배경• Clustered Voltage Scaling 구조• Row by Row Power Supply 구조• Mix-And-Match Power Supply 구조• Level Converter 구조• Mix-And-Match Power Supply 설계흐름• 실험결과• 결론

Page 44: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

연 구 목 적 및 배경• 조합회로의 전력 소모량을 줄이는

이중 전압 레이아웃 기법 제안

• 이중 전압 셀을 사용할 때 , 한 cell row 에 같은 전압의 cell 이 배치되면서 증가하는 wiring 과 track 의 수를 줄임

• 최소 트랜지스터 개수를 사용하는 Level Converter 회로의 구현

• 디바이스의 성능을 유지하면서 이중 전압을 사용하는 Clustered Voltage Scaling [Usami, ’95] 을 적용

• 제안된 Mix-And-Match Power Supply 레이 아웃 구조는 기존의 Row by Row Power Supply [Usami, ’97] 레이 아웃 구조를 개선하여 전력과 면적을 줄임

Page 45: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Clustered Voltage Scaling• 저전력 netlist 를 생성

F/F

F/F

F/F

LC 2

G 1

G 2

G 3G 4

G 5

G 6

G 7G 8

LC 1

G 11 G 9G 10

S lack(S i) = R i - A i

S 1> 0

S 3> 0S 4> 0

S 5> 0

S 6> 0

S 9> 0S 7< 0

S 10< 0

S 11< 0

S 8< 0

: VDDL

: VDDH

: Level C onverter

S 2< 0

Page 46: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

VDDH

VDDL

VDDH

VDDHVDDL

standard cell

s tandard cell

module

VS SVDDH cell

VS S

VDDL

VDDL cell

standard cell

VDDH cell

VDDL cell

Row by Row Power Supply 구조

Page 47: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Mix-And-Match Power Supply 구조

VDDH

VDDLVDDHVDDL

standard cell

s tandard cell

module

standard cellVDDH

cellVDDL

cell

VDDH cell

VDDL cell

VDDL cellVDDH cellVS S

VDDL

VDDH

VS S

VDDL

VDDH

Page 48: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

VDDH

module

VDDH

VDDL

module

VDDH

VDDL

module

구 조 비 교

Conventional RRPS MAMPS

Circuit

Page 49: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Level Converter 구조• Transistor 의 갯수 : 6 개 4 개 • 전력과 면적면에서 효과적

기 존 제 안

VS S / VDDL

Vth= 1.5V

VS S / VDDH

Vth= 2.0V

VDDHVDDH

INVDDL

VDDH

O U T

Page 50: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Mix-And-Match Power Supply Design Flow

Physical placem ent

Assign supply voltage to each cell

Routing

Synthesis tim ing, power and area

S ingle voltage netlist

Netlist with m ultiple supply voltage

Multiple voltage scaling

(O P U S )

(Aquarius XO )

(P owerM ill)

Page 51: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Area (% )

C onventionalc ircuit

R R P S M AM P S

15%10%

100

power (%)

C onventionalc ircuit

RR P S M AM P S

47%

2%

100

실 험 결 과

전체 Power 전체 Area

Page 52: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

결 론

• 단일 전압 회로와 비교하여 49.4% 의 Power 감소를 얻은 반면 5.6% 의 Area overhead 가 발생

• 기존의 RRPS 구조보다 10% 의 Area 감소와 2% 의 Power 감소

• 제안된 Level Converter 는 기존의 Level Converter 보다 30% 의 Area 감소와 35% 의 Power 감소

Page 53: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

9. CAD tools

Page 54: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Low Power Design Tools

• Transistor Level Tools (5-10% of silicon)– SPICE, PowerMill(Epic), ADM(Avanti/Anagram), Lsim Power Analyst(mentor)

• Logic Level Tools (10-15%)– Design Power and PowerGate (Synopsys), WattWatcher/Gate (Sente), PowerSim

(System Sciences), POET (Viewlogic), and QuickPower (Mentor)

• Architectural (RTL) Level Tools (20-25%)– WattWatcher/Architect (Sente): 20-25% accuracy

• Behavioral (spreadsheet) Level Tools (50-100%)– Active area of academic research

Page 55: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Commercial synthesis systems

Page 56: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Research synthesis systems A - Architectural synthesis.

L - Logic synthesis.

Page 57: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Low-Power CAD sites• Alternative System Concepts, Inc, : 7X power reduction throigh optimization,

contact http://www.ee.princeton.edu and Jake Karrfalt at [email protected] or (603) 437-2234. Reduction of glitch and clock power; modeling and optimization of interconnect power; power optimization for data-dominated designs with limited control flow.

• Mentor Graphics QuickPower: Hierarchical of determining overall benet of exchanging the blocks for lower power. powering down or disabling blocks when not in use by gated-clock

• choose candidates for power-down Calculate the effect of the power-down logic http://www.mentorg.com

• Synopsys's Power Compiler http://www.synopsys.com/products/power/power_ds

• Sente's WattWatcher/Architect (first commerical tool operating at the architecture level(20-25 %accuracy). http://www.powereda.com

• Behavioral Tool: Hyper-LP (Optimization), Explore (Estimation) by J. Rabaey

Page 58: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

Design Power(Synopsys)

• DesignPower(TM) provides a single, integrated environment for power analysis in multiple phases of the design process:

– Early, quick feedback at the HDL or gate level through probabilistic analysis.

– Improved accuracy through simulation-based analysis for gate level and library exploration.

• DesignPower estimates switching, internal cell and leakage power. It accepts user-defined probabilities, simulation toggle data or a combination of both as input. DesignPower propagates switching information through sequential devices, including flip-flops and latches.

• It supports sequential, hierarchical, gated-clock, and multiple-clock designs. For simulation toggle data, it links directly to Verilog and VHDL simulators, including Synopsys' VSS.

Page 59: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

10. References

Page 60: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

References[1] Gary K. Yeap, "Practical Low Power Digital VLSI Design",

Kluwer Academic Publishers.[2] Jan M. Rabaey, Massoud Pedram, "Low Power Design Methodologies",

Kluwer Academic Publishers.[3] Abdellatif Bellaouar, Mohamed I. Elmasry, "Low-Power Digital VLSI Design

Circuits And Systems", Kluwer Academic Publishers.[4] Anantha P. Chandrakasan, Robert W. Brodersen, "Low Power Digital CMOS Design", Kluwer Academic Publishers.[5] Dr. Ralph Cavin, Dr. Wentai Liu, "1996 Emerging Technologies : Designing Low Power Digital Systems"[6] Muhammad S. Elrabaa, Issam S. Abu-Khater, Mohamed I. Elmasry, "Advanced Low-Power Digital Circuit Techniques", Kluwer Academic Publishers.

Page 61: After Tech. Mapping. 7. Circuit Level Design Buffer Chain Delay analysis of buffer chainDelay analysis considering parasitic capacitance,C p Ck,Pk: stage

References

• [BFKea94] R. Bechade, R. Flaker, B. Kaumann, and et. al. A 32b 66 mhz 1.8W Microprocessor". In IEEE Int. Solid-State Circuit Conference, pages 208-209, 1994.

• [BM95] Bohr and T. Mark. Interconnect Scaling - The real limiter to high performance ULSI". In proceedings of 1995 IEEE international electron devices meeting, pages 241-242, 1995.

• [BSM94] L. Benini, P. Siegel, and G. De Micheli. Saving Power by Synthesizing Gated Clocks for Sequential Circuits". IEEE Design and Test of Computers, 11(4):32-41, 1994.

• [GH95] S. Ganguly and S. Hojat. Clock Distribution Design and Verification for PowerPC Microprocessor". In International Conference on Computer-Aided Design, page Issues in Clock Designs, 1995.

• [MGR96] R. Mehra, L. M. Guerra, and J. Rabaey. Low Power Architecture Synthesis and the Impact of Exploiting Locality". In Journal of VLSI Signal Processing,, 1996.