tutorial outline - ecs.umass.edu€¦ · ... low power design gate.1 ©mji&vn, psu, 2000...

24
1 Gate. Gate.1 ISCA Tutorial: Low Power Design ISCA Tutorial: Low Power Design ©MJI&VN, PSU, 2000 MJI&VN, PSU, 2000 Tutorial Outline Introduction and motivation Sources of power in CMOS designs Power analysis tools and techniques Gate & functional unit design issues & techniques BREAK Architectural level issues and techniques LUNCH Low power memory system design Software level issues and techniques BREAK Software level issues and techniques, con’ t Future challenges 8:30 - 8:45 8:45 - 9:05 9:05 - 9:30 9:30 - 10:30 10:30 - 10:50 10:50 - 12:15 12:15 - 1:30 1:30 - 2:30 2:30 - 3:30 3:30 - 3:50 3:50 - 4:30 4:30 - 4:45 Gate. Gate.2 ISCA Tutorial: Low Power Design ISCA Tutorial: Low Power Design ©MJI&VN, PSU, 2000 MJI&VN, PSU, 2000 Design Levels Abstraction Analysis Analysis Analysis Analysis Energy Level Capacity Accuracy Speed Resources Savings Most Worst Fastest Least Most Application Behavioral Architectural (RTL) Logic (Gate) Transistor (Circuit) Least Best Slowest Most Least

Upload: ngominh

Post on 05-Apr-2018

225 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

1

Gate.Gate.11ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Tutorial Outline

Introduction and motivationSources of power in CMOS designsPower analysis tools and techniquesGate & functional unit design issues & techniquesBREAKArchitectural level issues and techniquesLUNCHLow power memory system designSoftware level issues and techniquesBREAKSoftware level issues and techniques, con’tFuture challenges

8:30 - 8:458:45 - 9:059:05 - 9:30

9:30 - 10:3010:30 - 10:5010:50 - 12:1512:15 - 1:30

1:30 - 2:302:30 - 3:303:30 - 3:503:50 - 4:304:30 - 4:45

Gate.Gate.22ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Design Levels

Abstraction Analysis Analysis Analysis Analysis Energy

Level Capacity Accuracy Speed Resources Savings

Most Worst Fastest Least Most

Application

Behavioral

Architectural (RTL)

Logic (Gate)

Transistor (Circuit)

Least Best Slowest Most Least

Page 2: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

2

Gate.Gate.33ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Basic Principles of Low Power Design

l Reduce switching (supply) voltage» quadratic effect -> dramatic savings» negative effect on performance

l Reduce capacitancel Reduce switching frequency

» switching activity» clock rate

l Reduce glitchingl Reduce short circuit currents (slope engineering)l Reduce leakage currents

P = CL VDD2 f0→ 1 + tscVDD Ipeak f0→ 1 + VDD Ileakage

Gate.Gate.44ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Low Energy Gates:Transistor Sizing

lUse the smallest transistors that satisfy the delay constraints» slack time - difference between required

time and arrival time of a signal at a gate output

– Positive slack - size down– Negative slack - size up

lMake gates that toggle more frequently smaller

lSize for slope engineering to reduce short circuit currents

Page 3: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

3

Gate.Gate.55ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Low Energy Gates:Transistor Pin Ordering

lLogically equivalent inputs may not have identical energy/delay characteristics

B

AOut

Ci

Cout

lTo conserve energy (and improve speed), connect inputs so that most active input is nearest output

lNeed to know signal statistics

Gate.Gate.66ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Low Energy Gates:Dynamic Gate Pin Ordering

lDynamic gates exhibit higher switching activity (and add to clock load) but are fast

!A !C!B

SelA SelB SelC

A CB

SelA SelB SelC

If A, B, and C have low signalprobability

Page 4: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

4

Gate.Gate.77ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Low Energy Gates:Gate Restructuring

lLogically equivalent CMOS gates may not have identical energy/delay characteristics

Gate.Gate.88ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Low Energy Gate Networks: Balanced Delay Paths

Equalize lengths of timing paths through logic

F1

F2

F3

00

0

0

12

F1

F2

F3

00

00

1

1

lReduce glitching by balancing the delay path

Page 5: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

5

Gate.Gate.99ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Low Energy Gate Networks: Network Restructuring

Chain implementation has a lower overall switching activity than the tree implementation

Ignores glitching effects

lConsider logic topology alternatives

AB

CD F

WX

0.5

0.5

3/16

0.50.5

7/6415/256

AB

CD Z

F

Y0.5

0.50.5

0.5

3/16

3/16

15/256

Gate.Gate.1010ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Network Restructuring, con’t

lLogically equivalent gate networks may not have identical energy/delay characteristics

Technology mapping

F = ABCD delayarea

energy

Page 6: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

6

Gate.Gate.1111ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Low Energy Gate Networks: Network Input Ordering

Beneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5)

l Input ordering

AB

C

X

F

0.5

0.20.1

(1-0.5x0.2)x(0.5x0.2)=0.09BC

A

X

F

0.2

0.10.5

(1-0.2x0.1)x(0.2x0.1)=0.0196

Gate.Gate.1212ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Dual Supply Voltages

l Use two VDD’s (e.g., 2.5V and 1.5V)» use the higher supply for gates on the critical path» use the lower supply for gates off the critical path

l Reduces energy without a performance lossl Cons

» slight area penalty» increased design time» need level converters to interconnect gates on different

supplies (to avoid static currents)

Page 7: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

7

Gate.Gate.1313ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Dual Threshold Voltages

l Use two VT’s (e.g., 0.6V and 0.3V for VDD = 2.5V)» use the lower threshold for gates on the critical path» use the higher threshold for gates off the critical path

l Improves performance without an increase in power

l Cons» increased fabrication complexity» increased design time» beware of increased leakage in low VT portion of the

circuit - could end up with increased power!

Gate.Gate.1414ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Functional Unit Energy Optimization

l Key processor core functional units» latches and (pipeline) registers» ALUs - adders, multipliers, barrel shifters» control logic (FSMs)» interconnect» multi-ported register file

l On-chip memories (ROMs, caches, SRAMs,eDRAMs)

l MMU, TLBl Clock generation and distributionl Off-chip interconnect (pads)

Page 8: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

8

Gate.Gate.1515ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Flipflops and Pipeline Registers

l Consume a lot of energy because they are clocked every cycle» Clock energy (Ec)

– energy dissipated when the ff is clocked with stable data» Data energy (Ed)

– energy dissipated when the ff is clocked and the data has changed so that the ff changes state

» Typically the data rate (fd) is much lower than the clock rate (fc)

l Also impacts clock energy since a large portion of clock energy is used to drive the sequential elements

Gate.Gate.1616ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Power Consumption in Latches

0

20

40

60

80

100

0 0.1 0.2 0.3 0.4 0.5

DataClock

Latch Data AF

% P

ower

CLK

D Q

CLKB

From From TiwariTiwari, 1998, 1998

Page 9: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

9

Gate.Gate.1717ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Some Typical CMOS FFs

CLK

D Q

CLK

D

Q

CLK

Static TG FF

DQ

CLK

D Q

Dynamic C2MOS FF

Dyn Precharged TSPC FF Dyn Non-Precharged TSPC FF

Gate.Gate.1818ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

FF Power Comparison

0

5

10

15

20

25

30

0.05 0.15 0.25 0.35 0.45

TGFFC2MOSPTSPCNPTSPC

Latch Data AF

Rel

ativ

e P

ower

Con

sum

ptio

n

From From SvensonSvenson, 1996, 1996

Page 10: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

10

Gate.Gate.1919ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Energy Efficient Flipflops

CLK

D

GND

VDD

VDD

Q

Q

D Q

StrongArm SA110 FF

Power PC 603 FF

VDD

CLK

CLKCLKB

CLKB

16 transistorCLK & CLKB

4 clock loads each

20 transistorCLK

3 clock loads

Gate.Gate.2020ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

EDP of Some Low Power FFs

0

1020

3040

50

60

70

80

HLFF

SDFF

PowerP

C

mC2MOS

SA11

0FF

K6ET

L

HighLowAverage

From From StojanovicStojanovic, 1998, 1998

ED

Pto

t(fJ

)

Page 11: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

11

Gate.Gate.2121ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Self-Gating FF

lWhen ff input is equal to its output, suppress internal clocking to conserve energy» gating function is derived within the FF

D Q

Φ

Φ Φ

Φ

ΦΦ

CLK

DQ

Φ

Φ

Φ

Strict ruleson when D canchange wrt CLK

Gate.Gate.2222ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Power of Self-Gated FF

0

10

1 2

SG FFReg FF

Data switching rate fd/fc

Pow

er d

issi

patio

n

From Reyes, 1996From Reyes, 1996

Page 12: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

12

Gate.Gate.2323ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Double Edge Triggered FF

D Q

Loads data at bothrising and falling

clock edges

CLK

CLK

CLK

CLK

CLKB

CLKB

CLKB

CLKB

Gate.Gate.2424ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

DETFF Pros and Cons

l Advantages» Clock frequency can be halved to achieve the same

computational throughput: Pd = 0.84Ps

» Also get a 2X energy savings in the clock network

l Disadvantages» About 15% larger in transistor count» Maximum operating frequency less» Strict requirements on clock skew» Requires a strict 50% duty cycle» Larger clock load

Page 13: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

13

Gate.Gate.2525ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Adders (Subtractors)

synchronous word parallel adders

ripple carry adders (RCA) carry prop min adders

signed-digit fast carry prop residueadders adders adders

Manchester carry carry conditional carry carry chain select lookahead sum skip

T = O(n), A = O(n)

T = O(1), A = O(n)

T = O(log n)A = O(n log n)

T = O(n), A = O(n) T = O(n**1/2), A = O(n)

Gate.Gate.2626ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

PDP of Different Adders

0

25

50

75

100

8 bits 16 bits 32 bits 48 bits 64 bits

RCAMCCACSkAVSkACSlACLABKAELMA

From From NagendraNagendra, 1996, 1996

Page 14: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

14

Gate.Gate.2727ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Brent-Kung (CLA) Adder

g0p0

g1p1

g2p2

g3p3

g4p4

g5p5

g6p6

g7p7

g8p8

g9p9

g10p10

g11p11

g12p12

g13p13

g14p14

g15p15

€€€€€€€

€ € € €

€ € € € € €

€ €

c1c2c3c4c5c6c7c8c9c10c11c12c13c14c15c16

T =

log 2

n

Par

alle

l Pre

fix C

ompu

tatio

n

T =

log 2

n -1

A =

2lo

g 2n

A = n/2

Gate.Gate.2828ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

0

5

10

15

20

ED

P (p

J)

16 32 64

Number of bits

BK ClassicBK HybridELM ClassicELM Hybrid

BK and ELM Adder Optimization

Page 15: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

15

Gate.Gate.2929ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Parallel Multipliers

lForm partial product array in parallel and add it in parallel» can use multiplier recoding to reduce the high

of the partial produce array by half» recoding may cost more energy than it saves!» use delay balancing to reduce glitching

lArray multipliers (regularity)lPipelined multipliers (higher throughput,

longer latency, less glitching but adds to clock load)

Gate.Gate.3030ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Parallel Multiplier Structure

Q (‘ier)

D (‘icand)

DD

D

0

00

0

multiple forming circuits

partial productarray reduction tree

fast CPA

P (product)

muxes + tree reduction (log n) + CPA

Page 16: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

16

Gate.Gate.3131ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

PP Array Reduction Process‘icand‘ier

partialproductarray

reduced partial product array

(4,2) counter

to CPA

Gate.Gate.3232ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

(4,2) Countersl Built out of (3,2) counters (FA’s)

l Tiles with neighboring (4,2) countersl Can use delay balancing in cell design and

interconnect to reduce glitching

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

Page 17: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

17

Gate.Gate.3333ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

PP Array Reduction Tree Structure

CPA

multiple generators

multiple selection signals(‘ier)

. . .multiplicand

(4,2) counter slices

(4,2) counter slices

(4,2) counter slices

2

Gate.Gate.3434ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Glitch Reduction by Pipelining

lGlitches are dependent on the logic depth of the circuit

lNodes logically deeper are more prone to glitching» arrival times of the gate inputs are more

spread due to delay imbalances» usually affected by more primary input

switchinglReduce depth by adding pipeline

registers

Page 18: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

18

Gate.Gate.3535ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Pipelined Parallel Multiplier

Q (‘ier)

D (‘icand)

DD

D

0

00

0

multiple forming circuits

partial productarray reduction tree

fast CPA

P (product)

helps to reduce glitching but adds to the clock load

clk

Gate.Gate.3636ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

CSA Array Multiplier

M00M01M02M03

M10M11M12M13

M20M21M22M23

M30M31M32M33

q0q1q2q3

d0

p0

d1

p1

d2

p2

d3

p3p4p5p6p7

00000 0 0 0

CSA

qjsuminput

di

carryin

sumoutput

carryout

Longest delay pathn + n - 1 = 2n - 1

Page 19: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

19

Gate.Gate.3737ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Multiplier Cell Structure

fulladder

Bjsuminput

Ai

carry incarry out

add delay elementsto minimize glitching

1D2D

Gate.Gate.3838ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Pipelined CSA Array Multiplier

M00M01M02M03

M10M11M12M13

M20M21M22M23

M30M31M32M33

q0q1q2q3

d0

d1

p1d2

p2d3

p3

p4

p5

p6p7

0000clk

0 0 0 0

M41M42M43

M52M53

M63

p0

Page 20: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

20

Gate.Gate.3939ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Barrel Shifters0

510

Av

era

ge P

ow

er

(mw

)

log PT Array PT log static logdynamic

0

2

4

6

8

10

12

Del

ay (n

s)

log PT Array PT log static logdynamic

Influence of architecture: Logarithmic, Arrayand Gate types: Pass Transistor, Dynamic/Static Mux

From From AckenAcken, 1996, 1996

Gate.Gate.4040ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Control Unit Design

CombinationalLogic

Sta

te F

Fs

Inputs Outputs

n! different possibleencodings (n states)

11

00 01

0,1/1

1/X

1/X0/0

0/0

State EncodingOne of most important factors determining area, speed, and energy of resulting control logic

Page 21: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

21

Gate.Gate.4141ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Energy State Encoding Heuristicl Area driven -> try to reduce the distance in

Boolean n-space between related statesl Energy driven -> try to minimize number of bit

transitions in the state register» fewer transitions in state register» fewer transitions propagated to combinational logic

0.1

0.1

0.1

0.40.3

probability that a transition will occur(sum of all edges

equals unity)1100

01

Gate.Gate.4242ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Caveat

lLowest E[M] may not be lowest in energy → it could require more gates and/or signal transitions in the combinational logic

lExperiments show that the area and energy dissipation of a state machine are correlated when the state encoding is varied

Page 22: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

22

Gate.Gate.4343ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

State Encoding Effects

500

550

600

650

700

750

3300 3400 3500 3600 3700 3800 3900 4000 4100Area

Pow

er

From From YeapYeap, 1997, 1997

Gate.Gate.4444ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Practical Considerations

lBalance area-energy by forced encoding of only a subset of states that span the high probability edges» leave assignment of remaining states to the

logic synthesis system for area optimization» fortunately, in practice, most state machines

have this characteristiclUnlike area encoding, energy encoding

requires knowledge of probabilities of state transitions and input signals

Page 23: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

23

Gate.Gate.4545ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

A Low Power Processor Core

Example

Gate.Gate.4646ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

M•CORE Architecture

GP reg file

(32bitx16)

Alt reg file

(32bitx16)

Control reg file

(32bitx13)

X port Y port Immed

Scale

Barrel shift, FF1

ALU, priority encode, 0 detect

Sign ext

Instr pipeline

Instr decoder

Branch adder

PC increment

Address bus

Data bus

Writeback busH/W acc bus

Page 24: Tutorial Outline - ecs.umass.edu€¦ · ... Low Power Design Gate.1 ©MJI&VN, PSU, 2000 Tutorial Outline ... Low Power Design Gate.18 ©MJI&VN, ... 10 ISCA Tutorial: Low Power Design

24

Gate.Gate.4747ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

M•CORE Power Distribution

36%

36%

28%

DatapathClockControl

42%

14%9%

8%

7%

6%

5%

9%

Reg FileAddr/Data BusInst RegBarrel ShifterX MUXY MUXAddr GenOther

Gate.Gate.4848ISCA Tutorial: Low Power DesignISCA Tutorial: Low Power Design ©©MJI&VN, PSU, 2000MJI&VN, PSU, 2000

Key ReferencesHossain, Low power design using double edge triggered flipflop, IEEE Trans. on VLSI

Systems, 2(2):261-265, 1994.Motorola, M•CORE Architecture microRISC Engine, MCORE 1/D,

www.mot.com/SPS/MCORE/info_documentation.htmMutsunori, Low power design method using multiple supply voltages, SLPED, 1997.Rabaey, Digital Integrated Circuits, Prentice-Hall, 1996.Reyes, Low Power FF Circuit and Method Thereof, Patent No 5,498,988, 1996.Roy, Power analysis and design at the system level, Low Power Design in Deep

Submicron Electronics, Nebel and Mermet, Ed., Kluwer, 1997.Sakuta, Delay balanced multipliers for low power, SLPE, 1995.Scott, Designing the Low-Power M•CORE Architecture, Proc. Inter. Symp. Computer

Architecture Power Driven Microarchitecture Workshop, June 1998.Stojanovic, A unified approach in the analysis of latches and FFs for low power

systems, ISLPED, 1998.Tiwari, Reducing power in high-performance microprocessors, DAC, 1998.Yeap, CPU controller optimization for HDL logic synthesis, CICC, 1997.Yeap, Practical Low Power Digital VLSI Design, KAP, 1998.