january 24th, 2008 nec electronics shuichi kunie s.kunie@necel · marco3 clock gating flip flop...

20
1 Low power architecture and design techniques for mobile handset LSI Medity™ M2 NEC Electronics Shuichi Kunie [email protected] January 24th, 2008 ASP-DAC 2008

Upload: lytruc

Post on 18-Jul-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

1

Low power architecture and design techniques for mobile handset LSI Medity™ M2

NEC Electronics Shuichi [email protected]

January 24th, 2008

ASP-DAC 2008

Page 2: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

2

Outline

Medity™ M2 overviewUse case analysis and low power issues

heavy workloadlight workloadstandby mode

Medity™ M2 low power techniquesHierarchical clock gatingAutomatic frequency controlMultiple Vt transistors and on-chip power switchBack-bias control (UltimateLowPower™)

Power measurement ResultsSummary and conclusions

Page 3: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

3

Medity™ M2 Overview

M2 is second-generation LSI for mobile handset which integrates DBB and Application processor on a chip

(130nm)

DBB LSIDBB LSI

HSDPA Acc.HSDPA Acc.

(90nm)

2G DBBLSI

2G DBBLSI

Logic size

*DBB:Digital BaseBand

ApplicationApplication

SolutionSolution

(65nm)M1M1MP211

M2M2

Page 4: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

4

M1 and M2 comparisonsM2 aims to be same power consumption as M1, although there are big differences between M1 and M2

128 chords

D1 Enc, Dec(Dedicated macro)

WVGA854×480

166MHz(AXI)

500MHz(ARM11)

GSM/GPRS

HSDPA(3.6Mbps)

W-CDMA

M2

Functions

Application

DB

B

QVGA+ Dec(DSP)

QVGA+345×240

125MHz(AHB)

250MHz(ARM9)

W-CDMA

M1

Frequency

Communication new

new

5 timesLCD

new3D sound

newMotion codecH.264

1.3 timesSystem Bus(3D graphics, image)

2 timesCPU/DSP

M2/M1Function

Page 5: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

5

156

0.3

85

34

0

20

40

60

80

100

Heavy workload(TV phone)

Light workload (music player)

Standby

Rela

tive p

ow

er

consum

pti

on

Dynamic

Leak

Power consumptions in three use casesHeavy workloadLight workloadStandby mode

These power values are alreadyapplied to M2 low power techniques

Typ, 1.2V@26C

Page 6: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

6

Medity M2 Low power techniques

(a) Multiple Vt transistors and on-chip power switch

(b) PMU

Hierarchical clock gating

Automatic frequency control

Logic

Clock

Leakage

Dynamicpower

Leakagepower

LPLP--Tech1Tech1

LPLP--Tech2Tech2

LPLP--Tech3a, 3bTech3a, 3b

LPLP--Tech4Tech4Back-bias control

(UltimateLowPower™)

M2 low power techniques are covered for reducing each element of the power

Page 7: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

7

Reduce the dynamic power of unnecessary macros and clock tree at the clock starting point

Active monitor makes clock control condition

Hierarchical clock gating

LPLP--Tech1Tech1

FlipFlopFlipFlop

GG

FlipFlopFlipFlop

FlipFlopFlipFlop

FlipFlopFlipFlop

enGG

CTS

ActivemonitorActive

monitorMarco3

Clock gating

FlipFlopFlipFlop

GG

FlipFlopFlipFlop

FlipFlopFlipFlop

FlipFlopFlipFlop

enGG

CTS

ActivemonitorActive

monitorMacro2

Clock gating

GG

GG

GG

Monitoring macro state for auto clock control

1. Clock gating at the clock tree start point

●●●

Multifrequencygenerator

Multifrequencygenerator

3. Register-level dynamic clock gating

clock controlclock control

2. Sub module level clock gating

Macro1INTC/Timer

ActivemonitorActive

monitor

GGGGGGGG

Page 8: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

8

Automatic frequency controlReduce dynamic power of a unstoppable clock macro and system bus

Active monitor makes clock control condition for multi frequency generator

Macro1Macro1

Macro2Macro2

Macro3Macro3

time

ActiveMonitorActive

Monitor

ActiveMonitorActive

Monitor

ActiveMonitorActive

Monitor

reg1reg1

reg2reg2

reg3reg3

0

1

1

Normal Clock down Normal

Clock down when Macro2 and Macro3 is in idle.

Macro2_clkreq

Macro3_clkreq

Macro1_clkreq

FrequencyChange!

Macro1_clk

Macro2_clk

Macro3_clk

Multi frequency generator

Clock stop

Clock stop

LPLP--Tech2Tech2

Page 9: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

9

Multiple Vt Transistors & on-chip power switchAchieve target speed (Low-Vt)Reduce the area (Low-Vt)Reduce the leakage power (High-Vt & on-chip power switch)Reduce the external materials (on-chip power switch)

High-Vt: Slow speed, low leakMid-Vt: Middle speed and leakLow-Vt: High speed, high leak

H.264 macro51% Low-Vt

3D graphics61% Low-Vt

ARM/DSP100% Low-VtDBB

>99% High-Vt

ARM37% Mid-Vt63% High-Vt

DSP100% Mid-Vt

11.9% area reductionLow-Vt 4.5%→51%

12.4% area reductionLow-Vt 14%→61%

Misc Apl. macro

LPLP--Tech3aTech3aLPLP--Tech3bTech3b

Low-Vton-chip power switch

High & Mid-Vt ApplicationApplicationDigitalDigitalBasebandBaseband

Page 10: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

10

Power-on switching sequence

Power-on switching sequential is divided 4Preventing the rush current for IR Drop

LPLP--Tech3aTech3a

time3ns 150 225ns 350ns

1.2V

# of SW on

Voltage

Current

30 3500 1000 10000

VDD IR drop: max 8mV

typ

typ

max

max

time

⊿95mA⊿80mA

⊿70mA108mA

Rush current: max 100mA

1st SW 2nd SW 3rd SW 4th SW

Page 11: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

11

Power Management Unit :PMUCould migrate to various power mode sequence flexibly and smoothly by PMU SW

LPLP--Tech3bTech3b

PMUPC

RAM

PowerCLK Div.

PLL

INTC

ARM1176~500MHz

DSP~500MHz

3Dgraphic

AXI+AHB

APB

H.264

DMA

CAM

USB,UART...

Power, PLL, Clock, and Reset control

Transfer Interrupt to wake up

Timer

Isolation cell Isolation cell Isolation cell Isolation cell

Isolation cell control Isolation cell

Page 12: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

12

UltimateLowPower™ (1)Reduces the leakage power using back-biasImprove the leakage yield to manage FAST side samples to be around TYP side samples, no effect to SLOW side samples

LPLP--Tech4Tech4

Transistor characteristics

Devicecharacteristics

w/o Back-bias Control

FASTSLOW TYP

SLOW TYP

w/ Back-bias Control

Speed

Leak current w/o Back-

bias Control

Leak current with Back-bias

control

Speed is not

changed

Shift the FAST side characteristics with back-bias control

FAST

Transistors On Chip Variation

Leak current

Page 13: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

13

=Vdd to Vdd+1V

Back-bias for Nmos Tr.

Clock

Back-bias for Pmos Tr.

=0 to -1V

Positive variable Regulator

Negative variable Regulator

Control Signals

Application ARM/DSP

Monitor

UltimateLowPower™ (2)

3G3G--DBBDBB

ApplicationApplicationDigitalDigitalBasebandBaseband

biasbias

DSP DSP 100% Low100% Low--VtVt

MonitorBack-bias regulator

Apply to application ARM/DSP macrosUse 100% optimized Low-Vt which more sensitive for leakage current with respect to back-bias voltageMulti-Vt in same region is not suited for UltimateLowPower

LPLP--Tech4Tech4

ARM1176ARM1176100% Low100% Low--VtVt

Page 14: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

14

0

5

10

15

20

25

30

35

40

0 1 2 3 4 5

Back-bias level

Leak

age

curr

ent (

AR

M+D

SP

) [m

A

F/FF/SS/FS/ST/T

Leakage current with back-bias

The Leakage current of Fast sample (F/F) is 70% reducedthe characteristics of Slow (S/S) samples are not changed

Target delay with back-bias

70% reduced

Typ, 1.2V@26C

LPLP--Tech4Tech4

no back-bias

Page 15: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

15

Outline

Medity™ M2 overviewUse case analysis and low power issues

heavy workloadlight workloadstandby mode

Medity™ M2 low power techniquesHierarchical clock gatingAutomatic frequency controlMultiple Vt transistors and on-chip power switchBack-bias control (UltimateLowPower™)

Power measurement ResultsSummary and conclusions

Page 16: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

16

Power consumptions in light workload80% power reduction is achievedMusic player (AAC) with WVGA displaying

0

20

40

60

80

100

(A) (B) (C) (D)

Rela

tive p

ow

er

consum

pti

on 100

70

3020

Typ, 1.2V@26C

80% reduction

OFFOFFOFF

ONOFFOFF

ONONOFF

ONONON

(1) Hierarchical clock gating

(2) Automatic frequency control(3)Power switch control(4) Back-Bias control

LPLP--Tech1Tech1

LPLP--Tech2Tech2

LPLP--Tech3a,bTech3a,bLPLP--Tech4Tech4

Page 17: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

17

Power Consumption resultsLow power consumption is achieved in heavy and light workload

66.5mAVideo dec. (H.264, D1, 30fps)

22.3mAVideo dec. (H.264, QVGA, 30fps)

5.2mAVGA still image displaying 23.8mAAudio dec. (Enhanced AAC+, 48Kbps)

typ, 1.2V@26C (w/o I/O)

Application

LPLP--Tech1Tech1LPLP--Tech2Tech2LPLP--Tech3a, 3bTech3a, 3bLPLP--Tech4Tech4

Page 18: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

18

M2 die Photo

ApplicationApplicationDigitalDigitalBasebandBaseband

Logic 15Mgate, SRAM 12Mbit8.52x8.52mm@65nm

FPBGA PKG529pin14x14mm□

Page 19: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

19

M2 Products Photo

NTT DoCoMoN905i

NTT DoCoMoN905iμ

Page 20: January 24th, 2008 NEC Electronics Shuichi Kunie s.kunie@necel · Marco3 Clock gating Flip Flop Flip Flop GG Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop Flip Flop en GG CTS

20

Summary and Conclusions

Achieve 80% power reduction in light workloadHierarchical clock gating and automatic frequency control by HWMuti-Vt transistor and on-chip power switch which is controlled by PMUBack-bias by UltimateLowPower

Expectation for EDA toolsMaturity and stabilityReducing performance variationEnhancing automatic optimization

To realize truly ultimated low power, Study the elements of power consumption in use casesApply the knowledge to the design from front-end to back-end phase