ece 692 20091 power control for chip multiprocessors xue li oct 27, 2009

54
ECE 692 2009 1 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

Upload: elijah-mathews

Post on 12-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 1

Power Control for Chip Multiprocessors

Xue Li

Oct 27, 2009

Page 2: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 2

Outline

• Two ways to control power of chip multiprocessors– MPC control with online model estimation– Simple closed loop control with risk evaluation

Page 3: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 3

Temperature-Constrained Power Control for Chip Multiprocessors with

Online Model Estimation

Yefu Wang, Kai Ma, Xiaorui Wang

Page 4: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 4

Introduction

• Power and thermal are the major constraints for further throughput improvement of CMP– Peak power consumption of a CMP should be

controlled to enable higher computing densities.– The temperature of a CMP should be kept lower than

a threshold in case of thermal failures.– Performance delivered per watt needs to be

maximized.

Page 5: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 5

State of the Art

• Power control for CMP– Open-loop search or optimization [Isci’06],

[Teodorescu’08], etc.• Highly dependent on the accuracy of the system model

– Heuristics [Isci’06], [Meng’08], etc.• No theoretical guarantee of control accuracy/stability

– Chip-wide DVFS (Dynamic Voltage and Frequency Scaling) [McGowen’06], [Floyd’07], etc.

• Suboptimal in performance

• Dynamic thermal management– Heuristics or feedback control theory [Brooks’01],

[Skadron’03], etc.• Power and temperature are controlled separately

Page 6: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 6

Challenges and Solutions

• Multiple cores may need to be manipulated simultaneously to control both power and temperature.

Multi-Input-Multi-Output (MIMO) control

• Optimal control algorithms need to be designed for power shifting among different cores.

Model predictive control (MPC) theory

• Different cores may be coupled together.

Specific design constraints

• Workload is unpredictable at design time.

Online parameter estimation

• Control accuracy and system stability is critical

Theoretically guaranteed control performance and stability

Page 7: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 7

Temperature-Constrained Power Control

• MIMO control loop invoked periodically– Power monitor sends the chip-level power consumption to the controller– Controller reads temperature and performance metrics of each core – Controller computes new DVFS levels based on MPC control theory– New per-core DVFS levels are sent to the cores– Online model estimator updates the power model

Page 8: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 8

Steps of Model Predictive Control

• System modeling– Power model

• Controller design– MPC controller design– Constrains:

• Frequency range • Power budget • Temperature • Other design requirements

• System stability analysis

Page 9: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 9

Steps of Model Predictive Control

• System modelingSystem modeling– Power modelPower model

• Controller design– MPC controller design– Constrains:

• Frequency range • Power budget • Temperature • Other design requirements

• System stability analysis

Page 10: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 10

System Modeling: Power Model (1)

• Power consumption of one core

iiii ckfakp )()( )()()1( kfakpkp iiii

A • Estimated system parameters • Initial value can be defined by system identification• May change for different workloads• Can be updated by online estimation

p(k +1) = p(k) + AΔf(k)

Page 11: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 11

1

1

( )

( 1) ( )

( )N

N

f k

cp k cp k a a

f k

System Modeling: Power Model (2)

• Total power consumption of CMP

• Power model validation

Total power consumption of the chip

Page 12: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 12

Steps of Model Predictive Control

• System modeling– Power model

• Controller designController design– MPC controller designMPC controller design– Constrains: Constrains:

• Frequency range Frequency range • Power budget Power budget • Temperature Temperature • Other design requirementsOther design requirements

• System stability analysis

Page 13: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 13

Controller Design: MPC Controller

• Control objective: minimize the cost function

1

0

22

)(1

)()()(M

iiQ

P

i

kikrefkikcpkVR(i)maxF)kif(k)kiΔf(k

Control accuracy Performance optimization

Model prediction

Measured from power meter:

feedback

( | ) ( ( ))ref

Ti

T

s sref k i k P e P cp k

Page 14: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 14

Controller Design: Constraints (1)

• Physical frequency range

• Power budget for each core

• Other design requirements

min, max,( 1)j j jF f k F (1 )j N

( ) scp k P

( 1) ( 1)i jf k f k (1 , )i j N

Page 15: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 15

Controller Design: Constraints (2)

• Model between temperature and frequency– Temperature & power

– Power & frequency

( ) ( )i i i ip k a f k c

• Temperature constraint

( 1)i it k T

( 1) ik s iB f

t tt(k +1) = A t(k) +B p(k)

tΔt(k) = A Δt(k -1) +BΔf(k -1)

Page 16: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 16

Steps of Model Predictive Control

• System modeling– Power model

• Controller design– MPC controller design– Constrains:

• frequency range • power budget • temperature • other design requirements

• System stability analysisSystem stability analysis

Page 17: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 17

Controller Design: Stability Analysis

• Stability: – Converge to desired bounds from any initial condition

• Unknown system gain: – Actual system parameter , estimating system parameter

– The bigger range, the better system adaptability.

• The system is proved to be stable in a wide range– Uniform workload

• 0< g ≤ 8.83

– Different workload• 0 < g1 ≤ 15.7• 0 < g2 ≤ 17.6

The model can work as long as the real parameter

of a system is less than 8.83 times of the value

used to design the system.

A =GAA A

gG I

Page 18: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 18

Online Model Estimation

• Recursive Least Square (RLS) estimator to update the model periodically

– RLS estimator records and – The estimator calculates and– The estimator updates with in the system model

( )cp k e eA f (k) 1 Na a ceA (k)

Tef (k) = f(k) 1

f(k)

ef (k) eA (k)

A ia

( )cp k

Page 19: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 19

System ImplementationPower lines

(Current signal)

Current probe(1mv/A)

USB interface

Physical Testbed Simulation

CPU Intel Xeon X5365 Alpha 21264 like

Cores 4 4, 8, 16

Power Monitor Digital MultiMeter Wattch

Temp Sensor Coretemp driver

Controller Software

Workload SPEC CPU 2006

Page 20: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 20

Experimentation

• Baselines

• Empirical results– Control accuracy– Application performance– Temperature constraints– Online model estimator

• Simulation results– Control accuracy– Application performance

Page 21: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 21

Experimentation

• Baselines

• Empirical results– Control accuracy– Application performance– Temperature constraints– Online model estimator

• Simulation results– Control accuracy– Application performance

Page 22: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 22

Baselines

• Priority– Per-core DVFS– Heuristic based

• Power > budget

DVFS decreases by 1 • Power < budget

DVFS increases by 1

• Improved priority– Priority with safety

margin

• MaxBIPS– Per-core DVFS– Predictive based: uses

a typical workload to build a static table offline

– Exhaustive search from combination of DVFS levels for all cores

Workload sensitive

Page 23: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 23

Baselines: MaxBIPS

• Define two N*M matrices: Power and BIPS

– N: number of cores

– M: number of power modes

• Fill in the matrices with actual and predictive values

– Power: cubic scaling

– BIPS: linear scaling

• Find out the power and core combination to achieve best BIPS under power budget

Core1 Core2

Mode1 20 16

Mode2 17 14

Core1 Core2

Mode1 80 60

Mode2 69 52

PowerMatrix

BIPSMatrix

Actual value

317.15 20 0.95

Mode Power Savings

Performance Degradation

Mode1 None None

Mode2 15% 5%

BIPS Power

80+60=140 20+16=36

80+52=132 20+14=34

69+60=129 17+16=33

69+52=121 17+14=31

If power budget is 32, last one

will be selected

Page 24: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 24

Experimentation

• Baselines

• Empirical results– Control accuracy– Application performance– Temperature constraints– Online model estimator

• Simulation results– Control accuracy– Application performance

Page 25: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 25

Empirical Results: Control Accuracy (1)

• Comparison of steady state errors– Steady state error: violation of power budget at

different power level.– MPC follows the set point well.

Page 26: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 26

Empirical Results: Control Accuracy (2)

• MPC V.S. MaxBIPS / Priority / Improved Priority Much lower than

the set point

Fits well

Oscillates around the set point

Exceeds the budget at times

Page 27: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 27

Empirical Results: Application Performance

• SPEC performance between MPC, MaxBIPS and improved priority under different power budgets.

– MPC achieves better performance because MPC can precisely achieve the set-point power.

– Average improvement of MPC is 9.69% over MaxBIPS and 8.95% over Improved Priority.

Page 28: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 28

Empirical Results: Temperature Constraints

• Emulate a thermal emergency by lowering the temperature constraint– Figure (a) shows that the temperature of cores are

quickly constrained to the lower bound.– Figure (b) shows that the temperature constraints

works effectively to reduce power consumption.

Page 29: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 29

Empirical Results: Online Model Estimator

• MPC V.S. MPC with estimator– Workload may change significantly at run time.– Estimator can correct system parameters

dynamically.– MPC without estimator suffers large oscillations.

Page 30: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 30

Experimentation

• Baselines

• Empirical results– Control accuracy– Application performance– Temperature constraints– Online model estimator

• Simulation results– Control accuracy– Application performance

Page 31: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 31

Simulation Results: Control Accuracy

• Simulation with more cores (4, 8, 16)– Average power and standard deviation of

different control method.• MPC precisely converges to the budget.

MaxBIPS’ absence of 16 due to

exponentially increase of static prediction table

Page 32: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 32

Simulation Results: Application Performance

• SPEC benchmark performance comparison under different number of cores (Set point = 95%, 85%)

Page 33: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 33

Conclusion

• A temperature-constrained chip-level power controller– Designed based on MPC control theory– Accurately controls power consumption– Temperatures of the cores are limited to stay below the

constraint.– An online model estimator periodically updates the system

model

• Compared with state-of-the-art work– More accurate power control– Better application performance

Page 34: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 34

Multi-Optimization Power Management for Chip

Multiprocessors

Ke Meng, Russ Joseph, Robert P. Dick

Northwestern University

Li Shang

University of Colorado

Page 35: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 35

Introduction

• Power is still a first-class design constraint in CMP era.– Higher transistor density– Higher leakage power

• Power is still a precious computing resource– When power is limited, maximizing the chip-

wide performance requires global and local coordination.

High power density

ThermalIssues

Page 36: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 36

System Framework

Select power optimizations and allowable power modes

Collect data from sensors and counters; calculate power /performance.

Analyze, search and tune

Soft-limit budget

Page 37: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 37

Optimization Pool (1)

• DVFS– Simple models

• Frequency: linear with voltage• Power: changes cubically with voltage• Performance: roughly linear with frequency

– High efficiency• Cubical relationship between frequency and power

Page 38: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 38

Optimization Pool (2)

• Cache resizing– Large leakage: big savings– Workload variety: unused private capacity

Page 39: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 39

Models and Experimentations

• Models– Dynamic voltage / frequency scaling (DVFS)

– Cache resizing

– Unified analytic models

– Risk evaluation

– Search algorithms

• Experimentation– Configuration

– Model validation

– Model evaluation

– Power violation

Page 40: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 40

Models and Experimentations

• ModelsModels– Dynamic voltage / frequency scaling (DVFS)Dynamic voltage / frequency scaling (DVFS)

– Cache resizingCache resizing

– Unified analytic modelsUnified analytic models

– Risk evaluationRisk evaluation

– Search algorithmsSearch algorithms

• Experimentation– Configuration

– Model validation

– Model evaluation

– Power violation

Page 41: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 41

Analytic Models: DVFS

• DVFS modeling – CPI stack counters: counts computing stalls

and L2 miss stalls• Computing stalls: changes with frequency• L2 miss stalls: constant in spite of frequency

– Performance model

iCPU

iMem

( )ji ii

j ii i

FreqCPU Mem

FreqPerf Perf

CPU Mem

Power: Cubic with

frequency

j jCPU Mem

Page 42: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 42

Analytic Models: Cache Resizing

• Cache resizing modeling– Non-stall cycles– Stall cycles due to cache misses

jj i

i

MissCPI BaseCPIPerf Perf

MissCPI BaseCPI

MissCPI

BaseCPI

Power: Average leakage power of a cache way times number of active

ways

Page 43: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 43

Analytic Models: Unification

• Unified analytic models with DVFS and cache resizing– Performance

• Weak interaction among multiple optimization allow independent speed-ups

– Power• DVFS has a strong influence • Additive contribution of cache resizing

jk

ki

PerfOptSpeedup

Perf

iOptPower

3( ) ( )jj i kk i

FreqPower Power OptPower

Freq

Page 44: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 44

Analytic Models: Risk Evaluation

• Why to do risk evaluation?– Some optimizations are more prone to phase

adjustment. – Severe performance loss and power violation.

• How to do risk evaluation?– DVFS: assume zero risk.– Cache resizing: cache activities variation

threshold.

Page 45: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 45

• Brute-force search– Traverse all possible power modes– Always find the best combination– Slow when search space are large

• Greedy search– Take currently best step available

• Current best step: power mode with the maximal delta power/performance ratio.

– Fast– Can get stuck in local minima

Results show it happens rarely

Analytic Models: Search Algorithms(1)

Page 46: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 46

Models and Experimentations

• Models– Dynamic voltage / frequency scaling (DVFS)

– Cache resizing

– Unified analytic models

– Risk evaluation

– Search algorithms

• ExperimentationExperimentation– ConfigurationConfiguration

– Model validationModel validation

– Model evaluationModel evaluation

– Power violationPower violation

Page 47: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 47

Experiment: Configuration

Processor Setup

Cores 4 Alpha21264-like cores

L1 I/D Cache 64KB 2-way private 64B blocks

L2 Cache 2MB 8-way private 128B blocks

Tech node 65 nm

DVFS Range 85%, 90%, 95%, 100% of 3GHz

Group No. Workloads Stability

Group A equake, swim, sixtrack, gcc Moderate

Group B applu, gap, facerec, vortex Moderate

Group C mesa, eon, lucas, wupwise Stable

Group D art, mcf, parser, vpr Un-stable

Page 48: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 48

Experiment: Model Validation

• Cache CPI model validation

Page 49: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 49

Experiment: Model Evaluation (1)

• Modeling-greedy vs. modeling-global / trial-and-error– Trial-and-error (DVFS + cache re

sizing):• Starting trial-stage when entering a s

table phase• Only works with workloads possessi

ng stable phases (Group C).

– Analytical modeling (DVFS + cache resizing):

• 8% perf loss vs. 35% power saving• Greedy search works extremely well

Page 50: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 50

Experiment: Model Evaluation (2)

• Modeling with risk management vs. MaxBIPS– Simple (DVFS + cache resizing):

Analytical modeling without risk evaluation.

– With risk evaluation:Results either better or almost unchanged.

– MaxBIPS (only DVFS): Not always the worst.Difficult to manage multiple optimizationsEven with risk evaluation, errors can be made before risk being identified.

Page 51: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 51

Conclusion

• Power problem is critical in CMP.

• CMP power management must coordinate global and local power usage.

• Analytical modeling are more favorable than trial-and-error.

• Risk evaluation is necessary to avoid frequent prediction errors.

Page 52: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 52

Comparison

1st paper 2nd Paper

Controlling Predictive based Heuristic based

Power budget Hard limit Soft limit

Temperature management

Yes No

L2 cache involvement

No Yes

Hardware implementation

Yes No

Page 53: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 53

Critiques

• First paper– Temperature constraint seems much higher

than normal working condition.– Explanation of in temperature constraints is

not very clear.

• Second paper– Modeling accuracy is low.– No absolute guarantee of power consumption.– Too many arbitrary assumptions.

is

Page 54: ECE 692 20091 Power Control for Chip Multiprocessors Xue Li Oct 27, 2009

ECE 692 2009 54

Thank you