eecs 388: embedded systems - ittc

61
EECS 388: Embedded Systems 12. Power and Energy Heechul Yun 1

Upload: others

Post on 01-May-2022

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EECS 388: Embedded Systems - ITTC

EECS 388: Embedded Systems

12. Power and Energy

Heechul Yun

1

Page 2: EECS 388: Embedded Systems - ITTC

Agenda

• Background

• How to measure?

• How to save energy/power?

2

Page 3: EECS 388: Embedded Systems - ITTC

3H Sutter, “The Free Lunch Is Over”, Dr. Dobb's Journal, 2005(Updated in 2009)

Page 4: EECS 388: Embedded Systems - ITTC

4

Power Consumption (Server)

• Memory consumes significant power– E.g.,) Intel Haswell-ULT: 15W, 2 x 4G DDR3 DRAM: 10W

Figure source: Luiz André Barroso and Urs Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-ScaleMachines, Morgan & Claypool, 2009

Page 5: EECS 388: Embedded Systems - ITTC

Power Consumption (Smart Phone)

• Audio playback with backlight off on a smartphone

5

Page 6: EECS 388: Embedded Systems - ITTC

DVFS and DPM

• Dynamic Voltage/Frequency Scaling (DVFS)– Power ~ f V2

– Reduce frequency & voltage

• Dynamic Power Management (DPM)– Multiple power states

• CPU C-states (standby, sleep, deep sleep, …)• DDR3 power states (standby, powerdown, self-refresh, …)

• Goal: Making a “Good” Tradeoff– Minimize performance hit, maximize power reduction

6

Page 7: EECS 388: Embedded Systems - ITTC

Background

- f: clock frequency- V: voltage

7

staticPCfV 2

2

1

staticdynamic PPPower

Page 8: EECS 388: Embedded Systems - ITTC

Background

8

2~ fVPower

TimePowerEnergy

• Frequency doesn’t matter. Is that right?

fTime

1~

(Let’s ignore Pstatic for now.)

Page 9: EECS 388: Embedded Systems - ITTC

Background

9

2~ fVPower

TimePowerEnergy

• If you reduce frequency, you can also reduce voltage

fTime

1~

Vf ~

Page 10: EECS 388: Embedded Systems - ITTC

Background

10

3~ fPower

TimePowerEnergy

• Is reducing frequency always good?

fTime

1~

Page 11: EECS 388: Embedded Systems - ITTC

Background

11

2~ f

TimePTimePEnergy staticdynamic

• Is reducing frequency always good?

f

1~

Page 12: EECS 388: Embedded Systems - ITTC

PowerTop

12

Page 13: EECS 388: Embedded Systems - ITTC

Intel’s Recent Processors

• RAPL (Running Average Power Limit)

13Source: http://web.eece.maine.edu/~vweaver/projects/rapl/

Source: http://http://software.intel.com/en-us/articles/intel-power-governor

Page 14: EECS 388: Embedded Systems - ITTC

14

Platform level monitoringOdroid-XU-E boardProcessor: Exynos 5 Octa

Source: http://hardkernel.com/main/products/prdt_info.php?g_code=G137463363079

Page 15: EECS 388: Embedded Systems - ITTC

External Measurement

15

Source: http://www.hardkernel.com/main/products/prdt_info.php?g_code=G137361754360

Source: http://www.rakuten.com/prod/p3-kill-a-watt-ps-10-10-outlets-power-strip-receptacle-10/220012603.html?listingId=284206025&scid=pla_google_3KingsAudio&adid=18172&gclid=CIvs97jTq70CFa5DMgodcEkAHg

http://www.amazon.com/P3-International-P4460-Electricity-Monitor/dp/B000RGF29Q/ref=sr_1_3?ie=UTF8&qid=1395680823&sr=8-3&keywords=power+meter

Page 16: EECS 388: Embedded Systems - ITTC

How to save Power/Energy?

• Techniques for perf/energy tradeoffs

– DVFS

– Turbo boost

– Power gating

– Core heterogeneity

• Considerations

– Sensitive to time (performance)

– Sensitive to energy consumption

16

Page 17: EECS 388: Embedded Systems - ITTC

A Measurement Study

• An Analysis of Power Consumption in a Smartphone, USENIX ATC’10

17

Impact

Page 18: EECS 388: Embedded Systems - ITTC

A Smartphone

• (very old) 2.5G GPRS phone– Battery: 1200mAh, 3.7V Li-ion (4.4Wh)

18

Page 19: EECS 388: Embedded Systems - ITTC

What to Know?

• Where does the energy go?

– Detailed component-level power breakdown

– On various usage scenarios

• How to save energy?

– The efficacy of DVFS (dynamic voltage-frequency scaling) schemes

19

Page 20: EECS 388: Embedded Systems - ITTC

Methodology

• Hardware

– A development board, configured to measure individual component (CPU, memory, Modem, …) power consumption

– Using a DAQ (data acquisition) system

• Read the paper. You can find very detailed descriptions

• Software

– On Android 1.5, using a set of micro-benchmarks as well as real applications

20

Page 21: EECS 388: Embedded Systems - ITTC

Idle

• System is awake, but no applications are active• CPU and RAM are not top power consumers

21

Page 22: EECS 388: Embedded Systems - ITTC

Audio Playback

• Backlight off• Comparable to idle state

22

Page 23: EECS 388: Embedded Systems - ITTC

Video Playback

• Backlight is a dominant factor

23

Page 24: EECS 388: Embedded Systems - ITTC

Backlight

• User controllable (~255 levels)

24

Page 25: EECS 388: Embedded Systems - ITTC

CPU and Memory

• 100MHz (low perf) 400MHz (max perf)• equake’s power consumption increases significantly• mcf’s power consumption doesn’t increase much

25

Page 26: EECS 388: Embedded Systems - ITTC

Internal Flash and SD Card

• Benchmark: flash read/write (dd)• Why are they (internal and SD) different?

26

Page 27: EECS 388: Embedded Systems - ITTC

Findings

• Where does the energy go?

– GSM, display, backlight

– Not CPU and DRAM

• Is DVFS useful?

– Reduce power but not necessarily energy

– Only memory bound applications get energy savings

27

Page 28: EECS 388: Embedded Systems - ITTC

Two Additional Smartphones

28

Page 29: EECS 388: Embedded Systems - ITTC

Quiz

• Which phone do you want to use DVFS?

29

Page 30: EECS 388: Embedded Systems - ITTC

Is DVFS useful?

• Yes: Nexus One, Freerunner (weak)• No: G1

30Further reading: E. Le Sueur and G. Heiser, “Dynamic voltage and frequency scaling: the laws of diminishing returns,” HotPower’10

Page 31: EECS 388: Embedded Systems - ITTC

Challenge: How To Configure?

• Too many possible configurations

– low or high freq?

– More cores or less cores?

– Little core vs. big core?

• Platform variation

– A policy that works well on a platform does not necessarily work on another platform

31

Page 32: EECS 388: Embedded Systems - ITTC

Challenge: How To Configure?

• Too many possible configurations

– low or high freq?

– More cores or less cores?

– Little core vs. big core?

• Platform variation

– A policy that works well on a platform does not necessarily work on another platform

32

Page 33: EECS 388: Embedded Systems - ITTC

Energy Saving Strategies

• Model-based approach

– Offline: build an energy/performance model

– Online: compute an “optimal” assignment

• Heuristic approach

– Race to idle

– Never idle

– Adaptive control

33

Page 34: EECS 388: Embedded Systems - ITTC

System-wide Energy Optimization for Multiple DVS Components and

Real-time TasksHeechul Yun, Po-Liang Wu, Anshu Arya, Tarek

Abdelzaher, Cheolgi Kim, and Lui ShaUniversity of Illinois at Urbana and ChampaignIEEE Real-Time and Embedded Technology and

Applications Symposium (RTAS), 2010

34

Page 35: EECS 388: Embedded Systems - ITTC

CPU-only DVFS

• “DVFS is increasingly ineffective” [Le Sueur, HotPower’10]– Increased importance of static power– Small voltage margin for DVFS to be effective– Reduced freq. increased runtime often increased energy

35

- f: clock frequency- V: voltage- k: constant

staticPkfV 2

staticdynamic PPP

Page 36: EECS 388: Embedded Systems - ITTC

CPU-only DVFS

36

0

100

200

300

400

500

600

700

40

60

80

10

0

12

0

14

0

16

0

18

0

20

0

22

0

24

0

26

0

28

0

30

0

32

0

34

0

36

0

38

0

40

0

Valid range (~200Mhz)

Not effective, But…

fc

(Mhz)

Energy(mJ)

Task cache stall ratio = 0 %

Page 37: EECS 388: Embedded Systems - ITTC

Motivation

37

CPU(Mhz) Mem(Mhz) Time(s) Energy(mJ)

200 100 3.46 1690

100 100 3.55 1182

Memxfer5b : memory benchmark program

Half of CPU clock

Energy saved 30%

Exec. time increased only 3%

Page 38: EECS 388: Embedded Systems - ITTC

Motivation

38

CPU(Mhz) Mem(Mhz) Time(s) Energy(mJ)

200 100 4.26 2364

200 50 4.28 2106

Dhrystone: CPU benchmark program

Half of Mem clock

Energy saved 10%

Exec time increased only 0.05%

Page 39: EECS 388: Embedded Systems - ITTC

Task Model

• Task = Computation + Memory fetch

39

computation

memory fetch(cache stall)

time

power

Computation Memoryfetch

time

power

Page 40: EECS 388: Embedded Systems - ITTC

Task Model (2)

40

C M

C : computationM : off-chip memory fetch

(cache-stall cycles)

power

time

CMLower MEM freq

power

time

CM

Lower CPU freq

power

time

Page 41: EECS 388: Embedded Systems - ITTC

Task Model (3)

• Execution time of a task

– C : CPU cycles of a given task (excluding memory stalls)

– M : memory cycles of a given task (memory stall cycles)

– fc : CPU clock frequency

– fm : Memory clock frequency

41

mc f

M

f

Ce

Page 42: EECS 388: Embedded Systems - ITTC

Power Model

• Power of a component (i.e., CPU)

– k : capacitance constant

– f : frequency of the component

– V : supplying voltage

– R : leakage power

42

RkfVP 2

Different k for different modes: kactive - active mode capacitance

kstandby- standby mode capacitance

Page 43: EECS 388: Embedded Systems - ITTC

Energy Model

43

e P

Memory Fetch

power

idle

CPU active

Bus, memstandby

time

CPU standby

Bus, memactive

System static

CPU, bus, memidle

Ecpu

pure exec block

Emem

MEM fetch block

Eidle

idle block

Dynamic power

• System wide energy model– Considers CPU, bus, and memory power consumption

– Considers active, standby and idle modes

– Other components are assumed to be static (included in R)

Page 44: EECS 388: Embedded Systems - ITTC

Energy Equation and Validation

Capacitance (nF) Power (mW)

Kca Kcs Kma* Kms* I R

0.505 0.224 0.540 0.210 6.570 67.434

44

)()(

)()( 2*22*2

ePRI

f

MRfVkfVk

f

CRfVkfVkE

m

mmaccpucs

c

mmscca

Obtained coefficients in the energy equation

• Validated on a ARM926-ejs based platform via regression analysis

Heechul Yun, Po-Liang Wu, Anshu Arya, Tarek Abdelzaher, Cheolgi Kim, and Lui Sha. “System-wide Energy Optimization for Multiple DVS Components and Real-time Tasks,” ECRTS, 2010

Page 45: EECS 388: Embedded Systems - ITTC

Static MultiDVFS Problem

• Given a set of periodic real-time tasks (T1, …,Tn), where each task invocation requires up to Ci CPU cycles and up to Mi memory cycles at worst.

• Find the energy optimal static frequencies for multiple DVFS capable components (CPU and memory)

45

Page 46: EECS 388: Embedded Systems - ITTC

Problem Formulation

Minimize

Subjects to

where

46

n

i

idleimemicomp

i

EEEP

H

1

,, )(

.11

n

i i

i

P

e

H : hyper periodei : execution time of task iEcomp,i : computation block energy of task iEmem,i : cache stall block energy of task iEidle : idle block energy

Page 47: EECS 388: Embedded Systems - ITTC

Energy vs. Utilization

47

Task set cache stall ratio (MH/(CH+MH) ): 0.3

0.5

0.6

0.7

0.8

0.9

1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

MAX

CPU-only

Static

utilization

No

rmal

ized

ave

rage

po

wer

co

nsu

mp

tio

n

MultiDVFS

Page 48: EECS 388: Embedded Systems - ITTC

Summary

• Memory-aware time/energy model – Consider CPU and memory frequencies/voltages

– Validated on a real hardware platform

• MultiDVFS– Joint optimization of CPU and memory

frequencies/voltages,

– Minimize energy consumption of periodic real-time tasks

48

Page 49: EECS 388: Embedded Systems - ITTC

Recap: First Attempt

• 1000 samples (minus the first sample. Why?)

49

CFS (nice=0)

Mean 23.8

Max 47.9

99pct 47.4

Min 20.7

Median 20.9

Stdev. 7.7

Why?

Page 50: EECS 388: Embedded Systems - ITTC

Recap: DVFS

• Dynamic voltage and frequency scaling (DVFS)

• Lower frequency/voltage saves power

• Vary clock speed depending on the load

• Cause timing variations

• Disabling DVFS

50

# echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor# echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor# echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor# echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor

Page 51: EECS 388: Embedded Systems - ITTC

Recap: Energy Saving Strategies

• Model-based approach

– Offline: build an energy/performance model

– Online: compute an “optimal” assignment

• Heuristic approach

– Race to idle

– Never idle

– Adaptive control

51

Page 52: EECS 388: Embedded Systems - ITTC

POET: A Portable Approach to Minimizing Energy Under Soft Real-

time ConstraintsConnor Imes, David H. K. Kim, Martina Maggio, and

Henry HoffmannUniversity of Illinois at Urbana and ChampaignIEEE Real-Time and Embedded Technology and

Applications Symposium (RTAS), 2015

52

Page 53: EECS 388: Embedded Systems - ITTC

Systems

53

Page 54: EECS 388: Embedded Systems - ITTC

Configurations

• Per-application/per-platform, off-line profiling

54

Page 55: EECS 388: Embedded Systems - ITTC

Platform Variation

55

Connor Imes, David H. K. Kim, Martina Maggio, and Henry Hoffmann, “POET: A Portable Approach to Minimizing Energy Under Soft Real-time Constraints,” RTAS, 2015

Page 56: EECS 388: Embedded Systems - ITTC

POET Approach

• Control theory based

– (1) observe error (2) compute control (3) apply control

56

Connor Imes, David H. K. Kim, Martina Maggio, and Henry Hoffmann, “POET: A Portable Approach to Minimizing Energy Under Soft Real-time Constraints,” RTAS, 2015

Page 57: EECS 388: Embedded Systems - ITTC

Controller

• Goal: meet the speed target

• Observe error

• Compute control signal

57

Page 58: EECS 388: Embedded Systems - ITTC

Optimizer

• Given

– C configurations,

– measured speed s(t),

– time window tau

• Goal: minimize energy

– Subject to

• Meeting performance (#of jobs in a given time window tau)

• Sum of time spent on each setting = tau

58

Page 59: EECS 388: Embedded Systems - ITTC

Example Usage

• Apply to periodic tasks– One control per job

• Heartbeat API– measure rate

59

Connor Imes, David H. K. Kim, Martina Maggio, and Henry Hoffmann, “POET: A Portable Approach to Minimizing Energy Under Soft Real-time Constraints,” RTAS, 2015

Page 60: EECS 388: Embedded Systems - ITTC

Results

60

Connor Imes, David H. K. Kim, Martina Maggio, and Henry Hoffmann, “POET: A Portable Approach to Minimizing Energy Under Soft Real-time Constraints,” RTAS, 2015

Page 61: EECS 388: Embedded Systems - ITTC

Summary

• Power/energy/speed relationship– Model vs. practice

• Control options– DVFS– On/off– Core heterogeneity

• Management approaches– Model based– Heuristic based– Control theory based

61