kristoffer robin stokke, phd flir uas...wan, can, et al. "photovoltaic and solar power...

49
Power Modelling and Characterisation for Data Centers Kristoffer Robin Stokke, PhD FLIR UAS

Upload: others

Post on 09-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Power Modelling and Characterisation for Data Centers

Kristoffer Robin Stokke, PhD

FLIR UAS

Page 2: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Outline

• Introduction to datacenters in

smart grid

• Energy consumption and

modelling for Data Centers

• Power Saving Techniques

• Case Study: Power Modelling

for the Tegra K1 & X1

30.08.2018 3

Page 3: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Energy Consumption of Data Centers

• 1.1 % -> 1.5 % total (world) annual consumption (2010)

• Economic

– Cost of energy $ increases

• Environmental

– Greenhouse gas emissions

• Increasing demand for processing power & services

– TODO: Motivate with cisco forecast if possible

30.08.2018 4

[1] Info-Tech, “Top 10 energy-saving tips for a greener data center,” Info-Tech Research Group, London, ON, Canada, Apr. 2010.

[2] Dayarathna, Miyuru, Yonggang Wen, and Rui Fan. "Data center energy consumption modeling: A survey." IEEE Communications Surveys & Tutorials 18.1 (2016): 732-794.

[3] Koomey, Jonathan. "Growth in data center electricity use 2005 to 2010." A report by Analytical Press, completed at the request of The New York Times 9 (2011).

Example data center energy

consumption breakdown [1]

Page 4: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

What is Power and Energy?

30.08.2018 5

Time [s]

GPU kernel

(program) launches

• Power is the rate of energy consumption

over time

•𝑤𝑜𝑟𝑘 (𝑒𝑛𝑒𝑟𝑔𝑦)

𝑡𝑖𝑚𝑒= Watts =

𝐽𝑜𝑢𝑙𝑒𝑠

𝑠𝑒𝑐𝑜𝑛𝑑

• Which means that energy 𝐸 [𝑊ℎ]..

• 𝐸 = 𝑡1𝑡2𝑃(𝑡) 𝑑𝑡 -> area under curve

• Intuition: With a battery of 10 𝑊ℎ (watt-

hours)

• You can draw 10 W for one hour

• Or 5 W for two hours..

𝑡0 𝑡1

Page 5: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Smart Energy Systems and Data Centers

30.08.2018 6

[1] Saad, Walid, et al. "Game-theoretic methods for the smart grid: An overview of microgrid systems, demand-side management, and smart grid communications." IEEE Signal Processing Magazine 29.5 (2012): 86-105.

[2] https://www.123rf.com/photo_52377624_stock-vector-renewable-energy-sources-vector-infographics-solar-wind-tidal-hydroelectric-geothermal-power-biofuel.html

Hydroelectric

Fossil / Nuclear

Solar

Wind

Geothermal

Data CenterEnergy

Storage

Fossil / Nuclear

Data Center

Data Center

Power distribution and

metering infrastructure

Page 6: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Solar Photovoltaic Panels

30.08.2018 7

PV PanelsData Center

Regulator /

Inverter

Limited Energy

Density Storage

Consumers

• Transient source of energy

• Limited energy storage, use now

• Predicting solar power usage

• Forecasting model

• Outputs irradiance I 𝑊

𝑚2

• PV power model

• Actual power output

• Challenge: Predicting availability

• Scheduling decisions

• Strategic planning for workloads

• Maintenance, off-line work

Solar power from a PV farm in Jutterland, Denmark

(2006), plotted over every day and time of day.

Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015): 38-46.

Bacher, Peder, Henrik Madsen, and Henrik Aalborg Nielsen. "Online short-term solar power forecasting." Solar Energy 83.10 (2009): 1772-1783.

Page 7: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Solar Photovoltaic Panels (Cont.)

30.08.2018 8

• Forecasting dependencies, ex:

• Solar irradiance 𝐼𝑊

𝑚2

• PV cell temperatures 𝑡0• Cloud cover, humidity, wind

• PV dependencies, ex:

• Panel area 𝑆[𝑚2]• Regulator efficiency 𝛼, reflectivity..

• Prediction of solar irradiance

• Statistical, time series

• Neural networks

• Numerical Weather Prediction (NWP)

Forecasting Model

PV Power Model

𝑊

𝑚2Output irradiance ->

𝑃𝑅 = 𝛼𝑆𝐼[1 − 0.05 𝑡0 − 25 ]

Output PV power:

Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015): 38-46.

Bacher, Peder, Henrik Madsen, and Henrik Aalborg Nielsen. "Online short-term solar power forecasting." Solar Energy 83.10 (2009): 1772-1783.

Page 8: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Renewable Energy in General• Availability changes over time, with little energy storage

– Solar, wind, hydropower (wave)

• Hydropower (dam), vehicle-to-grid

– More latent storage of energy

• Prediction of availability involves..

– A prediction of weather (wind, temperature, humidity, irradiance)

– ..which is fed to a model mapping environment to power

• Alternatively, just a statistical model (no actual physics)

• Challenging weather conditions (climate change?)

– Norway’s summer 2018 marked by exceptionally long dry periods and

warm weather.

– Meanwhile, it’s the wettest summer in Iceland.

– Opportunity for smart grid to use our resources smarter?

30.08.2018 9

May 2018 marked the warmest weather

ever measured in Oslo. Picture from the

meteorological institute at Blindern.

Source: Aftenposten.

Page 9: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Data Center Components

30.08.2018 10

• Mostly power from grid

• Diesel, solar, wind, hydrogen..

• Cooling

• CRAC (Computer

Room A/C)

• CRAH (Computer

Room Air Handler)

• Redundancy (UPS)

• Racks (IT equipment)

• Processors

• Storage

• Network

• Mainboards

• Air cooling through vents

• Lighting

Dayarathna, Miyuru, Yonggang Wen, and Rui Fan. "Data center energy consumption modeling: A survey." IEEE Communications Surveys & Tutorials 18.1 (2016): 732-794.

Page 10: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Simula’s Data Center

• Thermal picture + count of different hardware

• Touch/Explain heterogeneity

• Touch/Explain difficulties measuring all of it

30.08.2018 11

Page 11: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Data Centers: Overview

30.08.2018 12

Processors

CPU / GPU

Storage

HDD / SDD

Network /

Interconnect

Power

Conversion

Guest

OS

VM

Monitoring

Virtual Hardware

Disk / CPU / NICHost

OS

Service Applications

Internet

• Users access datacenter services

over the internet

• Requests handled by running

applications

• Applications usually run in virtual

environments

• Virtal CPU, GPU, disk, etc.

• Sand-box

SpotifyDropboxBrowsing

Virtu

al M

achin

eP

hysic

al M

achin

e

Page 12: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Challenges for Data Centers in Smart Energy Systems

30.08.2018 13

• Worldwide energy consumption, and the (climate) bill is large and increasing

• Large cooling and processing energy consumption

• Macroscopic view: smart energy systems can help these issues

• Buy and use cleaner / cheaper energy

• Partitioning / migrating tasks / virtual machines to other centers

• Delaying work until cleaner / cheaper energy is available

• Microscopic view: can we utilise the resources in a data center more efficiently?

• More energy-efficient processing and cooling

• Mandates understanding of energy consumption (models)

• Predicting energy availability (when / how much ± uncertainty)

• Understanding how the datacenter components consume energy

• Identify tradeoffs between performance and power usage

Page 13: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Managing Energy Consumption in Data Centers

30.08.2018 14

Feature extraction

Model Construction

Prediction / Validation

Model Usage

• Real / simulated system

• Measure component power

• Identify important consumers

• Cannot measure all

subcomponents (!)

• Need models for

research (!)

• Build models for

• Power, user

behaviour, weather

• Formal abstraction of

system

• Power model directs

optimisation approaches

• Scheduling, DVFS, task

placement etc • Validation, robustness, correctness

• Later, predicting

Heterogeneity

Page 14: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Measuring Power

30.08.2018 15

• PDU

• Individual socket readout

J. Smith, A. Khajeh-Hosseini, J. Ward, and I. Sommerville, “Cloudmonitor: Profiling power usage,” in Proc. IEEE 5th CLOUD Comput., Jun. 2012, pp. 947–948.

https://www.rackmountmart.com

http://rsta.royalsocietypublishing.org/content/372/2018/20130278

• IBM POWER7 server

• Automated System for Temperature and Energy Reporting

• Integrated Lights-Out

• Remote server power

monitoring

Page 15: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Power and Synchronisation

8/30/2018

16

[1] http://mlab.no/blog/2015/08/a-peek-in-the-lab-tegra-k1-power-and-voltage-measurements/

[2] Rice, Andrew, and Simon Hay. "Decomposing power measurements for mobile devices." Pervasive Computing

and Communications (PerCom), 2010 IEEE International Conference on. IEEE, 2010.

• Very few authors consider synchronisation

• The problem:

• «You» are the machine to be measured

• «Logging» is done externally

• There is latency between «your» events

and the actual measurements of the

effects of those events (timestamps)

• Causes

• Uneven time synchronisation between

«you» and the «logger»

• Internal latencies in the measuring device

• Electrical capacitance smooths out current

signature

«You»

«Logger»

«Measurement»

Page 16: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Rate-Based Power Models

• Power is correlated with utilisation levels (events per second) and

summed

– E.g. rate at which instructions are executed, or rate of cache misses

– Limited by availability of measurement

• Challenge: what are good hardware activity predictors?

– Excersise in learning how hardware works

8/30/2018 17

Xiao, Y. et. al., 2010. A System-Level Model for Runtime Power Estimation on Mobile Devices.

Dong, M. and Zhong, L., 2011. Self-Constructive High-Rate System Energy Modeling for Battery-Powered Mobile Systems.

S. Li et al., “The MCPAT framework for multicore and manycore architectures: Simultaneously modeling power, area, and timing,” ACM Trans. Archit. Code Optim., vol. 10, no. 1, pp. 5:1–5:29, Apr. 2013.[1] T. Li and L. K. John, “Run-time modeling and estimation of operating system power consumption,” in Proc. ACM SIGMETRICS Int. Conf. Meas. Model. Comput. Syst., 2003, pp. 160–171.

𝑃𝑡𝑜𝑡 = 𝛽0 +

𝑖=1

𝑁𝑝

𝛽𝑖𝜌𝑖

Events per second

Cost (𝑊

𝐸𝑣𝑒𝑛𝑡 𝑝𝑒𝑟 𝑠𝑒𝑐𝑜𝑛𝑑)

Constant base power

Example breakdown for OS functions. [1]

Page 17: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

State-Based Power Models

8/30/2018

18

• Abstracting hardware into set of states 𝑺• Ex. CPU core: Off, Idle, Active

• Each state has a constant power draw 𝑷𝒔

• Often accompanied by transition costs

• Transitions also cost time

𝐸𝑐𝑜𝑚𝑝,𝑆 =

𝑠∈𝑆

𝑃𝑠𝑇𝑠 +

(𝑢,𝑣)∈𝑆

𝐶𝑢,𝑣𝑛𝑢,𝑣

Energy of

component S

Total time

spent in state S

CPU core

Off

ActiveIdle

Energy cost of

transition from state

u->v

Number of

transitions from u->v

𝑃𝑜𝑓𝑓 = 0𝑊

𝑃𝑎𝑐𝑡 = 600 𝑚𝑊𝑃𝑖𝑑𝑙𝑒 = 15 𝑚𝑊

𝐶𝑎𝑐𝑡→𝑖𝑑𝑙𝑒 = 2𝑛𝑊ℎ

Page 18: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Summary of Power Model Types

19

Regression is the «de facto» method to estimate model coefficients!

Page 19: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Central / Graphical Processing Units

8/30/2018 20

Generic CPU multicore architecture.

Typical predictors 𝜌𝑖 (access rates):

• Instructions

• Floating point, integer, branch

• Cache, L1+L2(+L3)

• Cache references

• Cache misses

• Texture (GPU)

• RAM (GPU-integrated)

• Dedicated counters

• Clock frequency

• Usually synchronous

• Some predictors indirectly

«measure» off-chip energy

consumption!

𝑃𝑡𝑜𝑡 = 𝛽0 +

𝑖=1

𝑁𝑝

𝛽𝑖𝜌𝑖

Rate-based model:

Page 20: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

DRAM Power : Typical Rate-Based Models

8/30/2018 21

𝑃𝑑𝑑𝑟 = 𝑃𝑠𝑡𝑎𝑡𝑖𝑐 + 𝛼𝑟𝑒𝑎𝑑𝑢𝑟𝑒𝑎𝑑 + 𝛼𝑤𝑟𝑖𝑡𝑒𝑢𝑤𝑟𝑖𝑡𝑒

𝑃𝑑𝑑𝑟 = 𝐷𝑆𝑅𝜎 + 𝐸𝑟𝑤𝜌𝑟𝑤+D𝐸𝑎𝑝𝑓𝑎𝑝

Read / Write throughput

Static power

Read/write

energy per bit

Energy needed to

activate / precharge

a row

Other predictors

• Refresh cycles

• Idle & active power

• What about clock frequency?

J. Lin, H. Zheng, Z. Zhu, H. David, and Z. Zhang, “Thermal modeling and management of DRAM memory systems,” SIGARCH Comput. Archit. News, vol. 35, no. 2, pp. 312–322, Jun. 2007.

N. Vijaykrishnan, M. Kandemir, M. J. Irwin, H. S. Kim, and W. Ye, “Energy-driven integrated hardware-software optimizations using simplepower,” SIGARCH Comput. Archit. News, vol. 28, no. 2, pp. 95–106, May 2000.

Page 21: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Rotating Disks

8/30/2018 22

Power of rotating disk

Angular velocityRadiusPlatters

Read energy ∝ 𝐿3 (logical block number)

Rate-based model (active, seek, idle)

Time in idle I

(kind of state-based)

Requests N Seeks N

Rotating HDD states and transitions.

Page 22: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Solid State Drives

8/30/2018 23

SSD SLC (Single-Layered Cell) state diagram.

• Transistor-based

• SLC and MLC

• Same types as found in

embedded (EMMCs)

• Rate-based model predictors

• Transition costs

• Program <-> Read

• Program <-> Write

• Utilisation

• Programming sector

• Reading sector

• Erasing sectorSSD MLC (Multi-Layered Cell) state diagram.

Page 23: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Network Switches (Ethernet)

8/30/2018 24

Switch ∈ 𝑉

Switch ∈ 𝑉

Switch ∈ 𝑉

N

N

N

NN

Link ∈ 𝐸

State-based model (!) :

Cost of active link (u,v) [W] Cost of active switch (u) [W]

Link / Switch active?

𝑃𝑛𝑒𝑡

Rate-based model for per-bit-energy in switch network:

Bit-processing

energy in stage iBit-processing

energy in output

of stage i

Bit-processing

energy in

access network

Page 24: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Routers / Network Interfaces

8/30/2018 25

Router / switch architecture

𝑃𝑟𝑜𝑢𝑡𝑒𝑟 = 𝑃𝑏𝑎𝑠𝑒 + 𝐸𝑝𝑘𝑡𝑅𝑝𝑘𝑡 + 𝐸𝑠𝑓𝑅𝑏𝑦𝑡𝑒

𝑃𝑏𝑎𝑠𝑒 = 𝑃𝑐𝑡𝑟𝑙 + 𝑃𝑒𝑛𝑣 + 𝑃𝑑𝑎𝑡𝑎

Processing energy

per packetPer-byte store

and forward energy

Data PlaneEnvironmental Plane

Control

Plane

• Conventional (ethernet)

• 𝑃𝑒𝑡ℎ = 𝑇𝑖𝑑𝑙𝑒𝑃𝑖𝑑𝑙𝑒 + (𝑇𝑎𝑐𝑡𝑖𝑣𝑒𝑃𝑖𝑑𝑙𝑒)𝜌

• Router

Page 25: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Power Distribution Loss

8/30/2018 26

𝑃𝑢𝑝𝑠/𝑝𝑑𝑢 = 𝑃𝑖𝑑𝑙𝑒 + 𝜶𝑃𝑑𝑒𝑙𝑖𝑣𝑒𝑟𝑒𝑑

Power delivered by UPS or

PDU to consumers

Inefficiency constant

Power Delivery

According to this, the only way

to reduct loss in a UPS or PDU

is to deliver less power.

Page 26: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

An Example of Cooling Failure

• Picture of processor temperature

8/30/2018 27

Page 27: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Cooling Systems - Inefficiency

30.08.2018 28

𝑃𝑎𝑐= 𝑃𝑡𝑜𝑡

𝐶𝑜𝑃(𝑇𝑠𝑢𝑝)

𝑇𝑠𝑢𝑝

𝑇𝑠𝑢𝑝

The higher supplied temperature, the

worse energy performance of the A/C

unit

(in example: HP Labs CRAC units)

Tang, Qinghui, Sandeep Kumar S. Gupta, and Georgios Varsamopoulos. "Energy-efficient thermal-aware task scheduling

for homogeneous high-performance computing data centers: A cyber-physical approach." IEEE Transactions on Parallel

and Distributed Systems 19.11 (2008): 1458-1472.

Computing powerChilled Room AC

(CRAC) power

𝐶𝑜𝑃 =𝑄

𝑊

Heat removed [W]

Work required [W]

Page 28: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Minimising Contribution to Heat Buildup

30.08.2018 29

• n chassis

• Each chassis contributes to heating

up inlet temperatures

• Goal

• Reschedule VMs on machines

• Minimise total contribution to inlet

temperature

• Model for each chassis’ contribution

to inlet temperature

Inlet tempearture

Chassis idle power dissipation b

Active (100 %) processor

power dissipation a

Task distribution

vector c

𝑡𝑠𝑢𝑝Unknown

𝒅𝒆𝒈𝒓𝒆𝒆𝒔

𝒘𝒂𝒕𝒕D

Page 29: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

30.08.2018 30

• Hybrid cooling (CRAH)

• Chiller active

• Free cooling

• Challenge

• Dynamically select approach

• Few cooling transitions

• Dynamically position VMs

Cooling power breakdown (model)[1]

[1] S. Ghosh, S. Chandrasekaran, and B. Chapman, “Statistical modeling of power/energy of scientific kernels on a

multi-gpu system,” in Proc. IGCC, Jun. 2013, pp. 1–6.

Hybrid Cooling

Page 30: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Dynamic Shutdown of Servers

30.08.2018 31

• Core idea

• Turn off physical machines

• Eliminate static / dynamic power

• VMs migrated between PMs

• Threshold value controls decision

• Can impair QoS and violate SLA

• Question:

• Is it a good idea to impose a lot of

processing on a single PM?

• (contention, voltage scaling, locking /

other resource usage, disk etc)

CPU Util [%]

Upper threshold

Lower threshold

Too low utilisation.

About to loose all it’s VMs,

get powered off

Too high utilisation.

VMs will get migrated

To other PMs.Beloglazov, Anton, and Rajkumar Buyya. "Adaptive threshold-based approach for energy-efficient

consolidation of virtual machines in cloud data centers." MGC@ Middleware. 2010.

Page 31: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

VM Provisioning at Perspective

30.08.2018 32

• Re-allocate VMs to..

• ..turn off physical machines.

• ..minimise traffic overhead,

and avoid hotspots.

• ..minimise contribution to heat

buildup.

• Predict future workload

• Validation methodology (usually

simulations)

• PlanetLab

• CloudSim

• GreenCloud

Kliazovich, Dzmitry, Pascal Bouvry, and Samee Ullah Khan. "DENS: data center energy-efficient network-aware scheduling." Cluster computing 16.1 (2013): 65-75.

Liu, Haikun, et al. "Performance and energy modeling for live migration of virtual machines." Proceedings of the 20th international symposium on High performance distributed computing. ACM, 2011.

Maximise traffic overhead

Maximise contribution to heat

Prevent nodes from shutting off

They actually don’t

work together at all

Page 32: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

High-Precision Power Modelling for the Tegra K1 & X1 SoCs

30.08.2018 33

Jetson TX2 / TX1 blade server.

System-on-Module (SoMs)

mounted in a 1U server.

http://connecttech.com/product/jetson-tx2-tx1-array-server/

Page 33: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Tegra K1/X1: Hereogeneous Multicore 28 nm SoC

• Tegra family of mobile Systems-on-Chip

(SoC), < 12 W power usage

• (Tegra 2, 3, 4..)

• Tegra K1 & Tegra X1

• Programmable GPU (CUDA)

• Power management capabilities

8/30/2018 34

Tegra K1 Tegra X1

CPU

High Performance 4 x ARM Cortex-A15 4 x ARM Cortex-A57

Low Power 1 x ARM Cortex-A15 4 x ARM Cortex-A53

GPU 192-Core Kepler 256-Core Maxwell

Memory 2 GB (Jetson-TK1) 4 GB (Jetson-TX1)

Page 34: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

CMOS Devices : Static and Dynamic Power

𝑃𝑟𝑎𝑖𝑙 = 𝑃𝑠𝑡𝑎𝑡 + 𝑃𝑑𝑦𝑛

𝑃𝑠𝑡𝑎𝑡 = 𝑉𝑟𝑎𝑖𝑙𝐼𝑙𝑒𝑎𝑘 𝑃𝑑𝑦𝑛 = 𝛼𝐶𝑉𝑟𝑎𝑖𝑙2 𝑓

• Power on a rail can be described using

the standard CMOS equations

• Rail voltage 𝑉𝑟𝑎𝑖𝑙• Increases with clock frequency

• Total power

• ..is the sum of power of all rails

Transistor leakage

Capacitive load per cycle

Cycles per second

Tegra K1 SoC power distribution.

(FYI: billions of CMOS transistors)

Page 35: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Clock Frequency and Rail Voltage

• Clock frequency, rail voltage and power

usage are deeply coupled• For certain clocks...

• ...increasing clock frequency increases

voltage, and vice versa

• From previous slide: power ∝ 𝑉2

Measured Average Power

Measured GPU Rail Voltage

𝑃𝑠𝑡𝑎𝑡 = 𝑉𝑟𝑎𝑖𝑙𝐼𝑙𝑒𝑎𝑘 𝑃𝑑𝑦𝑛 = 𝛼𝐶𝑉𝑟𝑎𝑖𝑙2 𝑓

Page 36: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Building High-Precision Power Models

• Main innovation

– Express switching activity in terms of measurable hardware activity

– Consider voltages on all rails

– Consider core- and rail-gating

• What constitutes good hardware activity predictors?

– 𝜌𝑅,𝑖 can be cache misses, cache writebacks, instructions, cycles..

– Should cover all hardware activity on a rail

𝑃𝑟𝑎𝑖𝑙 = 𝑉𝑟𝑎𝑖𝑙𝐼𝑙𝑒𝑎𝑘 +

𝑖=1

𝑁𝑅

𝐶𝑅,𝑖𝜌𝑅,𝑖𝑉𝑅2

Number of utilisation

predictors on rail R

Capacitive load

per event per second

Hardware utilisation predictor

(events per second)

Page 37: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

38

Memory Controller

Clock cycles per second

LP

Core

32K $D$I

512MB L2

HP

Core

32K $D$I

HP

Core

32K $D$I

HP

Core

32K $D$I

HP

Core

32K $D$I

2048 MB L2

L2 – RAM cache traffic

Rail voltage

Rail voltage

Clock cycles

1 GB DDR3

External Memory Controller

128 KB L2

64 KB L1

Rail voltage

Clock cycles

Cache reads

Cache r/w

Instructions

• Integer

• Single-precision floating point

• Double-precision floating point

• Conversion

• Control

• Misc

Global instructionsCore gating

1 GB DDR3Active clock cycles

Rail voltage

Model Predictors – Overview* utilisation units in [events per second]

L1 – L2 cache traffic

Rail gating

Rail gating

Page 38: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

8/30/2018 39

Model Training Methodology

CPU GPU

RA

M

L1-L

2

L2-R

AM

INT

FP

U

NE

ON

L2

L1

INT

F3

2

F6

4

CN

V

MIS

C

CPU

Idle CPU

L1 wb

L1 refill

Mmul-int

Mmul-f32

Mmul-neon

GPU

L2 read

L1 read

L1 write

RAM

Integer

Single-precision

Double-precision

Conversion

Misc

Components

under explicit stress

Benchmarks

For each hardware config

For each mem-frequency

For each [C/G]PU-frequency

• run ( benchmark)

• Sample voltages

• Sample predictors

• Sample power

• Advantages– Ensures diversity in model predictors

(hardware access rates)

• Very helpful for regression

– Triggers changes in voltages across the

platform

– Helps vary hardware utilisation and model

predictors

• Disadvantages

– Takes a long time to run

• X CPU configurations

• Y GPU configurations

Page 39: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Model Coefficient Comparison

40

Leakage Currents

CPU Dynamic Power

CPU Instruction Power GPU Dynamic Power

Page 40: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Leakage in CMOS: Subthreshold leakage (Tegra X1)

8/30/2018 41

• Subthreshold leakage

• Transistor off-state leakage

• The smaller gate width W..

• The more significant it is

• Current flows from source to drain..

• Temperature-dependent (𝑽𝟎)

source

drain

gate

Thermal voltage given by temp T

Transistor supply

voltageTransistor threshold voltage

number of transistors * gate width

Tegra X1 power versus average SoC

temperature and GPU voltages.

Decent indication of

temperature-dependent

leakage

Page 41: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Leakage in CMOS: Gate-Oxide Leakage (Tegra X1)

8/30/2018 42

• Gate-Oxide leakage

• Transistor off-state leakage

• The smaller gate width W..

• The more significant it is

• Quantum-tunneling

• Current flows through di-electric layer

• S-D channel -> gate

source

drain

gate

Transistor supply voltage Oxide thickness

Power usage over voltages and different

average SoC temperature ranges.

Page 42: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

A Subthreshold Leakage Model for the X1

43

Thermal voltage coefficient 𝜶 𝟑.𝟎𝟏 ∗ 𝟏𝟎−𝟑

𝑁𝑡𝑟𝑎𝑛𝑠 ∗ 𝐾𝑤𝑖𝑑𝑡ℎ (Core rail) 11.20

𝑁𝑡𝑟𝑎𝑛𝑠 ∗ 𝐾𝑤𝑖𝑑𝑡ℎ (GPU rail) 22.31

𝑁𝑡𝑟𝑎𝑛𝑠 ∗ 𝐾𝑤𝑖𝑑𝑡ℎ (Per-CPU core) 5.44

Estimated and measured GPU (idle) power usage.

Example leakage over temperature and voltage ranges.

• Modelled using non-linear least

squares solver in python

• Validated using dedicated power

measurement sensors on board

Page 43: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Video Processing Filters

• Drone processing live video

stream

– Debarreling

– Frame rotation

– Motion vector search

– Compression (DCT)

– Quantisation

– Entropy encoding

• Goal: divide workloads between

cores

– To achieve high energy efficiency44

Rotation

filter

60 FPS

«Shaky video»

Frame

stream

Debarre

l filter

Page 44: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Performance Per Watt (PPW)

45

Workloads: SIFT, BLAS, face

recognition and tracking

Processor A

CPU

Processor B

GPU, DSP, ...

System-on-Chip

• Tegra 2

• Tegra 250

• Tegra 3

• Samsung S4

• Samsung Note II

• Nexus 7

• OMAP 3530

• Reported

• Increased performance

• Increased performance-per-

watt

• Performance Per Watt (PPW)

• E.g. frames per second per

watt

• Common methodology:

• Measure performance and

power

• Test duration: until done

(!)

Page 45: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Example: DCT

46

100 % CPU 100 % GPU

NVIDIA

GK20AARM

Cortex A15

DCT frame

30 %

8x8

macroblock

• Process 80 DCT frames

• 1920x1080

• Can offload macroblocks

• CPU GPU

• Fixed

• Number of CPU cores

• Frequencies

• Test runs until 80 frames

processed

• Power measured for

this duration

Highest PPW

(or is it?)

To GPU

Page 46: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Example: DCT

47

Detailed power

component breakdown

• 10 % offloading seems better (PPW) because:

• At 10 % offloading, benchmarks finish earlier less idle

(unavoidable) energy

• 100 % GPU offloading is now best

• Less instruction energy

• Always run energy benchmarks for the same duration

Baser

pow

er

Page 47: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Single Filter: DCT Offloading

48

• Offloading 10 % to CPU

• Reduced GPU clock

power

• Reduced GPU

voltage

• Increased CPU clock

and instruction power

• 5 % energy saving

• (only)

Low

er fre

quencie

s

Page 48: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

Multiple Filter: DCT Offloading

49

• CPU busy

• Feature search

• Huffman encoding

• No saving possible

• CPU frequency: 1.8 GHz

• CPU voltage: 0.94 V (0.82 V in idle)

• All instructions & cycles cost more!

𝑃𝑐𝑝𝑢 ∝ 𝑉𝑐𝑝𝑢2

Page 49: Kristoffer Robin Stokke, PhD FLIR UAS...Wan, Can, et al. "Photovoltaic and solar power forecasting for smart grid energy management." CSEE Journal of Power and Energy Systems 1.4 (2015):

The Right Core for the Right Job

50

• Better to select a single core for a

job

• Offloading beneficial in very

constrained and specific cases

• Filters on the right process full-HD

video at 25 FPS

• Implementations have an affinity

with different cores

• Tightly coupled with performance

• Better performance; less

cycles

(25 FPS QoS requirement)