power awareness through selective dynamically optimized traces roni rosner, yoav almog, micha...

18
Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs, Haifa, Israel Presenter: Ioana Burcea

Upload: chrystal-harper

Post on 18-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Power Awareness through Selective Dynamically Optimized Traces

Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs, Haifa, Israel

Presenter: Ioana Burcea

Page 2: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Agenda

Motivation for PARROT = Power-Aware aRchitecture Running Optimized Traces

PARROT Concept and Architecture Performance and Energy Results Discussion

– What makes PARROT a power-aware architecture?– What is new about this paper? / What are the contributions

of this paper?

Page 3: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Motivation

We pay more energy per task– Poor scaling of performance with power consumption

PARROT tries to change the balance– Filtering Techniques to Improve Trace-Cache Efficiency –

PACT 2001– Selecting Long Atomic Traces for High Coverage – ICS

2003– Specialized Dynamic Optimizations for High-Performance

Energy-Efficient Microarchitecture – CGO 2004

Page 4: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

PARROT Concepts – The Big Picture

Based on the well-known cold/hot (10/90) paradigm

PARROT Principles– Reuse: trace-cache centric– Dynamic optimizations: more performance with

less energy– Focus: invest where it pays– Pipeline decoupling: hybrid front-end, cold and

hot execution pipelines– Transparency: immune to s/w compatibility

Page 5: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Traces and Trace Selection

Decoded atomic traces– Complex retirement & recovery in case of misprediction– More aggressive optimizations

Trace Selection – deterministic criteria– Capacity limitation: 64 uops– Complete basic blocks– Terminating CTI (control-transfer instructions)

Indirect jumps, software exceptions, backward taken branches

– Return instructions: procedure inlining– Trace join

Page 6: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Microarchitecture

Split-execution vs. unified-execution– Foreground phase: fetch-to-execution pipeline– Background phase (post-processing): trace selection and

optimization

Page 7: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Microarchitecture (cont’d)

• Two predictors: GHR = Global History Buffer

•Branch predictor

•Trace predictor

• Deterministic trace build scheme

• Filtering mechanisms:

• The hot filter selects frequent traces from those executed on the cold pipeline

• The blazing filter selects for optimization the hottest traces

• Dynamic optimizations

• generic and core specific optimizations

• gradually applied (?)

Page 8: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Simulation framework

An “in-house” proprietary performance and power simulator

Optimizations applied as different passes– Optimization delay for one trace ~ 100 cycles

Energy simulation– Power consumption matrix for each operation on each

hardware unit– Leakage

Uniform leakage in space over the processor core and L2 cache and in time modeling a high temperature

LE = PMAX * (0.05 * M + 0.4*K) * CYC

Page 9: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Configuration Space

Page 10: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Experimental Evaluation

Metrics– IPC– Total energy– Cubic-MIPS-per-WATT (CMPW)

A measure of the design tradeoffs between power and performance

Benchmarks– SpecInt2000– SpecFP2000– Office– Multimedia– DotNet

Page 11: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Performance and Power Awareness

Page 12: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Extreme Microarchitectural Alternatives

Page 13: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Hot Code Predictability

Page 14: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Trace-cache Fetch Coverage

Page 15: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Optimizer Capabilities

Page 16: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Energy Breakdown

Page 17: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Their Conclusions…

Page 18: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Our Conclusions

What makes PARROT a power-aware architecture?

What is new about this paper? / What are the contributions of this paper?– rePlay (?)