amnesiac: amnesic automatic computer€¦ · amnesiac: amnesic automatic computer trading...

35
AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University of Minnesota, Twin Cities {aktur002, ukarpuzc}@umn.edu

Upload: dangnhu

Post on 30-Apr-2018

229 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

AMNESIAC: Amnesic Automatic ComputerTrading Computation for Communication for Energy Efficiency

Ismail Akturk and Ulya R. KarpuzcuUniversity of Minnesota, Twin Cities

{aktur002, ukarpuzc}@umn.edu

Page 2: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Motivation

Energy of on chip communication normalized to computation

Technology Node 40nm

1.55x 5.77x

10nm

AMNESIAC: Amnesic Automatic Computer

Keckler, S. W., Dally, W. J., Khailany, B., Garland, M., and Glasco, D. “GPUs and the Future of Parallel Computing”, IEEE Micro 31, 5 (2011).

2

Page 3: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Motivation

(Re)computing becomes cheaper than memorizing and retrieving data

AMNESIAC: Amnesic Automatic Computer

Keckler, S. W., Dally, W. J., Khailany, B., Garland, M., and Glasco, D. “GPUs and the Future of Parallel Computing”, IEEE Micro 31, 5 (2011).

3

Energy of on chip communication normalized to computation

Technology Node 40nm

1.55x 5.77x

10nm

Page 4: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Amnesic* Automatic Computer• Idea: Replace energy-hungry load with sequence of instructions

4

*amnesia: noun a partial or total loss of memory (origin late 18th cent.: from Greek amnēsia ‘forgetfulness’)

AMNESIAC: Amnesic Automatic Computer

Page 5: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Amnesic* Automatic Computer• Idea: Replace energy-hungry load with sequence of instructions• Scope

• How to determine which data values to recompute• How to identify instructions to recompute data values• How to protect architectural state during recomputation• How to orchestrate recomputation at runtime

5AMNESIAC: Amnesic Automatic Computer

*amnesia: noun a partial or total loss of memory (origin late 18th cent.: from Greek amnēsia ‘forgetfulness’)

Page 6: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

How to determine which data values to recompute

6

core

L1

L2

memory

load A

core

L1

L2

memory

load B

core

L1

L2

memory

load C

A

B

C

Eld,B ? Erc,B Eld,C ? Erc,CEld,A ? Erc,A

AMNESIAC: Amnesic Automatic Computer

Page 7: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Recomputation Slice (RSlice)

7AMNESIAC: Amnesic Automatic Computer

Page 8: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Overview• Amnesic Compiler

• Identifies independent RSlices• Annotates RSlices: RCMP, REC, RTN

8AMNESIAC: Amnesic Automatic Computer

Page 9: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Overview• Amnesic Compiler

• Identifies independent RSlices• Annotates RSlices: RCMP, REC, RTN

• Amnesic Scheduler• Decides when to recompute• Can use various policies

• Compiler• First-Level Cache Miss - FLC• Last-Level Cache Miss - LLC

9AMNESIAC: Amnesic Automatic Computer

Page 10: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Overview• Amnesic Compiler

• Identifies independent RSlices• Annotates RSlices: RCMP, REC, RTN

• Amnesic Scheduler• Decides when to recompute• Can use various policies

• Compiler• First-Level Cache Miss - FLC• Last-Level Cache Miss - LLC

• Amnesic Microarchitecture• Prevents corruption of architectural state• Makes RSlice inputs available at the time of recomputation

10AMNESIAC: Amnesic Automatic Computer

Page 11: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Amnesic Microarchitecture

11

Source and Destination Registers

SFile

AMNESIAC: Amnesic Automatic Computer

Page 12: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Amnesic Microarchitecture

12

Source and Destination Registers

RegisterFile[#] -> SFile[#]

Renamer SFile

AMNESIAC: Amnesic Automatic Computer

Page 13: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Amnesic Microarchitecture

13

Leaf Address Non-Recomputable Inputs

v

Source and Destination Registers

RegisterFile[#] -> SFile[#]

Hist

Renamer SFile

v …

AMNESIAC: Amnesic Automatic Computer

Page 14: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Amnesic Microarchitecture

14

Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

AMNESIAC: Amnesic Automatic Computer

Page 15: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Putting It All Together: Default Execution

15

0

Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

MemoryorRegisterFile

AMNESIAC: Amnesic Automatic Computer

Page 16: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Putting It All Together: Default Execution

16

01

Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

MemoryorRegisterFile

Fetch Logic

AMNESIAC: Amnesic Automatic Computer

Page 17: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Putting It All Together: During Recomputation

17

01

2

Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

MemoryorRegisterFile

Fetch Logic

AMNESIAC: Amnesic Automatic Computer

Page 18: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Putting It All Together: During Recomputation

18

01

2 3

4Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

MemoryorRegisterFile

Fetch Logic

ExecutionUnits

AMNESIAC: Amnesic Automatic Computer

Page 19: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Putting It All Together: During Recomputation

19

01

2 3

4

5

Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

MemoryorRegisterFile

Fetch Logic

ExecutionUnits

AMNESIAC: Amnesic Automatic Computer

Page 20: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Putting It All Together: During Recomputation

20

01

2 3

4

5

6

Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

MemoryorRegisterFile

Fetch Logic

ExecutionUnits

AMNESIAC: Amnesic Automatic Computer

Page 21: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Putting It All Together: During Recomputation

21

01

2 3

4

5

6

7

8

Leaf Address Non-Recomputable Inputs

v

RSlice Instructions

Source and Destination Registers

RegisterFile[#] -> SFile[#]

IBuff Hist

Renamer SFile

v …

MemoryorRegisterFile

Fetch Logic

ExecutionUnits

AMNESIAC: Amnesic Automatic Computer

Page 22: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Evaluation Setup

22

• Benchmarks: SPEC2006, NAS, Parsec, Rodinia• 33 benchmarks evaluated, 11 of them show > 10% EDP gain

• Amnesic compiler pass mimicked by binary generator Pintool• Microarchitecture and scheduler implemented in Snipersim

AMNESIAC: Amnesic Automatic Computer

Page 23: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Energy-Delay Product

23AMNESIAC: Amnesic Automatic Computer

mcf sx cg is ca fs fe rt bp bfs sr

Page 24: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Energy-Delay Product

24AMNESIAC: Amnesic Automatic Computer

mcf sx cg is ca fs fe rt bp bfs sr

Page 25: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Energy-Delay Product

25AMNESIAC: Amnesic Automatic Computer

mcf sx cg is ca fs fe rt bp bfs sr

Page 26: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Energy-Delay Product

26AMNESIAC: Amnesic Automatic Computer

mcf sx cg is ca fs fe rt bp bfs sr

Page 27: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

mcf sx cg is ca fs fe rt bp bfs sr

Energy-Delay Product

27AMNESIAC: Amnesic Automatic Computer

up to 87% (24.92% on average ) EDP gain obtained

Page 28: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Instruction Mix

28

Benchmark% increase in

instruction count% decrease in load

count

mcf 4.47 6.19

sx 4.55 6.68

cg 3.97 2.11

is 17.97 49.99

ca 7.38 7.95

fs 1.83 3.08

fe 3.55 1.75

rt 1.97 6.08

bp 31.89 55.55

bfs 1.20 60.93

sr 20.02 23.33

AMNESIAC: Amnesic Automatic Computer

Page 29: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Benchmark% increase in

instruction count% decrease in load

count

mcf 4.47 6.19

sx 4.55 6.68

cg 3.97 2.11

is 17.97 49.99

ca 7.38 7.95

fs 1.83 3.08

fe 3.55 1.75

rt 1.97 6.08

bp 31.89 55.55

bfs 1.20 60.93

sr 20.02 23.33

Instruction Mix

29AMNESIAC: Amnesic Automatic Computer

49.99% reduction in loads, at the expense of 17.97% more instructions

Page 30: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Benchmark% increase in

instruction count% decrease in load

count

mcf 4.47 6.19

sx 4.55 6.68

cg 3.97 2.11

is 17.97 49.99

ca 7.38 7.95

fs 1.83 3.08

fe 3.55 1.75

rt 1.97 6.08

bp 31.89 55.55

bfs 1.20 60.93

sr 20.02 23.33

Instruction Mix

30AMNESIAC: Amnesic Automatic Computer

translates into 74.68% reduction in (load) energy

49.99% reduction in loads, at the expense of 17.97% more instructions

Page 31: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

RSlice Length

31

sx bfs

AMNESIAC: Amnesic Automatic Computer

Page 32: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

RSlice Length

32

sx bfs

AMNESIAC: Amnesic Automatic Computer

majority of RSlices (78.32%) have less than 10 instructions

Page 33: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Summary• Amnesic execution can minimize power and performance overhead of

• Data retrieval• Data communication

• Amnesic execution makes the workload more compute-intensive• Better use of classic architectures

• Up to 87% improvement in energy-delay product (avg. 24%)

33AMNESIAC: Amnesic Automatic Computer

Page 34: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

Questions and Comments

Please send your questions and comments to

[email protected]

34AMNESIAC: Amnesic Automatic Computer

Page 35: AMNESIAC: Amnesic Automatic Computer€¦ · AMNESIAC: Amnesic Automatic Computer Trading Computation for Communication for Energy Efficiency Ismail Akturk and Ulya R. Karpuzcu University

AMNESIAC: Amnesic Automatic ComputerTrading Computation for Communication for Energy Efficiency

Ismail Akturk and Ulya R. KarpuzcuUniversity of Minnesota, Twin Cities

{aktur002, ukarpuzc}@umn.edu