snatch: opportunistically reassigning power allocation...

63
Snatch: Opportunistically Reassigning Power Allocation between Processor and Memory in 3D Stacks Dimitrios Skarlatos, Renji Thomas, Aditya Agrawal, Shibin Qin, Robert Pilawa, Ulya Karpuzcu, Radu Teodorescu, Nam Sung Kim, and Josep Torrellas UIUC, OSU, UMN, NVIDIA 1

Upload: others

Post on 20-Aug-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Opportunistically Reassigning Power Allocation between Processor

and Memory in 3D Stacks

Dimitrios Skarlatos, Renji Thomas, Aditya Agrawal, Shibin Qin, Robert Pilawa, Ulya Karpuzcu, Radu Teodorescu, Nam Sung Kim, and Josep Torrellas

UIUC, OSU, UMN, NVIDIA

1

Page 2: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Motivation: Cost of Power/Ground Pins in 3D stacks

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

2

Page 3: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Motivation: Cost of Power/Ground Pins in 3D stacks

• Size & cost of packages is proportional to # of pins

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

3

Page 4: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Motivation: Cost of Power/Ground Pins in 3D stacks

• Size & cost of packages is proportional to # of pins

• 3D Stacks: Disjoint Power/Ground pins for Processor and Memory

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

4

Page 5: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Motivation: Cost of Power/Ground Pins in 3D stacks

• Size & cost of packages is proportional to # of pins

• 3D Stacks: Disjoint Power/Ground pins for Processor and Memory

• Each dimensioned for the worst case

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

5

Page 6: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Motivation: Underutilization of Power Budget

• High Processor or Memory Power phases

6

Page 7: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Contribution: Snatch

• Dynamically and opportunistically divert power between processor and memory

Conventional

Processor die

Mem0 die

Mem1 die

Proc PDN

TSVs

Mem PDN

Processor VR

Memory VR

7

Page 8: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Contribution: Snatch

• Dynamically and opportunistically divert power between processor and memory

• On-chip voltage regulator connects the two Power Delivery Networks

• Processor or Memory can consume more power for the same # of pins

Snatch

Processor die

Mem0 die

Mem1 die

Proc PDN

TSVs

Mem PDN

Processor

VR

Memory

VR

On-chip VR

8

Page 9: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Impact Compared to Conventional 3D Stacks

• For same # of power/ground pins:

• Application can consume more power

• Up to 23% application speedup

• For the same maximum power in Processor and Memory

• Fewer pins, about 30% package cost reduction

9

Page 10: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch Outline

• Implementation

• Operation

• Case 1:

• Same Max Power in Processor and Memory, reduced # of pins

• Case 2:

• Same # of pins, improved performance

• Evaluation

10

Page 11: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch Outline

• Implementation

• Operation

• Case 1:

• Same Max Power in Processor and Memory, reduced # of pins

• Case 2:

• Same # of pins, improved performance

• Evaluation

11

Page 12: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conventional Implementation

0.8-0.95V

5.5W 4.5W

1.1V

12V 12V

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

Cross-section 12

Page 13: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch implementation

12V 12V

Cross-section

5.5W 4.5W

0.8-0.95V

1.1V

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

Single on-chip VRSingle On-Chip VR

13

Page 14: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch implementation

• Small 2W on-chip bidirectional VR on Proc die

• Bulk of work from off-chip VRs

12V 12V

Cross-section

5.5W 4.5W

0.8-0.95V

1.1V

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

Single on-chip VRSingle On-Chip VR

14

Page 15: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

Single on-chip VR

Snatch: Dynamic power reassignment

• Up/Down convert power Snatched

Single On-Chip VR

15

Page 16: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

Single on-chip VR

Snatch: Dynamic power reassignment

• Up/Down convert power Snatched

0.8-0.95V

1.1V

Single On-Chip VR

2W

16

Page 17: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Cross-Section

• Small 2W on-chip bidirectional VR on Proc die

Cross-section

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

Single on-chip VRSingle On-Chip VR

17

Page 18: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Top Down

• Small 2W on-chip bidirectional VR on Proc die

0.8-0.95V

5.5W

4.5W 1.1V

Top Down

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

Memory VR

PCB substrate

Processor die

C4 bumps

DRAM0 die

DRAM1 die

microbumps

Processor VR

BGA pins

TSVs

Single on-chip VRSingle On-Chip VR

Cross-section18

Page 19: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch Outline

• Implementation

• Operation

• Case 1:

• Same Max Power in Processor and Memory, reduced # of pins

• Case 2:

• Same # of pins, improved performance

• Evaluation

19

Page 20: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Memory Power

• On processor intensive phase

5.5W

4.5W

5.5W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

20

Page 21: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Memory Power

• On processor intensive phase

• Snatch Memory Power TurboBoost Processor

7.5W

2.5W

5.5W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

2W

21

Page 22: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Processor Power

• On memory intensive phase

• Snatch Processor Power TurboBoost Memory

3.5W

6.5W

5.5W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

2W

22

Page 23: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Decisions

• Processor or Memory Intensive Phase?

5.5W

4.5W

5.5W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

?W

23

Page 24: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Decisions

• Processor or Memory Intensive Phase?

• How much Power is available?

5.5W

4.5W

5.5W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

?W

24

Page 25: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Decisions

• Processor or Memory Intensive Phase?

• How much Power is available?

• How much Power can we Snatch?5.5W

4.5W

5.5W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

?W

25

Page 26: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conservative Snatching Algorithm

• Keep track of past power values of 10µs epochs

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

?W

5.5W

4.5W

26

Page 27: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conservative Snatching Algorithm

• Keep track of past power values of 10µs epochs

• Average for activity detection

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

?W

5.5W

4.5W

27

Page 28: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conservative Snatching Algorithm

• Keep track of past power values of 10µs epochs

• Average for activity detection

• MAX for power availabilityProcessor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

?W

5.5W

4.5W

28

Page 29: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conservative Snatching Algorithm

• Keep track of past power values of 10µs epochs

• Average for activity detection

• MAX for power availability

• Avoid hysteresis

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

?W

5.5W

4.5W

29

Page 30: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch Outline

• Implementation

• Operation

• Case 1:

• Same Max Power in Processor and Memory, reduced # of pins

• Case 2:

• Same # of pins, improved performance

• Evaluation

30

Page 31: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conventional Power Provisioning

• Processor provisioned for 7.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

31

Page 32: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conventional Power Provisioning

• Processor provisioned for 7.5W

7.5W7.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

32

Page 33: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conventional Power Provisioning

• Processor provisioned for 7.5W

• Memory provisioned for 6.5W

6.5W

6.5W

7.5W7.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

33

Page 34: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conventional Power Provisioning

• Processor provisioned for 7.5W

• Memory provisioned for 6.5W

• Total = Processor + Memory = 14W

6.5W

6.5W

7.5W7.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

34

Page 35: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Provisioning 3D Stacks Just Right

• Processor provisioned for 7.5W

• Memory provisioned for 6.5W

• Total = Processor + Memory = 14W

6.5W

6.5W

7.5W7.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

35

Page 36: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

• Processor provisioned for 7.5W

• Memory provisioned for 6.5W

• Total = Processor + Memory = 14W

Snatch: Provisioning 3D Stacks Just Right

Snatch 2W

6.5W

6.5W

7.5W7.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

36

Page 37: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

7.5W5.5W

Snatch: Provisioning 3D Stacks Just Right

5.5W +-2W

Snatch 2W

6.5W

6.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

• Processor provisioned for 7.5W - 2W = 5.5W

• Memory provisioned for 6.5W

• Total = Processor + Memory = 14W

37

Page 38: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

6.5W

Snatch: Provisioning 3D Stacks Just Right

4.5W +-2W

7.5W5.5W 5.5W +-2W

Snatch 2W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

4.5W

• Processor provisioned for 7.5W - 2W = 5.5W

• Memory provisioned for 6.5W - 2W = 4.5W

• Total = Processor + Memory = 14W

38

Page 39: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Provisioning 3D Stacks Just Right

Reduce Total Provisioning from 14W to 10W, approx same

performance

6.5W

4.5W +-2W

7.5W5.5W 5.5W +-2W

Snatch 2W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

4.5W

• Processor provisioned for 7.5W - 2W = 5.5W

• Memory provisioned for 6.5W - 2W = 4.5W

• Total = Processor + Memory = 14W - 4W = 10W

39

Page 40: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Provisioning 3D Stacks Just Right

Reduce Total Provisioning from 14W to 10W, approx same

performance

30% Reduction in Package Power/Ground Pins

6.5W

4.5W +-2W

7.5W5.5W 5.5W +-2W

Snatch 2W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

4.5W

• Processor provisioned for 7.5W - 2W = 5.5W

• Memory provisioned for 6.5W - 2W = 4.5W

• Total = Processor + Memory = 14W - 4W = 10W

40

Page 41: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch Outline

• Implementation

• Operation

• Case 1:

• Same Max Power in Processor and Memory, reduced # of pins

• Case 2:

• Same # of pins, improved performance

• Evaluation

41

Page 42: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conventional Power Provisioning

• Processor & Memory provisioned for 5.5W & 4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

42

Page 43: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Conventional Power Provisioning

• Processor & Memory provisioned for 5.5W & 4.5W

5.5W

4.5W

5.5W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN

43

Page 44: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Provide Additional Power

• Processor & Memory provisioned for 5.5W & 4.5W

• Snatch power

Snatch 2W

4.5W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN5.5W

4.5W

5.5W

44

Page 45: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Opportunistically boost performance

• Processor & Memory provisioned for 5.5W & 4.5W

• Snatch power and boost performance

5.5W +-2W

4.5W +-2W

Snatch 2W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN5.5W

4.5W

45

Page 46: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Boost Performance with Same # of Pins

• Processor & Memory provisioned for 5.5W & 4.5W

• Snatch power and boost performance

• Same # of pins as conventional5.5W +-2W

4.5W +-2W

Snatch 2W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN5.5W

4.5W

46

Page 47: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Boost Performance with Same # of Pins

• Processor & Memory provisioned for 5.5W & 4.5W

• Snatch power and boost performance

• Same # of pins as conventional

Higher performance for the same package cost

5.5W +-2W

4.5W +-2W

Snatch 2W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN5.5W

4.5W

47

Page 48: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch: Boost Performance with Same # of Pins

• Processor & Memory provisioned for 5.5W & 4.5W

• Snatch power and boost performance

• Same # of pins as conventional

Higher performance for the same package cost

IR-drop and EM characteristics remain the same

5.5W +-2W

4.5W +-2W

Snatch 2W

Processor

On-chip

VR

Memory

Processor

VR

Memory

VR

PDN

PDN5.5W

4.5W

48

Page 49: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatch Outline

• Implementation

• Operation

• Case 1:

• Same Max Power in Processor and Memory, reduced # of pins

• Case 2:

• Same # of pins, improved performance

• Evaluation

49

Page 50: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Evaluation Methodology

• Case 2: Same # of pins, improved performance

• Processor: 22nm LP 8-core w/ SESC + McPAT

• Memory: 4GB 2-layer WideIO2 w/ DRAMSim2

• Benchmarks: SPLASH-2, NAS, and SPEC

50

Page 51: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Performance

Sp

eed

up

0.4

0.6

0.7

0.9

1.0

1.2

1.3

Average(Splash + NAS) Average(SPEC)

Baseline Turbo Boost Snatch

51

Page 52: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Performance

Sp

eed

up

0.4

0.6

0.7

0.9

1.0

1.2

1.3

Average(Splash + NAS) Average(SPEC)

Baseline Turbo Boost SnatchP = {5.5W, 1.2GHz} M= {4.5W, 400MHz}

Baseline

52

Page 53: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Performance

Sp

eed

up

0.4

0.6

0.7

0.9

1.0

1.2

1.3

Average(Splash + NAS) Average(SPEC)

Baseline Turbo Boost Snatch

P = {5.5W, 1.2-1.5GHz} M= {4.5W, 400-900MHz} DVFS Within Power Budget

P = {5.5W, 1.2GHz} M= {4.5W, 400MHz}

Baseline

Turbo Boost

53

Page 54: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Performance

Sp

eed

up

0.4

0.6

0.7

0.9

1.0

1.2

1.3

Average(Splash + NAS) Average(SPEC)

Baseline Turbo Boost Snatch

P = {5.5W, 1.2-1.5GHz} M= {4.5W, 400-900MHz}DVFS and Snatch up to 2W

P = {5.5W, 1.2-1.5GHz} M= {4.5W, 400-900MHz} DVFS Within Power Budget

P = {5.5W, 1.2GHz} M= {4.5W, 400MHz}

Baseline

Turbo Boost

Snatch

54

Page 55: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

P = {5.5W, 1.2-1.5GHz} M= {4.5W, 400-900MHz}DVFS and Snatch up to 2W

P = {5.5W, 1.2-1.5GHz} M= {4.5W, 400-900MHz} DVFS Within Power Budget

P = {5.5W, 1.2GHz} M= {4.5W, 400MHz}

Baseline

Turbo Boost

Snatch

Performance

Sp

eed

up

0.4

0.6

0.7

0.9

1.0

1.2

1.3

Average(Splash + NAS) Average(SPEC)

Baseline Turbo Boost Snatch

25%8%

Snatch boosts performance on average, by 25% against Baseline and 8% against Turbo Boost

for Splash and NAS benchmarks

10%

55

Page 56: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

P = {5.5W, 1.2-1.5GHz} M= {4.5W, 400-900MHz}DVFS and Snatch up to 2W

P = {5.5W, 1.2-1.5GHz} M= {4.5W, 400-900MHz} DVFS Within Power Budget

P = {5.5W, 1.2GHz} M= {4.5W, 400MHz}

Baseline

Turbo Boost

Snatch

Performance

Sp

eed

up

0.4

0.6

0.7

0.9

1.0

1.2

1.3

Average(Splash + NAS) Average(SPEC)

Baseline Turbo Boost Snatch

25%8%

Snatch boosts performance on average, by 10% against Baseline for SPEC benchmarks

Negligible gains against Turbo Boost

10%

56

Page 57: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Activity Overview%

of

To

tal

Tim

e S

na

tch

ing

0

22.5

45

67.5

90

Bar

nes BT

CG

Ch

oles

kyF

FT

FM

M FT IS LU

LU

(NA

S)M

GR

adio

sity

Rad

ixR

aytr

ace

SPW

-Nsq

uar

edW

-Sp

atia

lA

vgM

->P

mcf

mil

clb

mb

zip

2A

vgP

->M

M->P P->M

57

Page 58: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Activity Overview%

of

To

tal

Tim

e S

na

tch

ing

0

22.5

45

67.5

90

Bar

nes BT

CG

Ch

oles

kyF

FT

FM

M FT IS LU

LU

(NA

S)M

GR

adio

sity

Rad

ixR

aytr

ace

SPW

-Nsq

uar

edW

-Sp

atia

lA

vgM

->P

mcf

mil

clb

mb

zip

2A

vgP

->M

M P P M

58

Page 59: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Activity Overview%

of

To

tal

Tim

e S

na

tch

ing

0

22.5

45

67.5

90

Bar

nes BT

CG

Ch

oles

kyF

FT

FM

M FT IS LU

LU

(NA

S)M

GR

adio

sity

Rad

ixR

aytr

ace

SPW

-Nsq

uar

edW

-Sp

atia

lA

vgM

->P

mcf

mil

clb

mb

zip

2A

vgP

->M

M P P M

59

Page 60: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Snatching Activity Overview%

of

To

tal

Tim

e S

na

tch

ing

0

22.5

45

67.5

90

Bar

nes BT

CG

Ch

oles

kyF

FT

FM

M FT IS LU

LU

(NA

S)M

GR

adio

sity

Rad

ixR

aytr

ace

SPW

-Nsq

uar

edW

-Sp

atia

lA

vgM

->P

mcf

mil

clb

mb

zip

2A

vgP

->M

M P P M

Application Snatch on average, 30% for Splash+NAS

9.4% for SPEC

60

Page 61: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

More On the Paper

• Design and Implementation:

• On-chip Voltage Regulator

• Snatch Algorithm

• Additional Evaluation:

• Snatch Algorithm

• Power Delivery Network

• Pin Reliability

• 3D Stack Thermals

61

Page 62: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

Summary

• Snatch: An opportunistic power reassignment design for 3D Stacked architectures

• Small on-chip bidirectional VR

• Processor - Memory phase detection and power availability estimation

• Up to 23% application speedup

• Alternatively, about 30% package cost reduction

62

Page 63: Snatch: Opportunistically Reassigning Power Allocation ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_micro16_2.pdf · Snatch: Opportunistically Reassigning Power Allocation between

63Image Source : http://www.hutui6.com/data/out/178/67609941-snatch-wallpapers.jpg