exploration of memory and cluster modes in directory-based ... · exploration of memory and cluster...

31
Exploration of Memory and Cluster Modes in Directory - Based Many - Core CMPs Subodha Charles and Prabhat Mishra University of Florida, USA Chetan Arvind Patil and Umit Y. Ogras Arizona State University, USA This work was partially supported by the National Science Foundation (NSF) grants CNS-1526687 and CNS-1526562

Upload: others

Post on 14-Sep-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Exploration of Memory and Cluster

Modes in Directory-Based

Many-Core CMPs

Subodha Charles and Prabhat Mishra

University of Florida, USA

Chetan Arvind Patil and Umit Y. Ogras

Arizona State University, USA

This work was partially supported by the National Science

Foundation (NSF) grants CNS-1526687 and CNS-1526562

Page 2: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

2

Outline

Introduction

Existing NoC Exploration Methods

Accurate Modeling and Exploration

❖ Motivation

❖ Modeling of Directory–Memory Traffic

❖ Exploration of Memory and Cluster Modes

Experimental Results

Conclusion

Page 3: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Increased Complexity of SoC Design

Page 4: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Increased Complexity of SoC Design

Page 5: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

NoCs are Ciritcal for Performance

Early interconnection

designs were buses

and point-to-point

Does Not Scale!

Solution: NoC

Page 6: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Architecture of a Many-Core CMP

Page 7: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

7

Outline

Introduction

Existing NoC Exploration Methods

Accurate Modeling and Exploration

❖ Motivation

❖ Modeling of Directory–Memory Traffic

❖ Exploration of Memory and Cluster Modes

Experimental Results

Conclusion

Page 8: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Traffic Optimization on NoC

Min # of MCs

Eitschberger et al.

MCC ‘13

Optimum MC Placement

Xu et al.

CODES+ISSS ‘13

Dynamic Workload Data Mapping

Awasthi et al.

PACT ‘10

8

Page 9: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Optimum MC Placement

9

Column 0/7 Column 2/5 Diamond

Optimum SlashXu et al.

CODES+ISSS ‘13

Page 10: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

10

Outline

Introduction

Existing NoC Exploration Methods

Accurate Modeling and Exploration

❖ Motivation

❖ Modeling of Directory–Memory Traffic

❖ Exploration of Memory and Cluster Modes

Experimental Results

Conclusion

Page 11: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

KNL: 2nd Generation Xeon-Phi

38 tiles

36 active, 2 recovery

Each tile;

2 VPUs, Out of order

4 threads per core

4 separate NoCs

Page 12: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Traffic Model of gem5 Simulator

Life Cycle of a memory

request:

(1) Request forwarded

to Directory

Controller after miss

in private cache

(2) Data retrieved from

memory

(3) MC forwards data to

the requestor

1

2

3

Page 13: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

A Memory Controller at Each Tile?

Is this a realistic assumption???

Number of MCs < Number of tiles

Packaging constraints

High I/O pin cost

Page 14: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Intel Xeon-Phi 7210

Page 15: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Hotspots Introduced by MCs

Page 16: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Key Idea

The interactions between cores,

directory controllers and memory

controllers should be accurately

modelled to enable exploration of

NoC optimization

Page 17: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

17

Outline

Introduction

Existing NoC Exploration Methods

Accurate Modeling and Exploration

❖ Motivation

❖ Modeling of Directory–Memory Traffic

❖ Exploration of Memory and Cluster Modes

Experimental Results

Conclusion

Page 18: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Modified Traffic Model

Life Cycle of a memory

request:

(1) Request forwarded

to Directory

Controller after miss

in private cache

(2) Forward request to

MC.

(3) Data retrieved from

memory

(4) MC forwards data to

the requestor

1

3

2

4

Page 19: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Modified Traffic Model

19

Introduces hotspots

Realistic estimate of power and performance data.

Exploration of MC placement.

Exploration of Cluster and Memory modes

The inclusion of the new step (2) has a significant

impact

Page 20: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Modified Traffic Model

Page 21: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

21

Outline

Introduction

Existing NoC Exploration Methods

Accurate Modeling and Exploration

❖ Motivation

❖ Modeling of Directory–Memory Traffic

❖ Exploration of Memory and Cluster Modes

Experimental Results

Conclusion

Page 22: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Cluster Modes in KNL

All-to-all Mode

A request from a core can be

forwarded to any directory

controller. The memory

request can be forwarded to

any MC as well.

Quadrant Mode

Four virtual quadrants. A request

from a core can be forwarded to any

directory controller. But the memory

request should be sent to an MC on

the same quadrant as the directory.

12

3

1

2

3

Page 23: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Memory Modes in KNL

Flat Mode

DDR and MCDRAM in the

same address space

Cache Mode

MCDRAM acting as

last-level cache

12

3

1

2

3

4

Page 24: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Traffic Flow – Memory and Cluster Modes

Flat, All-to-all

Mode

Cache, All-to-all

Mode

Flat, Quadrant

Mode

Page 25: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

25

Outline

Introduction

Existing NoC Exploration Methods

Accurate Modeling and Exploration

❖ Motivation

❖ Modeling of Directory–Memory Traffic

❖ Exploration of Memory and Cluster Modes

Experimental Results

Conclusion

Page 26: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Experimental Setup

Architecture Simulator: gem5

NoC model: Garnet2.0

A CMP similar to Xeon-Phi 7210 modeled in

gem5

Our implementation added in the cache

coherence traffic transitions.

Gem5 output statistics fed into McPAT simulator

to extract power results.

Page 27: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Network Traffic Analysis

The default gem5

model gives highly

optimistic results

The two modified

models – KNL (all-to-

all) and KNL

(quadrant) gives

comparable results

KNL (quadrant) gives

better performance as

it has high affinity

between directory and

memory controllers.

Page 28: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Memory Controller Placement

Exploration of memory controller placement under the

modified model.

Compared with the work done by Xu et al. “Optimal” is no

longer the optimal placement.

The default gem5 model again gives highly optimistic results

Page 29: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Memory and Cluster Mode Exploration

Compared to All-to-all Flat mode, All-to-all Cache mode

gives highest benefit : 18.62% less execution time on

average

Observations are in agreement with results obtained

from Xeon Phi 7210 hardware platform

Page 30: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

30

Conclusion

Page 31: Exploration of Memory and Cluster Modes in Directory-Based ... · Exploration of Memory and Cluster Modes in Directory-Based Many-Core CMPs Subodha Charles and Prabhat Mishra University

Thank you!

Questions?