non-uniform cache architecture

10
Non-Uniform Cache Architecture Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech Guest lecture for ECE4100/6100 for Prof. Yalamanchili

Upload: liang

Post on 14-Jan-2016

38 views

Category:

Documents


1 download

DESCRIPTION

Non-Uniform Cache Architecture Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech Guest lecture for ECE4100/6100 for Prof. Yalamanchili. Non-Uniform Cache Architecture. ASPLOS 2002 proposed by UT-Austin Facts Large shared on-die L2 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Non-Uniform Cache Architecture

Non-Uniform Cache Architecture

Prof. Hsien-Hsin S. LeeSchool of Electrical and Computer EngineeringGeorgia Tech

Guest lecture for ECE4100/6100 for Prof. Yalamanchili

Page 2: Non-Uniform Cache Architecture

2

Non-Uniform Cache Architecture

• ASPLOS 2002 proposed by UT-Austin• Facts

– Large shared on-die L2– Wire-delay dominating on-die cache

3 cycles1MB

180nm, 1999

11 cycles4MB

90nm, 2004

24 cycles16MB

50nm, 2010

Page 3: Non-Uniform Cache Architecture

3

Multi-banked L2 cache

Bank=128KB11 cycles

2MB @ 130nm

Bank Access time = 3 cyclesInterconnect delay = 8 cycles

Page 4: Non-Uniform Cache Architecture

4

Multi-banked L2 cache

Bank=64KB47 cycles

16MB @ 50nm

Bank Access time = 3 cyclesInterconnect delay = 44 cycles

Page 5: Non-Uniform Cache Architecture

5

Static NUCA-1

• Use private per-bank channel• Each bank has its distinct access latency• Statically decide data location for its given address • Average access latency =34.2 cycles• Wire overhead = 20.9% an issue

Tag Array

Data Bus

Address Bus

Bank

Sub-bank

Predecoder

Senseamplifier

Wordline driverand decoder

Page 6: Non-Uniform Cache Architecture

6

Static NUCA-2

• Use a 2D switched network to alleviate wire area overhead• Average access latency =24.2 cycles• Wire overhead = 5.9%

Bank

Data bus

SwitchTag Array

Wordline driverand decoder

Predecoder

Page 7: Non-Uniform Cache Architecture

7

Dynamic NUCA

• Data can dynamically migrate• Move frequently used cache lines closer to CPU

Page 8: Non-Uniform Cache Architecture

8

Dynamic NUCA

• Simple Mapping• All 4 ways of each bank set needs to be searched• Farther bank sets longer access

8 bank setsway 0

way 1

way 2

way 3

one set

bank

Page 9: Non-Uniform Cache Architecture

9

Dynamic NUCA

• Fair Mapping• Average access time across all bank sets are

equal

8 bank setsway 0

way 1

way 2

way 3

one set

bank

Page 10: Non-Uniform Cache Architecture

10

Dynamic NUCA

• Shared Mapping• Sharing the closet banks for farther banks

8 bank setsway 0

way 1

way 2

way 3

bank