18-447 computer architecture lecture 21: main memoryece447/s15/lib/exe/fetch... · 18-447 computer...
TRANSCRIPT
![Page 1: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/1.jpg)
18-447 Computer Architecture
Lecture 21: Main Memory
Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 3/23/2015
![Page 2: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/2.jpg)
Assignment Reminders n Lab 6: Due April 3
q C-level simulation of caches and branch prediction
n HW 5: Due March 29 q Will be out later today
n Midterm II: TBD
n The course will move quickly in the last 1.5 months q Please manage your time well q Get help from the TAs during office hours and recitation
sessions q The key is learning the material very well
2
![Page 3: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/3.jpg)
Upcoming Seminar on Flash Memory (March 25)
n March 25, Wednesday, CIC Panther Hollow Room, 4-5pm n Yixin Luo, PhD Student, CMU n Data Retention in MLC NAND Flash Memory:
Characterization, Optimization and Recovery
n Yu Cai, Yixin Luo, Erich F. Haratsch, Ken Mai, and Onur Mutlu, "Data Retention in MLC NAND Flash Memory: Characterization, Optimization and Recovery" Proceedings of the 21st International Symposium on High-Performance Computer Architecture (HPCA), Bay Area, CA, February 2015. [Slides (pptx) (pdf)] Best paper session.
3
![Page 4: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/4.jpg)
Computer Architecture Seminars n Seminars relevant to many topics covered in 447
q Caching q DRAM q Multi-core systems q …
n List of past and upcoming seminars are here: q https://www.ece.cmu.edu/~calcm/doku.php?
id=seminars:seminars n You can subscribe to receive Computer Architecture related
event announcements here: q https://sos.ece.cmu.edu/mailman/listinfo/calcm-list
4
![Page 5: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/5.jpg)
Midterm I Statistics: Average n Out of 100:
n MEAN 48.69 n MEDIAN 47.94 n STDEV 12.06
n MAX 76.18 n MIN 27.06
5
![Page 6: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/6.jpg)
Midterm I Grade Distribution (Percentage)
6
![Page 7: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/7.jpg)
Midterm I Grade Distribution (Absolute)
7
![Page 8: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/8.jpg)
Grade Breakdowns per Question n http://www.ece.cmu.edu/~ece447/s15/lib/exe/fetch.php?
media=midterm_distribution.pdf
8
![Page 9: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/9.jpg)
Going Forward n What really matters is learning
q And using the knowledge, skills, and ability to process information in the future
q Focus less on grades, and put more weight into understanding
n Midterm I is only 12% of your entire course grade q Worth less than 2 labs + extra credit
n There are still Midterm II, Final, 3 Labs and 3 Homeworks n There are many extra credit opportunities (great for
learning by exploring your creativity)
9
![Page 10: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/10.jpg)
Lab 3 Extra Credit Recognitions n 4.00 bmperez (Brandon Perez) n 3.75 junhanz (Junhan Zhou) n 3.75 zzhao1 (Zhipeng Zhao) n 2.50 terencea (Terence An) n 2.25 rohitban (Rohit Banerjee)
10
![Page 11: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/11.jpg)
Where We Are in Lecture Schedule n The memory hierarchy n Caches, caches, more caches n Virtualizing the memory hierarchy: Virtual Memory n Main memory: DRAM n Main memory control, scheduling n Memory latency tolerance techniques n Non-volatile memory
n Multiprocessors n Coherence and consistency n Interconnection networks n Multi-core issues
11
![Page 12: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/12.jpg)
Main Memory
![Page 13: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/13.jpg)
Required Reading (for the Next Few Lectures) n Onur Mutlu, Justin Meza, and Lavanya Subramanian,
"The Main Memory System: Challenges and Opportunities" Invited Article in Communications of the Korean Institute of Information Scientists and Engineers (KIISE), 2015.
13
![Page 14: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/14.jpg)
Required Readings on DRAM n DRAM Organization and Operation Basics
q Sections 1 and 2 of: Lee et al., “Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture,” HPCA 2013.
http://users.ece.cmu.edu/~omutlu/pub/tldram_hpca13.pdf
q Sections 1 and 2 of Kim et al., “A Case for Subarray-Level Parallelism (SALP) in DRAM,” ISCA 2012.
http://users.ece.cmu.edu/~omutlu/pub/salp-dram_isca12.pdf
n DRAM Refresh Basics q Sections 1 and 2 of Liu et al., “RAIDR: Retention-Aware
Intelligent DRAM Refresh,” ISCA 2012. http://users.ece.cmu.edu/~omutlu/pub/raidr-dram-refresh_isca12.pdf
14
![Page 15: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/15.jpg)
Why Is Memory So Important? (Especially Today)
![Page 16: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/16.jpg)
The Main Memory System
n Main memory is a critical component of all computing systems: server, mobile, embedded, desktop, sensor
n Main memory system must scale (in size, technology, efficiency, cost, and management algorithms) to maintain performance growth and technology scaling benefits
16
Processor and caches
Main Memory Storage (SSD/HDD)
![Page 17: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/17.jpg)
Memory System: A Shared Resource View
17
Storage
![Page 18: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/18.jpg)
State of the Main Memory System n Recent technology, architecture, and application trends
q lead to new requirements q exacerbate old requirements
n DRAM and memory controllers, as we know them today, are (will be) unlikely to satisfy all requirements
n Some emerging non-volatile memory technologies (e.g., PCM) enable new opportunities: memory+storage merging
n We need to rethink/reinvent the main memory system q to fix DRAM issues and enable emerging technologies q to satisfy all requirements
18
![Page 19: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/19.jpg)
Major Trends Affecting Main Memory (I) n Need for main memory capacity, bandwidth, QoS increasing
n Main memory energy/power is a key system design concern
n DRAM technology scaling is ending
19
![Page 20: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/20.jpg)
Demand for Memory Capacity n More cores è More concurrency è Larger working set
n Modern applications are (increasingly) data-intensive
n Many applications/virtual machines (will) share main memory
q Cloud computing/servers: Consolidation to improve efficiency q GP-GPUs: Many threads from multiple parallel applications q Mobile: Interactive + non-interactive consolidation q …
20
IBM Power7: 8 cores Intel SCC: 48 cores AMD Barcelona: 4 cores
![Page 21: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/21.jpg)
Example: The Memory Capacity Gap
n Memory capacity per core expected to drop by 30% every two years n Trends worse for memory bandwidth per core!
21
Core count doubling ~ every 2 years DRAM DIMM capacity doubling ~ every 3 years
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been
![Page 22: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/22.jpg)
Major Trends Affecting Main Memory (II) n Need for main memory capacity, bandwidth, QoS increasing
q Multi-core: increasing number of cores/agents q Data-intensive applications: increasing demand/hunger for data q Consolidation: Cloud computing, GPUs, mobile, heterogeneity
n Main memory energy/power is a key system design concern
n DRAM technology scaling is ending
22
![Page 23: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/23.jpg)
Major Trends Affecting Main Memory (III) n Need for main memory capacity, bandwidth, QoS increasing
n Main memory energy/power is a key system design concern
q IBM servers: ~50% energy spent in off-chip memory hierarchy [Lefurgy, IEEE Computer 2003]
q DRAM consumes power when idle and needs periodic refresh
n DRAM technology scaling is ending
23
![Page 24: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/24.jpg)
Major Trends Affecting Main Memory (IV) n Need for main memory capacity, bandwidth, QoS increasing
n Main memory energy/power is a key system design concern
n DRAM technology scaling is ending
q ITRS projects DRAM will not scale easily below X nm q Scaling has provided many benefits:
n higher capacity, higher density, lower cost, lower energy
24
![Page 25: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/25.jpg)
The DRAM Scaling Problem n DRAM stores charge in a capacitor (charge-based memory)
q Capacitor must be large enough for reliable sensing q Access transistor should be large enough for low leakage and high
retention time q Scaling beyond 40-35nm (2013) is challenging [ITRS, 2009]
n DRAM capacity, cost, and energy/power hard to scale
25
![Page 26: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/26.jpg)
Row of Cells Row Row Row Row
Wordline
VLOW VHIGH Vic2m Row
Vic2m Row Aggressor Row
Repeatedly opening and closing a row enough 2mes within a refresh interval induces disturbance errors in adjacent rows in most real DRAM chips you can buy today
Opened Closed
26
Evidence of the DRAM Scaling Problem
Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors,” ISCA 2014.
![Page 27: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/27.jpg)
86% (37/43)
83% (45/54)
88% (28/32)
A company B company C company
Up to 1.0×107 �errors
Up to 2.7×106�errors
Up to 3.3×105 �errors
27 Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors,” ISCA 2014.
Most DRAM Modules Are At Risk
![Page 28: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/28.jpg)
DRAM Module x86 CPU
Y
X
loop: mov (X), %eax mov (Y), %ebx clflush (X) clflush (Y) mfence jmp loop
![Page 29: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/29.jpg)
DRAM Module x86 CPU
loop: mov (X), %eax mov (Y), %ebx clflush (X) clflush (Y) mfence jmp loop
Y
X
![Page 30: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/30.jpg)
DRAM Module x86 CPU
loop: mov (X), %eax mov (Y), %ebx clflush (X) clflush (Y) mfence jmp loop
Y
X
![Page 31: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/31.jpg)
DRAM Module x86 CPU
loop: mov (X), %eax mov (Y), %ebx clflush (X) clflush (Y) mfence jmp loop
Y
X
![Page 32: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/32.jpg)
• A real reliability & security issue • In a more controlled environment, we can induce as many as ten million disturbance errors
CPU Architecture Errors Access-‐Rate
Intel Haswell (2013) 22.9K 12.3M/sec
Intel Ivy Bridge (2012) 20.7K 11.7M/sec
Intel Sandy Bridge (2011) 16.1K 11.6M/sec
AMD Piledriver (2012) 59 6.1M/sec
32 Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors,” ISCA 2014.
Observed Errors in Real Systems
![Page 33: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/33.jpg)
33 All modules from 2012–2013 are vulnerable
First Appearance
Errors vs. Vintage
![Page 34: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/34.jpg)
Security Implications (I)
34
http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
http://users.ece.cmu.edu/~omutlu/pub/dram-row-hammer_isca14.pdf
![Page 35: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/35.jpg)
Security Implications (II) n “Rowhammer” is a problem with some recent DRAM devices in
which repeatedly accessing a row of memory can cause bit flips in adjacent rows.
n We tested a selection of laptops and found that a subset of them exhibited the problem.
n We built two working privilege escalation exploits that use this effect.
n One exploit uses rowhammer-induced bit flips to gain kernel privileges on x86-64 Linux when run as an unprivileged userland process.
n When run on a machine vulnerable to the rowhammer problem, the process was able to induce bit flips in page table entries (PTEs).
n It was able to use this to gain write access to its own page table, and hence gain read-write access to all of physical memory.
35
![Page 36: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/36.jpg)
Recap: The DRAM Scaling Problem
36
![Page 37: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/37.jpg)
An Orthogonal Issue: Memory Interference
Main Memory
37
Core Core
Core Core
Cores’ interfere with each other when accessing shared main memory Uncontrolled interference leads to many problems (QoS, performance)
![Page 38: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/38.jpg)
Major Trends Affecting Main Memory n Need for main memory capacity, bandwidth, QoS increasing
n Main memory energy/power is a key system design concern
n DRAM technology scaling is ending
38
![Page 39: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/39.jpg)
How Can We Fix the Memory Problem & Design (Memory) Systems of the Future?
![Page 40: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/40.jpg)
Look Backward to Look Forward n We first need to understand the principles of:
q Memory and DRAM q Memory controllers q Techniques for reducing and tolerating memory latency q Potential memory technologies that can compete with DRAM
n This is what we will cover in the next few lectures
40
![Page 41: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/41.jpg)
Main Memory
![Page 42: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/42.jpg)
Main Memory in the System
42
CORE 1
L2 CA
CH
E 0
SHA
RED
L3 CA
CH
E
DR
AM
INTER
FAC
E
CORE 0
CORE 2 CORE 3 L2 C
AC
HE 1
L2 CA
CH
E 2
L2 CA
CH
E 3
DR
AM
BA
NK
S
DRAM MEMORY CONTROLLER
![Page 43: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/43.jpg)
The Memory Chip/System Abstraction
43
![Page 44: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/44.jpg)
Review: Memory Bank Organization n Read access sequence:
1. Decode row address & drive word-lines
2. Selected bits drive bit-lines • Entire row read
3. Amplify row data 4. Decode column
address & select subset of row
• Send to output 5. Precharge bit-lines • For next access
44
![Page 45: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/45.jpg)
Review: SRAM (Static Random Access Memory)
45
bit-cell array
2n row x 2m-col
(n≈m to minimize overall latency)
sense amp and mux 2m diff pairs
2n n
m
1
row select
bitli
ne
_bitl
ine
n+m
Read Sequence 1. address decode 2. drive row select 3. selected bit-cells drive bitlines (entire row is read together)
4. diff. sensing and col. select (data is ready) 5. precharge all bitlines (for next read or write)
Access latency dominated by steps 2 and 3 Cycling time dominated by steps 2, 3 and 5
- step 2 proportional to 2m
- step 3 and 5 proportional to 2n
![Page 46: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/46.jpg)
Review: DRAM (Dynamic Random Access Memory)
46
row enable _b
itlin
e
bit-cell array
2n row x 2m-col
(n≈m to minimize overall latency)
sense amp and mux 2m
2n n
m
1
RAS
CAS A DRAM die comprises of multiple such arrays
Bits stored as charges on node capacitance (non-restorative)
- bit cell loses charge when read - bit cell loses charge over time
Read Sequence 1~3 same as SRAM 4. a “flip-flopping” sense amp
amplifies and regenerates the bitline, data bit is mux’ed out
5. precharge all bitlines
Refresh: A DRAM controller must periodically read all rows within the allowed refresh time (10s of ms) such that charge is restored in cells
![Page 47: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/47.jpg)
Review: DRAM vs. SRAM n DRAM
q Slower access (capacitor) q Higher density (1T 1C cell) q Lower cost q Requires refresh (power, performance, circuitry) q Manufacturing requires putting capacitor and logic together
n SRAM q Faster access (no capacitor) q Lower density (6T cell) q Higher cost q No need for refresh q Manufacturing compatible with logic process (no capacitor)
47
![Page 48: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/48.jpg)
Some Fundamental Concepts (I) n Physical address space
q Maximum size of main memory: total number of uniquely identifiable locations
n Physical addressability q Minimum size of data in memory can be addressed q Byte-addressable, word-addressable, 64-bit-addressable q Microarchitectural addressability depends on the abstraction
level of the implementation
n Alignment q Does the hardware support unaligned access transparently to
software?
n Interleaving 48
![Page 49: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/49.jpg)
Some Fundamental Concepts (II) n Interleaving (banking)
q Problem: a single monolithic memory array takes long to access and does not enable multiple accesses in parallel
q Goal: Reduce the latency of memory array access and enable multiple accesses in parallel
q Idea: Divide the array into multiple banks that can be accessed independently (in the same cycle or in consecutive cycles) n Each bank is smaller than the entire memory storage n Accesses to different banks can be overlapped
q A Key Issue: How do you map data to different banks? (i.e., how do you interleave data across banks?)
49
![Page 50: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/50.jpg)
Interleaving
50
![Page 51: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/51.jpg)
Interleaving Options
51
![Page 52: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/52.jpg)
Some Questions/Concepts n Remember CRAY-1 with 16 banks
q 11 cycle bank latency q Consecutive words in memory in consecutive banks (word
interleaving) q 1 access can be started (and finished) per cycle
n Can banks be operated fully in parallel? q Multiple accesses started per cycle?
n What is the cost of this? q We have seen it earlier
n Modern superscalar processors have L1 data caches with multiple, fully-independent banks; DRAM banks share buses
52
![Page 53: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/53.jpg)
The Bank Abstraction
53
![Page 54: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/54.jpg)
54
Rank
![Page 55: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/55.jpg)
The DRAM Subsystem
![Page 56: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/56.jpg)
DRAM Subsystem Organization
n Channel n DIMM n Rank n Chip n Bank n Row/Column n Cell
56
![Page 57: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/57.jpg)
Page Mode DRAM n A DRAM bank is a 2D array of cells: rows x columns n A “DRAM row” is also called a “DRAM page” n “Sense amplifiers” also called “row buffer”
n Each address is a <row,column> pair n Access to a “closed row”
q Activate command opens row (placed into row buffer) q Read/write command reads/writes column in the row buffer q Precharge command closes the row and prepares the bank for
next access
n Access to an “open row” q No need for an activate command
57
![Page 58: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/58.jpg)
The DRAM Bank Structure
58
![Page 59: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/59.jpg)
DRAM Bank Operation
59
Row Buffer
(Row 0, Column 0)
Row
dec
oder
Column mux
Row address 0
Column address 0
Data
Row 0 Empty
(Row 0, Column 1)
Column address 1
(Row 0, Column 85)
Column address 85
(Row 1, Column 0)
HIT HIT
Row address 1
Row 1
Column address 0
CONFLICT !
Columns
Row
s
Access Address:
![Page 60: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/60.jpg)
The DRAM Chip n Consists of multiple banks (8 is a common number today) n Banks share command/address/data buses n The chip itself has a narrow interface (4-16 bits per read)
n Changing the number of banks, size of the interface (pins), whether or not command/address/data buses are shared has significant impact on DRAM system cost
60
![Page 61: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/61.jpg)
128M x 8-bit DRAM Chip
61
![Page 62: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/62.jpg)
DRAM Rank and Module n Rank: Multiple chips operated together to form a wide
interface n All chips comprising a rank are controlled at the same time
q Respond to a single command q Share address and command buses, but provide different data
n A DRAM module consists of one or more ranks q E.g., DIMM (dual inline memory module) q This is what you plug into your motherboard
n If we have chips with 8-bit interface, to read 8 bytes in a single access, use 8 chips in a DIMM
62
![Page 63: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/63.jpg)
A 64-bit Wide DIMM (One Rank)
63
DRAMChip
DRAMChip
DRAMChip
DRAMChip
DRAMChip
DRAMChip
DRAMChip
DRAMChip
Command Data
![Page 64: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/64.jpg)
A 64-bit Wide DIMM (One Rank) n Advantages:
q Acts like a high-capacity DRAM chip with a wide interface
q Flexibility: memory controller does not need to deal with individual chips
n Disadvantages: q Granularity:
Accesses cannot be smaller than the interface width
64
![Page 65: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/65.jpg)
Multiple DIMMs
65
n Advantages: q Enables even
higher capacity
n Disadvantages: q Interconnect
complexity and energy consumption can be high
à Scalability is limited by this
![Page 66: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/66.jpg)
DRAM Channels
n 2 Independent Channels: 2 Memory Controllers (Above) n 2 Dependent/Lockstep Channels: 1 Memory Controller with
wide interface (Not Shown above)
66
![Page 67: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/67.jpg)
Generalized Memory Structure
67
![Page 68: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/68.jpg)
Generalized Memory Structure
68
![Page 69: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/69.jpg)
The DRAM Subsystem The Top Down View
![Page 70: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/70.jpg)
DRAM Subsystem Organization
n Channel n DIMM n Rank n Chip n Bank n Row/Column n Cell
70
![Page 71: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/71.jpg)
The DRAM subsystem
Memory channel Memory channel
DIMM (Dual in-‐line memory module)
Processor
“Channel”
![Page 72: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/72.jpg)
Breaking down a DIMM
DIMM (Dual in-‐line memory module)
Side view
Front of DIMM Back of DIMM
![Page 73: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/73.jpg)
Breaking down a DIMM
DIMM (Dual in-‐line memory module)
Side view
Front of DIMM Back of DIMM
Rank 0: collecEon of 8 chips Rank 1
![Page 74: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/74.jpg)
Rank
Rank 0 (Front) Rank 1 (Back)
Data <0:63> CS <0:1> Addr/Cmd
<0:63> <0:63>
Memory channel
![Page 75: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/75.jpg)
Breaking down a Rank
Rank 0
<0:63>
Chip 0
Chip 1
Chip 7 . . .
<0:7>
<8:15>
<56:63>
Data <0:63>
![Page 76: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/76.jpg)
Breaking down a Chip
Chip 0
<0:7>
Bank 0
<0:7>
<0:7>
<0:7>
...
<0:7>
![Page 77: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/77.jpg)
Breaking down a Bank
Bank 0
<0:7>
row 0
row 16k-‐1
... 2kB
1B
1B (column)
1B
Row-‐buffer
1B
... <0:7>
![Page 78: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/78.jpg)
DRAM Subsystem Organization
n Channel n DIMM n Rank n Chip n Bank n Row/Column n Cell
78
![Page 79: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/79.jpg)
Example: Transferring a cache block
0xFFFF…F
0x00
0x40
...
64B cache block
Physical memory space
Channel 0
DIMM 0
Rank 0 Mappe
d to
![Page 80: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/80.jpg)
Example: Transferring a cache block
0xFFFF…F
0x00
0x40
...
64B cache block
Physical memory space
Rank 0 Chip 0 Chip 1 Chip 7
<0:7>
<8:15>
<56:63>
Data <0:63>
. . .
![Page 81: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/81.jpg)
Example: Transferring a cache block
0xFFFF…F
0x00
0x40
...
64B cache block
Physical memory space
Rank 0 Chip 0 Chip 1 Chip 7
<0:7>
<8:15>
<56:63>
Data <0:63>
Row 0 Col 0
. . .
![Page 82: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/82.jpg)
Example: Transferring a cache block
0xFFFF…F
0x00
0x40
...
64B cache block
Physical memory space
Rank 0 Chip 0 Chip 1 Chip 7
<0:7>
<8:15>
<56:63>
Data <0:63>
8B
Row 0 Col 0
. . .
8B
![Page 83: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/83.jpg)
Example: Transferring a cache block
0xFFFF…F
0x00
0x40
...
64B cache block
Physical memory space
Rank 0 Chip 0 Chip 1 Chip 7
<0:7>
<8:15>
<56:63>
Data <0:63>
8B
Row 0 Col 1
. . .
![Page 84: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/84.jpg)
Example: Transferring a cache block
0xFFFF…F
0x00
0x40
...
64B cache block
Physical memory space
Rank 0 Chip 0 Chip 1 Chip 7
<0:7>
<8:15>
<56:63>
Data <0:63>
8B
8B
Row 0 Col 1
. . .
8B
![Page 85: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/85.jpg)
Example: Transferring a cache block
0xFFFF…F
0x00
0x40
...
64B cache block
Physical memory space
Rank 0 Chip 0 Chip 1 Chip 7
<0:7>
<8:15>
<56:63>
Data <0:63>
8B
8B
Row 0 Col 1
A 64B cache block takes 8 I/O cycles to transfer.
During the process, 8 columns are read sequenUally.
. . .
![Page 86: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/86.jpg)
Latency Components: Basic DRAM Operation
n CPU → controller transfer time n Controller latency
q Queuing & scheduling delay at the controller q Access converted to basic commands
n Controller → DRAM transfer time n DRAM bank latency
q Simple CAS (column address strobe) if row is “open” OR q RAS (row address strobe) + CAS if array precharged OR q PRE + RAS + CAS (worst case)
n DRAM → Controller transfer time q Bus latency (BL)
n Controller to CPU transfer time
86
![Page 87: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/87.jpg)
Multiple Banks (Interleaving) and Channels n Multiple banks
q Enable concurrent DRAM accesses q Bits in address determine which bank an address resides in
n Multiple independent channels serve the same purpose q But they are even better because they have separate data buses q Increased bus bandwidth
n Enabling more concurrency requires reducing q Bank conflicts q Channel conflicts
n How to select/randomize bank/channel indices in address? q Lower order bits have more entropy q Randomizing hash functions (XOR of different address bits)
87
![Page 88: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/88.jpg)
How Multiple Banks/Channels Help
88
![Page 89: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/89.jpg)
Multiple Channels n Advantages
q Increased bandwidth q Multiple concurrent accesses (if independent channels)
n Disadvantages q Higher cost than a single channel
n More board wires n More pins (if on-chip memory controller)
89
![Page 90: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/90.jpg)
Address Mapping (Single Channel) n Single-channel system with 8-byte memory bus
q 2GB memory, 8 banks, 16K rows & 2K columns per bank
n Row interleaving q Consecutive rows of memory in consecutive banks
q Accesses to consecutive cache blocks serviced in a pipelined manner
n Cache block interleaving n Consecutive cache block addresses in consecutive banks n 64 byte cache blocks
n Accesses to consecutive cache blocks can be serviced in parallel
90
Column (11 bits) Bank (3 bits) Row (14 bits) Byte in bus (3 bits)
Low Col. High Column Row (14 bits) Byte in bus (3 bits) Bank (3 bits) 3 bits 8 bits
![Page 91: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/91.jpg)
Bank Mapping Randomization n DRAM controller can randomize the address mapping to
banks so that bank conflicts are less likely
91
Column (11 bits) 3 bits Byte in bus (3 bits)
XOR
Bank index (3 bits)
![Page 92: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/92.jpg)
Address Mapping (Multiple Channels)
n Where are consecutive cache blocks?
92
Column (11 bits) Bank (3 bits) Row (14 bits) Byte in bus (3 bits) C
Column (11 bits) Bank (3 bits) Row (14 bits) Byte in bus (3 bits) C
Column (11 bits) Bank (3 bits) Row (14 bits) Byte in bus (3 bits) C
Column (11 bits) Bank (3 bits) Row (14 bits) Byte in bus (3 bits) C
Low Col. High Column Row (14 bits) Byte in bus (3 bits) Bank (3 bits) 3 bits 8 bits
C
Low Col. High Column Row (14 bits) Byte in bus (3 bits) Bank (3 bits) 3 bits 8 bits
C
Low Col. High Column Row (14 bits) Byte in bus (3 bits) Bank (3 bits) 3 bits 8 bits
C
Low Col. High Column Row (14 bits) Byte in bus (3 bits) Bank (3 bits) 3 bits 8 bits
C
Low Col. High Column Row (14 bits) Byte in bus (3 bits) Bank (3 bits) 3 bits 8 bits
C
![Page 93: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/93.jpg)
Interaction with VirtualàPhysical Mapping n Operating System influences where an address maps to in
DRAM
n Operating system can influence which bank/channel/rank a virtual page is mapped to.
n It can perform page coloring to q Minimize bank conflicts q Minimize inter-application interference [Muralidhara+ MICRO’11]
93
Column (11 bits) Bank (3 bits) Row (14 bits) Byte in bus (3 bits)
Page offset (12 bits) Physical Frame number (19 bits)
Page offset (12 bits) Virtual Page number (52 bits) VA
PA PA
![Page 94: 18-447 Computer Architecture Lecture 21: Main Memoryece447/s15/lib/exe/fetch... · 18-447 Computer Architecture Lecture 21: Main Memory Prof. Onur Mutlu Carnegie Mellon University](https://reader033.vdocuments.net/reader033/viewer/2022053015/5f165a49c9c23400764f7313/html5/thumbnails/94.jpg)
More on Reducing Bank Conflicts n Read Sections 1 through 4 of:
q Kim et al., “A Case for Exploiting Subarray-Level Parallelism in DRAM,” ISCA 2012.
94