cs 505: computer structures memory and disk i/o
DESCRIPTION
CS 505: Computer Structures Memory and Disk I/O. Thu D. Nguyen Spring 2005 Computer Science Rutgers University. Main Memory Background. Performance of Main Memory: Latency: Cache Miss Penalty Access Time: time between request and word arrives Cycle Time: time between requests - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/1.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 1
CS 505: Computer Structures
Memory and Disk I/O
Thu D. Nguyen
Spring 2005
Computer Science
Rutgers University
![Page 2: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/2.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 2
Main Memory Background
• Performance of Main Memory: – Latency: Cache Miss Penalty
» Access Time: time between request and word arrives» Cycle Time: time between requests
– Bandwidth: I/O & Large Block Miss Penalty (L2)
• Main Memory is DRAM: Dynamic Random Access Memory
– Dynamic since needs to be refreshed periodically (8 ms)– Addresses divided into 2 halves (Memory as a 2D matrix):
» RAS or Row Access Strobe» CAS or Column Access Strobe
• Cache uses SRAM: Static Random Access Memory– No refresh (6 transistors/bit vs. 1 transistor)– Size: DRAM/SRAM 4-8– Cost/Cycle time: SRAM/DRAM 8-16
![Page 3: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/3.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 3
DRAM logical organization (4 Mbit)
Column Decoder
Sense Amps & I/O
Memory Array(2,048 x 2,048)
A0…A10
…
11 D
Q
Word LineStorage Cell
![Page 4: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/4.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 4
4 Key DRAM Timing Parameters
• tRAC: minimum time from RAS line falling to the valid data output.
– Quoted as the speed of a DRAM when buy
– A typical 4Mb DRAM tRAC = 60 ns
• tRC: minimum time from the start of one row access to the start of the next.
– tRC = 110 ns for a 4Mbit DRAM with a tRAC of 60 ns
• tCAC: minimum time from CAS line falling to valid data output.
– 15 ns for a 4Mbit DRAM with a tRAC of 60 ns
• tPC: minimum time from the start of one column access to the start of the next.
– 35 ns for a 4Mbit DRAM with a tRAC of 60 ns
![Page 5: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/5.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 5
DRAM Performance
• A 60 ns (tRAC) DRAM can – perform a row access only every 110 ns (tRC)
– perform column access (tCAC) in 15 ns, but time between column accesses is at least 35 ns (tPC).
» In practice, external address delays and turning around buses make it 40 to 50 ns
• These times do not include the time to drive the addresses off the microprocessor nor the memory controller overhead!
![Page 6: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/6.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 6
DRAM History• DRAMs: capacity +60%/yr, cost –30%/yr
– 2.5X cells/area, 1.5X die size in 3 years
• ‘98 DRAM fab line costs $2B– DRAM only: density, leakage v. speed
• Rely on increasing no. of computers & memory per computer (60% market)
– SIMM or DIMM is replaceable unit => computers use any generation DRAM
• Commodity, second source industry => high volume, low profit, conservative
– Little organization innovation in 20 years
• Order of importance: 1) Cost/bit 2) Capacity– First RAMBUS: 10X BW, +30% cost => little impact
![Page 7: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/7.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 7
• Tunneling Magnetic Junction RAM (TMJ-RAM):– Speed of SRAM, density of DRAM, non-
volatile (no refresh)– New field called “Spintronics”:
combination of quantum spin and electronics
– Same technology used in high-density disk-drives
• MEMs storage devices:– Large magnetic “sled” floating on top of
lots of little read/write heads– Micromechanical actuators move the sled
back and forth over the heads
More esoteric Storage Technologies?
![Page 8: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/8.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 8
MEMS-based Storage• Magnetic “sled”
floats on array of read/write heads
– Approx 250 Gbit/in2
– Data rates:IBM: 250 MB/s w 1000 headsCMU: 3.1 MB/s w 400 heads
• Electrostatic actuators move media around to align it with heads
– Sweep sled ±50m in < 0.5s
• Capacity estimated to be in the 1-10GB in 10cm2
See Ganger et all: http://www.lcs.ece.cmu.edu/research/MEMS
![Page 9: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/9.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 9
Main Memory Performance
• Simple: – CPU, Cache, Bus,
Memory same width (32 or 64 bits)
• Wide: – CPU/Mux 1 word;
Mux/Cache, Bus, Memory N words (Alpha: 64 bits & 256 bits; UtraSPARC 512)
• Interleaved: – CPU, Cache, Bus 1
word: Memory N Modules(4 Modules); example is word interleaved
![Page 10: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/10.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 10
Main Memory Performance
• Timing model (word size is 32 bits)– 1 to send address, – 6 access time, 1 to send data– Cache Block is 4 words
• Simple M.P. = 4 x (1+6+1) = 32• Wide M.P. = 1 + 6 + 1 = 8• Interleaved M.P. = 1 + 6 + 4x1 = 11
![Page 11: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/11.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 11
How Many Banks?
• Number of banks Number of clock cycles to access word in bank
– otherwise will return to original bank before it can have next word ready
• Increasing DRAM size => fewer chips => harder to have banks
![Page 12: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/12.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 12
DRAMs per PC over TimeM
inim
um
Mem
ory
Siz
e
DRAM Generation‘86 ‘89 ‘92 ‘96 ‘99 ‘02 1 Mb 4 Mb 16 Mb 64 Mb 256 Mb 1 Gb
4 MB
8 MB
16 MB
32 MB
64 MB
128 MB
256 MB
32 8
16 4
8 2
4 1
8 2
4 1
8 2
![Page 13: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/13.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 13
Avoiding Bank Conflicts
• Lots of banksint x[256][512];
for (j = 0; j < 512; j = j+1)for (i = 0; i < 256; i = i+1)
x[i][j] = 2 * x[i][j];• Even with 128 banks, since 512 is multiple of 128,
conflict on word accesses• SW: loop interchange or declaring array not power of 2
(“array padding”)• HW: Prime number of banks
– bank number = address mod number of banks– address within bank = address / number of words in bank– modulo & divide per memory access with prime no. banks?– address within bank = address mod number words in bank– bank number? easy if 2N words per bank
![Page 14: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/14.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 14
Fast Memory Systems: DRAM specific• Multiple CAS accesses: several names (page mode)
– Extended Data Out (EDO): 30% faster in page mode
• New DRAMs to address gap; what will they cost, will they survive?
– RAMBUS: startup company; reinvent DRAM interface» Each Chip a module vs. slice of memory» Short bus between CPU and chips» Does own refresh» Variable amount of data returned» 1 byte / 2 ns (500 MB/s per chip)
– Synchronous DRAM: 2 banks on chip, a clock signal to DRAM, transfer synchronous to system clock (66 - 150 MHz)
• Niche memory or main memory?– e.g., Video RAM for frame buffers, DRAM + fast serial output
![Page 15: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/15.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 15
Potential DRAM Crossroads?
• After 20 years of 4X every 3 years, running into wall? (64Mb - 1 Gb)
• How can keep $1B fab lines full if buy fewer DRAMs per computer?
• Cost/bit –30%/yr if stop 4X/3 yr?• What will happen to $40B/yr DRAM industry?
![Page 16: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/16.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 16
Main Memory Summary
• Wider Memory• Interleaved Memory: for sequential or
independent accesses• Avoiding bank conflicts: SW & HW• DRAM specific optimizations: page mode &
Specialty DRAM• DRAM future less rosy?
![Page 17: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/17.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 17
Virtual Memory: TB (TLB)
CPU
TB
$
MEM
VA
PA
PA
ConventionalOrganization
CPU
$
TB
MEM
VA
VA
PA
Virtually Addressed CacheTranslate only on miss
Synonym Problem
CPU
$ TB
MEM
VA
PATags
PA
Overlap $ accesswith VA translation:requires $ index to
remain invariantacross translation
VATags
L2 $
![Page 18: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/18.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 18
2. Fast hits by Avoiding Address Translation
• Send virtual address to cache? Called Virtually Addressed Cache or just Virtual Cache vs. Physical Cache
– Every time process is switched logically must flush the cache; otherwise get false hits
» Cost is time to flush + “compulsory” misses from empty cache– Dealing with aliases (sometimes called synonyms);
Two different virtual addresses map to same physical address– I/O must interact with cache, so need virtual address
• Solution to aliases– One possible solution in Wang et al.’s paper
• Solution to cache flush– Add process identifier tag that identifies process as well as address
within process: can’t get a hit if wrong process
![Page 19: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/19.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 19
2. Fast Cache Hits by Avoiding Translation: Process ID impact
• Black is uniprocess
• Light Gray is multiprocess when flush cache
• Dark Gray is multiprocess when use Process ID tag
• Y axis: Miss Rates up to 20%
• X axis: Cache size from 2 KB to 1024 KB
![Page 20: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/20.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 20
2. Fast Cache Hits by Avoiding Translation: Index with Physical
Portion of Address• If index is physical part of address, can start
tag access in parallel with translation so that can compare to physical tag
• Limits cache to page size: what if want bigger caches and uses same trick?
– Higher associativity one solution
Page Address Page Offset
Address Tag Index Block Offset
![Page 21: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/21.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 21
Alpha 21064
• Separate Instr & Data TLB & Caches
• TLBs fully associative• TLB updates in SW
(“Priv Arch Libr”)• Caches 8KB direct
mapped, write thru• Critical 8 bytes first• Prefetch instr.
stream buffer• 2 MB L2 cache, direct
mapped, WB (off-chip)
• 256 bit path to main memory, 4 x 64-bit modules
• Victim Buffer: to give read priority over write
• 4 entry write buffer between D$ & L2$
StreamBuffer
WriteBuffer
Victim Buffer
Instr Data
![Page 22: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/22.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 22
0.0001
0.001
0.01
0.1
1AlphaSort Eqntott Ora Alvinn Spice
Mis
s R
ate I $
D $
L2
Alpha Memory Performance: Miss Rates
of SPEC92
8K
8K
2M
![Page 23: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/23.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 23
00.5
11.5
22.5
33.5
44.5
5
AlphaSort Espresso Sc Mdljsp2 Ear Alvinn Mdljp2
CP
I
L2
I$
D$
I Stall
Other
Alpha CPI Components• Instruction stall: branch mispredict (green);• Data cache (blue); Instruction cache (yellow); L2$ (pink)
Other: compute + reg conflicts, structural conflicts
![Page 24: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/24.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 24
Pitfall: Predicting Cache Performance from Different Prog.
(ISA, compiler, ...)
• 4KB Data cache miss rate 8%,12%, or 28%?
• 1KB Instr cache miss rate 0%,3%,or 10%?
• Alpha vs. MIPS for 8KB Data $:17% vs. 10%
• Why 2X Alpha v. MIPS?
0%
5%
10%
15%
20%
25%
30%
35%
1 2 4 8 16 32 64 128Cache Size (KB)
Miss Rate
D: tomcatv
D: gcc
D: espresso
I: gcc
I: espresso
I: tomcatv
D$, Tom
D$, gcc
D$, esp
I$, gcc
I$, esp
I$, Tom
![Page 25: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/25.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 25
Instructions Executed (billions)
Cummlative
AverageMemoryAccessTime
1
1.5
2
2.5
3
3.5
4
4.5
0 1 2 3 4 5 6 7 8 9 10 11 12
Pitfall: Simulating Too Small an Address Trace
I$ = 4 KB, B=16BD$ = 4 KB, B=16BL2 = 512 KB, B=128BMP = 12, 200
![Page 26: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/26.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 26
Main Memory Summary
• Wider Memory• Interleaved Memory: for sequential or
independent accesses• Avoiding bank conflicts: SW & HW• DRAM specific optimizations: page mode &
Specialty DRAM• DRAM future less rosy?
![Page 27: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/27.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 27
Outline
• Disk Basics• Disk History• Disk options in 2000• Disk fallacies and performance• Tapes• RAID
![Page 28: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/28.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 28
Disk Device Terminology
• Several platters, with information recorded magnetically on both surfaces (usually)
• Actuator moves head (end of arm,1/surface) over track (“seek”), select surface, wait for sector rotate under head, then read or write
– “Cylinder”: all tracks under heads
• Bits recorded in tracks, which in turn divided into sectors (e.g., 512 Bytes)
Platter
OuterTrack
InnerTrackSector
Actuator
HeadArm
![Page 29: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/29.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 29
Photo of Disk Head, Arm, Actuator
Actuator
ArmHead
Platters (12)
{Spindle
![Page 30: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/30.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 30
Disk Device Performance
Platter
Arm
Actuator
HeadSectorInnerTrack
OuterTrack
• Disk Latency = Seek Time + Rotation Time + Transfer Time + Controller Overhead
• Seek Time? depends no. tracks move arm, seek speed of disk
• Rotation Time? depends on speed disk rotates, how far sector is from head
• Transfer Time? depends on data rate (bandwidth) of disk (bit density), size of request
ControllerSpindle
![Page 31: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/31.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 31
Disk Device Performance
• Average distance sector from head?• 1/2 time of a rotation
– 7200 Revolutions Per Minute = 120 Rev/sec– 1 revolution = 1/120 sec = 8.33 milliseconds– 1/2 rotation (revolution) = 4.16 ms
• Average no. tracks move arm?– Sum all possible seek distances
from all possible tracks / # possible» Assumes average seek distance is random
– Disk industry standard benchmark
![Page 32: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/32.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 32
Data Rate: Inner vs. Outer Tracks
• To keep things simple, orginally kept same number of sectors per track
– Since outer track longer, lower bits per inch
• Competition decided to keep BPI the same for all tracks (“constant bit density”)
– More capacity per disk– More of sectors per track towards edge– Since disk spins at constant speed, outer tracks have
faster data rate
• Bandwidth outer track 1.7X inner track!
![Page 33: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/33.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 33
Devices: Magnetic Disks
SectorTrack
Cylinder
HeadPlatter
• Purpose:– Long-term, nonvolatile
storage– Large, inexpensive, slow
level in the storage hierarchy
• Characteristics:– Seek Time (~8 ms avg)
• Transfer rate– 10-30 MByte/sec– Blocks
• Capacity– Gigabytes– Quadruples every 3
years (aerodynamics)
7200 RPM = 120 RPS => 8 ms per rev ave rot. latency = 4 ms128 sectors per track => 0.25 ms per sector1 KB per sector => 16 MB / s
Response time = Queue + Controller + Seek + Rot + Xfer
Service time
![Page 34: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/34.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 34
Historical Perspective
• 1956 IBM Ramac — early 1970s Winchester– Developed for mainframe computers, proprietary interfaces– Steady shrink in form factor: 27 in. to 14 in.
• 1970s developments– 5.25 inch floppy disk formfactor (microcode into mainframe)– early emergence of industry standard disk interfaces
» ST506, SASI, SMD, ESDI
• Early 1980s– PCs and first generation workstations
• Mid 1980s– Client/server computing – Centralized storage on file server
» accelerates disk downsizing: 8 inch to 5.25 inch– Mass market disk drives become a reality
» industry standards: SCSI, IPI, IDE» 5.25 inch drives for standalone PCs, End of proprietary
interfaces
![Page 35: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/35.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 35
Disk History
Data densityMbit/sq. in.
Capacity ofUnit ShownMegabytes
1973:1. 7 Mbit/sq. in140 MBytes
1979:7. 7 Mbit/sq. in2,300 MBytes
source: New York Times, 2/23/98, page C3, “Makers of disk drives crowd even more data into even smaller spaces”
![Page 36: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/36.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 36
Historical Perspective
• Late 1980s/Early 1990s:– Laptops, notebooks, (palmtops)– 3.5 inch, 2.5 inch, (1.8 inch formfactors)– Formfactor plus capacity drives market, not so much
performance» Recently Bandwidth improving at 40%/ year
– Challenged by DRAM, flash RAM in PCMCIA cards» still expensive, Intel promises but doesn’t deliver» unattractive MBytes per cubic inch
– Optical disk fails on performance but finds niche (CD ROM)
![Page 37: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/37.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 37
Disk History
1989:63 Mbit/sq. in60,000 MBytes
1997:1450 Mbit/sq. in2300 MBytes
source: New York Times, 2/23/98, page C3, “Makers of disk drives crowd even mroe data into even smaller spaces”
1997:3090 Mbit/sq. in8100 MBytes
![Page 38: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/38.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 38
1 inch disk drive!
• 2000 IBM MicroDrive:– 1.7” x 1.4” x 0.2” – 1 GB, 3600 RPM,
5 MB/s, 15 ms seek– Digital camera, PalmPC?
• 2006 MicroDrive?• 9 GB, 50 MB/s!
– Assuming it finds a niche in a successful product
– Assuming past trends continue
![Page 39: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/39.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 39
Disk Performance Model /Trends
• Capacity– + 100%/year (2X / 1.0 yrs)
• Transfer rate (BW)– + 40%/year (2X / 2.0 yrs)
• Rotation + Seek time– – 8%/ year (1/2 in 10 yrs)
• MB/$– > 100%/year (2X / <1.5 yrs)– Fewer chips + areal density
![Page 40: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/40.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 40
State of the Art: Ultrastar 72ZX– 73.4 GB, 3.5 inch disk– 2¢/MB– 10,000 RPM;
3 ms = 1/2 rotation– 11 platters, 22 surfaces– 15,110 cylinders– 7 Gbit/sq. in. areal den– 17 watts (idle)– 0.1 ms controller time– 5.3 ms avg. seek– 50 to 29 MB/s(internal)
source: www.ibm.com; www.pricewatch.com; 2/14/00
Latency = Queuing Time + Controller time +Seek Time + Rotation Time + Size / Bandwidth
per access
per byte{+
Sector
Track
Cylinder
Head PlatterArmTrack Buffer
![Page 41: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/41.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 41
Disk Performance Example
• Calculate time to read 1 sector (512B) for UltraStar 72 using advertised performance; sector is on outer track
• Disk latency = average seek time + average rotational delay + transfer time + controller overhead
• = 5.3 ms + 0.5 * 1/(10000 RPM) + 0.5 KB / (50 MB/s) + 0.15 ms
• = 5.3 ms + 0.5 /(10000 RPM/(60000ms/M)) + 0.5 KB / (50 KB/ms) + 0.15 ms
• = 5.3 + 3.0 + 0.10 + 0.15 ms = 8.55 ms
![Page 42: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/42.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 42
Areal Density
• Bits recorded along a track– Metric is Bits Per Inch (BPI)
• Number of tracks per surface– Metric is Tracks Per Inch (TPI)
• Care about bit density per unit area– Metric is Bits Per Square Inch– Called Areal Density– Areal Density = BPI x TPI
![Page 43: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/43.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 43
Areal Density
Year Areal Density1973 1.71979 7.71989 631997 30902000 17100
1
10
100
1000
10000
100000
1970 1980 1990 2000
Year
Are
al D
ensity
– Areal Density = BPI x TPI– Change slope 30%/yr to 60%/yr about 1991
![Page 44: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/44.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 44
Disk Characteristics in 2000Seagate
CheetahST173404LC
Ultra160 SCSI
IBMTravelstar
32GH DJSA -232 ATA-4
IBM 1GBMicrodrive
DSCM-11000
Disk diameter(inches)
3.5 2.5 1.0Formatted datacapacity (GB)
73.4 32.0 1.0Cylinders 14,100 21,664 7,167Disks 12 4 1RecordingSurfaces (Heads)
24 8 2Bytes per sector 512 to 4096 512 512Avg Sectors pertrack (512 byte)
~ 424 ~ 360 ~ 140Max. arealdensity(Gbit/sq.in.)
6.0 14.0 15.2
![Page 45: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/45.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 45
Disk Characteristics in 2000Seagate
CheetahST173404LC
Ultra160 SCSI
IBMTravelstar
32GH DJSA -232 ATA-4
IBM 1GBMicrodrive
DSCM-11000
Rotation speed(RPM)
10033 5411 3600Avg. seek ms(read/write)
5.6/6.2 12.0 12.0Minimum seekms (read/write)
0.6/0.9 2.5 1.0Max. seek ms 14.0/15.0 23.0 19.0Data transferrate MB/second
27 to 40 11 to 21 2.6 to 4.2Link speed tobuffer MB/s
160 67 13Poweridle/operatingWatts
16.4 / 23.5 2.0 / 2.6 0.5 / 0.8
![Page 46: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/46.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 46
Disk Characteristics in 2000Seagate
CheetahST173404LC
Ultra160 SCSI
IBMTravelstar
32GH DJSA -232 ATA-4
IBM 1GBMicrodrive
DSCM-11000
Buffer size in MB 4.0 2.0 0.125Size: height xwidth x depthinches
1.6 x 4.0 x5.8
0.5 x 2.7 x3.9
0.2 x 1.4 x1.7
Weight pounds 2.00 0.34 0.035Rated MTTF inpowered-on hours
1,200,000 (300,000?) (20K/5 yrlife?)
% of POH permonth
100% 45% 20%% of POHseeking, reading,writing
90% 20% 20%
![Page 47: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/47.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 47
Disk Characteristics in 2000Seagate
CheetahST173404LC
Ultra160 SCSI
IBM Travelstar32GH DJSA -
232 ATA-4
IBM 1GB MicrodriveDSCM-11000
Load/Unloadcycles (diskpowered on/off)
250 per year 300,000 300,000
Nonrecoverableread errors perbits read
<1 per 1015 < 1 per 1013 < 1 per 1013
Seek errors <1 per 107 not available not availableShock tolerance:Operating, Notoperating
10 G, 175 G 150 G, 700 G 175 G, 1500 G
Vibrationtolerance:Operating, Notoperating (sineswept, 0 to peak)
5-400 Hz @0.5G, 22-400Hz @ 2.0G
5-500 Hz @1.0G, 2.5-500Hz @ 5.0G
5-500 Hz @ 1G, 10-500 Hz @ 5G
![Page 48: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/48.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 48
Technology Trends
Disk Capacity now doubles every 12 months; before1990 every 36 motnhs
• Today: Processing Power Doubles Every 18 months
• Today: Memory Size Doubles Every 18-24 months(4X/3yr)
• Today: Disk Capacity Doubles Every 12-18 months
• Disk Positioning Rate (Seek + Rotate) Doubles Every Ten Years!
The I/OGAP
The I/OGAP
![Page 49: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/49.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 49
Fallacy: Use Data Sheet “Average Seek” Time
• Manufacturers needed standard for fair comparison (“benchmark”)
– Calculate all seeks from all tracks, divide by number of seeks => “average”
• Real average would be based on how data laid out on disk, where seek in real applications, then measure performance
– Usually, tend to seek to tracks nearby, not to random track
• Rule of Thumb: observed average seek time is typically about 1/4 to 1/3 of quoted seek time (i.e., 3X-4X faster)
– UltraStar 72 avg. seek: 5.3 ms => 1.7 ms
![Page 50: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/50.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 50
Fallacy: Use Data Sheet Transfer Rate
• Manufacturers quote the speed of the data rate off the surface of the disk
• Sectors contain an error detection and correction field (can be 20% of sector size) plus sector number as well as data
• There are gaps between sectors on track• Rule of Thumb: disks deliver about 3/4 of
internal media rate (1.3X slower) for data• For example, UlstraStar 72 quotes
50 to 29 MB/s internal media rate • => Expect 37 to 22 MB/s user data rate
![Page 51: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/51.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 51
Disk Performance Example
• Calculate time to read 1 sector for UltraStar 72 again, this time using 1/3 quoted seek time, 3/4 of internal outer track bandwidth; (8.55 ms before)
• Disk latency = average seek time + average rotational delay + transfer time + controller overhead
• = (0.33 * 5.3 ms) + 0.5 * 1/(10000 RPM) + 0.5 KB / (0.75 * 50 MB/s) + 0.15 ms
• = 1.77 ms + 0.5 /(10000 RPM/(60000ms/M)) + 0.5 KB / (37 KB/ms) + 0.15 ms
• = 1.73 + 3.0 + 0.14 + 0.15 ms = 5.02 ms
![Page 52: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/52.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 52
Future Disk Size and Performance
• Continued advance in capacity (60%/yr) and bandwidth (40%/yr)
• Slow improvement in seek, rotation (8%/yr)• Time to read whole disk • Year Sequentially Randomly
(1 sector/seek)• 1990 4 minutes 6 hours• 2000 12 minutes 1 week(!)• 3.5” form factor make sense in 5-7 yrs?
![Page 53: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/53.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 53
SCSI: Small Computer System Interface
• Clock rate: 5 MHz / 10 (fast) / 20 (ultra)- 80 MHz (Ultra3)• Width: n = 8 bits / 16 bits (wide); up to n – 1 devices to
communicate on a bus or “string”• Devices can be slave (“target”) or master(“initiator”)• SCSI protocol: a series of “phases”, during which
specific actions are taken by the controller and the SCSI disks
– Bus Free: No device is currently accessing the bus– Arbitration: When the SCSI bus goes free, multiple devices may
request (arbitrate for) the bus; fixed priority by address– Selection: informs the target that it will participate (Reselection if
disconnected)– Command: the initiator reads the SCSI command bytes from host
memory and sends them to the target– Data Transfer: data in or out, initiator: target– Message Phase: message in or out, initiator: target (identify,
save/restore data pointer, disconnect, command complete)– Status Phase: target, just before command complete
![Page 54: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/54.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 54
Use Arrays of Small Disks?
14”10”5.25”3.5”
3.5”
Disk Array: 1 disk design
Conventional: 4 disk designs
Low End High End
•Katz and Patterson asked in 1987: •Can smaller disks be used to close gap in performance between disks and CPUs?
![Page 55: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/55.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 55
Replace Small Number of Large Disks with Large Number of Small Disks! (1988 Disks)
Capacity
Volume
Power
Data Rate
I/O Rate
MTTF
Cost
IBM 3390K
20 GBytes
97 cu. ft.
3 KW
15 MB/s
600 I/Os/s
250 KHrs
$250K
IBM 3.5" 0061
320 MBytes
0.1 cu. ft.
11 W
1.5 MB/s
55 I/Os/s
50 KHrs
$2K
x70
23 GBytes
11 cu. ft.
1 KW
120 MB/s
3900 IOs/s
??? Hrs
$150K
Disk Arrays have potential for large data and I/O rates, high MB per cu. ft., high MB per KW, but what about reliability?
9X
3X
8X
6X
![Page 56: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/56.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 56
Array Reliability
• Reliability of N disks = Reliability of 1 Disk ÷ N
50,000 Hours ÷ 70 disks = 700 hours
Disk system MTTF: Drops from 6 years to 1 month!
• Arrays (without redundancy) too unreliable to be useful!
Hot spares support reconstruction in parallel with access: very high media availability can be achievedHot spares support reconstruction in parallel with access: very high media availability can be achieved
![Page 57: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/57.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 57
Redundant Arrays of (Inexpensive) Disks
• Files are "striped" across multiple disks• Redundancy yields high data availability
– Availability: service still provided to user, even if some components failed
• Disks will still fail• Contents reconstructed from data
redundantly stored in the array– Capacity penalty to store redundant info– Bandwidth penalty to update redundant info
![Page 58: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/58.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 58
Redundant Arrays of Inexpensive Disks
RAID 1: Disk Mirroring/Shadowing
• Each disk is fully duplicated onto its “mirror” Very high availability can be achieved• Bandwidth sacrifice on write: Logical write = two physical writes
• Reads may be optimized• Most expensive solution: 100% capacity overhead
• (RAID 2 not interesting, so skip)
recoverygroup
![Page 59: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/59.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 59
Redundant Array of Inexpensive Disks RAID 3:
Parity Disk
P
100100111100110110010011
. . .logical record 1
0100011
11001101
10100011
11001101
P contains sum ofother disks per stripe mod 2 (“parity”)If disk fails, subtract P from sum of other disks to find missing information
Striped physicalrecords
![Page 60: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/60.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 60
RAID 3
• Sum computed across recovery group to protect against hard disk failures, stored in P disk
• Logically, a single high capacity, high transfer rate disk: good for large transfers
• Wider arrays reduce capacity costs, but decreases availability
• 33% capacity cost for parity in this configuration
![Page 61: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/61.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 61
Inspiration for RAID 4
• RAID 3 relies on parity disk to discover errors on Read
• But every sector has an error detection field• Rely on error detection field to catch errors
on read, not on the parity disk• Allows independent reads to different disks
simultaneously
![Page 62: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/62.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 62
Redundant Arrays of Inexpensive Disks RAID 4:
High I/O Rate ParityD0 D1 D2 D3 P
D4 D5 D6 PD7
D8 D9 PD10 D11
D12 PD13 D14 D15
PD16 D17 D18 D19
D20 D21 D22 D23 P
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.Disk Columns
IncreasingLogicalDisk
Address
Stripe
Insides of 5 disksInsides of 5 disks
Example:small read D0 & D5, large write D12-D15
Example:small read D0 & D5, large write D12-D15
![Page 63: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/63.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 63
Inspiration for RAID 5
• RAID 4 works well for small reads• Small writes (write to one disk):
– Option 1: read other data disks, create new sum and write to Parity Disk
– Option 2: since P has old sum, compare old data to new data, add the difference to P
• Small writes are limited by Parity Disk: Write to D0, D5 both also write to P disk
D0 D1 D2 D3 P
D4 D5 D6 PD7
![Page 64: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/64.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 64
Redundant Arrays of Inexpensive Disks RAID 5: High I/O Rate Interleaved
ParityIndependent writespossible because ofinterleaved parity
Independent writespossible because ofinterleaved parity
D0 D1 D2 D3 P
D4 D5 D6 P D7
D8 D9 P D10 D11
D12 P D13 D14 D15
P D16 D17 D18 D19
D20 D21 D22 D23 P
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.Disk Columns
IncreasingLogical
Disk Addresses
Example: write to D0, D5 uses disks 0, 1, 3, 4
![Page 65: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/65.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 65
Problems of Disk Arrays: Small Writes
D0 D1 D2 D3 PD0'
+
+
D0' D1 D2 D3 P'
newdata
olddata
old parity
XOR
XOR
(1. Read) (2. Read)
(3. Write) (4. Write)
RAID-5: Small Write Algorithm
1 Logical Write = 2 Physical Reads + 2 Physical Writes
![Page 66: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/66.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 66
System Availability: Orthogonal RAIDs
ArrayController
StringController
StringController
StringController
StringController
StringController
StringController
. . .
. . .
. . .
. . .
. . .
. . .
Data Recovery Group: unit of data redundancy
Redundant Support Components: fans, power supplies, controller, cables
End to End Data Integrity: internal parity protected data paths
![Page 67: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/67.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 67
System-Level Availability
Fully dual redundantI/O Controller I/O Controller
Array Controller Array Controller
. . .
. . .
. . .
. . . . . .
.
.
.RecoveryGroup
Goal: No SinglePoints ofFailure
Goal: No SinglePoints ofFailure
host host
with duplicated paths, higher performance can beobtained when there are no failures
![Page 68: CS 505: Computer Structures Memory and Disk I/O](https://reader036.vdocuments.net/reader036/viewer/2022062804/568149de550346895db7021c/html5/thumbnails/68.jpg)
CS 505: Thu D. NguyenRutgers University, Spring 2003 68
Summary: Redundant Arrays of Disks (RAID)
Techniques• Disk Mirroring, Shadowing (RAID 1)
Each disk is fully duplicated onto its "shadow" Logical write = two physical writes
100% capacity overhead
• Parity Data Bandwidth Array (RAID 3)
Parity computed horizontally
Logically a single high data bw disk
• High I/O Rate Parity Array (RAID 5)Interleaved parity blocks
Independent reads and writes
Logical write = 2 reads + 2 writes
Parity + Reed-Solomon codes
10010011
11001101
10010011
00110010
10010011
10010011