digital systems architecture topics eece 343-01 eece...

12
1 Digital Systems Architecture EECE 343-01 EECE 292-02 Dependability and RAID Dr. William H. Robinson April 23, 2004 http://eecs.vanderbilt.edu/courses/eece343/ 2 Topics KILLS – BUGS – DEAD! TV commercial for RAID bug spray Administrative stuff Turn in Reading Assignment #8 Homework Assignment #4 solutions posted More on magnetic storage devices Dependability defined • RAID Queuing theory 3 North Bridge Connects CPU to memory & video South Bridge Connects North Bridge to everything else Pentium 4 Implementation 4 AMD’s Integrated Memory Controller http://www.pcstats.com/articleview.cfm?articleid=1469&page=5 Reduces the time it takes the processor to access memory Data need only travel between the processor and the physical memory Memory traffic need no longer run between the processor and the Northbridge The Northbridge traditionally provides the memory controller with data, and removing this bottleneck increases overall operations

Upload: vudung

Post on 10-Apr-2018

237 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

1

1

Digital Systems ArchitectureEECE 343-01 EECE 292-02

Dependability and RAID

Dr. William H. RobinsonApril 23, 2004

http://eecs.vanderbilt.edu/courses/eece343/

2

TopicsKILLS – BUGS – DEAD!

– TV commercial for RAID bug spray

• Administrative stuff– Turn in Reading Assignment #8– Homework Assignment #4 solutions posted

• More on magnetic storage devices• Dependability defined• RAID• Queuing theory

3

• North Bridge– Connects CPU to

memory & video

• South Bridge– Connects North

Bridge to everythingelse

Pentium 4 Implementation

4

AMD’s Integrated Memory Controller

• http://www.pcstats.com/articleview.cfm?articleid=1469&page=5

• Reduces the time it takes the processor to access memory– Data need only travel between

the processor and the physical memory

• Memory traffic need no longer run between the processor and the Northbridge– The Northbridge traditionally

provides the memory controller with data, and removing this bottleneck increases overall operations

Page 2: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

2

5

• CPU Performance: 60% per year

• I/O system performance limited by mechanicaldelays (disk I/O)

< 10% per year (IO per sec or MB per sec)

• Amdahl's Law: system speed-up limited by the slowest part!

10% IO & 10x CPU => 5x Performance (lose 50%)10% IO & 100x CPU => 10x Performance (lose 90%)

• I/O bottleneck:Diminishing fraction of time in CPUDiminishing value of faster CPUs

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

Motivation: Who Cares About I/O?

6

• Why still important to keep CPUs busy vs. IO devices ("CPU time"), as CPUs not costly?

– Moore's Law leads to both large, fast CPUs but also to very small, cheap CPUs

– 2001 Hypothesis: 600 MHz PC is fast enough for Office Tools?– PC slowdown since fast enough unless games, new apps?

• People care more about about storing information and communicating information than calculating

– "Information Technology" vs. "Computer Science"– 1960s and 1980s: Computing Revolution– 1990s and 2000s: Information Age

Big Picture: Who Cares About CPUs?

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

7

I/O SystemsProcessor

Cache

Memory - I/O Bus

MainMemory

I/OController

Disk Disk

I/OController

I/OController

Graphics Network

interruptsinterrupts

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB. 8

Storage Technology Drivers• Driven by the prevailing computing paradigm

– 1950s: migration from batch to on-line processing– 1990s: migration to ubiquitous computing

• computers in phones, books, cars, video cameras, …• nationwide fiber optical network with wireless tails

• Effects on storage industry– Embedded storage

• smaller, cheaper, more reliable, lower power

– Data utilities• high capacity, hierarchically managed storage

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

Page 3: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

3

9

• Disk Latency = Seek Time + Rotation Time + Transfer Time + Controller Overhead

• Seek Time? depends on number of tracks arm has to move, seek speed of disk

• Rotation Time? depends on speed disk rotates, how far sector is from head

• Transfer Time? depends on data rate (bandwidth) of disk (bit density), size of request

Platter

Arm

Actuator

HeadSectorInnerTrack

OuterTrack

ControllerSpindle

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Disk Device Performance

10

• Average distance sector from head?

• 1/2 time of a rotation– 10000 Revolutions Per Minute ⇒ 166.67 Rev/sec– 1 revolution = 1/ 166.67 sec ⇒ 6.00 milliseconds– 1/2 rotation (revolution) ⇒ 3.00 ms

• Average number of tracks arm moves?– Sum all possible seek distances

from all possible tracks / # possible• Assumes average seek distance is random

– Disk industry standard benchmarkAdapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Disk Device Performance

11

• To keep things simple, originally kept same number of sectors per track

– Since outer track longer, lower Bits Per Inch (BPI)

• Competition ⇒ decided to keep BPI the same for all tracks (“constant bit density”)

⇒ More capacity per disk⇒ More of sectors per track towards edge⇒ Since disk spins at constant speed,

outer tracks have faster data rate

• Bandwidth outer track 1.7X inner track!– Inner track highest density, outer track lowest, so not really constant– 2.1X length of track outer / inner, 1.7X bits outer / inner

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Data Rate: Inner vs. Outer Tracks

12

SectorTrack

Cylinder

HeadPlatter

• Purpose:– Long-term, nonvolatile storage– Large, inexpensive, slow level in

the storage hierarchy

• Characteristics:– Seek Time (~8 ms avg)

• positional latency• rotational latency

• Transfer rate– 10-40 MByte/sec– Blocks

• Capacity– Gigabytes– Quadruples every 2 years

(aerodynamics)

7200 RPM = 120 RPS => 8 ms per revavg rot. latency = 4 ms

128 sectors per track => 0.25 ms per sector1 KB per sector => 16 MB / s

Response time= Queue + Controller + Seek + Rot + Xfer

Service time

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Devices: Magnetic Disks

Page 4: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

4

13

• Capacity+ 100%/year (2X / 1.0 yrs)

• Transfer rate (BW)+ 40%/year (2X / 2.0 yrs)

• Rotation + seek time– 8%/ year (1/2 in 10 yrs)

• MB/$> 100%/year (2X / <1.5 yrs)Fewer chips + areal density

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

Disk Performance Model /Trends

14

• Bits recorded along a track– Metric is Bits Per Inch (BPI)

• Number of tracks per surface– Metric is Tracks Per Inch (TPI)

• Disk designs brag about bit density per unit area– Metric is Bits Per Square Inch– Called Areal Density– Areal Density = BPI x TPI

Areal Density

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

15

Year Areal Density1973 1.71979 7.71989 631997 30902000 17100

1

10

100

1000

10000

100000

1970 1980 1990 2000

Year

Are

al D

ensi

– Areal Density = BPI x TPI– Change slope from (30%/yr) to (60%/yr) about 1991

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Areal Density

16

Examples on why precise definitions are so important for reliability

• Is a programming mistake a fault, error, or failure?– Are we talking about the time it was designed

or the time the program is run? – If the running program doesn’t exercise the mistake,

is it still a fault/error/failure?

• If an alpha particle hits a DRAM memory cell, is it a fault/error/failure if it doesn’t change the value?

– Is it a fault/error/failure if the memory doesn’t access the changed bit?

– Did a fault/error/failure still occur if the memory had error correction and delivered the corrected value to the CPU?

Adapted from David Patterson’s CS 252 lecture notes. Copyright © 2001 UCB.

Definitions

Page 5: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

5

17

• Computer system dependability: quality of delivered service such that reliance can be placed on service

• Service is observed actual behavior as perceived by other system(s) interacting with this system’s users

• Each module has ideal specified behavior, where service specification is agreed description of expected behavior

• A system failure occurs when the actual behavior deviates from the specified behavior

• Failure occurred because an error, a defect in module• The cause of an error is a fault• When a fault occurs it creates a latent error, which becomes effective

when it is activated• When error actually affects the delivered service, a failure occurs (time

from error to failure is error latency)

Adapted from David Patterson’s CS 252 lecture notes. Copyright © 2001 UCB.

IFIP Standard Terminology

18

• A fault creates one or more latent errors• Properties of errors are

– A latent error becomes effective once activated– An error may cycle between its latent and effective states– An effective error often propagates from one component to

another, thereby creating new errors

• Effective error is either a formerly-latent error in that component or it propagated from another error

• A component failure occurs when the error affects the delivered service

• These properties are recursive, and apply to any component in the system

• An error is manifestation in the system of a fault, a failure is manifestation on the service of an error

Adapted from David Patterson’s CS 252 lecture notes. Copyright © 2001 UCB.

Fault vs. (Latent) Error vs. Failure

19

• An error is manifestation in the system of a fault, a failure is manifestation on the service of an error

• Is a programming mistake a fault, error, or failure?– Are we talking about the time it was designed

or the time the program is run?– If the running program doesn’t exercise the mistake,

is it still a fault/error/failure?

• A programming mistake is a fault• The consequence is an error (or latent error) in the

software• Upon activation, the error becomes effective• When this effective error produces erroneous data

which affect the delivered service, a failure occurs

Adapted from David Patterson’s CS 252 lecture notes. Copyright © 2001 UCB.

Fault vs. (Latent) Error vs. Failure

20

• An error is manifestation in the system of a fault, a failure is manifestation on the service of an error

• If an alpha particle hits a DRAM memory cell, is it a fault/error/failure if it doesn’t change the value?

– Is it a fault/error/failure if the memory doesn’t access the changed bit?

– Did a fault/error/failure still occur if the memory had error correction and delivered the corrected value to the CPU?

• An alpha particle hitting a DRAM can be a fault• If it changes the memory, it creates an error• Error remains latent until effected memory word is read• If the effected word error affects the delivered service,

a failure occurs

Adapted from David Patterson’s CS 252 lecture notes. Copyright © 2001 UCB.

Fault vs. (Latent) Error vs. Failure

Page 6: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

6

21

• Fault Tolerance (or more properly, Error Tolerance): mask local faults(prevent errors from becoming failures)

– RAID disks– Uninterruptible Power Supplies

• Disaster Tolerance: masks site errors(prevent site errors from causing service failures)

– Protects against fire, flood, sabotage,..– Redundant system and service at remote site.– Use design diversity

From Jim Gray’s “Talk at UC Berkeley on Fault Tolerance " 11/9/00

Adapted from David Patterson’s CS 252 lecture notes. Copyright © 2001 UCB.

Fault Tolerance vs. Disaster Tolerance

22

14”10”5.25”3.5”

3.5”

Disk Array: 1 disk design

Conventional: 4 disk designs

Low End High End

Katz and Patterson asked in 1987: Can smaller disks be used to close gap in performance between disks and CPUs?

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Use Arrays of Small Disks?

23

Low cost/MBHigh MB/volumeHigh MB/wattLow cost/Actuator

Cost and Environmental Efficiencies

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Advantages of Small FormfactorDisk Drives

24

Capacity Volume PowerData Rate I/O Rate MTTF Cost

IBM 3390K20 GBytes97 cu. ft.

3 KW15 MB/s

600 I/Os/s250 KHrs

$250K

IBM 3.5" 0061320 MBytes

0.1 cu. ft.11 W

1.5 MB/s55 I/Os/s50 KHrs

$2K

Disk Arrays have potential for large data and I/O rates, high MBper cu. ft., high MB per KW, but what about reliability?

x7023 GBytes11 cu. ft.

1 KW120 MB/s

3900 IOs/s??? Hrs$150K

9X3X

8X

6X

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Replace Small Number of Large Disks with Large Number of Small Disks! (1988 Disks)

Page 7: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

7

25

• Reliability of N disks = Reliability of 1 Disk ÷ N

50,000 Hours ÷ 70 disks = 700 hours

Disk system MTTF: Drops from 6 years to 1 month!

• Arrays (without redundancy) too unreliable to be useful!

Hot spares support reconstruction in parallel with access: very high media availability can be achievedHot spares support reconstruction in parallel with access: very high media availability can be achieved

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Array Reliability

26

• Files are "striped" across multiple disks

• Redundancy yields high data availability– Availability: service still provided to user, even if some components

failed

• Disks will still fail

• Contents reconstructed from data redundantly stored in the array

⇒ Capacity penalty to store redundant info⇒ Bandwidth penalty to update redundant info

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Redundant Arrays of (Inexpensive) Disks

27

RAID Levels 0 Through 5

Backup and parity drives are shaded

28

recoverygroup

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Redundant Arrays of Inexpensive Disks RAID 1: Disk Mirroring/Shadowing

• Each disk is fully duplicated onto its “mirror”– Very high availability can be achieved

• Bandwidth sacrifice on write:– Logical write = two physical writes

• Reads may be optimized• Most expensive solution: 100% capacity overhead

Page 8: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

8

29

Redundant Arrays of Inexpensive Disks RAID 2: Hamming Code

• Split each byte into a pair of 4-bit nibbles– Add Hamming code to each– Parity bits are 1, 2, 4

• Provides single-bit error correction• Drawbacks

– Requires all drives to be rotationally synchronized– Requires a large number of drives to reduce overhead– Controller complexity to calculate checksum

30

P100100111100110110010011

. . .logical record 1

0010011

11001101

10010011

11001101

P contains sum ofother disks per stripe mod 2 (“parity”)If disk fails, subtract P from sum of other disks to find missing information

Striped physicalrecords

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Redundant Arrays of Inexpensive Disks RAID 3: Parity Disk

31

• Sum computed across recovery group to protect against hard disk failures, stored in P disk

• Logically, a single high capacity, high transfer rate disk: good for large transfers

• Wider arrays reduce capacity costs, but decreases availability

• 33% capacity cost for parity in this configuration

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

RAID 3

32

• RAID 3 relies on parity disk to discover errors on Read

• But every sector has an error detection field

• Rely on error detection field to catch errors on read, not on the parity disk

• Allows independent reads to different disks simultaneously

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Inspiration for RAID 4

Page 9: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

9

33

D0 D1 D2 D3 P

D4 D5 D6 PD7

D8 D9 PD10 D11

D12 PD13 D14 D15

PD16 D17 D18 D19

D20 D21 D22 D23 P...

.

.

.

.

.

.

.

.

.

.

.

.Disk Columns

IncreasingLogicalDisk

Address

Stripe

Insides of 5 disksInsides of 5 disks

Example:small read D0 & D5, large write D12-D15

Example:small read D0 & D5, large write D12-D15

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Redundant Arrays of Inexpensive Disks RAID 4: High I/O Rate Parity

34

• RAID 4 works well for small reads• Small writes (write to one disk):

– Option 1: read other data disks, create new sum and write to Parity Disk

– Option 2: since P has old sum, compare old data to new data, addthe difference to P

• Small writes are limited by Parity Disk: Write to D0, D5 both also write to P disk

D0 D1 D2 D3 P

D4 D5 D6 PD7

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Inspiration for RAID 5

35

Independent writespossible because ofinterleaved parity

Independent writespossible because ofinterleaved parity

D0 D1 D2 D3 P

D4 D5 D6 P D7

D8 D9 P D10 D11

D12 P D13 D14 D15

P D16 D17 D18 D19

D20 D21 D22 D23 P...

.

.

.

.

.

.

.

.

.

.

.

.Disk Columns

IncreasingLogicalDisk

Addresses

Example: write to D0, D5 uses disks 0, 1, 3, 4

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Redundant Arrays of Inexpensive Disks RAID 5: High I/O Rate Interleaved Parity

36

D0 D1 D2 D3 PD0'

+

+

D0' D1 D2 D3 P'

newdata

olddata

old parity

XOR

XOR

(1. Read) (2. Read)

(3. Write) (4. Write)

RAID-5: Small Write Algorithm1 Logical Write = 2 Physical Reads + 2 Physical Writes

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Problems of Disk Arrays: Small Writes

Page 10: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

10

37

• RAID-I (1989) – Consisted of a Sun 4/280

workstation with 128 MB of DRAM, four dual-string SCSI controllers, 28 5.25-inch SCSI disks and specialized disk striping software

• Today RAID is $19 billion dollar industry, 80% nonPC disks sold in RAIDs

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

Berkeley History: RAID-I

38

• Disk Mirroring, Shadowing (RAID 1)

Each disk is fully duplicated onto its "shadow"

Logical write = two physical writes

100% capacity overhead

• Parity Data Bandwidth Array (RAID 3)

Parity computed horizontally

Logically a single high data bw disk

• High I/O Rate Parity Array (RAID 5)Interleaved parity blocks

Independent reads and writes

Logical write = 2 reads + 2 writes

10010011

11001101

10010011

00110010

10010011

10010011

Adapted from David Culler’s CS 252 lecture notes. Copyright © 2002 UCB.

RAID Techniques Summary: Goal was performance, popularity due to reliability of storage

39

• Queueing Theory applies to long term, steady state behavior ⇒ Arrival rate = Departure rate

• Little’s Law: Mean number tasks in system = arrival rate x mean response time

– Observed by many, Little was first to prove– Simple interpretation: you should see the same number of tasks in

queue when entering as when leaving.

• Applies to any system in equilibrium, as long as nothing in black box is creating or destroying tasks

“Black Box”Queueing

System

Arrivals Departures

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

Introduction to Queueing Theory

40

• Server spends a variable amount of time with customers– Weighted mean m1 = (f1 x T1 + f2 x T2 +...+ fn x Tn)/F

= Σ p(T)xT– σ2 = (f1 x T12 + f2 x T22 +...+ fn x Tn2)/F – m12

= Σ p(T)xT2 - m12

– Squared coefficient of variance: C = σ2/m12

• Unitless measure (100 ms2 vs. 0.1 s2)• Exponential distribution C = 1 : most short relative to

average, few others long; 90% < 2.3 x average, 63% < averageHypoexponential distribution C < 1 : most close to average, C=0.5 => 90% < 2.0 x average, only 57% < averageHyperexponential distribution C > 1 : further from average C=2.0 => 90% < 2.8 x average, 69% < average

Avg.

Avg.

0

Proc IOC Device

Queue serverSystem

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

A Little Queuing Theory:Use of Random Distributions

Page 11: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

11

41

• Disk response times C ≈ 1.5 (majority seeks < average)• Yet usually pick C = 1.0 for simplicity

– Memoryless, exponential distribution– Many complex systems well described by memoryless distribution!

• Another useful value is average time must wait for server to complete current task: m1(z)

– Called “Average Residual Wait Time”– Not just 1/2 x m1 because doesn’t capture variance– Can derive m1(z) = 1/2 x m1 x (1 + C)– No variance ⇒ C= 0 => m1(z) = 1/2 x m1– Exponential ⇒ C= 1 => m1(z) = m1

Proc IOC Device

Queue serverSystem

Avg.

0 Time

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

A Little Queuing Theory:Variable Service Time

42

• Calculating average wait time in queue Tq:– All customers in line must complete; avg time: m1≡Tser= 1/µ– If something at server, it takes to complete on average m1(z)

• Chance server is busy = u=λ/µ; average delay is u x m1(z)Tq = u x m1(z) + Lq x TserTq = u x m1(z) + λ x Tq x TserTq = u x m1(z) + u x TqTq x (1 – u) = m1(z) x uTq = m1(z) x u/(1-u) = Tser x {1/2 x (1+C)} x u/(1 – u))

Notation:λ average number of arriving customers/secondTser average time to service a customeru server utilization (0..1): u = λ x TserTq average time/customer in queueLq average length of queue:Lq= λ x Tqm1(z) average residual wait time = Tser x {1/2 x (1+C)}

Little’s Law

Defn of utilization (u)

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

A Little Queuing Theory:Average Wait Time

43

• Assumptions so far:– System in equilibrium– Time between two successive arrivals in line are random– Server can start on next customer immediately after prior finishes– No limit to the queue: works First-In-First-Out– Afterward, all customers in line must complete; each avg Tser

• Described “memoryless” or Markovian request arrival (M for C=1 exponentially random), General service distribution (no restrictions), 1 server: M/G/1 queue

• When Service times have C = 1, M/M/1 queueTq = Tser x u x(1 + C) /(2 x (1 – u)) = Tser x u /(1 – u)Tser average time to service a customeru server utilization (0..1): u = r x TserTq average time/customer in queue

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

A Little Queuing Theory: M/G/1 and M/M/1

44

A Little Queuing Theory: An Example

• Processor sends 10 x 8KB disk I/Os per second, requests & service exponentially distrib., avg. disk service = 20 ms

• On average, how utilized is the disk?– What is the number of requests in the queue?– What is the average time spent in the queue?– What is the average response time for a disk request?

• Notation:λ average number of arriving customers/second = 10Tser average time to service a customer = 20 ms (0.02s)u server utilization (0..1): u = λ x Tser= 10/s x .02s = 0.2Tq average time/customer in queue = Tser x u / (1 – u)

= 20 x 0.2/(1-0.2) = 20 x 0.25 = 5 ms (0 .005s)Tsys average time/customer in system: Tsys =Tq +Tser= 25 msLq average length of queue:Lq= λ x Tq

= 10/s x .005s = 0.05 requests in queueLsys average # tasks in system: Lsys = λ x Tsys = 10/s x .025s = 0.25

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

Page 12: Digital Systems Architecture Topics EECE 343-01 EECE …eecs.vanderbilt.edu/courses/eece343/spring2004/lecture/EECS343... · EECE 343-01 EECE 292-02 Dependability and RAID ... •

12

45

• Processor access memory over “network”• DRAM service properties

– speed = 9cycles+ 2cycles/word– With 8 word cachelines: Tser= 25cycles, µ=1/25=.04 ops/cycle– Deterministic Servicetime! (C=0)

• Processor behavior– CPI=1, 40% memory, 7.5% cache misses – Rate: λ= 1 inst/cycle*.4*.075 = .03 ops/cycle

• Notation:λ average number of arriving customers/cycle=.03 Tser average time to service a customer= 25 cyclesu server utilization (0..1): u = λ x Tser= .03×25=.75Tq average time/customer in queue = Tser x u x (1 + C) /(2 x (1 – u))

= (25 x 0.75 x ½)/(1 – 0.75) = 37.5 cyclesTsys average time/customer in system: Tsys = Tq +Tser= 62.5 cyclesLq average length of queue:Lq= λ x Tq

= .03 x 37.5 = 1.125 requests in queueLsys average # tasks in system :Lsys = λ x Tsys = .03 x 62.5 = 1.875

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

A Little Queuing Theory:Yet Another Example

46

Summary• Disk industry growing rapidly, improves:

– Bandwidth 40%/yr , – Areal density 60%/year, $/MB faster?

• Disk Latency queue + controller + seek + rotate + transfer

• Advertised average seek time benchmark much greater than average seek time in practice

Adapted from John Kubiatowicz’s CS 252 lecture notes. Copyright © 2003 UCB.

47

Things to Do• Homework Assignment #4 posted with

solutions

• Project #3 due on Monday