anshul kumar, cse iitd csl718 : main memory cpu-cache-main memory performance 9th mar, 2006
TRANSCRIPT
Anshul Kumar, CSE IITD
CSL718 : Main MemoryCSL718 : Main MemoryCSL718 : Main MemoryCSL718 : Main Memory
CPU-Cache-Main Memory Performance
9th Mar, 2006
Anshul Kumar, CSE IITD slide 2
A Simple ModelA Simple ModelA Simple ModelA Simple Model
tav = tc + pm . tc.miss
where
tav = average memory access time as seen by CPU
tc = cache access time
pm = miss probability (consider only read misses, if write penalties
are hidden by buffers)
tc.miss = cache miss penalty
CPU Cache Memory
Anshul Kumar, CSE IITD slide 3
Cache miss penaltyCache miss penaltyCache miss penaltyCache miss penalty
Depends on • Various cache policies
– Read policy
– Load policy
– Write policy
– Write buffers etc.
• Main memory organization– Interleaving
– Page mode
Anshul Kumar, CSE IITD slide 4
Read PoliciesRead PoliciesRead PoliciesRead Policies
CacheMemory
Teff=(1-pm).1 + pm . (T+2)
Sequential Simple:
CacheMemory
Teff=(1-pm).1 + pm . (T+1)
Concurrent Simple:
CacheMemory
Teff=(1-pm).1 + pm . (T+1)
Sequential Forward:
CacheMemory
Teff=(1-pm).1 + pm . (T)
Concurrent Forward:
1 1 1T
1 1 1T
1 1T
1 1T
Anshul Kumar, CSE IITD slide 5
Load policiesLoad policiesLoad policiesLoad policies
4 AU Block
Cache miss on AU 1
Block Load
Load Forward
Fetch Bypass(wrap aroundload)
0 1 2 3
Anshul Kumar, CSE IITD slide 6
Analyzing Write Policies:CPU timeAnalyzing Write Policies:CPU timeAnalyzing Write Policies:CPU timeAnalyzing Write Policies:CPU time
Hit:WB, Miss: WB 1 Tb + i 1 1
Hit:WB, Miss: WTWA 1 Tb + i 1 1
Hit:WB, Miss: WTNWA 1 Tb + i 1 1
Hit:WT, Miss: WB 1 Tb + i 1 1
Hit:WT, Miss: WTWA 1 Tb + i 1 1
Hit:WT, Miss: WTNWA 1 Tb + i 1 1
Policy Read Read Write Writehit miss hit miss
i depends on read policy
Anshul Kumar, CSE IITD slide 7
Analyzing Write Policies:Bus timeAnalyzing Write Policies:Bus timeAnalyzing Write Policies:Bus timeAnalyzing Write Policies:Bus time
Hit:WB, Miss: WB 0 Tb (2-Pc) 0 Tb(2-Pc)
Hit:WB, Miss: WTWA 0 Tb (2-Pc) 0 Tb(2-Pc)+Tw
Hit:WB, Miss: WTNWA 0 Tb (2-Pc) 0 Tw
Hit:WT, Miss: WB 0 Tb (2-Pc) Tw Tb(2-Pc)
Hit:WT, Miss: WTWA 0 Tb Tw Tb+Tw
Hit:WT, Miss: WTNWA 0 Tb Tw Tw
Policy Read Read Write Writehit miss hit miss
Anshul Kumar, CSE IITD slide 8
Interleaving with Fast Page ModeInterleaving with Fast Page ModeInterleaving with Fast Page ModeInterleaving with Fast Page Mode
m
LLT
m
LTTT buscalineaccess 1
Anshul Kumar, CSE IITD slide 9
A Refined ModelA Refined ModelA Refined ModelA Refined Model
tav = tc + pm . (tc.miss + tinterference + tw-interference + tIO-interference )
where
tinterference = interference among line transfers
tw-interference = interference between word writes and line transfers
tIO-interference = interference between I/O and line transfers
Anshul Kumar, CSE IITD slide 10
Interference among line transfersInterference among line transfersInterference among line transfersInterference among line transfers
What happens when another miss occurs in tbusy = tm.miss - tc.miss
interval?
tinterference = additional delay due to this
= expected number of misses during tbusy * delay per miss
= ( * tbusy * pm) * (tbusy / 2)where = memory request rate of processor
tc tc.miss
tm.miss
CPU blocked CPU executing
Memory busy
Anshul Kumar, CSE IITD slide 11
Interference I/Os and writesInterference I/Os and writesInterference I/Os and writesInterference I/Os and writes
delay = prob that memory is busy when request arrives *
average waiting period
what happens when memory is found to be busy serving one request and some other requests are waiting?
Memory busy
request arrivals
served waiting served
Anshul Kumar, CSE IITD slide 12
I/O InterferenceI/O InterferenceI/O InterferenceI/O Interference
tIO-interference = delay due to I/O contention
= probability that memory is occupied with I/O *
average time taken to complete ongoing I/O
= ( ) * (tservice +tIO-wait)/2
tservice = time to service (block read/write time)
tIO-wait = waiting time
= 0, if CPU has a higher priority
0, otherwise
estimate using queuing model
Anshul Kumar, CSE IITD slide 13
Write Interference DelayWrite Interference DelayWrite Interference DelayWrite Interference Delay
tw-interference = probability that a write through is occupying the memory when a read miss occurs *
average time taken to complete ongoing write
Anshul Kumar, CSE IITD slide 14
Memory performance using queuing modelMemory performance using queuing modelMemory performance using queuing modelMemory performance using queuing model
Arrival ofrequests
(from processor/cache)
Servicing ofrequests
(by memory)
Requests queuedfor service
Statistical behaviour of arrivals ?Statistical behaviour of service?
Model Nomenclature: arrival / service / number M / G / 1 G : GeneralM / M / 1 M : Poisson/ExponentialM / D / 1 D : ConstantMB / D / 1 MB : Binomial
Anshul Kumar, CSE IITD slide 15
Modeling memory requestsModeling memory requestsModeling memory requestsModeling memory requests
prob of a request in one cycle = pprob of no request in one cycle = 1 – pprob of no request in T/ cycles = (1 – p)T/
prob of at least one req in T/ cycles = 1 – (1 – p)T/
prob of k requests in n (=T/ ) cycles = nCk pk (1 – p)n-k
(Binomial distribution)expected no. of requests in n cycles = n p
T : interval(memory cycle time)
: processor cycle
Anshul Kumar, CSE IITD slide 16
Poisson ApproximationPoisson ApproximationPoisson ApproximationPoisson Approximation
If processor cycles are small
(i.e., 0, p 0, n , n p T),
Binomial distribution Poisson distribution, request rate =
prob of k requests in interval T =
expected no. of requests in interval T = T
Interval between two consecutive requests has an exponential distribution, prob (inter arrival interval > t) = 1 – e - t
Tk
ek
T
!
)(
Anshul Kumar, CSE IITD slide 17
Modeling ServiceModeling ServiceModeling ServiceModeling Service
• Each request is served in constant time
e.g. cache write through requests,
cache block transfer requests
or• Service time has an exponential distribution
e.g. I/O requests with varying block sizes where small blocks are more common than large blocks
Anshul Kumar, CSE IITD slide 18
M / G / 1 ModelM / G / 1 ModelM / G / 1 ModelM / G / 1 Model
Average waiting time = Tw =
Average queue length = Q =
where
= occupancy of server = / = average service rate
c = = variance of service time
)1(2
)1(1 22
c
)1(2
)1( 22
c
Anshul Kumar, CSE IITD slide 19
Special cases: M/M/1, M/D/1Special cases: M/M/1, M/D/1Special cases: M/M/1, M/D/1Special cases: M/M/1, M/D/1
M/M/1 c = 1
Average waiting time = Tw =
Average queue length = Q =
M/D/1 c = 0
Average waiting time = Tw =
Average queue length = Q =
1
1 2
1
2
)1(2
1 2
)1(2
2
Anshul Kumar, CSE IITD slide 20
M/D/1 with low server occupancyM/D/1 with low server occupancyM/D/1 with low server occupancyM/D/1 with low server occupancy
Average waiting time = Tw =
Average queue length = Q =
when is small, Tw =
=
Compare this with
)1(2
1 2
)1(2
2
2
1 2
2
1
2
1
2busym tp
2
1
Anshul Kumar, CSE IITD slide 21
Designing buffer to hold the queueDesigning buffer to hold the queueDesigning buffer to hold the queueDesigning buffer to hold the queue
How to design a buffer so that buffer overflow or stalling due to buffer full is within certain limit?
For M/M/1 model ,
prob(queue size buffer size BF) = BF+1
Choose BF so that this probability is below a desired value.
Anshul Kumar, CSE IITD slide 22
Open and Closed QueuesOpen and Closed QueuesOpen and Closed QueuesOpen and Closed Queues
Arrival ofrequests
(from processor/cache)
Servicing ofrequests
(by memory)
Requests queuedfor service
•Processor is not blocked by queuing delays and request rate remains unaffected – Open queue•Processor is blocked due to queuing delays and request rate drops – Closed queue
Anshul Kumar, CSE IITD slide 23
Open and Closed QueuesOpen and Closed QueuesOpen and Closed QueuesOpen and Closed Queues
Arrival ofrequests
(from processor/cache)
Servicing ofrequests
(by memory)
Requests queuedfor service
Time Tw 1/ Number (open) Q = Tw = / Number (closed) Qa a
occupancy (open q) = = occupancy (closed q) + waiting (closed q) a + Qa
Anshul Kumar, CSE IITD slide 24
M/D/1 Closed QueueM/D/1 Closed QueueM/D/1 Closed QueueM/D/1 Closed QueueReduced request rate = a
Reduced occupancy = a = a /
Requests being served = a
Requests waiting =
)1(2
2
a
a
1)1(1)1(
1)1()1(2
22
22
a
aa
aa
Anshul Kumar, CSE IITD slide 25
Deriving queue length, wait timeDeriving queue length, wait timeDeriving queue length, wait timeDeriving queue length, wait time
Let ti = time when request i is being served
ri = no. of arrivals during ti
ni = queue length at the end of ti
including item in service
Assume occupancy of server = = / < 1 process reaches a steady state
Expected value E(ti ) = E(t ) = T = 1/
E(ri ) = E(r ) = E(t ) = / = E(ni ) = E(n ) = N
Anshul Kumar, CSE IITD slide 26
Relating Relating nni+1i+1 to to nniiRelating Relating nni+1i+1 to to nnii
ni+1 = ni + arrivals – departures
two cases need to be considered:
i) ni 0
ii) ni = 0
Ci+1Ci+2Ci+3 Ci
ni
Anshul Kumar, CSE IITD slide 27
When When nnii 0 0When When nnii 0 0
Ci+1 arrived before Ci left
ni+1 = ni + ri+1 - 1
Ci served Ci+1 served
Ci leaves Ci+1 leaves
timeti ti+1
Anshul Kumar, CSE IITD slide 28
When When nnii = 0 = 0When When nnii = 0 = 0
Ci+1 arrived after Ci left
ni+1 = ni + 1 + ri+1 – 1
= ni + ri+1
Ci served Ci+1 served
Ci leaves Ci+1 leaves
timeti ti+1
Ci+1 arrives
Anshul Kumar, CSE IITD slide 29
Combining the two casesCombining the two casesCombining the two casesCombining the two cases
ni+1 = ni + ri+1 – 1 + i
wherei = 0, when ni 0 and
i = 1, when ni = 0
note that ni i = 0 and i2
= i
E(ni+1) = E( ni ) + E( ri+1 ) – 1 + E( i )
in steady state, E(n) = E( n ) + E( r ) – 1 + E( )
that is, E() = 1 - E( r ) = 1 - prob ( n 0) =
Anshul Kumar, CSE IITD slide 30
Combining the two casesCombining the two casesCombining the two casesCombining the two casesni+1 = ni + ri+1 – 1 + i
ni+12 = ni
2 + (ri+1 – 1)2 + i2 + 2 ni (ri+1 – 1)
+ 2 (ri+1 – 1) i + 2 ni i
ni+12 = ni
2 + (ri+1 – 1)2 + i + 2 ni (ri+1 – 1) + 2 (ri+1 – 1) i
E(ni+12) = E( ni
2 ) + E(ri+1 – 1)2 + E( i )
+ 2 E[ ni (ri+1 – 1) ] + 2 E[(ri+1 – 1) i ]
0 = E[(r – 1)2] + E( ) + 2 E[ n (r – 1) ] + 2 E[(r – 1) ]
0 = E(r2)-2 +1+ (1-) + 2 E(n) ( – 1) + 2 ( – 1)(1-)
2 E(n) (1- ) = E(r2) -2 2 +
Anshul Kumar, CSE IITD slide 31
continuedcontinuedcontinuedcontinued
2 E(n) (1- ) = E(r2) -2 2 +
This is valid for G/G/1
)1(2
- )E(
)1(2
2- )E( )E( N
222
rr
n
Anshul Kumar, CSE IITD slide 32
Consider Poisson arrivalConsider Poisson arrivalConsider Poisson arrivalConsider Poisson arrival
P(ri) =
mean E(ri) = ti
variance ri2 = ti
ri2 = E(ri
2) - |E(ri)|2
E(ri2) = ri
2 + |E(ri)|2
Take expectation over i
E(r2) = E(t) + 2 E(t 2)
i
i
!
)(
i
i tr
er
t
Anshul Kumar, CSE IITD slide 33
continuedcontinuedcontinuedcontinued
mean E(t) = 1/
variance t2
E(t2) = t2 + [E(t) ] 2 = t
2 + 1/ 2
Recall E(r2) = E(t) + 2 E(t 2)
Therefore, E(r2) = / + 2 (t2 + 1/ 2 )
= + 2 t2 + 2
where c2 = 2 t2
)1(2
) (1
)1(2
)1(2
- )E( )E( N
222222
cr
n t
Anshul Kumar, CSE IITD slide 34
Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1
P(n; t) = prob that there are n req in the system at time t (in queue + in service)
P(n; t+t) = P(n; t)(1 - t - t) + P(n-1; t) t + P(n+1; t) tP(0; t+t) = P(0; t)(1 - t) + P(1; t) t
Prob of more than one event in t is neglected (t2
term)
Anshul Kumar, CSE IITD slide 35
Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1
dP(n; t)/dt = P(n; t)(-- ) + P(n-1; t) + P(n+1; t) dP(0; t)/dt = P(0; t)(-) + P(1; t)In steady state, We can drop ;t Derivatives tend to 0 0 = P(n)(-- ) + P(n-1) + P(n+1) 0 = P(0)(-) + P(1) P(n) - P(n+1) = P(n-1) - P(n) P(0) - P(1) = 0
P(n-1) - P(n) = 0 P(n) = P(n-1)
Anshul Kumar, CSE IITD slide 36
Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1
P(n) = P(n-1)
P(n) = n P(0)
11)1()1(
)1()1()()(
)1()(1)0(
11
)0(1)0(1)(
2
2
000
00
i
i
i
i
i
n
i
i
i
iiiPinE
nPandP
PPiP
Anshul Kumar, CSE IITD slide 37
Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1Direct Derivation for M/M/1
)(
)(Prob
)1(
)1)(1()1()(
)(Prob
1
1
2
00
k
k
k
i
ik
i
kserverqueueinitems
iP
kserverqueueinitems