![Page 1: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/1.jpg)
Caching II
Andreas KlappeneckerCPSC321 Computer
Architecture
![Page 2: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/2.jpg)
Verilog Questions & Answers
![Page 3: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/3.jpg)
Verilog Q & A How is the xor instruction encoded?
R-format instruction, function field Ox26 See [PH] page A-59
What is the purpose of Idealmem.v? It models the memory dmeminit.v initializes data memory imeminit.v initializes instruction memory
![Page 4: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/4.jpg)
Verilog Q&A How do I specify delays?
`define DEL 10
begin
a <= #(`DEL) b;
c <= #(`DEL) d;
end
Delays can be inserted anywhere in an assignment
![Page 5: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/5.jpg)
Delaysmodule iab;
integer i, j;
initial begin
i = 3;
j = 4;
begin
#1 i = #1 j;
#1 j = #1 i;
end
end
endmodule
module iab;
integer i, j;
initial begin
i = 3;
j = 4;
begin
#1 i = #1 j;
#1 j = #1 i;
end
end
endmodule
Simulation starts:
@time 0: i=3, j=4
Simulation continues until first delay #1 and waits until time 1.
@time 1, j is sampled
@time 2, assign 4 to i
continue w/ next stmt
@time 3, i is sampled
@time 4, assign 4 to j
![Page 6: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/6.jpg)
Delaysmodule ianb;
integer i, j;
initial begin
i = 3;
j = 4;
begin
i <= #1 j;
j <= #1 i;
end
end
endmodule
module ianb;
integer i, j;
initial begin
i = 3;
j = 4;
begin
i <= #1 j;
j <= #1 i;
end
end
endmodule
@time 0: i=3, j=4
both non-blocking assignments finish at time 0
[intra-assignments delays do not delay the execution of the statement]
sample j and schedule to assign to i at time 1
sample i and schedule to assign to j
@time 1: i = 4, j = 3
![Page 7: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/7.jpg)
Delays
Hint: Using unit delays simplifies debugging
It allows you to find out which signal depends on which
Do not code in the form #1, rather use
define ‘foo_del 1 // Change later a <= #(‘foo_del) b;
![Page 8: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/8.jpg)
Clock
module m555 (CLK);
parameter STime = 0,Ton = 50,Toff = 50,Tcc=Ton+Toff;
output CLK;
reg CLK;
initial begin
#STime CLK = 0;
end
always begin
#Toff CLK = ~CLK;
#Ton CLK = ~CLK;
end
endmodule
![Page 9: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/9.jpg)
Project For jal and jr, the datapath of the
book is not enough You need more control signals for
ALUop, so there is no point to stick to the way it is done in the book
![Page 10: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/10.jpg)
Report
Include some a table explaining yourcontrol signals, e.g.,
![Page 11: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/11.jpg)
Caching
![Page 12: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/12.jpg)
Memory Users want large and fast memories
SRAM is too expensive for main memory DRAM is too slow for many purposes Compromised: Build a memory hierarchy
CPU
Level n
Level 2
Level 1
Levels in thememory hierarchy
Increasing distance from the CPU in
access time
Size of the memory at each level
![Page 13: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/13.jpg)
Locality
Temporal locality A referenced item will be again
referenced soon Spatial locality
nearby data will be referenced soon
![Page 14: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/14.jpg)
Mapping: address modulo the number of blocks in the cache, x -> x mod B
Direct Mapped Cache
00001 00101 01001 01101 10001 10101 11001 11101
000
Cache
Memory
001
01
001
11
001
011
101
11
![Page 15: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/15.jpg)
Cache with 1024=210 words tag from cache is compared against upper portion of
the address If tag=upper 20 bits and valid bit is set, then we
have a cache hit otherwise it is a cache miss
What kind of locality are we taking advantage of?
Direct Mapped Cache
Address (showing bit positions)
20 10
Byteoffset
Valid Tag DataIndex
0
1
2
1021
1022
1023
Tag
Index
Hit Data
20 32
31 30 13 12 11 2 1 0
![Page 16: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/16.jpg)
Taking advantage of spatial locality:
Direct Mapped Cache
Address (showing bit positions)
16 12 Byteoffset
V Tag Data
Hit Data
16 32
4Kentries
16 bits 128 bits
Mux
32 32 32
2
32
Block offsetIndex
Tag
31 16 15 4 32 1 0
![Page 17: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/17.jpg)
Read hits this is what we want!
Read misses stall the CPU, fetch block from memory, deliver to cache,
restart Write hits:
can replace data in cache and memory (write-through) write the data only into the cache (write-back the cache later)
Write misses: read the entire block into the cache, then write the word
Cache Hits and Misses
![Page 18: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/18.jpg)
![Page 19: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/19.jpg)
![Page 20: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/20.jpg)
![Page 21: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/21.jpg)
![Page 22: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/22.jpg)
![Page 23: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/23.jpg)
![Page 24: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/24.jpg)
![Page 25: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/25.jpg)
What Block Size?
A large block size reduces cache misses Cache miss penalty increases We need to balance these two
constraints Next time:
How can we measure cache performance? How can we improve cache performance?
![Page 26: Caching II Andreas Klappenecker CPSC321 Computer Architecture](https://reader038.vdocuments.net/reader038/viewer/2022110323/56649d6a5503460f94a48bfd/html5/thumbnails/26.jpg)