cache basics (section 1.7, 5.1)
DESCRIPTION
Cache Basics (Section 1.7, 5.1). A cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data A block is a fixed-size collection of data containing the requested word, that is retrieved from the memory - PowerPoint PPT PresentationTRANSCRIPT
Cache Basics (Section 1.7, 5.1)
• A cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data
• A block is a fixed-size collection of data containing the requested word, that is retrieved from the memory
• Temporal locality tells us that we are likely to need this word again in the near future
• Spatial locality tells us that the other data in the block may be needed soon.
Cache Basics (Cont’d)
• The time required for the cache miss depends on the latency and bandwidth of the memory
• Latency determines the time to retrieve the first word of the block
• Bandwidth determines the time to retrieve the rest of the block
• Hit (or Miss) rate in the fraction of cache accesses that result in a hit (or a miss)
• Example on page 42
Performance of Cache
Memory Stall Cycles = No of misses x Miss penalty
= IC x(Misses/instruction) x Miss penalty
= IC x Memory references per instruction
x Miss rate x Miss penalty
CPU Executive Time = (CPU clock cycles + Memory stall cycles) x Clock cycle time
Example on page 43
When Can a Block Be Placed in a Cache? (Figure 5.2)
• Direct Mapped: each block has only one place in the cache.
(Block address) mod (no.of blocks in cache)
• Fully Associative: a block can be placed anywhere in the cache
• Set Associate: A block can be placed in a restricted set of places in the cache.
A set is a group of blocks.
If n blocks in a set, it is n-way set- associative cache placement.
A set in chosen as
(Block address) mod (no. of sets in cache)
How IS a Block Found if it is in the Cache?
• Each block has an address tag and an index that give the block address (Fig 5.3)
• A block offset points to the desired data within the block
• The index field selects the set
• The tag field is compared to determine a hit
• Increasing associativity means increasing the tag field and decreasing the index field
• Fully associative caches have no index field
Which Block Should Be Replaced?
• A block needs to be replaced in the cache whenever there is a cache miss
• In direct-mapped cache, there is a fixed place for each block, so the choice is simple
• In fully associative or set-associative caches, three strategies exist for replacement
1. Random
2. Least-recently used (LRU)
3. FIFO
What Happens on a Write?
Options when writing to cache:• Write through: write to the cache and to the memory
– Next lower level has the most current copy
• Write back: write to the cache only– Write occurs at the speed of cache– Dirty bit specifies if the block has been modified or not
Options on a write miss• Write allocate: the block is loaded on a write miss• No-write allocate: the block is not loaded into the cache
Alpha AXP 21064 Data Cache(Figure 5.5)
• 8192 bytes data cache• 32-byte blocks (5 –bit offset, 8-bit index)• Direct-mapped• Write through with a 4-block write buffer• No-write allocate: write around the cache on a miss• 34-bit address: 21-bit tag, 8-bit index, 5-bit offset• Write buffers use merging• Separate instructions and data caches
Cache Performance
• Average memory access time = Hit time + Miss rate x Miss penalty
• CPU Time = (CPU execution CCs + Mem. Stall CCs) x CC time
Examples on pages 384-389
CC: Clock Cycle