comp25212: system architecture lecturers alasdair rawsthorne...

18
COMP25212: System COMP25212: System Architecture Architecture • Lecturers Alasdair Rawsthorne ( [email protected] ) Daniel Goodman ([email protected] ) • Lectures 22 (two per week) Wed 1200 Simon Basement B.41 Fri 1600 Alan Turing G.107 • Laboratories 5 x 2 hour sessions starting NEXT week (Thu 1100 or Fri 1300) COMP25212 Lecture 1 1

Upload: isabella-watkins

Post on 03-Jan-2016

221 views

Category:

Documents


2 download

TRANSCRIPT

COMP25212: System COMP25212: System ArchitectureArchitecture

• LecturersAlasdair Rawsthorne ([email protected])Daniel Goodman ([email protected])

• Lectures22 (two per week)

Wed 1200 Simon Basement B.41Fri 1600 Alan Turing G.107

• Laboratories5 x 2 hour sessionsstarting NEXT week (Thu 1100 or Fri 1300)

COMP25212 Lecture 11

COMP25212: System COMP25212: System ArchitectureArchitecture

• Recommended textbook (IW)– D.A. Patterson & J.L. Hennessy, “Computer

Organization and Design. The Hardware/Software Interface”, Morgan Kaufmann, now in 4th Edition

COMP25212 Lecture 12

Aims of the CourseAims of the Course

• To introduce architectural techniques which are used in modern processors and systems

• To understand how the specification of systems affects their performance and suitability for particular applications

• To understand how to design such systems

COMP25212 Lecture 13

COMP25212: Course COMP25212: Course OverviewOverview

• Architectural techniques – making processors go faster– Caches– Pipelines– Multi-Threading– Multi-Core

• How to make processors more flexible– Virtualization

• The Architecture of permanent storage

COMP25212 Lecture 14

Motivation for Motivation for PerformancePerformance

• There is always a demand for increased computational performance

• Current ‘microprocessors’ are several thousand times faster than when they were first introduced 30 years ago.

• But still lots of things they can’t do due to lack of speed – e.g. HD video synthesis, realistic game physics

COMP25212 Lecture 15

Single Core PerformanceSingle Core Performance

COMP25212 Lecture 16

Core i7

Architecture & the FutureArchitecture & the Future

• Speed improvements due to technology have slowed since around 2004/5– Physical production limits– Power Dissipation– Device Variability

• Architecture will need to play a larger role in future performance gains – particularly parallelism (multi-core?)

COMP25212 Lecture 17

Architecture & TechnologyArchitecture & Technology

• A lot of performance improvements over 30 years have been due to technology

• Mainly due to smaller faster circuits• But it isn’t that simple e.g.

– CPU speed increased 1000x– Main memory speed < 10x

• Need to tailor architecture to exploit technology – changes with time

COMP25212 Lecture 18

Processor Cache MemoryProcessor Cache Memory

• A very important technique in processors since about mid 1980s

• Purpose is to overcome the speed imbalance between fast internal processor circuitry (e.g. ALU & registers) & main memory

• No modern processor could perform anywhere near current speeds without caches

COMP25212 Lecture 19

Understand Caches:Understand Caches:PrerequisitesPrerequisites

• Processor is a CPU connected to memory

• CPU fetches a sequence of instructions from memory and executes them

• Each memory location has a unique address – a binary value

• Some instructions are loads or stores which read or write values from/to memory addresses

COMP25212 Lecture 110

What is a Cache?What is a Cache?

• Cache: “A secret hiding place”

• General principle

– If something is far away and/or takes a long time to access, keep a local copy

– Usually limited fast local space

• Not just processors

– Web pages kept on local disc– Virtual Memory is a form of caching

COMP25212 Lecture 111

What is a Processor What is a Processor Cache?Cache?

• Small amount of very fast memory used as temporary store for frequently used memory locations (both instructions and data)

• Relies on locality:

– during any short period of time, a program uses only a small subset (working set) of its instructions and data.

COMP25212 Lecture 112

Processor Cache Processor Cache MemoryMemory

• 2Ghz processor with 32k L1 data cache and

• 32k L1 instruction cache plus 256k on-chip L2 cache (L2 cache is 4 way set associative)

• What does it mean?

• Why is it there?

• What is ‘good’?

COMP25212 Lecture 113

Why is Cache Needed?Why is Cache Needed?

• Modern processor speed > 1GHz• > 1 instruction / nsec (10-9 sec)• Every instruction needs to be fetched

from main memory.• Many instructions (1 in 3?) also access

main memory to read or write data.• But RAM memory access time typically

>50 nsec! (67 x too slow!)

COMP25212 Lecture 114

Facts about memory Facts about memory speedsspeeds

• Circuit capacitance is the thing that makes things slow (needs charging)

• Bigger things have bigger capacitance

• So large memories are slow

• Dynamic memories (storing data on capacitance) are slower than static memories (bistable circuits)

COMP25212 Lecture 115

Interconnection SpeedsInterconnection Speeds

• External wires also have significant capacitance.

• Driving signals between chips needs special high power interface circuits.

• Things within a VLSI ‘chip’ are fast – anything ‘off chip’ is slow.

• Put everything on a single chip? Maybe one day! Manufacturing limitations

COMP25212 Lecture 116

Basic Level 1 (L1) Basic Level 1 (L1) CacheCache

COMP25212 Lecture 117

L1Cache

Compiler makes best use of registers – they are the fastest.Anything not in registers – must go (logically) to memory.But is data (a copy!) in cache?

CPU

Registers

RAMMemory

On-chip

Cache RequirementsCache Requirements• Main memory is big e.g. potentially 232 bytes

(4G bytes) if 32 bit byte address (or 248 if 48 – x64)

• Cache is small (to be fast), e.g. 32k bytes, can only hold a very small fraction (x10-17 or x10-33) of all possible data.

• But cache must be able to store and retrieve any one of 232 (or 248) addresses/data (in practice would not hold single bytes, but in principle ….)

• Special structures needed – is not simple memory indexed by address.

COMP25212 Lecture 118