cisc 662 graduate computer architecture lecture 3 - isaalignment restrictions con’t •a 32-bit...
TRANSCRIPT
![Page 1: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/1.jpg)
CISC 662 Graduate ComputerArchitecture
Lecture 3 - ISAMichela Taufer
Powerpoint Lecture Notes from John Hennessy and David Patterson’s: ComputerArchitecture, 4th edition
----Additional teaching material from:
Jelena Mirkovic (U Del) and John Kubiatowicz (UC Berkeley)
![Page 2: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/2.jpg)
MemoryAddressing
![Page 3: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/3.jpg)
Alignment Restrictions• Computer systems place restrictions on
allowable addresses for some objects• Access to an object of size s bytes at byte
address A is aligned if A mod s = 0• Why do machines have alignment
restrictions?– Hardware to access memory is simpler– Program with alignment accesses run faster– A misalignment memory access will take multiple
aligned memory references
![Page 4: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/4.jpg)
Alignment Restrictions con’t• A 32-bit processors require a 4-byte integer to reside
at a memory address that is evenly divisible by 4• Any aligned 4-byte int has its address be multiple of 4
e.g., 0x2000 or 0x2004 -> the value can be read orwritten with a single memory operation
• Any unaligned double has its address not a multipleof 4 e.g., 0x2001 -> the object may be slit across two4-byte blocks and therefore read or written with twomemory operations
![Page 5: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/5.jpg)
Addressing Modes• Addressing mode = how architectures specify
the address of an object they will access• Addressing modes may:
– Reduce instructions counters– Add to the complexity of building a computer– Increase the average CPI
• Figure B.6 lists all the addressing modes inrecent computers
• Some examples in the next slide
![Page 6: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/6.jpg)
Addressing Modes con’t• Register
– Add R4,R3 R4 <- R4 + R3– When a value is in a register
• Immediate – Add R4, #3 R4 <- R4 + 3– For constants
• Displacement– Add R4, 100(R1) R4 <- R4 + M[100+R1]– Accessing local variables
• Register indirect – Add R4,(R1) R4 <- R4 + M[R1]– Accessing using a pointer or a computed address
![Page 7: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/7.jpg)
Operands andOperations
![Page 8: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/8.jpg)
Operands and Operations
opcode:• which operation (ADD, MULT …)• type of operands (INT, FP)
result operand1 operandn
…
• operand location (memory or register)• type (INT, FP)
ADD R1, R3, R4ADD F1, F2, F3SUB R1, R2, R3FADD R1, R2, R3
Operands and operations are encoded in instructions
![Page 9: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/9.jpg)
Frequency of Data Access• Frequency of access to different data
helps in deciding what types are moreimportant to support efficiently
![Page 10: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/10.jpg)
Operations in the Instruction SetArithmetic: Add, multiply, subtract, divideLogical: And, orControl: branch, jump, procedure call andreturnSystem: OS call, virtual memorymanagementFP operations: add, multiply, subtract,divideDecimal: add, multiply, convertString: move, compare, searchGraphics: pixel and vertex operations
![Page 11: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/11.jpg)
Frequency of Instructions• The most widely executed instructions
are the simple operations of an ISA
![Page 12: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/12.jpg)
Control Flow Instructions (CFI)• Conditional branches• Jumps• Procedure calls• Procedure returns
![Page 13: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/13.jpg)
Frequency of CFI• Each event is different and may use different
instructions and have different behaviors
![Page 14: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/14.jpg)
How To Specify Branch Condition?Condition code• ALU operation sets special bits,
get condition for free• Constrain instruction orderingCondition register• Write 0 (false) or 1 (true) into a register
after comparison• Support only BZ and BNZ instructionsCompare and branch• Compare operands (BLT, BGT, BEQ …) and
branch• Instruction may last long
![Page 15: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/15.jpg)
Procedure Invocation OptionsReturn address and some state must besavedCaller saving:• Calling procedure saves registers that it will
need upon return• Must be used for globally accessed variablesCallee saving:• Called procedure saves registers that it
will overwrite
![Page 16: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/16.jpg)
Encoding The Instruction Set
Design decisions affect the size of theinstruction:• Size of the compiled program• Ease of decoding
![Page 17: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/17.jpg)
Encoding The opcode Field
Depends on whether every operation canbe combined with every addressing mode• If it can separate address specifier is needed
for each operand• If it can’t opcode can signify the addressing
mode
![Page 18: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/18.jpg)
Instruction Set Design Trade-offs
More registers are better for compileroptimizationMore addressing modes bring fasteroperationMore registers and addressing modesmake instructions longerShorter instructions and instructions withsimilar CPI are better for pipelining
![Page 19: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/19.jpg)
![Page 20: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/20.jpg)
Instruction Formats
Variable
operation and number of operands
addressing modeand address 1
addressing mode andaddress n
Works best if there are many operations and addressing modesAll addressing modes with all instructionAs few bits as possible to encode the programDecoding might be complicated
![Page 21: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/21.jpg)
Fixed
operation and addressing mode
address 1 address 3
Works best if there is a small number of operationsand addressing modesLarger programsAlways same number bits to encode instructionsEasy decoding
address 2
Instruction Formats
![Page 22: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/22.jpg)
Hybrid
operation addressing mode 1address 1
operation address spec 1
operation address spec 1address 1
address spec 2 address 1
address 2
Instruction Formats
![Page 23: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/23.jpg)
CISC vs. RISCComplex Instruction Set Computer (CISC)• Instructions are highly specialized• Support for a variety of instructions,
addressing modes, etc.• Different CPI and instruction size
Reduced Instruction Set Computer (RISC)• Short, simple instructions, support for a few addressing
modes• More complex instructions must be programmed• Same low CPI
![Page 24: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/24.jpg)
Reduced Code SizeImportant for embedded applicationsDesign hybrid version of instruction set withboth 16-bit and 32-bit instructions• 16-bit instructions are simpler, support fewer
operations and addressing modesCompressed code• Instruction cache contains full instructions• Memory contains compressed instructions• On cache miss, instruction is fetched and
decompressed
![Page 25: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/25.jpg)
Role ofCompilers
![Page 26: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/26.jpg)
Role of CompilersCompiler generates object code inmachine language from the high-levellanguage such as CInstruction set is compiler’s targetIn addition to generating the code,compiler optimizes the code to make it:• Shorter – 25% to 90%• Faster• Susceptible to pipelining
![Page 27: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/27.jpg)
CompilationCompiler makes two to four passes through thecode• In each pass it performs one of the optimizations• The optimizations are optional and may be skipped to
achieve faster compilation• Passes are sequential
• If compiler could go back and repeat steps it might discoverbetter optimizations but this would increase time andcomplexity
Compiler design goals:• Correctness• Speed of compilation
![Page 28: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/28.jpg)
Compilation
Front end per language
High-level optimizations
Global optimizer
Code generator
![Page 29: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/29.jpg)
![Page 30: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/30.jpg)
Front End
Transforms high-level language intocommon intermediate representationWhen a new language becomes popularonly front-end needs to be rewritten
![Page 31: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/31.jpg)
High-Level Optimizations
Transform the code to take advantage ofparallelism and increase speed of execution:•• Loop unrollingLoop unrolling – expand body of the loop to
encompass several iterations thus eliminating numberof conditional branches
•• Procedure Procedure inlininginlining – eliminates context switch•• Prefetch insertionPrefetch insertion – prefech array references
in loops
for (i = 0; i < 100; i++) {
g ();}
for (i = 0; i < 100; i+=2) {
g ();g ();
} ⇒
![Page 32: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/32.jpg)
Global OptimizationsGlobal and local optimizations•• Global common Global common subexpression subexpression eliminationelimination – locates
several expressions that compute same value andreplaces the second with the temporary variable
• Local optimization is done only within basic block•• Copy propagationCopy propagation: if A=X replace all later references to A
with XRegister allocation• Allocate most accessed variables to registers• Since number of registers is limited, must find a strategy
that does not result in too many transfers between thememory and the registers
![Page 33: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/33.jpg)
Code GeneratorTakes advantage of design features of aspecific architecture• Reorder instructions to improve pipeline
performance• Replace multiplication with addition and shifts
![Page 34: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/34.jpg)
Which Variables → RegistersProgram data allocation• Stack
• Local scalar variables and activation records forprocedures
• Best for register allocation• Global area
• Global variables and constants• Should be allocated to registers if accessed frequently
• Heap• Dynamic objects accessed with pointers• Should not be allocated to registers
Aliased variables should also not be allocatedto registers
![Page 35: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/35.jpg)
How Can Architecture Help?Provide regularity• Operations, data types and addressing modes
should be orthogonalProvide primitives not solutions• Special features that match kernels or high-
level languages are often unusableSimplify trade-offs among alternatives• Compilers strive to generate efficient code• Specify benefits and costs of each alternative
Make use of everything that is known atcompile time
![Page 36: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible](https://reader035.vdocuments.net/reader035/viewer/2022062917/5ed141c6cd86a73bbf4f385b/html5/thumbnails/36.jpg)
Next Weeks …
Week Date Topics Reading assigned Quiz
1 Sep 4 Lec01 - Introduction Chap 1; App B
2 Sep 9 Lec02 – Performance and ISAs Q1
2 Sep 11 Lec03 – ISAs and Role of Compilers App A1-A6
3 Sep 16 Lec04 - MIPS Overview
3 Sep 18 Lec05 – Pipeline Q2
4 Sep 23 Lec06 - Hazards
4 Sep 25 Lec07 – Multi-cycles App A.7; Chap 2
Sep 29 Homework 1 due
5 Sep 30 Homework review Q3