computer architecture and organization (cs-507) · computer architecture and organization (cs-507)...
TRANSCRIPT
Computer Architecture and Organization (CS-507)
Lecture 6
Instruction Set Architectures & AddressingModes
Muhammad Zeeshan Haider AliLecturer
ISP. [email protected]
https://zeeshanaliatisp.wordpress.com/
Instruction Set Design Issues
Instruction set design issues
include:
Where are operands stored?
registers, memory, stack, accumulator
How many explicit operands are there?
0, 1, 2, or 3
How is the operand location specified?
register, immediate, indirect, . . .
What type & size of operands are supported?
byte, int, float, double, string, vector. . .
What operations are supported?
add, sub, mul, move, compare . . .
Accumulator Architecture
• Instruction set:add A, sub A, mult A, div A, . ..
load A, storeA
• Example: A*B - (A+C*B)load B
mul C
add A
store D
load A
mul B
sub D
B B*C A+B*C AA+B*C A*B result
acc = acc +,-,*,/ mem[A]
Accumulator Architecture
• Pros–Very low hardware requirements
–Easy to design and understand
• Cons–Accumulator becomes the bottleneck
– Little ability for parallelism or pipelining
–High memory traffic
Stack Architectures
Instruction set:
add, sub, mult, div, . . .
push A, popA
Example: A*B - (A+C*B)
pushA
push B
mul
pushA
push C
push B
mul
add
sub
A B
A
A*B
A*B
A*B
A A*B
A C
A
A*B
B
C
A
A*B
B*C A+B*C result
Stacks: Pros and Cons
ProsGood code density (implicit top of stack)
Low hardware requirements
Easy to write a simpler compiler for stack architectures
ConsStack becomes the bottleneck
Little ability for parallelism or pipelining
Data is not always at the top of stack when need, so additionalinstructions like TOP and SWAP are needed
Difficult to write an optimizing compiler for stack architectures
Memory – Memory Architecture
mul A, B, C
• Instruction set:(3 operands) add A, B, C
(2 operands) add A, B subA, B
sub A, B, C
mul A, B
• Example: A*B - (A+C*B)– 3 operands
mul D, A, B
mul E, C, B
add E, A, E
sub E, D, E
2 operands
mov D,A
mul D, B
mov E, C
mul E, B
add E, A
sub E, D
Memory - Memory Architecture
• Pros– Requires fewer instructions (especially if 3 operands)
– Easy to write compilers for (especially if 3 operands)
• Cons– Very high memory traffic (especially if 3 operands)
– Variable number of clocks per instruction
– With two operands, more data movements are required
Register – Memory Architectures
mul R1, B
• Instruction set:add R1, A
load R1,A
sub R1,A
store R1,A
• Example: A*B - (A+C*B)load R1, A
/* A*B */mul R1, B
store R1, D
load R2, C
mul R2, B
add R2, A
sub R2, D
/*
/*
/*
C*B */
A + CB */
AB - (A + C*B) */
R1 = R1 +,-,*,/ mem[B]
Register – Memory Architectures
• Pros– Some data can be accessed without loading first
– Instruction format easy to encode
– Good code density
• Cons– Operands are not equivalent (poor orthogonal)
– Variable number of clocks per instruction
– May limit number of registers
Load – Store Architectures
• Instruction set:add R1, R2, R3 sub R1, R2, R3 mul R1, R2, R3
load R1, &Astore R1, &A move R1, R2
• Example: A*B - (A+C*B)load R1, &A
load R2, &B
load R3, &C
mul R7, R3, R2
add R8, R7, R1
mul R9, R1, R2
sub R10, R9, R8
/*
/*
/*
/*
C*B
A + C*B
A*B
A*B - (A+C*B)
R3 = R1 +,-,*,/ R2
*/
*/
*/
*/
Load – Store Architectures
• Pros– Simple, fixed length instruction encodings
– Instructions take similar number of cycles
– Relatively easy to pipeline and make superscalar
• Cons– Higher instruction count
– Not all instructions need three operands
– Dependent on good compiler
Code Sequence C = A +B for Four Instruction Sets
Stack Accumulator Register
(register-memory)
Register (load-store)
Push A
Push B
Add
Pop C
Load A
Add B
Store C
Load R1, A
Add R1, B
Store C, R1
Load R1,A
Load R2, B
Add R3, R1, R2
Store C, R3
memory
acc = acc + mem[C]
memoryR1 = R1 + mem[C] R3 = R1 + R2
Addressing Modes
Immediate
Direct
Indirect
Register
Register Indirect
Displacement (Indexed)
Stack
Immediate Addressing
Operand is part of instruction
Operand = address field
e.g. ADD 5
Add 5 to contents of accumulator
5 is operand
No memory reference to fetch data
Fast
Limited range
Immediate Addressing Diagram
OperandOpcode
Instruction
Direct Addressing
• Address field contains address of operand
• Effective address (EA) = address field (A)
• e.g. ADD A
• Add contents of cell A to accumulator
• Look in memory at address A for operand
• Single memory reference to access data
• No additional calculations to work out• effective address
• Limited address space
Direct Addressing Diagram
Address AOpcode
Instruction
Memory
Operand
Indirect Addressing (1)
Memory cell pointed to by address field
contains the address of (pointer to) the
operand
EA = (A)
Look in A, find address (A) and look there for
operand
e.g. ADD (A)Add contents of cell pointed to by contents of
A to accumulator
Indirect Addressing (2)
Large address space
2n where n = word length
May be nested, multilevel,
cascaded
e.g. EA =(((A)))
Draw the diagramyourself
Multiple memory accesses to
find operand
Hence slower
Indirect Addressing Diagram
Address AOpcode
Instruction
Memory
Pointer to operand
Operand
Register Addressing (1)
Operand is held in register named in
address filed
EA =R
Limited number of registers
Very small address field needed
Shorter instructions
Faster instruction fetch
Register Addressing (2)
No memory access
Very fast execution
Very limited address space
Multiple registers helps performance
Requires good assembly programming or
compiler writing
N.B. C programming
register int a;
c.f. Direct addressing
Register Addressing Diagram
Register Address ROpcode
Instruction
Registers
Operand
Register Indirect Addressing
C.f. indirect addressing
EA = (R)
Operand is in memory cell
pointed to by contents of register
R
Large address space (2n)
One fewer memory access than
indirect addressing
Register Indirect Addressing Diagram
Instruction
Memory
OperandPointer to Operand
Opcode Register Address R
Registers
Displacement Addressing
EA = A + (R)
Address field hold twovalues
A = base value
R = register that
holds displacement
or vice versa
Displacement Addressing Diagram
Instruction
Memory
OperandPointer to Operand
Registers
Opcode Register R Address A
+
Relative Addressing
A version ofdisplacement
addressing
R = Program counter, PC
EA = A + (PC)
i.e. get operand from A cells
from current location pointed to
by PC
c.f locality of reference &
cache usage
Base-Register Addressing
A holds displacement
R holds pointer to base
address
R may be explicit orimplicit
e.g. segment registers in80x86
Indexed Addressing
A = base
R = displacement
EA = A + (R)
Good for accessingarrays
EA = A + (R)
R++
Combinations
Postindex
EA = (A) + (R)
Preindex
EA = (A+(R))
(Draw the diagrams)
Stack Addressing
Operand is (implicitly) on top
of stack
e.g.
ADD
stack
Pop top two items from
and add
Types of Addressing Modes (VAX)
Example
Add R4, R3
Add R4, #3
Action
R4 <- R4 + R3
R4 <- R4 + 3
Addressing Mode
1. Register direct
2. Immediate
3. Displacement
4. Register indirect
5. Indexed
6. Direct
7. Memory Indirect
8. Autoincrement
Add R4, 100(R1)
Add R4, (R1)
Add R4, (R1 + R2)
Add R4, (1000)
Add R4, @(R3)
Add R4, (R2)+
R4 <- R4 + M[100 + R1]
R4 <- R4 + M[R1]
R4 <- R4 + M[R1 + R2]
R4 <- R4 + M[1000]
R4 <- R4 + M[M[R3]]
R4 <- R4 + M[R2]
9. Autodecrement Add R4, (R2)-
R2 <- R2 + d
R4 <- R4 + M[R2]
R2 <- R2 - d
10. Scaled Add R4, 100(R2)[R3] R4 <- R4 +
M[100 + R2 + R3*d]
Studies by [Clark and Emer] indicate that modes 1-4 account for 93% of all operands on the VAX.
CISC
Pros
Complex instructions operate directly on mainmemory.
Programmer is no longer required to do a direct call to LOADand
STORE operations as they are now handled by hardware.
Compiler has less work to translate statements in a high level
language to assembly language.
Cons
Microcode became more difficult to test and debug as systems
became more complex requiring numerous patches to fix bugs.
Programmers weren’t using the more complex instructions sets in
favor of smaller instructions that accomplished the same result.
The use of memory operands caused structuralhazards
preventing concurrent execution of instructions. (pipelining)
RISCReduced Instruction Set Computer
RISC is a CPU design that recognizes only a limited number of instructions
Simple instructions
Instructions are executed quickly
Executes a series of simple instruction instead of a complexinstruction
Instructions are executed within one clock cycle
Incorporates a large number of general registers
for arithmetic operations to avoid storing
variables on a stack in memory
Only the load and store instructions operate
directly onto memory
Pipelining = speed
RISC
Pros
Reduced instruction set.
Less complex, simple instructions.
Hardwired control unit and machine
instructions.
Few addressing schemes for memory operands
with only two basic instructions, LOAD and
STORE
Many symmetric registers which are organised
into a register file.
Cons
Greater burden on the software.
VLIW
A typical VLIW (very long
instruction word) machine has
instruction words hundreds of bits in
length.
Multiple functional units are used
concurrently in a VLIW processor.
All functional units share the use of a
common large register file.
VLIW - Advantages
Compiler prepares fixed packets of multiple operations that give the full "plan of execution"
dependencies are determined by compiler and used to schedule according to function unit latencies
function units are assigned by compiler and correspond to the position within the instruction packet ("slotting")
compiler produces fully-scheduled, hazard-free code => hardware doesn't have to "rediscover" dependencies or schedule
VLIW - Disadvantages
Compatibility acrossimplementations is a major problem
VLIW code won't run properly with different number of function units or different latencies
unscheduled events (e.g., cache miss) stall entire processor
Code density is another problemlow slot utilization
Comparison