cosc 6385 computer architecture -exercisesgabriel/courses/cosc6385_f07/ca_09_hardwaresp… · cosc...
Embed Size (px)
TRANSCRIPT

1
Edgar Gabriel
COSC 6385
Computer Architecture
- Exercises
Edgar Gabriel
Fall 2007
COSC 6385 – Computer Architecture
Edgar Gabriel
Hardware based speculation
• Branch prediction reduces direct stalls of branches
• Instructions can be issued using dynamic branch
prediction, but could not be executed until the branch
outcome was known
• Speculative executions extends the concept of dynamic
scheduling
– Speculates on the outcome of the branch
– Executes the following instructions
• Requires the ability to undo instructions in case the
prediction was wrong.

2
COSC 6385 – Computer Architecture
Edgar Gabriel
Hardware based speculation (II)
• Extending Tomasolu’s algorithm to support speculation:
– Separate the step of bypassing results among instructions from the completion of the instruction
– Add another step
• Issue
• Execute
• Write result
• Commit
– Instruction execute out-of-order but commit in-order
– Additional set of hardware buffers to hold the results of instructions which have not yet been committed: Reorder buffer (ROB)
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder Buffers
• Hold the results of instructions between the time an
instruction finishes and the time the instruction is
being committed
• Acts as additional reservation stations
– ROB can be the source of operands of other instructions
• Each ROB contains four fields
– Instruction type: branch/store/ALU operation
– Destination: Register number or memory address where
result should be written
– Value: value of the instruction
– Ready: instruction completed execution?

3
COSC 6385 – Computer Architecture
Edgar Gabriel
Four steps of execution (I)
• Issue:
– Get instruction from instruction queue
– Issue instruction if an reservation station is empty and an ROB is available
• Execute:
– If operands available, execute
– New: a store instruction only contains the calculation of the effective address at this point
• Write result:
– Write result to CDB
– Any reservation station/ROB should update
– Register file not modified at this point
COSC 6385 – Computer Architecture
Edgar Gabriel
Four steps of execution (II)
• Commit:
– Normal case (prediction was correct):
• instruction reaches head of ROB
• Update register file
• Remove entry from ROB
– Store operation:
• Instruction reaches head of ROB
• Update of memory location
– Incorrect prediction:
• When a branch instruction reaches head of ROB and the hardware indicates that the prediction was wrong, ROB is flushed and execution restarted.

4
COSC 6385 – Computer Architecture
Edgar Gabriel
The same example as for
scoreboardingL.D F6, 34(R2)
L.D F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Following slides are based on a lecture by Jelena Mirkovic,
University of Delaware
http://www.cis.udel.edu/~sunshine/courses/F04/CIS662/class12.pdf
Assumption:
ADD and SUB take 2 clock cycles
MULT takes 10 clock cycle
DIV takes 40 clock cycles
2 Load/Store, 3 ADD and 2 Mult reservation stations
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) �
L.D F2, 45(R3)
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1 Yes Load Regs[R2] #1 34
Load2
Add1
Add2
Add3
Mult1
Mult2
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #1
Busy yes
Time=1 Issue first load

5
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 Yes L.D F6, 34(R2) Issue F6
2
3
4
5
6
Time=1 Issue first load
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � �
L.D F2, 45(R3) �
MUL.D F0, F2, F4
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1 Yes Load Regs[R2] #1 +34
Load2 Yes Load Regs[R3] #2 45
Add1
Add2
Add3
Mult1
Mult2
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #2 #1
Busy yes yes
Time=2 first load executes, Second load issues

6
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 Yes L.D F6, 34(R2) Execute F6
2 Yes L.D F2, 45(R3) Issue F2
3
4
5
6
Time=2
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � �
L.D F2, 45(R3) � �
MUL.D F0, F2, F4 �
SUB.D F8, F6, F2
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1 Yes Load #1 Regs[R2]+34
Load2 Yes Load Regs[R3] #2 +45
Add1
Add2
Add3
Mult1 Yes Mult Regs[F4] #2 #3
Mult2
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #2 #1
Busy yes yes yes
Time=3 first load executes, Second load executes, Mul is issued

7
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 Yes L.D F6, 34(R2) Execute F6
2 Yes L.D F2, 45(R3) Executes F2
3 Yes MUL.D F0,F2,F4 Issue F0
4
5
6
Time=3
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � �
L.D F2, 45(R3) � �
MUL.D F0, F2, F4 �
SUB.D F8, F6, F2 �
DIV.D F10, F0, F6
ADD.D F6, F8, F2
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2 Yes Load #2 Regs[R3]+45
Add1 Yes Sub Mem[34+Regs[R2]] #2 #4
Add2
Add3
Mult1 Yes Mult Regs[F4] #2 #3
Mult2
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #2 #1 #4
Busy yes yes yes yes
Time=4 first load write res., Second load executes, Mul stalled, SUB issued

8
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 Yes L.D F6, 34(R2) Write result F6 Mem[34+Regs[R2]]
2 Yes L.D F2, 45(R3) Executes F2
3 Yes MUL.D F0,F2,F4 Stalled in issue F0
4 Yes SUB.D F8, F2, F6 Issue F8
5
6
Time=4
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � �
MUL.D F0, F2, F4 �
SUB.D F8, F6, F2 �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1 Yes Sub Mem[45+Regs[R3]] Mem[34+Regs[R2]] #4
Add2
Add3
Mult1 Yes Mult Mem[45+Regs[R3]] Regs[F4] #3
Mult2 Yes Div Mem[34+Regs[R2]] #3 #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #2 #4 #5
Busy yes yes yes Yes
Time=5first load commits, Second load write res, Mul, Sub stalled, Div issued

9
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 Yes L.D F2, 45(R3) Write result F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Stalled in issue F0
4 Yes SUB.D F8, F2, F6 Stalled in issue F8
5 Yes DIV.D F10,F0, F6 Issue F10
6
Time=5
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � �
SUB.D F8, F6, F2 � �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2 �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1 Yes Sub Mem[45+Regs[R3]] Mem[34+Regs[R2]] #4
Add2 yes Add Mem[45+Regs[R3]] #4 #6
Add3
Mult1 Yes Mult Mem[45+Regs[R3]] Regs[F4] #3
Mult2 Yes Div Mem[34+Regs[R2]] #3 #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #6 #4 #5
Busy yes yes yes Yes
Time=6 second load commits., Mul (1/10), Sub (1/2), Div stalled, Add issued

10
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Execute F0
4 Yes SUB.D F8, F2, F6 Execute F8
5 Yes DIV.D F10,F0, F6 Stalled in Issue F10
6 Yes ADD F6, F8, F2 Issue F6
Time=6
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � �
SUB.D F8, F6, F2 � �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2 �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1 Yes Sub Mem[45+Regs[R3]] Mem[34+Regs[R2]] #4
Add2 yes Add Mem[45+Regs[R3]] #4 #6
Add3
Mult1 Yes Mult Mem[45+Regs[R3]] Regs[F4] #3
Mult2 Yes Div Mem[34+Regs[R2]] #3 #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #6 #4 #5
Busy yes yes yes Yes
Time=7 Mul (2/10), Sub (2/2), Div stalled, Add stalled

11
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Execute F0
4 Yes SUB.D F8, F2, F6 Execute F8
5 Yes DIV.D F10,F0, F6 Stalled in Issue F10
6 Yes ADD F6, F8, F2 Stalled in Issue F6
Time=7
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � �
SUB.D F8, F6, F2 � � �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2 �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1
Add2 yes Add X Mem[45+Regs[R3]] #6
Add3
Mult1 Yes Mult Mem[45+Regs[R3]] Regs[F4] #3
Mult2 Yes Div Mem[34+Regs[R2]] #3 #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #6 #4 #5
Busy yes yes yes Yes
Time=8 Mul (3/10), Sub write result, Div stalled, Add stalled

12
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Execute F0
4 Yes SUB.D F8, F2, F6 Write result F8 X
5 Yes DIV.D F10,F0, F6 Stalled in Issue F10
6 Yes ADD F6, F8, F2 Stalled in Issue F6
Time=8
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � �
SUB.D F8, F6, F2 � � �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2 � �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1
Add2 yes Add X Mem[45+Regs[R3]] #6
Add3
Mult1 Yes Mult Mem[45+Regs[R3]] Regs[F4] #3
Mult2 Yes Div Mem[34+Regs[R2]] #3 #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #6 #4 #5
Busy yes yes yes Yes
Time=9 Mul (4/10),Div stalled, Add executes (1/2)

13
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Execute F0
4 Yes SUB.D F8, F2, F6 Waiting to commit F8 X
5 Yes DIV.D F10,F0, F6 Stalled in Issue F10
6 Yes ADD F6, F8, F2 Execute F6
Time=9
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � �
SUB.D F8, F6, F2 � � �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2 � � �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1
Add2
Add3
Mult1 Yes Mult Mem[45+Regs[R3]] Regs[F4] #3
Mult2 Yes Div Mem[34+Regs[R2]] #3 #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #6 #4 #5
Busy yes yes yes Yes
Time=11 Mul (6/10),Div stalled, Add writes result

14
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Execute F0
4 Yes SUB.D F8, F2, F6 Waiting to commit F8 X
5 Yes DIV.D F10,F0, F6 Stalled in Issue F10
6 Yes ADD F6, F8, F2 Write result F6 Y
Time=11
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � �
SUB.D F8, F6, F2 � � �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2 � � �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1
Add2
Add3
Mult1 Yes Mult Mem[45+Regs[R3]] Regs[F4] #3
Mult2 Yes Div Mem[34+Regs[R2]] #3 #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #6 #4 #5
Busy yes yes yes Yes
Time=12 Mul (7/10),Div stalled,

15
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Execute F0
4 Yes SUB.D F8, F2, F6 Waiting to commit F8 X
5 Yes DIV.D F10,F0, F6 Stalled in Issue F10
6 Yes ADD F6, F8, F2 Waiting to commit F6 Y
Time=12
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � � �
SUB.D F8, F6, F2 � � �
DIV.D F10, F0, F6 �
ADD.D F6, F8, F2 � � �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1
Add2
Add3
Mult1
Mult2 Yes Div Z Mem[34+Regs[R2]] #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #3 #6 #4 #5
Busy yes yes yes Yes
Time=16 Mul writes result, Div stalled

16
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 Yes MUL.D F0,F2,F4 Writing result F0 Z
4 Yes SUB.D F8, F2, F6 Waiting to commit F8 X
5 Yes DIV.D F10,F0, F6 Stalled in Issue F10
6 Yes ADD F6, F8, F2 Waiting to commit F6 Y
Time=16
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � � � �
SUB.D F8, F6, F2 � � �
DIV.D F10, F0, F6 � �
ADD.D F6, F8, F2 � � �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1
Add2
Add3
Mult1
Mult2 Yes Div Z Mem[34+Regs[R2]] #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #6 #4 #5
Busy yes yes Yes
Time=17 Mul commits, Div executes (1/40),

17
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 no MUL.D F0,F2,F4 Commits F0 Z
4 Yes SUB.D F8, F2, F6 Waiting to commit F8 X
5 Yes DIV.D F10,F0, F6 Executes F10
6 Yes ADD F6, F8, F2 Waiting to commit F6 Y
Time=17
COSC 6385 – Computer Architecture
Edgar Gabriel
Instruction status
Instruction Issue Execute Write result Commit
L.D F6, 34(R2) � � � �
L.D F2, 45(R3) � � � �
MUL.D F0, F2, F4 � � � �
SUB.D F8, F6, F2 � � � �
DIV.D F10, F0, F6 � �
ADD.D F6, F8, F2 � � �
Reservation station
Name Busy Op Vj Vk Qj Qk Dest A
Load1
Load2
Add1
Add2
Add3
Mult1
Mult2 Yes Div Z Mem[34+Regs[R2]] #5
Register result status
F0 F2 F4 F6 F8 F10 F12 0 F30
Reorder# #6 #5
Busy yes Yes
Time=18 Sub commits, Div executes (2/40),

18
COSC 6385 – Computer Architecture
Edgar Gabriel
Reorder buffer
Entry Busy Instruction State Destination Value
1 no L.D F6, 34(R2) Commit F6 Mem[34+Regs[R2]]
2 no L.D F2, 45(R3) Commit F2 Mem[45+Regs[R3]]
3 no MUL.D F0,F2,F4 Commit F0 Z
4 No SUB.D F8, F2, F6 Commit F8 X
5 Yes DIV.D F10,F0, F6 Executes F10
6 Yes ADD F6, F8, F2 Waiting to commit F6 Y
Time=18
COSC 6385 – Computer Architecture
Edgar Gabriel
… and so on…
• Time 57: DIV writes result
• Time 58: DIV commits
• Time 59: Add commits

19
COSC 6385 – Computer Architecture
Edgar Gabriel
Multiple Issue• Take advantage of the fact that we have multiple functional units
– further decrease ideal CPI (<1)
• Two flavors of multiple-issue processors:
– Superscalar
– VLIW (Very long instruction Word)
Issue
structure
Hazard
detection
Scheduling Dist.
characteristic
Examples
Superscalar
(static)
Dynamic Hardware Static In-order
execution
Sun UltraSPARC
II/III
Superscalar
(dynamic)
Dynamic Hardware Dynamic Limited out-of-
order exec.
IBM Power2
Superscalar
(speculative)
Dynamic Hardware Dynamic with
speculation
Out-of-order
exec. with spec.
Pentium III/4
IBM RS64III, B
VLIW Static Software Static No hazards
between issue
packets
i860, Trimedia
EPIC Mostly static Mostly
software
Mostly static Dependencies
marked by
compiler
Itanium
COSC 6385 – Computer Architecture
Edgar Gabriel
Superscalar architectures
• Issue a varying number of instructions per cycle
– Statically scheduled using compiler techniques
– Dynamically scheduled (e.g. using Tomasolu’s algorithms)
• Why a varying number of instructions?
– Statically scheduled -> no out-of-order execution
– Can check for hazards at issue time
– Issue logic will issue instructions which cause a hazard

20
COSC 6385 – Computer Architecture
Edgar Gabriel
Some details
• Issue unit receives between one and n instructions from the instruction fetch unit (n being typically 4 or 8)
→ issue packet
• Instruction fetch unit examines each instruction in the issue packet in order
• If an instruction causes a structural hazard or a data hazard, it will not be issued
• Since the checking for structural and data hazards are complex operations, the instruction fetch unit is implemented as a pipeline, e.g
– First stage checks for hazards within the issue packet
– Second stage checks for hazards with already issued instructions
COSC 6385 – Computer Architecture
Edgar Gabriel
Dynamically scheduled superscalar
MIPS• Extend Tomasolu’s algorithm to handle multiple issues
per cycle
– Must issue instructions to reservation stations in order to
maintain program semantics
• Note: we do not handle the details on how multiple
issue works for Tomasolu’s algorithms

21
COSC 6385 – Computer Architecture
Edgar Gabriel
VLIW processors
• Issue a fixed number of instructions
– As one large instruction or
– As a fixed instruction packet
• Parallelism among instructions has to be explicitly
indicated by the instructions
– Statically scheduled by compiler