![Page 1: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/1.jpg)
1
RISC Pipeline
Han WangCS3410, Spring 2010
Computer ScienceCornell University
See: P&H Chapter 4.6
![Page 2: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/2.jpg)
2
Homework 2
0 1 2 3 4 5 6 7 8 9
![Page 3: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/3.jpg)
3
Announcements- Homework 2 due tomorrow midnight- Programming Assignment 1 release tomorrow- Pipelined MIPS processor (topic of today)- Subset of MIPS ISA- Feedback- We want to hear from you!- Content?
![Page 4: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/4.jpg)
4
Absolute Jump
tgt
+4
||
Data Mem
addr
ext
5 5 5
Reg.File
PC
Prog.Mem ALUinst
control
imm
offset
+
=?
cmp
Could have used ALU for
link add
+4
op mnemonic description0x3 JAL target r31 = PC+8 (+8 due to branch delay slot)
PC = (PC+4)31..28 || (target << 2)
![Page 5: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/5.jpg)
5
A Processor
alu
PC
imm
memory
memory
din dout
addr
target
offset cmpcontrol
=?
new pc
registerfile
inst
extend
+4 +4
Review: Single cycle processor
![Page 6: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/6.jpg)
6
Single Cycle ProcessorAdvantages• Single Cycle per instruction make logic and clock simple
Disadvantages• Since instructions take different time to finish, memory
and functional unit are not efficiently utilized.• Cycle time is the longest delay.
– Load instruction
• Best possible CPI is 1
![Page 7: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/7.jpg)
7
Pipeline Hazards
0h 1h 2h 3h…
![Page 8: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/8.jpg)
8
Write-BackMemory
InstructionFetch Execute
InstructionDecode
registerfile
control
A Processor
alu
imm
memory
din dout
addr
inst
PC
memory
computejump/branch
targets
new pc
+4
extend
![Page 9: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/9.jpg)
9
Basic Pipeline
Five stage “RISC” load-store architecture1. Instruction fetch (IF)
– get instruction from memory, increment PC2. Instruction Decode (ID)
– translate opcode into control signals and read registers3. Execute (EX)
– perform ALU operation, compute jump/branch targets4. Memory (MEM)
– access memory if needed5. Writeback (WB)
– update register file
Slides thanks to Sally McKee & Kavita Bala
![Page 10: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/10.jpg)
10
Pipelined Implementation
Break instructions across multiple clock cycles (five, in this case)
Design a separate stage for the execution performed during each clock cycle
Add pipeline registers to isolate signals between different stages
![Page 11: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/11.jpg)
11
Write-BackMemory
InstructionFetch Execute
InstructionDecode
extend
registerfile
control
Pipelined Processor
alu
memory
din dout
addrPC
memory
newpc
inst
IF/ID ID/EX EX/MEM MEM/WB
imm
BA
ctrl
ctrl
ctrl
BD D
M
computejump/branch
targets
+4
![Page 12: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/12.jpg)
12
IF
Stage 1: Instruction Fetch
Fetch a new instruction every cycle• Current PC is index to instruction memory• Increment the PC at end of cycle (assume no branches for now)
Write values of interest to pipeline register (IF/ID)• Instruction bits (for later decoding)• PC+4 (for later computing branch targets)
![Page 13: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/13.jpg)
13
IF
PC
instructionmemory
newpc
inst
addr mc
00 = read word
1
IF/ID
WE 1
Rest
of p
ipel
ine
+4
PC+4
pcsel
pcregpcrel
pcabs
![Page 14: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/14.jpg)
14
ID
Stage 2: Instruction Decode
On every cycle:• Read IF/ID pipeline register to get instruction bits• Decode instruction, generate control signals• Read from register file
Write values of interest to pipeline register (ID/EX)• Control information, Rd index, immediates, offsets, …• Contents of Ra, Rb• PC+4 (for computing branch targets later)
![Page 15: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/15.jpg)
15
ID
ctrl
ID/EX
Rest
of p
ipel
ine
PC+4
inst
IF/ID
PC+4
Stag
e 1:
Inst
ructi
on F
etch
registerfile
WERd
Ra Rb
DB
A
BA
extend imm
decode
result
dest
![Page 16: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/16.jpg)
16
EX
Stage 3: Execute
On every cycle:• Read ID/EX pipeline register to get values and control bits• Perform ALU operation• Compute targets (PC+4+offset, etc.) in case this is a branch• Decide if jump/branch should be taken
Write values of interest to pipeline register (EX/MEM)• Control information, Rd index, …• Result of ALU operation• Value in case this is a memory store instruction
![Page 17: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/17.jpg)
17
Stag
e 2:
Inst
ructi
on D
ecod
e
pcrel
pcabs
EX
ctrl
EX/MEM
Rest
of p
ipel
ine
BD
ctrl
ID/EX
PC+4
BA
alu
+
||
branch?
imm
pcsel
pcreg
![Page 18: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/18.jpg)
18
MEM
Stage 4: Memory
On every cycle:• Read EX/MEM pipeline register to get values and control bits• Perform memory load/store if needed
– address is ALU result
Write values of interest to pipeline register (MEM/WB)• Control information, Rd index, …• Result of memory operation• Pass result of ALU operation
![Page 19: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/19.jpg)
19
MEM
ctrl
MEM/WB
Rest
of p
ipel
ine
Stag
e 3:
Exe
cute
MD
ctrl
EX/MEM
BD
memory
din dout
addr
mc
![Page 20: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/20.jpg)
20
WB
Stage 5: Write-back
On every cycle:• Read MEM/WB pipeline register to get values and control bits• Select value and write to register file
![Page 21: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/21.jpg)
21
WB
Stag
e 4:
Mem
ory
ctrl
MEM/WB
MD
result
dest
![Page 22: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/22.jpg)
22IF/ID
+4
ID/EX EX/MEM MEM/WB
mem
din dout
addrinst
PC+4
OP
BA
Rd
BD
MD
PC+4
imm
OP
Rd
OP
Rd
PC
instmem
Rd
Ra Rb
DB
A
![Page 23: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/23.jpg)
23
Example
add r3, r1, r2; nand r6, r4, r5; lw r4, 20(r2); add r5, r2, r5; sw r7, 12(r3);
![Page 24: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/24.jpg)
24
0:add1:nand2:lw3:add4:sw
r0r1r2r3r4r5r6r7
0369
12187
4122
IF/ID
+4
ID/EX EX/MEM MEM/WB
mem
din dout
addrinst
PC+4
OP
BA
Rd
BD
MD
PC+4
imm
OP
Rd
OP
Rd
PC
instmem
77
add r3, r1, r2nand r6, r4, r5 add r3, r1, r2lw r4, 20(r2) nand r6, r4, r5 add r3, r1, r2add r5, r2, r5 lw r4, 20(r2) nand r6, r4, r5 add r3, r1, r2sw r7, 12(r3) add r5, r2, r5 lw r4, 20(r2) nand r6, r4, r5 add r3, r1, r2sw r7, 12(r3) add r5, r2, r5 lw r4, 20(r2) nand r6, r4, r5sw r7, 12(r3) add r5, r2, r5 lw r4, 20(r2)sw r7, 12(r3) add r5, r2, r5 sw r7, 12(r3)
Rd
Ra Rb
DB
A
![Page 25: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/25.jpg)
25
Time Graphs
1 2 3 4 5 6 7 8 9
add
nand
lw
add
sw
Clock cycle
Latency:Throughput:Concurrency:
CPI =
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
![Page 26: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/26.jpg)
26
Pipelining Recap
Powerful technique for masking latencies• Logically, instructions execute one at a time• Physically, instructions execute in parallel
– Instruction level parallelism
Abstraction promotes decoupling• Interface (ISA) vs. implementation (Pipeline)
![Page 27: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/27.jpg)
27
The end
![Page 28: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/28.jpg)
28
Sample Code (Simple)
Assume eight-register machineRun the following code on a pipelined datapath
add 3 1 2 ; reg 3 = reg 1 + reg 2 nand 6 4 5 ; reg 6 = ~(reg 4 & reg 5) lw 4 20 (2) ; reg 4 = Mem[reg2+20] add 5 2 5 ; reg 5 = reg 2 + reg 5 sw 7 12(3) ; Mem[reg3+12] = reg 7
Slides thanks to Sally McKee
![Page 29: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/29.jpg)
29
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
op
dest
offset
valB
valA
PC+1PC+1target
ALUresult
op
dest
valB
op
dest
ALUresult
mdata
instruction
0
R2
R3
R4
R5
R1
R6
R0
R7
regA
regB
Bits 21-23
data
dest
IF/ID ID/EX EX/MEM MEM/WB
![Page 30: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/30.jpg)
30
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
nop
0
0
0
0
000
0
nop
0
0
nop
0
0
0
0
nop
912187
36
41
0
22
R2
R3
R4
R5
R1
R6
R0
R7
Bits 21-23
data
dest
InitialState
IF/ID ID/EX EX/MEM MEM/WB
![Page 31: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/31.jpg)
31
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
nop
0
0
0
0
010
0
nop
0
0
nop
0
0
0
0
add 3 1 2
912187
36
41
0
22
R2
R3
R4
R5
R1
R6
R0
R7
Bits 21-23
data
dest
Fetch: add 3 1 2
add 3 1 2
Time: 1 IF/ID ID/EX EX/MEM MEM/WB
![Page 32: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/32.jpg)
32
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
add
3
3
9
36
120
0
nop
0
0
nop
0
0
0
0nand 6 4 5
912187
36
41
0
22
R2
R3
R4
R5
R1
R6
R0
R7
1
2
Bits 21-23
data
dest
Fetch: nand 6 4 5
nand 6 4 5 add 3 1 2
Time: 2 IF/ID ID/EX EX/MEM MEM/WB
![Page 33: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/33.jpg)
33
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
nand
6
6
7
18
234
45
add
3
9
nop
0
0
0
0lw 4 20(2)
912187
36
41
0
22
R2
R3
R4
R5
R1
R6
R0
R7
4
5
Bits 21-23
data
dest
Fetch: lw 4 20(2)
lw 4 20(2) nand 6 4 5 add 3 1 2
Time: 3
36
9
3
IF/ID ID/EX EX/MEM MEM/WB
![Page 34: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/34.jpg)
34
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
lw
4
20
18
9
348
-3
nand
6
7
add
3
45
0
0add 5 2 5
912187
36
41
0
22
R2
R3
R4
R5
R1
R6
R0
R7
2
4
Bits 21-23
data
dest
Fetch: add 5 2 5
add 5 2 5 lw 4 20(2) nand 6 4 5 add 3 1 2
Time: 4
18
7
6
45
3
IF/ID ID/EX EX/MEM MEM/WB
![Page 35: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/35.jpg)
35
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
add
5
5
7
9
4523
29
lw
4
18
nand
6
-3
0
0sw 7 12(3)
945187
36
41
0
22
R2
R3
R4
R5
R1
R6
R0
R7
2
5
Bits 21-23
data
dest
Fetch: sw 7 12(3)
sw 7 12(3) add 5 2 5 lw 4 20 (2) nand 6 4 5 add 3 1 2
Time: 5
9
20
4
-3
6
45
3
IF/ID ID/EX EX/MEM MEM/WB
![Page 36: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/36.jpg)
36
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
sw
7
12
22
45
5 9
16
add
5
7
lw
4
29
99
09
45187
36
-3
0
22
R2
R3
R4
R5
R1
R6
R0
R7
3
7
Bits 21-23
data
dest
No moreinstructions
sw 7 12(3) add 5 2 5 lw 4 20(2) nand 6 4 5
Time: 6
9
7
5
29
4
-3
6
IF/ID ID/EX EX/MEM MEM/WB
![Page 37: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/37.jpg)
37
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
15
57
sw
7
22
add
5
16
0
09
45997
36
-3
0
22
R2
R3
R4
R5
R1
R6
R0
R7
Bits 21-23
data
dest
No moreinstructions
nop nop sw 7 12(3) add 5 2 5 lw 4 20(2)
Time: 7
45
7
12
16
5
99
4
IF/ID ID/EX EX/MEM MEM/WB
![Page 38: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/38.jpg)
38
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
sw
7
57
0
9
459916
36
-3
0
22
R2
R3
R4
R5
R1
R6
R0
R7
Bits 21-23
data
dest
No moreinstructions
nop nop nop sw 7 12(3) add 5 2 5
Time: 8
2257
22
16
5
Slides thanks to Sally McKee
IF/ID ID/EX EX/MEM MEM/WB
![Page 39: 1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a0246a/html5/thumbnails/39.jpg)
39
PC Instmem
Regi
ster
file
MUXA
LU
MUX
1
Datamem
+
MUX
MUX
Bits 0-2
Bits 15-17
9
459916
36
-3
0
22
R2
R3
R4
R5
R1
R6
R0
R7
Bits 21-23
data
dest
No moreinstructions
nop nop nop nop sw 7 12(3)
Time: 9 IF/ID ID/EX EX/MEM MEM/WB