© kavita bala, computer science, cornell university kevin walsh cs 3410, spring 2010 computer...
Post on 19-Dec-2015
219 views
TRANSCRIPT
© Kavita Bala, Computer Science, Cornell University
Kevin WalshCS 3410, Spring 2010
Computer ScienceCornell University
Pipelining
See: P&H Chapter 4.5
© Kavita Bala, Computer Science, Cornell University 2
The Kids
Alice
Bob
They don’t always get along…
© Kavita Bala, Computer Science, Cornell University 5
The Instructions
N pieces, each built following same sequence:
Saw Drill Glue Paint
© Kavita Bala, Computer Science, Cornell University 6
Design 1: Sequential Schedule
Alice owns the roomBob can enter when Alice is finishedRepeat for remaining tasksNo possibility for conflicts
© Kavita Bala, Computer Science, Cornell University 7
Elapsed Time for Alice: 4Elapsed Time for Bob: 4Total elapsed time: 4*NCan we do better?
Sequential Performance
time1 2 3 4 5 6 7 8 …
Latency:Throughput:Concurrency:
© Kavita Bala, Computer Science, Cornell University 8
Design 2: Pipelined Design
Partition room into stages of a pipeline
One person owns a stage at a time4 stages4 people working simultaneouslyEveryone moves right in lockstep
AliceBobCarolDave
© Kavita Bala, Computer Science, Cornell University 9
Pipelined Performancetime1 2 3 4 5 6 7…
Latency:Throughput:Concurrency:
© Kavita Bala, Computer Science, Cornell University 10
Unequal Pipeline Stages
0h 1h 2h 3h…
15 min 30 min 45 min 90 min
Latency:Throughput:Concurrency:
© Kavita Bala, Computer Science, Cornell University 11
Poorly-balanced Pipeline Stages
0h 1h 2h 3h…
15 min 30 min 45 min 90 min
Latency:Throughput:Concurrency:
© Kavita Bala, Computer Science, Cornell University 12
Re-Balanced Pipeline Stages
0h 1h 2h 3h…
15 min 30 min 45 min 90 min
Latency:Throughput:Concurrency:
© Kavita Bala, Computer Science, Cornell University 13
Splitting Pipeline Stages
0h 1h 2h 3h…
15 min 30 min 45 min 45+45 min
Latency:Throughput:Concurrency:
© Kavita Bala, Computer Science, Cornell University 14
Pipeline Hazards
0h 1h 2h 3h…
Q: What if glue step of task 3 depends on output of task 1?
Latency:Throughput:Concurrency:
© Kavita Bala, Computer Science, Cornell University 15
Lessons
Principle:Latencies can be masked by parallel execution
Pipelining:• Identify pipeline stages• Isolate stages from each other• Resolve pipeline hazards
© Kavita Bala, Computer Science, Cornell University 16
A Processor
alu
PC
imm
memory
memory
din
do
ut
addr
target
offset cmpcontrol
=?
new pc
registerfile
inst
extend
+4 +4
© Kavita Bala, Computer Science, Cornell University 17
Write-BackMemory
InstructionFetch Execute
InstructionDecode
registerfile
control
A Processor
alu
imm
memory
din
do
ut
addr
inst
PC
memory
computejump/branch
targets
new pc
+4
extend
© Kavita Bala, Computer Science, Cornell University 18
Basic Pipeline
Five stage “RISC” load-store architecture1. Instruction fetch (IF)
– get instruction from memory, increment PC2. Instruction Decode (ID)
– translate opcode into control signals and read registers3. Execute (EX)
– perform ALU operation, compute jump/branch targets4. Memory (MEM)
– access memory if needed5. Writeback (WB)
– update register file
Slides thanks to Sally McKee & Kavita Bala