lec 8: pipelining kavita bala cs 3410, fall 2008 computer science cornell university

27
Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

Post on 21-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

Lec 8: Pipelining

Kavita Bala

CS 3410, Fall 2008

Computer Science

Cornell University

Page 2: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Basic PipeliningFive stage “RISC” load-store architecture1. Instruction fetch (IF)

• get instruction from memory2. Instruction Decode (ID)

• translate opcode into control signals and read regs3. Execute (EX)

• perform ALU operation4. Memory (MEM)

• Access memory if load/store5. Writeback (WB)

• update register file

Following slides thanks to Sally McKee

Page 3: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Pipelined Implementation

• Break the execution of the instruction into cycles (five, in this case)

• Design a separate stage for the execution performed during each cycle

• Build pipeline registers (latches) to communicate between the stages

Slides thanks to Sally McKee

Page 4: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Stage 1: Fetch and Decode

• Design a datapath that can fetch an instruction from memory every cycle– Use PC to index memory to read instruction– Increment the PC (assume no branches for now)

• Write everything needed to complete execution to the pipeline register (IF/ID)– The next stage will read this pipeline register– Note that pipeline register must be edge triggered

Slides thanks to Sally McKee

Page 5: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell UniversityIn

stru

ctio

nb

its

IF / IDPipeline register

PC

InstructionMemory/

Cache

en

en

1

incr

MUX

Res

t of

pip

elin

ed d

atap

ath

PC

+ 1

Slides thanks to Sally McKee

Page 6: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Stage 2: Decode

• Reads the IF/ID pipeline register, decodes instruction, and reads register file (specified by regA and regB of instruction bits)– Decode can be easy, just pass on the opcode and let

later stages figure out their own control signals for the instruction

• Write everything needed to complete execution to the pipeline register (ID/EX)– Pass on the offset field and destination register

specifiers (or simply pass on the whole instruction!)– Pass on PC+1 even though decode didn’t use it

Page 7: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Destreg

Data

ID / EXPipeline register

Con

ten

tsO

f re

gAC

onte

nts

Of

regB

Register File

regAregB

en

Res

t of

pip

elin

ed d

atap

ath

Inst

ruct

ion

bit

s

IF / IDPipeline register

PC

+ 1

PC

+ 1

Con

trol

Sig

nal

s

Sta

ge 1

: In

st F

etch

dat

apat

h

Slides thanks to Sally McKee

Page 8: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Stage 3: Execute

• Design a datapath that performs the proper ALU operation for the instruction specified and values present in the ID/EX pipeline register– The inputs are the contents of regA and either the

contents of regB or the offset field in the instruction– Also, calculate PC+1+offset, in case this is a branch

• Write everything needed to complete execution to the pipeline register (EX/Mem)– ALU result, contents of regB and PC+1+offset– Instruction bits for opcode and destReg specifiers

Page 9: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

ID / EXPipeline register

Con

ten

tsO

f re

gAC

onte

nts

Of

regB

Res

t of

pip

elin

ed d

atap

ath

Alu

Res

ult

EX/MemPipeline register

PC

+ 1

Con

trol

Sig

nal

sSta

ge 2

: D

ecod

e d

atap

ath

Con

trol

Sign

als

con

ten

ts

of r

egB

ALU

MUX

PC

+ 1

+ o

ffse

t

Magic

Slides thanks to Sally McKee

Page 10: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Stage 4: Memory Operation

• Design a datapath that performs the proper memory operation for the instruction specified and values present in the EX/Mem pipeline register– ALU result contains address for ld and st instructions– Opcode bits control memory R/W and enable signals

• Write everything needed to complete execution to the pipeline register (Mem/WB)– ALU result and MemData– Instruction bits for opcode and destReg specifiers

Page 11: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Alu

Res

ult

Mem/WBPipeline register

Res

t of

pip

elin

ed d

atap

ath

Alu

Res

ult

EX/MemPipeline register

Sta

ge 3

: E

xecu

te d

atap

ath

Con

trol

Sign

als

PC

+1

+of

fset

con

ten

ts

of r

egB

This goes back to the MUXbefore the PC in stage 1

Mem

ory

Rea

d D

ata

Data Memory

en R/W

Con

trol

Sign

als

MUX controlfor PC input

Slides thanks to Sally McKee

Page 12: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Stage 5: Write Back

• Design a datapath that completes the execution of this instruction, writing to the register file if required– Write MemData to destReg for ld instruction– Write ALU result to destReg for

arithmetic/logic instructions– Opcode bits also control register write enable

signal

Slides thanks to Sally McKee

Page 13: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Alu

Res

ult

Mem/WBPipeline register

Sta

ge 4

: M

emor

y d

atap

ath

Con

trol

Sign

als

Mem

ory

Rea

d D

ata

MUX

This goes back to data input of register file

This goes back to thedestination register specifier

MUX

bits 0-2

bits 15-17

register write enable

Slides thanks to Sally McKee

Page 14: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Sample Code (Simple)

• Assume eight-register machine

• Run the following code on a pipelined datapath add 3 1 2 ; reg 3 = reg 1 + reg 2

nand 6 4 5 ; reg 6 = ~(reg 4 & reg 5)

lw 4 20 (2) ; reg 4 = Mem[reg2+20]

add 5 2 5 ; reg 5 = reg 2 + reg 5

sw 7 12(3) ; Mem[reg3+12] = reg 7

Slides thanks to Sally McKee

Page 15: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

op

dest

offset

valB

valA

PC+1PC+1target

ALUresult

op

dest

valB

op

dest

ALUresult

mdata

instru

ction

0

R2

R3

R4

R5

R1

R6

R0

R7

regA

regB

Bits 21-23

data

dest

Slides thanks to Sally McKee

Page 16: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

nop

0

0

0

0

000

0

nop

0

0

nop

0

0

0

0

nop

912187

36

41

0

22

R2

R3

R4

R5

R1

R6

R0

R7

Bits 21-23

data

dest

InitialState

Slides thanks to Sally McKee

Page 17: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

nop

0

0

0

0

010

0

nop

0

0

nop

0

0

0

0

add

3 1 2

912187

36

41

0

22

R2

R3

R4

R5

R1

R6

R0

R7

Bits 21-23

data

dest

Fetch: add 3 1 2

add 3 1 2

Time: 1Slides thanks to Sally McKee

Page 18: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

add

3

3

9

36

120

0

nop

0

0

nop

0

0

0

0nan

d 6 4 5

912187

36

41

0

22

R2

R3

R4

R5

R1

R6

R0

R7

1

2

Bits 21-23

data

dest

Fetch: nand 6 4 5

nand 6 4 5 add 3 1 2

Time: 2Slides thanks to Sally McKee

Page 19: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

nand

6

6

7

18

234

45

add

3

9

nop

0

0

0

0lw 4 20(2)

912187

36

41

0

22

R2

R3

R4

R5

R1

R6

R0

R7

4

5

Bits 21-23

data

dest

Fetch: lw 4 20(2)

lw 4 20(2) nand 6 4 5 add 3 1 2

Time: 3

36

9

3

Slides thanks to Sally McKee

Page 20: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

lw

4

20

18

9

348

-3

nand

6

7

add

3

45

0

0add

5 2 5

912187

36

41

0

22

R2

R3

R4

R5

R1

R6

R0

R7

2

4

Bits 21-23

data

dest

Fetch: add 5 2 5

add 5 2 5 lw 4 20(2) nand 6 4 5 add 3 1 2

Time: 4

18

7

6

45

3

Slides thanks to Sally McKee

Page 21: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

add

5

5

7

9

4523

29

lw

4

18

nand

6

-3

0

0sw 7 12(3)

945187

36

41

0

22

R2

R3

R4

R5

R1

R6

R0

R7

2

5

Bits 21-23

data

dest

Fetch: sw 7 12(3)

sw 7 12(3) add 5 2 5 lw 4 20 (2) nand 4 5 6 add

Time: 5

9

20

4

-3

6

45

3

Slides thanks to Sally McKee

Page 22: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

sw

7

12

22

45

5 9

16

add

5

7

lw

4

29

99

0

945187

36

-3

0

22

R2

R3

R4

R5

R1

R6

R0

R7

3

7

Bits 21-23

data

dest

No moreinstructions

sw 7 12(3) add 5 2 5 lw 4 20(2) nand

Time: 6

9

7

5

29

4

-3

6

Slides thanks to Sally McKee

Page 23: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

15

57

sw

7

22

add

5

16

0

0

945997

36

-3

0

22

R2

R3

R4

R5

R1

R6

R0

R7

Bits 21-23

data

dest

No moreinstructions

sw 7 12(3) add 5 2 5 lw

Time: 7

45

7

12

16

5

99

4

Slides thanks to Sally McKee

Page 24: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

sw

7

57

0

9459916

36

-3

0

22

R2

R3

R4

R5

R1

R6

R0

R7

Bits 21-23

data

dest

No moreinstructions

sw 7 12(3) add

Time: 8

2257

22

16

5

Slides thanks to Sally McKee

Page 25: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

PC Instmem

Reg

iste

r fi

le

MUXA

LU

MUX

1

Datamem

+

MUX

IF/ID

ID/EX

EX/Mem

Mem/WB

MUX

Bits 0-2

Bits 15-17

9459916

36

-3

0

22

R2

R3

R4

R5

R1

R6

R0

R7

Bits 21-23

data

dest

No moreinstructions

sw

Time: 9Slides thanks to Sally McKee

Page 26: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Time Graphs

Time: 1 2 3 4 5 6 7 8 9

add

nand

lw

add

sw

fetch decode execute memory writeback

fetch decode execute memory writeback

fetch decode execute memory writeback

fetch decode execute memory writeback

fetch decode execute memory writeback

Slides thanks to Sally McKee

Page 27: Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University

© Kavita Bala, Computer Science, Cornell University

Pipelining Recap• Powerful technique for masking latencies

– Logically, instructions execute one at a time– Physically, instructions execute in parallel

Instruction level parallelism

• Decouples the processor model from the implementation– Interface vs. implementation

• BUT dependencies between instructions complicate the implementation