ece 2300 digital logic & computer organization · lecture 17: 2 announcements • prelab 5(b)...

31
Lecture 17: Spring 2018 1 Pipelined Microprocessor ECE 2300 Digital Logic & Computer Organization

Upload: others

Post on 09-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Lecture 17:

Spring 2018

1

Pipelined Microprocessor

ECE 2300Digital Logic & Computer Organization

Lecture 17: 2

Announcements• Prelab 5(b) deadline is on Friday

• Prelim 2: Tues April 17, 7:30-9:00pm, PHL 101– Coverage: Lectures 8~16

• FSMs, timing analysis, binary arithmetic, memories, single-cycle microprocessor

• Supplementary notes on timing analysis posted on course web

– Closed book, closed notes, closed internet– A sample exam is posted on CMS– Instructor OH moved from Thursday 4/12 to

Monday 4/16, 4:00-5:30pm (one-time change)

Lecture 17: 3

Programmable Single-Cycle Processor

PC

RF

LDSASBDRD_in

ALU DataRAM

DataA

DataB

V C Z N

Fm … F0

SE

IMMMB

M_address

Data_in

MW MD

01

01

Inst. RAM

Decoder

DRSASB

IMMMBFSMDLD

MWBS

SE(OFF,0) AdderMP

01

+2

• Instruction RAM holds the program to be run

• Decoder derives control word from the instruction

MP

BS

01ZZ’NN’CV

01234567

Lecture 17: 4Lecture 17: !14

Instructions → Control Words

• Instruction decoder derives the control word from the instruction fields

• Combinational circuit ! instruction input, CW output

LD

ECE-2300 Instruction Set

Lecture 17: 5

Instruction Set Architecture (ISA)• The ISA describes a set of instructions supported by a

family of machines

• The ISA specification tells hardware and software (compiler and operating system) developers – Instruction formats – Operation of each instruction – Ways to form memory addresses – Data formats – Lots of other

• Examples: x86, ARM, MIPS, POWER, SPARC, RISC-V

Lecture 17:

Steps in Instruction Execution

6

• Instruction Fetch (IF) – Fetch instruction; Update PC

• Instruction Decode (ID) – Decode instruction; Read register file

• Execute (EX) – Perform ALU operation

• Memory (MEM) – Perform memory operation

• Write Back (WB) – Put result into register file

• Clock period is limited by the longest path– Suppose each step takes 1ns, clock period is 5ns

Lecture 17:

Pipelining: Basic Idea

7

IF/ID ID/EX EX/MEM MEM/WB

IF ID EX MEM WB

• Overlap instruction execution by performing each step in successive clock cycles

Lecture 17:

• Single-cycle execution

Pipelining: Overlapped Instructions

• Pipelined execution

8

IF-ID-EX-MEM-WB IF-ID-EX-MEM-WB

CC1 CC2

IF ID ALU MEM WBIF ID ALU MEM WB

IF ID ALU MEM WBIF ID ALU MEM WB

IF ID ALU MEM WB

CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9Instruction 1

Instruction 2

Instruction 3

Instruction 4

Instruction 5

Lecture 17:

Pipelining: Performance• Faster clock frequency than single cycle processor• Each instruction takes 5 cycles• Average number of cycles per instruction (CPI)

• ~1 instruction completed every cycle (ideally)

9

IF ID ALU MEM WBIF ID ALU MEM WB

IF ID ALU MEM WBIF ID ALU MEM WB

IF ID ALU MEM WB

CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9Instruction 1

Instruction 2

Instruction 3

Instruction 4

Instruction 5

Lecture 17: 10

Instruction Fetch Stage

PC

PCL

MUX

PCJ

+2

IF/ID

InstRAM

• Fetch the instruction into IF/ID (based on current PC)• Load PC+2 into PC• Place PC+2 into IF/ID

Lecture 17: 11

Instruction Decode StageControlSignalsCU

V C Z N

PC

PCL

MUX

PCJ

+2 Adder

IF/ID ID/EX

Decoder

SE

InstRAM

• Read source operands from RF into ID/EX• Place SE(IMM) into ID/EX• Place SE(IMM)+(PC+2) into ID/EX• Place DR into ID/EX

LDSASBDR

D_in

RF

Lecture 17: 12

Execute Stage

MUX

ControlSignals

ALU

V C Z N

Fm … F0

CU

PC

PCL

MUX

PCJ

+2 Adder

IF/ID ID/EX EX/MEMMB

Decoder

SE

InstRAM

• Perform ALU operation and place result into EX/MEM• Pass DataB from the RF to EX/MEM• Pass DR to EX/MEM• If taken branch, update PC (Is it too late?)

LDSASBDR

D_in

RF

V C Z N

Lecture 17: 13

Memory Stage

MUX

ControlSignals

ALU

Fm … F0

MUX

MD

CU

PC

PCL

MUX

PCJMW

D_IN

+2 Adder

IF/ID ID/EX EX/MEM MEM/WBMB

Decoder

SE

InstRAM

DataRAM

• Store: Write DataB into RAM• Load: Read data from RAM into MEM/WB• ALU operation: pass ALU result from EX/MEM to MEM/WB• Pass DR to MEM/WB

LDSASBDR

D_in

RF

V C Z N

V C Z N

Lecture 17: 14

Writeback Stage

MUX

ControlSignals

ALU

Fm … F0

MUX

MD

CU

PC

PCL

MUX

PCJMW

D_IN

+2 Adder

IF/ID ID/EX EX/MEM MEM/WBMB

Decoder

SE

InstRAM

DataRAM

• Load or ALU operation: Write register file

LDSASBDR

D_in

RF

V C Z N

V C Z N

Lecture 17: 15

Pipelined Microprocessor

MUX

ControlSignals

ALU

Fm … F0

MUX

MD

CU

PC

PCL

MUX

PCJMW

D_IN

+2 Adder

IF/ID ID/EX EX/MEM MEM/WBMB

Decoder

SE

InstRAM

DataRAM

LDSASBDR

D_in

RF

V C Z N

V C Z N

Lecture 17:

Abstract Representation

IM RegReg ALU

DM

16

IF/ID ID/EX EX/MEM MEM/WB

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DMADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

17

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DM

ADD R1,R2,R3

18

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DM

ADD R1,R2,R3OR R4,R4,R3

19

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DM

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3

20

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DM

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2

21

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DM

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

22

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DM

OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

23

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

Lecture 17:

Example Instruction Sequence

IM RegReg ALU

DM

SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

24

ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3

Lecture 17:

Example Instruction Sequence

ADD R1,R2,R3

OR R4,R4,R3

SUB R5,R2,R3

AND R6,R6,R2

ADDI R7,R7,3

IM RegReg ALU

IM RegReg DMALU

IM RegReg DMALU

IM RegReg DMALU

IM RegReg DMALU

CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3

DM

25

Lecture 17:

What About This Sequence?

ADD R1,R2,R3

OR R4,R1,R3

SUB R5,R2,R1

AND R6,R1,R2

ADDI R7,R7,3

IM RegReg ALU

IM RegReg DMALU

IM RegReg DMALU

IM RegReg DMALU

IM RegReg DMALU

CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3

DM

26

The OR, SUB, and AND instructions are data dependent on the ADD instruction

Lecture 17:

Data Hazard• Occurs when a register is read before the write back of a

value to that register

ADD R1,R2,R3OR R4,R1,R3SUB R5,R2,R1AND R6,R1,R2

• What should happen– The 1st instruction calculates a new value for R1– The 2nd, 3rd, and 4th instructions use this new value

27

IF ID ALU MEM WBIF ID ALU MEM WB

IF ID ALU MEM WBIF ID ALU MEM WB

• What actually happens– The 2nd, 3rd, and 4th instructions read the old value of R1– The first instruction then writes the new value into R1

Lecture 17:

Solution 1: SW (compiler) Inserts NOPs

ADD R1,R2,R3

NOP

IM RegReg ALU

IM RegReg DMALU

IM RegReg DMALU

IM RegReg DMALU

IM RegReg DMALU

CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3

DM

OR R4,R1,R3

NOP

NOP

28

Lecture 17:

Solution 2: HW Stalls the Pipeline

ADD R1,R2,R3

OR R4,R1,R3

SUB R5,R2,R1

AND R6,R1,R2

ADDI R7,R7,3

IM RegReg ALU

IM

IM Reg ALU

IM Reg DMALU

IM Reg

CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3

DM

Reg ALU

DM Regbubble bubble bubble

29

The pipeline is stalled for three cycles

Lecture 17: 30

• Identify all data hazards in the following instruction sequences by circling each source register that is read before the updated value is written backADD R1, R2, R3 NOPADDI R2, R1, 1 SUB R3, R1, R2 SUB R4, R3, R1

Example: Data Hazards

Lecture 17:

Before Next Class • H&H 7.5.3-7.5.5

Next Time

More Pipelined Microprocessor

31