ece 2300 digital logic & computer organization · lecture 17: 2 announcements • prelab 5(b)...
TRANSCRIPT
Lecture 17: 2
Announcements• Prelab 5(b) deadline is on Friday
• Prelim 2: Tues April 17, 7:30-9:00pm, PHL 101– Coverage: Lectures 8~16
• FSMs, timing analysis, binary arithmetic, memories, single-cycle microprocessor
• Supplementary notes on timing analysis posted on course web
– Closed book, closed notes, closed internet– A sample exam is posted on CMS– Instructor OH moved from Thursday 4/12 to
Monday 4/16, 4:00-5:30pm (one-time change)
Lecture 17: 3
Programmable Single-Cycle Processor
PC
RF
LDSASBDRD_in
ALU DataRAM
DataA
DataB
V C Z N
Fm … F0
SE
IMMMB
M_address
Data_in
MW MD
01
01
Inst. RAM
Decoder
DRSASB
IMMMBFSMDLD
MWBS
SE(OFF,0) AdderMP
01
+2
• Instruction RAM holds the program to be run
• Decoder derives control word from the instruction
MP
BS
01ZZ’NN’CV
01234567
Lecture 17: 4Lecture 17: !14
Instructions → Control Words
• Instruction decoder derives the control word from the instruction fields
• Combinational circuit ! instruction input, CW output
LD
ECE-2300 Instruction Set
Lecture 17: 5
Instruction Set Architecture (ISA)• The ISA describes a set of instructions supported by a
family of machines
• The ISA specification tells hardware and software (compiler and operating system) developers – Instruction formats – Operation of each instruction – Ways to form memory addresses – Data formats – Lots of other
• Examples: x86, ARM, MIPS, POWER, SPARC, RISC-V
Lecture 17:
Steps in Instruction Execution
6
• Instruction Fetch (IF) – Fetch instruction; Update PC
• Instruction Decode (ID) – Decode instruction; Read register file
• Execute (EX) – Perform ALU operation
• Memory (MEM) – Perform memory operation
• Write Back (WB) – Put result into register file
• Clock period is limited by the longest path– Suppose each step takes 1ns, clock period is 5ns
Lecture 17:
Pipelining: Basic Idea
7
IF/ID ID/EX EX/MEM MEM/WB
IF ID EX MEM WB
• Overlap instruction execution by performing each step in successive clock cycles
Lecture 17:
• Single-cycle execution
Pipelining: Overlapped Instructions
• Pipelined execution
8
IF-ID-EX-MEM-WB IF-ID-EX-MEM-WB
CC1 CC2
IF ID ALU MEM WBIF ID ALU MEM WB
IF ID ALU MEM WBIF ID ALU MEM WB
IF ID ALU MEM WB
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9Instruction 1
Instruction 2
Instruction 3
Instruction 4
Instruction 5
Lecture 17:
Pipelining: Performance• Faster clock frequency than single cycle processor• Each instruction takes 5 cycles• Average number of cycles per instruction (CPI)
• ~1 instruction completed every cycle (ideally)
9
IF ID ALU MEM WBIF ID ALU MEM WB
IF ID ALU MEM WBIF ID ALU MEM WB
IF ID ALU MEM WB
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9Instruction 1
Instruction 2
Instruction 3
Instruction 4
Instruction 5
Lecture 17: 10
Instruction Fetch Stage
PC
PCL
MUX
PCJ
+2
IF/ID
InstRAM
• Fetch the instruction into IF/ID (based on current PC)• Load PC+2 into PC• Place PC+2 into IF/ID
Lecture 17: 11
Instruction Decode StageControlSignalsCU
V C Z N
PC
PCL
MUX
PCJ
+2 Adder
IF/ID ID/EX
Decoder
SE
InstRAM
• Read source operands from RF into ID/EX• Place SE(IMM) into ID/EX• Place SE(IMM)+(PC+2) into ID/EX• Place DR into ID/EX
LDSASBDR
D_in
RF
Lecture 17: 12
Execute Stage
MUX
ControlSignals
ALU
V C Z N
Fm … F0
CU
PC
PCL
MUX
PCJ
+2 Adder
IF/ID ID/EX EX/MEMMB
Decoder
SE
InstRAM
• Perform ALU operation and place result into EX/MEM• Pass DataB from the RF to EX/MEM• Pass DR to EX/MEM• If taken branch, update PC (Is it too late?)
LDSASBDR
D_in
RF
V C Z N
Lecture 17: 13
Memory Stage
MUX
ControlSignals
ALU
Fm … F0
MUX
MD
CU
PC
PCL
MUX
PCJMW
D_IN
+2 Adder
IF/ID ID/EX EX/MEM MEM/WBMB
Decoder
SE
InstRAM
DataRAM
• Store: Write DataB into RAM• Load: Read data from RAM into MEM/WB• ALU operation: pass ALU result from EX/MEM to MEM/WB• Pass DR to MEM/WB
LDSASBDR
D_in
RF
V C Z N
V C Z N
Lecture 17: 14
Writeback Stage
MUX
ControlSignals
ALU
Fm … F0
MUX
MD
CU
PC
PCL
MUX
PCJMW
D_IN
+2 Adder
IF/ID ID/EX EX/MEM MEM/WBMB
Decoder
SE
InstRAM
DataRAM
• Load or ALU operation: Write register file
LDSASBDR
D_in
RF
V C Z N
V C Z N
Lecture 17: 15
Pipelined Microprocessor
MUX
ControlSignals
ALU
Fm … F0
MUX
MD
CU
PC
PCL
MUX
PCJMW
D_IN
+2 Adder
IF/ID ID/EX EX/MEM MEM/WBMB
Decoder
SE
InstRAM
DataRAM
LDSASBDR
D_in
RF
V C Z N
V C Z N
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DMADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
17
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DM
ADD R1,R2,R3
18
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DM
ADD R1,R2,R3OR R4,R4,R3
19
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DM
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3
20
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DM
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2
21
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DM
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
22
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DM
OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
23
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
Lecture 17:
Example Instruction Sequence
IM RegReg ALU
DM
SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
24
ADD R1,R2,R3OR R4,R4,R3SUB R5,R2,R3AND R6,R6,R2ADDI R7,R7,3
Lecture 17:
Example Instruction Sequence
ADD R1,R2,R3
OR R4,R4,R3
SUB R5,R2,R3
AND R6,R6,R2
ADDI R7,R7,3
IM RegReg ALU
IM RegReg DMALU
IM RegReg DMALU
IM RegReg DMALU
IM RegReg DMALU
CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3
DM
25
Lecture 17:
What About This Sequence?
ADD R1,R2,R3
OR R4,R1,R3
SUB R5,R2,R1
AND R6,R1,R2
ADDI R7,R7,3
IM RegReg ALU
IM RegReg DMALU
IM RegReg DMALU
IM RegReg DMALU
IM RegReg DMALU
CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3
DM
26
The OR, SUB, and AND instructions are data dependent on the ADD instruction
Lecture 17:
Data Hazard• Occurs when a register is read before the write back of a
value to that register
ADD R1,R2,R3OR R4,R1,R3SUB R5,R2,R1AND R6,R1,R2
• What should happen– The 1st instruction calculates a new value for R1– The 2nd, 3rd, and 4th instructions use this new value
27
IF ID ALU MEM WBIF ID ALU MEM WB
IF ID ALU MEM WBIF ID ALU MEM WB
• What actually happens– The 2nd, 3rd, and 4th instructions read the old value of R1– The first instruction then writes the new value into R1
Lecture 17:
Solution 1: SW (compiler) Inserts NOPs
ADD R1,R2,R3
NOP
IM RegReg ALU
IM RegReg DMALU
IM RegReg DMALU
IM RegReg DMALU
IM RegReg DMALU
CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3
DM
OR R4,R1,R3
NOP
NOP
28
Lecture 17:
Solution 2: HW Stalls the Pipeline
ADD R1,R2,R3
OR R4,R1,R3
SUB R5,R2,R1
AND R6,R1,R2
ADDI R7,R7,3
IM RegReg ALU
IM
IM Reg ALU
IM Reg DMALU
IM Reg
CC1 CC4 CC5 CC6 CC7 CC8 CC9CC2 CC3
DM
Reg ALU
DM Regbubble bubble bubble
29
The pipeline is stalled for three cycles
Lecture 17: 30
• Identify all data hazards in the following instruction sequences by circling each source register that is read before the updated value is written backADD R1, R2, R3 NOPADDI R2, R1, 1 SUB R3, R1, R2 SUB R4, R3, R1
Example: Data Hazards