1 1999 ©ucb cs 161computer architecture chapter 5 lecture 11 instructor: l.n. bhuyan bhuyan adapted...

29
.1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan www.cs.ucr.edu/~bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

Post on 21-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.1 1999 ©UCB

CS 161Computer Architecture

Chapter 5Lecture 11

Instructor: L.N. Bhuyanwww.cs.ucr.edu/~bhuyan

Adapted from notes by Dave Patterson(http.cs.berkeley.edu/~patterson)

Page 2: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.2 1999 ©UCB

Implementing Main Control

Main Control

RegDst

Branch

MemRead

MemtoReg

ALUop

MemWrite

ALUSrc

RegWrite

op

2

Main Control has one 6-bit input, 9 outputs (7 are 1-bit, ALUOp is 2 bits)

To build Main Control as sum-of-products:

(1) Construct a minterm for each different instruction (or R-type); each minterm corresponds to a single instruction (or all of the R-type instructions), e.g., MR-format, Mlw

(2) Determine each main control output by forming the logical OR of relevant minterms (instructions), e.g., RegWrite: MR-format OR Mlw

Page 3: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.3 1999 ©UCB

Single-Cycle MIPS-lite CPU

Regs

ReadReg1

Readdata1

ALURead

data2

ReadReg2

WriteReg

WriteData

Zero

ALU-con

RegWrite

Address

Readdata

WriteData

SignExtend

Dmem

MemRead

MemWrite

Mux

MemTo-Reg

Mux

Read Addr

Instruc-tion

Imem

4

PC

add

add <<

2

Mux

ALU Control

5:0ALUOp (2)

ALU-src

Mux

25:21

20:16

15:11

RegDst

15:0

31:0

Branch

Main Control

op=[31:26]

PCSrc

Page 4: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.4 1999 ©UCB

Fig. 5.17 Datapath with Control Signals

Page 5: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.5 1999 ©UCB

Instruction RegDst ALUSrcMemto-

RegReg

WriteMem Read

Mem Write Branch ALUOp1 ALUp0

R-format 1 0 0 1 0 0 0 1 0lw 0 1 1 1 1 0 0 0 0sw X 1 X 0 0 1 0 0 0beq X 0 X 0 0 0 1 0 1

Fig. 5.18 Setting Control Lines Depend on Opcode

Page 6: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.6 1999 ©UCB

Control Design

° Simple combinational logic (truth tables)

Operation2

Operation1

Operation0

Operation

ALUOp1

F3

F2

F1

F0

F (5– 0)

ALUOp0

ALUOp

ALU control block

R-format Iw sw beq

Op0

Op1

Op2

Op3

Op4

Op5

Inputs

Outputs

RegDst

ALUSrc

MemtoReg

RegWrite

MemRead

MemWrite

Branch

ALUOp1

ALUOpO

Page 7: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.7 1999 ©UCB

Fig. 5.19 R-type operation, add $t1, $t2, $t3 Active parts are highlighted

Page 8: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.8 1999 ©UCB

Fig. 5.20 Active parts for a Load instruction

Page 9: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.9 1999 ©UCB

Fig. 5.21 Active parts for a beq instruction

Page 10: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.10 1999 ©UCB

Fig. 5.24 Extension for Jump instruction

Page 11: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.11 1999 ©UCB

Single-Cycle Machine: Appraisal° All instructions complete in one clock cycle

(CPI = 1)

° Some instructions take more steps than others

• lw is most expensive (5 steps, vs. 4 for R-type and sw, 3 for beq)

° Clock cycle must cover longest instruction inefficient

• suppose mul is added?

• 32-shift/add steps would delay every other instruction

Page 12: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.12 1999 ©UCB

Example° Assume 2ns for instruction/data memory,

1ns for decode/register read, 2ns for ALU and 1 ns for register write.

° Single-cycle datapath clock period = 8 ns.

° Assume an instn mix of 24% loads, 12% stores, 44% R-format, 18% branches, and 2% jumps.

° Assuming a variable-cycle datapath, average clock period = 6.3 ns.

° Possible Speed-up = 1.27

Page 13: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.13 1999 ©UCB

Multicycle Implementation (MIPS-lite v.2)° Want more efficient implementation

° Each step will take one clock cycle (not each instruction) [CPI > 1]

shorter clock cycle: cycle time constrained by longest step, not longest instruction

° simpler instructions take fewer cycles

higher overall performance

° complex control: finite state machine

° Versatile (can extend for new instructions: add3, swap, etc.)

Page 14: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.14 1999 ©UCB

Recap: Clocking: single-cycle vs. multicycle

add $t0,$t1,$t2 beq $t0,$t1,L

Single-cycle Implementation

Multicycle Implementation

add $t0,$t1,$t2 beq $t0,$t1,L

• Multicycle Implementation: less waste=higher performance

waste waste

clock

clock

Page 15: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.15 1999 ©UCB

Recap: How fast can we run the clock?°Depends on how much want done per clock cycle

• Can do: several “inexpensive” datapath operations per clock

- simple gates (AND, OR, …)

- single datapath registers (PC)

- sign extender, left shifter, multiplexor

• PLUS: exactly one “expensive” datapath operation per clock

- ALU operation

- Register File access (2 reads, or 1 write)

- Memory access (read or write)

Page 16: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.16 1999 ©UCB

Multicycle Datapath (overview)

Registers

ReadReg1

ALU

ReadReg2

WriteReg

Data

PC

Address

Instructionor Data

Memory

MIPS-liteMulticycle Version

A

B

ALU-Out

InstructionRegister

Data MemoryData

Register

Readdata 1

Readdata 2

• One ALU (no extra adders)• One Memory (no separate Imem, Dmem)• New Temporary Registers (“clocked”/require clock input)

Page 17: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.17 1999 ©UCB

Multicycle Implementation

°Datapath changes• one memory: both instructions and data (because can access on separate steps)

• one ALU (eliminate extra adders)

• extra “invisible” registers to capture intermediate (per-step) datapath results

°Controller changes• controller must fire control lines in correct sequence and correct time

controller must remember current execution step, advance to next step

Page 18: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.18 1999 ©UCB

Multicycle Datapath: Add Multiplexors

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

zero

15:11

Note inputs to multiplexors

Page 19: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.19 1999 ©UCB

Datapath + Control Points

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

IorDMemRead

MemWriteIRWrite

RegDstRegWrite

ALUSrcA

ALUSrcB

MemtoReg

ALUControl

ALUOp

22

3

(funct) 5:0

Mux

PCSrcPCWrite

PCWrite-Cond

Page 20: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.20 1999 ©UCB

Multicycle Instruction Execution°All instructions execute in 3-5 cycles

• 3 cycles: beq

• 4 cycles: R-type, sw

• 5 cycles: lw

°1: fetch instruction, PC=PC+4

°2: decode, fetch registers, brnch target

°3: execute/compute address/branch

°4: access memory/complete R-type

°5: (lw) store memory

Page 21: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.21 1999 ©UCB

Cycle 1 Datapath: IR=Mem[PC]; PC=PC+4

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

ALUControl

22

3

(funct) 5:0

Mux

IR=Mem[PC];PC=PC+4

Page 22: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.22 1999 ©UCB

Cycle 2: A=Reg[IR25:21]; ALUOut= PC + sgn-ext(IR15:0) << 2

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

ALUControl

22

3

(funct) 5:0

Mux

A=Reg[IR25:21];B=Reg[IR20:16];ALUOut= PC +

sgn-ext(IR15:0) << 2

Page 23: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.23 1999 ©UCB

Cycle 3: R-format: ALUOut = A op B

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

ALUControl

22

3

(funct) 5:0

Mux

ALUOut=A op B

Page 24: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.24 1999 ©UCB

Cycle 4 R-format: Reg[IR15:11] = ALUOut

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

ALUControl

22

3

(funct) 5:0

Mux

Reg[IR15:11] = ALUOut

• How many times use ALU?

Page 25: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.25 1999 ©UCB

Cycle 3 beq: if (A==B) PC =ALUOut

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

ALUControl

22

3

(funct) 5:0

Mux

if (A==B) PC =ALUOut

Page 26: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.26 1999 ©UCB

Cycle 3 lw: ALUOut = A + sgn-ext(IR15:0)

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

IorD=xMemRead

MemWriteIRWrite

RegDst=xRegWrite

ALUSrcA=1

ALUSrcB=2

MemtoReg=x

ALUControl

ALUOp=0

22

3

(funct) 5:0

Mux

PCSrc=xPCWrite

PCWrite-Cond

ALUOut = A + sgn-ext(IR15:0)

Page 27: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.27 1999 ©UCB

Cycle 4 lw:MDR = Mem[ALUout]

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

IorD=1MemRead

MemWriteIRWrite

RegDst=xRegWrite

ALUSrcA=x

ALUSrcB=x

MemtoReg=x

ALUControl

ALUOp=x

22

3

(funct) 5:0

Mux

PCSrc=xPCWrite

PCWrite-Cond

MDR = Mem[ALUout]

Page 28: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.28 1999 ©UCB

Cycle 5 lw: Reg[IR15:11] = MDR

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

IorD=xMemRead

MemWriteIRWrite

RegDst=0RegWrite

ALUSrcA=x

ALUSrcB=x

MemtoReg=1

ALUControl

ALUOp=x

22

3

(funct) 5:0

Mux

PCSrc=xPCWrite

PCWrite-Cond

Reg[IR15:11] = MDR

Page 29: 1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)

.29 1999 ©UCB

Cycle 4 (sw): Mem[ALUOut] = B

ALU

Regs

ReadReg1

Readdata1

Readdata2

ReadReg2

WriteReg

WriteData

Sgn Ext- end

PC

<<2

A

B

ALU-Out

Address

ReadData

Mem

WriteData

MDR

Mux

25:21

20:16

15:0 0 1M2 u3 x

Mux

Mux

Mux

IR4

z

15:11

IorD=1MemRead

MemWriteIRWrite

RegDstRegWrite

ALUSrcA

ALUSrc

MemtoReg

ALUControl

ALUOp

22

3

(funct) 5:0

Mux

PCSrcPCWrite

PCWrite-Cond

Mem[ALUOut] = B