cs 152 computer architecture and engineering lecture 8 single-cycle (con’t) designing a multicycle...

51
CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (www.cs.berkeley.edu/~kubitron) lecture slides: http://inst.eecs.berkeley.edu/~cs152/

Post on 21-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

CS 152Computer Architecture and Engineering

Lecture 8

Single-Cycle (Con’t)Designing a Multicycle Processor

February 23, 2004

John Kubiatowicz (www.cs.berkeley.edu/~kubitron)

lecture slides: http://inst.eecs.berkeley.edu/~cs152/

Page 2: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.2

Recap: A Single Cycle Datapath

° Rs, Rt, Rd and Imed16 hardwired into datapath from Fetch Unit° We have everything except control signals (underline)

32

ALUctr

Clk

busW

RegWr

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

Rd

RegDst

Exten

der

Mu

x

Mux

3216imm16

ALUSrc

ExtOp

Mu

x

MemtoReg

Clk

Data InWrEn

32

Adr

DataMemory

32

MemWrA

LU

InstructionFetch Unit

Clk

Equal

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel

Page 3: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.3

Recap: Flexible Instruction Fetch° Branch (nPC_sel = “Br”):

• if (Equal == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4° Other (nPC_sel = “+4”):

• PC=PC+4

° What is encoding of nPC_sel?• Direct MUX select?

• Branch / not branch

° Let’s choose second option

nPC_sel Equal MUX0 x 01 0 01 1 1

Adr

InstMemory

Ad

der

Ad

der

PC

Clk

00

Mu

x

4

nPC_sel

imm

16

Instruction<31:0>

0

1

Equal

nPC_MUX_sel

Page 4: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.4

Recap: The Single Cycle Datapath during Add

32

ALUctr = Add

Clk

busW

RegWr = 1

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 1

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 0

ExtOp = x

Mu

x

MemtoReg = 0

Clk

Data InWrEn

32

Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Equal

Instruction<31:0>° R[rd] <- R[rs] + R[rt]

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

op rs rt rd shamt funct

061116212631

nPC_sel= +4

Page 5: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.6

Recap: The Single Cycle Datapath during Load

32

ALUctr = Add

Clk

busW

RegWr = 1

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 0

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 1

ExtOp = 1

Mu

x

MemtoReg = 1

Clk

Data InWrEn

32

Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Equal

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

° R[rt] <- Data Memory {R[rs] + SignExt[imm16]}

op rs rt immediate

016212631

nPC_sel= +4

Page 6: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.7

Recap: The Single Cycle Datapath during Store

32

ALUctr = Add

Clk

busW

RegWr = 0

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = x

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 1

ExtOp = 1

Mu

x

MemtoReg = x

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 1A

LU

InstructionFetch Unit

Clk

Equal

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

° Data Memory {R[rs] + SignExt[imm16]} <- R[rt]

op rs rt immediate

016212631

nPC_sel= +4

Page 7: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.8

Recap: The Single Cycle Datapath during Branch

32

ALUctr =Sub

Clk

busW

RegWr = 0

32

32

busA

32

busB

55 5

Rw Ra Rb

32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = x

Exten

der

Mu

x

Mux

3216imm16

ALUSrc = 0

ExtOp = x

Mu

x

MemtoReg = x

Clk

Data InWrEn

32

Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Equal

Instruction<31:0>

0

1

0

1

01<

21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

° if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0

op rs rt immediate

016212631

nPC_sel= “Br”

Page 8: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.9

Recap: A Summary of Control Signals

inst Register Transfer

ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”

SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”

ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4

ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”

LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”

STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”

BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4

nPC_sel = “Br”, ALUctr = “sub”

Page 9: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.10

Step 5: Assemble Control logic

ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Equal

Instruction<31:0>

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel

Adr

InstMemory

DATA PATH

Decoder

Op

<21:25>

Fun

RegWr

Page 10: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.11

A Summary of the Control Signals

add sub ori lw sw beq

RegDst

ALUSrc

MemtoReg

RegWrite

MemWrite

nPCsel

ExtOp

ALUctr<2:0>

1

0

0

1

0

0

x

Add

1

0

0

1

0

0

x

Subtract

0

1

0

1

0

0

0

Or

0

1

1

1

0

0

1

Add

x

1

x

0

1

0

1

Add

x

0

x

0

0

1

x

Subtract

op target address

op rs rt rd shamt funct

061116212631

op rs rt immediate

R-type

I-type

J-type

add, sub

ori, lw, sw, beq

jump

func

op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100Appendix A10 0000See 10 0010 We Don’t Care :-)

Page 11: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.12

The Concept of Local Decoding

MainControl

op

6

ALUControl(Local)

func

N

6ALUop

ALUctr

3

AL

U

R-type ori lw sw beq

RegDst

ALUSrc

MemtoReg

RegWrite

MemWrite

Branch

ExtOp

ALUop<N:0>

1

0

0

1

0

0

x

“R-type”

0

1

0

1

0

0

0

Or

0

1

1

1

0

0

1

Add

x

1

x

0

1

0

1

Add

x

0

x

0

0

1

x

Subtract

op 00 0000 00 1101 10 0011 10 1011 00 0100

Page 12: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.13

The Encoding of ALUop

° ALUop has to be 2 bits wide to represent:• (1) “R-type” instructions

• “I-type” instructions that require the ALU to perform:

- (2) Or, (3) Add, and (4) Subtract

MainControl

op

6

ALUControl(Local)

func

N

6ALUop

ALUctr

3

R-type ori lw sw beq

ALUop (Symbolic) “R-type” Or Add Add Subtract

ALUop<2:0> 1 00 0 10 0 00 0 00 0 01

Page 13: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.14

The Decoding of the “func” Field

MainControl

op

6

ALUControl(Local)

func

N

6ALUop

ALUctr

3

op rs rt rd shamt funct

061116212631

R-type

funct<5:0> Instruction Operation

10 0000

10 0010

10 0100

10 0101

10 1010

add

subtract

and

or

set-on-less-than

ALUctr<2:0> ALU Operation

000

001

010

110

111

And

Or

Add

Subtract

Set-on-less-than

P. 286 text:

ALUctr

AL

U

R-type ori lw sw beq

ALUop (Symbolic) “R-type” Or Add Add Subtract

ALUop<2:0> 1 00 0 10 0 00 0 00 0 01

Page 14: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.15

The Truth Table for ALUctr

R-type ori lw sw beqALUop(Symbolic) “R-type” Or Add Add Subtract

ALUop<2:0> 1 00 0 10 0 00 0 00 0 01

ALUop func

bit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3>

0 0 0 x x x x

ALUctrALUOperation

Add 0 1 0

bit<2> bit<1> bit<0>

0 x 1 x x x x Subtract 1 1 0

0 1 x x x x x Or 0 0 1

1 x x 0 0 0 0 Add 0 1 0

1 x x 0 0 1 0 Subtract 1 1 0

1 x x 0 1 0 0 And 0 0 0

1 x x 0 1 0 1 Or 0 0 1

1 x x 1 0 1 0 Set on < 1 1 1

funct<3:0> Instruction Op.

0000

0010

0100

0101

1010

add

subtract

and

or

set-on-less-than

Page 15: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.16

Step 5: Logic for each control signal

`define Rtype 6`b000000;`define BEQ 6`b000100;`define Ori 6`b001101;`define Load 6`b100011;`define Store 6`b101011;

… etc …

nPC_sel <= (OP == `BEQ) ? `Br : `plus4;ALUsrc <= (OP == `Rtype) ? `regB : `immed;ALUctr <= (OP == `Rtype`) ? funct : (OP == `ORi) ? `ORfunction : (OP == `BEQ) ? `SUBfunction : `ADDfunction;

ExtOp <= (OP == `ORi) : `ZEROextend : `SIGNextend;MemWr <= (OP == `Store) ? 1 : 0;MemtoReg<= (OP == `Load) ? 1 : 0;RegWr: <= ((OP == `Store) || (OP == `BEQ)) ? 0 : 1;RegDst: <= ((OP == `Load) || (OP == `ORi)) ? 0 : 1;

Page 16: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.17

The “Truth Table” for the Main Control

R-type ori lw sw beq

RegDst

ALUSrc

MemtoReg

RegWrite

MemWrite

nPC_sel

Jump

ExtOp

ALUop (Symbolic)

1

0

0

1

0

0

0

x

“R-type”

0

1

0

1

0

0

0

0

Or

0

1

1

1

0

0

0

1

Add

x

1

x

0

1

0

0

1

Add

x

0

x

0

0

1

0

x

Subtract

op 00 0000 00 1101 10 0011 10 1011 00 0100

ALUop <2> 1 0 0 0 0ALUop <1> 0 1 0 0 0ALUop <0> 0 0 0 0 1

MainControl

op

6

ALUControl(Local)

func

3

6

ALUop

ALUctr

3

RegDst

ALUSrc

:

Page 17: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.18

The “Truth Table” for RegWrite

R-type ori lw sw beq jump

RegWrite 1 1 1 0 0 0

op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010

° RegWrite = R-type + ori + lw= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type)

+ !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori)

+ op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw)

op<0>

op<5>. .op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

R-type ori lw sw beq jump

RegWrite

Page 18: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.19

PLA Implementation of the Main Control

op<0>

op<5>. .op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

R-type ori lw sw beq jumpRegWrite

ALUSrc

MemtoReg

MemWrite

Branch

Jump

RegDst

ExtOp

ALUop<2>

ALUop<1>

ALUop<0>

Page 19: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.20

Administrative Issues

° Read Chapter 5° This lecture and next one slightly different from the book

° Design Document for lab 3 due in section this Thursday!• Describe your division of labor• Your testing methodology (how will you test each step of the

way?)• Top-level block diagrams

° Midterm on Wednesday 3/10• 5:30pm to 8:30pm, location TBA• No class on that day:• Pencil, calculator, one 8.5” x 11” (both sides) of

handwritten notes° Meet at LaVal’s pizza after the midterm

Page 20: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.21

The Big Picture: Where are We Now?

° The Five Classic Components of a Computer

° Today’s Topic: Designing the Datapath for the Multiple Clock Cycle Datapath

Control

Datapath

Memory

Processor

Input

Output

Page 21: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.22

Abstract View of our single cycle processor

° looks like a FSM with PC as state

PC

Nex

t P

C

Reg

iste

rF

etch ALU Reg

. W

rt

Mem

Acc

ess

Dat

aM

emInst

ruct

ion

Fet

ch

Res

ult

Sto

re

AL

Uct

r

Reg

Dst

AL

US

rc

Ext

Op

Mem

Wr

Eq

ual

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

MainControl

ALUcontrol

op

fun

Ext

Page 22: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.23

What’s wrong with our CPI=1 processor?

° Long Cycle Time

° All instructions take as much time as the slowest

° Real memory is not as nice as our idealized memory• cannot always get the job done in one (short) cycle

PC Inst Memory mux ALU Data Mem mux

PC Reg FileInst Memory mux ALU mux

PC Inst Memory mux ALU Data Mem

PC Inst Memory cmp mux

Reg File

Reg File

Reg File

Arithmetic & Logical

Load

Store

Branch

Critical Path

setup

setup

Page 23: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.24

Memory Access Time

° Physics => fast memories are small (large memories are slow)

• question: register file vs. memory

° => Use a hierarchy of memories

Storage Array

selected word line

addressstorage cell

bit line

sense ampsaddressdecoder

CacheProcessor

1 time-period

proc

. bu

s

L2Cache

mem

. bu

s

2-3 time-periods20 - 50 time-periods

memory

Page 24: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.25

Reducing Cycle Time

° Cut combinational dependency graph and insert register / latch

° Do same work in two fast cycles, rather than one slow one

° May be able to short-circuit path and remove some components for some instructions!

storage element

Acyclic CombinationalLogic

storage element

storage element

Acyclic CombinationalLogic (A)

storage element

storage element

Acyclic CombinationalLogic (B)

Page 25: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.26

Worst Case Timing (Load)

Clk

PC

Rs, Rt, Rd,Op, Func

Clk-to-Q

ALUctr

Instruction Memoey Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA

Register File Access Time

Old Value New Value

busB

ALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

ExtOp Old Value New Value

ALUSrc Old Value New Value

MemtoReg Old Value New Value

Address Old Value New Value

busW Old Value New

Delay through Extender & Mux

RegisterWrite Occurs

Data Memory Access Time

Page 26: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.27

Basic Limits on Cycle Time

° Next address logic• PC <= branch ? PC + offset : PC + 4

° Instruction Fetch• InstructionReg <= Mem[PC]

° Register Access• A <= R[rs]

° ALU operation• R <= A + B

PC

Nex

t P

C

Ope

rand

Fet

ch Exec Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

Inst

ruct

ion

Fet

ch

Res

ult

Sto

re

AL

Uct

r

Reg

Dst

AL

US

rc

Ext

Op

Mem

Wr

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

Control

Page 27: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.28

Partitioning the CPI=1 Datapath

° Add registers between smallest steps

° Place enables on all registers

PC

Nex

t P

C

Ope

rand

Fet

ch Exec Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

Inst

ruct

ion

Fet

ch

Res

ult

Sto

re

AL

Uct

r

Reg

Dst

AL

US

rc

Ext

Op

Mem

Wr

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

Equ

al

Page 28: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.29

Example Multicycle Datapath

° Critical Path ?

PC

Nex

t P

C

Ope

rand

Fet

ch

Inst

ruct

ion

Fet

ch

nPC

_sel

IRRegFile E

xtA

LU Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

Res

ult

Sto

reR

egD

stR

egW

r

Mem

Wr

Mem

Rd

S

M

Mem

ToR

eg

Equ

al

AL

Uct

rA

LU

Src

Ext

Op

A

B

E

Page 29: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.30

Recall: Step-by-step Processor Design

Step 1: ISA => Logical Register Transfers

Step 2: Components of the Datapath

Step 3: RTL + Components => Datapath

Step 4: Datapath + Logical RTs => Physical RTs

Step 5: Physical RTs => Control

Page 30: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.31

Step 4: R-rtype (add, sub, . . .)

° Logical Register Transfer

° Physical Register Transfers

inst Logical Register Transfers

ADDU R[rd] <– R[rs] + R[rt]; PC <– PC + 4

inst Physical Register Transfers

IR <– MEM[pc]

ADDU A<– R[rs]; B <– R[rt]

S <– A + B

R[rd] <– S; PC <– PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

em

Tim

e

A

B

E

Page 31: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.32

Step 4: Logical immed

° Logical Register Transfer

° Physical Register Transfers

inst Logical Register Transfers

ORI R[rt] <– R[rs] OR ZExt(Im16); PC <– PC + 4

inst Physical Register Transfers

IR <– MEM[pc]

ORI A<– R[rs]; B <– R[rt]

S <– A or ZExt(Im16)

R[rt] <– S; PC <– PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

em

Tim

e

A

B

E

Page 32: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.33

Step 4 : Load

° Logical Register Transfer

° Physical Register Transfers

inst Logical Register Transfers

LW R[rt] <– MEM[R[rs] + SExt(Im16)];

PC <– PC + 4

inst Physical Register Transfers

IR <– MEM[pc]

LW A<– R[rs]; B <– R[rt]

S <– A + SExt(Im16)

M <– MEM[S]

R[rd] <– M; PC <– PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

em A

B

E

Tim

e

Page 33: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.34

Step 4 : Store

° Logical Register Transfer

° Physical Register Transfers

inst Logical Register Transfers

SW MEM[R[rs] + SExt(Im16)] <– R[rt];

PC <– PC + 4

inst Physical Register Transfers

IR <– MEM[pc]

SW A<– R[rs]; B <– R[rt]

S <– A + SExt(Im16);

MEM[S] <– B PC <– PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

em A

B

E

Tim

e

Page 34: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.35

Step 4 : Branch° Logical Register Transfer

° Physical Register Transfers

inst Logical Register Transfers

BEQ if R[rs] == R[rt]

then PC <= PC + 4+SExt(Im16) || 00

else PC <= PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

eminst Physical Register Transfers

IR <– MEM[pc]

BEQ E<– (R[rs] = R[rt])

if !E then PC <– PC + 4 else PC <–PC+4+SExt(Im16)||00

A

B

ET

ime

Page 35: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.36

Alternative datapath (book): Multiple Cycle Datapath

° Miminizes Hardware: 1 memory, 1 adder

IdealMemoryWrAdrDin

RAdr

32

32

32Dout

MemWr

32

AL

U

3232

ALUOp

ALUControl

Instru

ction R

eg

32

IRWr

32

Reg File

Ra

Rw

busW

Rb5

5

32busA

32busB

RegWr

Rs

Rt

Mu

x

0

1

Rt

Rd

PCWr

ALUSelA

Mux 01

RegDst

Mu

x

0

1

32

PC

MemtoReg

Extend

ExtOp

Mu

x

0

132

0

1

23

4

16Imm 32

<< 2

ALUSelB

Mu

x1

0

Target32

Zero

ZeroPCWrCond PCSrc BrWr

32

IorD

AL

U O

ut

Page 36: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.37

Our Control Model

° State specifies control points for Register Transfer

° Transfer occurs upon exiting state (same falling edge)

Control State

Next StateLogic

Output Logic

inputs (conditions)

outputs (control points)

State X

Register TransferControl Points

Depends on Input

Page 37: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.38

Step 4 Control Specification for multicycle proc

IR <= MEM[PC]

R-type

A <= R[rs]B <= R[rt]

S <= A fun B

R[rd] <= SPC <= PC + 4

S <= A or ZX

R[rt] <= SPC <= PC + 4

ORi

S <= A + SX

R[rt] <= MPC <= PC + 4

M <= MEM[S]

LW

S <= A + SX

MEM[S] <= BPC <= PC + 4

BEQPC <= Next(PC,Equal)

SW

“instruction fetch”

“decode / operand fetch”

Exe

cute

Mem

ory

Writ

e-ba

ck

Page 38: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.39

Traditional FSM Controller

State

6

4

11nextState

op

Equal

control points

state op condnextstate control points

Truth Table

datapath State

Page 39: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.40

Step 5 (datapath + state diagram control)

° Translate RTs into control points

° Assign states

° Then go build the controller

Page 40: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.41

Mapping RTs to Control Points

IR <= MEM[PC]

R-type

A <= R[rs]B <= R[rt]

S <= A fun B

R[rd] <= SPC <= PC + 4

S <= A or ZX

R[rt] <= SPC <= PC + 4

ORi

S <= A + SX

R[rt] <= MPC <= PC + 4

M <= MEM[S]

LW

S <= A + SX

MEM[S] <= BPC <= PC + 4

BEQ

PC <= Next(PC,Equal)

SW

“instruction fetch”

“decode”

imem_rd, IRen

ALUfun, Sen

RegDst, RegWr,PCen

Aen, Ben,Een

Exe

cute

Mem

ory

Writ

e-ba

ck

Page 41: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.42

Assigning States

IR <= MEM[PC]

R-type

A <= R[rs]B <= R[rt]

S <= A fun B

R[rd] <= SPC <= PC + 4

S <= A or ZX

R[rt] <= SPC <= PC + 4

ORi

S <= A + SX

R[rt] <= MPC <= PC + 4

M <= MEM[S]

LW

S <= A + SX

MEM[S] <= BPC <= PC + 4

BEQ

PC <= Next(PC)

SW

“instruction fetch”

“decode”

0000

0001

0100

0101

0110

0111

1000

1001

1010

00111011

1100

Exe

cute

Mem

ory

Writ

e-ba

ck

Page 42: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.43

(Mostly) Detailed Control Specification (missing0)

0000 ?????? ? 0001 10001 BEQ x 0011 1 1 1 0001 R-type x 0100 1 1 1 0001 ORI x 0110 1 1 10001 LW x 1000 1 1 10001 SW x 1011 1 1 1

0011 xxxxxx 0 0000 1 0 x 0 x0011 xxxxxx 1 0000 1 1 x 0 x0100 xxxxxx x 0101 0 1 fun 10101 xxxxxx x 0000 1 0 0 1 10110 xxxxxx x 0111 0 0 or 10111 xxxxxx x 0000 1 0 0 1 01000 xxxxxx x 1001 1 0 add 11001 xxxxxx x 1010 1 0 11010 xxxxxx x 0000 1 0 1 1 01011 xxxxxx x 1100 1 0 add 11100 xxxxxx x 0000 1 0 0 1 0

State Op field Eq Next IR PC Ops Exec Mem Write-Backen sel A B E Ex Sr ALU S R W M M-R Wr Dst

R:

ORi:

LW:

SW:

-all same in Moore machine

BEQ:

Page 43: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.44

Performance Evaluation

° What is the average CPI?• state diagram gives CPI for each instruction type

• workload gives frequency of each type

Type CPIi for type Frequency CPIi x freqIi

Arith/Logic 4 40% 1.6

Load 5 30% 1.5

Store 4 10% 0.4

branch 3 20% 0.6

Average CPI:4.1

Page 44: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.45

Controller Design

° The state digrams that arise define the controller for an instruction set processor are highly structured

° Use this structure to construct a simple “microsequencer”

° Control reduces to programming this very simple device microprogramming

sequencercontrol

datapath control

micro-PCsequencer

microinstruction

Page 45: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.46

Example: Jump-Counter

op-codeMap ROM

Counterzeroincload

0000i

i+1

i

None of above: Do nothing (for wait states)

Page 46: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.47

Using a Jump Counter

IR <= MEM[PC]

R-type

A <= R[rs]B <= R[rt]

S <= A fun B

R[rd] <= SPC <= PC + 4

S <= A or ZX

R[rt] <= SPC <= PC + 4

ORi

S <= A + SX

R[rt] <= MPC <= PC + 4

M <= MEM[S]

LW

S <= A + SX

MEM[S] <= BPC <= PC + 4

BEQ

PC <= Next(PC)

SW

“instruction fetch”

“decode”

0000

0001

0100

0101

0110

0111

1000

1001

1010

00111011

1100

inc

load

zero zerozero

zero

zeroinc inc inc inc

inc

Exe

cute

Mem

ory

Writ

e-ba

ck

Page 47: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.48

Our Microsequencer

op-code

Map ROM

Micro-PC

Z I Ldatapath control

taken

Page 48: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.49

Microprogram Control Specification

0000 ? inc 10001 0 load 1 1

0011 0 zero 1 00011 1 zero 1 10100 x inc 0 1 fun 10101 x zero 1 0 0 1 10110 x inc 0 0 or 10111 x zero 1 0 0 1 01000 x inc 1 0 add 11001 x inc 1 0 11010 x zero 1 0 1 1 01011 x inc 1 0 add 11100 x zero 1 0 0 1 0

µPC Taken Next IR PC Ops Exec Mem Write-Backen sel A B Ex Sr ALU S R W M M-R Wr Dst

R:

ORi:

LW:

SW:

BEQ

Page 49: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.50

Overview of Control

° Control may be designed using one of several initial representations. The choice of sequence control, and how logic is represented, can then be determined independently; the control can then be implemented with one of several methods using a structured logic technique.

Initial Representation Finite State Diagram Microprogram

Sequencing Control Explicit Next State Microprogram counter Function + Dispatch ROMs

Logic Representation Logic Equations Truth Tables

Implementation PLA ROM Technique

“hardwired control” “microprogrammed control”

Page 50: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.51

Summary

° Disadvantages of the Single Cycle Processor• Long cycle time

• Cycle time is too long for all instructions except the Load

° Multiple Cycle Processor:• Divide the instructions into smaller steps

• Execute each step (instead of the entire instruction) in one cycle

° Partition datapath into equal size chunks to minimize cycle time

• ~10 levels of logic between latches

° Follow same 5-step method for designing “real” processor

Page 51: CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (kubitron)

2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz

Lec8.52

Summary (cont’d)° Control is specified by finite state digram

° Specialize state-diagrams easily captured by microsequencer

• simple increment & “branch” fields

• datapath control fields

° Control design reduces to Microprogramming

° Control is more complicated with:• complex instruction sets

• restricted datapaths (see the book)

° Simple Instruction set and powerful datapath simple control

• could try to reduce hardware (see the book)

• rather go for speed => many instructions at once!