cs 152 computer architecture and engineering lecture 8 single-cycle (con’t) designing a multicycle...
Post on 21-Dec-2015
216 views
TRANSCRIPT
CS 152Computer Architecture and Engineering
Lecture 8
Single-Cycle (Con’t)Designing a Multicycle Processor
February 23, 2004
John Kubiatowicz (www.cs.berkeley.edu/~kubitron)
lecture slides: http://inst.eecs.berkeley.edu/~cs152/
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.2
Recap: A Single Cycle Datapath
° Rs, Rt, Rd and Imed16 hardwired into datapath from Fetch Unit° We have everything except control signals (underline)
32
ALUctr
Clk
busW
RegWr
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
Rd
RegDst
Exten
der
Mu
x
Mux
3216imm16
ALUSrc
ExtOp
Mu
x
MemtoReg
Clk
Data InWrEn
32
Adr
DataMemory
32
MemWrA
LU
InstructionFetch Unit
Clk
Equal
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.3
Recap: Flexible Instruction Fetch° Branch (nPC_sel = “Br”):
• if (Equal == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4° Other (nPC_sel = “+4”):
• PC=PC+4
° What is encoding of nPC_sel?• Direct MUX select?
• Branch / not branch
° Let’s choose second option
nPC_sel Equal MUX0 x 01 0 01 1 1
Adr
InstMemory
Ad
der
Ad
der
PC
Clk
00
Mu
x
4
nPC_sel
imm
16
Instruction<31:0>
0
1
Equal
nPC_MUX_sel
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.4
Recap: The Single Cycle Datapath during Add
32
ALUctr = Add
Clk
busW
RegWr = 1
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 1
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 0
ExtOp = x
Mu
x
MemtoReg = 0
Clk
Data InWrEn
32
Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Equal
Instruction<31:0>° R[rd] <- R[rs] + R[rt]
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
op rs rt rd shamt funct
061116212631
nPC_sel= +4
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.6
Recap: The Single Cycle Datapath during Load
32
ALUctr = Add
Clk
busW
RegWr = 1
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 0
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 1
ExtOp = 1
Mu
x
MemtoReg = 1
Clk
Data InWrEn
32
Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Equal
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
° R[rt] <- Data Memory {R[rs] + SignExt[imm16]}
op rs rt immediate
016212631
nPC_sel= +4
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.7
Recap: The Single Cycle Datapath during Store
32
ALUctr = Add
Clk
busW
RegWr = 0
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = x
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 1
ExtOp = 1
Mu
x
MemtoReg = x
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 1A
LU
InstructionFetch Unit
Clk
Equal
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
° Data Memory {R[rs] + SignExt[imm16]} <- R[rt]
op rs rt immediate
016212631
nPC_sel= +4
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.8
Recap: The Single Cycle Datapath during Branch
32
ALUctr =Sub
Clk
busW
RegWr = 0
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = x
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 0
ExtOp = x
Mu
x
MemtoReg = x
Clk
Data InWrEn
32
Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Equal
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
° if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0
op rs rt immediate
016212631
nPC_sel= “Br”
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.9
Recap: A Summary of Control Signals
inst Register Transfer
ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”
SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”
ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4
ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”
LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”
STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”
BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4
nPC_sel = “Br”, ALUctr = “sub”
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.10
Step 5: Assemble Control logic
ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Equal
Instruction<31:0>
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel
Adr
InstMemory
DATA PATH
Decoder
Op
<21:25>
Fun
RegWr
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.11
A Summary of the Control Signals
add sub ori lw sw beq
RegDst
ALUSrc
MemtoReg
RegWrite
MemWrite
nPCsel
ExtOp
ALUctr<2:0>
1
0
0
1
0
0
x
Add
1
0
0
1
0
0
x
Subtract
0
1
0
1
0
0
0
Or
0
1
1
1
0
0
1
Add
x
1
x
0
1
0
1
Add
x
0
x
0
0
1
x
Subtract
op target address
op rs rt rd shamt funct
061116212631
op rs rt immediate
R-type
I-type
J-type
add, sub
ori, lw, sw, beq
jump
func
op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100Appendix A10 0000See 10 0010 We Don’t Care :-)
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.12
The Concept of Local Decoding
MainControl
op
6
ALUControl(Local)
func
N
6ALUop
ALUctr
3
AL
U
R-type ori lw sw beq
RegDst
ALUSrc
MemtoReg
RegWrite
MemWrite
Branch
ExtOp
ALUop<N:0>
1
0
0
1
0
0
x
“R-type”
0
1
0
1
0
0
0
Or
0
1
1
1
0
0
1
Add
x
1
x
0
1
0
1
Add
x
0
x
0
0
1
x
Subtract
op 00 0000 00 1101 10 0011 10 1011 00 0100
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.13
The Encoding of ALUop
° ALUop has to be 2 bits wide to represent:• (1) “R-type” instructions
• “I-type” instructions that require the ALU to perform:
- (2) Or, (3) Add, and (4) Subtract
MainControl
op
6
ALUControl(Local)
func
N
6ALUop
ALUctr
3
R-type ori lw sw beq
ALUop (Symbolic) “R-type” Or Add Add Subtract
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.14
The Decoding of the “func” Field
MainControl
op
6
ALUControl(Local)
func
N
6ALUop
ALUctr
3
op rs rt rd shamt funct
061116212631
R-type
funct<5:0> Instruction Operation
10 0000
10 0010
10 0100
10 0101
10 1010
add
subtract
and
or
set-on-less-than
ALUctr<2:0> ALU Operation
000
001
010
110
111
And
Or
Add
Subtract
Set-on-less-than
P. 286 text:
ALUctr
AL
U
R-type ori lw sw beq
ALUop (Symbolic) “R-type” Or Add Add Subtract
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.15
The Truth Table for ALUctr
R-type ori lw sw beqALUop(Symbolic) “R-type” Or Add Add Subtract
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01
ALUop func
bit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3>
0 0 0 x x x x
ALUctrALUOperation
Add 0 1 0
bit<2> bit<1> bit<0>
0 x 1 x x x x Subtract 1 1 0
0 1 x x x x x Or 0 0 1
1 x x 0 0 0 0 Add 0 1 0
1 x x 0 0 1 0 Subtract 1 1 0
1 x x 0 1 0 0 And 0 0 0
1 x x 0 1 0 1 Or 0 0 1
1 x x 1 0 1 0 Set on < 1 1 1
funct<3:0> Instruction Op.
0000
0010
0100
0101
1010
add
subtract
and
or
set-on-less-than
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.16
Step 5: Logic for each control signal
`define Rtype 6`b000000;`define BEQ 6`b000100;`define Ori 6`b001101;`define Load 6`b100011;`define Store 6`b101011;
… etc …
nPC_sel <= (OP == `BEQ) ? `Br : `plus4;ALUsrc <= (OP == `Rtype) ? `regB : `immed;ALUctr <= (OP == `Rtype`) ? funct : (OP == `ORi) ? `ORfunction : (OP == `BEQ) ? `SUBfunction : `ADDfunction;
ExtOp <= (OP == `ORi) : `ZEROextend : `SIGNextend;MemWr <= (OP == `Store) ? 1 : 0;MemtoReg<= (OP == `Load) ? 1 : 0;RegWr: <= ((OP == `Store) || (OP == `BEQ)) ? 0 : 1;RegDst: <= ((OP == `Load) || (OP == `ORi)) ? 0 : 1;
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.17
The “Truth Table” for the Main Control
R-type ori lw sw beq
RegDst
ALUSrc
MemtoReg
RegWrite
MemWrite
nPC_sel
Jump
ExtOp
ALUop (Symbolic)
1
0
0
1
0
0
0
x
“R-type”
0
1
0
1
0
0
0
0
Or
0
1
1
1
0
0
0
1
Add
x
1
x
0
1
0
0
1
Add
x
0
x
0
0
1
0
x
Subtract
op 00 0000 00 1101 10 0011 10 1011 00 0100
ALUop <2> 1 0 0 0 0ALUop <1> 0 1 0 0 0ALUop <0> 0 0 0 0 1
MainControl
op
6
ALUControl(Local)
func
3
6
ALUop
ALUctr
3
RegDst
ALUSrc
:
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.18
The “Truth Table” for RegWrite
R-type ori lw sw beq jump
RegWrite 1 1 1 0 0 0
op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
° RegWrite = R-type + ori + lw= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type)
+ !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori)
+ op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw)
op<0>
op<5>. .op<5>. .<0>
op<5>. .<0>
op<5>. .<0>
op<5>. .<0>
op<5>. .<0>
R-type ori lw sw beq jump
RegWrite
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.19
PLA Implementation of the Main Control
op<0>
op<5>. .op<5>. .<0>
op<5>. .<0>
op<5>. .<0>
op<5>. .<0>
op<5>. .<0>
R-type ori lw sw beq jumpRegWrite
ALUSrc
MemtoReg
MemWrite
Branch
Jump
RegDst
ExtOp
ALUop<2>
ALUop<1>
ALUop<0>
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.20
Administrative Issues
° Read Chapter 5° This lecture and next one slightly different from the book
° Design Document for lab 3 due in section this Thursday!• Describe your division of labor• Your testing methodology (how will you test each step of the
way?)• Top-level block diagrams
° Midterm on Wednesday 3/10• 5:30pm to 8:30pm, location TBA• No class on that day:• Pencil, calculator, one 8.5” x 11” (both sides) of
handwritten notes° Meet at LaVal’s pizza after the midterm
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.21
The Big Picture: Where are We Now?
° The Five Classic Components of a Computer
° Today’s Topic: Designing the Datapath for the Multiple Clock Cycle Datapath
Control
Datapath
Memory
Processor
Input
Output
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.22
Abstract View of our single cycle processor
° looks like a FSM with PC as state
PC
Nex
t P
C
Reg
iste
rF
etch ALU Reg
. W
rt
Mem
Acc
ess
Dat
aM
emInst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
US
rc
Ext
Op
Mem
Wr
Eq
ual
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
MainControl
ALUcontrol
op
fun
Ext
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.23
What’s wrong with our CPI=1 processor?
° Long Cycle Time
° All instructions take as much time as the slowest
° Real memory is not as nice as our idealized memory• cannot always get the job done in one (short) cycle
PC Inst Memory mux ALU Data Mem mux
PC Reg FileInst Memory mux ALU mux
PC Inst Memory mux ALU Data Mem
PC Inst Memory cmp mux
Reg File
Reg File
Reg File
Arithmetic & Logical
Load
Store
Branch
Critical Path
setup
setup
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.24
Memory Access Time
° Physics => fast memories are small (large memories are slow)
• question: register file vs. memory
° => Use a hierarchy of memories
Storage Array
selected word line
addressstorage cell
bit line
sense ampsaddressdecoder
CacheProcessor
1 time-period
proc
. bu
s
L2Cache
mem
. bu
s
2-3 time-periods20 - 50 time-periods
memory
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.25
Reducing Cycle Time
° Cut combinational dependency graph and insert register / latch
° Do same work in two fast cycles, rather than one slow one
° May be able to short-circuit path and remove some components for some instructions!
storage element
Acyclic CombinationalLogic
storage element
storage element
Acyclic CombinationalLogic (A)
storage element
storage element
Acyclic CombinationalLogic (B)
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.26
Worst Case Timing (Load)
Clk
PC
Rs, Rt, Rd,Op, Func
Clk-to-Q
ALUctr
Instruction Memoey Access Time
Old Value New Value
RegWr Old Value New Value
Delay through Control Logic
busA
Register File Access Time
Old Value New Value
busB
ALU Delay
Old Value New Value
Old Value New Value
New ValueOld Value
ExtOp Old Value New Value
ALUSrc Old Value New Value
MemtoReg Old Value New Value
Address Old Value New Value
busW Old Value New
Delay through Extender & Mux
RegisterWrite Occurs
Data Memory Access Time
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.27
Basic Limits on Cycle Time
° Next address logic• PC <= branch ? PC + offset : PC + 4
° Instruction Fetch• InstructionReg <= Mem[PC]
° Register Access• A <= R[rs]
° ALU operation• R <= A + B
PC
Nex
t P
C
Ope
rand
Fet
ch Exec Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
US
rc
Ext
Op
Mem
Wr
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
Control
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.28
Partitioning the CPI=1 Datapath
° Add registers between smallest steps
° Place enables on all registers
PC
Nex
t P
C
Ope
rand
Fet
ch Exec Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
US
rc
Ext
Op
Mem
Wr
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
Equ
al
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.29
Example Multicycle Datapath
° Critical Path ?
PC
Nex
t P
C
Ope
rand
Fet
ch
Inst
ruct
ion
Fet
ch
nPC
_sel
IRRegFile E
xtA
LU Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
Res
ult
Sto
reR
egD
stR
egW
r
Mem
Wr
Mem
Rd
S
M
Mem
ToR
eg
Equ
al
AL
Uct
rA
LU
Src
Ext
Op
A
B
E
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.30
Recall: Step-by-step Processor Design
Step 1: ISA => Logical Register Transfers
Step 2: Components of the Datapath
Step 3: RTL + Components => Datapath
Step 4: Datapath + Logical RTs => Physical RTs
Step 5: Physical RTs => Control
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.31
Step 4: R-rtype (add, sub, . . .)
° Logical Register Transfer
° Physical Register Transfers
inst Logical Register Transfers
ADDU R[rd] <– R[rs] + R[rt]; PC <– PC + 4
inst Physical Register Transfers
IR <– MEM[pc]
ADDU A<– R[rs]; B <– R[rt]
S <– A + B
R[rd] <– S; PC <– PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
em
Tim
e
A
B
E
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.32
Step 4: Logical immed
° Logical Register Transfer
° Physical Register Transfers
inst Logical Register Transfers
ORI R[rt] <– R[rs] OR ZExt(Im16); PC <– PC + 4
inst Physical Register Transfers
IR <– MEM[pc]
ORI A<– R[rs]; B <– R[rt]
S <– A or ZExt(Im16)
R[rt] <– S; PC <– PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
em
Tim
e
A
B
E
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.33
Step 4 : Load
° Logical Register Transfer
° Physical Register Transfers
inst Logical Register Transfers
LW R[rt] <– MEM[R[rs] + SExt(Im16)];
PC <– PC + 4
inst Physical Register Transfers
IR <– MEM[pc]
LW A<– R[rs]; B <– R[rt]
S <– A + SExt(Im16)
M <– MEM[S]
R[rd] <– M; PC <– PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
em A
B
E
Tim
e
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.34
Step 4 : Store
° Logical Register Transfer
° Physical Register Transfers
inst Logical Register Transfers
SW MEM[R[rs] + SExt(Im16)] <– R[rt];
PC <– PC + 4
inst Physical Register Transfers
IR <– MEM[pc]
SW A<– R[rs]; B <– R[rt]
S <– A + SExt(Im16);
MEM[S] <– B PC <– PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
em A
B
E
Tim
e
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.35
Step 4 : Branch° Logical Register Transfer
° Physical Register Transfers
inst Logical Register Transfers
BEQ if R[rs] == R[rt]
then PC <= PC + 4+SExt(Im16) || 00
else PC <= PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
eminst Physical Register Transfers
IR <– MEM[pc]
BEQ E<– (R[rs] = R[rt])
if !E then PC <– PC + 4 else PC <–PC+4+SExt(Im16)||00
A
B
ET
ime
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.36
Alternative datapath (book): Multiple Cycle Datapath
° Miminizes Hardware: 1 memory, 1 adder
IdealMemoryWrAdrDin
RAdr
32
32
32Dout
MemWr
32
AL
U
3232
ALUOp
ALUControl
Instru
ction R
eg
32
IRWr
32
Reg File
Ra
Rw
busW
Rb5
5
32busA
32busB
RegWr
Rs
Rt
Mu
x
0
1
Rt
Rd
PCWr
ALUSelA
Mux 01
RegDst
Mu
x
0
1
32
PC
MemtoReg
Extend
ExtOp
Mu
x
0
132
0
1
23
4
16Imm 32
<< 2
ALUSelB
Mu
x1
0
Target32
Zero
ZeroPCWrCond PCSrc BrWr
32
IorD
AL
U O
ut
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.37
Our Control Model
° State specifies control points for Register Transfer
° Transfer occurs upon exiting state (same falling edge)
Control State
Next StateLogic
Output Logic
inputs (conditions)
outputs (control points)
State X
Register TransferControl Points
Depends on Input
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.38
Step 4 Control Specification for multicycle proc
IR <= MEM[PC]
R-type
A <= R[rs]B <= R[rt]
S <= A fun B
R[rd] <= SPC <= PC + 4
S <= A or ZX
R[rt] <= SPC <= PC + 4
ORi
S <= A + SX
R[rt] <= MPC <= PC + 4
M <= MEM[S]
LW
S <= A + SX
MEM[S] <= BPC <= PC + 4
BEQPC <= Next(PC,Equal)
SW
“instruction fetch”
“decode / operand fetch”
Exe
cute
Mem
ory
Writ
e-ba
ck
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.39
Traditional FSM Controller
State
6
4
11nextState
op
Equal
control points
state op condnextstate control points
Truth Table
datapath State
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.40
Step 5 (datapath + state diagram control)
° Translate RTs into control points
° Assign states
° Then go build the controller
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.41
Mapping RTs to Control Points
IR <= MEM[PC]
R-type
A <= R[rs]B <= R[rt]
S <= A fun B
R[rd] <= SPC <= PC + 4
S <= A or ZX
R[rt] <= SPC <= PC + 4
ORi
S <= A + SX
R[rt] <= MPC <= PC + 4
M <= MEM[S]
LW
S <= A + SX
MEM[S] <= BPC <= PC + 4
BEQ
PC <= Next(PC,Equal)
SW
“instruction fetch”
“decode”
imem_rd, IRen
ALUfun, Sen
RegDst, RegWr,PCen
Aen, Ben,Een
Exe
cute
Mem
ory
Writ
e-ba
ck
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.42
Assigning States
IR <= MEM[PC]
R-type
A <= R[rs]B <= R[rt]
S <= A fun B
R[rd] <= SPC <= PC + 4
S <= A or ZX
R[rt] <= SPC <= PC + 4
ORi
S <= A + SX
R[rt] <= MPC <= PC + 4
M <= MEM[S]
LW
S <= A + SX
MEM[S] <= BPC <= PC + 4
BEQ
PC <= Next(PC)
SW
“instruction fetch”
“decode”
0000
0001
0100
0101
0110
0111
1000
1001
1010
00111011
1100
Exe
cute
Mem
ory
Writ
e-ba
ck
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.43
(Mostly) Detailed Control Specification (missing0)
0000 ?????? ? 0001 10001 BEQ x 0011 1 1 1 0001 R-type x 0100 1 1 1 0001 ORI x 0110 1 1 10001 LW x 1000 1 1 10001 SW x 1011 1 1 1
0011 xxxxxx 0 0000 1 0 x 0 x0011 xxxxxx 1 0000 1 1 x 0 x0100 xxxxxx x 0101 0 1 fun 10101 xxxxxx x 0000 1 0 0 1 10110 xxxxxx x 0111 0 0 or 10111 xxxxxx x 0000 1 0 0 1 01000 xxxxxx x 1001 1 0 add 11001 xxxxxx x 1010 1 0 11010 xxxxxx x 0000 1 0 1 1 01011 xxxxxx x 1100 1 0 add 11100 xxxxxx x 0000 1 0 0 1 0
State Op field Eq Next IR PC Ops Exec Mem Write-Backen sel A B E Ex Sr ALU S R W M M-R Wr Dst
R:
ORi:
LW:
SW:
-all same in Moore machine
BEQ:
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.44
Performance Evaluation
° What is the average CPI?• state diagram gives CPI for each instruction type
• workload gives frequency of each type
Type CPIi for type Frequency CPIi x freqIi
Arith/Logic 4 40% 1.6
Load 5 30% 1.5
Store 4 10% 0.4
branch 3 20% 0.6
Average CPI:4.1
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.45
Controller Design
° The state digrams that arise define the controller for an instruction set processor are highly structured
° Use this structure to construct a simple “microsequencer”
° Control reduces to programming this very simple device microprogramming
sequencercontrol
datapath control
micro-PCsequencer
microinstruction
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.46
Example: Jump-Counter
op-codeMap ROM
Counterzeroincload
0000i
i+1
i
None of above: Do nothing (for wait states)
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.47
Using a Jump Counter
IR <= MEM[PC]
R-type
A <= R[rs]B <= R[rt]
S <= A fun B
R[rd] <= SPC <= PC + 4
S <= A or ZX
R[rt] <= SPC <= PC + 4
ORi
S <= A + SX
R[rt] <= MPC <= PC + 4
M <= MEM[S]
LW
S <= A + SX
MEM[S] <= BPC <= PC + 4
BEQ
PC <= Next(PC)
SW
“instruction fetch”
“decode”
0000
0001
0100
0101
0110
0111
1000
1001
1010
00111011
1100
inc
load
zero zerozero
zero
zeroinc inc inc inc
inc
Exe
cute
Mem
ory
Writ
e-ba
ck
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.48
Our Microsequencer
op-code
Map ROM
Micro-PC
Z I Ldatapath control
taken
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.49
Microprogram Control Specification
0000 ? inc 10001 0 load 1 1
0011 0 zero 1 00011 1 zero 1 10100 x inc 0 1 fun 10101 x zero 1 0 0 1 10110 x inc 0 0 or 10111 x zero 1 0 0 1 01000 x inc 1 0 add 11001 x inc 1 0 11010 x zero 1 0 1 1 01011 x inc 1 0 add 11100 x zero 1 0 0 1 0
µPC Taken Next IR PC Ops Exec Mem Write-Backen sel A B Ex Sr ALU S R W M M-R Wr Dst
R:
ORi:
LW:
SW:
BEQ
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.50
Overview of Control
° Control may be designed using one of several initial representations. The choice of sequence control, and how logic is represented, can then be determined independently; the control can then be implemented with one of several methods using a structured logic technique.
Initial Representation Finite State Diagram Microprogram
Sequencing Control Explicit Next State Microprogram counter Function + Dispatch ROMs
Logic Representation Logic Equations Truth Tables
Implementation PLA ROM Technique
“hardwired control” “microprogrammed control”
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.51
Summary
° Disadvantages of the Single Cycle Processor• Long cycle time
• Cycle time is too long for all instructions except the Load
° Multiple Cycle Processor:• Divide the instructions into smaller steps
• Execute each step (instead of the entire instruction) in one cycle
° Partition datapath into equal size chunks to minimize cycle time
• ~10 levels of logic between latches
° Follow same 5-step method for designing “real” processor
2/23/04 ©UCB Spring 2004 CS152 / Kubiatowicz
Lec8.52
Summary (cont’d)° Control is specified by finite state digram
° Specialize state-diagrams easily captured by microsequencer
• simple increment & “branch” fields
• datapath control fields
° Control design reduces to Microprogramming
° Control is more complicated with:• complex instruction sets
• restricted datapaths (see the book)
° Simple Instruction set and powerful datapath simple control
• could try to reduce hardware (see the book)
• rather go for speed => many instructions at once!