computer architecture: a constructive approach multi-cycle smips implementations joel emer

23
Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 5, 2012 L8-1 http:// csg.csail.mit.edu/6.S078

Upload: holland

Post on 23-Feb-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology. MemWrite. WBSrc. 0x4. clk. we. clk. rs1. rs2. ALU Control. we. rd1. addr. PC. ws. addr. inst. wd. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Computer Architecture: A Constructive Approach

Multi-cycle SMIPS Implementations

Joel EmerComputer Science & Artificial Intelligence Lab.Massachusetts Institute of Technology

March 5, 2012 L8-1http://csg.csail.mit.edu/

6.S078

Page 2: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Harvard-Style Datapath for MIPS

0x4

RegWrite

AddAdd

clk

WBSrcMemWrite

addr

wdata

rdataData Memory

we

RegDst BSrcExtSelOpCode

z

OpSel

clk

zero?

clk

addrinst

Inst.Memory

PC rd1

GPRs

rs1rs2

wswd rd2

we

ImmExt

ALU

ALUControl

31

PCSrcbrrindjabspc+4

old way

March 5, 2012 L8-2http://csg.csail.mit.edu/

6.S078

Page 3: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Hardwired Control TableOpcode ExtSel BSrc OpSel MemW RegW WBSrc RegDst PCSrcALUALUiALUiuLWSWBEQZz=0

BEQZz=1

JJALJRJALR

BSrc = Reg / Imm WBSrc = ALU / Mem / PC RegDst = rt / rd / R31 PCSrc = pc+4 / br / rind / jabs

* * * no yes rindPC R31rind* * * no no * *jabs* * * no yes PC R31

jabs* * * no no * *pc+4sExt16 * 0? no no * *

brsExt16 * 0? no no * *pc+4sExt16 Imm + yes no * *

pc+4Imm Op no yes ALU rt

pc+4* Reg Func no yes ALU rdsExt16 Imm Op pc+4no yes ALU rt

pc+4sExt16 Imm + no yes Mem rtuExt16

old way

March 5, 2012 L8-3http://csg.csail.mit.edu/

6.S078

Page 4: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Single-Cycle SMIPS

PC

InstMemory

Decode

Register File

Execute

DataMemory

+4

Datapath and control were derived automatically from a high-level rule-based description

new way

2 read & 1 write ports

separate Instruction &

Data memories

March 5, 2012 L8-4http://csg.csail.mit.edu/

6.S078

Page 5: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Single-Cycle SMIPS code structuremodule mkProc(Proc); Reg#(Addr) pc <- mkRegU; RFile rf <- mkRFile; Memory mem <- mkTwoPortedMemory; let iMem = mem.iport ; let dMem = mem.dport;

rule doProc; let inst = iMem(MemReq{op:Ld, addr:pc, data:?}); let dInst = decode(inst); Data rVal1 = rf.rd1(dInst.rSrc1); Data rVal2 = rf.rd2(dInst.rSrc2); let eInst = exec(dInst, rVal1, rVal2, pc);

update rf, pc and dMem

March 5, 2012 L8-5http://csg.csail.mit.edu/

6.S078

Page 6: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

decode

Decoding Instructions: input-output types

instructionbrComp

rDstrSrc1rSrc2immext

aluFunc

iType31:26, 5:0

31:265:0

31:26

20:1615:1125:21

20:16

15:025:0

Type DecodedInst

Bit#(32)

IType

AluFunc

RindexRindexRindex

BrType

Bit#(32)

Mux control logic not shown

immValidBoolMarch 5, 2012 L8-6

http://csg.csail.mit.edu/6.S078

Page 7: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Reading RegistersRead registers

RVal2RVal1

RFRSrc2

Pure combinational logic

RSrc1

March 5, 2012 L8-7http://csg.csail.mit.edu/

6.S078

Page 8: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Executing Instructionsexecute

dInst

addr

brTaken

data

iTyperDst

ALU

BranchAddresspc

rVal2

rVal1

Pure combinational logic

either for memory reference or branch target

either for rf write or St

ALUBr

March 5, 2012 L8-8http://csg.csail.mit.edu/

6.S078

Page 9: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Branch Address Calculationfunction Addr brAddrCalc(Address pc, Data val, IType iType, Data imm); let targetAddr = case (iType) J, Jal : {pc[31:28], imm[27:0]}; Jr, Jalr : val; default : pc + imm; endcase; return targetAddr;endfunction

March 5, 2012 L8-9http://csg.csail.mit.edu/

6.S078

Page 10: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Some Useful Functionsfunction Bool memType (IType i) return (i==Ld || i == St);endfunction

function Bool regWriteType (IType i) return (i==Alu || i==Ld || i==Jal || i==Jalr);endfunction

function Bool controlType (IType i) return (i==J || i==Jr || i==Jal || i==Jalr || i==Br);endfunction

March 5, 2012 L8-10http://csg.csail.mit.edu/

6.S078

Page 11: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Execute Functionfunction ExecInst exec(DecodedInst dInst, Data rVal1, Data rVal2, Addr pc); ExecInst einst = ?;

Data aluVal2 = (dInst.immValid)? dInst.imm : rVal2 let aluRes = alu(rVal1, aluVal2, dInst.aluFunc); let brAddr = brAddrCal(pc, rVal1, dInst.iType, dInst.imm); einst.itype = dInst.iType; einst.addr = (memType(dInst.iType)? aluRes : brAddr; einst.data = dInst.iType==St ? rVal2 : aluRes; einst.brTaken = aluBr(rVal1, aluVal2, dInst.brComp); einst.rDst = dInst.rDst; return einst;endfunction

March 5, 2012 L8-11http://csg.csail.mit.edu/

6.S078

Page 12: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Single-Cycle SMIPS atomic state updates if(memType(eInst.iType)) eInst.data <- dMem(MemReq{ op: eInst.iType==Ld ? Ld : St, addr: eInst.addr, data: eInst.data});

if(regWriteType(eInst.iType)) rf.wr(eInst.rDst, eInst.data);

pc <= eInst.brTaken ? eInst.addr : pc + 4;

endrule endmodule

March 5, 2012 L8-12http://csg.csail.mit.edu/

6.S078

Page 13: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Single-Cycle SMIPS: Clock Speed

PC

InstMemory

Decode

Register File

Execute

DataMemory

+4

tClock > tM + tDEC + tRF + tALU+ tM+ tWB

We can improve the clock speed if we execute each instruction in two clock cyclestClock > max {tM , (tDEC + tRF + tALU+ tM+ tWB

)}

March 5, 2012 L8-13http://csg.csail.mit.edu/

6.S078

Page 14: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Two-Cycle SMIPS

PC

InstMemory

Decode

Register File

Execute

DataMemory

+4 ir

stage

Introduce register “ir” to hold a fetched instruction and register “stage” to remember which stage (fetch/execute) we are in

March 5, 2012 L8-14http://csg.csail.mit.edu/

6.S078

Page 15: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

ir: The instruction registerYou may recall from our earlier discussion of pipelining that when we take multiple cycles to perform some operation (e.g., IFFT), there is a possibility that intermediate registers do not contain any meaningful data in some cyclesIt is straight forward to convert ir into a pipeline register

We can associate (Valid/Invalid) bit with ir Equivalently, we can think of ir as a single-element

FIFO

March 5, 2012 L8-15http://csg.csail.mit.edu/

6.S078

Page 16: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Additional Typestypedef struct { Addr pc; Bit#(32) inst;} TypeFetch2Decode deriving (Bits, Eq);

typedef enum {Fetch, Execute} TypeStage deriving (Bits, Eq);

March 5, 2012 L8-16http://csg.csail.mit.edu/

6.S078

Page 17: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Two-Cycle SMIPSmodule mkProc(Proc); Reg#(Addr) pc <- mkRegU; RFile rf <- mkRFile; Memory mem <- mkTwoPortedMemory; let iMem = mem.iport; let dMem = mem.dport; Reg#(TypeFetch2Decode) ir <- mkRegU; Reg#(TypeStage) stage <- mkReg(Fetch);

rule doFetch (state==Fetch); let inst = iMem(MemReq{op:Ld, addr:pc, data:?}); ir <= TypeFetch2Decode{pc: pc, inst: inst}; stage <= Execute; endrule

March 5, 2012 L8-17http://csg.csail.mit.edu/

6.S078

Page 18: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Two-Cycle SMIPS rule doExecute(stage==Execute); let irpc = ir.pc; let inst = ir.inst; let dInst = decode(inst); Data rVal1 = rf.rd1(dInst.rSrc1); Data rVal2 = rf.rd2(dInst.rSrc2); let eInst = exec(dInst, rVal1, rVal2, irpc); if(memType(eInst.iType)) eInst.data <- dMem(MemReq{ op: eInst.iType==Ld ? Ld : St, addr: eInst.addr, data: eInst.data});

if(regWriteType(eInst.iType)) rf.wr(eInst.rDst, eInst.data); pc <= eInst.brTaken ? eInst.addr : pc + 4; stage <= Fetch; endrule endmodule

no change from single-cycle

March 5, 2012 L8-18http://csg.csail.mit.edu/

6.S078

Page 19: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Princeton versus Harvard Architecture

Harvard architecture uses different memories for instructions and data

needed for a single-cycle implementationPrinceton architecture uses the same memory for instruction and data and thus, requires at least two cycles to execute Load/Store instructions

The two-cycle implementations of Princeton and Harvard architectures are almost the same

March 5, 2012 L8-19http://csg.csail.mit.edu/

6.S078

Page 20: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

SMIPS Princeton Architecture

PC

Memory

Decode

Register File

Execute+4 ir

Since both the Fetch and Execute stages want to use the memory, there is a structural hazard in accessing memory

March 5, 2012 L8-20http://csg.csail.mit.edu/

6.S078

stage

Page 21: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Two-Cycle SMIPS Princetonmodule mkProc(Proc); Reg#(Addr) pc <- mkRegU; RFile rf <- mkRFile; Memory mem <- mkOnePortedMemory; let uMem = mem.port; Reg#(TypeFetch2Decode) ir <- mkRegU; Reg#(TypeStage) stage <- mkReg(Fetch);

rule doFetch (stage==Fetch); let inst <- uMem(MemReq{op:Ld, addr:pc, data:?}); ir <= TypeFetch2Decode{pc: pc, inst: inst}; stage <= Execute; endrule

March 5, 2012 L8-21http://csg.csail.mit.edu/

6.S078

Page 22: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Two-Cycle SMIPS Princeton rule doExecute(stage==Execute); let irpc = ir.pc; let inst = ir.inst; let dInst = decode(inst); Data rVal1 = rf.rd1(dInst.rSrc1); Data rVal2 = rf.rd2(dInst.rSrc2); let eInst = exec(dInst, rVal1, rVal2, irpc); if(memType(eInst.iType)) eInst.data <- uMem(MemReq{ op: eInst.iType==Ld ? Ld : St, addr: eInst.addr, data: eInst.data});

if(regWriteType(eInst.iType)) rf.wr(eInst.rDst, eInst.data); pc <= eInst.brTaken ? eInst.addr : pc + 4; stage <= Fetch; endrule endmoduleMarch 5, 2012 L8-22

http://csg.csail.mit.edu/6.S078

Page 23: Computer Architecture: A Constructive Approach Multi-cycle SMIPS Implementations Joel Emer

Two-Cycle SMIPS: Analysis

PC

InstMemory

Decode

Register File

Execute

DataMemory

+4 ir

stage

In any given clock cycle, lots of unused hardware!

ExecuteFetch

next lecture: Pipelining to increase the throughputMarch 5, 2012 L8-23

http://csg.csail.mit.edu/6.S078