pipelining basics - bt.nitk.ac.in · data types and sizes signed and unsigned data – 2's...

42
Pipelining Basics

Upload: buituyen

Post on 31-Jul-2019

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Pipelining Basics

Page 2: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Outline● Addressing Modes● MIPS ISA● MIPS Pipeline

Page 3: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Addressing Modes● How are operands specified in instructions?

Add R1, R2, R3 Regs[R4] <- Regs[R3] + Regs[R2] Register

Add R4, R3, #5 Regs[R4] <- Regs[R3] + 5 Immediate

Regs[R4] <- Regs[R3] + Mem[100 + Regs[R1]]

DisplacementAdd R4, R3, 100(R1)

Regs[R4] <- Regs[R3] + Mem[Regs[R1]]

Register IndirectAdd R4, R3, (R1)

Regs[R4] <- Regs[R3] + Mem[0x475] AbsoluteAdd R4, R3, (0x475)

Regs[R4] <- Regs[R3] + Mem[Mem[R1]]

Memory IndirectAdd R4, R3, @(R1)

Regs[R4] <- Regs[R3] + Mem[100 + PC]

PC relativeAdd R4, R3, 100(PC)

Regs[R4] <- Regs[R3] + Mem[100 + Regs[R1] + Regs[R5] * 4]

ScaledAdd R4, R3, 100(R1)[R5]

Page 4: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Data Types and Sizes

● Signed and Unsigned Data– 2's complement representation

● Real numbers (Floating point)– IEEE 754 Single precision and Double precision

● Addresses

Page 5: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

ISA Encoding● Fixed Width

– Eg.: RISC Architectures: MIPS, PowerPC, SPARC, ARM

● Variable Length (Mostly Fixed or Compressed)– Eg. CISC Architectures: IBM 360, x86, Motorola

68K, VAX, …

● Mostly fixed or Compressed– MIPS16, THUMB

● Very Long Instruction Words– Multiple instructions in a fixed width bundle

– Eg.: Multiflow, HP/ST Lx, TI C6000

Page 6: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

x86 (IA-32) Instruction Encoding

InstructionPrefix

Opcode ModR/MScale,Index

BaseDisplace

mentImmediate

Up to four prefixes

(1 byte each)

1, 2 or 3B 1B(if needed)

1B(if needed)

0,1,2, or 4B(if needed)

0,1,2, or 4B(if needed)

x86 and x86-64 instruction formatPossible instructions 1 to 18 bytes long

REP MOVSB

Page 7: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Example – MIPS64 ISA● RISC, load-store architecture● 32-bit instructions, fixed format● 32 64-bit GPRs, R0-R31, 32 64-bit FPRs, F0-F31

– R0 is hardwired to 0.

– Can hold 32-bit floats also (with other ½ unused).

– “SIMD” extensions operate on more floats in 1 FPR

● Special registers– Floating-point status register

● Load/store 8-, 16-, 32-, 64-bit integers– All sign-extended to fill 64-bit GPR

– Also 32- bit floats/doubles

Page 8: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 Addressing Modes● Register (Arithmetic, Logical ops only)● Immediate (Arithmetic, Logical ) & Displacement

(load/stores only)– 16-bit immediate/offset field

– Register indirect: use 0 as displacement offset

– Direct (absolute): use R0 as displacement base

● Byte-addressed memory, 64-bit address● Software-settable big-endian/little-endian flag● Alignment required 100 101 102 103

104 105 106 107

Word aligned addresses

Page 9: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 InstructionsDATA TRANSFER INSTRUCTIONSInstruction Opcode/Mnemonic Examples

Load LB, LBU, LH, LHU, LW, LWU, LD, SDL.S, L.D

LD R1, 30(R2)L.S F0, 50(R3)

Store SB, SH, SW, SDS.S, S.D

SH R3, 502(R2)SB R2, R1(R3)

● L: Load● S: Store

● B: Byte (8b), H: Half Word (16b), W: Word (32b)

● U: Upper● I: Immediate

Decode Instruction, Fetch Operands, Effective address calculation,

Memory access, Update RF.

Page 10: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 Instructions

ARITHMETIC/LOGICAL INSTRUCTIONS

Logical and Arithmetic Shift, Set less than…

DADD, DADDI, DADDIU, DSUB, DSUBU, DMUL, DMULU, DDIV, DDIVUAND, OR, XOR, ANDI, ORI, XORILUIDSLL, DSRL, SLT, SLTI, SLTU

DADDU R1, R2, R3

ANDI R1, #43

SLT R1, R2, R3

Decode Instruction, Fetch operands, Arithmetic operation, Update results in RF.

Page 11: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 Instructions

CONTROL INSTRUCTIONS

Branch, Jump, Control transfer

BEQZ, BNEZBEQ, BNEJ, JRJAL, JALRERET

BEQ R1, R2, label

J label

Decode Instruction, Fetch operands, Compare condition, Update PC.

Page 12: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS Instruction Formats

● R-type.

● I-type.

● J-type

6 bits 5 bits 5 bits 5 bits 6 bits5 bits

op rs rt rd shamt funct

6 bits 5 bits 5 bits 16 bits

op rs rt immediate

6 bits 26 bits

op Offset added to PC

op: Opcode (class of instruction). Eg. ALUfunct: Which subunit of the ALU to activate?

OP rt, rs, IMM

OP rd, rs, rt

OP LABEL

Page 13: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA - 1● Instruction Fetch (IF)

AD

D

PC

4

InstructionMemory

IR

NPC

IR Mem[PC]

NPC PC+4

Page 14: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA - 2● Instruction Decode/Register Fetch (ID)

RegistersIR

Imm Sign-extended immediate filed of IR

A Regs[rs]

SignExtend

A

B

Imm16 32

B Regs[rt]

rs

rt

rd

Page 15: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA - 3● Execution/Effective Address (EX)

AL

UALUOuput A + Imm

A

B

Imm

ALUOutput

MUX

ALUOuput A func B

ALUOuput A func Imm

Register-Register andRegister-Immediate Instructions

Memory Reference

Page 16: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA - 3● Execution/Effective Address (EX)

AL

UALUOuput A + Imm

A

B

Imm

ALUOutput

MUX

ALUOuput A func B

ALUOuput A func Imm

Register-Register andRegister-Immediate Instructions

Memory Reference ALUOuput NPC + (Imm << 2);

Cond (A == 0)

Branch Instruction

Page 17: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA – 3 (cont)● Execution/Effective Address (EX)

AL

U

ALUOuput A + Imm

A

B

Imm

ALUOutput

MUX

ALUOuput A func B

ALUOuput A func Imm

Register-Register andRegister-Immediate Instructions

Memory Reference ALUOuput NPC + (Imm << 2);

Cond (A == 0)

Branch Instruction

NPC

MUX

Zero? Cond

Page 18: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA - 4● Memory Access/Branch Completion (MEM)

DataMemory

LMD

NPC

ALUOutput

Cond

MUX

PC

LMD Mem[ALUOutput]

Memory Reference

Mem[ALUOutput] B

if (Cond) PC ALUOutputBranch

B

Page 19: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA - 5● Write back (WB)

ALUOutput

MUX

LMD

Regs[rd] ALUOutput

Regs[rt] ALUOutput

Register-Register andRegister-Immediate Instructions

Regs[rt] LMD

Load Instruction

RegisterFile

Page 20: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Implementation of RISC ISA - Stages● Instruction Fetch (IF)● Instruction Decode/Register Fetch (ID)

– Fixed field decoding

● Execution/Effective address (EX)● Memory Access (MEM)● Write back (WB)

Page 21: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS Datapath

AD

D

PC

4

IM

NPC

RegsIR

SignExtend

A

B

Imm16 32

rs

rt

rd

AL

U ALUOutput

MUX

MUX

Zero? Cond

DM LMD MUX

MUX

Instruction Fetch Instruction Decode/Register Fetch

Execute/Address

Calculation

MemoryAccess

WriteBack

IF ID EX MEM WB

Page 22: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS Pipeline

Hennessy & Patterson, CA-QA, Appendix C, 5ed. MK, 2013

IF ID EX MEM WB

Page 23: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS Pipeline

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

i1

i2

i3

i4

...

Time(clock cycles)

1 2 3 4 5 6 7 8 9

Example: When will i10000 complete? What is the average clock cycles per Instruction (CPI)? If the processor were not pipelined, when would i10000 complete? What is the average CPI? (Assume same clock period for both designs)

Page 24: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Some Equations

● Unpipelined: Time to execute one instruction

● N stage pipeline. Time per stage,

T exec=T +T ovh

T stage=TN

+T ovh

IF ID EX MEM WB

Tovh

Tstage

IF ID EX MEM WB

Tovh

T

Unpipelined ProcessorUnpipelined Processor

Pipelined ProcessorPipelined Processor

Page 25: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Some Equations● Unpipelined: Time to execute one instruction

● N stage pipeline. Time per stage,

● Total time per instruction = ● Clock cycle time = ● Clock speed = ● Ideal speedup = ● Cycles to complete one instruction = N● Average CPI = 1

T exec=T +T ovh

T stage=TN

+T ovh

T inst=N×(TN

+T ovh)=T +N×T ovh

1T clock

T clock=TN

+T ovh

Speedupideal=T+T ovh

T /N +T ovh

Page 26: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Pipeline PerformanceAn unpipelined processor has 1ns clock cycle. ALU Operation and branches take 4 cycles and Memory ops take 5 cycles. Relative frequencies of the operations are 40%, 20%, and 40%. Suppose Clock skew and setup, pipelining adds 0.2ns of overhead to the clock. What is the speedup?

Average Instruction Execution time = Clock cycle * Average CPI

CPI=∑i=1

n IC iInstructionCount

×CPI i

Page 27: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Multiple Issue Integer Pipeline

IMRF

Read

AB

DM

RF

Write

IR0

IR1

Zero?

IF ID EX MEM WB

Page 28: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Outline● Addressing Modes● MIPS ISA● MIPS Pipeline

Page 29: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

References

Page 30: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

EXTRA

Page 31: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Operations and Operands

ALUControl

i1 i2

o

... Register File

.........

...Memory

PR

OC

ES

SO

R

Page 32: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Machine Models

ALU

...

.........

...

TOS

STACK

ALU

.........

...

ACCUMULATOR

ALU

...

.........

...

REGISTOR-MEMORY

ALU

...

.........

...

REGISTER-REGISTER

Page 33: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

C = A + B

ALU

...

............

TOS

STACK

ALU

............

ACCUMULATOR

ALU

...

............

REGISTOR-MEMORY

ALU

...

............

REGISTER-REGISTER

Push APush BAddPop C

Load AAdd BStore C

Load R1, AAdd R3, R1, BStore R3, C

Load R1, ALoad R2, BAdd R3, R1, R2Store R3, C

Page 34: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Machine Models – Comparison● Number of explicitly named operands● Number of instructions that can access data

from memory● Code size● Amount of data transferred between memory

and processor● Complexity of hardware● Ease of compilation (ease of generation of

machine code).

Page 35: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

The Stack Machine Model

● What is the sequence of instructions?● Convert the equation to its Reverse Polish

Notation form.– ab*cde/-*

How is the expression x = (a*b)+(c- (d/e) evaluated ona stack based machine?How is the expression x = (a*b)+(c- (d/e) evaluated ona stack based machine?

ExampleExample

Page 36: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

The Stack Machine Model

Evaluate ab*cde/- on a stack based machineEvaluate ab*cde/- on a stack based machine

...

...

...

...

...

...

...

STACK

0xFF

0xFE

172

3

13............7

a

b

c

d

...

...

MEMORY

0x00

0x01

0x02

0x03

0x04

0x05

0x065

17210

1721

172

d

de

dx

What is the minimumsize of the stackrequired to evaluatethis expression ?

What is the minimumsize of the stackrequired to evaluatethis expression ?

Page 37: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Class Work Example

For each machine model, write a code sequence to evaluatethe following expressions.For each machine model, write a code sequence to evaluatethe following expressions.

ExampleExample

b=a3+3⋅a2+2⋅a+7c= x3

+3⋅a2+2⋅b+7

For each machine model, what is the (a) total instructions inthe code sequence, (b) Execution time in clock cycles, (c) CPI?Given: Load, store, arithmetic and logic tasks take 1 cycle.Multiply completes in 4 clock cycles.

For each machine model, what is the (a) total instructions inthe code sequence, (b) Execution time in clock cycles, (c) CPI?Given: Load, store, arithmetic and logic tasks take 1 cycle.Multiply completes in 4 clock cycles.

Page 38: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

Real World Instruction SetsArch Type #Oper #Mem Data

Size#Regs Addr

SizeUse

Alpha Reg-Reg 3 0 64b 32 64b Workstation

ARM Reg-Reg 3 0 32/64b 16 32/64b Cell Phone, Embedded

MIPS Reg-Reg 3 0 32/64b 32 32b/64b Workstation

SPARC Reg-Reg 3 0 32/64b 24-32 32b/64b DSP

TI C6000 Reg-Reg 3 0 32b 32 32b Mainframe

IBM 360 Reg-Mem 2 1 32b 16 24/31/64 Personal Computers

x86 Reg-Mem 2 1 8/16/32/64b

4/8/24 16/32/64 PC

VAX Mem-Mem 3 3 32b 16 32b Minicomputers

Motorola6800

Accumulator

1 1/2 8b 0 16b Microcontroller

Page 39: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 InstructionsDATA TRANSFER INSTRUCTIONSInstruction Opcode/Mnemonic Examples

Load LB, LBU, LH, LHU, LW, LWU, LD, SDL.S, L.D

LD R1, 30(R2)L.S F0, 50(R3)

Store SB, SH, SW, SDS.S, S.D

SH R3, 502(R2)SB R2, R1(R3)

Move MOV.S, MOV.DMFC0, MTC0MFC1, MTC1

MOV.S F2, F3

● L: Load● S: Store

● B: Byte (8b), H: Half Word (16b), W: Word (32b)

● U: Upper● I: Immediate

Page 40: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 Instructions

ARITHMETIC/LOGICAL INSTRUCTIONS

Multiply Accumulate,Logical and Arithmetic Shift, Set less than…

DADD, DADDI, DADDIU, DSUB, DSUBU, DMUL, DMULU, DDIV, DDIVUAND, OR, XOR, ANDI, ORI, XORILUIDSLL, DSRL, DSRA, DSLLVSLT, SLTI, SLTU

DADDU R1, R2, R3

LUI R1, #43

SLT R1, R2, R3

43

LUI R1, #43

0 0 …. … … … … … … … 0 0 0 …. …. 0

Page 41: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 Instructions

CONTROL INSTRUCTIONS

Branch, Jump, Control transfer

BEQZ, BNEZBEQ, BNEMOVN, MOVZJ, JRJAL, JALRERET

BEQ R1, R2, label

MOVZ R1, R2, R3

J label

Page 42: Pipelining Basics - bt.nitk.ac.in · Data Types and Sizes Signed and Unsigned Data – 2's complement representation Real numbers (Floating point) – IEEE 754 Single precision and

MIPS64 Instructions

FLOATING POINT

FP Arithmetic ADD.D, ADD.S, ADD.PSSUB.D, SUB.S, SUB.PSMULD, MUL.S, MUL.PSDIV.D, DIV.S, DIV.PSCVT.D.S, CVT.D.L, CVT.D.W, CVT.S._.C.LT.D, C.GT.D, C.LE.D, C.GE.D, C.EQ.D, C.NE.D, C.__.S