multiple instruction issue and hardware based...

26
Multiple Instruction Issue and Hardware Based Speculation Soner Önder Michigan Technological University, Houghton MI www.cs.mtu.edu/~soner

Upload: others

Post on 06-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

Multiple Instruction Issueand

Hardware Based Speculation

Soner ÖnderMichigan Technological University, Houghton MI

www.cs.mtu.edu/~soner

Page 2: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

2

•Exploiting more ILP requires that we overcome the limitation of control dependence:

• With branch prediction we allowed the processor continue issuing instructions past a branch based on a prediction:

• Those fetched instructions do not modify the processor state.• These instructions are squashed if prediction is incorrect.

• We now allow the processor to execute these instructions before we know if it is ok to execute them:

• We need to correctly restore the processor state if such an instruction should not have been executed.

• We need to pass the results from these instructions to future instructions as if the program is just following that path.

Hardware Based Speculation

Page 3: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

3

•Assume the processor predicts B1 to be taken and executes.

•What will happen if the prediction was wrong?

•What value of each variable should be used if the processor predicts B1 and B2 taken and executes instructions along the way?

Hardware Based Speculation

x < y?

A =b+cC=c-1

C=0A=0

B=b+1A=a+1

C=a

D=a+b+c….

Use d

X < z

B1

B2

T

T

N

N

Page 4: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

4

•In order to execute instructions speculatively, we need to provide means:

• To roll back the values of both registers and the memory to their correct values upon a misprediction,

• To communicate speculatively calculated values to the new uses of those values.

•Both can be provided by using a simple structure called Reorder Buffer (ROB).

Hardware Based Speculation

Page 5: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

5

•It is a simple circular array with a head and a tail pointer:

• New instructions is allocated a position at the tail in program order.

• Each entry provides a location for storing the instruction’s result.• New instructions look for the values starting from tail – back.• When the instruction at the head complete and becomes non-

speculative the values are committed and the instruction is removed from the buffer.

Reorder Buffer

Tail Head

Page 6: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

6

3 fields: instr, destination, valueReorder buffer can be operand source => more registers like RSUse reorder buffer number instead of reservation station when execution completesSupplies operands between execution complete & commitOnce operand commits, result is put into registerInstructions commitAs a result, its easy to undo speculated instructions on mispredicted branches or on exceptions

Reorder Buffer

Page 7: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

7

Steps of Speculative Tomasulo Algorithm

1. Issue [get instruction from FP Op Queue]

1. Check if the reorder buffer is full.2. Check if a reservation station is available.3. Access the register file and the reorder buffer for the current

values of the source operands.4. Send the instruction, its reorder buffer slot number and the

source operands to the reservation station.

Once issued, the instruction stays in the reservation station until it gets both operands.

Page 8: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

8

Steps of Speculative Tomasulo Algorithm

2. Execute [operate on operands (EX)]

When both operands ready and a functional unit is available, the instruction executes.

This step checks RAW hazards and as long as operands are not ready, watches CDB for results.

Page 9: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

9

Steps of Speculative Tomasulo Algorithm

3. Write result [finish execution (WB)]Write on Common Data Bus to all awaiting FUs and the reorder buffer; mark reservation station available.

Page 10: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

10

Steps of Speculative Tomasulo Algorithm

4. Commit [update register file with reorder result]When instruction reaches the head of reorder buffer The result is presentNo exceptions associated with the instruction:

The instruction becomes non-speculative:Update register file with result (or store to memory)Remove the instruction from the reorder buffer.

A mispredicted branch flushes the reorder buffer.

Page 11: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

11MIPS FP Unit

Page 12: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

12Renaming RegistersCommon variation of speculative design

Reorder buffer keeps instruction information but not the result

Extend register file with extra renaming registers to hold speculative results

Rename register allocated at issue; result into rename register on execution complete; rename register into real register on commit

Operands read either from register file (real or speculative) or via Common Data Bus

Advantage: operands are always from single source (extended register file)

Page 13: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

13Renaming Registers1. Index a MAP table using the

source register identifiers to get the physical register number.

2. Get the previous physical register number for the destination register.

3. Allocate a free physical register and modify the MAP table by indexing it with the destination register identifier.

4. When instruction commits, return the previous physical register to the pool.

.

.

Map table125

012

293031

.

.

Physical registers

012

125126127

Page 14: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

14Renaming Registers

54

0

23

1

Map table

012345678 7

6

910221317

R7=r4+r3R6=r2+r6R3=r6+r7R6=r6+10

Code sequence

Page 15: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

15Renaming Registers

54

0

23

1

Map table

01234567 7

6

910221317

R7=r4+r3R6=r2+r6R3=r6+r7R6=r6+10

Code sequence Renamed Code sequence

Page 16: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

16Renaming Registers

54

0

23

1

Map table

01234567 9

6

10221317

R7=r4+r3R6=r2+r6R3=r6+r7R6=r6+10

Code sequence

R9=r4+r3

Renamed Code sequence

R7

Previous Dest

Page 17: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

17Renaming Registers

54

0

23

1

Map table

01234567 9

10

221317

R7=r4+r3R6=r2+r6R3=r6+r7R6=r6+10

Code sequence

R9=r4+r3R10=r2+r6

Renamed Code sequence

R7r6

Previous Dest

Page 18: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

18Renaming Registers

54

0

222

1

Map table

01234567 9

10

1317

R7=r4+r3R6=r2+r6R3=r6+r7R6=r6+10

Code sequence

R9=r4+r3R10=r2+r6R22=r10+r9

Renamed Code sequence

R7R6R3

Previous Dest

Page 19: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

19Renaming Registers

54

0

222

1

Map table

01234567 9

13

17

R7=r4+r3R6=r2+r6R3=r6+r7R6=r6+10

Code sequence

R9=r4+r3R10=r2+r6R22=r10+r9R13=r10+10

Renamed Code sequence

R7R6R3R10

Previous Dest

Page 20: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

20Renaming Registers

54

0

222

1

Map table

01234567 9

13

1710

R7=r4+r3R6=r2+r6R3=r6+r7R6=r6+10

Code sequence

R9=r4+r3R10=r2+r6R22=r10+r9R13=r10+10

Renamed Code sequence

R7R6R3R10

Previous Dest

When r13=r10+10 retires

Page 21: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

21Limits to ILP

Assumptions for ideal/perfect machine to start:

1. Register renaming–infinite virtual registers and all WAW & WAR hazards are avoided

2. Branch prediction–perfect; no mispredictions

3. Jump prediction–all jumps perfectly predicted => machine with perfect speculation & an unbounded buffer of instructions available

4. Memory-address alias analysis–addresses are known & a load can be moved before a store provided addresses not equal

1 cycle latency for all instructions; unlimited number of instructions issued per clock cycle

Page 22: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

22Upper Limit to ILP: Ideal Machine

Programs

0

20

40

60

80

100

120

140

160

gcc espresso li fpppp doducd tomcatv

54.862.6

17.9

75.2

118.7

150.1

Integer: 18 - 60

FP: 75 - 150

IPC

Page 23: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

23

Program

0

10

20

30

40

50

60

gcc espresso li fpppp doducd tomcatv

35

41

16

6158

60

9

1210

48

15

6 7 6

46

13

45

6 6 7

45

14

45

2 2 2

29

4

19

46

Perfect Selective predictor Standard 2-bit Static None

Change from Infinite window to examine to 2000 and maximum issue of 64 instructions per clock cycle

FP: 15 - 45

Integer: 6 - 12

IPC

More Realistic HW: Branch Impact

Page 24: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

24

Program

0

10

20

30

40

50

60

gcc espresso li fpppp doducd tomcatv

11

15

12

29

54

10

15

12

49

16

1013

12

35

15

44

9 10 11

20

11

28

5 5 6 5 57

4 45

45 5

59

45

Infinite 256 128 64 32 None

Change 2000 instrwindow, 64 instr issue, 8K 2 level Prediction

Integer: 5 - 15

FP: 11 - 45

IPC

More Realistic HW: Register Impact

Page 25: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

25

Program

0

5

10

15

20

25

30

35

40

45

50

gcc espresso li fpppp doducd tomcatv

10

15

12

49

16

45

7 79

49

16

4 5 4 46 5

35

3 3 4 4

45

Perfect Global/stack Perfect Inspection None

Change 2000 instr window, 64 instr issue, 8K 2 level Prediction, 256 renaming registers

FP: 4 - 45(Fortran,no heap)

Integer: 4 - 9

IPC

More Realistic HW: Alias Impact

Page 26: Multiple Instruction Issue and Hardware Based Speculationsoner/Classes/CS-4431/PDF-Slides/Lecture-0… · Hardware Based Speculation. 5 •It is a simple circular array with a head

26

Program

0

10

20

30

40

50

60

gcc expresso li fpppp doducd tomcatv

10

15

12

52

17

56

10

15

12

47

16

10

1311

35

15

34

910 11

22

12

8 8 9

14

9

14

6 6 68

79

4 4 4 5 46

3 2 3 3 3 3

45

22

Infinite 256 128 64 32 16 8 4

Perfect disambiguation (HW), 1K Selective Prediction, 16 entry return, 64 registers, issue as many as window

Integer: 6 - 12

FP: 8 - 45

IPC

Realistic HW for ‘9X: Window Impact