cpe 731 advanced computer architecture ilp: part ii – branch prediction dr. gheith abandah adapted...

24
CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California, Berkeley

Upload: oswald-walton

Post on 18-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

CPE 731 Advanced Computer

Architecture

ILP: Part II – Branch Prediction

Dr. Gheith Abandah

Adapted from the slides of Prof. David Patterson, University of California, Berkeley

04/18/23 CPE 731, ILP2 2

Outline

• Static Branch Prediction

• Dynamic Branch Prediction

• Branch History Table

• Correlated Branch Prediction

• Tournament Predictors

• Branch Target Buffer

04/18/23 CPE 731, ILP2 3

12%

22%

18%

11% 12%

4%6%

9% 10%

15%

0%

5%

10%

15%

20%

25%

compress

eqntott

espresso gc

c li

doduc

ear

hydro2d

mdljdp

su2cor

Mis

pre

dic

tio

n R

ate

Static Branch Prediction

• We learned how to schedule code around delayed branch

• To reorder code around branches, need to predict branch statically when compile

• Simplest scheme is to predict a branch as taken– Average misprediction = untaken branch frequency = 34% SPEC

• More accurate scheme predicts branches using profile information collected from earlier runs, and modify prediction based on last run:

Integer Floating Point

04/18/23 CPE 731, ILP2 4

Dynamic Branch Prediction

• Why does prediction work?– Underlying algorithm has regularities

– Data that is being operated on has regularities

– Instruction sequence has redundancies that are artifacts of way that humans/compilers think about problems

• Is dynamic branch prediction better than static branch prediction?– Seems to be

– There are a small number of important branches in programs which have dynamic behavior

• Performance = ƒ(accuracy, cost of misprediction)

04/18/23 CPE 731, ILP2 5

Branch History Table

• Use the lower k bits of the PC to address index the table

Branch Address

Branch History Table (BHT)

2k entries

Predict Taken or not Taken

k <n>

04/18/23 CPE 731, ILP2 6

BHT, n=1

• Problem: in a loop, 1-bit BHT will cause two mispredictions (avg is 9 iterations before exit):– End of loop case, when it exits instead of looping as before

– First time through loop on next time through code, when it predicts exit instead of looping

0 1

Not taken

TakenNot taken

Taken

Predict not taken

Predict taken

04/18/23 CPE 731, ILP2 7

• Solution: 2-bit scheme where change prediction only if get misprediction twice

• Red: stop, not taken

• Green: go, taken

• Adds hysteresis to decision making process

BHT, n=2

T

T NT

NT

Predict Taken

Predict Not Taken

Predict Taken

Predict Not TakenT

NTT

NT

04/18/23 CPE 731, ILP2 8

18%

5%

12%10%

9%

5%

9% 9%

0%1%

0%2%4%6%8%

10%12%14%16%18%20%

eqntott

espresso gc

c li

spice

doduc

spice

fpppp

matrix300

nasa7

Mis

pre

dic

tio

n R

ate

BHT Accuracy

• Mispredict because either:– Wrong guess for that branch

– Got branch history of wrong branch when index the table

• 4096 entry table:

IntegerFloating Point

04/18/23 CPE 731, ILP2 9

Outline

• Static Branch Prediction

• Dynamic Branch Prediction

• Branch History Table

• Correlated Branch Prediction

• Tournament Predictors

• Branch Target Buffer

04/18/23 CPE 731, ILP2 10

Correlated Branch Prediction

• Aka Two-Level Predictors

• Motivation1. if (a==2) a=0;

2. if (b==2) b=0;

3. if (a != b) {…}

Can predict that Condition (3) is not true if Conditions (1) and (2) are true.

04/18/23 CPE 731, ILP2 11

Correlated Branch Prediction

• Idea: record m most recently executed branches as taken or not taken, and use that pattern to select the proper n-bit branch history table

• In general, (m,n) predictor means record last m branches to select between 2m history tables, each with n-bit counters– Thus, old 2-bit BHT is a (0,2) predictor

• Global Branch History: m-bit shift register keeping T/NT status of last m branches.

• Each entry in table has m n-bit predictors.

04/18/23 CPE 731, ILP2 12

Correlating Branch Prediction

(2,2) predictor

– Behavior of recent branches selects between four predictions of next branch, updating just that prediction

Branch address

2-bits per branch predictor

Prediction

2-bit global branch history

4

04/18/23 CPE 731, ILP2 13

Correlated Branch Prediction

Example:

Find the size of (2, 2) predictor with k=10

Size = 2m * n * 2k

= 4 * 2 * 1K = 8 Kbits

04/18/23 CPE 731, ILP2 14

0%

Fre

quen

cy o

f M

isp

redi

ctio

ns

0%1%

5%6% 6%

11%

4%

6%5%

1%2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

4,096 entries: 2-bits per entry Unlimited entries: 2-bits/entry 1,024 entries (2,2)

Accuracy of Different Schemes

4096 Entries 2-bit BHTUnlimited Entries 2-bit BHT1024 Entries (2,2) BHT

nasa

7

mat

rix3

00

dodu

cd

spic

e

fppp

p

gcc

expr

esso

eqnt

ott li

tom

catv

04/18/23 CPE 731, ILP2 15

Outline

• Static Branch Prediction

• Dynamic Branch Prediction

• Branch History Table

• Correlated Branch Prediction

• Tournament Predictors

• Branch Target Buffer

04/18/23 CPE 731, ILP2 16

Tournament Predictors

• Multilevel branch predictor

• Use n-bit saturating counter to choose between predictors

• Usual choice between global and local predictors

04/18/23 CPE 731, ILP2 17

Tournament Predictors

Tournament predictor using, say, 4K 2-bit counters indexed by local branch address. Chooses between:

• Global predictor

– 4K entries index by history of last 12 branches (212 = 4K)

– Each entry is a standard 2-bit predictor

• Local predictor

– Local history table: 1024 10-bit entries recording last 10 branches, index by branch address

– The pattern of the last 10 occurrences of that particular branch used to index table of 1K entries with 3-bit saturating counters

04/18/23 CPE 731, ILP2 18

Alpha 21264 Branch Predictor

Selector

Global

Local

BHR 4K

<2>

addr 4K

<2>

<12>

k=12PredictionSelect

addr 1K10

1K

<10> <3>

Size = 4K*2 + 4K*2 + 1K*10 + 1K*3 = 29 Kbits

04/18/23 CPE 731, ILP2 19

Comparing Predictors (Fig. 2.8)

• Advantage of tournament predictor is ability to select the right predictor for a particular branch– Particularly crucial for integer benchmarks.

– A typical tournament predictor will select the global predictor almost 40% of the time for the SPEC integer benchmarks and less than 15% of the time for the SPEC FP benchmarks

04/18/23 CPE 731, ILP2 20

Outline

• Static Branch Prediction

• Dynamic Branch Prediction

• Branch History Table

• Correlated Branch Prediction

• Tournament Predictors

• Branch Target Buffer

• Branch target calculation is costly and stalls the instruction fetch.

• BTB stores PCs the same way as caches

• The PC of a branch is sent to the BTB

• When a match is found the corresponding Predicted PC is returned

• If the branch was predicted taken, instruction fetch continues at the returned predicted PC

Branch Target Buffers (BTB)

Branch Target Buffers

04/18/23 CPE 731, ILP2 23

Pentium 4 Misprediction Rate (per 1000 instructions, not per branch)

11

13

7

12

9

10 0 0

5

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

164.gzip

175.vpr

176.gcc

181.mcf

186.crafty

168.wupwise

171.swim

172.mgrid

173.applu

177.mesa

Bra

nch

mis

pre

dic

tion

s p

er

10

00

In

str

ucti

on

s

SPECint2000 SPECfp2000

6% misprediction rate per branch SPECint (19% of INT instructions are branch)

2% misprediction rate per branch SPECfp(5% of FP instructions are branch)

04/18/23 CPE 731, ILP2 24

Dynamic Branch Prediction Summary

• Prediction becoming important part of execution

• Branch History Table: 2 bits for loop accuracy

• Correlation: Recently executed branches correlated with next branch– Either different branches (GA)

– Or different executions of same branches (PA)

• Tournament predictors take insight to next level, by using multiple predictors – usually one based on global information and one based on local

information, and combining them with a selector

– In 2006, tournament predictors using 30K bits are in processors like the Power5 and Pentium 4

• Branch Target Buffer: include branch address & prediction