2015. 05. 12. bias-free branch predictor 2014 47 th annual ieee/acm int. symposium on...

41
2015. 05. 12. Bias-Free Branch Predictor 2014 47 th annual IEEE/ACM Int. Symposium on Microarchitecture Dibakar Gope and Mikko H. Lipasti 2015. 05. 12. Microprocessor Arch. 김김김

Upload: millicent-kelly

Post on 01-Jan-2016

221 views

Category:

Documents


6 download

TRANSCRIPT

2015. 05. 12.

Bias-Free Branch Predictor

2014 47th annual IEEE/ACM Int. Symposium on Microarchitec-ture

Dibakar Gope and Mikko H. Lipasti

2015. 05. 12.

Microprocessor Arch.

김인식

INDEX

1. Introduction

2. Global History : Biased Branch vs non-Biased Branch

3. Bias-Free Branch Predictor - Filtering Biased Branch from the Global History

- Biased Branch Detection : Branch Status Table

- Filtering multiple instances from the global history

- Positional History

4. Bias-Free Neural Branch Predictor

5. Bias-Free TAGE Branch Predictor

6. Bias-Free Branch Predictor –Experimental Re-sults

Introduction - Branch Prediction- Bi-modal Branch Predic-tor

Introduction - Branch Prediction- Two-level Adaptive Branch Pre-dictor

Only Local Branch History → No Inter-Branch Correlation info.

Introduction - Branch Prediction

Only Local Informa-

tion+

Inst. Address

AccessAliasing

Predic-tionMiss

Introduction - Branch Prediction

Only Local Informa-

tion+

Inst. Address

AccessAliasing

Predic-tionMiss

Global Branch His-tory

Introduction - Branch Prediction

Global Branch History - Branch Correlation

Introduction - Branch Prediction

The number of case (Table index ↑)

Introduction - Branch Prediction

The number of case (Table index ↑)

Branch Correlation(Global History ↑)

Introduction - Branch Prediction

The number of case(Table index ↑)

Branch Correlation(Global History ↑)

HighPrediction Accuracy

Global History : Biased vs non-Biased Branch

A

B C

D

EPath_1 Path_2

Global History : Biased vs non-Biased Branch

A

B C

D

E

Path_1

A : 1 C : 1 D : 0 E : 1

Path_1 Path_2

Path_2

A : 0 B : 1 E : 0

→ Branch B, C, D is always Taken or Not Taken fixed

Global History : Biased vs non-Biased Branch

A

B C

D

E

Biased

Biased

Biased

→ Branch B, C, D is always Taken or Not Taken fixed

Path_1

A : 1 C : 1 D : 0 E : 1

Path_1 Path_2

Path_2

A : 0 B : 1 E : 0

Global History : Biased vs non-Biased Branch

A

B C

D

E

Non-Biased

Non-Biased

Biased

Biased

Biased Path_1

A : 1 C : 1 D : 0 E : 1

Path_1 Path_2

Path_2

A : 0 B : 1 E : 0

→ Branch B, C, D is always Taken or Not Taken fixed

Global History : Biased vs non-Biased Branch

→ Biased Branch is always Taken or Not Taken fixed

→ Branch E is rely on direction of Branch A only

→ Branch C, D, B contribute No Useful information

Path_1

A : 1 C : 1 D : 0 E : 1

Path_2

A : 0 B : 1 E : 0

Global History : Biased vs non-Biased Branch

→ Biased Branch is always Taken or Not Taken fixed

→ Branch E is rely on direction of Branch A only

→ Branch C, D, B contribute No Useful information

Path_1

A : 1 C : 1 D : 0 E : 1

Path_2

A : 0 B : 1 E : 0

Don’t provideAny useful Info.

Global History : Biased vs non-Biased Branch

Bias-Free Branch Predictor1. Filtering Biased Branch from the Global History - Bias-free BP Only tracks non-biased branch at runtime

Unfiltered GHR: 1 0 1 0 0 1 0

A X Y B Z B C

Bias-Free Branch Predictor1. Filtering Biased Branch from the Global History - Bias-free BP Only tracks non-biased branch at runtime

Unfiltered GHR: 1 0 1 0 0 1 0

A X Y B Z B C

→ non-Biased Branch History→ Decide the execute path→ Affect the next Branch’s direction→ Important bits for Branch Correlation

Bias-Free Branch Predictor1. Filtering Biased Branch from the Global History - Bias-free BP Only tracks non-biased branch at runtime

Unfiltered GHR: 1 0 1 0 0 1 0

A X Y B Z B C

→ Biased Branch History→ Always fixed by execute paths→ Don’t contribute the number of cases

Bias-Free Branch Predictor1. Filtering Biased Branch from the Global History - Bias-free BP Only tracks non-biased branch at runtime

Unfiltered GHR:

Bias-Free GHR:

A B B C1 0 1 0

1 0 1 0 0 1 0

A X Y B Z B C

Bias-Free Branch Predictor1. Filtering Biased Branch from the Global History - Bias-free BP Only tracks non-biased branch at runtime

Unfiltered GHR:

Bias-Free GHR:

A B B C1 0 1 0

1 0 1 0 0 1 0

A X Y B Z B C

→ We can save 3 bits without any Correlation information Loss

Bias-Free Branch Predictor2. Biased Branch Detection : Branch Status Table (B.S.T)

- Encoding 4 possible states by F.S.M

- B.S.T is direct mapped structure

- Branch in middle state executes in opposite direction, it is non-bias branch

Bias-Free Branch Predictor3. Filtering multiple instances from the global history

Unfiltered GHR : 1 0 0 1 0 1 0

A B C B A C B

Bias-Free GHR :

A B C1 0 0

Non-Biased Branches :

Bias-Free Branch Predictor3. Filtering multiple instances from the global history

Unfiltered GHR : 1 0 0 1 0 1 0

A B C B A C B

Bias-Free GHR :

A B C1 0 0

Non-Biased Branches :

- B.F.B.P only tracks the latest occurrence of a non-biased Branch

Bias-Free Branch Predictor3. Filtering multiple instances from the global history

Unfiltered GHR : 1 0 0 1 0 1 0

A B C B A C B

Bias-Free GHR :

A B C1 0 0

Non-Biased Branches :

- B.F.B.P only tracks the latest occurrence of a non-biased Branch

- In order to save more distant branch correlation Info.

Bias-Free Branch Predictor3. Filtering multiple instances from the global history

Unfiltered GHR : 1 0 0 1 0 1 0

A B C B A C B

Bias-Free GHR :

A B C1 0 0

Non-Biased Branches :

- B.F.B.P only tracks the latest occurrence of a non-biased Branch

- In order to save more distant branch correlation Info.

- Minimize the footprint of GHR

Bias-Free Branch Predictor3. Filtering multiple instances from the global history - Recency Stack structure

D Q

PC 𝑥

=?PC𝑛𝑏

D Q

PC 𝑦

=?

D Q

PC 𝑧

=?

D Qh 𝑖𝑛

CLK❑

Bias-Free Branch Predictor3. Filtering multiple instances from the global history - Recency Stack structure

D Q

PC 𝑥

=?PC𝑛𝑏

D Q

PC 𝑦

=?

D Q

PC 𝑧

=?

D Qh 𝑖𝑛

CLK❑

② ② ②

Bias-Free Branch Predictor3. Filtering multiple instances from the global history - Recency Stack structure

D Q

PC 𝑥

=?PC𝑛𝑏

D Q

PC 𝑦

=?

D Q

PC 𝑧

=?

D Qh 𝑖𝑛

CLK❑

② ② ②

③ ③ ③Hit!

Bias-Free Branch Predictor3. Filtering multiple instances from the global history - Recency Stack structure

D Q

PC 𝑥

=?PC𝑛𝑏

D Q

PC 𝑦

=?

D Q

PC 𝑧

=?

D Qh 𝑖𝑛

CLK❑

② ② ②

③ ③ ③Hit!

Clock Gated → hold bit shifting

Bias-Free Branch Predictor3. Filtering multiple instances from the global history - Recency Stack structure

D Q

PC 𝑥

=?PC𝑛𝑏

D Q

PC 𝑦

=?

D Q

PC 𝑧

=?

D Qh 𝑖𝑛

CLK❑

② ② ②

③ ③ ③Hit!

Clock Gated → hold bit shifting

Bias-Free Branch Predictor3. Filtering multiple instances from the global history - Recency Stack structure

D Q

PC 𝑥

=?PC𝑛𝑏

D Q

PC 𝑦

=?

D Q

PC 𝑧

=?

D Qh 𝑖𝑛

CLK❑

Bias-Free Branch Predictor4. Positional History

if (Some Condition) / / Branch A array [ 10 ] = 1;

for ( i = 0 ; i < 100 ; i ++) / / Branch L{ if ( array [ i ] == 1 ) { ..... } / / Branch X -> 100 instances}

- Only one instance of Branch X(when i is 10) is correlate with Branch A

- Branch X is classified to Biased Branch until [i==10]

- Non-biased branch includes its Positional History

- Positional history conveys the distance of non-biased branch in the past history

Bias-Free Neural Branch Predictor

D Q

PC 𝑥

=?PC𝑛𝑏

D Q

PC 𝑦

=?

D Q

PC 𝑧

=?

D Qh 𝑖𝑛

CLK❑

Bias-Free TAGE Branch Predictor

Bias-Free Branch Predictor – Experimental Results

- OH-SNAP average 2.63 / BF Neural : 2.49 / TAGE : 2.445 MPKI

- BF-Neural improves the accuracy by 5.32%

Bias-Free Branch Predictor – Experimental Results

- MPKI Comparison for Different Number of Tables.

Bias-Free Branch Predictor – Experimental Results

- MPKI Comparison for Different Number of Tables. - MPKI of BF-TAGE with 7 tagged table is 2.57/ ISL-TAGE is 2.73 ※ Both use 70-history Bit to index 7’th table → BF GHR can offer much richer context

Bias-Free Branch Predictor – Experimental Results

- TAGE achieves best accuracy using 15 tagged tables. - ARM’s Cortex-A15, branch predictor accounts for 12 − 15% of all core energy. - It is important to implement smaller TAGE with few Tables - BF-TAGE with 10 tagged tables matches the accuracy of TAGE with 15 tagged tables. → It is suitable choice for embedded, mobile Processors

Thank you.