polyhedral optimizations of explicitly parallel programs · 10 background explicit parallelism -...

38
1 Polyhedral Optimizations of Explicitly Parallel Programs Prasanth Chatarasi , Jun Shirako, Vivek Sarkar Habanero Extreme Scale Software Research Group Department of Computer Science Rice University The 24th International Conference on Parallel Architectures and Compilation Techniques (PACT) October 19, 2015 Prasanth Chatarasi , Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Upload: others

Post on 31-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

1

Polyhedral Optimizations of Explicitly Parallel Programs

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar

Habanero Extreme Scale Software Research GroupDepartment of Computer Science

Rice University

The 24th International Conference onParallel Architectures and Compilation Techniques (PACT)

October 19, 2015

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 2: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

2

Introduction and Motivation

Introduction

Moving towards Extreme-Scale and Exa-Scale computing systems

Billions of billions operations per second

Enabling applications to fully exploit the systems is not easy !

How ??

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 3: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

3

Introduction and Motivation

Introduction

Two approaches from past work:

1) Manually parallelize using explicitly-parallel programming models (E.g., CAF, Cilk,Habanero, MPI, OpenMP, UPC etc)

Optimizations performed by programmer not compiler !Tedious ! But can deliver good performance, with sufficient effort

2) Automatically parallelize sequential programs

Done by compilers not humans !Easy ! But, limitations exist.

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 4: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

4

Introduction and Motivation

Motivation and Our Approach

Motivation

Programmer expresses logical parallelism in the application and then let compilerperform optimizations accordingly

Our approach

Automatically optimize explicitly-parallel programs

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 5: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

5

Introduction and Motivation

Glimpse of benefits

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 6: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

6

Background

1 Introduction and Motivation

2 Background

3 Our framework

4 Evaluation

5 Related Work

6 Conclusions and Future work

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 7: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

7

Background

Explicit Parallelism - Loop level parallelism

Major difference between Sequential and Parallel programs

Sequential programs - total execution orderParallel programs - partial execution order

Loop-level parallelism (since OpenMP 1.0)

Loop is annotated with ’omp parallel for’Iterations of the loop can be run in parallel

1#pragma omp p a r a l l e l f o r2 f o r (i−loop ) {3 S1 ;4 S2 ;5 S3 ;6 }

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 8: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

8

Background

Explicit Parallelism - Task level parallelism

Task-level parallelism (OpenMP 3.0 & 4.0)

Region of code is annotated with ’omp task’Synchronization

B/w parent and children - ’omp taskwait’B/w siblings - ’depend’ clause

1#pragma omp ta sk depend ( out : A) //T12 {S1}3#pragma omp ta sk depend ( i n : A) //T24 {S2}5#pragma omp ta sk // T36 {S3}7#pragma omp t a s kwa i t // Tw

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 9: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

9

Background

Explicit Parallelism - Happens before relation

Happens-Before relation

Specification of partial order amongdynamic statement instancesHB(S1, S2) = true ↔ S1 must happenbefore S2, where S1 and S2 are statementinstances.

HB(S1 (i), S2(i)) = trueHB(S1, S2) = true, HB(S2, S3) = false

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 10: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

10

Background

Explicit Parallelism - Serial elision property

Serial-Elision property

Removal of all parallel constructs results in asequential program that is a valid (albeitinefficient) implementation of the parallel programsemantics.

1#pragma omp ta sk depend ( out : A) //T12 {S1}3#pragma omp ta sk depend ( i n : A) //T24 {S2}5#pragma omp ta sk // T36 {S3}7#pragma omp t a s kwa i t // Tw

Satisfies serial-elision

Original task graph

Removingparallelconstructs

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 11: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

11

Background

Polyhedral Compilation Techniques

Compiler techniques for analysis and transformation of codes with nested loops

Algebraic framework for affine program optimizations

Advantages over AST based frameworks

Reasoning at statement instance levelUnifies many complex loop transformations

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 12: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

12

Background

Polyhedral Representation (SCoP)

A statment (S) in the program is represented as follows in Static Control Part(SCoP):

1) Iteration domain (DS)

Set of statement (S) instances

2) Schedule (ΘS)

Assigns logical time stamp to the statement instances (S)Gives ordering information b/w statement instancesCaptures sequential execution order of a programStatement instances are executed in increasing order of schedules

3) Access function (AS)

Array subscripts in the statement (S)

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 13: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

13

Background

Polyhedral Compilation Techniques - Summary

Advantages

Precise data dependency computationUnified formulation of complex set of loop transformations

LimitationsAffine array subscripts

But, conservative approaches exist !

Static affine control flow

Control dependences are modeled in same way as data dependences.

Assumes input is sequential program

Unaware of happens-before relation in input parallel program

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 14: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

14

Background

Automatic parallelization of sequential programs

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 15: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

15

Our framework

1 Introduction and Motivation

2 Background

3 Our framework

4 Evaluation

5 Related Work

6 Conclusions and Future work

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 16: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

16

Our framework

Polyhedral optimizations of Parallel Programs (PoPP)

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 17: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

17

Our framework

PoPP - Program Analysis

Step1: Compute dependences based on the sequential order (use serial elision andignore parallel constructs)

1 #pragma omp p a r a l l e l2 #pragma omp s i n g l e3 {4 f o r ( i n t it = itold + 1 ; it <= itnew ; it++) {5 f o r ( i n t i = 0 ; i < nx ; i++) {6 #pragma omp ta s k depend ( out : u [ i ] ) \7 depend ( in : unew [ i ] ) // T18 f o r ( i n t j = 0 ; j < ny ; j++)9 S1 : u [ i ] [ j ] = unew [ i ] [ j ] ;

10 }11 f o r ( i n t i = 0 ; i < nx ; i++) {12 #pragma omp ta s k depend ( out : unew [ i ] ) \13 depend ( in : f [ i ] , u [ i −1 ] , u [ i ] , u [ i+1]) // T214 f o r ( i n t j = 0 ; j < ny ; j++)15 S2 : cpd (i , j , unew , u , f ) ;16 }17 }18 #pragma omp taskwait // Tw19 }

Conservative analysis, but may still capturevectorization possibility

i

it

nx

itnew b b b b b

b b b b b

b b b b b

b b b b b

rs

0 1 2 3

1

2

(S2 → S1)dependencesacross it & i loops

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 18: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

18

Our framework

PoPP - Program Analysis

Step1: Compute dependences based on the sequential order (use serial elision andignore parallel constructs)

Step2: Compute happens-before relation

1 #pragma omp p a r a l l e l2 #pragma omp s i n g l e3 {4 f o r ( i n t it = itold + 1 ; it <= itnew ; it++) {5 f o r ( i n t i = 0 ; i < nx ; i++) {6 #pragma omp ta s k depend ( out : u [ i ] ) \7 depend ( in : unew [ i ] ) // T18 f o r ( i n t j = 0 ; j < ny ; j++)9 S1 : u [ i ] [ j ] = unew [ i ] [ j ] ;

10 }11 f o r ( i n t i = 0 ; i < nx ; i++) {12 #pragma omp ta s k depend ( out : unew [ i ] ) \13 depend ( in : f [ i ] , u [ i −1 ] , u [ i ] , u [ i+1]) // T214 f o r ( i n t j = 0 ; j < ny ; j++)15 S2 : cpd (i , j , unew , u , f ) ;16 }17 }18 #pragma omp taskwait // Tw19 }

i

it

nx

itnew b b b b b

b b b b b

b b b b b

b b b b b

rs

0 1 2 3

1

2

(S2→S1) HB edgesacross it & i loops

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 19: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

19

Our framework

PoPP - Program Analysis

Step1: Compute dependences

Step2: Compute Happens-before relation

Step3: Intersect 1 & 2 (Gives best of both worlds)

i

it

nx

itnew b b b b b

b b b b b

b b b b b

b b b b b

rs

0 1 2 3

1

2

Conservativedependences PS2→S1

1

∩i

it

nx

itnew b b b b b

b b b b b

b b b b b

b b b b b

rs

0 1 2 3

1

2

HB relationHB

S2→S11

=

i

it

nx

itnew b b b b b

b b b b b

b b b b b

b b b b b

rs

0 1 2 3

1

2

Refined dependencesP′S2→S11

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 20: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

20

Our framework

PoPP - Program Analysis

Step1: Compute dependences

Step2: Compute Happens-before relation

Step3: Intersect 1 & 2 (Gives best of both worlds)

j

i

ny

nx b b b b b

b b b b b

b b b b b

b b b b b

0 1 2 3

1

2

Conservativedependences PS1→S1

1(j-loop is parallel for S1)

∩j

i

ny

nx b b b b b

b b b b b

b b b b b

b b b b b

0 1 2 3

1

2

HB relationHB

S1→S11 ( i-loop is

parallel for S1)

=

j

i

ny

nx b b b b b

b b b b b

b b b b b

b b b b b

0 1 2 3

1

2

Refined dependencesP′S1→S11 (No

dependences for S1)Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 21: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

21

Our framework

PoPP - Program Transformations

Step4: Use refined dependences in existing optimizations

i

it

nx

itnew b b b b

b b b b

b b b b

b b b b

0 1 2

1

2

Refined dependences, P′S2→S11

Skewing and tiling the iteration space

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 22: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

22

Our framework

PoPP - Code generation

Step5: Generate optimized code using fine grained synchronization

2 #pragma omp p a r a l l e l f o r \3 private (c3 , c5 ) ordered (2 )4 f o r ( c1 = itold + 1 ; c1 <= itnew ; c1++) {5 f o r ( c3 = 2 ∗ c1 ; c3 <= 2 ∗ c1 + nx ; c3++) {6 #pragma omp ordered \7 depend ( sink : c1−1 , c3 ) depend ( sink : c1 , c3−1)8 i f ( c3 <= 2 ∗ c1 + nx + −1) {9 f o r ( c5 = 0 ; c5 < ny ; c5++)

10 S1 : u [−2∗c1+c3 ] [ c5 ] = unew [−2∗c1+c3 ] [ c5 ] ;11 }

13 i f ( c3 >= 2 ∗ c1 + 1) {14 f o r ( c5 = 0 ; c5 < ny ; c5++)15 S2 : cpd (−2∗c1+c3−1 , c5 , unew , u , f ) ;16 }17 #pragma omp ordered depend ( source )18 }}

Doacross loop synchronization - OpenMP 4.1

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 23: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

23

Our framework

PoPP - Workflow (in ROSE Compiler)

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 24: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

24

Our framework

PoPP - Transformations & Code Generation

Transformations - PolyAST framework [Shirako et.al SC’2014]

To perform loop optimizationsHybrid approach of polyhedral and AST-based transformationsDetects reduction, doacross and doall parallelism from dependences

Code Generation

Doall parallelism - omp parallel forDoacross parallelism - omp doacross

Proposed in OpenMP 4.1 [Shirako et.al IWOMP’11]Allows fine grained synchronization in multidimensional loop nests

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 25: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

25

Our framework

Extensions to Polyhedral frameworks

Correctness of Intersection approach

Serial-elision property makes it correct !

Computing conservative dependences

Non-affine subscripts, Unknown function calls, Non-affine conditionals etcExtended access functions to support

Extracting and Encoding task-related constructs in polyhedral representation(SCoP)

Constructed task SCoP to compute HB relation

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 26: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

26

Evaluation

1 Introduction and Motivation

2 Background

3 Our framework

4 Evaluation

5 Related Work

6 Conclusions and Future work

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 27: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

27

Evaluation

Evaluation: Benchmarks and Platforms

Intel Xeon 5660(Westmere)

IBM Power 8E(Power 8)

Microarch Westmere Power PCClock speed 2.80GHz 3.02GHzCores/socket 6 12Total cores 12 24Compiler gcc -4.9.2 gcc -4.9.2Compiler flags -O3 -fast(icc) -O3

KASTORS -Task parallel (3)

Jacobi, Jacobi-blocked, SparseLU

RODINIA -Loop parallel (8)

Back propagation, CFDsolver, Hotspot, Kmeans,LUD, Needle-Wunch, Particlefilter, Path finder

Unanalyzable data accesspatterns - 7 benchmarks

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 28: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

28

Evaluation

Variants

Variants in the experiments

Original OpenMP program (Blue bars)

Written by programmer

Automatic parallelization and optimization of serial elision version ofOpenMP program (Green bars)

Automatic optimizers

Optimized OpenMP program with intersection approach (Yellow bars)

Our framework (PoPP)

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 29: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

29

Evaluation

KASTORS suite + Intel Westmere (12 cores)

Geometric mean improvement - 2.02x

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 30: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

30

Evaluation

KASTORS suite + IBM Power8 (24 cores)

Geometric mean improvement - 7.32x

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 31: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

31

Evaluation

RODINIA suite + Intel westmere (12 cores)

Geometric mean improvement - 1.48x

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 32: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

32

Evaluation

RODINIA suite + IBM Power8 (24 cores)

Geometric mean improvement - 1.89x

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 33: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

33

Related Work

1 Introduction and Motivation

2 Background

3 Our framework

4 Evaluation

5 Related Work

6 Conclusions and Future work

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 34: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

34

Related Work

Related work

Dataflow analysis of explicitly parallel programs

Extensions to data-parallel/ task-parallel languages [J.F.Collard et.al Europar’96]Extensions to X10 programs with async-finish languages [T. Yuki et.al PPoPP’13]Above work is limited to analysis but we also focus on transformations.

PENCIL - Platform Neutral Compute Intermediate Language [Baghdadi et.al.PACT’15]

Prunes data-dependence relation on parallel loopsNo support for task parallel constructs as yetEnforces certain coding restrictions related to aliasing, recursion etc.

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 35: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

35

Related Work

Related work (contd)

Polyhedral optimization framework for DFGL [Sbirlea et.al LCPC’15]

Dataflow programming model - Implicitly parallelOptimizations via polyhedral & AST-based framework

Preliminary approach to optimize parallel programs [Pop and Cohen CPC’10]

Extract parallel semantics into compiler IR and perform polyhedral optimizationsEnvisaged on considering OpenMP streaming extensions

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 36: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

36

Conclusions and Future work

1 Introduction and Motivation

2 Background

3 Our framework

4 Evaluation

5 Related Work

6 Conclusions and Future work

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 37: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

37

Conclusions and Future work

PoPP - Conclusions and Future work

Conclusions: Our approach

Reduced spurious dependences from conservative analysis by intersecting with HBrelationBroadened the range of legal transformations for parallel programsIntegrated HB relation from task-parallel constructs into Polyhedral frameworksGeometric mean performance improvement of 1.62X on Intel westmere and 2.75X onIBM Power8 - Larger improvements !!

Future work:

Parallel constructs that don’t satisfy serial-elision propertyExtend to distributed-memory programming models (Eg: MPI)Happens-Before relation for debuggingBeyond polyhedral

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs

Page 38: Polyhedral Optimizations of Explicitly Parallel Programs · 10 Background Explicit Parallelism - Serial elision property Serial-Elision property Removal of all parallel constructs

38

Conclusions and Future work

Finally,

Optimizing explicitly parallel programs is a new direction for Parallel Architecturesand Compilation Techniques (PACT)!

Acknowledgments

Rice Habanero Extreme Scale Software Research GroupPACT 2015 program committee

Thank you!

Prasanth Chatarasi, Jun Shirako, Vivek Sarkar Polyhedral Optimizations of Explicitly Parallel Programs