array operation synthesis to optimize data parallel programs department of computer science,...

Post on 14-Dec-2015

225 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Array Operation Synthesis to Optimize Data Parallel Programs

Department of Computer Science,

National Tsing-Hua University

Student:Gwan-Hwan HwangAdvisor: Dr. Jenq Kuen Lee

Array Operation Synthesis to Optimize Data Parallel Programs

國立清華大學資訊工程系

Student: 黃冠寰Advisor: 李政崑博士

Array Operation Synthesis on Distributed-memory Machines

國立清華大學資訊工程學系

黃冠寰 , Phd.

Compiler Optimization for Compiler Optimization for

Parallel Computations on Parallel Computations on

Distributed & Shared Memory Distributed & Shared Memory

Machines Machines •Communication Code for Block-Cyclic Distribution of HPF(IPPS’98)

•Array Operation Synthesis for Intrinsic Array Functions (JPDC, ACM PPoPP’95, ICPP’96)

Research Interests

Key Issues

•Automatic Alignment for Data Parallel Languages (LCPC’97)

Concurrent Testing Concurrent Testing •Reachability Testing of Concurrent Program (IJSEKE’95, APSEC’93)

Parallel Object Program Model &Parallel Object Program Model &

Heterogeneous ComputingHeterogeneous Computing

•Java-Based Network Computing Environment •Transparent Parallel Computing Environment (Ongoing)

Outline of Presentation

• Fortran 90 Intrinsic Array Operations

• Array Operation Synthesis(AOS)

• SYNTOOL

• Apply AOS to Shared-Memory Machines

• Apply AOS to Distributed-Memory Machines

• Conclusion and Future Work

Outline of Presentation• Fortran 90 Intrinsic Array Operations

• Array Operation Synthesis(AOS)

• SYNTOOL

• Apply AOS to Shared-Memory Machines

• Apply AOS to Distributed-Memory Machines

• Integrate AOS with Automatic Data Alignment

• Conclusion and Future Work

Intrinsic Array Operations

• Provided by Modern Program Languages. E.g. Fortran 90, High Performance Fortran(HPF), HPF2,

Fortran 97, APL, MATLAB, MATHEMATICA, NESL, C*

• Engineering and Scientific Applications

• Facilitate a Compilation Analysis for Optimization

• Support Parallel Execution and Portability

4321

16151413

1211109

8765

416128

315117

214106

11395

Intrinsic Array Operations(Cont’d)• Array Operations Provided by Fortran 90, HPF.

• Examples:

CSHIFT, TRANSPOSE, MERGE, EOSHIFT, RESHAPESPREAD, Section Move, Where Constructs, Reductions.

16151413

1211109

8765

4321B=CSHIFT(A,1,1)

4321

16151413

1211109

8765

C=TRANSPOSE(B)

Consecutive Array Expressions• Array Expression

• Consecutive Array Operations

C=EOSHIFT(MERGE(RESHAPE(S,/N,N/),A+B,T),1,0,1)

FXP=CSHIFT(F1,1,+1)FXM=CSHIFT(F1,1,-1)FXP=CSHIFT(F1,2,+1)FYM=CSHIFT(F1,2,-1)FDERIV=ZXP*(FXP-F1)+ZXM*(FXM-F1)+ ZYP*(FYP-F1)+ZYM*(FYM-F1)

Classification of Array Operations

• Model Array Operations by Data Access Functions (DAF)

Single-Clause Multiple-ClauseSingle-Source TYPE 1 TYPE 3

Multiple-Source TYPE 2 TYPE 4

Type 1Type 2 Type 3

Type 4

Data Access Functions

• Represent Array Operations by Mathematical Functions

• Model Array Operations by Data Access Functions (DAF)Single-Source, Multiple-SourceSingle-Clause, multiple-Clause

Type 1: Single-source Single-clause Data Access Function

• One Source Array

• One Data Access Pattern

4321

16151413

1211109

8765

416128

315117

214106

11395

B=TRANSPOSE(A)

Data Access Function is B(I,J)=A(J,I)

Single-source Single-clause Data Access Function

• One Source Array

• One Data Access Pattern

4321

16151413

1211109

8765

416128

315117

214106

11395

B=TRANSPOSE(A)

Data Access Function is B(I,J)=A(J,I)

Type 2: Multiple-source Single-clause Data Access Function

• Multiple Source Arrays

• One Data Access PatternR=MERGE(T,F,M)

Data Access Function is

111

111

111

222

222

222

TFF

FTT

TFT

122

211

121

I,JI,JI,JI,J M,F,TR where

False if

True if ,,

zy

zxzyx

Array T Array F Array M Array R

Multiple-source Single-clause Data Access Function

• Multiple Source Arrays

• One Data Access PatternR=MERGE(T,F,M)

Data Access Function is

111

111

111

222

222

222

TFF

FTT

TFT

122

211

121

I,JI,JI,JI,J M,F,TR where

False if

True if ,,

zy

zxzyx

Array T Array F Array M Array R

Type 3: Single-source Multiple-clause Data Access Function

• Single Source Array• Multiple Data Access Patterns

B=CSHIFT(A,1,1)

Data Access Function is

/1:4:1 , 1:3:1//,,/ 1A

/1:4:1 , 1:4:4//,,/ 3AB

JI,JI

JI,JII,J

16151413

1211109

8765

4321

4321

16151413

1211109

8765

Array A Array B

: a segmentation descriptor

Single-source Multiple-clause Data Access Function

• Single Source Array• Multiple Data Access Patterns

B=CSHIFT(A,1,1)

Data Access Function is

/1:4:1 , 1:3:1//,,/ 1A

/1:4:1 , 1:4:4//,,/ 3AB

JI,JI

JI,JII,J

16151413

1211109

8765

4321

4321

16151413

1211109

8765

Array A Array B

: a segmentation descriptor

Type 4: Multiple-source Multiple-clause Data Access Function

• Multiple Source Arrays• Multiple Data Access Patterns

No array operation of Fortran 90 belongs to type 4Synthesis of multiple array operations may derive a

type 4 data access function.

Multiple-source Multiple-clause Data Access Function

• Multiple Source Arrays• Multiple Data Access Patterns

No array operation of Fortran 90 belongs to this typeSynthesis of multiple array operations may derive a

multiple-source multiple-clause data access function

Straightforward Compilation• Translate each operation into a parallel loop

B=CSHIFT((TRANSPOSE(EOSHIFT(A,1,0,1),1,1)

FORALL (I=1:N:1; J=1:N:1) T2(I,J)=T1(J,I)ENDFORALL

FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN B(I,J)=T2(I+1,J) ELSE B(I,J)=T2(I-N,J)ENDFORALL

FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN T1(I,J)=A(I+1,J) ELSE T1(I,J)=0ENDFORALL

EOSHIFT

TRANSPOSE

CSHIFT

Array Operation Synthesis

• Construct the Parse Tree of Array Expression

• Represent Array Operations by Mathematical Functions (DAF)

B=CSHIFT((TRANSPOSE(EOSHIFT(A,1,0,1),1,1)

CSHIFT

TRANSPOSE

EOSHIFT

/1::1 , 1:://,,/ 12T

/1::1 , 1:1:1//,,/ 12TB

NNNJI,JNI

NNJI,JII,J

J,II,J T1 T2

/1::1 , 1:://,,/ 0

/1::1 , 1:1:1//,,/ 11T

NNNJI

NNJI,JIAI,J

Array Operation Synthesis (Cont’d)

/1::1 , 1:://,,/ 12T

/1::1 , 1:1:1//,,/ 12TB

NNNJI,JNI

NNJI,JII,J

J,II,J T1 T2

CSHIFT

TRANSPOSE

Synthesis of twofunctions

/1::1 , 1:://,,/ 0

/1::1 , 1:1:1//,,/ 11T

NNNJI

NNJI,JIAI,J

/1::1 , 1:://,,/ 1,1T

/1::1 , 1:1:1//,,/ 1,1TB

NNNJINIJ

NNJIIJI,J COSHIFT+

TRANSPOSE

EOSHIFT

/:1 , ://,1,/ /:1 , ://,,/ 0

/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A

/:1 ,:/ /,1,//:1 , 1:1//,,/ 0

/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A

B

NNNNIJNNNJI

NNNIJNNNJINIJ

NNNIJNNJI

NNIJNNJIIJ

I,J

• Substitution (Term Rewriting like method)Having two Data Access Patterns:

The Synthesized Data Access Pattern is:

Synthesis of two Data Access Functions

,,,,,,,,,,,112111 iifiifiifSiiT nmnnn

,,,,,,,,,,,112111 iihiihiihQiiS mpmmm

'

112111 ,,,,,,,,,,, iigiigiigQiiT npnnn

piiiifiifiifhiig nmnnini 1,,112111

,,,,,,,,,,,

/::,,::/,,,,,,/ 111,111'

sulsuliifiif mmmnmn

where

/::,,::/,,/ 111,/1 sulsulii mmmn where

• For example,

• By the substitution rule

3:3,1:/1j/,/i, ji,Aji,1T

Synthesis of two DAFs (Cont’d)

4:3,1:/1j/,/i, j1,iT3,1ij,T1ji,B

3:3,1:/11/,i/j,4:3,1:/1j/,/i, j1,iT3,1ij,Aji,B

• For example,

Synthesis of two DAFs (Cont’d)

y

q

11

p1

i,,iT

y

q

x

k

11

n1

,T,

,T,

,T,

i,,iS

x

k

yxyx

xx

xx

ykyk

kk

kk

yy

,x

2,x2

1,x1

,k

2,k2

1,k1

,111

2,1121

1,1111

n1

,,

,,

,,

,,

,,

,,

,,

,,

,,

i,,iS

Code Generation for Synthesized Data Access Function

/:1 , ://,1,/ /:1 , ://,,/ 0

/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A

/:1 ,:/ /,1,//:1 , 1:1//,,/ 0

/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A

B

NNNNIJNNNJI

NNNIJNNNJINIJ

NNNIJNNJI

NNIJNNJIIJ

I,J

FORALL (I=1:N:1; J=1:N:1)

IF (/I,J/,/1:N-1,1:N/) (/J,I+1/,/1:N-1,1:N/) THEN B(I,J)=A(J+1, I+1) IF (/I,J/,/1:N-1,1:N/) (/J,I+1/,/N:N ,1:N/) THEN B(I,J)=0

IF (/I,J/,/N:N ,1:N/) (/J,I+1/,/1:N-1,1:N/) THEN B(I,J)=A(J+1, I-N+1) IF (/I,J/,/N:N ,1:N/) (/J,I+1/,/N:N ,1:N/) THEN B(I,J)=0 ENDFORALL

Code Generation

Code Generation for Synthesized Data Access Function

/:1 , ://,1,/ /:1 , ://,,/ 0

/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A

/:1 ,:/ /,1,//:1 , 1:1//,,/ 0

/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A

B

NNNNIJNNNJI

NNNIJNNNJINIJ

NNNIJNNJI

NNIJNNJIIJ

I,J

After Optimization

1

N-1N

1 N-1 N

/ : , ://,,/ 0

/1:1 , ://,,/ 1,A

/ : , 1:1//,,/ 0

/1:1 , 1:1//,,/ 1,1A

B

NNNNJI

NNNJINIJ

NNNJI

NNJIIJ

I,J

• Simplifying the ranges at compilation time instead of runtime

• Optimization process:Normalize:

Intersection for each dimension:

/ , 5:28:3,//,,I,/ / , 5:100:4,//,,5I3,/

Optimization

/ , 5:200:7,//,,I,/ / , 6:100:5,//,,I,/

/ , 30:77:17,//,,I,/

top related