array operation synthesis to optimize data parallel programs department of computer science,...
TRANSCRIPT
![Page 1: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/1.jpg)
Array Operation Synthesis to Optimize Data Parallel Programs
Department of Computer Science,
National Tsing-Hua University
Student:Gwan-Hwan HwangAdvisor: Dr. Jenq Kuen Lee
![Page 2: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/2.jpg)
Array Operation Synthesis to Optimize Data Parallel Programs
國立清華大學資訊工程系
Student: 黃冠寰Advisor: 李政崑博士
![Page 3: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/3.jpg)
Array Operation Synthesis on Distributed-memory Machines
國立清華大學資訊工程學系
黃冠寰 , Phd.
![Page 4: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/4.jpg)
Compiler Optimization for Compiler Optimization for
Parallel Computations on Parallel Computations on
Distributed & Shared Memory Distributed & Shared Memory
Machines Machines •Communication Code for Block-Cyclic Distribution of HPF(IPPS’98)
•Array Operation Synthesis for Intrinsic Array Functions (JPDC, ACM PPoPP’95, ICPP’96)
Research Interests
Key Issues
•Automatic Alignment for Data Parallel Languages (LCPC’97)
Concurrent Testing Concurrent Testing •Reachability Testing of Concurrent Program (IJSEKE’95, APSEC’93)
Parallel Object Program Model &Parallel Object Program Model &
Heterogeneous ComputingHeterogeneous Computing
•Java-Based Network Computing Environment •Transparent Parallel Computing Environment (Ongoing)
![Page 5: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/5.jpg)
Outline of Presentation
• Fortran 90 Intrinsic Array Operations
• Array Operation Synthesis(AOS)
• SYNTOOL
• Apply AOS to Shared-Memory Machines
• Apply AOS to Distributed-Memory Machines
• Conclusion and Future Work
![Page 6: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/6.jpg)
Outline of Presentation• Fortran 90 Intrinsic Array Operations
• Array Operation Synthesis(AOS)
• SYNTOOL
• Apply AOS to Shared-Memory Machines
• Apply AOS to Distributed-Memory Machines
• Integrate AOS with Automatic Data Alignment
• Conclusion and Future Work
![Page 7: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/7.jpg)
Intrinsic Array Operations
• Provided by Modern Program Languages. E.g. Fortran 90, High Performance Fortran(HPF), HPF2,
Fortran 97, APL, MATLAB, MATHEMATICA, NESL, C*
• Engineering and Scientific Applications
• Facilitate a Compilation Analysis for Optimization
• Support Parallel Execution and Portability
![Page 8: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/8.jpg)
4321
16151413
1211109
8765
416128
315117
214106
11395
Intrinsic Array Operations(Cont’d)• Array Operations Provided by Fortran 90, HPF.
• Examples:
CSHIFT, TRANSPOSE, MERGE, EOSHIFT, RESHAPESPREAD, Section Move, Where Constructs, Reductions.
16151413
1211109
8765
4321B=CSHIFT(A,1,1)
4321
16151413
1211109
8765
C=TRANSPOSE(B)
![Page 9: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/9.jpg)
Consecutive Array Expressions• Array Expression
• Consecutive Array Operations
C=EOSHIFT(MERGE(RESHAPE(S,/N,N/),A+B,T),1,0,1)
FXP=CSHIFT(F1,1,+1)FXM=CSHIFT(F1,1,-1)FXP=CSHIFT(F1,2,+1)FYM=CSHIFT(F1,2,-1)FDERIV=ZXP*(FXP-F1)+ZXM*(FXM-F1)+ ZYP*(FYP-F1)+ZYM*(FYM-F1)
![Page 10: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/10.jpg)
Classification of Array Operations
• Model Array Operations by Data Access Functions (DAF)
Single-Clause Multiple-ClauseSingle-Source TYPE 1 TYPE 3
Multiple-Source TYPE 2 TYPE 4
Type 1Type 2 Type 3
Type 4
![Page 11: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/11.jpg)
Data Access Functions
• Represent Array Operations by Mathematical Functions
• Model Array Operations by Data Access Functions (DAF)Single-Source, Multiple-SourceSingle-Clause, multiple-Clause
![Page 12: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/12.jpg)
Type 1: Single-source Single-clause Data Access Function
• One Source Array
• One Data Access Pattern
4321
16151413
1211109
8765
416128
315117
214106
11395
B=TRANSPOSE(A)
Data Access Function is B(I,J)=A(J,I)
![Page 13: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/13.jpg)
Single-source Single-clause Data Access Function
• One Source Array
• One Data Access Pattern
4321
16151413
1211109
8765
416128
315117
214106
11395
B=TRANSPOSE(A)
Data Access Function is B(I,J)=A(J,I)
![Page 14: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/14.jpg)
Type 2: Multiple-source Single-clause Data Access Function
• Multiple Source Arrays
• One Data Access PatternR=MERGE(T,F,M)
Data Access Function is
111
111
111
222
222
222
TFF
FTT
TFT
122
211
121
I,JI,JI,JI,J M,F,TR where
False if
True if ,,
zy
zxzyx
Array T Array F Array M Array R
![Page 15: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/15.jpg)
Multiple-source Single-clause Data Access Function
• Multiple Source Arrays
• One Data Access PatternR=MERGE(T,F,M)
Data Access Function is
111
111
111
222
222
222
TFF
FTT
TFT
122
211
121
I,JI,JI,JI,J M,F,TR where
False if
True if ,,
zy
zxzyx
Array T Array F Array M Array R
![Page 16: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/16.jpg)
Type 3: Single-source Multiple-clause Data Access Function
• Single Source Array• Multiple Data Access Patterns
B=CSHIFT(A,1,1)
Data Access Function is
/1:4:1 , 1:3:1//,,/ 1A
/1:4:1 , 1:4:4//,,/ 3AB
JI,JI
JI,JII,J
16151413
1211109
8765
4321
4321
16151413
1211109
8765
Array A Array B
: a segmentation descriptor
![Page 17: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/17.jpg)
Single-source Multiple-clause Data Access Function
• Single Source Array• Multiple Data Access Patterns
B=CSHIFT(A,1,1)
Data Access Function is
/1:4:1 , 1:3:1//,,/ 1A
/1:4:1 , 1:4:4//,,/ 3AB
JI,JI
JI,JII,J
16151413
1211109
8765
4321
4321
16151413
1211109
8765
Array A Array B
: a segmentation descriptor
![Page 18: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/18.jpg)
Type 4: Multiple-source Multiple-clause Data Access Function
• Multiple Source Arrays• Multiple Data Access Patterns
No array operation of Fortran 90 belongs to type 4Synthesis of multiple array operations may derive a
type 4 data access function.
![Page 19: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/19.jpg)
Multiple-source Multiple-clause Data Access Function
• Multiple Source Arrays• Multiple Data Access Patterns
No array operation of Fortran 90 belongs to this typeSynthesis of multiple array operations may derive a
multiple-source multiple-clause data access function
![Page 20: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/20.jpg)
Straightforward Compilation• Translate each operation into a parallel loop
B=CSHIFT((TRANSPOSE(EOSHIFT(A,1,0,1),1,1)
FORALL (I=1:N:1; J=1:N:1) T2(I,J)=T1(J,I)ENDFORALL
FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN B(I,J)=T2(I+1,J) ELSE B(I,J)=T2(I-N,J)ENDFORALL
FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN T1(I,J)=A(I+1,J) ELSE T1(I,J)=0ENDFORALL
EOSHIFT
TRANSPOSE
CSHIFT
![Page 21: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/21.jpg)
Array Operation Synthesis
• Construct the Parse Tree of Array Expression
• Represent Array Operations by Mathematical Functions (DAF)
B=CSHIFT((TRANSPOSE(EOSHIFT(A,1,0,1),1,1)
CSHIFT
TRANSPOSE
EOSHIFT
/1::1 , 1:://,,/ 12T
/1::1 , 1:1:1//,,/ 12TB
NNNJI,JNI
NNJI,JII,J
J,II,J T1 T2
/1::1 , 1:://,,/ 0
/1::1 , 1:1:1//,,/ 11T
NNNJI
NNJI,JIAI,J
![Page 22: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/22.jpg)
Array Operation Synthesis (Cont’d)
/1::1 , 1:://,,/ 12T
/1::1 , 1:1:1//,,/ 12TB
NNNJI,JNI
NNJI,JII,J
J,II,J T1 T2
CSHIFT
TRANSPOSE
Synthesis of twofunctions
/1::1 , 1:://,,/ 0
/1::1 , 1:1:1//,,/ 11T
NNNJI
NNJI,JIAI,J
/1::1 , 1:://,,/ 1,1T
/1::1 , 1:1:1//,,/ 1,1TB
NNNJINIJ
NNJIIJI,J COSHIFT+
TRANSPOSE
EOSHIFT
/:1 , ://,1,/ /:1 , ://,,/ 0
/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A
/:1 ,:/ /,1,//:1 , 1:1//,,/ 0
/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A
B
NNNNIJNNNJI
NNNIJNNNJINIJ
NNNIJNNJI
NNIJNNJIIJ
I,J
![Page 23: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/23.jpg)
• Substitution (Term Rewriting like method)Having two Data Access Patterns:
The Synthesized Data Access Pattern is:
Synthesis of two Data Access Functions
,,,,,,,,,,,112111 iifiifiifSiiT nmnnn
,,,,,,,,,,,112111 iihiihiihQiiS mpmmm
'
112111 ,,,,,,,,,,, iigiigiigQiiT npnnn
piiiifiifiifhiig nmnnini 1,,112111
,,,,,,,,,,,
/::,,::/,,,,,,/ 111,111'
sulsuliifiif mmmnmn
where
/::,,::/,,/ 111,/1 sulsulii mmmn where
![Page 24: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/24.jpg)
• For example,
• By the substitution rule
3:3,1:/1j/,/i, ji,Aji,1T
Synthesis of two DAFs (Cont’d)
4:3,1:/1j/,/i, j1,iT3,1ij,T1ji,B
3:3,1:/11/,i/j,4:3,1:/1j/,/i, j1,iT3,1ij,Aji,B
![Page 25: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/25.jpg)
• For example,
Synthesis of two DAFs (Cont’d)
y
q
11
p1
i,,iT
y
q
x
k
11
n1
,T,
,T,
,T,
i,,iS
x
k
yxyx
xx
xx
ykyk
kk
kk
yy
,x
2,x2
1,x1
,k
2,k2
1,k1
,111
2,1121
1,1111
n1
,,
,,
,,
,,
,,
,,
,,
,,
,,
i,,iS
![Page 26: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/26.jpg)
Code Generation for Synthesized Data Access Function
/:1 , ://,1,/ /:1 , ://,,/ 0
/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A
/:1 ,:/ /,1,//:1 , 1:1//,,/ 0
/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A
B
NNNNIJNNNJI
NNNIJNNNJINIJ
NNNIJNNJI
NNIJNNJIIJ
I,J
FORALL (I=1:N:1; J=1:N:1)
IF (/I,J/,/1:N-1,1:N/) (/J,I+1/,/1:N-1,1:N/) THEN B(I,J)=A(J+1, I+1) IF (/I,J/,/1:N-1,1:N/) (/J,I+1/,/N:N ,1:N/) THEN B(I,J)=0
IF (/I,J/,/N:N ,1:N/) (/J,I+1/,/1:N-1,1:N/) THEN B(I,J)=A(J+1, I-N+1) IF (/I,J/,/N:N ,1:N/) (/J,I+1/,/N:N ,1:N/) THEN B(I,J)=0 ENDFORALL
Code Generation
![Page 27: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/27.jpg)
Code Generation for Synthesized Data Access Function
/:1 , ://,1,/ /:1 , ://,,/ 0
/:1 ,1:1//,1,/ /:1 , ://,,/ 1,A
/:1 ,:/ /,1,//:1 , 1:1//,,/ 0
/:1 , 1:1/ /,1,//:1 , 1:1//,,/ 1,1A
B
NNNNIJNNNJI
NNNIJNNNJINIJ
NNNIJNNJI
NNIJNNJIIJ
I,J
After Optimization
1
N-1N
1 N-1 N
/ : , ://,,/ 0
/1:1 , ://,,/ 1,A
/ : , 1:1//,,/ 0
/1:1 , 1:1//,,/ 1,1A
B
NNNNJI
NNNJINIJ
NNNJI
NNJIIJ
I,J
![Page 28: Array Operation Synthesis to Optimize Data Parallel Programs Department of Computer Science, National Tsing-Hua University Student:Gwan-Hwan Hwang Advisor:](https://reader035.vdocuments.net/reader035/viewer/2022062308/56649c7d5503460f949324a4/html5/thumbnails/28.jpg)
• Simplifying the ranges at compilation time instead of runtime
• Optimization process:Normalize:
Intersection for each dimension:
/ , 5:28:3,//,,I,/ / , 5:100:4,//,,5I3,/
Optimization
/ , 5:200:7,//,,I,/ / , 6:100:5,//,,I,/
/ , 30:77:17,//,,I,/