interprocedural symbolic range propagation for optimizing compilers
DESCRIPTION
Interprocedural Symbolic Range Propagation for Optimizing Compilers. Hansang Bae and Rudolf Eigenmann Purdue University 2005. 10. 22. Outline. Motivation Symbolic Range Propagation Interprocedural Symbolic Range Propagation Experiments Conclusion s. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
Interprocedural Symbolic Range Propagation for Optimizing Compilers
Hansang Bae and Rudolf Eigenmann
Purdue University
2005. 10. 22
04/19/23 Interprocedural Symbolic Range Propagation
2
Outline
Motivation Symbolic Range Propagation Interprocedural Symbolic Range Propagation Experiments Conclusions
04/19/23 Interprocedural Symbolic Range Propagation
3
Motivation
Symbolic analysis is key to static analysis X=Y+1 better than X=?
Range analysis has been successful X=[0,10] better than X=?
Relevant questions How much can we achieve interprocedurally? Can interprocedural range propagation outperform
other alternatives?
04/19/23 Interprocedural Symbolic Range Propagation
4
Symbolic Range Propagation (Background) Has been effective for compiler analyses Abstract interpretation Provides lower/upper bounds for variables Sources of information
Variable definition IF conditionals Loop variables
Intersect with new source, union at merge
04/19/23 Interprocedural Symbolic Range Propagation
5
SRP – Example
X = 1
IF (X.LE.N) THEN
X = 2*X
ELSE
X = X+2
ENDIF
…
X=[-INF,INF]
X=[1,1]
X=[1,1]
X=[2,2]
X=[1,1]
X=[3,3]
X=[2,3]
Example Code Ranges
04/19/23 Interprocedural Symbolic Range Propagation
6
Interprocedural Symbolic Range Propagation (ISRP)
Propagates ranges across procedure calls Collects ISR at important program points
Entry to a subroutine After a call site
SRP as the source of information Iterative algorithm Context sensitivity by procedure-cloning
04/19/23 Interprocedural Symbolic Range Propagation
7
ISRP – Terminology Symbolic Range
Mapping from a variable to its value range, V = [LB, UB], where LB and UB are expressions
Interprocedural Symbolic RangeSymbolic Range valid at relevant program points - subroutine entries and call sites (forward/backward)
Jump FunctionSet of symbolic ranges expressed in terms of input variables to a called subroutine (actual parameters, global variables)
Return Jump FunctionSet of symbolic ranges expressed in terms of return variables to a calling subroutine (formal parameters, global variables)
Caller
Callee
Jump Function
Forward ISR Return Jump Function
Backward ISR
04/19/23 Interprocedural Symbolic Range Propagation
8
ISRP – Algorithm Propagate_Interprocedural_Ranges(){
Initialize_Call_Graph()while (there is any change in ISR) {
foreach Subroutine (bottom-up) {Get_Backward_Interprocedural_Ranges()Compute_Jump_Functions()Compute_Return_Jump_Functions()
}Get_Forward_Interprocedural_Ranges()
}}
04/19/23 Interprocedural Symbolic Range Propagation
9
ISRP – Algorithm Propagate_Interprocedural_Ranges()
Initialize_Call_Graph()while (there is any change in ISR) {
foreach Subroutine (bottom-up) {Get_Backward_Interprocedural_Ranges()Compute_Jump_Functions()Compute_Return_Jump_Functions()
}Get_Forward_Interprocedural_Ranges()
}
Get_Backward_Interprocedural_Ranges() Transforms return jump functions to ISRs Does nothing for leaf nodes of the call graph
Compute_Jump_Functions() Computes intraprocedural ranges Discards non-input-variables to the callee
Compute_Return_Jump_Functions() Discards non-return-variables to the caller
Get_Forward_Interprocedural_Ranges() Transforms jump functions to ISRs Clone procedures if necessary Keeps track of any changes
04/19/23 Interprocedural Symbolic Range Propagation
10
ISRP – Example (1st iteration) PROGRAM MAIN INTEGER X, Y X = 1 Y = 2α CALL A(X, Y) END
SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40β CALL B(T, U, N) ENDDO END
SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M END
X=[1] X=[1],Y=[2]
N=[10,40] N=[10,40],T=[U+N]
V=[W+M]
J :X=[1],Y=[2]
ISR:T=[1],U=[2] J :N=[10,40] ISR:T=[U+N]
ISR:M=[10,40] RJ :V=[W+M]
foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_FunctionsGet_Forward_ISRs (for call graph)
04/19/23 Interprocedural Symbolic Range Propagation
11
ISRP – Example (2nd iteration) PROGRAM MAIN INTEGER X, Y X = 1 Y = 2α CALL A(X, Y) END
SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40β CALL B(T, U, N) ENDDO END
SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M END
X=[1]X=[1],Y=[2]Y=[2]
T=[1],U=[2] T=[1],U=[2],N=[10,40]N=[10,40],U=[2]U=[2]
M=[10,40] M=[10,40],V=[W+M]
J :X=[1],Y=[2]ISR:Y=[2]
ISR:T=[1],U=[2] J :U=[2],N=[10,40]ISR:N=[10,40],T=[U+N]RJ :U=[2]
ISR:M=[10,40] RJ :M=[10,40],V=[W+M]
ISR:T=[1],U=[2]
ISR:M=[10,40],W=[2]
foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_FunctionsGet_Forward_ISRs (for call graph)
04/19/23 Interprocedural Symbolic Range Propagation
12
ISRP – Example (3rd iteration) PROGRAM MAIN INTEGER X, Y X = 1 Y = 2α CALL A(X, Y) END
SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40β CALL B(T, U, N) ENDDO END
SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M END
X=[1]X=[1],Y=[2]Y=[2]
T=[1],U=[2] T=[1],U=[2],N=[10,40]N=[10,40],U=[2]U=[2]
M=[10,40],W=[2] M=[10,40],W=[2],V=[W+M]
J :X=[1],Y=[2]ISR:Y=[2]
ISR:T=[1],U=[2] J :U=[2],N=[10,40]ISR:N=[10,40],U=[2],T=[U+N]RJ :U=[2]
ISR:M=[10,40],W=[2] RJ :M=[10,40],W=[2],V=[W+M]
ISR:T=[1],U=[2]
ISR:M=[10,40],W=[2]
foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_FunctionsGet_Forward_ISRs (for call graph)
04/19/23 Interprocedural Symbolic Range Propagation
13
Experiments
Efficacy of ISRP for an optimizing compiler (Polaris) Test elision and dead-code elimination Data dependence analysis Other optimizations for parallelization
21 Fortran codes from SPEC CFP and Perfect Best available optimizations in Polaris as Base
Interprocedural expression propagation Forward substitution Intraprocedural symbolic range propagation Automatic partial inlining
04/19/23 Interprocedural Symbolic Range Propagation
14
Result – Test ElisionCodes Base ISRP Codes Base ISRPARC2DBDNADYFESMFLO52QMDGMIGRATIONOCEANQCD2SPEC77TRACKTRFD
41518
204
670342
51526
516
7233
108
appluapsifpppphydro2dmgridsu2corswimtomcatvturb3dwupwiseTOTAL
41
12907005
19176
418
790800
1527
242
ISRP found more or same number of cases for 20 codes
Base made an aggressive decision with hard-wired test elision for fpppp
04/19/23 Interprocedural Symbolic Range Propagation
15
Result – Data Dependence AnalysisCodes Base ISRP Codes Base ISRP
ARC2DBDNADYFESMFLO52QMDGMIGRATIONOCEANQCD2SPEC77TRACKTRFD
80111241388263
11207180433
1725723922003
93
45910811108279908
6219370
357916521781
40
appluapsifpppphydro2dmgridsu2corswimtomcatvturb3dwupwiseTOTAL
6771139522064
11083
109150
45371
201381727
6778222
200066619
87120
45342
107156636
ISRP disproved more data dependences for 20 codes
Base benefits from forward substitution for FLO52Q
Data dependence analysis benefits from other improved optimizations
04/19/23 Interprocedural Symbolic Range Propagation
16
Result – Other Optimizations
0
20
40
60
80
100
120
TRFD QCD2 apsi BDNA su2cor
Is non-zero-trip loop ? Is closed range ?
Per
cent of pre
ferr
ed c
om
piler
dec
isio
ns
(Yes
) Base ISRP
“Yes” to the questions helps the compiler generate better codes
ISRP helped the compiler make better decisions for 5 codes
Induction Variable Substitution Reduction Translation
IF (((-num)+(-num**2))/2.LE.0.AND.(-num).LE.0) THEN ALLOCATE (xrsiq00(1:morb, 1:num, 1:numthreads))!$OMP PARALLEL!$OMP+IF(6+((-1)*num+(-1)*num**2)/2.LE.0)!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MY_CPU_ID,MRS,MRSIJ1,MI0,MJ0,VAL,MQ,MP,XIJ00,MI,MJ) my_cpu_id = omp_get_thread_num()+1!$OMP DO DO mrs = 1, (num*(1+num))/2, 1 IF ((num*(1+num))/2.NE.mrs) THEN DO mq = 1, num, 1 DO mi0 = 1, num, 1 10 CONTINUE xrsiq00(mi0, mq, my_cpu_id) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num* **2)/2) IF (zero.NE.val) THEN DO mi0 = 1, num, 1 xrsiq00(mi0, mq, my_cpu_id) = xrsiq00(mi0, mq, my_cpu_id)+v *(mp, mi0)*val xrsiq00(mi0, mp, my_cpu_id) = xrsiq00(mi0, mp, my_cpu_id)+v *(mq, mi0)*val 20 CONTINUE ENDDO ENDIF 30 CONTINUE ENDDO 40 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi0 = 1, num, 1 DO mj0 = 1, mi0, 1 50 CONTINUE xij00(mj0) = zero ENDDO DO mq = 1, num, 1 val = xrsiq00(mi0, mq, my_cpu_id) IF (zero.NE.val) THEN DO mj0 = 1, mi0, 1 60 CONTINUE xij00(mj0) = xij00(mj0)+v(mq, mj0)*val ENDDO ENDIF 70 CONTINUE ENDDO DO mj0 = 1, mi0, 1 80 CONTINUE xrsij(mj0+(mi0**2+(-mi0))/2+((-num)+(-num**2)+mrs*num+mrs*num ***2)/2) = xij00(mj0) ENDDO 90 CONTINUE ENDDO 100 CONTINUE ELSE DO mq = 1, num, 1 DO mi = 1, num, 1 316 CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num* **2)/2) IF (zero.NE.val) THEN DO mi = 1, num, 1 xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val 317 CONTINUE ENDDO ENDIF 318 CONTINUE ENDDO 319 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi = 1, num, 1 DO mj = 1, mi, 1 320 CONTINUE xij(mj) = zero ENDDO DO mq = 1, num, 1 val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, 1 321 CONTINUE xij(mj) = xij(mj)+v(mq, mj)*val ENDDO ENDIF 322 CONTINUE ENDDO DO mj = 1, mi, 1 323 CONTINUE xrsij(mj+(mi**2+(-mi))/2+((-num)+(-num**2)+mrs*num+mrs*num**2 *)/2) = xij(mj) ENDDO 324 CONTINUE ENDDO 325 CONTINUE ENDIF ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL DEALLOCATE (xrsiq00) ELSE DO mrs = 1, (num*(1+num))/2, 1!$OMP PARALLEL!$OMP+IF(6+(-1)*num.LE.0)!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MI)!$OMP DO DO mq = 1, num, 1 DO mi = 1, (num), 1 306 CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL DO mp = 1, num, 1 DO mq = 1, mp, 1 mrspq = 1+mrspq val = xrspq(mrspq) IF (zero.NE.val) THEN!$OMP PARALLEL!$OMP+IF(6+(-1)*num.LE.0)!$OMP+DEFAULT(SHARED)!$OMP DO DO mi = 1, (num), 1 xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val 307 CONTINUE ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL ENDIF 308 CONTINUE ENDDO 309 CONTINUE ENDDO mrsij = mrsij0 DO mi = 1, (num), 1!$OMP PARALLEL!$OMP+IF(6+(-1)*mi.LE.0)!$OMP+DEFAULT(SHARED)!$OMP DO DO mj = 1, mi, 1 310 CONTINUE xij(mj) = zero ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL ALLOCATE (xij1(1:mi, 1:numthreads))!$OMP PARALLEL!$OMP+IF(6+(-1)*num.LE.0)!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MY_CPU_ID,MQ,TPINIT,VAL,MJ) my_cpu_id = omp_get_thread_num()+1 DO tpinit = 1, mi, 1 xij1(tpinit, my_cpu_id) = 0.0 ENDDO!$OMP DO DO mq = 1, num, 1 val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, 1 311 CONTINUE xij1(mj, my_cpu_id) = xij1(mj, my_cpu_id)+v(mq, mj)*val ENDDO ENDIF 312 CONTINUE ENDDO!$OMP END DO NOWAIT!$OMP CRITICAL DO tpinit = 1, mi, 1 xij(tpinit) = xij(tpinit)+xij1(tpinit, my_cpu_id) ENDDO!$OMP END CRITICAL!$OMP END PARALLEL DEALLOCATE (xij1) DO mj = 1, mi, 1 mrsij = mrsij+1 313 CONTINUE xrsij(mrsij) = xij(mj) ENDDO 314 CONTINUE ENDDO mrsij0 = mrsij0+(num*(num+1))/2 315 CONTINUE ENDDO ENDIF
!$OMP PARALLEL!$OMP+DEFAULT(SHARED)!$OMP+PRIVATE(MRSIJ1,MI0,MJ0,VAL,MQ,MP,XIJ00,XRSIQ00)!$OMP DO DO mrs = 1, (num+num**2)/2, 1 IF ((num+num**2)/2.NE.mrs) THEN DO mq = 1, num, 1 DO mi0 = 1, num, 1 10 CONTINUE xrsiq00(mi0, mq) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num** *2)/2) IF (zero.NE.val) THEN DO mi0 = 1, num, 1 xrsiq00(mi0, mq) = xrsiq00(mi0, mq)+v(mp, mi0)*val xrsiq00(mi0, mp) = xrsiq00(mi0, mp)+v(mq, mi0)*val 20 CONTINUE ENDDO ENDIF 30 CONTINUE ENDDO 40 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi0 = 1, num, 1 DO mj0 = 1, mi0, 1 50 CONTINUE xij00(mj0) = zero ENDDO DO mq = 1, num, 1 val = xrsiq00(mi0, mq) IF (zero.NE.val) THEN DO mj0 = 1, mi0, 1 60 CONTINUE xij00(mj0) = xij00(mj0)+v(mq, mj0)*val ENDDO ENDIF 70 CONTINUE ENDDO DO mj0 = 1, mi0, 1 80 CONTINUE xrsij(mj0+(mi0**2+(-mi0)+(-num)+(-num**2)+mrs*num+mrs*num**2)/ *2) = xij00(mj0) ENDDO 90 CONTINUE ENDDO 100 CONTINUE ELSE DO mq = 1, num, 1 DO mi = 1, num, 1 306 CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO DO mp = 1, num, 1 DO mq = 1, mp, 1 val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num** *2)/2) IF (zero.NE.val) THEN DO mi = 1, num, 1 xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val 307 CONTINUE ENDDO ENDIF 308 CONTINUE ENDDO 309 CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/2 DO mi = 1, num, 1 DO mj = 1, mi, 1 310 CONTINUE xij(mj) = zero ENDDO DO mq = 1, num, 1 val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, 1 311 CONTINUE xij(mj) = xij(mj)+v(mq, mj)*val ENDDO ENDIF 312 CONTINUE ENDDO DO mj = 1, mi, 1 313 CONTINUE xrsij(mj+(mi**2+(-mi)+(-num)+(-num**2)+mrs*num+mrs*num**2)/2) *= xij(mj) ENDDO 314 CONTINUE ENDDO 315 CONTINUE ENDIF ENDDO!$OMP END DO NOWAIT!$OMP END PARALLEL
04/19/23 Interprocedural Symbolic Range Propagation
17
Conclusions
Interprocedural analysis of symbolic ranges Based on intraprocedural analysis Iterative algorithm
ISRP enhances other optimizations Compilation time increases up to 150%
Exceptions: OCEAN and TRACK
04/19/23 Interprocedural Symbolic Range Propagation
18
Thank you.