optimizing compilers for modern architectures more interprocedural analysis chapter 11, sections...
TRANSCRIPT
Optimizing Compilers for Modern Architectures
More Interprocedural Analysis
Chapter 11, Sections 11.2.5 to end
Optimizing Compilers for Modern Architectures
Review
• We analyzed several interprocedural problems.—MOD -- set of variables modified in procedure—ALIAS -- set of aliased variables in a procedure.
• The problem of aliasing make interprocedural problems harder than global problems.
Optimizing Compilers for Modern Architectures
SUBROUTINE FOO(N) INTEGER N,M CALL INIT(M,N) DO I = 1,P B(M*I + 1) = 2*B(1) ENDDOEND
SUBROUTINE INIT(M,N) M = NEND
If N = 0 on entry to FOO, the loop is a reduction.Otherwise, we can vectorize the loop.
Constant Propagation
• Propagating constants between procedures can cause significant improvements.
• Dependence testing can be made more precise.
Optimizing Compilers for Modern Architectures
• Definition: Let s = (p,q) be a call site in procedure p, and let x be a parameter of q. Then , the jump function for x at s, gives the value of x in terms of parameters of p.
• The support of is the set of p-parameters that is dependent on.
J sx
J sxJ s
x
Constant Propagation
Optimizing Compilers for Modern Architectures
• Instead of a Def-Use graph, we construct an interprocedural value graph:—Add a node to the graph for each jump function
—If x belongs to the support of , where t lies in the procedure q, then add an edge between and for every call site s = (p,q) for some p.
• We can now apply the constant propagation algorithm to this graph. —Might want to iterate with global propagation
J sxJ t
yJ s
x
J ty
Constant Propagation
Optimizing Compilers for Modern Architectures
JαX
JβVJβ
U
JαY
The constant-propagation algorithm willEventually converge to above values.(might need to iterate with global)
1 2
-13
ExamplePROGRAM MAIN
INTEGER A,B
A = 1
B = 2
α CALL S(A,B)
END
SUBROUTINE S(X,Y)
INTEGER X,Y,Z,W
Z = X + Y
W = X - Y
β CALL T(Z,W)
END
SUBROUTINE T(U,V)
PRINT U,V
END
Optimizing Compilers for Modern Architectures
PROGRAM MAIN INTEGER Aα CALL PROCESS(15,A) PRINT AENDSUBROUTINE PROCESS(N,B) INTEGER N,B,Iβ CALL INIT(I,N) CALL SOLVE(B,I)ENDSUBROUTINE INIT(X,Y) INTEGER X,Y X = 2*YENDSUBROUTINE SOLVE(C,T) INTEGER C,T C = T*10END
•Need a way of building•For x output of p, define to be
the value of x on return from p in terms of p-parameters.
•The support of is defined as above.
J γI
Rpx
Rpx
RINITX ={2* Y} RSOLVE
C ={T *10}
J γT =
RINITX (N) I ∈MOD(β)
undefinedotherwise
⎧ ⎨ ⎪
⎩ ⎪
J αN =15 J β
Y =N
RPROCESSB =
RSOLVEC (J γ
T (N)) C∈MOD(γ)
undefinedotherwise
⎧ ⎨ ⎪
⎩ ⎪
Jump Functions
Optimizing Compilers for Modern Architectures
• The NotKilled problem is easily solved globally:
• To extend NotKilled to the procedure level, construct the reduced-control-flow graph for the procedure:—Vertices consist of procedure entry, procedure exit, and
call sites.—Every edge (x,y) in the graph is annotated with the set
THRU(x,y) of variables not killed on that edge.
NotKilled(b)= ∪c∈succ(b)
(THRU(b,c)∩ NotKilled(c))
Kill Analysis
Optimizing Compilers for Modern Architectures
Reduced Control GraphComputeReducedCFG(G)
remove back edges from G
for each successsor of entry node s, add (entry, s) to worklist
while worklist isn’t empty
remove a ready element (b,s) from the worklist
if s is a call site
add (s,t) to worlist for each sucessor t of s
otherwise if s isn’t the exit node
for each successor t of s
if THRU[b,t] undefined then THRU[b,t] {}
THRU[b,t] THRU[b,t] (THRU[b,s] THRU[s,t])
end
Optimizing Compilers for Modern Architectures
Kill Analysis
ComputeNKILL(p)
for each b in reduced graph in reverse top order
if b is exit node then NKILL[b] {all variables}
else
NKILL[b] {}
for each successor s of b
NKILL[b] NKILL[b] (NKILL[s] THRU[b,s]
if b is a call site (p,q) then
NKILL[b] NKILL[b] NKILL[q]
NKILL[p] NKILL[entry node]
end
Optimizing Compilers for Modern Architectures
Kill Analysis
• ComputeNKill only works in the absence of formal parameters.
• A more complicated algorithm takes parameter binding into account using a binding graph.
• This algorithm ignores aliasing for efficiency.
Optimizing Compilers for Modern Architectures
Symbolic Analysis
• Prove facts about variables other than constancy:—Find a symbolic expression for a variable in terms of other
variables.—Establish a relationship between pairs of variables at some
point in program.—Establish a range of values for a variable at a given point.
Optimizing Compilers for Modern Architectures
[-∞60 [50:∞][1:100]
[-∞:100] [1:∞]
[-∞,∞]
• Jump functions and return jump functions return ranges.
• Meet operation is now more complicated.
• If we can bound number of times upper bound increases and lower bound decreases, the finite-descending-chain property is satisfied.
Range Analysis:
Symbolic Analysis
• Range analysis and symbolic evaluation can be solved using a lattice framework.
Optimizing Compilers for Modern Architectures
DO I = 1,N CALL SOURCE(A,I) CALL SINK(A,I)ENDDO
We want to know if this loop carries a dependence.MOD and USE are some use, but notmuch.
Let be the set of locations in array modified on iteration I and set of locations used on iteration I. Then has a carried true dependence iff
MA(I )UA(I )
MA(I1)∩ UA(I 2) ≠∅ 1≤I1 <I2 ≤N
Array Section Analysis
• Consider the following code:
Optimizing Compilers for Modern Architectures
Array Section Analysis
• One possible lattice representation are sections of the form:
– A(I,L), A(I,*), A(*,L), A(*,*)
• The depth of the lattice is now on the order of the number of array subscripts., and the meet operation is efficient.
• A better representation is one in which upper and lower bounds for each subscript are allowed.
Optimizing Compilers for Modern Architectures
SUBROUTINE SUB1(X,Y,P) INTEGER X,Y CALL P(X,Y)END
What procedure names can bepassed into P?
To avoid loss of precision, we need to record sets of procedure-name tuples when a subroutine has multiple procedure parameters.
Call Graph Construction
• This problem is complicated by procedure parameters:
Optimizing Compilers for Modern Architectures
Call Graph ConstructionProcedure ComputeProcParms ProcParms(p) {} for each procedure p for each call site s = (p,q) passing in procedure names Let t = <N1,N2, …, Nk> be procedure names passed in. worklist worklist {<t,q>} while worklist isn’t empty remove <t= <N1,N2, …, Nk>,p> from worklist ProcParms[p] ProcParms[p] {t} <P1,P2,…,Pk> are parameters bound to <N1,N2, …, Nk> for each call site (p,q) passing in some Pi Let u=<M2,M2,…,Ml> set of procedure names and instances of Pi passed into q if u is not in ProcParms[q] then worklist worklist {<u,q>}end
Optimizing Compilers for Modern Architectures
Inlining
• Inlining procedure calls has several advantages:– Eliminates procedure call overhead.– Allows more optimizations to take place
• However, a study by Mary Hall showed that overuse of inlining can cause slowdowns:
– Breaks compiler procedure assumptions.– Function calls add needed register spills.– Changing function forces global recompilation.
Optimizing Compilers for Modern Architectures
PROCEDURE UPDATE(A,N,IS) REAL A(N) INTEGER I = 1,N A(I*IS-IS+1)=A(I*IS-IS+1)+PI ENDDOEND
If we knew that IS != 0 ata call, then loop can be vectorized.
If we know that IS != 0 at specific call sites, clone a vectorized version of the procedure and use it at those sites.
Procedure Cloning
• Often specific values of function parameters result in better optimizations.
Optimizing Compilers for Modern Architectures
DO I = 1,N CALL FOO()ENDDO
PROCEDURE FOO() …END
CALL FOO()
PROCEDURE FOO() DO I = 1,N … ENDDOEND
Hybrid optimizations
• Combinations of procedures can have benefit.
• One example is loop embedding:
Optimizing Compilers for Modern Architectures
Whole Program
• Want to efficiently do interprocedural analysis without constant recompilation.
• This is a software engineering problem.
• Basic idea: every time the optimizer passes over a component, calculate information telling what other procedures must be rescanned.
• Include a feedback loop that optimizes until no procedures are out of date.
Optimizing Compilers for Modern Architectures
Summary
• Solution of flow sensitive probems:—Constant propagation—Kill analysis
• Solutions to related problems such as symbolic analysis and array section analysis.
• Other optimizations and ways to integrate these into whole-program compilation.