introduction to optimization, ii value numbering & larger scopes copyright 2003, keith d....
TRANSCRIPT
Introduction to Optimization, IIValue Numbering & Larger Scopes
Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use.
Missed opportunities(need stronger methods)
m a + bn a + b
A
p c + dr c + d
B
y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
v a + bw c + dx e + f
F
Value Numbering EaC, Figure 8.5
Local Value Numbering
• 1 block at a time
• Strong local results
• No cross-block effects
*
An Aside on Terminology
m a + bn a + b
A
p c + dr c + d
B
y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
v a + bw c + dx e + f
F
Control-flow graph (CFG)
•Nodes for basic blocks
•Edges for branches
•Basis for much of program analysis & transformation
This CFG, G = (N,E)
•N = {A,B,C,D,E,F,G}
•E = {(A,B),(A,C),(B,G),(C,D), (C,E),(D,F),(E,F),(F,E)}
• |N| = 7, |E| = 8
Superlocal Value Numbering
m a + bn a + b
A
p c + dr c + d
B
y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
v a + bw c + dx e + f
FThe Concept
• Apply local method to EBBs
• Do {A,B}, {A,C,D}, & {A,C,E}
• Obtain reuse from ancestors
• Avoid re-analyzing A & C
• Does not help with F or G*
EBB: A set of blocks B1, B2, …, Bn where Bi is the sole predecessor of Bi+1, and B1 does not have a unique predecessor
Superlocal Value Numbering
Efficiency• Use A’s table to initialize tables for B & C• To avoid duplication, use a scoped hash table
A, AB, A, AC, ACD, AC, ACE, F, G
• Need a VN name mapping to handle kills Must restore map with scope Adds complication, not cost
To simplify matters• Unique name for each definition• Makes name VN• Use the SSA name space
EaC: § 5 & App. B
m a + bn a + b
A
p c + dr c + d
B
y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
v a + bw c + dx e + f
F
Subscripted names from example in last
lecture
SSA Name Space (locally)
Example (from last lecture )
With VNs
a03 x0
1 + y02
b03 x0
1 + y02
a14 17
c03 x0
1 + y02
Notation:
• While complex, the meaning is clear
Original Code
a0 x0 + y0
b0 x0 + y0
a1 17
c0 x0 + y0
Renaming:
• Give each value a unique name
• Makes it clear
Rewritten
a03 x0
1 + y02
b03 a0
3
a14 17
c03 a0
3
Result:
• a03 is available
• rewriting works
SSA Name Space (in general)
Two principles• Each name is defined by exactly one operation• Each operand refers to exactly one definition
To reconcile these principles with real code• Insert -functions at merge points to reconcile name
space• Add subscripts to variable names for uniqueness
x ... x ...
... x + ...
x0 ... x1 ...
x2 (x0,x1)
x2 + ...
becomes
This is in SSA Form
Superlocal Value Numbering
m0 a + bn0 a + b
A
p0 c + dr0 c + d
B
r2 (r0,r1)y0 a + bz0 c + d
G
q0 a + br1 c + d
C
e0 b + 18s0 a + bu0 e + f
D e1 a + 17t0 c + du1 e + f
E
e3 (e0,e1)
u2 (u0,u1)v0 a + bw0 c + dx0 e + f
F
EaC, Figure 8.6
With all the bells & whistles
• Find more redundancy
• Pay little additional cost
• Still does nothing for F & G
Superlocal techniques
• Some local methods extend cleanly to superlocal scopes
• VN does not back up
• If C adds to A, it’s a problem
What About Larger Scopes?
We have not helped with F or G• Multiple predecessors
• Must decide what facts hold in F and in G For G, combine B & F? Merging state is expensive Fall back on what’s known
m0 a + bn0 a + b
A
p0 c + dr0 c + d
B
r2 (r0,r1)y0 a + bz0 c + d
G
q0 a + br1 c + d
C
e0 b + 18s0 a + bu0 e + f
D e1 a + 17t0 c + du1 e + f
E
e3 (e0,e1)
u2 (u0,u1)v0 a + bw0 c + dx0 e + f
F
Dominators
Definitionsx dominates y if and only if every path from the entry of the
control-flow graph to the node for y includes x
• By definition, x dominates x• We associate a Dom set with each node • |Dom(x )| ≥ 1
Immediate dominators• For any node x, there must be a y in Dom(x ) closest to
x• We call this y the immediate dominator of x• As a matter of notation, we write this as IDom(x )
Dominators
Dominators have many uses in analysis & transformation
• Finding loops
• Building SSA form
• Making code motion decisions
We’ll look at how to compute dominators later
A
B C G
FED
Dominator tree
Dominator sets
Back to the discussion of value numbering over larger scopes ...
*
m0 a + bn0 a + b
A
p0 c + dr0 c + d
B
r2 (r0,r1)y0 a + bz0 c + d
G
q0 a + br1 c + d
C
e0 b + 18s0 a + bu0 e + f
D e1 a + 17t0 c + du1 e + f
E
e3 (e0,e1)
u2 (u0,u1)v0 a + bw0 c + dx0 e + f
F
What About Larger Scopes?
We have not helped with F or G• Multiple predecessors
• Must decide what facts hold in F and in G For G, combine B & F? Merging state is expensive Fall back on what’s known
• Can use table from IDom(x ) to start x Use C for F and A for G Imposes a Dom-based application order
Leads to Dominator VN Technique (DVNT)
*
m0 a + bn0 a + b
A
p0 c + dr0 c + d
B
r2 (r0,r1)y0 a + bz0 c + d
G
q0 a + br1 c + d
C
e0 b + 18s0 a + bu0 e + f
D e1 a + 17t0 c + du1 e + f
E
e3 (e0,e1)
u2 (u0,u1)v0 a + bw0 c + dx0 e + f
F
Dominator Value Numbering
The DVNT Algorithm• Use superlocal algorithm on extended basic blocks
Retain use of scoped hash tables & SSA name space
• Start each node with table from its IDom DVNT generalizes the superlocal algorithm
• No values flow along back edges (i.e., around loops)
• Constant folding, algebraic identities as before
Larger scope leads to (potentially) better results Local + Superlocal + good start for new EBBs
Dominator Value Numbering
m a + bn a + b
A
p c + dr c + d
B
r2 (r0,r1)y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
e3 (e1,e2)
u2 (u0,u1)v a + bw c + dx e + f
F
DVNT advantages
•Find more redundancy
•Little additional cost
•Retains online character
DVNT shortcomings
•Misses some opportunities
•No loop-carried CSEs or constants
The Story So Far, …
• Local algorithm (Balke, 1967)• Superlocal extension of Balke (many)• Dominator VN technique (Simpson, 1996) All these propagate along forward edges None are global methods
Global Methods• Classic CSE (Cocke 1970)• Partitioning algorithms (Alpern et al. 1988, Click 1995)• Partial Redundancy Elimination (Morel & Renvoise 1979)• SCC/VDCM (Simpson 1996)
We will look at several global methods
*
Roadmap
To recap …
• We have seen value numbering Local, superlocal, dominator scopes We may look at global value numbering later We will look at global redundancy elimination today
• We have used dominators Chapter 9 in EaC shows how to compute dominators Gives world’s fastest algorithm (also
simplest)
• We have used SSA Chapter 9 in EaC shows how to build SSA form & tear it
down