introduction to optimization, ii value numbering & larger scopes copyright 2003, keith d....

Introduction to Optimization, IIValue Numbering & Larger Scopes

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use.

Missed opportunities(need stronger methods)

m a + bn a + b

A

p c + dr c + d

B

y a + bz c + d

G

q a + br c + d

C

e b + 18s a + bu e + f

D e a + 17t c + du e + f

E

v a + bw c + dx e + f

F

Value Numbering EaC, Figure 8.5

Local Value Numbering

• 1 block at a time

• Strong local results

• No cross-block effects

*

An Aside on Terminology

m a + bn a + b

A

p c + dr c + d

B

y a + bz c + d

G

q a + br c + d

C

e b + 18s a + bu e + f

D e a + 17t c + du e + f

E


F

Control-flow graph (CFG)

•Nodes for basic blocks

•Edges for branches

•Basis for much of program analysis & transformation

This CFG, G = (N,E)

•N = {A,B,C,D,E,F,G}

•E = {(A,B),(A,C),(B,G),(C,D), (C,E),(D,F),(E,F),(F,E)}

• |N| = 7, |E| = 8

Superlocal Value Numbering

m a + bn a + b

A

p c + dr c + d

B

y a + bz c + d

G

q a + br c + d

C

e b + 18s a + bu e + f

D e a + 17t c + du e + f

E


FThe Concept

• Apply local method to EBBs

• Do {A,B}, {A,C,D}, & {A,C,E}

• Obtain reuse from ancestors

• Avoid re-analyzing A & C

• Does not help with F or G*

EBB: A set of blocks B1, B2, …, Bn where Bi is the sole predecessor of Bi+1, and B1 does not have a unique predecessor


Efficiency• Use A’s table to initialize tables for B & C• To avoid duplication, use a scoped hash table

A, AB, A, AC, ACD, AC, ACE, F, G

• Need a VN name mapping to handle kills Must restore map with scope Adds complication, not cost

To simplify matters• Unique name for each definition• Makes name VN• Use the SSA name space

EaC: § 5 & App. B

m a + bn a + b

A

p c + dr c + d

B

y a + bz c + d

G

q a + br c + d

C

e b + 18s a + bu e + f

D e a + 17t c + du e + f

E


F

Subscripted names from example in last

lecture

SSA Name Space (locally)

Example (from last lecture )

With VNs

a03 x0

1 + y02

b03 x0

1 + y02

a14 17

c03 x0

1 + y02

Notation:

• While complex, the meaning is clear

Original Code

a0 x0 + y0

b0 x0 + y0

a1 17

c0 x0 + y0

Renaming:

• Give each value a unique name

• Makes it clear

Rewritten

a03 x0

1 + y02

b03 a0

3

a14 17

c03 a0

3

Result:

• a03 is available

• rewriting works

SSA Name Space (in general)

Two principles• Each name is defined by exactly one operation• Each operand refers to exactly one definition

To reconcile these principles with real code• Insert -functions at merge points to reconcile name

space• Add subscripts to variable names for uniqueness

x ... x ...

... x + ...

x0 ... x1 ...

x2 (x0,x1)

x2 + ...

becomes

This is in SSA Form


m0 a + bn0 a + b

A

p0 c + dr0 c + d

B

r2 (r0,r1)y0 a + bz0 c + d

G

q0 a + br1 c + d

C

e0 b + 18s0 a + bu0 e + f

D e1 a + 17t0 c + du1 e + f

E

e3 (e0,e1)

u2 (u0,u1)v0 a + bw0 c + dx0 e + f

F

EaC, Figure 8.6

With all the bells & whistles

• Find more redundancy

• Pay little additional cost

• Still does nothing for F & G

Superlocal techniques

• Some local methods extend cleanly to superlocal scopes

• VN does not back up

• If C adds to A, it’s a problem

What About Larger Scopes?

We have not helped with F or G• Multiple predecessors

• Must decide what facts hold in F and in G For G, combine B & F? Merging state is expensive Fall back on what’s known

m0 a + bn0 a + b

A

p0 c + dr0 c + d

B

r2 (r0,r1)y0 a + bz0 c + d

G

q0 a + br1 c + d

C

e0 b + 18s0 a + bu0 e + f

D e1 a + 17t0 c + du1 e + f

E

e3 (e0,e1)

u2 (u0,u1)v0 a + bw0 c + dx0 e + f

F

Dominators

Definitionsx dominates y if and only if every path from the entry of the

control-flow graph to the node for y includes x

• By definition, x dominates x• We associate a Dom set with each node • |Dom(x )| ≥ 1

Immediate dominators• For any node x, there must be a y in Dom(x ) closest to

x• We call this y the immediate dominator of x• As a matter of notation, we write this as IDom(x )

Dominators

Dominators have many uses in analysis & transformation

• Finding loops

• Building SSA form

• Making code motion decisions

We’ll look at how to compute dominators later

A

B C G

FED

Dominator tree

Dominator sets

Back to the discussion of value numbering over larger scopes ...

*

m0 a + bn0 a + b

A

p0 c + dr0 c + d

B

r2 (r0,r1)y0 a + bz0 c + d

G

q0 a + br1 c + d

C

e0 b + 18s0 a + bu0 e + f

D e1 a + 17t0 c + du1 e + f

E

e3 (e0,e1)

u2 (u0,u1)v0 a + bw0 c + dx0 e + f

F

What About Larger Scopes?

We have not helped with F or G• Multiple predecessors

• Must decide what facts hold in F and in G For G, combine B & F? Merging state is expensive Fall back on what’s known

• Can use table from IDom(x ) to start x Use C for F and A for G Imposes a Dom-based application order

Leads to Dominator VN Technique (DVNT)

*

m0 a + bn0 a + b

A

p0 c + dr0 c + d

B

r2 (r0,r1)y0 a + bz0 c + d

G

q0 a + br1 c + d

C

e0 b + 18s0 a + bu0 e + f

D e1 a + 17t0 c + du1 e + f

E

e3 (e0,e1)

u2 (u0,u1)v0 a + bw0 c + dx0 e + f

F

Dominator Value Numbering

The DVNT Algorithm• Use superlocal algorithm on extended basic blocks

Retain use of scoped hash tables & SSA name space

• Start each node with table from its IDom DVNT generalizes the superlocal algorithm

• No values flow along back edges (i.e., around loops)

• Constant folding, algebraic identities as before

Larger scope leads to (potentially) better results Local + Superlocal + good start for new EBBs

Dominator Value Numbering

m a + bn a + b

A

p c + dr c + d

B

r2 (r0,r1)y a + bz c + d

G

q a + br c + d

C

e b + 18s a + bu e + f

D e a + 17t c + du e + f

E

e3 (e1,e2)

u2 (u0,u1)v a + bw c + dx e + f

F

DVNT advantages

•Find more redundancy

•Little additional cost

•Retains online character

DVNT shortcomings

•Misses some opportunities

•No loop-carried CSEs or constants

The Story So Far, …

• Local algorithm (Balke, 1967)• Superlocal extension of Balke (many)• Dominator VN technique (Simpson, 1996) All these propagate along forward edges None are global methods

Global Methods• Classic CSE (Cocke 1970)• Partitioning algorithms (Alpern et al. 1988, Click 1995)• Partial Redundancy Elimination (Morel & Renvoise 1979)• SCC/VDCM (Simpson 1996)

We will look at several global methods

*

Roadmap

To recap …

• We have seen value numbering Local, superlocal, dominator scopes We may look at global value numbering later We will look at global redundancy elimination today

• We have used dominators Chapter 9 in EaC shows how to compute dominators Gives world’s fastest algorithm (also

simplest)

• We have used SSA Chapter 9 in EaC shows how to build SSA form & tear it

down

introduction to optimization, ii value numbering & larger scopes copyright 2003, keith d....

Documents

f gsuperlocal techniques

xuse c

upif c

gfor g

discussion of value

b subscripted names

node x

node domx