graphs and sets in compilers · basic block: a maximal length sequence of straight-linecode...

37
Graphs and Sets in Compilers Arun Chauhan COMP 314 Lecture 14, 15 Feb 27, Mar 4, 2003

Upload: others

Post on 07-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Graphs and Sets in Compilers

Arun Chauhan

COMP 314

Lecture 14, 15 Feb 27, Mar 4, 2003

Page 2: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Recall: Compiler Structure

source

code

front-

endmiddle

back-

endobject

code

IR IR

parsing optimizationcode

generation

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 3: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Front-end

• parsing builds parse trees

- parse trees may be converted to syntax trees

- interpreters can evaluate with a walk of the parse tree

- a simple unparser (or source-to-source translator) can use the

parse tree

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 4: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Back-end: Register Allocation

• machine operations can only work on registers

• limited number of registers

• writing to and reading from memory is expensive

• each variable must live in memory

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 5: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Reg Alloc ≡ Graph Coloringx = a + b

...

z = d + e

y = x * c

...

p = z + f

live range 1 (of x)

live range 2 (of z)

l1 l2

• coloring the interference graph = allocating registers

- color nodes: no two adjacent nodes have the same color

- number of colors = number of registers

- can we color the interference graph?

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 6: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Reg Alloc ≡ Graph Coloringx = a + b

...

z = d + e

y = x * c

...

p = z + f

live range 1 (of x)

live range 2 (of z)

l1 l2

• coloring the interference graph = allocating registers

- color nodes: no two adjacent nodes have the same color

- number of colors = number of registers

- can we color the interference graph?

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 7: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Reg Alloc ≡ Graph Coloringx = a + b

...

z = d + e

y = x * c

...

p = z + f

live range 1 (of x)

live range 2 (of z)

l1 l2

• coloring the interference graph = allocating registers

- color nodes: no two adjacent nodes have the same color

- number of colors = number of registers

- can we color the interference graph?

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 8: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Reg Alloc ≡ Graph Coloringx = a + b

...

z = d + e

y = x * c

...

p = z + f

live range 1 (of x)

live range 2 (of z)

l1 l2

• coloring the interference graph = allocating registers

- color nodes: no two adjacent nodes have the same color

- number of colors = number of registers

- can we color the interference graph?

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 9: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Reg Alloc ≡ Graph Coloringx = a + b

...

z = d + e

y = x * c

...

p = z + f

live range 1 (of x)

live range 2 (of z)

l1 l2

• coloring the interference graph = allocating registers

- color nodes: no two adjacent nodes have the same color

- number of colors = number of registers

- can we color the interference graph?

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 10: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Graph Coloring

• the decision problem for k-coloring of graphs is

NP-complete

- graph-coloring is still used for register allocation

- excellent heuristics exist

• many problem in compilers are NP-complete

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 11: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Basic Blocks and Beyond

Basic Block: A maximal length sequence of

straight-line code

• representing a program (function)

- represent each basic block by a node

- connect node n1 to n2 with a directed edge whenever n2 can

immediately follow n1 in program execution

⇒ Control Flow Graph (CFG)

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 12: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Basic Blocks and Beyond

Basic Block: A maximal length sequence of

straight-line code

• representing a program (function)

- represent each basic block by a node

- connect node n1 to n2 with a directed edge whenever n2 can

immediately follow n1 in program execution

⇒ Control Flow Graph (CFG)

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 13: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Basic Blocks and Beyond

Basic Block: A maximal length sequence of

straight-line code

• representing a program (function)

- represent each basic block by a node

- connect node n1 to n2 with a directed edge whenever n2 can

immediately follow n1 in program execution

⇒ Control Flow Graph (CFG)

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 14: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Example: if-else

...

x = N;

if (x < 0) {

y = sqrt(-x);

}

else {

y = sqrt(x);

}

z = x + y;

...

x = N;

x < 0 ?

y = sqrt(-x); y = sqrt(x);

z = x + y;

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 15: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Example: if-else

...

x = N;

if (x < 0) {

y = sqrt(-x);

}

else {

y = sqrt(x);

}

z = x + y;

...

x = N;

x < 0 ?

y = sqrt(-x); y = sqrt(x);

z = x + y;

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 16: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Example: while loop

...

n = 1;

s = 2;

while (n < P-1) {

s = 2 * s;

n = n + 1;

}

print(s);

...

n = 1;

s = 2;

n < P-1 ?

s = 2 * s;

n = n + 1;

print(s);

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 17: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Example: while loop

...

n = 1;

s = 2;

while (n < P-1) {

s = 2 * s;

n = n + 1;

}

print(s);

...

n = 1;

s = 2;

n < P-1 ?

s = 2 * s;

n = n + 1;

print(s);

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 18: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Why CFG?

• exposes the control semantics

• language independent

- works for most imperative languages

- functional languages use alternative (λ-calculus based)

representations

• graph representation enables graph-based

optimization algorithms

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 19: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Why CFG?

• exposes the control semantics

• language independent

- works for most imperative languages

- functional languages use alternative (λ-calculus based)

representations

• graph representation enables graph-based

optimization algorithms

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 20: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Why CFG?

• exposes the control semantics

• language independent

- works for most imperative languages

- functional languages use alternative (λ-calculus based)

representations

• graph representation enables graph-based

optimization algorithms

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 21: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Why CFG?

• exposes the control semantics

• language independent

- works for most imperative languages

- functional languages use alternative (λ-calculus based)

representations

• graph representation enables graph-based

optimization algorithms

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 22: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Redundant Code Elimination

• code that does not change the outcome of a program

can be eliminated

- any optimization must preserve observational equivalence

x = 5;

y = 10;

w = 2 * exp(x) * (-y);

z = sqrt(w) + log(y);

print(w);

z is dead after it is defined

redundant computation of z can be eliminated

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 23: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Redundant Code Elimination

• code that does not change the outcome of a program

can be eliminated

- any optimization must preserve observational equivalence

x = 5;

y = 10;

w = 2 * exp(x) * (-y);

z = sqrt(w) + log(y);

print(w);

z is dead after it is defined

redundant computation of z can be eliminated

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 24: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Live Variable Analysis

• define sets of variables for each basic block

- DEF(b) ≡ defined in the basic block b

- USED(b) ≡ used in the basic block b, before they are defined

- LIVE(b) ≡ live on exit of the basic block b

• the goal is to compute LIVE sets for each block

- if a variable v is computed in b, but is not in LIVE(b), the

computation is redundant

- if a variable is not in the LIVE set of a block, it need not live

in a register

- a use before definition can be detected

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 25: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Live Variable Analysis

• define sets of variables for each basic block

- DEF(b) ≡ defined in the basic block b

- USED(b) ≡ used in the basic block b, before they are defined

- LIVE(b) ≡ live on exit of the basic block b

• the goal is to compute LIVE sets for each block

- if a variable v is computed in b, but is not in LIVE(b), the

computation is redundant

- if a variable is not in the LIVE set of a block, it need not live

in a register

- a use before definition can be detected

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 26: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Computing the LIVE Set

n = A;

x = B;

if (x < 0) {

s = abs(x);

r = sqrt(s);

}

else {

r = sqrt(x);

}

print(r);

n = A;

x = B;

x < 0 ?

s = abs(x);

r = sqrt(s);r = sqrt(x);

print(r);

B1

B2

B3

B4

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 27: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Computing the LIVE Set

n = A;

x = B;

if (x < 0) {

s = abs(x);

r = sqrt(s);

}

else {

r = sqrt(x);

}

print(r);

n = A;

x = B;

x < 0 ?

s = abs(x);

r = sqrt(s);r = sqrt(x);

print(r);

B1

B2

B3

B4

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 28: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Flow Equations for LIVE

LIVE(b) =⋃

s∈succ(b)

(LIVE(s) ∩ ¬DEF(s)) ∪ USED(s)

LIVE(B0) = {A, B}

DEF(B1) = {x}

USED(B1) = {A, B}

LIVE(B1) = {x}

DEF(B2) = {s, r}

USED(B2) = {x}

LIVE(B2) = {r}

DEF(B3) = {r}

USED(B3) = {x}

LIVE(B3) = {r}

DEF(B4) = {φ}

USED(B4) = {r}

LIVE(B4) = {φ}

n = A;

x = B;

x < 0 ?

s = abs(x);

r = sqrt(s);r = sqrt(x);

print(r);

B1

B2

B3

B4

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 29: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Flow Equations for LIVE

LIVE(b) =⋃

s∈succ(b)

(LIVE(s) ∩ ¬DEF(s)) ∪ USED(s)

LIVE(B0) = {A, B}

DEF(B1) = {x}

USED(B1) = {A, B}

LIVE(B1) = {x}

DEF(B2) = {s, r}

USED(B2) = {x}

LIVE(B2) = {r}

DEF(B3) = {r}

USED(B3) = {x}

LIVE(B3) = {r}

DEF(B4) = {φ}

USED(B4) = {r}

LIVE(B4) = {φ}

n = A;

x = B;

x < 0 ?

s = abs(x);

r = sqrt(s);r = sqrt(x);

print(r);

B1

B2

B3

B4

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 30: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Flow Equations for LIVE

LIVE(b) =⋃

s∈succ(b)

(LIVE(s) ∩ ¬DEF(s)) ∪ USED(s)

LIVE(B0) = {A, B}

DEF(B1) = {x}

USED(B1) = {A, B}

LIVE(B1) = {x}

DEF(B2) = {s, r}

USED(B2) = {x}

LIVE(B2) = {r}

DEF(B3) = {r}

USED(B3) = {x}

LIVE(B3) = {r}

DEF(B4) = {φ}

USED(B4) = {r}

LIVE(B4) = {φ}

n = A;

x = B;

x < 0 ?

s = abs(x);

r = sqrt(s);r = sqrt(x);

print(r);

B1

B2

B3

B4

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 31: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Dataflow: Graphs & Sets

• set-based equations

- either forward or backward flow

- usually, iteratively solvable (fixed-point solution)

• control flow graph

- built from the parse or syntax tree

• dataflow sets

- most practical implementations use bit-vector representations

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 32: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Dataflow: Graphs & Sets

• set-based equations

- either forward or backward flow

- usually, iteratively solvable (fixed-point solution)

• control flow graph

- built from the parse or syntax tree

• dataflow sets

- most practical implementations use bit-vector representations

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 33: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Dataflow: Graphs & Sets

• set-based equations

- either forward or backward flow

- usually, iteratively solvable (fixed-point solution)

• control flow graph

- built from the parse or syntax tree

• dataflow sets

- most practical implementations use bit-vector representations

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 34: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Dataflow: Graphs & Sets

• set-based equations

- either forward or backward flow

- usually, iteratively solvable (fixed-point solution)

• control flow graph

- built from the parse or syntax tree

• dataflow sets

- most practical implementations use bit-vector representations

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 35: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Other Operations on CFGs

• eliminating dead code

- graph reachability

• detecting loops

- strongly connected components

• many other flow-based optimizations

- common subexpression elimination

- constant propagation and folding

- loop-invariant code motion

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 36: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Another Graph

• call graph

- nodes are procedures (functions)

- directed edges connect callers to the callees

- self edges are possible

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003

Page 37: Graphs and Sets in Compilers · Basic Block: A maximal length sequence of straight-linecode †representingaprogram(function)-representeachbasicblockbyanode-connectnoden1 ton2 withadirected

Applications Outside Compilers

• networks

- represent network topologies

- model network b/w or traffic through weighted graphs

• parallel computation

- represent interconnection networks with graphs and CPUs

with nodes

- model loads with annotated graphs

• artificial intelligence

- iterative deepening A∗ algorithm

- neural networks

Lecture 14, 15: Graphs and Sets in Compilers Feb 27, Mar 4, 2003