code generation and optimisation
TRANSCRIPT
Code Generation and Optimisation
Register Allocation
via Graph Colouring
Broad problem
y := 10; x := y*y*y; print x
li y, 10 mul t, y, y mul x, t, y print x
Current compiler
Problem: to automatically convert unlimited register code to limited register code.
Broad solution
The number of registers used in a piece of code can be reduced by:
register reuse: when two registers are never simultaneously live they can be remapped to a single register;
register spilling: register values can be stored in memory and loaded into registers only when needed.
Example: register reuse
Since x and t are never live at the same time they can be remapped to a single register r.
li y, 10 mul t, y, y mul x, t, y print x
li y, 10 mul r, y, y mul r, r, y print r
Spill memory
Suppose registers a, b, and y are spilt to memory. They might be arranged in memory as follows:
Memory
…
a
b
y
…
spill
The spill register points to the beginning of the spilt registers in memory.
Loads and Stores
To access spilt registers, we need to extend our target language with instructions for accessing memory.
instr ::= … (MIPS2) | load r , r , i (r0 ⟵ mem[r1+i]) | store r , r , i (mem[r1+i] ⟵ r0)
Concrete syntax of MIPS3
data Instr = … | LOAD(Reg, Reg, Int) | STORE(Reg, Reg, Int)
Abstract syntax of MIPS3
Example: register spilling
Suppose y has been spilt to memory. We must wrap accesses to y with load and/or store instructions.
mul y, x, x ... print y
mul t, x, x store t, spill, 2
... load t, spill, 2
print t
Register allocation
Aims:
Introduce as much register reuse as possible.
(The number of registers used will fall
significantly, but may still be beyond the limit of the target machine.)
Spill registers to memory only if necessary. Memory access is expensive!
INTERFERENCE GRAPHS
When are registers live at the same time?
Interference graphs
Using the results of liveness analysis, we can construct an interference graph:
each register r is represented by a node labelled r in the interference graph;
there is an edge between r1 and r2 if liveness analysis states that, at any point in the code, r1 and r2 are live at the same time.
Exercise 1
Draw the interference graph for the following code.
li a, 0
addi b, a, 1
muli a, b, 2
blt a, n, loop
add c, c, b
L1:
loop:
L2:
L3:
L5: print c
L4:
Label Live-in
L1 c, n
loop a, c, n
L2 b, c, n
L3 b, c, n
L4 a, c, n
L5 c
GRAPH COLOURING
Register allocation
Let colours 1…N represent the N registers available in the target machine.
Register allocation is similar to graph colouring:
colour the interference graph with at most N colours, such that no two nodes connected by an edge have the same colour.
Register allocation
if the graph is not N-colourable then registers may be spilt (assigned no colour).
But there is one big difference:
The number of spills should be kept to a minimum!
Example 1
Suppose the target machine has 3 registers. Consider register allocation on the following interference graph.
a b
c d
Register Colour
a 1
b 2
c 2
d 3
Register Colour
a 1
b 2
c 3
d Spilt
Good answer: Bad answer:
Algorithm
Register allocation, like graph colouring, is NP-complete. That is, no optimal polynomial-time algorithm is known.
We use a mild variant of Chaitin's aglorithm. It is an approximate polynomial time algorithm that typically gives good results.
Algorithm
Has two components:
The basic colouring algorithm that is parameterised by the order in which nodes should be coloured.
The simplification algorithm which decides a good order to colour the nodes.
Basic colouring algorithm
Inputs:
an interference graph g;
a list of registers rs (the order).
Outputs:
a colouring c that partially-maps registers to colours, initially empty. Uncoloured registers are to be spilt.
Basic colouring algorithm
Let neighbours(g, c, r) denote the colours (if any) of the registers connected to r.
foreach r in rs possible := {1..N} – neighbours(g,c,r) ; if possible ≠ {} then c[r] := minimum(possible)
Basic colouring algorithm
Exercise 2
c
e f
d
h g
a
b
Assuming the target machine has just two registers , apply the basic colouring algorithm twice to the following interference graph.
First use the node ordering b, e, f, h, g, c, d, a, and then use the node ordering h, d, g, a, c, f, e, b.
Order matters!
When using the basic colouring algorithm, some node orderings lead to fewer spills than others.
The simplification algorithm aims to find a good ordering for the basic colouring algorithm.
Chaitin’s observation
If g-r is N-colourable then so is g, since when r and its edges are added to g-r, the neighbourhood of r contains fewer than N colours.
N is the number of available registers;
graph g contains a node r with fewer than N neighbours;
g-r is the graph obtained by removing r and its edges.
Suppose that
Node ordering
Chaitin's observation suggests the following recursively-defined node ordering.
To colour the nodes in graph g, first colour the nodes in g-r, where r has fewer than N neighbours, then colour node r.
(Note, this does not help us when there exists no node with fewer than N neighbours.)
Deterministic simplification algorithm
Inputs:
an interference graph g.
Outputs:
A list of registers rs representing an order in which to colour the nodes.
Let pick(g) return the register in g with the fewest neighbours; if more than one exists, return the first alphabetically.
Deterministic simplification algorithm
rs := [] while non-empty(g) r := pick(g) ; rs := [r] ++ rs ; g := del(r, g)
Deterministic simplification algorithm
Let del(r, g) return graph g with register r, and its edges, removed.
Exercise 3
c
e f
d
h g
a
b
Give the output of the simplification algorithm on the following interference graph.
REGISTER ALLOCATION
register: a device for storing small amounts of data
allocate: to apportion for a specific purpose
Webster’s Dictionary
Substitution
If register x is coloured c, then the register allocator replaces all occurrences of x in the code with rc.
For example, if registers x, y, and t are coloured 1, 2, and 1 then:
li y, 10 mul t, y, y mul x, t, y print x
li r2, 10 mul r1, r2, r2
mul r1, r1, r2
print r1
Spilling
If register r is not coloured, then spill-code must be inserted to transfer the value of r to and from memory as required.
For example, if registers x and y are to be spilt to memory locations spill[offsetx] and spill[offsety] then
add x, x, y
load t1, spill, offsetx
load t2, spill, offsety
add t1, t1, t2 store t1, spill, offsetx
COALESCING
Reducing unnecessary move instructions
Coalescing
If we have an instruction
move x, y
and there is no edge between x and y in the interference graph then the instruction can be eliminated.
Registers x and y can be coalesced into a new register xy whose neighbours are the union of those of x and of y.
Snag
Since a coalesced register xy may have more neighbours than x or y alone, a graph that is N-colourable before coalescing may not be after.
Coalescing can be done selectively during register allocation without compromising N-colourablity: see Section 11.2 of Appel[1].
[1] Modern Compiler Implementation in ML.
SUMMARY
What have we learnt?
Summary
Register allocation transforms unlimited-register code into limited-register code.
Aim to introduce as few spills as possible by reusing registers.
Chaitin's algorithm not optimal but efficient and typically gives good results.
Coalescing allows unnecessary move instructions to be removed.
Limitations
A long-living but infrequently-used variable may get mapped to a register r (and not spilt).
This seems wasteful of a valuable resource: it would be better to map a frequently-used variable to register r.
Acknowledgements
Many ideas in this chapter were taken from:
Gregory Chaitin of IBM: see “Register Allocation and Spilling via Graph Colouring”.
Andrew Appel: see “Modern Compiler Implementation in ML”.
IMPLEMENTATION
Interference graph (1)
Interference graph: each register r is mapped to a list of registers that are live at the same time as r.
Compute all registers live at the same time as r:
liveWith :: (Liveness, Reg) -> [Reg] liveWith(live, r) = bigUnion([rs | (l, rs) <- live, member(r, rs)])
type IG = Map Reg [Reg]
Interference graph (2)
Construct the interference graph for a given code sequence:
interferenceGraph :: Code -> IG interferenceGraph(code) = [(r, liveWith(live, r)) | r <- regs] where root = fresh() g = cfg(code, root) live = liveness(g) regs = bigUnion([use(i) ++ def(i) | i <- code])
Basic colouring
A colour is an integer between 1 and numRegs and a colouring maps registers to colours:
type Colour = Int
type Colouring = Map Reg Colour
Colour graph g in the order specified by rs, given an initial colouring c:
basic :: ([Reg], IG, Colouring) -> Colouring basic([], g, c) = c basic(r:rs, g, c) = case possible of [] -> basic(rs, g, c) x:xs -> basic(rs, g, insert(c, r, x)) where possible = [1..numRegs] \\ neighbours(g, c, r)
Simplification algorithm
Return registers in a good order for colouring:
Colour the registers in the order given by the simplification algorithm.
simplify :: IG -> [Reg] simplify([]) = [] simplify(g) = simplify(del(r, g)) ++ [r] where r = pick(g)
colour :: IG -> Colouring colour(g) = basic(simplify(g), g, [])