compiler optimisation - 7 register allocation · 2019-01-23 · register allocation definitions...
Post on 25-May-2020
12 Views
Preview:
TRANSCRIPT
Compiler Optimisation7 – Register Allocation
Hugh LeatherIF 1.18a
hleather@inf.ed.ac.uk
Institute for Computing Systems ArchitectureSchool of Informatics
University of Edinburgh
2019
Introduction
This lecture:Local Allocation - spill codeGlobal Allocation based on graph colouringTechniques to reduce spill code
Register allocation
Physical machines have limited number of registersScheduling and selection typically assume infinite registersRegister allocation and assignment ∞ → k registers
RequirementsProduce correct code that uses k (or fewer) registersMinimise added loads and storesMinimise space used to hold spilled valuesOperate efficiently
O(n), O(nlog2n), maybe O(n2), but not O(2n)
Register allocationDefinitions
Allocation versus assignmentAllocation is deciding which values to keep in registersAssignment is choosing specific registers for values
InterferenceTwo valuesa cannot be mapped to the same register wherever theyare both liveb
Such values are said to interfereaA value is stored in a variablebA value is live from its definition to its last use
Live rangeThe live range of a value is the set of statements at which it is liveMay be conservatively overestimated (e.g. just begin → end)
Register allocationDefinitions
SpillingSpilling saves a value from a register to memoryThat register is then free – Another value often loadedRequires F registers to be reserved
Clean and dirty valuesA previously spilled value is clean if not changed since last spillOtherwise it is dirtyA clean value can b spilled without a new store instruction
Spilling in ILOC
F is 0 (assuming rarp already reserved)Dirty valuestoreAI rx → rarp,@xloadAI rarp,@y ⇒ ry
Clean valueloadAI rarp,@y ⇒ ry
Local register allocation
Register allocation only on basic block
MAXLIVELet MAXLIVE be the maximum, over each instruction i in theblock, of the number of values (pseudo-registers) live at i.
If MAXLIVE ≤ k, allocation should be easyIf MAXLIVE ≤ k, no need to reserve F registers for spillingIf MAXLIVE > k, some values must be spilled to memoryIf MAXLIVE > k, need to reserve F registers for spilling
Two main forms:Top downBottom up
Local register allocationMAXLIVE
Example MAXLIVE computationSome simple code with virtual registers
Local register allocationMAXLIVE
Example MAXLIVE computationLive registers
Local register allocationMAXLIVE
Example MAXLIVE computationMAXLIVE is 4
Local register allocationTop down
Algorithm:If number of values > k
Rank values by occurrencesAllocate first k - F values to registersSpill other values
Local register allocationTop down
Example top downUsage counts
Local register allocationTop down
Example top downSpill rc . Now only 3 values live at once
Local register allocationTop down
Example top downSpill code inserted
Local register allocationTop down
Example top downRegister assignment straightforward
Local register allocationBottom up
Algorithm:Start with empty register setLoad on demandWhen no register is available, free one
Replacement:Spill the value whose next use is farthest in the futurePrefer clean value to dirty value
Local register allocationTop down
Example bottom downSpill ra. Now only 3 values live at once
Local register allocationTop down
Example bottom downSpill code inserted
Global register allocation
Local allocation does not capture reuse of values across multipleblocksMost modern, global allocators use a graph-colouring paradigm
Build a “conflict graph” or “interference graph”Data flow based liveness analysis for interference
Find a k-colouring for the graph, or change the code to anearby problem that it can k-colourNP-complete under nearly all assumptions1
1Local allocation is NP-complete with dirty vs clean
Global register allocationAlgorithm sketch
From live ranges construct an interference graphColour interference graph so that no two neighbouring nodeshave same colourIf graph needs more than k colours - transform code
Coalesce merge-able copiesSplit live rangesSpill
Colouring is NP-complete so we will need heuristicsMap colours onto physical registers
Global register allocationGraph colouring
DefinitionA graph G is said to be k-colourable iff the nodes can be labeledwith integers 1 ... k so that no edge in G connects two nodes withthe same label
Examples
Global register allocationInterference graph
The interference graph, GI = (NI ,EI)
Nodes in GI represent values, or live rangesEdges in GI represent individual interferences∀x , y ∈ NI , x → y ∈ EI iff x and y interfere2
A k-colouring of GI can be mapped into an allocation to k registers
2Two values interfere wherever they are both liveTwo live ranges interfere if their values interfere at any point
Global register allocationColouring the interference graph
Degree3 of a node (n°) is a loose upper bound on colourabilityAny node, n, such that n° < k is always trivially k-colourable
Trivially colourable nodes cannot adversely affect thecolourability of neighbours4
Can remove them from graphReduces degree of neighbours - may be trivially colourable
If left with any nodes such that n° ≥ k spill oneReduces degree of neighbours - may be trivially colourable
3Degree is number of neighbours4Proof as exercise
Global register allocationChaitin’s algorithm
1 While ∃ vertices with < k neighbours in GIPick any vertex n such that n° < k and put it on the stackRemove n and all edges incident to it from GI
2 If GI is non-empty (n° >= k, ∀n ∈ GI) then:Pick vertex n (heuristic), spill live range of nRemove vertex n and edges from GI , put n on “spill list”Goto step 1
3 If the spill list is not empty, insert spill code, then rebuild theinterference graph and try to allocate, again
4 Otherwise, successively pop vertices off the stack and colourthem in the lowest colour not used by some neighbour
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmColour with k = 3 colours
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithma° = 2 < k Choose a
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPush a and remove from graph
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmb° = 2 < k and c° = 2 < k Choose b
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPush b and remove from graph
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmc° = 2 < k, d° = 2 < k, and e° = 2 < k Choose c
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPush c and remove from graph
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmd° = 1 < k and e° = 1 < k Choose d
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPush d and remove from graph
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithme° = 0 < k Choose e
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPush e and remove from graph
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPop e, neighbours use no colours, choose red
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPop d , neighbours use red, choose green
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPop c, neighbours use red and green choose blue
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPop b, neighbours use red and green choose blue
Global register allocationChaitin’s algorithm
Example: colouring with Chaitin’s algorithmPop a, neighbours use blue choose red
Global register allocationOptimistic colouring
If Chaitins algorithm reaches a state where every node has kor more neighbours, it chooses a node to spill.
Example of Chaitin overzealous spilling
k = 2Graph is 2-colourable
Chaitin must immediately spill one of these nodes
Briggs said, take that same node and push it on the stackWhen you pop it off, a colour might be available for it!
Chaitin-Briggs algorithm uses this to colour that graph
Global register allocationChaitin-Briggs algorithm
1 While ∃ vertices with < k neighbours in GIPick any vertex n such that n° < k and put it on the stackRemove n and all edges incident to it from GI
2 If GI is non-empty (n° >= k, ∀n ∈ GI) then:Pick vertex n (heuristic) (Do not spill)Remove vertex n from GI , put n on stack (Not spill list)Goto step 1
3 Otherwise, successively pop vertices off the stack and colourthem in the lowest colour not used by some neighbour
If some vertex cannot be coloured, then pick an uncolouredvertex to spill, spill it, and restart at step 1
Step 3 is also different
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmColour with k = 2 colours
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithma° = 2 ≥ k Don’t Spill! Choose a
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPush a and remove from graph
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmb° = 1 < k and c° = 1 < k Choose b
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPush b and remove from graph
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmc° = 1 < k, and d° = 1 < k Choose c
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPush c and remove from graph
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmd° = 1 < k Choose d
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPush d and remove from graph
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPop d , neighbours use no colours, choose red
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPop c, neighbours use red choose green
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPop b, neighbours use red choose green
Global register allocationChaitin-Briggs algorithm
Example: colouring with Chaitin-Briggs algorithmPop a, neighbours use green choose red
Global register allocationSpill candidates
Minimise spill cost/ degreeSpill cost is the loads and stores needed. Weighted by scope -i.e. avoid inner loopsThe higher the degree of a node to spill the greater thechance that it will help colouringNegative spill cost load and store to same memory locationwith no other usesInfinite cost - definition immediately followed by use. Spillingdoes not decrease live range
Global register allocationAlternative spilling
Splitting live rangesCoalesce
Global register allocationLive range splitting
A whole live range may have many interferences, but perhapsnot all at the same timeSplit live range into two variables connected by copyCan reduce degree of interference graphSmart splitting allows spilling to occur in “cheap” regions
Global register allocationLive ranges splitting
Splitting exampleNon contiguous live ranges - cannot be 2 coloured
Global register allocationLive ranges splitting
Splitting exampleSplit live ranges - can be 2 coloured
Global register allocationCoalescing
If two ranges don’t interfere and are connected by a copycoalesce into one – opposite of splittingReduces degree of nodes that interfered with bothIf x := y and x → y ∈ GI then can combine LRx and LRy
Eliminates the copy operationReduces degree of LRs that interfere with both x and yIf a node interfered with both both before, coalescing helpsAs it reduces degree, often applied before colouring takes place
Global register allocationCoalescing
Coalescing can make the graph harder to colorTypically, LRxy ° > max(LRx °, LRy °)If max(LRx°, LRy°) < k and k < LRxy ° then LRxy might spill,while LRx and LRy would not spill
Global register allocationCoalescing
Observation led to conservative coalescingConceptually, coalesce x and y iff x → y ∈ GI and LRxy ° < kWe can do better
Coalesce LRx and LRy iff LRxy has < k neighbours withdegree > kOnly neighbours of “significant degree” can force LRxy to spill
Always safe to perform that coalesceCannot introduce a node of non-trivial degreeCannot introduce a new spill
Global register allocationOther approaches
Top-down uses high level priorities to decide on colouringHierarchical approaches - use control flow structure to guideallocationExhaustive allocation - go through combinatorial options -very expensive but occasional improvementRe-materialisation - if easy to recreate a value do so ratherthan spillPassive splitting using a containment graph to make spillseffectiveLinear scan - fast but weak; useful for JITs
Global register allocationOngoing work
Eisenbeis et al examining optimality of combined reg alloc andscheduling. Difficulty with general control-flowPartitioned register sets complicate matters. Allocation canrequire insertion of code which in turn affects allocation.Leupers investigated use of genetic algs for TM seriespartitioned reg sets.New work by Fabrice Rastello and others. Chordal graphsreduce complexityAs latency increases see work in combined code generation,instruction scheduling and register allocation
Summary
Local Allocation - spill codeGlobal Allocation based on graph colouringTechniques to reduce spill code
PPar CDT Advert
The biggest revolution in the technological landscape for fifty years
Now accepting applications! Find out more and apply at:
pervasiveparallelism.inf.ed.ac.uk
• • 4-year programme: 4-year programme: MSc by Research + PhDMSc by Research + PhD
• Collaboration between: ▶ University of Edinburgh’s School of Informatics ✴ Ranked top in the UK by 2014 REF
▶ Edinburgh Parallel Computing Centre ✴ UK’s largest supercomputing centre
• Full funding available
• Industrial engagement programme includes internships at leading companies
• Research-focused: Work on your thesis topic from the start
• Research topics in software, hardware, theory and
application of: ▶ Parallelism ▶ Concurrency ▶ Distribution
top related