wcet-aware register allocation based on integer-linear programming

18
Computer Science 12 Design Automation for Embedded Systems ECRTS 2011 WCET-aware Register Allocation WCET-aware Register Allocation based on based on Integer-Linear Programming Integer-Linear Programming Heiko Falk, Norman Schmitz, Florian Schmoll TU Dortmund Computer Science 12 Design Automation for Embedded Systems

Upload: kyrie

Post on 14-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

WCET-aware Register Allocation based on Integer-Linear Programming. Heiko Falk, Norman Schmitz, Florian Schmoll TU Dortmund Computer Science 12 Design Automation for Embedded Systems. Outline. Introduction State of the Art in Compiler Design Register Allocation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: WCET-aware Register Allocation based on Integer-Linear Programming

Computer Science 12Design Automation for Embedded Systems

ECRTS 2011

WCET-aware Register AllocationWCET-aware Register Allocationbased onbased on

Integer-Linear ProgrammingInteger-Linear Programming

Heiko Falk, Norman Schmitz, Florian Schmoll

TU Dortmund

Computer Science 12

Design Automation for Embedded Systems

Page 2: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 2 / 18© H. Falk | 2011-07-06 ECRTS 2011

OutlineOutline

Introduction State of the Art in Compiler Design Register Allocation

Traditional ILP-based Register Allocation ILP Model Limitations

WCET-aware Register Allocation using ILP Model of the WCET Model of Pipeline-Related Spill Costs

Results Summary & Future Work

Page 3: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 3 / 18© H. Falk | 2011-07-06 ECRTS 2011

Current State of the Art in Compiler DesignCurrent State of the Art in Compiler Design

Objective Function of Compiler Optimizations Usually reduction of Average-Case Execution Times (ACET):

Accelerate a “typical” execution of a program using “typical” input data

No statements about WCETs possible

Optimization Strategy Naive: Current compilers lack precise ACET timing model Application of an optimization if “promising” Effect of optimizations on a program’s ACET fully unknown to the

compiler itself. ACET-optimizations not useful for WCET minimization

Page 4: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 4 / 18© H. Falk | 2011-07-06 ECRTS 2011

Register AllocationRegister Allocation

Goals Considered the most important compiler optimization Registers are fastest and most efficient memories Register Allocation should make optimal use of registers

Tasks Assembly code before register allocation: virtual registers

(VREGs) Map all (potentially many) VREGs to (usually few) physical

registers (PHREGs) of a processor Insert memory loads and stores (spill code) whenever VREGs

don’t fit into the register file

Page 5: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 5 / 18© H. Falk | 2011-07-06 ECRTS 2011

Well-Known Register AllocatorsWell-Known Register Allocators

Graph Coloring De-facto standard approach nowadays Heuristics decide about allocation and spill code generation Fast approach of moderate complexity Spill heuristic might lead to poor code quality

Register Allocation via Integer-Linear Programming (ILP) Formal mathematical model of allocation and spilling Achieves minimal spill code overhead, i.e. minimizes total number

of spill instructions Relatively high complexity, but optimal quality

[P. Briggs, Register Allocation via Graph Coloring, 1992]

[D. W. Goodwin, K. D. Wilken, Optimal and Near-optimal Global Register Allocation Using 0-1 Integer Programming, 1996]

Page 6: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 6 / 18© H. Falk | 2011-07-06 ECRTS 2011

Traditional ILP-based Register AllocationTraditional ILP-based Register Allocation

Spilling decisions

ConstraintsGuarantee correctness of allocation and spilling decisions, e.g. ensure that each VREG is assigned to at least one PHREG, that at most one VREG can be assigned to a single PHREG, ...

Allocation decisionsVariables , and map VREGs to PHREGs

Page 7: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 7 / 18© H. Falk | 2011-07-06 ECRTS 2011

Traditional ILP-based Register AllocationTraditional ILP-based Register Allocation

Objective Function Minimizes spill code-related overhead Under the assumption:

Each spill instruction contributes by same constant amount to objective function

Example: minimization of spill-related code size

Page 8: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 8 / 18© H. Falk | 2011-07-06 ECRTS 2011

WCET Minimization via ILP-based Allocation?WCET Minimization via ILP-based Allocation?

Limitation of the traditional approach Assumption:

Each spill instruction contributes by same constant amount to objective function

Assumption only holds for trivial objectives like e.g. code size

Challenges How to model and minimize Worst-Case Execution Time (WCET)

as non-trivial objective? How to deal with complex processor pipelines executing spill

instructions in parallel with other code?

Page 9: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 9 / 18© H. Falk | 2011-07-06 ECRTS 2011

Challenge 1: ILP Model of the WCETChallenge 1: ILP Model of the WCET

The Worst-Case Execution Path (WCEP) WCET of a program = Length of the program’s longest execution

path (WCEP) WCET Minimization: Optimization of only those parts of a program

lying on the WCEP Code optimization apart the WCEP will not reduce WCET

Only those spill-related decision variables must contribute to the ILP’s objective function that actually lie on the WCEP.

But: Spilling decisions affect WCET of basic blocks and thus the WCEP within a program.

How to model the WCEP via ILP depending on spill-related decision variables?

Page 10: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 10 / 18© H. Falk | 2011-07-06 ECRTS 2011

Costs of basic block :

models WCET of depending on the WCET of potentially inserted spill code

WCET without any spill code, plus WCET of all spill code inside

Spill Code-dependent CostsSpill Code-dependent Costs

Page 11: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 11 / 18© H. Falk | 2011-07-06 ECRTS 2011

Intraprocedural Control FlowIntraprocedural Control Flow

Modeling of a function’s control flow:

A

CB

D

E

Acyclic sub-graphs: (Reducible) Loops:

B

A

C

D

E

Treat body of inner-most loop like acyclic sub-graph

Fold loop Costs of :

Continue with next innermost loop = WCET of longest path

starting at A

Loop LB, C, D

Page 12: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 12 / 18© H. Falk | 2011-07-06 ECRTS 2011

Objective FunctionObjective Function

WCET of entire function: Each function has dedicated entry block Variable models WCET of longest path within starting

at

Variable models WCET of entire function

Page 13: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 13 / 18© H. Falk | 2011-07-06 ECRTS 2011

Challenge 2: Pipeline-Related Spill CostsChallenge 2: Pipeline-Related Spill Costs

Example: The Infineon TriCore Pipelines Integer I-Pipeline: Executes usual integer ALU instructions Load/Store LS-Pipeline: Executes memory loads/stores and

address arithmetic Ideal case: One I- and one LS-instruction executed in parallel

within same clock cycle However...

(Some even more subtle cases of the TriCore pipelines omitted here…)

add d0,d1,d2; # d0 = d1 + d2ld d0,[a0]; # d0 = mem[a0]

I-instruction

LS-instruction

WAW hazard (write after write) Stalled by 1 cycle

Page 14: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 14 / 18© H. Falk | 2011-07-06 ECRTS 2011

ILP Example for Costs of Spill Instruction ILP Example for Costs of Spill Instruction ss

Case 1 If is LS-instruction:

. costs 1 cycle if is actually generated:

Case 2 If is spill-load

and is I-instruction: . costs 1 cycle if

is actually generatedand WAW hazard between and exists via PHREG :

st [a1],d1; # i: mem[a1] = d1ld d0,[a0]; # s: d0 = mem[a0]

add d0,d1,d2; # i: d0 = d1 + d2ld d0,[a0]; # s: d0 = mem[a0]

Page 15: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 15 / 18© H. Falk | 2011-07-06 ECRTS 2011

Results – Worst-Case Execution TimesResults – Worst-Case Execution Times

Target Processor: TriCore TC1796 100%: WCETEST using Graph Coloring

Compiler: WCC at optimizationlevel -O3 (42 optimizations)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

Re

lati

ve

WC

ET

ES

T [

%]

WCET-ILP WCET-GC

[H. Falk, WCET-aware Register Allocation based on Graph Coloring, DAC 2009]

98%

19%

80% x2

Page 16: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 16 / 18© H. Falk | 2011-07-06 ECRTS 2011

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

Re

lati

ve

AC

ET

[%

]

WCET-ILP WCET-GC

Results – Average-Case Execution TimesResults – Average-Case Execution Times

Target Processor: TriCore TC1796 100%: ACET using Graph Coloring

Compiler: WCC at optimizationlevel -O3 (42 optimizations)

Page 17: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 17 / 18© H. Falk | 2011-07-06 ECRTS 2011

Results – CPU RuntimesResults – CPU Runtimes

ILP-based Allocator Runtimes range from 1 CPU second to 54:08 CPU minutes Including WCET analysis and ILP solver Average runtime for 55 benchmarks: 3:33 CPU minutes

WCET-aware Graph Coloring Average runtime for 55 benchmarks: 4:13 CPU minutes Reason: Performs a costly WCET analysis after register allocation

for each individual basic block

Page 18: WCET-aware Register Allocation based on Integer-Linear Programming

Slide 18 / 18© H. Falk | 2011-07-06 ECRTS 2011

Summary & Future WorkSummary & Future Work

Summary Current state of the art: Compilers are unaware of timing, naive

optimization strategies Standard register allocators unaware of worst-case properties May thus lead to spill code generation along WCEP WCET-aware ILP-based register allocation: Sophisticated models

of WCET and pipeline-related spill costs Average WCET reductions over 55 benchmarks: 20.2% Outperforms WCET-aware graph coloring by factor 2

Future Work Reduce runtimes of ILP-based register allocator Improve code quality further by integrating rematerialization