1 intermediate representation goals: –encode knowledge about the program –facilitate analysis...
Post on 20-Dec-2015
226 Views
Preview:
TRANSCRIPT
1
Intermediate representation
• Goals: – encode knowledge about the program– facilitate analysis– facilitate retargeting– facilitate optimization
scanningparsing
semantic analysis
HIR intermediatecode gen.
HIRoptim
LIR codegen.
LIR
2
Intermediate representation
• Components– code representation– symbol table– analysis information– string table
• Issues– Use an existing IR or design a new one?– How close should it be to the source/target?
3
IR selection
• Using an existing IR– cost savings due to reuse– it must be expressive and appropriate for
the compiler operations
• Designing an IR– decide how close to machine code it should
be– decide how expressive it should be– decide its structure– consider combining different kinds of IRs
4
IR classification: Level
• High-level – closer to source language– used early in the process– usually converted to lower form later on– Example: AST
5
IR classification: Level
• Medium-level – try to reflect the range of features in the
source language in a language-independent way
– most optimizations are performed at this level
• algebraic simplification• copy propagation• dead-code elimination• common subexpression elimination• loop-invariant code motion• etc.
6
IR classification: Level
• Low-level – very close to target-machine instructions– architecture dependent– useful for several optimizations
• loop unrolling• branch scheduling• instruction/data prefetching• register allocation• etc.
7
IR classification: Level
for i := op1 to op2 step op3
instructions
endfor
i := op1
if step < 0 goto L2
L1: if i > op2 goto L3
instructions
i := i + step
goto L1
L2: if i < op2 goto L3
instructions
i := i + step
goto L2
L3:
High-level
Medium-level
8
IR classification: Structure
• Graphical– trees, graphs– not easy to rearrange– large structures
• Linear– looks like pseudocode– easy to rearrange
• Hybrid– combine graphical and linear IRs– Example:
• low-level linear IR for basic blocks, and• graph to represent flow of control
9
(Basic blocks)
• Basic block = a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching except at the end.
10
(Basic blocks)
• Partitioning a sequence of statements into BBs– Determine leaders (first statements of BBs)
• the first statement is a leader• the target of a conditional is a leader• a statement following a branch is a leader
– For each leader, its basic block consists of the leader and all the statements up to but not including the next leader.
11
Linear IRs
• Sequence of instructions that execute in order of appearance
• Control flow is represented by conditional branches and jumps
• Common representations– stack machine code– three-address code
12
Linear IRs
• stack machine code– assumes presence of operand stack– useful for stack architectures, JVM– operations typically pop operands and push
results.– advantages
• easy code generation• compact form
– disadvantages• difficult to rearrange• difficult to reuse expressions
13
Linear IRs
• three-address code– compact– generates temp variables– level of abstraction may vary– loses syntactic structure– quadruples
• operator• up to two operands• destination
– triples• similar to quadruples but the results are not named
explicitly (index of operation is implicit name)– Implement as table, array of pointers, or list
14
Linear IRs
L1: i := 2
t1:= i+1
t2 := t1>0
if t2 goto L1
(1) 2
(2) i st (1)
(3) i + 1
(4) (3) > 0
(5) if (4), (1)Quadruples
Triples
15
Graphical IRs
• Parse tree• Abstract syntax tree
– high-level– useful for source-level information– retains syntactic structure– Common uses
• source-to-source translation• semantic analysis• syntax-directed editors
16
Graphical IRs
• Tree, for basic block– root: operator– up to two children: operands– can be combined
• Uses:– algebraic simplifications– may generate locally optimal code.
L1: i := 2
t1:= i+1
t2 := t1>0
if t2 goto L1
assgn, i add, t1 gt, t2
2 i 1 t1 02
assgn, i 1
add, t1 0
gt, t2
17
Graphical IRs
• Directed acyclic graphs (DAGs)– Like compressed trees
• leaves: variables, constants available on entry• internal nodes: operators
– annotated with variable names? • distinct left/right children
– Used for basic blocks (doesn't show control flow)
– Can generate efficient code.• Note: DAGs encode common expressions
– But difficult to transform – Better for analysis
18
Graphical IRs
• Generating DAGs– check whether an operand is already
present• if not, create a leaf for it
– check whether there is a parent of the operand that represents the same operation
• if not create one, then label the node representing the result with the name of the destination variable, and remove that label from all other nodes in the DAG.
19
Graphical IRs
• Directed acyclic graphs (DAGs)– Example
m := 2 * y * z n := 3 * y * z p := 2 * y - z
20
Graphical IRs
• Control flow graphs (CFGs)– Each node corresponds to a
• basic block, or – fewer nodes– may need to determine facts at specific points within
BB
• a single statement– more space and time
– Each edge represents flow of control
21
Graphical IRs
• Dependence graphs – Encode flow of values from definition to use– Nodes represent operations– Edges connect definitions to uses– Graph represents constraints on the
sequencing of operations– Built for specific optimizations, then
discarded
22
SSA form
• Static Single Assignment Form– Encodes information about data and control flow– Two constraints:
• each definition has a unique name• each use refers to a single definition
– all uses reached by a definition are renamed
– Example:x := 5 x0 := 5 x := x+1 becomes x1 := x0 + 1 y := x *2 y0 := x1 * 2
• What if we have a loop?
23
SSA form
• The compiler inserts special join functions (called -functions) at points where different control flow paths meet.
• Example:read(x) read(x0)if (x>0) if (x0>0) y:=5 y0 := 5else becomes else y:=10 y1 := 10x := y y2 := (y0, y1) x1 := y2
24
SSA form
• Example 2: x := 0 x0 := 0i := 1 i0 := 1while (i<10) if (i0>=10) goto L2 x := x+i L1: i := i+1
25
SSA form
• Example 2: x := 0 x0 := 0i := 1 i0 := 1while (i<10) if (i0>=10) goto L2 x := x+i L1: x1:= (x0, x2) i := i+1 i1 := (i0, i2)
x2 := x1+i1 i2 := i1+1 if (i2<10) goto L1
L2: x3 := (x0, x2) i3 := (i0, i2)
26
SSA form
• Note: is not an executable function• A program is in SSA form if
– each variable is assigned a value in exactly one statement
– each use of a variable is dominated by the definition.
• point x dominates point y if every path from the start to y goes through x
top related