code generation - sjtujiangli/teaching/cs308/cs308-slides09.pdf · issues in the design of a code...

CS308 Compiler Principles

Code Generation

Li JiangDepartment of Computer Science and Engineering

Shanghai Jiao Tong University

Compiler Principles

Background• The final phase in our compiler model

• Requirements imposed on a code generator– Preserving the semantic meaning of the source program and being of high

quality– Making effective use of the available resources of the target machine– The code generator itself must run efficiently.

• A code generator has three primary tasks:– Instruction selection, register allocation, and instruction ordering

Compiler Principles

Issues in the Design of a Code Generator

• The most important criterion for a code generator is that it produce correct code.

• Given the premium on correctness, a code generator is expected to be easily implemented, tested, and maintained.

Compiler Principles

Issues in the Design of a Code Generator

• Input to the Code Generator• The Target Program• Instruction Selection• Register Allocation• Evaluation Order

Compiler Principles

Input to the Code Generator• The input to the code generator is

– the intermediate representation of the source program produced by the frontend

– information in the symbol table• Choices for the IR

– Three-address representations, such as quadruples– Virtual machine representations, such as bytecodes– Linear representations, such as postfix notation– Graphical representation, such as syntax trees and

DAG’s

Compiler Principles

The Target Program• The most common target-machine architecture

are RISC, CISC, and stack based.– A RISC machine typically has many registers, three-

address instructions, simple addressing modes, and a relatively simple instruction-set architecture.

– A CISC machine typically has few registers, two-address instructions, and variety of addressing modes, several register classes, variable-length instructions, and instruction with side effects.

– In a stack-based machine, operations are done by pushing operands onto a stack and then performing the operations on the operands at the top of the stack.

Compiler Principles

The Target Program• Use very simple RISC-like computer as the

target machine.

• Use assembly code as the target language– generate symbolic instructions and – use the macro facilities of the assembler to help

generate code

Compiler Principles

Instruction Selection• The code generator must map the IR

program into a code sequence that can be executed by the target machine.

• The complexity of the mapping is determined by the factors such as– The level of the IR– The nature of the instruction-set architecture– The desired quality of the generated code

8

Compiler Principles

Instruction Selection Cont’d• If the IR is high level, use code templates to

translate each IR statement into a sequence of machine instruction.– Produces poor code, needs further optimization.

• If the IR reflects some of the low-level details of the underlying machine, then the code generator can use this information to generate more efficient code sequences.

Compiler Principles

Instruction Selection Cont’d• The nature of the instruction set of the target

machine has a strong effect on the difficulty of instruction selection. – instruction speeds and machine idioms

x = y + z Þ LD R0, yADD R0, R0, zST x, R0

a = b + c Þ LD R0, bd = a + e ADD R0, R0, c

ST a, R0 LD R0, aADD R0, R0,eST d, R0

Compiler Principles

Instruction Selection Cont’d• A given IR program can be implemented by

many different code sequences, with significant cost differences.

• A naïve translation of the intermediate code may therefore lead to correct but unacceptably inefficient target code.

• For example, using INC for a=a+1 instead of LD R0,aADD R0, R0, #1ST a, R0

• We need to know instruction costs in order to design good code sequences.

Compiler Principles

Register Allocation• A key problem in code generation is deciding

what values to hold in what registers. • Efficient utilization is particularly important.• The use of registers is often subdivided into

two subproblems:1.Register allocation: select the set of variables that

will reside in registers at each point in the program.2.Register assignment: pick the specific register that

a variable will reside in.

Compiler Principles

Register Allocation Example

Compiler Principles

Evaluation Order• The order in which computations are

performed can affect the efficiency of the target code.

• Some computation orders require fewer registers to hold intermediate results than others.

• However, picking the best order in the general case is a NP-complete problem.

Compiler Principles

The Target Language• We shall use as a target language assembly

code for a simple computer that is representative of many register machines.

Compiler Principles

A Simple Target Machine Model• Our target computer models a three-address

machine with load and store operations, computation operations, jump operations, and conditional jumps.

• The underlying computer is a byte-addressable machine with n general-purpose registers.

• Assume the following kinds of instructions are available:– Load operations: LD dst, addr – Store operations: ST x, r – Computation operations: OP dst, srcl, src2– Unconditional jumps: BR L– Conditional jumps: Bcond r, L

Compiler Principles

A Simple Target Machine Model Cont’d• Addressing models:

– A variable name x refers to the memory location that is reserved for x

– Indexed address, a(r), where a is a variable and r is a register

– A memory location represented by an integer indexed by a register• for example, LD R1, 100(R2).

– Two indirect addressing modes: • *r: memory location stored in the location represented by

the content of register r• *100(r): memory location stored in the location of adding

100 to the contents of r– Immediate constant addressing mode: #n

Compiler Principles

A Simple Target Machine Model Examplex = y – z Þ LD R1, y

LD R2, zSUB R1, R1, R2ST x, R1

b = a[i] Þ LD R1, iMUL R1, R1, 8LD R2, a(R1)ST b, R2

a[j] = c Þ LD R1, cLD R2, jMUL R2, R2, 8ST a(R2), R1

x = *p Þ LD R1, pLD R2, 0(R1)ST x, R2

*p = y Þ LD R1, pLD R2, yST 0(R1), R2

if x < y goto L Þ LD R1, xLD R2, ySUB R1, R1, R2BLTZ R1, L

#

#

Compiler Principles

Program and Instruction Costs• For simplicity, we take the cost of an

instruction to be one plus the costs associated with the addressing modes of the operands.

• Addressing modes involving registers have zero additional cost, while those involving a memory location or constant have an additional cost of one.

• For example, – LD R0, R1 cost = 1– LD R0, M cost = 2– LD R1, *100(R2) cost = 3

Compiler Principles

Addresses in the Target Code• Program runs in its own logical address space

that was partitioned into four code and data areas:1. A statically determined area Code that holds the

executable target code. 2. A statically determined data area Static, for holding

global constants and other data generated by the compiler.

3. A dynamically managed area Heap for holding data objects that are allocated and freed during program execution.

4. A dynamically managed area Stack for holding activation records as they are created and destroyed during procedure calls and returns.

Compiler Principles

Static Allocation for Procedure• Focus on the following three-address

statements:– call callee– return– halt– action //placeholder

Compiler Principles

Static Allocation for Procedure• We assume the first location in the activation

holds the return address.

• call calleeST callee.staticArea, #here + 20BR callee.codeArea

• returnBR *callee.staticArea

Pointing to current

instruction

3 constants + 2 instructions = 5

words

Compiler Principles

Static Allocation Example/ / code for caction1call paction2halt

/ / code for paction3return

Compiler Principles

Stack Allocation for Procedure• Static allocation can become stack allocation by

using relative addresses for storage in activation records.

• In stack allocation, the position of an activation record is usually stored in a register and determined during run time.

• Words in the activation record can be accessed as offsets from the value in this register.– Maintain a register SP pointing to the beginning of the

activation record on top of the stack– When a procedure call occurs, the calling procedure

increases SP and transfers control to the called procedure

– After control returns to the caller, decreases SP.

Compiler Principles

Stack Allocation• The first procedure initializes the stack by setting SP

to the start of the stack area:LD SP, #stackStart //initialize the stackcodes for the first procedureHALT //terminate execution

• Procedure call:ADD SP, SP, #caller.recordSize //increase stack pointerST 0(SP), #here+16 //save return addressBR callee.codeArea //jump to the callee

• Return:– Callee: BR *0(SP)– Caller: SUB SP, SP, #caller.recordSize

Compiler Principles

Runtime Addresses for Names• Assumption: a name in a three-address statement

is really a pointer to a symbol-table entry for that name.

• Note that names must eventually be replaced by code to access storage locations

• Example: x=0– suppose the symbol-table entry for x contains a relative

address 12– x is in a statically allocated area beginning at address

static– the actual run-time address of x is static + 12– The actual assignment: static[12] = 0– For a static area starting at address 100: LD 112, #0

Compiler Principles

After talking about code generation for procedure, we will focus on the codes inside each of the procedures.

Compiler Principles

Basic Blocks and Flow Graphs• A graph representation of intermediate code

that is helpful for discussing code generation– Partition the intermediate code into basic blocks

• Basic blocks are maximal sequences of consecutive three-address instructions

• The flow of control can only enter/leave the basic block through the first/last instruction

– The basic blocks become the nodes of a flow graph, whose edges indicate which blocks can follow which other blocks.

How to find the basic block?

Compiler Principles

Basic Block Partitioning Algorithm • First, determine leader instructions:

– The first three-address instruction in the intermediate code is a leader.

– Any instruction that is the target of a conditional or unconditional jump is a leader.

– Any instruction that immediately follows a conditional or unconditional jump is a leader.

• Next, for each leader, its basic block consists of itself and all instructions up to but not including the next leader or the end of the intermediate program.

Compiler Principles

Basic Blocks Partitioning Exampleleaders

Compiler Principles

Flow Graphs

• The nodes of the flow graph are the basic blocks.

• There is an edge from block B to block C if and only if it is possible for the first instruction in block C to immediately follow the last instruction in block B.

• There are two ways that such an edge could be justified:– There is a conditional or unconditional jump from the

end of B to the beginning of C .– C immediately follows B in the original order of the

three-address instructions, and B does not end in an unconditional jump.

Basic block + control flow = flow graph

Compiler Principles

Flow Graph Example

Compiler Principles

Optimization of Basic Blocks• Local optimization is within each basic block• Global optimization looks at how information

flows among the basic blocks of a program

• This chapter focuses on the local optimization

Compiler Principles

Next-Use Information• The use of a name in a three-address

statement:– Three-address statement i assigns a value to x– Statement j has x as an operand– Control can flow from statement i to j along a path

that has no intervening assignments to x– Then statement j uses the value of x computed at i

.

– Say that x is live at statement i .

Compiler Principles

Determining the Liveness and Next-Use• Start from the last statement and scan

backwards to the beginning. At each statement i: x = y op z, do the following:1. Attach to i the information currently found in the symbol table regarding the next use and liveness of x, y, and z.2. In the symbol table, set x to "not live" and "no next use."3. In the symbol table, set y and z to "live" and the next uses of y and z to i.

Compiler Principles

DAG Representation of Basic Blocks

• Construct a DAG(directed acyclic graph) for a basic block1. There is a node in the DAG for each of the initial values of the

variables appearing in the basic block.2. There is a node N associated with each statement s within the

block. The children of N are those nodes corresponding to statements that are the last definitions, prior to s,of the operands used by s.

3. Node N is labeled by the operator applied at s and also attached to N is the list of variables for which it is the last definition within the block.

4. Certain nodes are designated output nodes.These are the nodes whose variables are live on exit from the block; that is, their values may be used later, in another block of the flow graph.

Many important techniques for local optimization begin by transforming a basic block into a DAG (directed acyclic graph)

Preserve information of dependency!

Compiler Principles

DAG Representation of Basic Blocks• The DAG representation of a basic block lets

us perform several code-improvements– eliminating local common sub-expressions

– eliminating dead code

– reordering statements to reduce the time a temporary value needs to be preserved in a register

– applying algebraic laws to reorder operands of three-address instructions to simplify the computation

instructions that compute a value that has already been computed

instructions that compute a value that is never used.

Compiler Principles

Finding Local Common Subexpressions

• Common subexpressions can be detected by checking whether there is an existing node N with the same children, in the same order, and with the same operator.

b = d // b is live

Compiler Principles

Dead Code Elimination• Delete from a DAG any root (node with no

ancestors) that has no live variables attached. • Repeated application of this transformation will

remove all nodes from the DAG that correspond to dead code.

• Example: Assume a and b are live but c and eare not.– e and then c can be deleted.

Compiler Principles

The Use of Algebraic Identities• Eliminate computations

• Reduction in strength

• Constant folding• 2*3.14 = 6.28 evaluated at compile time

• Other algebraic transformations– x*y=y*x– x>y and x-y>0– a= b+c; e=c+d+b; e=a+d;

Compiler Principles

Representation of Array References• An assignment from an array, like x=a[i], is represented by

creating a node with operator =[] and two children representing the initial value of the array a0, and the index i. Variable x becomes a label of this new node.

• An assignment to an array, like a[j]=y, is represented by a new node with operator []= and three children representing a0, j and y. There is no variable labeling this node. The creation of this node kills all currently constructed nodes whose value depends on a0.

Compiler Principles

Representation of Array References

a is an array. b is a position in the array a.

x is killed by b[j]=y.

Compiler Principles

Pointer Assignments and Procedure Calls• Problem of the following assignments

x = *p*q = y– we do not know what p or q point to. – x = *p is a use of every variable– *q = y is a possible assignment to every variable. – the operator =* must take all nodes that are currently associated with identifiers

as arguments, which is relevant for dead-code elimination. – the *= operator kills all other nodes so far constructed in the DAG.– Global pointer analyses can be used to limit the set of variables

• Procedure calls behave much like assignments through pointers. – Assume that a procedure uses and changes any data to which it has access. – If variable x is in the scope of a procedure P, a call to P both uses the node

with attached variable x and kills that node.

Compiler Principles

Reassembling Basic Blocks From DAGsb is not live on exit

b is live on exit

Compiler Principles

Reassembling Basic Blocks From DAGs• The rules of reassembling

1. The order of instructions must respect the order of nodes in the DAG.

2. Assignments to an array must follow all previous assignments to, or evaluations from, the same array, according to the order of these instructions in the original basic block.

3. Evaluations of array elements must follow any previous assignments to the same array

4. Any use of a variable must follow all previous procedure calls or indirect assignments through a pointer.

5. Any procedure call or indirect assignment through a pointer must follow all previous evaluations of any variable.

Compiler Principles

Test Yourself• Construct the DAG for the basic block

d = b * ce = a + bb = b * ca = e – d

• Simplify the above three-address code, when only a is live on exit from the block.

Compiler Principles

Instruction Selection by Tree Rewriting

• Instruction selection – selecting target-language instructions to

implement the operators in the intermediate representation

– a large combinatorial task, especially for CISC machines

• In this section, we treat instruction selection as a tree-rewriting problem.

Compiler Principles

Intermediate-Code Tree• A tree for the assignment statement a[i]=b+1, where the array a is stored on the run-time stack and the variable b is a global in memory location Mb.

The ind operator treats its argument as a memory address.

Compiler Principles

Tree-Translation Schemes• The target code is generated by applying a

sequence of tree-rewriting rules to reduce the input tree to a single node.

• Each tree-rewriting rule has the form

where replacement is a single node, template is a tree, and action is a code fragment.

• Example:

Compiler Principles

Tree-Translation Schemesload

store

index load

Compiler Principles

Tree-Translation Schemes

addition

Compiler Principles

Code Generation by Tiling an Input Tree

Given an input tree, the templates in the tree-rewriting rules are applied to tile its subtrees. If a template matches, the matching subtree in the input tree is replaced with the replacement node of the rule and the action associated with the rule is done. If the action contains a sequence of machine instructions, the instructions are emitted. This process is repeated until the tree is reduced to a single node, or until no more templates match.

Compiler Principles

Code Generation by Tiling an Input Tree

• To implement the tree-reduction process, we must address some issues related to tree-pattern matching:– How is tree-pattern matching to be done?– What do we do if more than one template

matches at a given time?

Compiler Principles

Pattern Matching by Parsing• The input tree can be treated as a string by

using its prefix representation• Uses an LR parser to do the pattern matching

Compiler Principles

Pattern Matching by Parsing Cont’d• The tree-translation scheme can be converted

into a syntax-directed translation scheme

From the productions of the translation scheme we build an LR parser. The target code is generated by emitting the machine in

Compiler Principles

Ambiguity Elimination• "maximal munch" approach: favor larger

reductions over smaller ones– in a reduce-reduce conflict, the longer reduction is

favored– in a shift-reduce conflict, the shift move is favored

Compiler Principles

Routines for Semantic Checking• Restrictions on Attribute value

– Generic templates can be used to represent classes of instructions and the semantic actions can then be used to pick instructions for specific cases.

• Parsing-action conflicts can be resolved by disambiguating predicates that can allow different selection strategies to be used in different contexts.

Compiler Principles

General Tree Matching• The LR-parsing approach to pattern matching

based on prefix representations favors the left operand of a binary operator.

• Postfix representation– an LR-parsing approach to pattern matching would

favor the right operand.• Hand-written code generator

– an ad-hoc matcher can be written.• Code-generator generator

– needs a general tree-matching algorithm. – An efficient top-down algorithm can be developed by

extending the string pattern-matching techniques

Compiler Principles

Home work: Read the text book and study the example :A simple code generator

code generation - sjtujiangli/teaching/cs308/cs308-slides09.pdf · issues in the design of a code...

Documents