cs412/413
DESCRIPTION
CS412/413. Introduction to Compilers and Translators Spring ’99 Lecture 12: Intermediate Representation and the Translation Function. Administration. Prelim 1 on Monday in class topics covered: regular expressions, tokenizing, context-free grammars, LL & LR parsers, static semantics - PowerPoint PPT PresentationTRANSCRIPT
CS412/413
Introduction to
Compilers and Translators
Spring ’99
Lecture 12: Intermediate Representation and the Translation Function
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
2
Administration
• Prelim 1 on Monday in class– topics covered: regular expressions, tokenizing,
context-free grammars, LL & LR parsers, static semantics
• No class Wednesday March 3
• Programming Assignment 2 due Friday March 5
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
3
Where we areSource code(character stream)
Lexical analysis
Syntactic Analysis
Token stream
Semantic Analysis
Intermediate CodeGeneration
Abstract syntax tree+ types
Abstract syntax tree
Intermediate Code
regular expressions
grammars
static semantics
translation functions
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
4
Intermediate Code• Abstract machine code - simpler
• Allows machine-independent code generation, optimization
AST
Pentium
Java bytecode
Alpha
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
5
Intermediate Code• Abstract machine code
• Allows machine-independent code generation, optimization
AST
Pentium
Java bytecode
Alpha
IR
optimize
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
6
Intermediate Code• High-level vs. low-level IR• High-level IR preserves high-level language
constructs– structured flow, variables, methods
– essentially, extended version of AST
– allows high-level optimization
• Low-level IR (ala Appel) is abstract machine code– unstructured jumps, registers, memory loc’ns
– allows low-level optimization
– convenient for translation to real machine code
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
7
Translations• Goal: get program closer to machine code
without losing information needed to do useful optimizations
AST
Pentium
Java bytecode
Alpha
L-L IRH-L IR
optimize optimize
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
8
Intermediate Code• High-level IR AST
– Some AST nodes may be replaced with other AST nodes
– New kinds of nodes not occurring in parse tree may be added
– Will consider later for high-level optimizations (e.g. inlining)
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
9
Low-level IR• Last time: variables in high-level code
are mapped to stack locations or regs• Stack loc’ns in function’s stack frame• Stack frame is region between frame
pointer (fp) and stack pointer (sp• Local variables, temporaries to
negative offsets• Arguments, static link are temporaries
in previous stack frame• If stack frame constant size, can use
(positive) offsets from sp -- (fp - sp) = const.
args
sp
fplocals
temps
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
10
Low-Level IR code• Intermediate Representation is a tree of
nodes representing abstract machine instructions
• Statement-like instructions return no value, are executed in a particular order (e.g. MOVE, SEQ, CJUMP)
• Expression-like instructions return a value, have non-determinism (ADD, SUB)
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
11
IR expressions• CONST(i) : the integer constant i• TEMP(t) : a temporary register t. The abstract
machine has an infinite number of these
• OP(e1, e2) : one of the following operations– PLUS, MINUS, MUL, DIV, MOD
– AND, OR, XOR, LSHIFT, RSHIFT, ARSHIFT
• MEM(e) : contents of memory locn w/ address e• CALL(f, l) : result of fcn f applied to arguments l• ESEQ(s, e) : result of e after stmt s is executed• NAME(n) : address of the statement labeled n
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
12
IR statements• MOVE(e, dest) : move result of e into dest
– dest = TEMP(t) : assign to temporary t– dest = MEM(e) : assign to memory locn e
• EXP(e) : evaluate e, discard result• SEQ(s1, s2) : execute s1 and then s2
• JUMP(e) : jump to address e• CJUMP(e, l1, l2) : jump to l1 or l2 depending on
whether e is true or false• LABEL(n) : a labeled statement (may be used
in NAME, JUMP, CJUMP)
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
13
Translation
• How do we translate an AST/High-level IR into this low-level IR representation?
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
14
Variables• Local variables, arguments
mapped to offsets from frame pointer (negative, positive resp.)
• Local variable v located at offset k -- reference to v in AST becomes IR expression MEM(PLUS(FP, k))
args
sp
fplocals
tempsMEM
+
FP CONST(k)v
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
15
Assignment• Assignment v = E translates to a MOVE(e,
dest) node, where e is the translation of expression E, and dest is the location of v.
MOVE
2 MEM
FP
+
CONST(k)
x = 2;
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
16
Statements• A sequence of two statements translates to a
SEQ node:
• If s1 translates to IR tree T1 and s2 to T2
• Then s1 ; s2 translates to SEQ(T1 , T2)
s1 ; s2
SEQ
T1 T2
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
17
Translation functions• Introduce function T[ E ] to represent the
translated version of an expression or statement E. Translation rule for a sequence:
T[ s1; s2 ] = SEQ(T[ s1 ], T[ s2 ])
• Assignment:
T[ v = E ] = ?
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
18
Translation function• How to translate a variable?
T [ v ] = MEM(PLUS(FP, CONST( k )))
• Where does k come from?
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
19
Need environment• Answer: put it in the symbol table
• Translation function takes another argument A. T[ E, A ] or T[ S, A ]
• For each variable v in scope, the environment contains entry “location v : k”
location v : k A
T [ v , A ] = MEM(PLUS(FP, CONST( k )))
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
20
Translation rules• Can write rules for remainder of AST• Like type-checking: provides recursive
recipe for translation codelocation v : k A
T [ v , A ] = MEM(PLUS(FP, CONST( k )))
T[ s1; s2 ] = SEQ(T[ s1 ] , T[ s2 ])
T[ v = E ] = MOVE( T [ E ] , T [ v ] )
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
21
Translation Code• Like type-checking: add method to AST
nodes that does the translation
abstract class ASTNode {
IRNode translate(SymTab A) { … }
}
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
22
Translating a blockT[ s1; s2 , A] = SEQ(T[ s1 , A] , T[ s2 , A])
class Block { Stmt [ ] stmts;IRNode translate(SymTab A) {
int n = stmts.length;IRNode ret = new Seq(stmts[n-2].translate(A),
stmts[n-1].translate(A));while (n > 2) { n--; ret = new Seq(stmts[n-2].translate(A), ret); }return ret;
}
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
23
Array index expressions• Array index expressions may be used both
as LHS and RHS of assignment.
• Translate to a MEM node pointing to the appropriate array element
• Question: how to find appropriate element?
• Need to decide on a representation for arrays
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
24
Array representation• Arrays can’t change in size, so store as
contiguous series of elements.
• Also need length of array
v: array[int] = new int[3] = 0;
v[1] = 5; v[2] = 10;
v305
10
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
25
Translating array index
T [ E1[ E2 ] , A ] =
MEM(PLUS(T[E1, A],
(MUL(PLUS(CONST(1), T[ E2,
A ]),
CONST(4))i.e. MEM(E1 + 4*(E2+1))
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
26
Bounds checking• Need to check that array index is within
bounds!
• Use ESEQ node
• E1[ E2 ] translates to ESEQ
boundschecking
code
MEM+
T[E1 ]...
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
27
Conclusions• Language constructs can be translated to a
small IR representation
• Translation process can be described conveniently as a translation function T[ E, A ]
• Translation function corresponds to natural recursive implementation that builds IR nodes bottom-up
• Next time: translating structured statements