3. translation to target language - - tu kaiserslautern · arithmetic-logic unit (alu)...

Compilers and Language Processing ToolsSummer Term 2011

Prof. Dr. Arnd Poetzsch-Heffter

Software Technology GroupTU Kaiserslautern

c© Prof. Dr. Arnd Poetzsch-Heffter 1

Content of Lecture

1. Introduction2. Syntax and Type Analysis

2.1 Lexical Analysis2.2 Context-Free Syntax Analysis2.3 Context-Dependent Analysis

3. Translation to Target Language3.1 Translation of Imperative Language Constructs3.2 Translation of Object-Oriented Language Constructs

4. Selected Aspects of Compilers4.1 Intermediate Languages4.2 Optimization4.3 Data Flow Analysis4.4 Register Allocation4.5 Code Generation

5. Garbage Collection6. XML Processing (DOM, SAX, XSLT)

c© Prof. Dr. Arnd Poetzsch-Heffter 2

3. Translation to Target Language

c© Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 3

Chapter Outline

3. Translation to Target Language3.1 Translation of Imperative Language Constructs

3.1.1 Language Constructs of Procedural Language3.1.2 Assembly and Machine Languages3.1.3 Translation of Variables and Data Types3.1.4 Translation of Expressions3.1.5 Translation of Statements3.1.6 Translation of Procedures and Local Objects

3.2 Translation of Object-Oriented Language Constructs3.2.1 Concepts of Object-Oriented Programming Languages3.2.2 Translation with Procedural Languages3.2.3 Translation of Classes3.2.4 Problems of Multiple Inheritance3.2.5 Further Aspects of Object-Oriented Languages3.2.6 Summary - A Simple Compiler


Translation to Target Language

Focus:• Differences between source languages and target

languages/target machines

• Most important translation techniques for different programingparadigms (procedural/object-oriented)


Translation to Target Language (2)

Learning Objectives:• Overview of imperative and procedural language constructs

• Typical language constructs of assembler languages

• Translation techniques for procedural language constructs

• Translation of object-oriented language constructs


Translation of Imperative Language Constructs

3.1 Translation of ImperativeLanguage Constructs


Translation of Imperative Language Constructs

Section Outline

3.1 Translation of Imperative Language Constructs3.1.1 Language Constructs of Procedural Language3.1.2 Assembly and Machine Languages3.1.3 Translation of Variables and Data Types3.1.4 Translation of Expressions3.1.5 Translation of Statements3.1.6 Translation of Procedures and Local Objects


Translation of Imperative Language Constructs Language Constructs of Procedural Languages

3.1.1 Language Constructs of Procedural Languages



Language Constructs of Procedural Languages

From a conceptional and semantical view point, procedural languageshave the following constructs:• Domains with operations (often typed)

I pre-defined: int, boolean, ...I user-defined: records, classes, ...I implicitly defined: field types, address types, function types

• VariablesI simple and compound typesI global, local, statically/dynamically allocatedI define memory state

• ExpressionsI computation of values with implicit intermediate resultsI possibly in combination with execution control and state

modification



Language Constructs of Procedural Languages (2)

• StatementsI simple and combined statementsI define execution control and state modification

• ProceduresI abstraction of parametrized statementsI may be recursiveI may be nested

Modules usually do not have a semantic meaning and are onlyrelevant for translation in name analysis and for binding and loading.



Nested Procedures

Example from [Wilhelm, Maurer; Fig. 2.9]

Übersetzung geschachtelter ProzedurenGeschachtelte/lokale Prozeduren werden z.B.

von Pascal und Ada unterstützt

Beispiel: (geschachtelte Prozeduren)

von Pascal und Ada unterstützt.

proc P(a)

var b

Abb. 2.9

)

var b

var c

proc Q

var a

proc R

elm

/Maure

r,var b

begin

... b ...

... a ...

c

mt aus W

ilhe... c ...

end

begin

... a ...

... b ...

spie

l sta

mm... call Q ...

end

proc S

var a

begin

(das B

eisbegin

... a ...

... call Q ...

end

begin

12.06.2007 237© A. Poetzsch-Heffter, TU Kaiserslautern

... a ...

... call Q ...

end


Translation of Imperative Language Constructs Assembly and Machine Languages

3.1.2 Assembly and Machine Languages



Assembly and Machine Languages

Assembly languages have the following language constructs:• Finite sequences of bits of various length: byte, word, halfword, ...• Global memory

I register, flags (addressing by name)I indexed, mostly word addressed main memory

• InstructionsI load, storeI arithmetic and boolean operationsI execution control (jumps, procedures)I simple, not combined statementsI possibly complex addressing of operands

• Initialization instructions



The MIPS Assembler

MIPS - Microprocessor without interlocked pipeline stages

• RISC Architecture, originally 32 bit (since 1991 64bit)• developed by John Hennessy (Stanford) starting 1981• MARS Simulatorhttp://courses.missouristate.edu/KenVollmar/MARS/



MIPS Architecture

• Arithmetic-Logic Unit (ALU)• Floating-Point Unit (FPU)• 32 Registers (inkl. stack pointer, frame pointer, global pointer,

return address)• Main memory, 230 memory words (4 byte)• 5-stage pipeline



MIPS Architecture

MemoryPC

Adder

RegisterFile

SignExtend

IF / ID

ID / E

X

Imm

RS1

RS2Zero?

ALU

MU

X

EX

/ MEM

Memory

MU

X

MEM

/ WB

MU

X

MU

X

Next SEQ PC Next SEQ PC

WB Data

Branchtaken

IR

Instruction Fetch

Next PC

Instruction DecodeRegister Fetch

ExecuteAddress Calc.

Memory Access Write Back

IF ID EX MEM WB

image: Wikipedia



Memory Structure

Reserved for OS

Stack Segment

free

Heap Segment

Data Segment

Text Segment

Reserved

0xFFFFFFFF

0x800000000x7FFFFFFF

0x10000000

0x00400000

0x00000000

$sp



Data Types and Literals in MIPS Assembly Language

Data Types

• Instructions are all 32 bits• byte (8 bits), halfword (2 bytes), word (4 bytes)• integer (1 word storage)• single precision floats (1 word storage)• double precision floats (2 word storage)

Literals

• Integers (e.g. 4, 2, -236, 0x44)• Floats (e.g. 3.41, -0.323e5)• Characters in single quotes, e.g. ’b’• Strings in double quotes, e.g. "Hello World"



MIPS Registers

No Name P* Description0 $zero - the constant 01 $at - assembler temporary (reserved by the assembler)2-3 $v0, $v1 no values for function results and expression evaluation4-7 $a0 - $a3 no arguments for subroutine calls8-15 $t0 - $t7 no temporaries16-23 $s0 - $s7 yes saved temporaries24-25 $t8 - $t9 no additional temporaries26-27 $k0, $k1 no reserved for OS kernel28 $gp yes global pointer29 $sp yes stack pointer30 $fp yes frame pointer31 $ra yes return address

*callee must preserve value



MIPS Instruction Format

• Instructions are always 32 bit• Opcode in first 6 bits• 3 types of instructions: R-, I-, and J-instructions

R-Instructionsopcode (6) rs (5) rt (5) rd (5) shamt (5) funct (6)

I-Instructionsopcode (6) rs (5) rt (5) immediate (16)

J-Instructionsopcode (6) address (26)



MIPS Instructions

In the following let, r1, r2, r3, be registers (e.g. $s1, $t3) and let c beconstant values (e.g. 4, 100, -4).

Arithmetic

add add r1, r2, r3 r1 = r2 + r3subtract sub r1, r2, r3 r1 = r2 - r3add immediate addi r1, r2, c r1 = r2 + cmultiply mult r1, r2, r3 r1 = r2 * r3

(lower 32 bits of result)move move r1, r2 addi r1, r2, 0



MIPS Instructions (2)

Data Transfer

load word lw r1, c(r2) r1 = Memory[r2 + c]store word sw r1, c(r2) Memory[r2 + c] = r1load immediate li r1, c r1 = cload half lh r1, c(r2) r1 = Memory[r2 + c]store half sh r1, c(r2) Memory[r2 + c] = r1load byte lb r1, c(r2) r1 = Memory[r2 + c]store byte sb r1, c(r2) Memory[r2 + c] = r1




Logical

and and r1, r2, r3 r1 = r2 & r3or or r1, r2, r3 r1 = r2 | r3nor nor r1, r2, r3 r1 = ¬ ( r2 | r3 )and immediate andi r1, r2, c r1 = r2 & cor immediate ori r1, r2, c r1 = r2 | cshift left logical sll r1, r2, c r1 = r2 « cshift right logical srl r1, r2, c r1 = r2 » c




Conditional Branches

branch on equal beq r1, r2, label if (r1 == r2)goto label

branch on not equal bne r1, r2, label if (r1 != r2)goto label

set on less than slt r1, r2, r3 if (r2 < r3)r1 := 1 else r1 := 0

set o.l.t. immediate slti r1, r2, c if (r2 < c)r1 := 1 else r1 := 0

Unconditional Branches

jump j label goto labeljump register jr r1 goto r1jump and link jal label $ra = PC + 4; goto label



Subroutine Calls

Subroutine call (jump and link)

jal label # jump and link

• copy program counter to $ra• jump to label• Note: before call store $ra on stack

Subroutine return (jump register)

jr $ra # jump register

• jump to return address in $ra



Working with the Stack

Push data on the stack

sw $ra, ($sp) # save return address on stack

addi $sp, $sp, -4 # decrement stack pointer

sw $fp, ($sp) # save frame pointer on stack

addi $sp, $sp, -4 # decrement stack pointer

Pop data from the stack

addi $sp, $sp, 4 # increment stack pointer

lw $fp, ($sp) # pop saved frame pointer

addi $sp, $sp, 4 # increment stack pointer

lw $ra, ($sp) # pop saved return address



Adressing in MIPS

• Immediate: Operand is a constant, e.g. 25

• Register: Operand is a register, e.g. $s2

• Base or Displacement Addressing: Operand is a memorylocation whose address is the sum of the register and a constant,e.g. 8($sp)

• PC relative: Address is the sum of PC and a constant

• Pseudodirect Addressing: Jump address is the 26 bit of theinstruction with the upper bits of the PC



Syscalls for MARS/SPIM Simulators

How to use System Calls:• load service number into register $v0• load argument values, if any into $a0, $a1, $a2• issue call instruction syscall

• retrieve return values, if any

Example:

li $v0, 1 # print integer

move $a0, $t0 # load value into $a0

syscall



List of System Services

Service Code in $v0 Arguments

print integer 1 $a0 = integer to printprint string 4 $a0 = address of

null-terminated string to printexit (terminate execution) 10print character 11 $a0 = character to printexit2 (terminate with value) 17 $a0 = termination result



MIPS Assembly Program Structure

.data # data declarations follow this line

# ...

.text # instructions follow this line

# ...

main: # indicates the first instruction to execute

# ...



Data Declarations

Format

<name>: <type> (<initial values> | <allocated space>)

Example

.data # data declarations followvar: .word 3 # integer variable with initial value 3array1: .byte ’a’,’b’ # 2-element character array initialized

# with ’a’ and ’b’array2: .space 40 # allocate 40 consecutive bytes, uninitialized



Example: Translation to MIPS

The example illustrates the MIPS assembler and typical translation tasks.Code quality is not considered.

Source Code in C

1 char a[3], b[3];2 int i;3 char res;4 int main() {5 i = 2;6 res = 1;7 while( -1 < i ) {8 if( res ) {9 res = (a[i]==b[i]);

10 i = i-1;11 } else {12 i = i-1;13 }14 }15 return res;16 }



Source Code in C with Labels

1 char a[3], b[3];2 int i;3 char res;4 int main() {5 main: i = 2;6 res = 1;7 loop: while( -1 < i ) {8 if( res ) {9 res = (a[i]==b[i]);

10 after: i = i-1;11 } else {12 elseif: i = i-1;13 } // afterif:14 }15 exit: return res;16 }



Source Code in C with Gotos

1 char a[3], b[3];2 int i;3 char res;4 int main() {5 i = 2;6 res = 1;7 loop: if (! (-1 < i ))8 goto exit;9 if( !res )

10 goto elseif;11 if (a[i]==b[i])12 goto equal;13 res = 0;14 goto after;15 equal: res = 1;16 after: i = i-1;17 goto afterif;18 elseif: i = i-1;19 afterif: goto loop;20 exit: return res;21 }



MIPS Program

# sp + 0 : i# sp + 4 : res# sp + 5 : base address of a[3]# sp + 8 : base address of b[3]main:addi $sp, $sp, -12 # make space for the variablesli $t1, 2sw $t1, 0($sp) # i = 2li $t1, 1sb $t1, 4($sp) # set res at sp +4



MIPS Program (2)

loop:lw $t2, 0($sp) # load i into $t2li $t3, -1 # load -1 into $t3slt $t0, $t3, $t2 # -1 < i ?beq $t0, $zero, exit # if not -1 < i goto exitlb $t1, 4($sp) # load res from stack into $t1beq $t1, $zero, elseif # if res == 0 goto else ifadd $t4, $sp, 5 # base address of array aadd $t4, $t4, $t2 # add offset/ array indexlb $t0, 0($t4) # load a[i]add $t4, $sp, 8 # base address of array badd $t4, $t4, $t2 # add offset/ array indexlb $t1, 0($t4) # load b[i]beq $t0, $t1, equal # if a[i] == b[i]sb $zero, 4($sp) # set res to 0j after



MIPS Program (3)

equal:addi $t3, $zero, 1 # $t3 = 1sb $t3, 4($sp) # res = $t3

after:subi $t2, $t2, 1 # i = i-1sw $t2, 0($sp) # store i to $sp +4j afterif # goto end of if statement

elseif:subi $t2, $t2, 1 # i = i-1sw $t2, 0($sp) # store i to $sp +4

afterif:j loop # return to loop

exit:lw $a0, 4($sp) # terminate with exit code resaddi $sp, $sp, 12 # reset stack pointerli $v0, 17syscall



Translation to MIPS

Remarks:The example illustrates typical translation tasks:• Translation of data types, memory management, addressing• Translation of expressions, management of intermediate results,

mapping of operations of the source language to operations of thetarget language

• Translation of statements by implementation with jumps• Bad code quality with simple systematic approach



Translation Process

Concrete Syntax

SL

Concrete SyntaxMIPS

AST SL

AST MIPS

Lexical and Context-Free

Analysis

Context-Dependent

Analysis

Translator Code Generator



MIPS Abstract Syntax

Prog * InstructionInstruction = ADD (Register reg0, Register reg1, Register reg2)

| ADDI (Register reg0, Register reg1, Const const0)| BEQ (Register reg0, Register reg1, Label label0)| SLT (Register reg0, Register reg1, Register reg2)| SLTI (Register reg0, Register reg1, Const const0)| J (Label label0)| JR (Register reg0)| JAL (Label label0)...

Const ( Integer value )Label ( Integer labelId )Register = Zero () | AT () | VReg | AReg | TReg | SReg

| KReg | GP () | SP () | FP () | RA ()

VReg = V0 () | V1 ()AReg = A0 () | A1 () | A2 () | A3 ()...


Translation of Imperative Language Constructs Translation of Variables and Data Types

3.1.3 Translation of Variables and Data Types



Translation of variables and data types

Compiler

Programing Language

Assembly Language

named variablescomplex types

addresses of memory regionsindex and offset computation



Translation of variables and data types (2)

The translation of variables and data types comprises:

• handling of primitive data types• conversion of data types (e.g. int→ float)• memory organisation• translation of arrays• translation of records and classes• implementation of dynamic objects



Primitive data types

Usually, the primitive data types of source languages are supported bythe target machine:• int, long→ 4 byte word with integer arithmetic• float, double→ accordingly

Potentially, data types have to be encoded:• boolean→ 1 byte or 4 byte words

Problem, if target machine does not comply to requirements of sourcelanguage, e.g.• floating point arithmetic is not handled according to IEEE standard• overflows are not dealt with correctly

(cmp. Java FP-strict expressions)• operations for conversion are missing on target machine



Memory layoutThe conceptional memory layout of most imperative programing languagesand target machines is similar. (Details depend on OS and machine)

dynamic variables, objects, ...

intermediate results, procedure-local values,objects with restricted scope

OS kernel

global values

low addresses

highaddresses

global, static variables, constants, ...

heap

stack

program



Translation of arrays

Efficient translation of arrays is important for many tasks.

One-dimensional static arrays

• Allocate memory in the segment for global data (starting at $gp)• Address computation with base address of array, index of array

element and size of element type

Consider the array declaration T tarr[57]:

• $gp contains the base adress for the global memory region• Let Rrel contain the relative address of the array tarr in the global

memory region• Let Ri contain the index i of the array component

If k = sizeof (T ), then the address of tarr[i] is $gp + Rrel + k ∗ Ri .



Translation of Arrays (2)

Computation in MIPS

li $ti, k

mul $ti, Ri, $ti

add $ti, R_rel, $ti

add $ti, $gp, $ti

lw $ti, ($ti)



More Translation of Arrays

Multi-dimensional static arrays

Consider as example the Pascal declaration

var a:array[-5..5][1..9] of integer;

which corresponds to 99 integer variables:

a[-5, 1] ... a[-5,9]

...

a[5,1] ... a[5,9]

Matrix is stored in rows in memory. Storing in rows is more efficientthan storing columns as second index is often incremented in innerloops.



Further Translation of Arrays(2)

Translation of access to a[E1,E2]:

Assume results of evaluating E1 and E2 are stored in $t1 and $t2.

As a is a static array, we know the dimensions at compile time.

a[$t1,$t2] is the r-th component of a linear array with

r = ($t1− (−5)) ∗ ((9− 1) + 1) + ($t2− 1)= 9 ∗ $t1 + 45 + $t2− 1= 9 ∗ $t1 + $t2 + 44

Result: Store the address of the 44-th component as base address ofthe array in symbol table. Then it suffices to add 9 ∗ $t1 + $t2 to baseaddress.



Further Translation of Arrays(2)

Code example for access to a[E1,E2]:

[Code for E1 -> $t1][Code for E2 -> $t2]LI ($t3, 9)MULT ($t1, $t1, $t3)ADD ($t1, $t1, $t2)LI ($t2, 4)MULT ($t1, $t1, $t2)ADDI ($t1, $t1, relA)ADD ($t1, $t1, $gp)LW ($t1, 0, $t1)

where relA = offset(a) + 44



General Translation of Arrays

General array declaration of dimension k

var a: array [u1..o1], ...., [uk..uk] of T;

Storing rows yields the following adress for accessing a[R1, ..., Rk]:

r = (R1− u1) ∗ size(array [u2..o2, ...,uk ..ok ] of T )+ (R2− u2) ∗ size(array [u3..o3, ...,uk ..ok ] of T )+ . . .+ (Rk − uk) ∗ size(T )



General Translation of Arrays (2)

For i = 1, . . . , k − 1, it holds that

size(i) := size(array [u{i + 1}..o{i + 1}, ...,uk ..ok ] of T )

size(k) = size(T )

This impliessize(i − 1) = size(i) ∗ (oi − ui + 1)

Simplification yields:

r =k∑

i=1

Ri ∗ size(i)−k∑

i=1

ui ∗ size(i)

At runtime, only the first summand has to be computed for which codehas to be generated.



Code Generation for Array Access

Abstract syntax of source language:Einfache Codeerzeugung für Feldzugriff:

Beispiel:

ArrayAccess ( UsedId uid, IndexExps ies )

UsedId ( Ident id )

IndexExps = IndexExpElem | IndexExp

IndexExpElem ( IndexExp ie, IndexExps ies )p ( p , p )

IndexExp ( ... )

Symboltabelle

Register, in dem Ergebnis steht ( Reg(Ri) )

Adressierung des Feldelements

Code für den Unterbaum

Liste der Größen zu jeder Felddimension

Relativadresse zur Adressierung eines Feldes a:

relA = offset(a) - !"ui * size(i) k

I=1

lkupRA: Ident x SymTab ! Adresse

lk SZL Id t S T b ! I tLi t

I=1

lkupSZL: Ident x SymTab ! IntList

Zur Konkatenation von Codelisten benutzen wir “+“,

die Erzeugung einer einelementigen Liste aus einem

El t h ib i l [ ]


Element e schreiben wir als [e] .



Code Generation for Array Access (2)

Attribution:

Einfache Codeerzeugung für Feldzugriff:

Beispiel:

ArrayAccess ( UsedId uid, IndexExps ies )

UsedId ( Ident id )

IndexExps = IndexExpElem | IndexExp

IndexExpElem ( IndexExp ie, IndexExps ies )p ( p , p )

IndexExp ( ... )

Symboltabelle

Register, in dem Ergebnis steht ( Reg(Ri) )

Adressierung des Feldelements

Code für den Unterbaum

Liste der Größen zu jeder Felddimension

Relativadresse zur Adressierung eines Feldes a:

relA = offset(a) - !"ui * size(i) k

I=1

lkupRA: Ident x SymTab ! Adresse

lk SZL Id t S T b ! I tLi t

I=1

lkupSZL: Ident x SymTab ! IntList

Zur Konkatenation von Codelisten benutzen wir “+“,

die Erzeugung einer einelementigen Liste aus einem

El t h ib i l [ ]


Element e schreiben wir als [e] .

Symbol Table

Result Register Ri

Address of Array Element

Code for Subtree

List of Sizes for each Array Dimension

Relative Address for Array a




Operations for attribution:• lkupRA: Ident × SymTab→ Address• lkupSZL: Ident × SymTab→ IntList• + : List concatenation, for an element e, [e] is the list containing

only e.

In the following, the SymTab attribute is only explicitly given where it isrequired.




Das Symboltabellenattribut ist nur angegeben, wo es

gebraucht wird. R0 enthält die Basisadresse des

Speicherbereichs, in dem das Feld gespeichert ist.

ArrayAccess

UsedId IndexExps

Bdispx(Reg(R0),_,_)

UsedId IndexExps

lkupRA(_,_) lkupSZL(_,_)

IndexExpElem

Ident

IndexExpElem

_ +

rest(_) first(_)

_ +

[ Mult2(W,Imm(_),_) ] +

[ Add2(W,_,_) ]


IndexExps

IndexExp

IndexExp

ADD(Ri,Ri, $gp)ADD(Ri, Ri,RA)

RiRA



Code Generation for Array Access (5)Um die Attributierungsbilder übersichtlicher zu gestalten, können Bezeichner für Attributwertebenutzt werden:

IndexExpElem

rest(_) first(_)

CL + CR +[ Mult2(W,Imm(_),RL) ] +[ Add2(W,RL,RR) ]

IndexExpsIndexExpRL CL RR CR

Zur Laufzeit braucht wieder nur der erste Summandberechnet werden. Dafür muss also Code generiertwerden. Bei der schrittweisen Berechung kann aucheine Bereichsprüfung für das Feld vorgenommen werden.

Bemerkungen:

• Bei der Berechnung von Feldindizes gibt es häufigeine großes Potential für Optimierungen.

• Für die Übersetzung dynamischer Felder muss

die Adressierung geeignet verallgemeinert werden


die Adressierung geeignet verallgemeinert werden.(siehe z.B. Wilhelm/Maurer, Abschnitt 2.6.2).

CL +CR +[LOADI (RT, FI)] +[MUL (RL, RL, RT) ] +[ADD (RR, RR, RL) ]

FI

During stepwise computation, array bounds can also be checked.



Array Access

Remarks:• Computation of array indices offers great potential for

optimizations.• For translation of dynamic arrays, addressing has to be

generalized appropriately. (cf. Wilhelm/Maurer, Sect. 2.6.2)



Translation of Records

Translation of records is similar to translation of arrays:• Determine size and memory layout• Compute adresses for selection of record components and pointer

dereferencing• Translation of record operations, e.g. assignments to record

components

Recommended Reading: Wilhelm, Maurer, Section 2.6.2



Implementation of Dynamic Objects

Dynamic objects = dynamically allocated variables and objects insense of OO programing

Dynamic objects are stored on the heap:• number of dynamic objects is not known at compile time, objects

are created at runtime• dynamic objects have a designated lifetime which disallows

handling with stack

Memory representation and addressing of components is similar tostatic records.



Implementation of Dynamic Objects (2)

Example:

Implementierung dynamischer Objekte

Dynamische Objekte werden hier als Sammelbegriff fürDynamische Objekte werden hier als Sammelbegriff fürdynamisch allozierte Variable und Objekte im Sinne der OO-Programmierung verwendet.

Dynamische Objekte werden auf der Halde verwaltet:Dynamische Objekte werden auf der Halde verwaltet:

• Ihre Anzahl ist im Allg. zur Übersetzungszeit nicht

bekannt. Deshalb werden sie erst zur Laufzeit erzeugt.

• Sie haben eine Lebensdauer die eine kellerartigeSie haben eine Lebensdauer, die eine kellerartige

Behandlung im Allg. nicht zulässt.

Beispiel: (dynamische Objekte)Beispiel: (dynamische Objekte)

typedef struct listelem {

int head;

struct listelem* tail; }* list;

# define listelemSIZE sizeof(struct listelem{

int h; struct listelem* t;})

list append( int i list l ) {list append( int i, list l ) {

list lvar = (list) calloc(1,listelemSIZE);

lvar->head = i;

lvar->tail = l;

return lvar;


}

...



Dynamic Memory Management

Dynamic memory management• is handled by runtime environment• can be supported by compiler• can partially be handled by user program

Runtime environment provides operations for dynamic memorymanagement:• for the programmer, e.g. in C malloc, calloc, realloc, free• for the compiler as in Pascal, Java, Ada• no memory deallocation by programer possible, but garbage

collection by runtime environment e.g. in Java



Dynamic Memory Management (2)

General Problem: Provide memory blocks of different sizes from alinear memory and reuse memory after it has been freed

Simple memory management by linear list of free memory areas

Structure of free memory area of variable length:

user datasize

header

free usedused free used

freelist




List of free memory areas:user datasize

header

free usedused free used

freelist

Procedure to allocate and deallocate memory:

• Allocate memoryI Search memory area B of appropriate sizeI Update references:

• If area has exactly required size, remove it from list.• Else update header of area, create header for rest of free memory

and add this area instead of the old area to list.




I Return pointer to memory cell after header (size information has tobe kept.)

I If no memory area of required size is found, new memory has to berequested from the OS

• Free memoryI Find header for memory area to be freed by pointer to this areaI If previous or next memory areas are free, join the areasI Add resulting memory area to list




Remarks:

• If program writes over assigned memory area, references or sizeinformation can be destroyed with bad consequences.

• If memory cannot be allocated in bytes, alignment restrictionshave to be obeyed.

• For practical use the above principle can be improved byI non linear searchI search for exact memory areas, avoiding defragmentationI support for joining memory areas after deallocation


Translation of Imperative Language Constructs Translation of Expressions

3.1.4 Translation of Expressions



Translation of Expressions

Difficulties for translation of expressions• Management of intermediate results on stack or in registers• Translation of source language operations

I no counterpart in target languageI addressingI context-dependent (Boolean expression as condition is handled

differently as Boolean expression in an assignment.)



Translation of Expressions (2)

Abstract Syntax of Expressions:

Hier demonstrieren wir die generellen Problemeanhand eines kleinen Beispiels, das die direkte Übersetzung von Ausdrücken demonstriert.

Fortgeschrittene Techniken werden in Kapitel 3

behandelt.

B i i l ( i f h A d k üb t )Beispiel: (einfache Ausdrucksübersetzung)

Wir betrachten die Ausdruckssyntax aus dem MI-Übersetzungsbeispiel in Abschnitt 3.1.2:

Exp = ArtihmExp | Relation | IntConst

| CharConst | ArrayAccess | Var

ArithmExp = Add | Sub

Add, Sub ( Exp left, Exp right )

Relation = Lt | EqRelation Lt | Eq

Lt, Eq ( Exp left, Exp right )

IntConst ( Int i )

CharConst ( Char c )

ArrayAccess ( UsedId uid, Exp e )

iVar ( UsedId uid )

UsedId ( Ident id )

Wir treffen folgende Entwurfsentscheidungen:

Zwischenergebnisse werden auf dem Keller verwaltet• Zwischenergebnisse werden auf dem Keller verwaltet.

• Vergleiche werden durch Sprünge implementiert:

- Subtrahiere die beiden Werte auf dem Keller.- In Abhängigkeit des Ergebnisses springe einen


In Abhängigkeit des Ergebnisses springe einenBefehl an der 1 kellert bzw. der 0 kellert.Dazu sind entsprechende Marken zu generieren.




Design decisions:

• Intermediate results are stored on stack.• Comparisons are implemented by jumps:

I compare values on stackI dependent on result, jump to command pushing 1 or pushing 0I generate associated labels




Attribution:Attributdeklarationen:

Relativadresse einer Variable oder eines Feldes

Typ eines Ausdrucks ( int, char, int[ ], char[ ] )

Code für den Unterbaum vom Typ CodeList

eindeutige Marke für Ausdruck vom Typ String

Attributierung für das Code-Attribut:

Add

CL + CR +

[ Add2(W Postinc(SP) Regdef(SP) ]

tt but e u g ü das Code tt but

Exp

[ Add2(W,Postinc(SP),Regdef(SP) ]

CL CRExp

Lt

CL + CR +

M

[ Sub2( W, Postinc(SP), Regdef(SP) ] +

[ Jlt( Label( “PUSH1_“ + M ) ) ] +

[ Move( W, Imm(0), Regdef(SP) ) ] +

[ Jump( Label( “ENDREL_“ + M )) ] +

[ Label( “PUSH1 “ + M ) ] +

Exp

[ Label( PUSH1_ + M ) ] +


[ Label( “ENDREL_“ + M ) ]

CL CRExp

© A. Poetzsch-Heffter, TU Kaiserslautern

Exp Exp

( Die Attributierungen für Sub und Eq sind entsprechend. )

Relative Address of Variable or Array

Type of Expression (int, char, int[], char[])

Code for Subtree of Type CodeList

Unique Label for Expression of Type String




Attributdeklarationen:






Add

CL + CR +



Exp


CL CRExp

Lt

CL + CR +

M


[ Jlt( Label( “PUSH1_“ + M ) ) ] +



[ Label( “PUSH1 “ + M ) ] +

Exp




CL CRExp


Exp Exp


CL +CR + [LOAD (R2, 0, $sp)ADD ($sp, $sp, 4)LOAD (R1, 0, $sp)ADD (R1, R1, R2)STORE (R1, 0, $sp)]




Attributdeklarationen:






Add

CL + CR +



Exp


CL CRExp

Lt

CL + CR +

M


[ Jlt( Label( “PUSH1_“ + M ) ) ] +



[ Label( “PUSH1 “ + M ) ] +

Exp




CL CRExp


Exp Exp


CL + CR + [LOAD (R2, 0, $sp)ADD($sp, $sp, 4)LOAD (R1, 0, $sp) SLT (R1, R1, R2) BEQ (R1, $zero, “PUSH_0_”+M)LOADI (R1, 1)STORE (R1, 0, $sp)JUMP (“ENDREL_”+M)LABEL(“PUSH_0_”+M)LOADI (R1, 0)STORE (R1, 0, $sp)LABEL (“ENDREL_”+M)]




IntConst

[ Move( W, Imm( ), Predec(SP) ][ Move( W, Imm(_), Predec(SP) ]

Int

VarTV

if TV = int then

[ Move( W, Bdisp(Reg(R0), RA), Predec(SP) ]

else // TV = charelse // TV char

[ Conv( Bdisp(Reg(R0), RA), Predec(SP) ] UsedId

RA

ArrayAccessTV

ArrayAccess

CR + [ Move( W, Regdef(SP), Reg(R1) ] +

if TV = int then

[ Move(W, Bdispx( Reg(R0), Reg(R1), RA),[ ( p ( g( ) g( ) )

Regdef(SP) ]

else // TV = char

[ Conv( Bdispx( Reg(R0), Reg(R1), RA),

Regdef( SP ) ]

Beachte: Die Attributierung von Var und ArrayAccess

UsedIdRA CR

Exp



erzeugt Code zum Kellern des Werts vom Ausdruck,

nicht für die Adressierung des Zugriffs.

[LOADI (Ri, int) ] +[SUB ($sp, $sp, 4)] +[STORE (Ri, 0, $sp)]

if TV = int then[SUB ($sp, $sp, 4) LOADI(R1,RA)ADD (RI, RI, $gp)LOAD(R2, 0, RI)STORE (R2, 0, $sp) ] else // TV = char[SUB ($sp,$sp,1)LOADI(R1,RA)ADD (RI, RI, $gp)LOAD(R2, 0, RI)STOREB (R2, 0, $sp) ]




IntConst


Int

VarTV

if TV = int then




RA

ArrayAccessTV

ArrayAccess


if TV = int then


Regdef(SP) ]

else // TV = char


Regdef( SP ) ]


UsedIdRA CR

Exp





[LOADI (Ri, int) ] +[SUB ($sp, $sp, 4)] +[STORE (Ri, 0, $sp)]

if TV = int then[SUB ($sp, $sp, 4) LOADI(R1,RA)ADD (RI, RI, $gp)LOAD(R2, 0, RI)STORE (R2, 0, $sp) ] else // TV = char[SUB ($sp,$sp,1)LOADI(R1,RA)ADD (RI, RI, $gp)LOAD(R2, 0, RI)STOREB (R2, 0, $sp) ]




IntConst


Int

VarTV

if TV = int then




RA

ArrayAccessTV

ArrayAccess


if TV = int then


Regdef(SP) ]

else // TV = char


Regdef( SP ) ]


UsedIdRA CR

Exp





CR +[LOAD (R1, 0, $sp)LOADI (R2, RA)ADD (R1, R1, R2)ADD (R1, R1, $gp)] +if TV = int then

[LOAD (R2, 0, RI)STORE (R2, 0, $sp)]

else // TV = char[LOADB (R2 0, RI)STOREB (R2, 0, $sp)]



Improvements

• Improvement of generated code byI Storage of intermediate results in registersI Context-dependent optimizing instruction selectionI Avoiding redundant computations by evaluating common

subexpressions only once

• Improvement of translation technique by usage of intermediatelanguage


Translation of Imperative Language Constructs Translation of Statements

3.1.5 Translation of Statements



Translation of Statements

Most statements can be translated by translation schemes with jumps:

Verbesserungen:

• des erzeugten Codes durch

Verwaltung von Zwischenergebnissen in Registern- Verwaltung von Zwischenergebnissen in Registern- kontextabhängige, optimierende Befehlsauswahl- Vermeidung redundanter Berechnungen durch

einmalige Auswertung gemeinsamer Teilausdrücke

Ü

3 1 5 Übersetzung von Anweisungen

• der Übersetzungstechnik durch Benutzung einer

Zwischensprache

Für die meisten Anweisungen lassen sich relativ leicht Übersetzungsschemata mittels Sprüngen angeben:

3.1.5 Übersetzung von Anweisungen

While

[ Label( “BEGWHILE_“ + M ) ] +CE + [ Cmp( W Imm(0) Postinc(SP) ) ] +

M

[ Cmp( W, Imm(0), Postinc(SP) ) ] +[ Jeq( Label( “ENDWHILE_“+M) ) ] +CS +[ Jump(Label( “BEGWHILE_“+M)) ] +[ Label( “ENDWHILE_“ + M ) ]

Schwieriger ist die gute Übersetzungen von switch-

Exp

( )

CE CSStat

© A. Poetzsch-Heffter, TU Kaiserslautern

g g gAnweisungen und die effiziente Berücksichtigungvon nicht-strikten Ausdrücken.

[LABEL (“BEGWHILE_”+M)] +CE +[LOAD (R1, 0, $sp)ADD ($sp, $sp, 4)BEQ (R1, $zero, “ENDWHILE_”+M)] +CS +[JUMP (“BEGWHILE_”+ M)] +[LABEL (“ENDWHILE_”+M)]



More Complex Translation of Statements

More complex is a good translation of switch-statements and efficienthandling of non-strict expressions.

We consider the translation of non-strict Boolean expressions as anexample of an optimizing translation and for the usage of contextinformation.

Example: Abstract Syntax

Wir demonstrieren hier die Übersetzungnicht-strikter boolescher Ausdrücke:

• als Beispiel für eine optimierende Übersetzung

• um die Verwendung von Kontextinformation zu

illustrieren.

Beispiel: (Verwendung ererbter Information)

Stat = While | IfThenElse | ...

BExp = And | Or | Not | StrictExp


Wir betrachten folgendes Sprachfragment:

BExp And | Or | Not | StrictExp

While ( BExp c, Stat b )

IfThenElse ( BExp c, Stat then, Stat else )

And, Or ( BExp left, BExp right )

Not ( Bexp e )

StrictExp ( Exp e )

Ein Programmfragment dazu:

if( (B1 || B2) && ! B3 ) {

while( !(B4 || B5) ) A1

Wobei A1 und A2 Anweisungen sind und B1 bis B5

while( !(B4 || B5) ) A1

} else {

A2

}

Wobei A1 und A2 Anweisungen sind und B1 bis B5strikte Ausdrücke. Wie in C und Java sind die booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!


nicht mehr ausgewertet werden braucht und darf!

Außerdem sollen Sprungketten vermieden werden,

d.h. Sprünge zu unbedingten Sprungbefehlen.



More Complex Translation of Statements (2)

A program fragment:

Wir demonstrieren hier die Übersetzungnicht-strikter boolescher Ausdrücke:

• als Beispiel für eine optimierende Übersetzung

• um die Verwendung von Kontextinformation zu

illustrieren.


Stat = While | IfThenElse | ...

BExp = And | Or | Not | StrictExp


Wir betrachten folgendes Sprachfragment:

BExp And | Or | Not | StrictExp

While ( BExp c, Stat b )

IfThenElse ( BExp c, Stat then, Stat else )

And, Or ( BExp left, BExp right )

Not ( Bexp e )

StrictExp ( Exp e )

Ein Programmfragment dazu:

if( (B1 || B2) && ! B3 ) {

while( !(B4 || B5) ) A1

Wobei A1 und A2 Anweisungen sind und B1 bis B5

while( !(B4 || B5) ) A1

} else {

A2

}

Wobei A1 und A2 Anweisungen sind und B1 bis B5strikte Ausdrücke. Wie in C und Java sind die booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!


nicht mehr ausgewertet werden braucht und darf!

Außerdem sollen Sprungketten vermieden werden,

d.h. Sprünge zu unbedingten Sprungbefehlen.

where• A1, A2 are statements• B1 – B5 are strict expressions




In C and Java, we have that || and && are non-strict, i.e. if B1 and B2evaluate to false, B3 may not be evaluated.

Further, jump cascades should be avoided, i.e. jumps to otherunconditional jumps.

Idea for Attribution:For each boolean expression, compute• Label for true case (Attribute: 5)• Label for false case (Attribute: 4)• Information of type bool in which case to jump (Attribute: �)




Further Attributes:

Idee der Attributierung:

Ermittele zu jedem booleschen Ausdruck:

• das Sprungziel für den true-Fall (Attribut ),

• das Sprungziel für den false-Fall (Attribut ),

• die Information vom Typ bool, in welchem Fall

Weitere Attributdeklarationen:

yp ,

zu springen ist (Attribut ).



eindeutige Marke für jede Anweisung und jeden

Booleschen Ausdruck vom Typ String

IfThenElseM

“THEN“ + MCB +

[ Label( “THEN“ + M ) ] +

CT +

[ Jump( Label( “END“+M))] +

[ Label( “ELSE“ + M ) ] +false

THEN + M

“ELSE“ + M

[ Label( ELSE + M ) ] +

CE +

[ Label( “END“ + M ) ]

CB C C

false


BExpCB CT

StatCE

Stat

Code for subtree of type CodeList

Unique label for each statement and for each boolean expression of type String




Idee der Attributierung:

Ermittele zu jedem booleschen Ausdruck:

• das Sprungziel für den true-Fall (Attribut ),

• das Sprungziel für den false-Fall (Attribut ),

• die Information vom Typ bool, in welchem Fall


yp ,

zu springen ist (Attribut ).



eindeutige Marke für jede Anweisung und jeden

Booleschen Ausdruck vom Typ String

IfThenElseM

“THEN“ + MCB +

[ Label( “THEN“ + M ) ] +

CT +

[ Jump( Label( “END“+M))] +

[ Label( “ELSE“ + M ) ] +false

THEN + M

“ELSE“ + M

[ Label( ELSE + M ) ] +

CE +

[ Label( “END“ + M ) ]

CB C C

false


BExpCB CT

StatCE

Stat




WhileM

[ Label( “BEGW“ + M ) ] +

CB +

[ Label( “BODY“ + M ) ] +

CS +

[ Jump( Label( “BEGW“+M))] +

“BODY“ + M

“ENDW“ + M

BExp

[ Jump( Label( BEGW +M))] +

[ Label( “ENDW“ + M ) ]

CB CSStat

false

p

Not

BExp

not(_)

And

M

“BER“ + M CL +

false

CL +

[ Label( “BER“ + M ) ] +

CR


BExp BExpCL CR




WhileM


CB +


CS +


“BODY“ + M

“ENDW“ + M

BExp



CB CSStat

false

p

Not

BExp

not(_)

And

M

“BER“ + M CL +

false

CL +

[ Label( “BER“ + M ) ] +

CR


BExp BExpCL CR




WhileM


CB +


CS +


“BODY“ + M

“ENDW“ + M

BExp



CB CSStat

false

p

Not

BExp

not(_)

And

M

“BER“ + M CL +

false

CL +

[ Label( “BER“ + M ) ] +

CR


BExp BExpCL CR




OrM

“BER“ + M CL +

true

BER M CL +

[ Label( “BER“ + M ) ] +

CR

BExp BExpCL CR

StrictExp

CE +

[ Cmp( W, Imm(1), Postinc(SP) ) ] +

TT FT JI

[ p( ( ) ( ) ) ]

( if JI then

[ Jeq( Label( TT) ) ]

else

[ Jne( Label( FT) ) ] )

ExpCE

Bemerkung:

Falls nicht-strikte und strikte boolesche Ausdrücke

gemischt sind, wird die Codegenerierung komplexer.


Beispiel: a = ( b && f(c) ) + g;




OrM

“BER“ + M CL +

true

BER M CL +

[ Label( “BER“ + M ) ] +

CR

BExp BExpCL CR

StrictExp

CE +

[ Cmp( W, Imm(1), Postinc(SP) ) ] +

TT FT JI

[ p( ( ) ( ) ) ]

( if JI then

[ Jeq( Label( TT) ) ]

else

[ Jne( Label( FT) ) ] )

ExpCE

Bemerkung:

Falls nicht-strikte und strikte boolesche Ausdrücke

gemischt sind, wird die Codegenerierung komplexer.


Beispiel: a = ( b && f(c) ) + g;

CE +[LOAD (R1, 0, $sp)ADD ($sp, $sp, 4)] +if JL then

[BNE (R1, $zero, LABEL(TT))]else

[BEQ (R1, $zero, LABEL(FT)]




Remarks:

If non-strict and strict Boolean expressions are mixed, code generationbecomes more complex.

Example: a = ( b && f(c)) + g ;

Recommended Reading:• Wilhelm, Maurer: Sec. 2.4, pp. 12 –16


Translation of Imperative Language Constructs Translation of Procedures and Local Objects

3.1.6 Translation of Procedures and Local Objects



Translation of Procedures and Local Objects

Most procedural languages support recursion, procedure-localvariables and nested procedures. In the following, we consider• Translation of recursive procedures• Translation of local variables• Translation of nested procedures

We do not consider the translation of procedures as parameters.



Procedures

The declaration of a procedure consists of• the name of the procedure• the declaration of the formal parameters• the declaration of local variables• the body of the procedure

Each dynamic call of a procedure corresponds to a procedureincarnation.

Analogy:• Procedure declaration→ procedure incarnation• Class declaration→ object/class instance



Procedure Call Tree

The runtime behaviour of a procedural program can be described by aprocedure call tree.

Example (C-Program):

Das Laufzeitverhalten eines prozeduralen Programms

lässt sich durch den Prozeduraufrufbaum beschreiben.

Beispiel: (Prozeduraufrufbaum)

Wir betrachten folgendes C-Programm:

int even(int n){return n==0?1:odd(n-1);}

int odd (int n){return n==0?0:even(n-1);}

i i (){ (2)? (1) dd(1) }int main(){return even(2)?even(1):odd(1);}

main

even

odd

even

odd

even

Bemerkung:Bemerkung:

• Der Prozeduraufrufbaum ist eine abstrakte

Beschreibung des Laufzeitverhaltens und damit

abhängig von den Eingabewerten des Programms.


• Zu jedem Ausführungszeitpunkt gibt es einen aktiven

Pfad in dem Baum.



Procedure Call Tree (2)

Remarks:

• The procedure call tree is an abstract description of the runtimebehavior and depends on the inputs of the program.

• For each execution point, there is an active path in the tree.



Translation of Recursive Procedures

Main Tasks:• Parameter passing on entry, return of result at exit of procedure• Addressing of parameters• Handling of recursion

Main Idea:For each procedure incarnation, a stack frame is allocated. The stackframe contains:• the current parameters• the return address• the register contents of the caller• further information



Stack Frame

Structure of stack frame

For procedure with result, also memory has to be allocated. (Where?)c© Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 98


Code Generation for Procedures

Code has to be generated• at the call site

I to pass current parameters to procedure incarnationI to jump to the code of the procedure bodyI to make the procedure’s result available for further processing

• at the beginning of the procedure (prolog)I saving registersI set argument pointer

• at the end of the procedure (epilog)I restore registers

Note: Many tasks can be moved from the call site to the prolog andvice versa. Because a procedure has only one prolog, but potentiallymany call sites, it is more efficient to move the code to the prolog (andto the epilog).



Translation Scheme for Procedure Declaration

Übersetzungsschema für Prozedurdeklaration:

P D lM

ProcDecl

[ Label( “PROCBEG_“ + M) ] +< Prolog > +CSL +CSL +< Epilog > +[ Ret( ) ]

CSLIdent StatListParamList

Übersetzungsschema für Prozeduraufruf wobeiÜbersetzungsschema für Prozeduraufruf, wobei vorausgesetzt ist, dass der Code für die Liste der Parameterausdrücke (ExpList) das Kellernder aktuellen Parameter besorgt:

Call

CPL +[ Jump PLAB ] +< entfernen der Parameter vom Keller >

UsedIdPLAB CPL

ExpList


wobei PLAB die Ansprungmarke der Prozedur ist.

[LABEL (“PROCBEG_”+M)] +<Prolog> +CSL +<Epilog> +[JR $ra]



Translation Scheme for Procedure Call

Assume that code for list of parameter expressions ExpList pushescurrent parameters on stack.

Übersetzungsschema für Prozedurdeklaration:

P D lM

ProcDecl

[ Label( “PROCBEG_“ + M) ] +< Prolog > +CSL +CSL +< Epilog > +[ Ret( ) ]

CSLIdent StatListParamList

Übersetzungsschema für Prozeduraufruf wobeiÜbersetzungsschema für Prozeduraufruf, wobei vorausgesetzt ist, dass der Code für die Liste der Parameterausdrücke (ExpList) das Kellernder aktuellen Parameter besorgt:

Call

CPL +[ Jump PLAB ] +< entfernen der Parameter vom Keller >

UsedIdPLAB CPL

ExpList


wobei PLAB die Ansprungmarke der Prozedur ist.

CPL +[JAL PLAB] +<Code to remove parameters from stack>

Some machines have special commands for procedure call return.(MIPS: JAL, JR)



Translation of Procedure-Local Variables

Analogue to parameters, also procedure-local variables have to bestored in the stack frame, because there is one instance of the localvariables for each procedure incarnation.



Dynamic and Static Local Variables

Local Variables are static, if their size is known at compile time, elsethey are dynamic.

Example:

Lokale Variablen heißen statisch, wenn ihreGröße zur Übersetzungszeit bekannt ist, andernfallsdynamisch.

Beispiel: (statische/dynamische Variable)

Im folgenden C-Fragment sind i,j,k statische lokaleVariable; f und g sind dynamische Variable/Felder

void foo( int hsize ) {

int i, j;

Variable; f und g sind dynamische Variable/Felder,da ihre Größe vom Parameter size abhängt.

char f[ 2*hsize ];

int g[ hsize ];

int k;

...

}}

Speicherallokation geschieht im Prolog, bei dynamischen Variablen in Abhängigkeit von denaktuellen Parametern Übersetzer erzeugt dafür Codeaktuellen Parametern. Übersetzer erzeugt dafür Code.

Adressierung:

Prozedurlokale Variable werden relativ zu einem

Bezugspunkt im Kellerrahmen adressiert, z.B. relativ zum Argumentzeiger.

Bei der Adressierung dynamischer Variablen ist

im Allg ein zusätzlicher Indirektionsschritt notwendig


im Allg. ein zusätzlicher Indirektionsschritt notwendig,um statisch Relativadressen für alle lokalen Variablenfestlegen zu können.

where• i,j,k are static local variables• f, g are dynamic variables (arrays), because their size depends on

the parameter hsize.c© Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 103


Memory Allocation for Local Variables

Memory allocation is done in the prolog of a procedure, for dynamicvariables dependent on the current parameters, thus code isgenerated.

Addressing: Local variables are addressed relative to a referencepoint in the stack frame, e.g. argument pointer/frame pointer.

For dynamic variables, an additional step is necessary to find staticallyrelative addresses for all local variables.



Stack Frame (Example)

Stack frame for procedure foo:



Stack Frame (Example) (2)

Addresses of local variables in the example:• i: AP - 64• k: AP - 80• f[Ri] – MIPS Code:

LI (R1, AP)

SUBI (R1, R1, 72)

LW (R1, 0, R1)

LI (R2, 4)

MULT (Ri, Ri, R2)

ADD (R1, R1, Ri)

LW (R1, 0, R1)



Translation of Nested Procedures

For each procedure incarnation, there exist instances of the localvariables and of the parameters.

Problems:• How are non-local variables (neither local nor global) addressed?• Which instance of a non-local variable should be accessed?

These problems are also important for many functional languages.



Static and Dynamic Successors

• The direct static predecessor of a procedure declaration P is theprocedure declaration enclosing P in the source text.

• The direct static predecessor of a procedure incarnation P is thecurrent youngest procedure incarnation of the direct staticpredecessor of P.

• The direct dynamic predecessor of a procedure incarnation P isthe calling procedure incarnation.

• The static and dynamic predecessors are contained in thetransitive closure.



Nested Procedures (Example)Beispiel: (geschachtelte Prozeduren, die 2.)

proc P

var vp

proc Q

var vq

proc R

var vr

begin

(* hier vp, vq, vr adressierbar *)( p, q, )

call P

end

begin

(* hier vp und vq adressierbar *)(* hier vp und vq adressierbar *)

call R

end

proc S

begin

(* hier vp adressierbar *)

if ... then call S

if ... then call Q

end

begin

(* hier vp adressierbar *)

call S


end

here vp, vq, vr addressable

here vp, vq addressable

here vp addressable

here vp addressable



Nested Procedures (Example) (2)

Procedure Call Tree:Möglicher Prozeduraufrufbaum für das Beispiel:

P ! vp

S

QS

+1

+0 +0! vp

! vp ! vp, vqQ

R

S

+1

-2

! vp ! vp, vq

! vp, vq, vr

P

S

+1

! bedeutet “zugreifbar“ ! vp

! vp

Die Prozedurschachtelungstiefe (PST) ist einwichtiges Merkmal für die Übersetzung geschachtelterProzeduren. Für das obige Beispiel:g p

Prozedur PST aufrufbar

P 0 P, Q, S

Q 1 P Q RQ 1 P, Q, RR 2 P, Q, RS 1 P, Q, S

Ist PG eine von PA aufrufbare Prozedur dann gilt:


Ist PG eine von PA aufrufbare Prozedur, dann gilt:

PST(PG) ! PST(PA) + 1

denotes accessible

variables



Nested Procedures (Example) (3)

The procedure nesting depth (PND) is an important characteristic forthe translation of nested procedures.

If PG is a procedure that is callable from PA, then it holds that

PND(PG) ≤ PND(PA) + 1

In the example:

procedure PND callableP 0 P, Q, SQ 1 P, Q, RR 2 P, Q, RS 1 P, Q, S



Translation of nested procedures (approach)

• Which instance of a non-local (neither local nor global) variableshould be accessed?

Answer (Programming language semantics):If PI is a procedure incarnation accessing the non-local variable v

of a procedure declaration P, chose the variable instance in thestatic predecessor of PI that belongs to P.

• How are non-local variables be addressed?

Answer (Translation technique):

1. Manage all static predecessors of a procedure incarnation.2. Access stack frame of the respective static predecessor via the

difference of the PND of the current procedure incarnation and theprocedure incarnation of the corresponding static predecessor.



Reference chain of static predecessors

• Each procedure incarnation has a reference to the procedureincarnation of its direct static predecessor (SPR).

• An incarnation is represented by the address of its argumentpointer AP.

• The static predecessor reference (SPR) is stored in the stackframe.



Reference chain of static predecessors (2)

Stack frame with static predecessor reference (SPR):



Reference chain of static predecessors (3)

Snapshot of stack for example

The procedure P has no static predecessors.



Relevant aspects for code generation

1. Addressing with static predecessor reference chain:

Let V be a variable with PND(V) = n, i.e. V is declared as a localvariable of a procedure P with PND(P) = n. Let RA(V) be the addressof V relative to the the argument pointer.

Let VA be an application position of V in a procedure Q (6= P) withPND(Q) = m and m > n.

The address of VA is obtained by m − n times dereferencing of thestatic predecessor references:

M[M[. . .M[AP] . . .]]︸︷︷︸m−n times

+RA(V )



Relevant aspects for code generation (2)

Remark:• The difference m-n is known at compile time for each application

position of a variable.

• The address of VA can in general not be handled directly by theaddressing techniques of the target machine. Instead, separatecommands have to be used that are executed each time thevariable is accessed.




2: Management of static predecessor reference chain:

Let ∆ PND =def PND(caller) - PND (callee). We distinguish two cases:

• ∆ PND = -1: Argument pointer of caller is stored as SPR of callee.

• ∆ PND > -1: Follow SPR chain of the caller for ∆ PND steps. Theresulting SPR is the SPR of the callee.




The SPR can be handled by the caller procedure before the call; e.g:

SUB $sp, $sp, 4

LI $Ri, APcallerSW $Ri, 0, $sp

[ LW $Ri, 0, $sp

ADD $sp, $Ri, $zero

...(∆ PND +1)- times in total ]

First, the AP of the caller is pushed onto the stack, then the SPR chainis followed.




Remarks:

• The SPR chain can relatively easily be realized.

• Addressing of non-local variables can be inefficient.



Static predecessors in stack frames

Observation: The number of static predecessors of a procedure(incarnation) P is known at compile time: PND (P).

Thus: All static predecessors of a procedure incarnation can bedirectly stored in the stack frame (instead of SPR chain).

The stack area to store the static predecessors is called local display.Instead of one word for the SPR, we store PND(P) words.



Static predecessors in stack frames (2)

Stack frame with local display:



Static predecessors in stack frames (3)

Snapshot of stack for example with local display



Local display

1. Addressing with local display

Let V, n, RA(V), VA and m defined as above, and m >n . The addressof VA is obtained by:

M[AP − 4 ∗ (m − n)] + RA(V )

2. Management of the local display:

Let ∆ PND =def PND(caller) -PND (callee). We distinguish two cases:1. ∆ PND = -1: Display of caller + AP of caller2. ∆ PND > -1: Display of caller - ∆ PND Entries

Remarks:• Addressing of local variables is more efficient with local display.• In general, more memory space on stack is required.



Static predecessors in global display

Observation: Many entries in the local display are identical.

Goal: Store display in global memory region. This memory area iscalled global display.



Static predecessors in global display (2)

Snapshot of stack for example with global display



Global display

1. Addressing with global display

Addressing with global display is like addressing with local display, butinstead of AP the address of global display is used.

2. Management of the global display:

Problem: Global display is changed on a procedure call if procedureswith lower PND are executed that are later called by procedures withhigher PND.

Observation: Each procedure incarnation changes maximally onecomponent of the global display, i.e. if PND(caller) - PND (callee) = -1.

Solution: It suffices to save the changed component and to restore itin the epilog of a procedure. For saving the component, a memoryword in the stack frame has to be reserved.



Global display (2)

Remarks:

• If there are enough registers, the global display (or parts) shouldbe stored in registers.

• For languages that use procedures as parameters, the displaytechnique has to be adapted.

• The different variants for handling nested procedures show typicalvariation points in compiler design.

The introduced memory management can be seen as a schema thatcan be adapted for given source and target languages (consideringproperties of the target machines, e.g. caches).



Summary: Memory Management



Literature

Recommended Reading:

• Wilhelm, Maurer: Sect. 2.9, pp. 31 – 53


3. translation to target language - - tu kaiserslautern · arithmetic-logic unit (alu)...

Documents