compiler construction code generation ii rina zviel-girshin and ohad shacham school of computer...

31
Compiler Construction Code Generation II Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University

Post on 19-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Compiler Construction

Code Generation II

Rina Zviel-Girshin and Ohad ShachamSchool of Computer Science

Tel-Aviv University

22

Compiler

ICProgram

ic

x86 executable

exeLexicalAnalysi

s

Syntax Analysi

s

Parsing

AST Symbol

Tableetc.

Inter.Rep.(IR)

CodeGeneration

IC compiler

We saw: X86 assembly Code generation

Today: Code generation Runtime checks

33

x86 assembly

AT&T syntax and Intel syntax We’ll be using AT&T syntax Work with GNU Assembler (GAS)

44

Immediate and register operands

Immediate Value specified in the instruction itself Preceded by $ Example: add $4,%esp

Register Register name is used Preceded by % Example: mov %esp,%ebp

55

Reminder: accessing variables

Use offset from frame pointer

Above FP = parameters Below FP = locals

(and spilled LIR registers)

Examples %ebp + 4 = return address %ebp + 8 = first parameter %ebp – 4 = first local

… …

SP

FP

Return address

local 1…

local n

Previous fp

param n…

param 1FP+8

FP-4

66

Memory and base displacement operands

Memory operands Obtain value at given address Example: mov (%eax), %eax

Base displacement Obtain value at computed address Syntax: disp(base,index,scale) offset = base + (index * scale) + displacement Example: mov $42, 2(%eax)

Example: mov $42, (%eax,%ecx,4)

77

Reminder: accessing variables

Use offset from frame pointer

Above FP = parameters Below FP = locals

(and spilled LIR registers)

Examples %ebp + 8 = first parameter %eax = %ebp + 8 (%eax) = the value 572 8(%ebp) = the value 572

… …

SP

FP

Return address

local 1…

local n

Previous fp

param n…

572 %eax,FP+8

FP-4

88

LIR to assembly

Need to know how to translate: Function bodies

Translation for each kind of LIR instruction Calling sequences Correctly access parameters and variables Compute offsets for parameter and variables

Dispatch tables String literals Runtime checks Error handlers

99

Translating LIR instructions

Translate function bodies:1. Compute offsets for:

Local variables (-4,-8,-12,…) LIR registers (considered extra local variables) Function parameters (+8,+12,+16,…)

Take this parameter into account

2. Translate instruction list for each function Local translation for each LIR instruction Local (machine) register allocation

1010

Memory offsets implementation

// MethodLayout instance per function declarationclass MethodLayout { // Maps variables/parameters/LIR registers to // offsets relative to frame pointer (BP) Map<Memory,Integer> memoryToOffset;}

void foo(int x, int y) { int z = x + y; g = z; // g is a field Library.printi(z); }

virtual function takesone extra parameter: this

MethodLayout for foo

MemoryOffset

this+8

x+12

y+16

z-4

R0-8

R1-12

_A_foo: Move x,R0 Add y,R0 Move R0,z Move this,R1 MoveField R0,R1.1 Library __printi(R0),Rdummy

(manual) LIR translation

1

PA4

PA5

1111

Memory offsets example

MethodLayout for foo

_A_foo: Move x,R0 Add y,R0 Move R0,z Move this,R1 MoveField R0,R1.1 Library __printi(R0),Rdummy

_A_foo: push %ebp # prologue mov %esp,%ebp sub $12, %esp mov 12(%ebp),%eax # Move x,R0 mov %eax,-8(%ebp) mov 16(%ebp),%eax # Add y,R0 add -8(%ebp),%eax mov %eax,-8(%ebp) mov -8(%ebp),%eax # Move R0,z mov %eax,-4(%ebp) mov 8(%ebp),%eax # Move this,R1 mov %eax,-12(%ebp) mov -8(%ebp),%eax # MoveField R0,R1.1 mov -12(%ebp),%ebx mov %eax,4(%ebx) mov -8(%ebp),%eax # Library __printi(R0) push %eax call __printi add $4,%esp_A_foo_epilogoue: mov %ebp,%esp # epilogoue pop %ebp ret

LIR translation Translation to x86 assembly

MemoryOffset

this+8

x+12

y+16

z-4

R0-8

R1-12

2

1212

Calls/returns

Direct function call syntax: call nameExample: call __println

Return instruction: ret

1313

Handling functions

Need to implement call sequence Caller code:

Pre-call code: Push caller-save registers Push parameters

Call (special treatment for virtual function calls) Post-call code:

Copy returned value (if needed) Pop parameters Pop caller-save registers

Callee code Each function has prologue and epilogue

1414

call

caller

callee

return

caller

Caller push code

Callee push code

(prologue)

Callee pop code

(epilogue)

Copy returned valueCaller pop code

Push caller-save registersPush actual parameters (in reverse order)

push return addressJump to call address

Push current base-pointerbp = spPush local variablesPush callee-save registers

Pop callee-save registersPop callee activation recordPop old base-pointer

pop return addressJump to address

Pop parametersPop caller-save registers

Call sequences

1515

Translating static callsStaticCall _A_foo(a=R1,b=5,c=x),R3LIR code:

# push parametersmov -4(%ebp),%eax # push xpush %eaxpush $5 # push 5mov -8(%ebp),%eax # push R1push %eax

# push caller-saved registerspush %eaxpush %ecxpush %edx

call _A_foo

# pop parameters (3 params*4 bytes = 12)add $12,%esp

# pop caller-saved registerspop %edxpop %ecxpop %eax

only if the value stored in these registers is needed by the callerh

mov %eax,-16(%ebp) # store returned value in R3

Only if return register is not Rdummy

only if the value stored in these registers is needed by the caller

1616

Virtual functions

Indirect call: call *(Reg)Example: call *(%eax)Used for virtual function calls

Dispatch table lookupPassing/receiving the this variable

1717

Translating virtual callsVirtualCall R1.2(b=5,c=x),R3

# push parametersmov -4(%ebp),%eax # push xpush %eaxpush $5 # push 5

# push caller-saved registerspush %eaxpush %ecxpush %edx

LIR code:

# pop parameters (2 params+this * 4 bytes = 12)add $12,%esp

# pop caller-saved registerspop %edxpop %ecxpop %eax

mov %eax,-12(%ebp) # store returned value in R3

x

y

DVPtr

R1

0_A_rise

1_A_shine

2_A_twinkle

_DV_A

# Find address of virtual method and call itmov -8(%ebp),%eax # load thispush %eax # push thismov 0(%eax),%eax # Load dispatch table addresscall *8(%eax) # Call table entry 2 (2*4=8)

1818

Function prologue/epilogue_A_foo:# prologuepush %ebpmov %esp,%ebp

# push local variables of foosub $12,%esp # 3 local vars+regs * 4 = 12

# push callee-saved registerspush %ebxpush %esipush %edi

function body

# pop callee-saved registerspop %edipop %esipop %ebx

# push local variables of foosub $12,%esp # 3 local vars+regs * 4 = 12

mov %ebp,%esppop %ebpret

_A_foo_epilogoue: # extra label for each function

Optional: only ifregister allocation optimization is used (in PA5)

only if the these registers will be modified by the collee

1919

Representing dispatch tables

class A { void sleep() {…} void rise() {…} void shine() {…} static void foo() {…}}class B extends A { void rise() {…} void shine() {…} void twinkle() {…}}

_DV_A: [_A_sleep,_A_rise,_A_shine]_DV_B: [_A_sleep,_B_rise,_B_shine,_B_twinkle]

file.ic

file.lir

# data section.data .align 4_DV_A: .long _A_sleep .long _A_rise .long _A_shine_DV_B: .long _A_sleep .long _B_rise .long _B_shine .long _B_twinkle

file.s

PA4

PA5

2020

Runtime checks

Insert code to check attempt to perform illegal operations Null pointer check

MoveField, MoveArray, ArrayLength, VirtualCall Reference arguments to library functions should not be null

Array bounds check Array allocation size check Division by zero

If check fails jump to error handler code that prints a message and gracefully exists program

2121

Null pointer check

# null pointer check

cmp $0,%eax

je labelNPE

labelNPE: push $strNPE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

2222

Array bounds check

# array bounds check mov -4(%eax),%ebx # ebx = length mov $0,%ecx # ecx = index cmp %ecx,%ebx jle labelABE # ebx <= ecx ? cmp $0,%ecx jl labelABE # ecx < 0 ?

labelABE: push $strABE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

2323

Array allocation size check

# array size check

cmp $0,%eax # eax == array size

jle labelASE # eax <= 0 ?

labelASE: push $strASE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

2424

Division by zero check

# division by zero check

cmp $0,%eax # eax is divisor je labelDBE # eax == 0 ?

labelDBE: push $strDBE # error message call __println push $1 # error code call __exit

Single generated handler for entire program

2525

Optimizations

More efficient register allocation for statementsAllocate machine registers during translation

Eliminate unnecessary labels and jumpsPost-translation pass

2626

Optimizing labels/jumps

If we have subsequent labels:_label1:_label2:

We can merge labels and redirect jumps to the merged label

After translation (easier) Map old labels to new labels

If we havejump label1_label1:Can eliminate jump

Eliminate labels not mentioned by any instruction

2727

Optimizing register allocation

Goal: associate machine registers with LIR registers as much as possible

Optimization done only for sequence of instructions translated from single statement

See more details on web site

2828

class Library { void println(string s); }

class Hello { static void main(string[] args) { Library.println("Hello world!"); } }

Hello world example

2929

Assembly file structure.title "hello.ic“

# global declarations.global __ic_main

# data section.data

.align 4 .int 13str1: .string "Hello world\n“

# text (code) section.text

#----------------------------------------------------.align 4

__ic_main:push %ebp # prologuemov %esp,%ebp

push $str1 # print(...)call __printadd $4, %esp

mov $0,%eax # return 0

mov %ebp,%esp # epiloguepop %ebpret

header

statically-allocateddata: string literalsand dispatch tables

symbol exported to

linker

Method bodiesand error handlers

string lengthin bytes

comment

3030

Assembly file structure.title "hello.ic“

# global declarations.global __ic_main

# data section.data

.align 4 .int 13str1: .string "Hello world\n“

# text (code) section.text

#----------------------------------------------------.align 4

__ic_main:push %ebp # prologuemov %esp,%ebp

push $str1 # print(...)call __printadd $4, %esp

mov $0,%eax # return 0

mov %ebp,%esp # epiloguepop %ebpret

Immediates have $ prefix Register names have % prefix Comments using # Labels end with the (standard) :

push print parameter

call print pop parameter

store return value of main in eax

prologue – save ebp and set to be esp

epilogue – restore esp and ebp (pop)

3131

From assembly to executable

LexicalAnalysi

s

Syntax Analysi

s

Parsing

AST Symbol

Tableetc.

Inter.Rep.(IR)

CodeGeneration

ICProgram

prog.ic

x86 assembly

prog.s

x86 assembly

prog.s

libic.a(libic + gc)

GNU assembler

prog.o GNUlinker prog.exe

Can automate compilation+assembling+linking with

script / Ant