type-based verification of assembly language for compiler debugging bor-yuh evan changadam chlipala...

26
Type-Based Verification of Type-Based Verification of Assembly Language for Assembly Language for Compiler Debugging Compiler Debugging Bor-Yuh Evan Chang Adam Chlipala George C. Necula Robert R. Schneck University of California, Berkeley TLDI 2005 Long Beach, California January 10, 2005

Upload: justina-stevenson

Post on 17-Dec-2015

223 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

Type-Based Verification of Type-Based Verification of Assembly Language for Compiler Assembly Language for Compiler

DebuggingDebugging

Bor-Yuh Evan Chang Adam ChlipalaGeorge C. Necula Robert R. Schneck

University of California, Berkeley

TLDI 2005Long Beach, California

January 10, 2005

Page 2: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

2

Problem and MotivationProblem and Motivation

• Type check assembly codeassembly code produced by existingexisting compilers for a Java-like language

• Validate that certifying compilation aids compiler debugging– using undergraduate compilers course as a

test bed

• Introduce undergraduates to using compiler technology for software quality

Page 3: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

3

Problem and MotivationProblem and Motivation• Starting Point: Bytecode verification

– Effective, simple, and very accessible– But does not work for assembly code

• Why assembly code?– Native code compilers are more complex– Also debug JITs

• Why existing compilers?– Force to make the verifier flexible– Better accessibility to students (i.e., require little

or no annotations)– More test cases to validate “certified compilation

for debugging”

Page 4: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

4

Debugging Compilers with Debugging Compilers with VerificationVerification

Run

Compile Run

Compile

r Test

Case

Compile

r Test

Case

Compiled

Program Test

Inputs

Stressed Compilers Student

Compile Verify

Compile

r Test

Case

Compile

r Test

Case

Relaxed Compilers Student

Ð^K^V^H^P^L^V^H0^L^V^Hp^L^V^H°^L^V^Hð^L^V^H^P^

M^V^H0^M^V^Hp^M^V^H^Ð^M^V^Hð^

M^V^

Ð^K^V^H^P^L^V^H0^L^V^Hp^L^V^H°^L^V^Hð^L^V^H^P^

M^V^H0^M^V^Hp^M^V^H^Ð^M^V^Hð^

M^V^

Line 2147: Type error – argument -8(SP0) :

unknown, which is not <= nonnull String.

Line 2147: Type error – argument -8(SP0) :

unknown, which is not <= nonnull String.

Page 5: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

5

ChallengesChallengesclass P {

int f;int m() { … }

}class C extends P

{int m() { … }

}

…P p = new P();P c = new C();c.m();…

invokevirtual P.m()

branch (= rc 0) Labort

rtmp := m[rc + 8]

rtmp := m[rtmp +

12]

rarg0 := rc

rra := &Lret

jump [rtmp]

Lret:

invokevirtual P.m()

Page 6: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

6

ChallengesChallengesclass P {

int f;int m() { … }

}class C extends P

{int m() { … }

}

…P p = new P();P c = new C();c.m();…

invokevirtual P.m()

branch (= rc 0) Labort

rtmp := m[rc + 8]

rtmp := m[rtmp + 12]

rarg0 := rc

rra := &Lret

jump [rtmp]

Lret:

hrc : P, … i

hrc : nonnull P, …i

hrtmp : disp(P), …i

hrtmp : meth(P,12), …i

hrarg0 : P, …i

hrrv : int, …i

Page 7: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

7

ChallengesChallengesclass P {

int f;int m() { … }

}class C extends P

{int m() { … }

}

…P p = new P();P c = new C();c.m();…

invokevirtual P.m()

branch (= rc 0) Labort

rtmp := m[rc + 8]

rtmp := m[rtmp + 12]

rarg0 := rrpp

rra := &Lret

jump [rtmp]

Lret:

hrc : P, … i

hrc : nonnull P, …i

hrtmp : disp(P), …i

hrtmp : meth(P,12), …i

hrarg0 : PP, …i

hrrv : int, …i

unsoundunsound

Page 8: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

8

ChallengesChallengesclass P {

int f;int m() { … }

}class C extends

P {int m() { … }

}

…P p = new P();P c = new C();int f = p.f;p.m();x = f + 1;

branch (= rp 0) Labort

rtmp := rp + 12

rf := m[rtmp]

branch (= rp 0) Labort

rtmp := m[rp + 8]

rtmp := m[rtmp + 12]

rarg0 := rp

rra := &Lret

jump [rtmp]

Lret:rx := rf + 1

reorderingand

optimization

Page 9: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

9

ChallengesChallenges

branch (= rp 0) Labort

rtmp := rp + 12

rf := m[rtmp]

rx := rf + 1

branch (= rp 0) Labort

rtmp := m[rtmp - 4]

rtmp := m[rtmp + 12]

rarg0 := rp

rra := &Lret

jump [rtmp]

Lret:

hrtmp : int ptr, … i

“funny”pointer

arithmetic

class P {int f;int m() { … }

}class C extends

P {int m() { … }

}

…P p = new P();P c = new C();int f = p.f;p.m();x = f + 1;

Page 10: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

10

OutlineOutline

• Overview• Abstract State

– Types– Join algorithm

• Verification Procedure• Educational Experience• Concluding Remarks

Page 11: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

11

OverviewOverview

• To maximize accessibility (i.e., minimize annotations)– infer types at intermediate program

points using abstract interpretation– willingness to have a limited amount of

specialization to a compilation strategy• To handle assembly code

– use (simple) dependent types• To be less sensitive to the compilation

strategy– assign types to intermediate values lazily

Page 12: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

12

Abstract StateAbstract State

• Keep intermediate expressions in symbolic form

• Symbolic valuesSymbolic values abstract irrelevant expressions keeping only type information

• Intermediate expressions track many dependencies between registers

Page 13: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

13

Why Symbolic Values?Why Symbolic Values?

• Used to track dependencies between registers

• Value state form convenient for abstract interpretation

r1 = r2 Æ r1 = r3 + 4

r1 = + 4, r2 = + 4, r3 =

r1 := r2

[r1 ! (r2)]

Page 14: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

14

ValuesValues

• Define a normalization of expressions to values

• Values give the expressions of interest– For our case, address computation and

comparisons with constants– Would vary depending on the source language or

compiler of interest– Not all equal expressions normalize to the same

value, so value forms must be chosen carefully

• Registers are equal if they map to the same values (after normalization)

Page 15: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

15

TypesTypes

• Primitive Types and Object References

• Types for Run-time Structures

Tracks that the value is the dispatch table

of object

Page 16: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

16

JoinJoin

h r1 = +4, r2 = +4, r3 = D : disp(), : nonnull exactly Ci

h r1 = +4, r2 = +8, r3 = D : disp(), : nonnull Ci

h r1 = +4, r2 = , r3 = D : disp(), : nonnull C, : >i (,) !

(,) !

Page 17: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

17

JoinJoin

• Joining of type state basically given by subtyping– subtyping lattice still fairly simple – dictated

mostly by the class inheritance hierachy• Values are (fancy) labels for equivalence

classes of registers– lattice of conjunctions of equalities on

registers– first normalize expressions to values for join– each distinct pair of values in the input state

gives an equivalence class in the result state

Page 18: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

18

OutlineOutline

• Overview• Abstract State

– Types– Join algorithm

• Verification Procedure• Educational Experience• Concluding Remarks

Page 19: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

19

Verification ExampleVerification Exampleclass P {

int f;int m() { … }

}class C extends

P{ … }

branch (= rp 0) Labort

rtmp := rp + 12

rf := m[rtmp]

rx := rf + 1

rtmp := m[rtmp - 4]

rtmp := m[rtmp + 12]

rarg0 := rp

rra := &Lret

jump [rtmp]

Lret:

Typing Rule

hrtmp = + 12 D …i

hrp = D : Pi

h… D : nonnull Pi

hrf = D : inti

hrx = + 1 D : inti

Page 20: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

20

Verification ExampleVerification Exampleclass P {

int f;int m() { … }

}class C extends

P{ … }

branch (= rp 0) Labort

rtmp := rp + 12

rf := m[rtmp]

rx := rf + 1

rtmp := m[rtmp - 4]

rtmp := m[rtmp + 12]

rarg0 := rp

rra := &Lret

jump [rtmp]

Lret:

hrtmp = + 12 D …i

hrp = D : Pi

h… D : nonnull Pi

Typing Rule

hrf = D : inti

hrx = + 1 D : inti

hrtmp = D : disp()i

hrtmp = D : meth(,12)i

hrarg0 = D …i

hrrv = D : inti

Calling meth(,12)

checks rarg0 =

+ 12 – 4 + 8

Page 21: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

21

OutlineOutline

• Overview• Abstract State

– Types– Join algorithm

• Verification Procedure• Educational Experience• Concluding Remarks

Page 22: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

22

Comparing Compiler DevelopmentComparing Compiler Development

• Good compilations increase: 53% to 75%

0%

20%

40%

60%

80%

100%

2002 2003 2004

test passed, type safe(observably correct)

test passed, type safe butCoolaid failed (incompleteness)

test passed, type error (hiddentype error)

test failed, type error (visible typeerror)

test failed, type safe (semanticerror)

Without CoolaidWithout Coolaid With CoolaidWith Coolaid

Page 23: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

23

0%

20%

40%

60%

80%

100%

2002 2003 2004

test passed, type safe(observably correct)

test passed, type safe butCoolaid failed (incompleteness)

test passed, type error (hiddentype error)

test failed, type error (visible typeerror)

test failed, type safe (semanticerror)

Comparing Compiler DevelopmentComparing Compiler Development

• Type errors decrease: 44% to 19%– even 29% to 15% if only consider those that are visible

in testing

Without CoolaidWithout Coolaid With CoolaidWith Coolaid

Page 24: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

24

0%

20%

40%

60%

80%

100%

2002 2003 2004

test passed, type safe(observably correct)

test passed, type safe butCoolaid failed (incompleteness)

test passed, type error (hiddentype error)

test failed, type error (visible typeerror)

test failed, type safe (semanticerror)

Comparing Compiler DevelopmentComparing Compiler Development

• False alarms: 6% to 1%– price for ease of use

Without CoolaidWithout Coolaid With CoolaidWith Coolaid

Page 25: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

25

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

35.0%

40.0%

45.0%

0–29 30–34 35–39 40–44 45–49

number of tests passed

perc

enta

ge o

f co

mpi

lers

2002 (without Coolaid)

2003 (without Coolaid)

2004 (with Coolaid)

Student PerspectiveStudent Perspective

• 0: “counter-productive” 6: “can’t imagine being without it”

• A favorite comment:“I would be totally lost without Coolaid. I learn best when I am using it hands-on …. I was able to really understand stack conventions and optimizations and to appreciate them.”

0

5

10

15

20

25

0 1 2 3 4 5 6

usefulness rating

num

ber

of g

roup

s

• Traditional testing procedure– 49 test cases / inputs

• Mean scores:– 33 (67%) in 2002– 34 (69%) in 2003– 39 (79%) in 2004

Page 26: Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

1/10/2005 Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging

26

ConclusionConclusion

• Extended data-flow based type inference of intermediate languages to assembly code– Able to retrofit existing compilers– Minimal annotations for accessibility

• Provided experimental confirmation that certifying compilation helps early compiler debugging

• Introduced undergraduates to the idea of improving software quality with program analysis