harry xu may 2012 - donald bren school of information and...

64
Harry Xu May 2012

Upload: vuongtruc

Post on 13-Apr-2018

215 views

Category:

Documents


3 download

TRANSCRIPT

Harry Xu May 2012

Complex, concurrent software

Precision (no false positives) Find real bugs in real executions

Need to modify JVM (e.g., object layout, GC, or ISA-level code)

Need to demonstrate realism

(usually performance)

Otherwise use RoadRunner, BCEL, Pin, LLVM, …

Keeping track of stuff as the program executes?

Change application behavior (add instrumentation)

Store per-object/per-field metadata Piggyback on GC

Keeping track of stuff as the program executes?

JVM written in Java?! Change application behavior (add instrumentation)

Store per-object/per-field metadata Piggyback on GC Uninterruptible code

Guide Research Archive Research mailing list

Jikes RVM {

Guide Research Archive Research mailing list

Jikes RVM {

Jikes RVM source code

Dynamic compilers

Boot image writer

Jikes RVM source code

Run with another JVM

Dynamic compilers

Boot image writer

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Build configurations: BaseBase (prototype) BaseAdaptive (prototype-opt) FullAdaptive (development) FastAdaptive (production)

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive

Testing

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive

Faster builds

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive

Faster runs

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Build configurations: BaseBase BaseAdaptive FullAdaptive FastAdaptive Performance

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Edit with Eclipse (see Guide)

Jikes RVM source code

Boot image (native code + initial

heap space)

Run with another JVM

Dynamic compilers

Boot image writer

Keeping track of stuff as the program executes?

Change application behavior (add instrumentation)

Store per-object/per-field metadata Piggyback on GC

Bytecode Native code Baseline compiler

Bytecode Native code Baseline compiler

Each bytecode several x86 instructions (BaselineCompilerImpl.java)

Bytecode Native code Baseline compiler

Each bytecode several x86 instructions (BaselineCompilerImpl.java)

Bytecode Native code Baseline compiler

Profiling

Adaptive optimization system

Bytecode Native code Baseline compiler

Profiling

Adaptive optimization system

Optimizing compiler

(Faster) native code

Bytecode Native code Baseline compiler

Profiling

Adaptive optimization system

Optimizing compiler

(Faster) native code

Bytecode

Optimizing compiler

(Faster) native code

Bytecode

(Faster) native code

HIR

LIR

MIR

Resembles bytecode

Resembles assembly code

Resembles typical compiler IR (3-address code)

Bytecode

(Faster) native code

HIR

LIR

MIR (Even faster) native code

Opt levels: 0, 1, 2

Bytecode

(Faster) native code

HIR

LIR

MIR

ExpandRuntimeServices.java

Add instrumentation at reads, writes, allocation, synchronization

Keeping track of stuff as the program executes?

Change application behavior (add instrumentation)

Store per-object/per-field metadata Piggyback on GC

field0 field1 field2 header

Low address High address

field0 field1 field2 type info block

locking & GC

Object reference

field0 field1 field2 type info block

locking & GC

Object reference

Array length elem0 elem1 type info

block locking &

GC

Object reference

field0 field1 field2 type info block

locking & GC

Object reference

Steal bits

field0 field1 field2 type info block

locking & GC

Object reference

misc

MiscHeader.java

field0 field1 field2 type info block

locking & GC

Object reference

counter

field0 field1 field2 type info block

locking & GC

Object reference

counter

Magic! Compiles down to three x86 instructions

field0 field1 field2 type info block

locking & GC

Object reference

counter

Gotcha: can’t actually use LSB of leftmost word

2

field0 field1 field2 type info block

locking & GC

Object reference

counter

What’s the problem with this code?

2

field0 field1 field2 type info block

locking & GC

Object reference

counter

2

field0 field1 field2 type info block

locking & GC

Object reference

not used

field0 field1 field2 type info block

locking & GC

Object reference

not used

Compiles down to three x86 instructions

field0 field1 field2 type info block

locking & GC

Object reference

misc

field0 field1 field2 type info block

locking & GC

Object reference

misc

What if GC moves object? What if GC collects object?

Keeping track of stuff as the program executes?

Change application behavior (add instrumentation)

Store per-object/per-field metadata Piggyback on GC

field0 field1 field2 type info block

locking & GC

Object reference

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f)

field0 field1 field2 type info block

locking & GC

Object reference

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f)

field0 field1 field2 type info block

locking & GC

Object reference

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f)

field0 field1 field2 type info block

locking & GC

Object reference

misc

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) obj.misc = markAndPossiblyCopy(obj.f) worklist.push(obj.misc)

field0 field1 field2 type info block

locking & GC

Object reference

misc

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) obj.misc = markAndPossiblyCopy(obj.f) worklist.push(obj.misc)

field0 field1 field2 type info block

locking & GC

Object reference

misc

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f) obj.misc = markAndPossiblyCopy(obj.f) worklist.push(obj.misc) TraceLocal.scanObject()

field0 field1 field2 type info block

locking & GC

Object reference

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f)

field0 field1 field2 type info block

locking & GC

Object reference

// Initially worklist populated with roots while worklist has elements Object obj = worklist.pop() foreach reference field obj.f obj.f = markAndPossiblyCopy(obj.f) worklist.push(obj.f)

TraceLocal.processNode()

Keeping track of stuff as the program executes?

Change application behavior (add instrumentation)

Store per-object/per-field metadata Piggyback on GC Uninterruptible code

Normal application code can be interrupted Allocation GC Synchronization & yield points join a GC

Some VM code shouldn’t be interrupted Heap etc. in inconsistent state

Most instrumentation can’t be interrupted Reads & writes aren’t GC-safe points

@Uninterruptible static void myMethod(Object o) { // No allocation or synchronization // No calls to interruptible methods }

@Uninterruptible static void myMethod(Object o) { currentThread.deferGC = true; Metadata m = new Metadata(); currentThread.deferGC = false; setMiscHeader(o, offset, m); }

Need to modify JVM internals Need to demonstrate realism

Guide

Research Archive

Research mailing list

Jikes RVM

Overview of other tasks & components

Dynamic analysis examples

Help (especially for novices)

Object layout Extra bits or words in header Stealing bits from references Discuss magic here

Adding instrumentation Baseline & optimizing compilers Allocation sites; reads & writes Inlining instrumentation

Garbage collection Piggybacking on GC New spaces

Low-level stuff Uninterruptible code Walking the stack

Concurrency Atomic stores Thread-local data