precise and efficient garbage collection in vmkit with...
TRANSCRIPT
Precise and Efficient Garbage Collection in VMKit with MMTk
LLVM Developer’s MeetingNicolas Geoffray
Background
• VMKit: Java and .Net on top of LLVM
• Uses LLVM’s JIT for executing code
• Uses Boehm for GC
• Performance bottlenecks
• No dynamic optimization
• Conservative GC
2
CPU-intensive Benchmarks (JGF)
3
0
1.25
2.50
3.75
5.00
series lufact heapsort crypt fft sor matmult
Tim
e no
rmal
ized
to
VM
Kit
VMKit Jikes Sun
MacOSX 10.58 x X86_32
2.66GHz 12Go
VM-intensive Benchmarks (Dacapo)
0
0.38
0.75
1.13
1.50
antlr fop luindex bloat jython pmd
Tim
e no
rmal
ized
to
VM
Kit
VMKit Jikes Sun
MacOSX 10.58 x X86_32
2.66GHz 12Go
4
Execution Overheads
7%7%10%
11% 65%
antlr
5%7%18%
17%
53%
bloat
9%2%8%6%
75%
fop
24%
4%12%
11%
49%
jython
5%2%8%12%
73%
luindex
4%20%
1%
20%17%
38%
pmd
Application Allocations Collections
System.arraycopy Interface calls Others5
Execution Overheads
7%7%10%
11% 65%
antlr
5%7%18%
17%
53%
bloat
9%2%8%6%
75%
fop
24%
4%12%
11%
49%
jython
5%2%8%12%
73%
luindex
4%20%
1%
20%17%
38%
pmd
Application Allocations Collections
System.arraycopy Interface calls Others6
21% 14% 20%
35% 23% 37%
Goal: replace Boehm with MMTk
• MMTk is JikesRVM’s GC
• Framework for writing GCs
• Multiple GC Implementations (Copying, Mark and trace, Immix)
• Copying collectors require precise stack scanning
• Locate pointers on the stack
7
But... it’s in Java?
• Yes, but nothing to be afraid of:
• Use of Magic tricks
• No use of runtime features (exceptions, inheritance)
• No use of standard library
• Use VMKit’s AOT compiler
• Transform MMTk into a .bc file
8
Outline
• Introduction
• Precise garbage collection
• Compiling MMTk with VMJC
• Putting it all together
• What’s left
9
• Introduction
• Precise garbage collection
• Compiling MMTk with VMJC
• Putting it all together
• What’s left
10
Precise Garbage Collection
• Write code that locates pointers in the stack
• llvm.gcroot in JIT-generated code
• llvm.gcroot in VMKit’s runtime written in C++
• Use LLVM’s GC framework to generate stack maps
• Caml stack maps for llvm-g++ generated code
• JIT stack maps for JIT-generated code
11
Precise Garbage Collection
12
VMKit.cppCaml Stackmap
VMKit.exellvm-g++
App.java
JIT Stackmap
JITted App
Stackmapsfor precise stack
scanning
Stack Scanning
• Problem: interweaving of different kinds of functions
• Application’s managed (Java or C#) functions: trusted
• VMKit’s C++ functions: trusted
• Application’s JNI functions: untrusted
• Solution: create a side-stack for frame addresses
• Updated upon entry of a kind of method
• VMKit knows the kind of each frame on the thread stack
13
Type of methods
• Trusted
• Has a stack map, so can manipulate objects (llvm.gcroot)
• Saves frame pointer (llvm::NoFramePointerElim)
• Untrusted
• Has no stack map, so should not manipulate objects
• May not save the frame pointer
14
Stack Scanning Example (1)
18
VMKit.main (C++)
App.main (Java)
App.function (Java)
VMKit.runtime (C++)
push(Enter Java)
push(Enter native)
Stack Scanning Example (1)
19
VMKit.main (C++)
App.main (Java)
App.function (Java)
VMKit.runtime (C++)
App.function2 (Java)
push(Enter Java)
push(Enter native)
push(Enter Java)
Stack Scanning Example (1)
20
VMKit.main (C++)
App.main (Java)
App.function (Java)
VMKit.runtime (C++)
App.function2 (Java)
push(Enter Java)
push(Enter native)
push(Enter Java) fp
Stack Scanning Example (1)
21
VMKit.main (C++)
App.main (Java)
App.function (Java)
VMKit.runtime (C++)
App.function2 (Java)
push(Enter Java)
push(Enter native)
push(Enter Java)
fp
fp
Stack Scanning Example (1)
22
VMKit.main (C++)
App.main (Java)
App.function (Java)
VMKit.runtime (C++)
App.function2 (Java)
push(Enter Java)
push(Enter native)
push(Enter Java)
fp
fp
fp
Stack Scanning Example (1)
23
VMKit.main (C++)
App.main (Java)
App.function (Java)
VMKit.runtime (C++)
App.function2 (Java)
push(Enter Java)
push(Enter native)
push(Enter Java)
fp
fp
fp
fp
Stack Scanning Example (2)
26
VMKit.main (C++)
App.main (Java)
App.function (JNI)
push(Enter Java)
push(Enter JNI)
Stack Scanning Example (2)
27
VMKit.main (C++)
App.main (Java)
App.function (JNI)
App.function2 (JNI)
push(Enter Java)
push(Enter JNI)
Stack Scanning Example (2)
28
VMKit.main (C++)
App.main (Java)
App.function (JNI)
App.function2 (JNI)
VMKit.jniRuntime (Java)
push(Enter Java)
push(Enter JNI)
push(Enter native)
Stack Scanning Example (2)
29
VMKit.main (C++)
App.main (Java)
App.function (JNI)
App.function2 (JNI)
VMKit.jniRuntime (Java)
push(Enter Java)
push(Enter JNI)
push(Enter native)
saved fp
Stack Scanning Example (2)
30
VMKit.main (C++)
App.main (Java)
App.function (JNI)
App.function2 (JNI)
VMKit.jniRuntime (Java)
push(Enter Java)
push(Enter JNI)
push(Enter native)
saved fp
fp
Stack Scanning Example (2)
31
VMKit.main (C++)
App.main (Java)
App.function (JNI)
App.function2 (JNI)
VMKit.jniRuntime (Java)
push(Enter Java)
push(Enter JNI)
push(Enter native)
saved fp
fp
fp
Running the GC
A precise GC scans the stacks at safe points: point during execution where the GC can know the type
of each value on the stack
32
Single-threaded Application
• GC always triggered at safe points
• gcmalloc instrunctions
• Collector::collect()
33
Multi-threaded Application
• When entering a GC, must wait for all threads to join
• Don’t use signals! or no safe point
• Use a thread-local variable to poll on method entry and backward branches
• Scan stacks of threads blocked in JNI or system calls
34
Application changes for GC
public static void runLoop(int a) {
while (a--) System.out.println(“Hello World”);
}
35
Application changes for GC
public static void runLoop(int a) {
if (getThreadID().doGC) GC()
while (a--) {
System.out.println(“Hello World”);
if (getThreadID().doGC) GC()
}
}
36
• Introduction
• Precise garbage collection
• Compiling MMTk with VMJC
• Putting it all together
• What’s left
37
What is VMJC?
• An Ahead of Time compiler (AOT)
• Generates .bc files from .class files
• Use of llvm tools to generate platform-dependant files
• shared library: llc -relocation-model=pic + gcc
• executable: llc + ld vmkit + gcc
38
Goal: compile MMTk with VMJC
• Generate a .bc file that can be linked with VMKit
• Interface MMTK → VMKit (e.g. threads synchronization, stack scanning)
• Interface VMKit → MMTk (e.g. gcmalloc)
39
Why MMTk does not need a Java runtime?
• No use of runtime features
• synchronizations, exceptions, inheritance
• No use of standard library
• HashMap, LinkedList, ArrayList
40
How MMTk is manipulating pointers?
• Definition of Magic classes and methods
• Address, Word, Offset
• Word Address.loadWord(Offset)
• Magic classes and methods translated by the compiler [VEE’09]
• Similar mechanism than Inline ASM for C
41
Example (Frampton [VEE’09])
42
@NoBoundsCheck voidprefetchObjects( ObjectReference[]buffer){ for(inti=0;i<buffer.length;i++){ ObjectReferencecurrent =buffer[i]; current.prefetch(); } }
voidprefetchObjects( OOP*buffer, intsize){ for(inti=0;i<size;i++){ OOPo=buffer[i]; asmvolatile( "prefetchnta(%0)":: "r"(o)); } }
Inline ASM in C Magic in Java
• Introduction
• Precise garbage collection
• Compiling MMTk with VMJC
• Putting it all together
• What’s left
43
Option 1: Object File
• Create a .o file of MMTk
• gcc mmtk.o vmkit.o -o vmkit
• But...
• No inlining in application code
44
Option 2: LLVM Bitcode File
• Create a .bc file of MMTk
• vmkit (-load mmtk.bc) -java HelloWorld
• Late binding of allocations in VMKit code
• gcmalloc in C++ are linked at runtime
• Inlining in Java code
• new in applications are inlined with MMTk’s malloc
45
Option 3: Everything is Bitcode
• Create a .bc file of MMTk
• Create a .bc file of VMKit
• Link, optimize and run
46
• Introduction
• Precise garbage collection
• Compiling MMTk with VMJC
• Putting it all together
• What’s left
47
What’s left
• Implementing the MMTK → VMKit interface
• Interactions between the GC and the VM
• Finish implementation with read/write barriers
• In VMKit code, in managed code
• Run benchmarks!
• Benchmark with different GCs from MMTk
48