an on-the-fly mark and sweep garbage collector based on sliding views hezi azatchi - ibm yossi...
TRANSCRIPT
An On-the-Fly Mark and Sweep Garbage Collector Based on Sliding Views
Hezi Azatchi - IBM Yossi Levanoni - Microsoft Harel Paz – Technion Erez Petrank – Technion
Erez Petrank GC via Sliding Views 2
Garbage Collection Today
Today’s advanced environments: multiprocessors + large memories
Dealing with multiprocessors
Stop The World
Erez Petrank GC via Sliding Views 3
Garbage Collection Today
Today’s advanced environments: multiprocessors + large memories
Dealing with multiprocessors
Concurrent collectionParallel collection
On-the-fly collection
Erez Petrank GC via Sliding Views 4
Garbage Collection Today
Today’s advanced environments: multiprocessors + large memories
Dealing with multiprocessors
Concurrent collectionParallel collection
On-the-fly collectionInformal
pause times
3ms
300ms 30ms
Erez Petrank GC via Sliding Views 5
Garbage Collection Today
Today’s advanced environments: multiprocessors + large memories
Dealing with multiprocessors
Concurrent collectionParallel collection
On-the-fly collectionInformal
throughput loss
10%
10%
Erez Petrank GC via Sliding Views 6
This Talk1. A new on-the-fly mark and sweep
collector. A synergy of snapshot collection and
sliding views.
2. Implementation and measurements on the Jikes RVM.
Pause times < 2ms Throughput loss 10%.
Erez Petrank GC via Sliding Views 7
The Mark-Sweep algorithm [McCarthy 1960]
Traverse & mark live objects. White objects may be reclaimed.
sta
ck
Heap
globalsglobals
RootsRoots
Erez Petrank GC via Sliding Views 8
Base: a snapshot collection
A naïve collector: Stop program threads Create a snapshot (replica) of the heap Program threads resume Trace replica concurrently with program Objects identified as unreachable in the
replica may be collected.
Problem: taking a replica of the heap is not realistic
Erez Petrank GC via Sliding Views 9
Base: a snapshot collection
A naïve collector: Stop program threads Create a snapshot (replica) of the heap Program threads resume Trace replica concurrently with program Objects identified as unreachable in the
replica may be collected. [Furusou et al. 91]: use a copy-on-write barrier.
No need to copy unless area written Use virtual pages.
Erez Petrank GC via Sliding Views 10
Some inefficiencies Copying a page requires synchronization. Efficiency depends on the system. Triggering and copying apply to all fields
although only pointers are interesting: Programs work at object level, this
mechanism works at page level a waste to copy a full page.
Erez Petrank GC via Sliding Views 11
Synergy with recently developed techniques
Note goal: we want to copy pointers in each modified object prior to its first modification.
The write barrier of the Levanoni-Petrank reference counting collector provides exactly this.
Use a dirty bit per object. Before a pointer is first modified – save object pointer values locally.
This can be done concurrently by a multithreaded program with no synchronization!
The write barrier (simplified)
Update(Object **slot, Object *new){ Object *old = *slot if (!IsDirty(slot)) { log( slot, old ) SetDirty(slot) } *slot = new}
Observation:If two threads:1. invoke the write barrier
in parallel, and 2. both log an old value,then both record the same old value.
The write barrier (simplified)
Update(Object **slot, Object *new){ Object *old = *slot if (!IsDirty(slot)) { log( slot, old ) SetDirty(slot) } *slot = new}
The “real” write barrier: • In the object level• With an optimistic initial
“if”
Erez Petrank GC via Sliding Views 14
Concurrent (intermediate) Algorithm:
Stop all threads Scan roots (locals) Initiate write barrier usage Resume threads Trace from roots.
Whenever a dirty objects is discovered use buffers to obtain its pointers.
Stop write barrier usage Sweep to reclaim unmarked objects. Clear all buffers and dirty bits.
Next goal: stop one thread at a time
Erez Petrank GC via Sliding Views 15
The Sliding Views “Framework”
Avoid simultaneous halting. Instead, stop one thread at a time.
View of the heap is a “sliding view”. There is a time interval in which all
objects are read. (But not one single point in time.)
Erez Petrank GC via Sliding Views 16
Danger in Sliding Views
Program does:P1 OP2 OP1 NULL
Here sliding view reads P2 (NULL)
Here sliding view reads P1 (NULL)
Problem: reachability of O not noticed!
Solution: “snooping”.If a pointer to O is stored while the sliding view is taken – do not reclaim O.
Erez Petrank GC via Sliding Views 17
The Sliding Views Algorithm: Initiate snooping and write barrier usage For each thread:
Stop thread and scan its roots (locals) Stop snooping Trace from roots and snooped objects.
Whenever a dirty object is discovered use buffers to obtain its actual values.
Stop write barrier usage Sweep to reclaim unmarked objects. Clear all buffers and dirty bits.
Erez Petrank GC via Sliding Views 18
Optimizing the write barrier We only need to store:
1. non-null pointer values of object. 2. while tracing is on.3. objects that have not been traced. 4. the object once.
Slow path of the write barrier is seldom taken (~ 1/300)
Implication of 3: new objects are never stored.
Erez Petrank GC via Sliding Views 19
Write Barrier Statistics
BenchmarkLong path frac.
SPECjbb2000 1 / 299
Compress1 / 894
Jess1 / 13,210
Db1 / 305
Javac1 / 160
Mpegaudio1 / 64,099
jack1 / 16,572
mtrt21 / 4116
Erez Petrank GC via Sliding Views 20
Performance Measurements Implementation for Java on the Jikes
Research JVM Compared collectors:
Jikes parallel collector (Parallel) Jikes concurrent RC (Jikes concurrent)
Benchmarks: Server benchmark: SPECjbb2000 ---
business-like transactions in a large firm Client benchmarks: SPECjvm98 ---
mostly single-threaded client benchmarks
Erez Petrank GC via Sliding Views 21
Pause Times vs. Parallel
-100
100
300
500
700
Pause Times
tracing
Jikes STW
tracing 1.3 0.66 2.04 0.54 0.91 0.91 0.6 0.73 0.93
Jikes STW 260.67 188.33 643.33 205.67 225 376 322 416.67 511.33
jess db javac mpeg jack mtrt2 jbb-1 jbb-2 jbb-3
Jikes parallel
Jikes parallel
Erez Petrank GC via Sliding Views 22
Pause Times vs. Jikes Concurrent
0
1
2
3
4
Pause Times - Concurrent
tracing
Jikes Concurrent
tracing 1.3 0.66 2.04 0.54 0.91 0.91 0.6 0.73 0.93
Jikes Concurrent 2.77 1.84 2.81 0.8 1.66 1.8 1.79 2.6 3.15
jess db javac m peg jack m trt2 jbb-1 jbb-2 jbb-3
Erez Petrank GC via Sliding Views 23
SPECjbb2000 Throughput
SPECjbb200 - Tracing vs. Jikes STW
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
256 320 384 448 512 576 640 704
heap sizes
jbb1
jbb2
jbb3
jbb4
jbb5
jbb6
jbb7
jbb8
Jikes parallel
Erez Petrank GC via Sliding Views 24
SPECjvm98 Throughput
SPECjvm98 - Jikes STW / Tracing
0.5
0.6
0.7
0.8
0.9
1
1.1
24 32 40 48 56 64 72 80 88 96
compress
jess
db
javac
mpeg
jack
mtrt
Jikes parallel
Erez Petrank GC via Sliding Views 25
SPECjbb2000 Throughput
SPECjbb2000- Tracing vs. Jikes concurrent
0.8
1
1.2
1.4
1.6
1.8
256 320 384 448 512 576 640 740
heap sizes
jbb1
jbb2
jbb3
jbb4
jbb5
jbb6
jbb7
jbb8
Erez Petrank GC via Sliding Views 26
SPECjvm98 Throughput
SPECjvm98 - Jikes concurrent / Tracing
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
24 32 40 48 56 64 72 80 88 96
jess
db
javac
mpeg
jack
mtrt
Erez Petrank GC via Sliding Views 27
SPECjbb2000 Throughput
SPECjbb2000- Tracing vs. Lev-Pet
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
256 320 384 448 512 576 640 704
heap sizes
jbb1
jbb2
jbb3
jbb4
jbb5
jbb6
jbb7
jbb8
Erez Petrank GC via Sliding Views 28
Most Related Collector Vast literature on on-the-fly mark &
sweep collectors. The state-of-the-art collector is by
Doligez-Leroy-Gonthier [POPL 93-94] Implemented for Java by IBM research:
Domani-Kolodner-Petrank [PLDI 2000]Domani et al [ISMM 2000]
Our new collector is the only alternative for tracing on-the-fly.
Erez Petrank GC via Sliding Views 29
Comparison ? No available research
implementation for Java.
Parent
o1 o2
p
Some thoughts on locality: A difference in write barrier on pointer modification:
[DLG]: Mark ex-referenced object [This work:] Copy (seldom) parent
pointers, check (frequently) parent mark bits.
Erez Petrank GC via Sliding Views 30
Related Work Snapshot tracing:
Demers et al (1990), Furusou et al. (1991) On-the-fly tracing:
Dijkstra et. al. (1976), Steele (1976), Lamport (1976), Kung & Song (1977), Gries (1977) Ben-Ari (1982,1984), Huelsbergen et. al. (1993,1998)
Doligez-Gonthier-Leroy (1993-4), Domani-Kolodner-Petrank (2000)
The RC sliding views algorithm: [Levanoni & Petrank: OOPSLA 01].
Generational extension of sliding views: Azatchi & Petrank [Compiler Construction 2003]