compiler optimizations for nondeferred reference-counting garbage collection pramod g. joisha...

26
Compiler Optimizations for Nondeferred Reference-Counting Garbage Collection Pramod G. Joisha Microsoft Research, Redmond

Post on 22-Dec-2015

235 views

Category:

Documents


0 download

TRANSCRIPT

Compiler Optimizations for Nondeferred Reference-Counting

Garbage CollectionPramod G. Joisha

Microsoft Research, Redmond

ISMM’06 2

Classic Reference-Counting (RC) Garbage Collection

• All references (stack, statics, heap) tallied

• Based on the nondeferred RC invariant– Nonzero means at least one incident

reference and zero means garbage

• High processing costs– Counts need to be updated on every mutation

ISMM’06 3

Past Solution to High Overhead• Count only a subset of references

– Deferred RC collection (1976)– Ulterior RC collection (2003)

• Based on the deferred RC invariant– Nonzero means at least one incident

reference but zero means maybe garbage

• Faster, but– more “floating” garbage– longer pauses

ISMM’06 4

Our Solution

• Program analyses– Idea: Eliminate redundant RC updates

• Redundancy with respect to RC invariant

– Advantages• Reclamation characteristics unchanged• Pause time no worse than unoptimized case

ISMM’06 5

Talk Outline

• Optimizations (and related analyses)– RC subsumption– Acyclic object RC update specialization

• Experimental results– Impact on execution times– Comparison with deferred RC collection

• Conclusions

ISMM’06 6

Optimizations

• Fall into three categories– Data-centric (immortal RC update elision,

acyclic object RC update specialization)– Program-centric (RC subsumption, RC update

coalescing, null-check omission)– RC update-centric (RC update inlining)

ISMM’06 7

RC Subsumption: Intuition

ISMM’06 8

Flow-Insensitive RC Subsumption

• y is always RC subsumed by x if1. All live ranges of y are contained in x

2. The variable y is never live through a redefinition of either y or x

3. Everything reachable from y is also reachable from x

y

x

ISMM’06 9

Live Range Webs

x := ...

y := x

... y ...

... x ...

... y ...

x := ...y := x

ISMM’06 10

Provision 1: Live-Range Subsumption Graph

• Directed graph GL

– Nodes represent local references– Edges denote live-range containment– (y, x) means “y is always contained in x”

• Quadratic algorithm– Start with G = (V,E)

– Add (u, v) if u is live and v dead at point P

– Complement of G is GL

ISMM’06 11

A Contingent Opportunity

ISMM’06 12

Provision 2: Uncut Live-Range Subsumption Graph

• Handles redefinition provision

• Directed graph GE

– Start with GL

– Find livethru(s) and defsmay(s)

– Then liverdef(s) = livethru(s) defsmay(s)

– Delete (u, x) if u liverdef(s)

– Delete (y, u) if y livethru(s) and u liverdef(s)

ISMM’06 13

Overlooking Rootsst

ack

v

u

A

B

u := v

u := v.g(g is a read-only field)

u := v[e](v is thread local and v[e]isn’t written into before v dies)

u := v.f(v is thread local and v.f isn’t written into before v dies)

ISMM’06 14

• Start with GE

• Delete (u, v), where u v– nothing overlooks u at its definition– u is overlooked by w and (w, v) GR

• Delete until fixed point is reached• Approximate overlooking roots’ set used

Provision 3: RC Subsumption Graph

u

w

v

ISMM’06 15

Talk Outline

• Optimizations (and related analyses)– RC subsumption– Acyclic object RC update specialization

• Experimental results– Impact on execution times– Comparison with deferred RC collection

• Conclusions

ISMM’06 16

The Problem of Garbage Cycles

• Reference counting can’t capture cycles

• Three solutions:– Programming paradigms– Back-up tracing collector– Local tracing solution: trial deletion

ISMM’06 17

Background on Trial Deletion

• Decremented references buffered

• Trial deletion adds overheads– Bookkeeping memory (PLC buffer, PLC link)– Extra processing in RC updates

• Idea: Statically identify acyclic objects

ISMM’06 18

• Determine types that are always acyclic

• Type hierarchy and field information– Type connectivity (TC) graph

• SCC decomposition of TC graph

Acyclic Type Analysis

y

w

v

x

z

ISMM’06 19

Building the TC Graph

• Separate compilation

• Immortal object optimization

• Array subtyping issues

ISMM’06 20

Other Optimizations

• RC updates on immortal objects– vtables, string literals, GC tables

• Coalescing of RC updates

• Non-null operand RC update specialization

• RC update inlining

ISMM’06 21

Talk Outline

• Optimizations (and related analyses)– RC subsumption– Acyclic object RC update specialization

• Experimental results– Impact on execution times– Comparison with deferred RC collection

• Conclusions

ISMM’06 22

Benchmarks

ISMM’06 23

Optimization Effects

ISMM’06 24

Overlooking Roots’ Set Effects

ISMM’06 25

RC Update Distributions

ISMM’06 26

Summary

• High overheads can be drastically reduced without compromising on benefits!– Key: a new analysis called RC subsumption

• Improvements due to it alone often significant

– Execution times on a par with deferred RC collection on a number of programs

– Challenges wisdom on classic RC efficiency

• Scope for further improvement exists

• Future Work: Multithreading