380c lecture 17 where are we & where we are going –managed languages dynamic compilation...

94
380C Lecture 17 • Where are we & where we are going – Managed languages • Dynamic compilation • Inlining • Garbage collection – Why you need to care about workloads & experimental methodology – Alias analysis – Dependence analysis – Loop transformations – EDGE architectures 1

Upload: wilfrid-chandler

Post on 27-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

380C Lecture 17

• Where are we & where we are going– Managed languages

• Dynamic compilation• Inlining• Garbage collection

– Why you need to care about workloads & experimental methodology

– Alias analysis– Dependence analysis– Loop transformations– EDGE architectures

1

2

Today

• Garbage Collection– Why use garbage collection?– What is garbage?

• Reachable vs live, stack maps, etc.– Allocators and their collection mechanisms

• Semispace• Marksweep• Performance comparisons

– Incremental age based collection• Write barriers: Friend or foe?• Generational • Beltway

– More performance

3

Basic VM Structure

Executing Program

Program/Bytecode

Dynamic Compilation Subsystem

Class Loader Verifier, etc. Heap

Thread Scheduler

Garbage Collector

4

True or False?

• Real programmers use languages with explicit memory management.– I can optimize my memory management

much better than any garbage collector

5

True or False?

• Real programmers use languages with explicit memory management.– I can optimize my memory management

much better than any garbage collector – Scope of effort?

6

Why Use Garbage Collection?

• Software engineering benefits– Less user code compared to explict memory management

(MM)– Less user code to get correct– Protects against some classes of memory errors

• No free(), thus no premature free(), no double free(), or forgetting to free()

• Not perfect, memory can still leak– Programmers still need to eliminate all pointers to objects

the program no longer needs • Performance: space time tradeoff

– Time proportional to dead objects (explicit mm, reference counting) or live objects (semispace, marksweep)

– Throughput versus pause time• Less frequent collection, typically reduces total time but

increases space requirements and pause times– Hidden locality benefits?

7

What is Garbage?

• In theory, any object the program will never reference again– But compiler & runtime system cannot figure that out

• In practice, any object the program cannot reach is garbage– Approximate liveness with reachability

• Managed languages couple GC with “safe” pointers– Programs may not access arbitrary addresses in

memory– The compiler can identify and provide to the garbage

collector all the pointers, thus– “Once garbage, always garbage” – Runtime system can move objects by updating

pointers– “Unsafe” languages can do non-moving GC by

assuming anything that looks like a pointer is one.

8

Reachability

stackglobals registersheap

ABC{ ....

r0 = objPC -> p.f = obj ....

• Compiler produces a stack-map at GC safe-points and Type Information Blocks

• GC safe points: new(), method entry, method exit, & back-edges (thread switch points)

• Stack-map: enumerate global variables, stack variables, live registers -- This code is hard to get right! Why?

• Type Information Blocks: identify reference fields in objects

9

Reachability

stackglobals registersheap

ABC{ ....

r0 = objPC -> p.f = obj ....

• Compiler produces a stack-map at GC safe-points and Type Information Blocks

• Type Information Blocks: identify reference fields in objects for each type i (class) in the program, a map

30 2TIBi

10

Reachability

• Tracing collector (semispace, marksweep)– Marks the objects reachable from the roots live, and

then performs a transitive closure over them

stackglobals registersheap

ABC{ ....

r0 = objPC -> p.f = obj ....

mark

11

Reachability

• Tracing collector (semispace, marksweep)– Marks the objects reachable from the roots live, and

then performs a transitive closure over them

stackglobals registersheap

ABC{ ....

r0 = objPC -> p.f = obj ....

mark

12

Reachability

• Tracing collector (semispace, marksweep)– Marks the objects reachable from the roots live, and

then performs a transitive closure over them

stackglobals registersheap

ABC{ ....

r0 = objPC -> p.f = obj ....

mark

13

Reachability

• Tracing collector (semispace, marksweep)– Marks the objects reachable from the roots live, and

then performs a transitive closure over them• All unmarked objects are dead, and can be

reclaimed

stackglobals registersheap

ABC{ ....

r0 = objPC -> p.f = obj ....

mark

14

Reachability

• Tracing collector (semispace, marksweep)– Marks the objects reachable from the roots live, and

then performs a transitive closure over them• All unmarked objects are dead, and can be

reclaimed

stackglobals registersheap

ABC{ ....

r0 = objPC -> p.f = obj ....

sweep

15

Today

• Garbage Collection– Why use garbage collection?– What is garbage?

• Reachable vs live, stack maps, etc.– Allocators and their collection mechanisms

• Semispace• Marksweep• Performance comparisons

– Incremental age based collection• Write barriers: Friend or foe?• Generational • Beltway

– More performance

16

Semispace

• Fast bump pointer allocation• Requires copying collection• Cannot incrementally reclaim memory,

must free en masse• Reserves 1/2 the heap to copy in to, in

case all objects are live

heap

to space from space

17

Semispace

• Fast bump pointer allocation• Requires copying collection• Cannot incrementally reclaim memory,

must free en masse• Reserves 1/2 the heap to copy in to, in

case all objects are live

heap

to space from space

18

Semispace

• Fast bump pointer allocation• Requires copying collection• Cannot incrementally reclaim memory,

must free en masse• Reserves 1/2 the heap to copy in to, in

case all objects are live

heap

to space from space

19

Semispace

• Fast bump pointer allocation• Requires copying collection• Cannot incrementally reclaim memory,

must free en masse• Reserves 1/2 the heap to copy in to, in

case all objects are live

heap

to space from space

20

Semispace

• Mark phase: – copies object when collector first encounters

it– installs forwarding pointers

heap

from space to space

21

Semispace

• Mark phase: – copies object when collector first encounters

it– installs forwarding pointers– performs transitive closure, updating

pointers as it goes

heap

from space to space

22

Semispace

• Mark phase: – copies object when collector first encounters

it– installs forwarding pointers– performs transitive closure, updating

pointers as it goes

heap

from space to space

23

Semispace

• Mark phase: – copies object when collector first encounters

it– installs forwarding pointers– performs transitive closure, updating

pointers as it goes

heap

from space to space

24

Semispace

• Mark phase: – copies object when collector first encounters

it– installs forwarding pointers– performs transitive closure, updating

pointers as it goes– reclaims “from space” en masse

heap

from space to space

25

Semispace

• Mark phase: – copies object when collector first encounters

it– installs forwarding pointers– performs transitive closure, updating

pointers as it goes– reclaims “from space” en masse– start allocating again into “to space”

heap

from space to space

26

Semispace

• Mark phase: – copies object when collector first encounters

it– installs forwarding pointers– performs transitive closure, updating

pointers as it goes– reclaims “from space” en masse– start allocating again into “to space”

heap

from space to space

27

Semispace

• Notice: fast allocation locality of contemporaneously allocated

objects locality of objects connected by pointers wasted space

heap

from space to space

28

Marksweep

• Free-lists organized by size– blocks of same size, or– individual objects of same size

• Most objects are small < 128 bytes

4

8

12

16

...

128

free lists

...

... heap

29

Marksweep

• Allocation– Grab a free object off the free list

4

8

12

16

...

128

free lists

...

... heap

30

Marksweep

• Allocation– Grab a free object off the free list

4

8

12

16

...

128

free lists

...

... heap

31

Marksweep

• Allocation– Grab a free object off the free list

4

8

12

16

...

128

free lists

...

... heap

32

Marksweep

• Allocation– Grab a free object off the free list– No more memory of the right size triggers a

collection– Mark phase - find the live objects– Sweep phase - put free ones on the free list

4

8

12

16

...

128

free lists

...

... heap

33

Marksweep

• Mark phase– Transitive closure marking all the live objects

• Sweep phase– sweep the memory for free objects populating free

list

4

8

12

16

...

128

free lists

...

... heap

34

Marksweep

• Mark phase– Transitive closure marking all the live objects

• Sweep phase– sweep the memory for free objects populating free

list

4

8

12

16

...

128

free lists

...

... heap

35

Marksweep

• Mark phase– Transitive closure marking all the live objects

• Sweep phase– sweep the memory for free objects populating free

list

4

8

12

16

...

128

free lists

...

... heap

36

Marksweep• Mark phase

– Transitive closure marking all the live objects• Sweep phase

– sweep the memory for free objects populating free list– can be made incremental by organizing the heap in blocks

and sweeping one block at a time on demand

4

8

12

16

...

128

free lists

...

... heap

37

Marksweep

space efficiency Incremental object reclamation relatively slower allocation time poor locality of contemporaneously

allocated objects

4

8

12

16

...

128

free lists

...

... heap

38

How do these differences play out

in practice?Marksweep

space efficiency Incremental object reclamation relatively slower allocation time poor locality of contemporaneously allocated

objectsSemispace

fast allocation locality of contemporaneously allocated

objects locality of objects connected by pointers wasted space

39

Methodology [SIGMETRICS 2004]

• Compare Marksweep (MS) and Semispace (SS)• Mutator time, GC time, total time

• Jikes RVM & MMTk • replay compilation• measure second iteration without compilation

• Platforms• 1.6GHz G5 (PowerPC 970) • 1.9GHz AMD Athlon 2600+• 2.6GHz Intel P4• Linux 2.6.0 with perfctr patch & libraries

– Separate accounting of GC & Mutator counts

• SPECjvm98 & pseudojbb

40

Allocation Mechanism

• Bump pointer– ~70 bytes IA32 instructions, 726MB/s

• Free list– ~140 bytes IA32 instructions, 654MB/s

• Bump pointer 11% faster in tight loop– < 1% in practical setting– No significant difference (?)

41

Mutator Time

42

jess

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 61

1.05

1.1

1.15

1.2

1.25

1.3

1.35

1.4

jess mutator time

MarkSweepSemiSpace

Normalized Heap Size

Nor

mal

ized

mu

tato

r ti

me

43

jess

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 61

1.2

1.4

1.6

1.8

2

jess L1 misses

MarkSweepSemiSpace

Normalized Heap Size

Nor

mal

ized

L1

mis

ses

44

jess

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 61

2

3

4

5

6

7

8

9

jess L2 misses

MarkSweepSemiSpace

Normalized Heap Size

Nor

mal

ized

L2

mis

ses

45

jess

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 61

1.5

2

2.5

3

jess TLB misses

MarkSweepSemiSpace

Normalized Heap Size

Nor

mal

ized

TLB

mis

ses

46

javac

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.05

1.1

1.15

1.2

javac mutator time

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d m

uta

tor

tim

e

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.1

1.2

1.3

1.4

1.5

javac L1 misses

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d L

1 m

isse

s

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.2

1.4

1.6

1.8

javac L2 misses

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d L

2 m

isse

s

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.2

1.4

1.6

1.8

javac TLB misses

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d T

LB m

isse

s

47

pseudojbb

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.05

1.1

1.15

1.2

1.25

jbb mutator time

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d m

uta

tor

tim

e

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.1

1.2

1.3

1.4

jbb L1 misses

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d L

1 m

isse

s

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

jbb L2 misses

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d L

2 m

isse

s

1 1.21 1.44 1.93 2.47 3.07 3.72 4.43 5.19 6

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

jbb TLB misses

MarkSweep

SemiSpace

Normalized Heap Size

No

rma

lize

d T

LB m

isse

s

48

Geometric MeanMutator Time

1

1.02

1.04

1.06

1.08

1.1

1.12

1.14

1.16

1.18

1.2

1 2 3 4 5 6

Heap Size (Relative to Min)

Mutator Time (Normalized to Best)

MarkSweep

SemiSpace

49

Garbage Collection Time

50

1

6

11

16

21

26

31

1 2 3 4 5 6

Heap Size (Relative to Min)

GC Time (Normalized to Best)

MarkSweep

SemiSpace

Garbage Collection Time

1

3

5

7

9

11

13

1 2 3 4 5 6

Heap Size (Relative to Min)

GC Time (Normalized to Best)

MarkSweep

SemiSpace

Geometric mean

pseudojbb

1

6

11

16

21

26

31

1 2 3 4 5 6

Heap Size (Relative to Min)

GC Time (Normalized to Best)

MarkSweep

SemiSpace

jess

1

3

5

7

9

11

13

1 2 3 4 5 6

Heap Size (Relative to Min)

GC Time (Normalized to Best)

MarkSweep

SemiSpace

javac

51

Total Time

521

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

1 2 3 4 5 6

Heap Size (Relative to Min)

Time (Normalized to Best)

MarkSweep

SemiSpace

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6

Heap Size (Relative to Min)

Time (Normalized to Best)

MarkSweep

SemiSpace

1

1.5

2

2.5

3

3.5

1 2 3 4 5 6

Heap Size (Relative to Min)

Time (Normalized to Best)

MarkSweep

SemiSpace

Total Time

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6

Heap Size (Relative to Min)

Time (Normalized to Best)

MarkSweep

SemiSpace

pseudojbb

Geometric mean

jess

javac

53

MS/SS Crossover: 1.6GHz PPC

1

1.5

2

2.5

3

1 2 3 4 5 6

Heap Size Relative to Minimum

Normalized Total Time

1.6GHz PPC SemiSpace

1.6GHz PPC MarkSweep

54

MS/SS Crossover: 1.9GHz AMD

1

1.5

2

2.5

3

1 2 3 4 5 6

Heap Size Relative to Minimum

Normalized Total Time

1.6GHz PPC SemiSpace

1.6GHz PPC MarkSweep

1.9GHz AMD SemiSpace

1.9GHz AMD MarkSweep

55

MS/SS Crossover: 2.6GHz P4

1

1.5

2

2.5

3

1 2 3 4 5 6

Heap Size Relative to Minimum

Normalized Total Time

1.6GHz PPC SemiSpace

1.6GHz PPC MarkSweep

1.9GHz AMD SemiSpace

1.9GHz AMD MarkSweep

2.6GHz P4 SemiSpace

2.6GHz P4 MarkSweep

56

MS/SS Crossover: 3.2GHz P4

1

1.5

2

2.5

3

1 2 3 4 5 6

Heap Size Relative to Minimum

Normalized Total Time

1.6GHz PPC SemiSpace

1.6GHz PPC MarkSweep

1.9GHz AMD SemiSpace

1.9GHz AMD MarkSweep

2.6GHz P4 SemiSpace

2.6GHz P4 MarkSweep

3.2GHz P4 SemiSpace

3.2GHz P4 MarkSweep

57

1

1.5

2

2.5

3

1 2 3 4 5 6

Heap Size Relative to Minimum

Normalized Total Time

1.6GHz PPC SemiSpace

1.6GHz PPC MarkSweep

1.9GHz AMD SemiSpace

1.9GHz AMD MarkSweep

2.6GHz P4 SemiSpace

2.6GHz P4 MarkSweep

3.2GHz P4 SemiSpace

3.2GHz P4 MarkSweep

MS/SS Crossover

2.6GHz2.6GHz

1.9GHz1.9GHz

1.6GHz1.6GHz

locality space

3.2GHz3.2GHz

58

Today

• Garbage Collection– Why use garbage collection?– What is garbage?

• Reachable vs live, stack maps, etc.– Allocators and their collection mechanisms

• Semispace• Marksweep• Performance comparisons

– Incremental age based collection• Enabling mechanisms

– write barrier & remembered sets• Heap organizations

– Generational – Beltway

– Performance comparisons

59

One Big Heap?

Pause times– it takes to long to trace the whole heap at once

Throughput– the heap contains lots of long lived objects, why collect

them over and over again?Incremental collection

– divide up the heap into increments and collect one at a time.

Increment 1 Increment 2

to space from space to space from space

60

Incremental Collection

Ideally• perfect pointer knowledge of live pointers between

increments• requires scanning whole heap, defeats the purpose

to space from space to space from space

Increment 1 Increment 2

61

Incremental Collection

Ideally• perfect pointer knowledge of live pointers between

increments• requires scanning whole heap, defeats the purpose

to space from space to space from space

Increment 1 Increment 2

62

Incremental Collection

to space from space to space from space

Increment 1 Increment 2

Ideally• perfect pointer knowledge of live pointers between

increments• requires scanning whole heap, defeats the purpose

63

Incremental Collection

Ideally• perfect pointer knowledge of live pointers between

increments• requires scanning whole heap, defeats the purposeMechanism: Write barrier• records pointers between increments when the mutator

installs them, conservative approximation of reachability

to space from space to space from space

Increment 1 Increment 2

64

Write barrier

compiler inserts code that records pointers between increments when the mutator installs them

// original program // compiler support for incremental collectionp.f = o; if (incr(p) != incr(o) {

remembered set (incr(o)) U p.f; } p.f = o;

to space from space to space from space

Increment 1 Increment 2

remset1 ={w} remset2 ={f,g}

a b c d e f g t u v w x y z

65

Write barrier

Install new pointer d -> v

// original program // compiler support for incremental collection

p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) U p.f; } p.f = o;

to space from space to space from space

Increment 1 Increment 2

remset1 ={w} remset2 ={f,g}

a b c d e f g t u v w x y z

66

Write barrier

Install new pointer d -> v, then update d-> y

// original program // compiler support for incremental collectionp.f = o; if (incr(p) != incr(o) {

remembered set (incr(o)) = p.f; } p.f = o;

to space from space to space from space

Increment 1 Increment 2

remset1 ={w} remset2 ={f,g,d}

a b c d e f g t u v w x y z

67

Write barrier

Install new pointer d -> v, then update d-> y

// original program // compiler support for incremental collection

p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) =

p.f; } p.f = o;

to space from space to space from space

Increment 1 Increment 2

remset1 ={w} remset2 ={f,g,d,d}

a b c d e f g t u v w x y z

68

Write barrier

At collection time• collector re-examines all entries in the remset

for the increment, treating them like roots• Collect Increment 2

to space from space to space from space

Increment 1 Increment 2

remset1 ={w} remset2 ={f,g,d,d}

a b c d e f g t u v w x y z

69

Write barrier

At collection time• collector re-examines all entries in the remset

for the increment, treating them like roots• Collect Increment 2

to space from space to space from space

Increment 1 Increment 2

remset1 ={w} remset2 ={f,g,d,d}

a b c d e f g t u v w x y z

70

Summary of the costs of

incremental collection• write barrier to catch pointer stores crossing

boundaries• remsets to store crossing pointers• processing remembered sets at collection time• excess retention

to space from space to space from space

Increment 1 Increment 2

remset1 ={w} remset2 ={f,g,d,d}

a b c d e f g t u v w x y z

71

Heap Organization

What objects should we put where?• Generational hypothesis

– young objects die more quickly than older ones [Lieberman & Hewitt’83, Ungar’84]

– most pointers are from younger to older objects [Appel’89, Zorn’90]

Organize the heap in to young and old, collect young objects preferentially

to space to space from space

Young Old

72

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

73

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

74

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

75

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

76

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

77

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

78

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

79

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

80

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

81

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

82

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space to space from space

Young Old

83

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space from space to space

Young Old

84

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces - ignore remembered sets– Generalizing to m generations

• if space n < m fills up, collect n through n-1

to space from space to space

Young Old

85

Generational Heap Organization

• Divide the heap in to two spaces: young and old• Allocate in to the young space• When the young space fills up,

– collect it, copying into the old space• When the old space fills up

– collect both spaces - ignore remembered sets– Generalizing to m generations

• if space n < m fills up, collect 1 through n-1

to space from space to space

Young Old

86

GenerationalWrite Barrier

Unidirectional barrier record only older to younger pointers no need to record younger to older pointers,

since we never collect the old space independently

• most pointers are from younger to older objects [Appel’89, Zorn’90]

• track the barrier between young objects and old spaces

to space to space from space

Young Old

addressbarrier

87

GenerationalWrite Barrier

to space to space from space

Young Old

unidirectional boundary barrier

// original program // compiler support for incremental collection

p.f = o; if (p > barrier && o < barrier) { remsetnursery U p.f;

} p.f = o;

88

GenerationalWrite Barrier

Unidirectional record only older to younger pointers no need to record younger to older pointers,

since we never collect the old space independently

– most pointers are from younger to older objects [Appel’89, Zorn’90]

– most mutations are to young objects [Stefanovic et al.’99]

to space to space from space

Young Old

89

Results

90

Garbage Collection Time

1

21

41

61

81

101

121

141

161

181

1 2 3 4 5 6

Heap Size (Relative to Min)

GC Time (Normalized to Best)

MarkSweep

SemiSpace

GenMS

GenCopy

91

Mutator Time

1

1.02

1.04

1.06

1.08

1.1

1.12

1.14

1.16

1.18

1.2

1 2 3 4 5 6

Heap Size (Relative to Min)

Mutator Time (Normalized to Best)

MarkSweep

SemiSpace

GenMS

GenCopy

92

Total Time

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

3

1 2 3 4 5 6

Heap Size (Relative to Min)

Time (Normalized to Best)

MarkSweep

SemiSpace

GenMS

GenCopy

McKinley, UT

Recap

• Copying improves locality• Incrementality improves responsiveness• Generational hypothesis

– Young objects: Most very short lived• Infant mortality: ~90% die young (within 4MB of alloc)

– Old objects: most very long lived (bimodal)• Mature morality: ~5% die each 4MB of new allocation

• Help from pointer mutations– In Java, pointers go in both directions, but older to younger

pointers across many objects are rare• less than 1%

– Most mutations among young objects• 92 to 98% of pointer mutations

380C

• Where are we & where we are going– Managed languages

• Dynamic compilation• Inlining• Garbage collection

– Can we get mutator locality, space efficiency, and collector efficiency all in one collector?

– Read: Blackburn and McKinley, Immix: A Mark-Region Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 22-32, Tucson AZ, June 2008.

– Why you need to care about workloads & methodology– Alias analysis– Dependence analysis– Loop transformations– EDGE architectures

94