1 gc advantage: improving program locality xianglong huang, zhenlin wang, stephen m blackburn,...

35
1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

Upload: henry-hensley

Post on 18-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

3 Marksweep vs. Copying pseudojbb

TRANSCRIPT

Page 1: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

1

GC Advantage: Improving Program Locality

Xianglong Huang, Zhenlin Wang,Stephen M Blackburn, Kathryn S McKinley,

J Eliot B Moss, Perry Cheng

Page 2: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

2

Motivation

Memory gapHow are Java programs affected?

Page 3: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

3

Marksweep vs. Copying

pseudojbb

Page 4: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

4

Motivation

Javac with perfect L1 and L2 cache.

16K L1 256K L2 Appel, GCTk. Breadth first

0

5

10

15

20

25

_213_javac (10̂ 9 cycles)

originalperfect L2perfect L1

Page 5: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

5

Motivation

Copying collector can reorder objectsGoal: take advantage of copying collectors

reorder objects to improve locality

Page 6: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

6

Exploring The Space

Different policies for traversing rootsClass-oblivious traversal orders

Which traversing order is the best?Class-based traversal orders

How to find the “important” data structure?

Page 7: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

7

Different Root Traversal Policies

Two different types of roots: Stack, global variables Remember sets (for generational)

Different traversal orders Copy all roots before traversing any children Copy each root and its children (root-by-root) Split roots

Stack first and the children Remset first and the children

Page 8: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

8

Experiment Setup

JikesRVM, JMTkGenerational copying collector with

bounded nursery size of 4MBPseudoAdaptive 2nd iteration

Page 9: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

9

Different Root Traversal Policies

•RxR has the best mutator locality

Page 10: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

10

Different Root Traversal Policies

•Total execution time

Page 11: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

11

Exploring The Space

Different policies for traversing rootsClass-oblivious traversal orders

Which traversing order is the best?Class-based traversal orders

How to find the “important” data structure?

Page 12: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

12

Different Traversal Orders

Breadth first 1,2,3,4,5,6,7Pure depth first 1,2,6,3,4,7,5Pure depth first, LIFO 1,5,4,7,3,2,6

1

4

76

2 3 5

Page 13: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

13

Different Traversal Orders

Breadth first 1,2,3,4,5,6,7Pure depth first 1,2,6,3,4,7,5Pure depth first, LIFO 1,5,4,7,3,2,6Partial depth first, 2 children 1,2,6,3,4,5,7

1

4

76

2 3 5

Page 14: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

14

Class Oblivious Type

Different traversal policies Partial DF is the best

Page 15: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

15

Exploring The Space

Different policies for traversing rootsClass-oblivious traversal orders

Which traversing order is the best?Class-based traversal orders

How to find the “important” data structure?

Page 16: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

16

Class-based Traversal

Class-oblivious traversal orders inflexibleClass-based object traversal

Static profiling Dynamic sampling

Page 17: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

17

Static Profiling

Profile object accesses Find hot pairs with strong correlation Example

(1,4), (4,7) and (2,6) have strong correlation Order: 1,4,7,2,6,3,5

1

4

76

2 3 5

Page 18: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

18

Online Profiling

Use the adaptive compiler sampling Hot method Hot basic block

Use field accesses to indicate hot fields Example: (In a hot method)

{Class A a;a.b=…;

… }

A

B

b…..

Page 19: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

19

Online Profiling

Micro benchmark results

Page 20: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

20

Online Profiling

Geometric mean

Page 21: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

21

Reasons

No advice for most of the objects copied For jess, db and raytrace, we only pick <<1% of

the objects as hot objects 5% for javac

The hot fields are within the first 2 pointers 90% of the advised objects for javac

Page 22: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

22

Online Profiling

PseudoJBB mutator results Generate advice for 23% of the copied objects 75% of the objects have adviced hot fields

other than first 2

Page 23: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

23

Questions

Have we found all the hot objects? Not all hot objects are connected?

Is class-base good enough? For pseudojbb, we need instance-based?

Locality for the nursery objects?

Page 24: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

24

Future Work

Sampling technique Catch more hot objects access

Lower the threshold Hot objects that are not connected

Dynamically change the advice for phase changing

Nursery localityDifferent traversal orders for cold objectsInstance-based

Page 25: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

25

Conclusion

Reorder objects during copying collection can improve locality

In class-oblivious traversal orders partial depth first order is the best

Online profiling, class-based traversal is more flexible, up to 50% better. very low overhead, ~0%

Still mysteries

Page 26: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

26

Questions?

Page 27: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

27

Answers?

Lower the threshold of the sampling, not only the hot methods

For objects with only 1 or 2 pointers, it maybe easier just depth first

Maybe the nursery locality is more important

Instance-based advice

Page 28: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

28

Online Profiling

Execution overhead

-6.00%-5.00%-4.00%-3.00%-2.00%-1.00%0.00%1.00%2.00%3.00%4.00%5.00%

overhead

Page 29: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

29

Online Profiling

Micro benchmark results for mutator time

Page 30: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

30

Different Root Traversal Policies

_227_mtrt

Page 31: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

31

Static Profiling

Results

Page 32: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

32

Answers?

Most objects have only one pointerPercentage of objects copied by advice

(whether it is really hot?) For pseudojbb ~50%, for jess <<1%, for our

micro benchmark ~16%Change! Half of the pairs do not form

chains longer than 2Maybe the nursery locality is more

important

Page 33: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

33

Class Oblivious Orderings

Different traversal policies Partial DF is better

pseudoJBB

Page 34: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

34

Motivation

MarkSweep vs. Copying Collector

Mutator time of_213_javac

Page 35: 1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng

35

Motivation

Mutator L2 misses_213_javac