garbage collection auto tuning for java map reduce on multi-cores

Powerpoint Templates 1

Presentation By:Pradeeban KathiraveluINESC-ID Lisboa Instituto Superior Técnico, Universidade de Lisboa

Garbage Collection Auto-Tuning forJava MapReduce on Multi-Cores

Jeremy Singer George Kovoor Gavin Brown Mikel LujánUniversity of [email protected]

[email protected] of [email protected]

http://www.powerpointstyles.com/


Agenda Introduction Motivation Contributions Evaluation

Scalability GC Impact GC Auto Tuning

Related Work Conclusions



Introduction MRJ, A MapReduce Java Framework

for multi-core architectures Use of memory management auto-

tuning techniques based on machine learning.

MRJ performance within 10% of optimal On 75% of the benchmark tests.



Why GC Auto Tuning?

MRJ end-user cannot be expected to perform expert analysis to determine

GC activity reducing MRJ performance.

How to improve the JVM configuration.



Motivation

Efficient adaptation to benchmark-specific or heap-size-specific anomalies.

Could be installed by the system administrator

automatically enabled for users that do not have sufficient permissions to change JVM parameters.

Enable rapid deployment of MRJ on new multi-core architecture layouts



Contributions A Scalable Java fork/join framework

for MapReduce (MRJ), on a commodity multi-core platform.

A comprehensive study on the impact of Java runtime garbage collection (GC) on MRJ

An auto-tuning approach to optimize GC for MRJ.



MRJ

Same application interface as Hadoop. Only map() and reduce() to be defined. Abstracts away all the details of the

parallelization, runtime scheduling, .. Focus on the application logic.



Evaluation

Scalability evaluation on a four-core, hyperthreaded Intel Core i7 processor

Using standard MapReduce benchmarks.



Scalability Study



Scalability of grep

Scalability of grep degrades with increasing numbers ofprocessors, for small heap sizes



GC Overhead

GC overhead increases with the number of processors, more significantly for small heap sizes



Relative GC Performance

Input Dependent Application performance different inputs. Small → Serial. Medium, Large → Parallel and Concurrent. Different Heap Sizes.

Application Dependent Parallel >> Serial & Concurrent ??



sm: concurrent > parallel ?

sm: Search for a word in an input file. Death rate = Total garbage collected

Total execution time



GC Auto Tuning Performance(relative to optimal policy)



GC Auto Tuning Performance(relative to default policy)



Related Work

The original work on MapReduce [13, 14] applies to compute-clusters.

Ranger et al. describe the first application of MapReduce to multi-core processors [31].

Conventional memory management techniques do not scale to large multi-core environments [40].

Application of machine learning to Java runtime performance auto-tuning is a growing trend [26, 39].



Conclusions MRJ: A Java-based framework for MapReduce parallelism

Targets conventional multi-core architectures. Speedups of up to 6x the default GC policy

10% geometric mean speedup over all benchmarks with the largest input data sets.

Scalable performance With increasing # of threads to the underlying Java

fork/join pool Machine-learning GC auto-tuning policy improving the

runtime performance



Thank you! Questions?Thank you! Questions?



Selected References

[13] J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In Proceedings of the 6th symposium on operating systems design and implementation, pages 137–150, 2004.

[14] J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, 2008.

[26] F. Mao and X. Shen. Cross-input learning and discriminative prediction in evolvable virtual machines. In Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization, pages 92–101, 2009.

[31] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating mapreduce for multi-core and multiprocessor systems. In Proceedings of the 13th International Symposium on High Performance Computer Architecture, pages 13–24, 2007.

[39] C. Zhang and M. Hirzel. Online phase-adaptive data layout selection. In ECOOP 2008 Object-Oriented Programming, pages 309–334, 2008.

[40] Y. Zhao, J. Shi, K. Zheng, H. Wang, H. Lin, and L. Shao. Allocation wall: a limiting factor of Java applications on emerging multi-core platforms. ACM SIGPLAN Notices, 44(10):361–376, 2009.


garbage collection auto tuning for java map reduce on multi-cores

Engineering