integration and application of the tau performance system in parallel java environments

32
Sameer Shende, Allen D. Malony {sameer,malony}@cs.uoregon.edu Computer & Information Science Department Computational Science Institute University of Oregon Integration and Application of the TAU Performance System in Parallel Java Environments

Upload: sean-sellers

Post on 31-Dec-2015

20 views

Category:

Documents


0 download

DESCRIPTION

Integration and Application of the TAU Performance System in Parallel Java Environments. Sameer Shende, Allen D. Malony {sameer,malony}@cs.uoregon.edu Computer & Information Science Department Computational Science Institute University of Oregon. Java HPC and Performance Technology. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Integration and Application of the TAU Performance System in Parallel Java Environments

Sameer Shende, Allen D. Malony

{sameer,malony}@cs.uoregon.edu

Computer & Information Science Department

Computational Science Institute

University of Oregon

Integration and Application of theTAU Performance System inParallel Java Environments

Page 2: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Java HPC and Performance Technology

Interest in performance tools for Java HPC Shared- and distributed-memory parallelism Multi-level (semantic) performance views

Java environment challenges performance technology Language and packages

object-oriented, interfaces, RMI, reflection, … Java Virtual Machine (JVM) execution model

thread mapping, scheduling, SMP execution, event access Just-In-Time (JIT) compilation and dynamic loading Java Native Interface (JNI)

inter-language execution, non-Java events / execution Portability of performance tools and methods

Page 3: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Research Problems

GeneralHow to create robust and ubiquitous performance technology for the analysis and tuning of parallel high-performance software and systems in the presence of (evolving) complexity challenges?

SpecificCan performance technology developed for use in HPC environments be successfully applied to parallel Java environments, and how are the new performance instrumentation, measurement, and analysis problems addressed?

Page 4: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Talk Outline

Java HPC and Performance Technology TAU Performance System

Computation model for performance technology TAU performance system toolkit

Target HPC Java Environment SMP clusters and distributed computing Multi-threading + MPI message passing

Integration (Adaption) of TAU Performance System User-level, JVM-level, JNI-level, inter-language

Example “Mixed-Mode” Application Conclusions

Page 5: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Performance System

Tuning and Analysis Utilities Performance system framework

scalable parallel and distributed HPC Targets a general complex system computation model

nodes / contexts / threads Multi-level: system / software / parallelism Measurement and analysis abstraction

Integrated performance toolkit instrumentation, measurement, analysis, visualization Portable facility based on open software approach

Robust and widely applied

Page 6: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

General Complex System Computation Model

Node: physically distinct shared memory machine Message passing node interconnection network

Context: distinct virtual memory space within node Thread: execution threads (user/system) in context

memory

memory

Node Node Node

VMspace

Context

SMP

Threads

node memory

Interconnection Network Inter-node messagecommunication

*

*

physicalview

modelview

Page 7: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Performance System Framework

Page 8: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Target HPC Java Environment

Hybrid, multi-language scientific applications Java + {C, C++, Fortran} libraries Numerical, system, communications support Performance optimization

Mixed-mode parallelism Multi-threaded shared memory parallelism Distributed memory parallelism using communications

Cluster of SMP nodes Scalable parallelism Distributed

Page 9: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Performance Technology Issues

Object-oriented programming Object-based performance analysis High-level classes and performance mapping

Multi-level performance events User / source / byte code / VM / OS / libraries / external Multiple performance instrumentation strategies Integration of performance measurements

Mixed-mode parallel computation Multi-threading performance measurement Cross-mode performance correspondence

Hybrid, multi-language performance measurement

Page 10: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Java Source-Level Instrumentation

TAU Java package

User-defined events

TAU.Profile class for new “timers” Start/Stop

Performance data output at end

Page 11: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Java Source Instrumentation Architecture

Any code section can be measured

Portability Measurement options

Profiling, tracing Limitations

Source access only Lack of thread

information Lack of node

information

Profile database stored in JVM heap

TAU as dynamic shared object

TAU.Profile class(init, data, output)

JNI C bindings

Java program

TAU package

Profile DB

JNI

TAU

Page 12: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Multi-Threading Performance Measurement

General issues Thread identity and per-thread data storage Performance measurement support and synchronization Fine-grained parallelism

different forms and levels of threading greater need for efficient instrumentation

TAU general threading and measurement model Common thread layer and measurement support Interface to system specific libraries (reg, id, sync)

Target different thread systems with core functionality Pthreads, Windows, Java, OpenMP

Page 13: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Virtual Machine Performance Instrumentation

Integrate performance system with VM Captures robust performance data (e.g., thread events) Maintain features of environment

portability, concurrency, extensibility, interoperation Allow use in optimization methods

JVM Profiling Interface (JVMPI) Generation of JVM events and hooks into JVM Profiler agent (TAU) loaded as shared object

registers events of interest and address of callback routine Access to information on dynamically loaded classes No need to modify Java source, bytecode, or JVM

Page 14: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Java JVM Instrumentation Architecture

JVMPI

Thread API

Eventnotification

Java program

Profile DB

JNI

TAU

Robust set of events Portability Access to thread info Measurement options Limitations

Overhead Many events Event control No user-defined

events

Page 15: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Java Multi-Threading Performance (Test Case)

Profile and trace Java (JDK 1.2+) applications Observe user-level and system-level threads Observe events for different Java packages

/lang, /io, /awt, … Test application

SciVis, NPAC, Syracuse University

% ./configure -jdk=<dir_where_jdk_is_installed>

% setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH\:<taudir>/<arch>/lib

% java -XrunTAU svserver

Page 16: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Profiling of Java Application (SciVis)

Profile for eachJava thread Captures events

for different Javapackages

24 threads of execution!

Page 17: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Tracing of Java Application (SciVis)

Performance groupsTimeline display

Parallelism view

Page 18: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Vampir Dynamic Call Tree View (SciVis)

Per thread call tree

Annotated performance

Expandedcall tree

Page 19: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Message Communications Performance

Explicit message communications libraries for Java MPI performance measurement

MPI profiling interface - link-time interposition library TAU wrappers in native profiling interface library Send/Receive events and communication statistics

mpiJava (Syracuse, JavaGrande, 1999) Java wrapper package JNI C bindings to MPI communication library Dynamic shared object (libmpijava.so) loaded in JVM prunjava calls mpirun to distribute program to nodes Contrast to Java RMI-based schemes (MPJ, CCJ)

Page 20: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Java Instrumentation Architecture

Java program

mpiJava package

JNI

Profile DB

TAU

MPI profiling interface

TAU wrapper

Native MPI library

Native MPI library

No source instrumentation

Portability Measurement options Limitations

MPI events onlyNo mpiJava eventsNode info onlyNo thread info

TAU package

Page 21: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Mixed-mode Parallel Programs (Java + MPI)

Java threads and MPI communications Shared-memory multi-threading events Message communications events

Unified performance measurement and views Integration of performance mechanisms Integrated association of performance events

thread event and communication events user-defined (source-level) performance events JVM events

Support for performance measurement scaling Support for performance data access

Page 22: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Instrumentation and Measurement Cooperation

Problem JVMPI doesn’t see MPI events (e.g., rank (node)) MPI profiling interfaces doesn’t see threads Source instrumentation doesn’t see either!

Need cooperation between interfaces MPI exposes rank, gets thread information JVMPI exposes thread information, get rank Source instrumentation gets both Post-mortem matching of sends and receives

Selective instrumentation java -XrunTAU:exclude=java/io,sun

Page 23: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

JVMPI

Thread API

Eventnotification

TAU Java Instrumentation Architecture

Java program

TAU package mpiJava package

MPI profiling interface

TAU wrapper

Native MPI library

Profile DB

JNI

TAU

Page 24: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Parallel Java Game of Life (Profile)

mpiJavatestcase

4 nodes,28 threads

Node 0

Node 1

Node 2

Thread 4 executesall MPI routines

Merged Java and MPI eventprofiles

Page 25: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Parallel Java Game of Life (Trace)

Integrated event tracing Merged

trace viz Node

processgrouping

Threadmessagepairing

Vampirdisplay

Multi-level event grouping

Page 26: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Node / Thread Event Timeline

Temporal event behavior Event relationships

Page 27: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Integrated Performance View (Callgraph)

Sourcelevel

MPIlevel

Javapackageslevel

Page 28: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Conclusion Integrate robust and portable performance system

(TAU) in Java HPC environment Apply performance system to observe multiple levels

of Java HPC operation Leverage performance system framework based on

common performance measurement API Key: define multi-level events and define associations

Opportunities for improvement and application JVM instrumentation and JIT (dynamic compilation) Runtime access to performance data Java scientific packages, communication libraries (CCJ,

MPJ, RMI), // compilers (JOMP), applications, ..

Page 29: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

More Information and Acknowledgments

URLs TAU: www.cs.uoregon.edu/research/paracomp/tau

Grant support (TAU) DOE 2000 ACTS

http://www-unix.mcs.anl.gov/DOE2000 http://www.nersc.gov/ACTS

DOE ASCI Level 3 (LANL, LLNL) DARPA

Page 30: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Distributed Monitoring Framework

Extend usability of TAU performance analysis Access TAU performance data during execution Framework model

each application context is a performance data server monitor agent thread is created within each context client processes attach to agents and request data server thread synchronization for data consistency pull mode of interaction

Distributed TAU performance data space “A Runtime Monitoring Framework for the TAU

Profiling System” (ISCOPE ‘99)

Page 31: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

TAU Distributed Monitor Architecture

Each context has a monitor agent

Client in separatethread directs agent

Pull model ofinteraction

TAU profile database

Page 32: Integration and Application of the TAU Performance System in Parallel Java Environments

April 19, 2023 Java Grande – ISCOPE 2001

Java Implementation of TAU Monitor Motivations

More portable monitor middleware system (RMI) More flexible and programmable server interface (JNI) More robust client development (EJB, JDBC, Swing)