performance analysis necessity or add-on in grid computing michael gerndt technische universität...

Performance AnalysisNecessity or Add-on in Grid Computing

Michael Gerndt

Technische Universität München

[email protected]

LRR at Technische Universität München

•Chair for Computer Hardware & Organisation / Parallel Computer Architecture (Prof. A. Bode)

•Three groups in parallel & distributed architectures

• Architectures– SCI Smile project– DAB– Hotswap

• Tools– CrossGrid– APART

• Applications– CFD– Medicine– Bioinformatics

New Campus at Garching

Outline

PA on parallel systems

Scenarios for PA in Grids

PA support in Grid projects

APART

Performance Analysis for Parallel Systems

•Development cycle• Assumption: Reproducibility

•Instrumentation• Static vs Dynamic• Source-level vs object-level

•Monitoring• Software vs Hardware• Statistical profiles vs Event

traces

•Analysis• Source-based tools• Visualization tools• Automatic analysis tools

Coding

Performance Monitoringand Analysis

Production

Program Tuning

Grid Computing

•Grids• enable communities (“virtual organizations”) to share

geographically distributed resources as they pursue common goals -- assuming the absence of…

– central location,– central control, – omniscience, – existing trust relationships.

[Globus Tutorial]

•Major differences to parallel systems• Dynamic system of resources• Large number of diverse systems• Sharing of resources• Transparent resource allocation

Scenarios for Performance Monitoring and Analysis

• Post-mortem application analysis• Self-tuning applications• Grid scheduling• Grid management

[GGF performance working group, DataGrid, CrossGrid]

Post-Mortem Application Analysis

• Requires• either resources with known performance

characteristics (QoS)• or system-level information to assess performance data• scalability of performance tools

• Focus will be on interacting components

1. George submits job to the Grid2. Job is executed on some resources3. George receives performance data4. George analyzes performance

Self-Tuning Applications

• Requires• Integration of system and application monitoring• On-the-fly performance analysis• API for accessing monitor data (if PA by application)• Performance model and interface to steer adaptation

(If PA and tuning decision by external component.)

1. Chris submits job2. Application adapts to assigned

resources3. Application starts4. Application monitors performance and

adapts to resource changes

Grid-Scheduling

• Requires• PA of the grid application• Possibly benchmarking the application• Access to current performance capabilities of

resources• Even better to predicted capabilities

1. Gloria determines performance critical application properties

2. She specifies a performance model3. Grid scheduler selects resources4. Application is started

Grid-Management

• Requires• PA of historical system information• Need to be done in a distributed fashion

1. George claims to see bad performance since one week.

2. The helpdesk runs the Grid performance analysis software.

3. Periodical saturation of connections is detected.

New Aspect of Performance Analysis

•Transparent resource allocation•Dynamism in resource availability

•Approaches in the following projects:• Damien• Datagrid• Crossgrid• GrADS

Analyzing Meta-Computing Applications

•DAMIEN (IST-25406), 5 partnerswww.hlrs.de/organization/pds/projects/damien/

•Goals• Analysis of GRID-enabled applications

– using MpCCI (www.mpcci.org)– using PACX-MPI

(www.hlrs.de/organization/pds/projects/pacx-mpi)

• Analysis of GRID components– PACX-MPI and MpCCI

• Extend Vampir/Vampirtrace technology

MetaVampirtrace for Application Analysis

GRID-MPI profiling routine (PPACX_Send)

Native MPI GRID communication layer

Compiled code (PACX_Send)

Routine call

Tracefile

MetaVT wrapper (PACX_Send)

Routine call

Name shift (CPP)

Application code (MPI_Send)

MetaVampirtrace for GRID Component Analysis

Name shift (CPP)

Application code (MPI_Send)

TracefileMetaVT wrapper (MPI_Send)

MPI profiling routine (PMPI_Send)

Compiled code (PACX_Send)

Routine call

GRID-MPI layer (PACX_Send)

Routine call

TCP/IP

GRID-MPI communication layer

MetaVampir

•General counter support • Grid component metrics

•Hierarchical analysis • Analysis at each level• Aggregate data for groups• Improves scalability

•Structured tracefiles• Subdivided into frames• Stripe data across multiple

files

Metacomputer

Node 2Node 1

SMP node 1

P_1

GRID–DaemonsMPI processes

Send RecvSMP node 2

P_n

All MPI Processes

P_1 P_n

Process Level

System Level

Grid Monitoring Architecture

•Developed by GGF Performance working group•Separation of data discovery and data transfer

• Data discovery via (possibly distributed) directory service

• Data transfer among producer – consumer

•GMA interactions• Publish/subscribe• Query/response• Notification

•Directory includes• Types of events• Accepted protocols• Security mechanisms

Consumer

Producer

Directory

Service

eventpublicationinformation

eventpublicationinformation

R-GMA in DataGrid

•DataGrid www.eu-datagrid.org•R-GMA www.cs.nwu.edu/~rgis•DataGrid WP3 hepunx.rl.ac.uk/edg/wp3

•Relational approach to GMA• Producers announce: SQL “CREATE TABLE”

publish: SQL “INSERT”• Consumers collect: SQL “SELECT”• Approach to use the relational model in a distributed

environment• It can be used for information service as well as

system and application monitoring.

P-Grade and R-GMA

•P-GRADE Environment developed at MTA SZTAKI

• GRM (Distributed monitor)• Prove (Visualization tool)

•GRM creates two tables in R-GMA• GRMTrace (String appName, String event): all events• GRMHeader (String appName, String event): important

header events only

•GRM Main Monitor• SELECT “*” FROM GRMHeader WHERE appName=“...”• SELECT “*” FROM GRMTrace WHERE appName=“...”

Main Monitor

Site

User’s

Host

Host 1 Host 2

ApplicationProcess

Appl.Process

Appl.Process

R-GMA

PROVE

Connection to R-GMA

Analyzing Interactive Applications in CrossGrid

•CrossGrid funded by EU: 03/2002 – 02/2005www.eu-crossgrid.org

•Simulation of vascular blood flow• Interactive visualization and simulation

– response times are critical– 0.1 sec (head movement) to 5 min (change in simulation)

• Performance analysis– response time and its breakdown– performance data for specific interactions

CrossGrid Application Monitoring Architecture

•OCM-G = Grid-enabled OMIS-Compliant Monitor•OMIS = On-line Monitoring Interface Specification

•Application-oriented• Information about running applications

•On-line• Information collected at runtime• Immediately delivered to consumers

•Information collected via instrumentation• Activated / deactivated on demand• Information of interest defined at runtime (lower

overhead)

OMIS

Performance Tool

Service Manager

LM

P1 P2

LM

P4 P5

LM

P3

th_stop(Sim)

th_stop(P1,P2) th_stop(P4,P5)th_stop(P3)

StopStop StopStopStop

Application Specific Measurement

•G-PM offers standard metrics• CPU time, communication time, disk I/O, ...

•Application programmer provides • Relevant events inside application (probes)• Relevant data computed by the application• Association between events in different processes

•G-PM allows to define new metrics• Based on existing ones and application specific

information• Metric Definition Language under development• Compilation or interpretation will be done by High-Level

Analysis Component.

Managing Dynamism: The GrADS Approach

•GrADS (Grid Application Development Software)• Funded by National Science Foundation, started 2000

•Goal:Provide application development technologies that make it easy to construct and execute applications with reliable [and often high] performance in the constantly-changing environment of the Grid.

•Major techniques to handle transparency and dynamism:

• Dynamic configuration to available resources (configurable object programs)

• Performance contracts and dynamic reconfiguration

GrADS Software Architecture

PSE

Config.object

program

wholeprogramcompiler

Source appli-cation

libraries

Realtimeperf

monitor

Dynamicoptimizer

Grid runtime System

(Globus)

negotiation

Software Components

Scheduler/Service

Negotiator

Performance feedback

Program Preparation System Execution Environment

Configurable Object Programs

•Integrated mapping strategy and cost model

•Performance enhanced by context-depend. variants

•Context includes potential execution platforms

•Dynamic Optimizer performs final binding

• Implements mapping strategy

• Chooses machine-specific variants

• Inserts sensors and actuators

• Perform final compilation and optimization

Performance Contracts

A performance contract specifies the measurable performance of a grid application.

Given• set of resources,• capabilities of resources,• problem parameters

the application will• achieve a specified, measurable performance

Creation of Performance Contracts

Program

PerformanceModel

Resource Broker

ResourceAssignment

PerformanceContract

• Developer• Compiler• Measurements

MDS

NWS

History-Based Contracts

•Resources given by broker•Capabilities of resources given by

• Measurements of this code on those resources• Possibly scaled by the Network Weather Service• e.g. Flops/second and Bytes/second

•Problem parameters• Given by the input data set

•Application intrinsic parameters • Independent of execution platform• Measurements of this code with same problem parameters• e.g. floating point operation count, message count,

message bytes count

•Measurable Performance Prediction• Combining application parameters and resource

capabilities

Application and System Space Signature

Application Signature• trajectory of values through

N-dimensional metric space• one trajectory per process• e.g. one point per iteration• e.g. metric: iterations/flop

M1

M2

M3

M1

M2

M3

System Signature• trajectory of values through

N-dimensional metric space• will vary across application

executions, even on the sameresources

• e.g. metric iterations/second

resource capabilities

Verification of Performance Contracts

Execution

ContractMonitor

Rescheduling

Sensor Data

SteerDynamic Optimizer

• Violation detection• Fault detection

APART

•ESPRIT IV Working Group, 01/1999 – 12/2000

•IST Working Group, 08/2001 – 07/2004

www.fz-juelich.de/apart

Focus:

• Network European development projects for

automatic performance analysis tools

– Testsuite for automatic analysis tools

• Automatic Performance Analysis and Grid Computing

(WP3 – Peter Kacsuk)

Summary

•Scenarios• Post-mortem Application Tuning• Self-tuning applications• Grid scheduling• Grid management

•How to handle transparency and dynamism?

•Approaches here:• Damien: Provide static environment.• Datagrid: Combining system and application

monitoring• Crossgrid: On-line analysis• GrADS: Performance models and contracts

performance analysis necessity or add-on in grid computing michael gerndt technische universität...

Documents

performance slide

grid application

application performance

performance analysis

bad performance

fly performance analysis

crossgrid slide

application monitoring