dynaprof evaluation report adam leko, hans sherburne upc group hcs research laboratory university of...

17
Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red: Negative note Green: Positive note

Upload: lisa-green

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

Dynaprof Evaluation Report

Adam Leko,Hans Sherburne

UPC Group

HCS Research LaboratoryUniversity of Florida

Color encoding key:

Blue: Information

Red: Negative note

Green: Positive note

Page 2: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

2

Basic Information Name: Dynaprof Developer: Philip Mucci (UTK) Current versions:

Dynaprof CVS as of 2/21/2005 DynInst API v4.1.1 (dependency) PAPI v3.0.7 (dependency) Website:

http://www.cs.utk.edu/~mucci/dynaprof/ Contact:

Philip Mucci ([email protected])

Page 3: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

3

Dynaprof Overview Merges existing tools

PAPI DynInst API

Command-line tool Dynamically instruments programs at

runtime Requires no recompilation!

Insert probes at runtime Metrics available

Wall clock time Any PAPI metrics Can be extended

Only simple GUI available (see right) Just wrapper around command-line

version Currently pretty broken

DynaProf 0.9

Philip J. Mucci, [email protected], 2000-2003

Provided courtesy of UTK's Innovative Computing Laboratory. See

http://icl.cs.utk.edu for more information.

This is Open Source Software!

(dynaprof)|

Page 4: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

4

Instrumentation Overview Instrumentation very easy

Especially for sequential/threaded applications Compile application regularly (-g eases naming later)

gcc -O3 -g -o camel camel.c Dynaprof commands

Load the exe load camel

Specify which probe you wish to use use papiprobe [args]

List available functions list camel.c

Instrument command All functions in a file: instr module camel.c A single function: instr function camel.c main

Run command continue <CTRL-C> pauses execution (currently does not work) Instrumentation output is produced in an additional file (will be shown at runtime)

Page 5: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

5

Instrumentation Overview [2] No special commands needed for

sequential applications pthread applications

MPI not supported directly through command line Wrapper scripts available for MPICH and LAM Dynaprof must be run in “batch mode”

A file containing all instrumentation commands Halts the app before MPI_Init() is called However, not working with current version of MPICH

Get assertion failure and stops working Can only use MPI programs with 1 process

UPC? Tried

GCC-UPC BUPC (smp + pthreads)

Both produced no output or crashed Dynaprof

Page 6: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

6

Dynaprof Probe Information Probes perform all data collection and analysis

Provide code to insert into a function when instrumented Probes can be called 4 different times

Function entry point Function exit point Function call point Function return point

Each probe is encapsulated in a shared library Allows relatively easy creation of new probes

Available probes “Wallclock” probe (records wall clock time) PAPI wallclock probe (same as wallclock, uses high-resolution timers) PAPI probe (records any PAPI metric, such as FLOPs)

Specify PAPI metrics as args in use papiprobe [args] command Existing probes provide profile-style data only

Although no reason that a trace could not also be collected

Page 7: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

7

Probe Output After running, an ASCII file containing raw data is created

At runtime, a message like “…output will be in /home/leko/…” will be printed indicating where file will be

Three programs are provided which analyze the raw data wallclockrpt – for wall clock probe papiclockrpt – for PAPI wall clock probe papiproberpt – for PAPI probe

Summary statistics are provided Exclusive profile (metric collected excluding children) Inclusive profile (metric collected including children) 1-call level deep profile (see which functions an instrumented

function called) Output from *rpt programs is simple ASCII (sample next page)

Page 8: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

8

Sample Probe Report (lu.W.1)[leko@eta-1 dynaprof]$ wallclockrpt lu-

1.wallclock.16143

Exclusive Profile.

Name Percent Total Calls

------------- ------- ----- -------

TOTAL 100 1.436e+11 1

unknown 100 1.436e+11 1

main 3.837e-06 5511 1

Inclusive Profile.

Name Percent Total SubCalls

------------- ------- ----- -------

TOTAL 100 1.436e+11 0

main 100 1.436e+11 5

1-Level Inclusive Call Tree.

Parent/-Child Percent Total Calls

------------- ------- ----- --------

TOTAL 100 1.436e+11 1

main 100 1.436e+11 1

- f_setarg.0 1.414e-05 2.03e+04 1

- f_setsig.1 1.324e-05 1.902e+04 1

- f_init.2 2.569e-05 3.691e+04 1

- atexit.3 7.042e-06 1.012e+04 1

- MAIN__.4 0 0 1

Note: only “main” was instrumented in this profiled run

Page 9: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

9

Instrumentation Overhead Only could instrument one-process

MPI code MPI run wrapper script broken No PPerf apps! (all require > 1

process) Camel overhead very high

Only instrumented main LU overhead really low? Possible causes of overhead

Frequent subroutine calls from main Use of tsc.h processor counters for

timers confuse Dynaprof

Expect overhead similar to Paradyn 5-10% for most applications with a

reasonable number of instrumentation points

Dynaprof overhead

0%

20%

40%

60%

80%

100%

120%

140%

CAMELw allclock

CAMELpapiprobe

LU w allclock LU papiprobe

Benchmark

Ove

rhea

d (

inst

rum

ente

d/u

nin

stru

men

ted

)

Page 10: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

10

Bottleneck Identification: Test Suite Testing metric: what did output of probe tell us? CAMEL: FAILED

Instrumenting main caused too much application perturbation NAS LU (“W” workload): TOSS-UP

Given enough time, any bottleneck could be identified Even cache miss problems, thanks to PAPI! But how much time to identify bottlenecks?

Communication problems difficult/impossible to pinpoint No tracing No communication visualization

Could not evaluate PPerfMark suite (running MPI commands broken) However, same comments for LU would probably apply to all

In general, Heavily reliant on user’s proficiency with pinpointing problems

Incremental approach Instrument, re-run, instrument w/PAPI, re-run…

Process can be tedious But, ease of instrumentation does ease this

Page 11: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

11

Adding UPC/SHMEM Support to Dynaprof UPC support

Would need to do a ton of work Best bet

Provide a UPC probe Instrument “known” UPC runtime functions Gasnet functions for Berkeley Etc.

Need one probe per UPC runtime/compiler environment SHMEM support

No extra work necessary! Handles instrumenting libraries like any other code

However, a few potential problems Reliance on DynInst

Hard to port Hard to compile!

Reliance on PAPI Can add own probes which do not use PAPI though…

Best way to use Dynaprof Steal ideas on how to make tool extensible

Probes as shared libraries nice idea! Steal code on how to use DynInst & PAPI

Page 12: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

12

Dynaprof General Comments Good points

Free Source code available, relatively organized Good reference on how to use PAPI & DynInst API Very easy to use Relatively easy to extend Developer very responsive to questions

Bad points High instrumentation overhead in a few cases Simple to understand, but not much available functionality Only profiling data with current probes Not really being updated much any more Changing program arguments requires reloading & reinstrumenting

executable Dynaprof illustrates that a tool doesn’t have to be ultra-complicated to be

useful KISS!

Page 13: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

13

Evaluation (1) Available metrics: 2/5

Can use PAPI to get lots of data Limited in what you can collect in a single run, only

Two PAPI metrics or Wall clock time

Cost: 5/5 Free

Documentation quality: 4/5 Minimal documentation, but covers the basics pretty well

Extendibility: 3.5/5 Open source Can add new functionality by writing new probes Must write new code to extend (not much existing functionality)

Filtering and aggregation: 2/5 Most program data is filtered out for you

Direct result of profile-nature of current probes Many times too much information is lost Filtering and aggregation behavior fixed in source code of probes

Page 14: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

14

Evaluation (2) Hardware support: 4/5

Most everything supported: Linux, AIX, IRIX, HP-UX Reliance on PAPI and DynInst could hinder porting No Cray support

Heterogeneity support: 0/5 (not supported) Installation: 3/5

Dynaprof easy to compile, but PAPI and DynInst a nightmare to install Also had to hack up some source code a bit to work with newer versions of gcc &

javac (JDK1.5) Interoperability: 1.5/5

No export interoperability with other tools There is a half-done TAU probe

Not sure if it works Or how useful it is!

Learning curve: 4.5/5 Very easy to use Anyone used to prof/gprof will feel right at home

Page 15: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

15

Evaluation (3) Manual overhead: 5/5

Very easy to choose which functions you want instrumented Can script behavior of dynaprof executable Reinstrumenting requires no recompilation

Measurement accuracy: 4/5 Tracing overhead small as long as number of instrumented functions kept

reasonable Program’s correctness of execution not affected Dynamic instrumentation does not get in compiler’s way for optimizations

Function wrappers, etc can affect the compiler’s ability to inline functions Multiple analyses: 1/5

Not supported, can do it manually Multiple executions: 1/5

Supported to the extent you can run two independent versions of a program Multiple views: 1/5

One way of recording data, one way of presenting it Probes could theoretically present things differently, but none currently do

Page 16: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

16

Evaluation (4) Performance bottleneck identification: 1.5/5

No automatic detection Usefulness of tool directly related to cleverness of user

Profiling/tracing support: 1.5/5 Only supports profiling Could feasibly add tracing if you wanted to code

Response time: 2/5 No data at all until after run has completed and tracefile has been opened Generating reports from raw data instantaneous though

Software support: 5/5 Can link against (and instrument!!) any existing library Supports MPI (although broken) and shared-memory threaded programs

Source code correlation: 3/5 Data reported to user at the function name level

Page 17: Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:

17

Evaluation (5) System stability: 3/5

Command-line interface relatively stable <CTRL-C> pause while running broken in command-line GUI severely broken

Technical support: 3/5 Responses from contact within 24 hours Philip Mucci very helpful, knowledgeable