3 vampir trace in detail
TRANSCRIPT
Robert Henschel
Contents• Instrumentation
– Automatic, manual and binary instrumentation
• Runtime measurement– Behind the scenes, post-processing– Trace file format
• Options, settings, parameters– Environment Variables– PAPI hardware performance counters– Memory allocation counters, application I/O calls– Filtering, grouping
Robert Henschel
Instrumentation in General
Instrumentation:Process of modifying programs to detect and report events by calling instrumentation functions.
• Instrumentation functions are provided by trace library• Notification about runtime event
• There are various ways of instrumentation
Robert Henschel
Instrumentation in General
• This is all for the Linux/Unix version of VampirTrace!
• The VampirTrace manual is really helpful!
Robert Henschel
Instrumentation in General
Source Code Binary ResultsCompiler Run
Source Code Binary ResultsVT Wrapper
Run
Traces
(Compiler)
Edit – Compile – Run Cycle
Edit – Compile – Run Cycle with VampirTrace
Robert Henschel
Instrumentation in General
int foo(void* arg){
if (cond){
return 1;
}
return 0;
}
int foo(void* arg){
enter(7);
if (cond){
leave(7);
return 1;
}
leave(7);
return 0;
}
Manually or Automatically
Robert Henschel
Instrumentation Types• Automatic instrumentation using compiler wrappers
– Think vtcc, vtcxx, vtf90
• Manual instrumentation– Think VT_USER_START(“”), and -DVTRACE
• Binary instrumentation– Think Dyninst (not covered in this presentation)
Robert Henschel
Automatic Instrumentation• Easiest way of using VampirTrace• No source code modifications• In the build system of your application, substitute calls
to the regular compiler with calls to the VampirTrace compiler wrappers– For compiling and linking
– e.g. in the makefile change “icc” to “vtcc”
• Rebuild the application• Set environment variables as required• Run the application to produce trace data
Robert Henschel
Automatic Instrumentation• Captured events:
– All user function entries and exits• If supported by the compiler (Intel, IBM, GNU,
PGI, PathScale, NEC)– MPI calls and messages– OMP regions– Pthread events– Fork/Exec/System calls– I/O events– Memory events– Hardware performance counters
• Some events require setting an environment variable!
Robert Henschel
Automatic Instrumentation
icc hello.c o hellovtcc hello.c o hello
icpc hello_parallel.cpp lmpi o hello_parallelvtcxx hello_parallel.cpp lmpi o hello_parallel
mpicc hello_mpi.c o hello_mpivtcc vt:cc mpicc hello_mpi.c o hello_mpi
Robert Henschel
Manual Instrumentation• Allows for detailed source code instrumentation
– e.g. regions of functions such as loops• Can be combined with automatic instrumentation• Be sure to instrument all function exits!
– And compile with “-DVTRACE”
Robert Henschel
Manual Instrumentation• Add the following into your source code to instrument a
region, e.g. C: (available for C++ and FORTRAN as well)
• Compile with “-DVTRACE”
– Otherwise, VampirTrace macros will expand to empty blocks, producing zero overhead
#include "vt_user.h"...VT_USER_START("Region_1");...VT_USER_END("Region_1");...
vtcc vt:inst manual prog.c DVTRACE o prog
Robert Henschel
Binary Instrumentation• Is described in the manual but not supported on BigRed
or Quarry.
Robert Henschel
Runtime Measurement• Runtime measurement
– Behind the Scenes– Unifying - Post-Processing– OTF Open Trace Format
Robert Henschel
Behind the Scenes• Trace data is written to a buffer in memory first• When this buffer is full, data is flushed to local storage• After the application has run to completion, these trace
files are unified to produce the final OTF trace• If all goes well, this should be transparent to the user• Most aspects of this behavior can/have to be
customized with environment variables
Robert Henschel
Trace Unification• Normally, trace data is unified automatically after the
application has run to completion, if this is not the case, you can run vtunify manually.
• vtunify <number-of-trace-files> <trace-file-prefix>
• Unification will be MPI parallel in the future, thus much faster for MPI jobs.
vtunify 16 my_trace
Robert Henschel
OTF – Open Trace Format• Open source trace file format
– Available from the homepage of TU Dresden, ZIH• http://www.tu-dresden.de/zih/otf/
• Includes powerful libotf for use in custom applications and other OTF tools
• API / Interfaces– High level interface for analysis tools– Low level interface for trace libraries
• Actively developed – In cooperation with the University of Oregon,
Lawrence Livermore National Laboratory and Forschungszentrum Jülich
Robert Henschel
Options Settings Parameters• General environment variables• Influencing trace file size• PAPI hardware performance counters• Memory allocation counters• Application I/O calls• Pthread tracing• Filtering• Grouping
Robert Henschel
General Environment Variables• By default, trace data is written to the current working
directory• About everything of this can be customized with
environment variables• Environment variables must be set prior to running the
application, not prior to building the application• They must be set on all nodes, that participate in the
MPI job
Robert Henschel
General Environment Variables• VT_PFORM_GDIR Directory where final trace file is stored
• VT_PFORM_LDIR Directory for local trace data
• VT_FILE_PREFIX Trace file name
• VT_MEMTRACE Enable memory allocation tracing
• VT_IOTRACE Enable I/O tracing
• VT_LIBCTRACE Enable fork/exec/system tracing
• VT_FILTER_SPEC Name of filter file
• VT_GROUPS_SPEC Name of function groups file
• VT_METRICS List of PAPI counters
• VT_VERBOSE Print diagnostic messages (0, 1, 2)
Robert Henschel
• VT_BUFFER_SIZE Size of trace buffer (per process)• VT_MAX_FLUSHES Number of buffer flushes
(0 … unlimited)• VT_FILTER_SPEC Filter function calls See a couple of slides down
Influencing Trace File Size
Robert Henschel
PAPI Hardware Performance Counters• PAPI counters can be included in traces
– If VampirTrace was build with PAPI support– If PAPI is available on the platform
• Which is currently not the case for BigRed or Quarry!!
• VT_METRICS can be used to specify a colon-separated list of PAPI counters
export VT_METRICS=PAPI_FP_OPS:PAPI_L2_TCM
Robert Henschel
• Memory allocations counters can be included in traces– If VampirTrace was build with memory allocations
support– If GNU glibc is used on the platform– Which is the case for both, BigRed and Quarry
• Memory function in glibc like “malloc” and “free” are traced
• VT_MEMTRACE can be used to enable memory allocation tracing
Memory Allocation Counters
export VT_MEMTRACE=yes
Robert Henschel
• I/O counters can be included in traces– If VampirTrace was build with I/O tracing support– If GNU glibc is used on the platform– Which is the case for both, BigRed and Quarry
• Standard I/O calls like “open” and “read” are recorded• VT_IOTRACE can be used to enable memory allocation
tracing
Application I/O Calls
export VT_IOTRACE=yes
Robert Henschel
• Pthread usage is detected automatically by the VT compiler wrappers
• For C/C++ applications, it is possible to trace the overhead of Pthread functions (thread joins etc.)– Include “vt_user.h” in all source files that contain C-
Pthread APIs– Compile the code with the additional define
“-DVTRACE_PTHREAD”
Pthread Tracing
vtcc DVTRACE_PTHREAD hello.c o hello
Robert Henschel
• Resource Usage Counters• Fork/Exec/System tracing• User Defined counters• User Defined markers
Other Advanced Options
Robert Henschel
• Filtering is one of the ways to reduce trace file size• Activated by setting the “VT_FILTER_SPEC”
environment variable
• Filter file contains a list of filters for functions that are applied during the execution of the application
• “vtfilter” tool can create a filter file• “vtfilter” tool can reduce the size of trace files
Filtering
export VT_FILTER_SPEC=/home/user/filter.spec
my*;test 1000calculate 1* 1000000
Robert Henschel
• Groups can be defined by the user to group related functions– Groups can be assigned different colors in Vampir
and VampirServer, highlighting application behavior• Activated by setting the “VT_GROUPS_SPEC”
environment variable
• Groups file contains a list of groups with associated functions
Function Grouping
export VT_GROUPS_SPEC=/home/user/groups.spec
CALC=calculateMISC=my*;testUNKNOWN=*