5/30/00cse 225 performance analysis tools nadya williams spring, 2000 ucsd
Post on 20-Dec-2015
215 views
TRANSCRIPT
![Page 1: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/1.jpg)
5/30/00 CSE 225
Performance Analysis Tools
Nadya Williams
Spring, 2000UCSD
![Page 2: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/2.jpg)
5/30/00 CSE 225
Outline
• Background
• Performance measurement
• SvPablo
• Autopilot
• Paradyn
• XPVM
![Page 3: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/3.jpg)
5/30/00 CSE 225
Background• Goal - high performance computing for
applications that are distributed:– by design, e.g. collaborative environments,
distributed data analysis, computer-enhanced instruments
– by implementation, e.g. metacomputing, high-throughput computing
• Goal - to achieve & maintain performance guarantees in heterogeneous, dynamic environments
![Page 4: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/4.jpg)
5/30/00 CSE 225
Background
Performance-robust grid applications need to• Identify resources required to meet application
performance requirements• Select from problem specification, algorithm &
code variants• Establish hierarchical performance contracts• Select and manage adaptation strategies when
performance contracts are violated
![Page 5: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/5.jpg)
5/30/00 CSE 225
VizEngine
VizEngine
Computational grids
Visualizationand Steering
MPPMPP
Real-timeData Analysis
NetworkNetwork
Network
Shared resources– computation, network, and data archives
![Page 6: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/6.jpg)
5/30/00 CSE 225
Complexity
• Emerging applications are dynamic– time varying resource demands– time varying resource availability– heterogeneous execution environments– geographically distributed
• Display and analysis hierarchy– code, thread, process, processor– system and local area network– national/international network
![Page 7: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/7.jpg)
5/30/00 CSE 225
Grid performance challenges
• Wide area infrastructure• Many resource models• Behavioral variability
– complex applications, diverse systems and networks
– irreproducible behavior
• Heterogeneous applications– multilingual and multimodel
– real-time constraints and shared resources
• Prediction & scheduling
![Page 8: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/8.jpg)
5/30/00 CSE 225
Outline
• Background
• Performance measurement
• SvPablo
• Autopilot
• Paradyn
• XPVM
![Page 9: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/9.jpg)
5/30/00 CSE 225
Performance analysis
• The ability to– capture– analyze– present– optimize
• Multiple analysis levels– hardware– system software– runtime systems– libraries– applications
Good tools must accommodate all
![Page 10: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/10.jpg)
5/30/00 CSE 225
Real-time Multilevel Analysis
• Multilevel Drilldown– multiple sites– multiple metrics– real-time display
• Problems– uncertainty and perturbation– confusion of cause and effect
![Page 11: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/11.jpg)
5/30/00 CSE 225
Guidelines
• Design for locality– regardless of programming model– threads, MPI, data parallel -- it’s the same
• Recognize historical models– large codes develop over time– assumptions change
• Think about more than FLOPS– I/O, memory, networking, user interfaces
![Page 12: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/12.jpg)
5/30/00 CSE 225
Initial steps
• Develop infrastructure for structural and performance information
• Provide instrumentation of end-user applications & communication libraries
• Study performance characteristics of real grid applications
![Page 13: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/13.jpg)
5/30/00 CSE 225
Peak and Sustained Performance
• Peak performance– perfect conditions
• Actual performance– considerably less
• Environment dictates performance– locality really matters– we must design for performance stability– more of less may be better than less of more
![Page 14: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/14.jpg)
5/30/00 CSE 225
Measurement developments• Hardware counters
– once rare (Cray), now common (Sun, IBM, Intel, Compaq)
– metrics
• operation types
• memory stalls
• Object code patching – run-time instrumentation
• Compiler integration– inverse compiler transformations
– high-level language analysis
![Page 15: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/15.jpg)
5/30/00 CSE 225
Correlating semantic levels
• Performance measurements – capture behavior of executing software– reflect output of multi-level transformations
• Performance tools– must relate data to “user” semantic model
• cache miss ratios cannot help a MATLAB user
• message counts cannot help an HPF user
– should suggest possible performance remedies
![Page 16: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/16.jpg)
5/30/00 CSE 225
Analysis developments• Visualization techniques
– traces and statistics
• Search and destroy– AI suggestions and consultants– critical paths and zeroing
• Data reduction and processing– statistical clustering/projection pursuit– neural net, and time series classification
• Real-time control– sensor/actuator models
![Page 17: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/17.jpg)
5/30/00 CSE 225
Performance tool checkpoint• An incomplete view …
– representative techniques and tools
• Major evolution– from architectural views/post-mortem
analysis – to deeper correlation and derived metrics
• Key open problems– adaptivity– scale– semantic correlation
![Page 18: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/18.jpg)
5/30/00 CSE 225
Representative vendor tools• IBM VT™
• “ParaGraph” trace display and statistical metrics
• Silicon Graphics Speedshop™• R10000, R12000 hardware counter tools
• Pallas Vampir™• event tracing and display tools
• Cray ATExpert™ (autotasking)• basic AI suggestions for tuning
• Intel SPV™• ParaGraph and hardware counter displays
• TMC/SUN Prism™• data parallel and message passing analysis
![Page 19: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/19.jpg)
5/30/00 CSE 225
Representative research tools• Illinois SvPablo™
– performance data metaformat– Globus integration (sensor/actuator control)
• Illinois Autopilot™– performance steering
• Wisconsin Paradyn™– runtime code patching– performance consultant
• Oak Ridge National Lab XPVM– X Windows based, graphical console and monitor for
PVM
![Page 20: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/20.jpg)
5/30/00 CSE 225
Outline
• Background
• Performance measurement
• SvPablo
• Autopilot
• Paradyn
![Page 21: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/21.jpg)
5/30/00 CSE 225
Department of Computer ScienceUniversity of Illinois at
Urbana-Champaign
SvPablo:
Graphical source code browser for performance tuning and
visualization
![Page 22: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/22.jpg)
5/30/00 CSE 225
SvPablo Outline
• Background
• SvPablo overview
• SvPablo model
• Automatic/Interactive instrumentation of programs
• The Pablo Self-Defining Data Format
![Page 23: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/23.jpg)
5/30/00 CSE 225
SvPablo Background• Motivations
– emerging high-level languages (HPF and HPC++)– aggressive code transformations for parallelism– large semantic gap between user and code
• Goals– relate dynamic performance data to source– hide semantic gap– generate instrumented executable/simulated code– support performance scalability predictions
![Page 24: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/24.jpg)
5/30/00 CSE 225
Background
• Tools should provide the performance data and suggestions for performance improvements at the level of an abstract, high-level program
• Tools should integrate dynamic performance data with information recorded by the compiler that describes the mapping from the high-level source to the resulting low-level explicitly parallel code
![Page 25: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/25.jpg)
5/30/00 CSE 225
SvPablo overview
A graphical user interface tool for:• source code instrumentation• browsing runtime performance dataTwo major components:• performance instrumentation libraries• performance analysis and presentationProvides:• performance data capture• analysis• presentation
![Page 26: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/26.jpg)
5/30/00 CSE 225
SvPablo overview
• Instrumentation– automatic
• HPF (from PGI)
– interactive• ANSI C• Fortran 77• Fortran 90
• Data capture– dynamic software statistics (no traces)– SGI R10000 counter values
![Page 27: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/27.jpg)
5/30/00 CSE 225
SvPablo overview• Source code instrumentation
– HPF: PGI runtime system invokes instrumentation
• each procedure call
• each HPF source line
– C and Fortran programs: interactively instrumented
• outer loops
• function calls
• Instrumentation maintains statistical summary• Summaries correlated across processors• Correlated summary input to browser
![Page 28: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/28.jpg)
5/30/00 CSE 225
SvPablo overview• Architectures:
– any system with the PGI HPF compile– any system with F77 or F90– C applications supported on
• single processor Unix workstations• network of Unix workstations using MPI• Intel Paragon• Meiko CS2
• GUI supports:– Sun (Solaris)– SGI (IRIX)
![Page 29: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/29.jpg)
5/30/00 CSE 225
Statistics metricsFor procedures :
– count– exclusive / inclusive duration– send / receive message duration (HPF only)
For lines:– count– duration– exclusive duration– message send and message receive (HPF only)
• duration• count• size
– event counters (SGI)
Mean, STD, Min, Max
![Page 30: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/30.jpg)
5/30/00 CSE 225
SvPablo model
. . .
. . .
![Page 31: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/31.jpg)
5/30/00 CSE 225
New project dialog box
![Page 32: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/32.jpg)
5/30/00 CSE 225
HPF performance analysis data flow
SvPabloCombine
Par
alle
lA
rch
itec
ture
SvPablodata capture
library
Per-processperformance
files
HPFsource code
instrumentedobject code
Linker
instrumentedexecutable
PGI HPF
compiler
performancefile
Graphical performance
browser
![Page 33: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/33.jpg)
5/30/00 CSE 225
HPF instrumentation
• pghpf -c -Mprof=lines source1.F
• pghpf -c -Mprof=lines source2.F
• pghpf -Mstats -o prog source1.o source2.o
/usr/local/SvPablo/lib/pghpf2SDDF.o
• prog -pghpf -np 8
• SvPabloCombine HPF_SDDF*
![Page 34: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/34.jpg)
5/30/00 CSE 225
Performance visualization
Metrics:count &
exclusiveduration
![Page 35: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/35.jpg)
5/30/00 CSE 225
Performance metric selection dialog
![Page 36: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/36.jpg)
5/30/00 CSE 225
C / F77/ F90 data flow
Par
alle
lA
rch
itec
ture
SvPablodata capture
library
SvPabloCombine
per-processperformance
files
performancefile
instrumentedsource code
instrumentedobject code
Linker
instrumentedexecutable
compiler
create or editproject
visualizeperformance
file
InstrumentC or Fortran
files
![Page 37: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/37.jpg)
5/30/00 CSE 225
Interactive instrumentation
InstrumentableConstructs
(function callsand outer loops)
![Page 38: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/38.jpg)
5/30/00 CSE 225
Generating an instrumented executable program
• mpicc -c file1.Context1.inst.c
• mpicc -c file2.Context1.inst.c
• mpicc -c Context1/InstrumentationInit.c
• mpicc -o instFile InstrumentationInit.o
file1.Context1.inst.o
file2.Context1.inst.o
svPabloLib.a
![Page 39: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/39.jpg)
5/30/00 CSE 225
SDDF: a medium of exchange
Self-Defining Data Format– data meta-format language for performance
data description– specifies both data record structures and data
record instances– separates data structure and semantics– allows the definition of records containing
scalars and arrays– supported by the Pablo SDDF library
![Page 40: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/40.jpg)
5/30/00 CSE 225
SDDF files: classes of records
• Command: conveys action to be taken
• Stream Attribute: gives information pertinent to the entire file
• Record Descriptor: declares record structure
• Record Data: encapsulates data values
![Page 41: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/41.jpg)
5/30/00 CSE 225
Record descriptorsDescribe record layout
• Each Record Descriptor contains:• A unique tag and record name
• An optional Record Attribute
• Field Descriptors, each one containing:• an optional Field Attribute • field type specifier• field name• optional field dimension
![Page 42: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/42.jpg)
5/30/00 CSE 225
SDDF: record descriptor & data#300:// "description" "PGI Line-Based Profile Record""PGI Line Profile" { int "Line Number"; int "Processor Number“[]; int "Procedure ID"; int "Count"; double "Inclusive Seconds"; double "Exclusive Seconds"; int "Send Data Count"; int "Send Data Byte"; double "Send Data Seconds"; int "Receive Data Count"; int "Receive Data Byte"; double "Receive Data Seconds";}; "PGI Line Profile" {359, [2]{7,9}, 4, 399384, 31.071, 31.071, 0, 0, 0, 0, 0,
0};;
tag
record name
field descriptors
![Page 43: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/43.jpg)
5/30/00 CSE 225
SvPablo language transparency
Meta-format for performance data• language defined by line and byte offsets
• metrics defined by mapping to offsets
• SDDF records• performance mapping information
• performance measurements
Result• language independent performance browser
• mechanism for scalability model integration
![Page 44: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/44.jpg)
5/30/00 CSE 225
SvPablo conclusionsVersatility – yes• analysis GUI is quite versatile, provides the ability to
define new modules, but steep learning curve • theoretically, any type of view could be constructed from
the toolkit provided
Portability – not quite• Intended for wide range of parallel platforms and
programming languages, reality is different – (SUN, SGI)
Scalability - some• Pablo trace library monitors and dynamically alters the
volume, frequency, and types of event data recorded• not clear how: automatically or by user at low level?• need to integrate predictions
![Page 45: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/45.jpg)
5/30/00 CSE 225
Outline
• Background
• Performance measurement
• SvPablo
• Autopilot
• Paradyn
• XPVM
![Page 46: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/46.jpg)
5/30/00 CSE 225
Autopilot - a performance steering toolkit
Provides flexible infrastructure for real-time adaptive control of parallel and distributed computing resources
Department of Computer ScienceUniversity of Illinois at
Urbana-Champaign
![Page 47: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/47.jpg)
5/30/00 CSE 225
Autopilot outline
• Background
• Autopilot overview
• Autopilot components
• Conclusions
![Page 48: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/48.jpg)
5/30/00 CSE 225
Autopilot background• HPC: from single parallel systems to distributed
collections of heterogeneous sequential and parallel systems.
• emerging applications are irregular– have complex, data dependent execution behavior
– dynamic, with time varying resource demands
• failure to recognize that resource allocation and management must evolve with applications
Consequence: small changes in application structure
can lead to large changes in observed performance.
![Page 49: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/49.jpg)
5/30/00 CSE 225
Autopilot background
• interactions between application and system resources change – across applications
– during a single application's execution
Autopilot approach : create adaptable– runtime libraries
– resource management policies
![Page 50: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/50.jpg)
5/30/00 CSE 225
Autopilot overviewAfter the integration of
– dynamic performance instrumentation – on-the-fly performance data reduction – configurable, malleable resource management algorithms – real-time adaptive control mechanism
Have adaptive resource management infrastructure Given:
– application request patterns– observed system performance
Automatically choose & configure resource management algorithms:– increase portability – increase achieved performance
![Page 51: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/51.jpg)
5/30/00 CSE 225
Autopilot components
1. Autopilot - implements the core features of the Autopilot system.
2. Fuzzy Library - needed to build the classes supporting the fuzzy logic decision procedure infrastructure
3. Autodriver - provides a graphical user interface (written in Java)
4. Performance Monitor - provides tools to retrieve and
record various system performance statistics on a set of machines.
![Page 52: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/52.jpg)
5/30/00 CSE 225
1 Autopilot component
• libAutopilot.a – creation, registration, and use – sensors
– actuators (enable and configure resource management policies)
– decision procedures
• AutopilotManager - a utility program which displays the sensors and actuators currently registered with the Autopilot Manager
![Page 53: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/53.jpg)
5/30/00 CSE 225
2 Fuzzy library component
• Fuzzy Rules to C++ translator
• related classes used by the Autopilot fuzzy logic decision procedure infrastructure.
![Page 54: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/54.jpg)
5/30/00 CSE 225
3 Autodriver component
• Autopilot Adapter program– provides a Java interface to Autopilot (must run on UNIX)
• JAVA GUI – talks to Autopilot through the Adapter– allows a user to monitor and interact with live
sensors and actuators. (runs on any platform that supports Java)
![Page 55: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/55.jpg)
5/30/00 CSE 225
4 Performance monitor component
two kinds of processes• Collectors
– run on the machines to be monitored – capture quantitative application and system
performance data
• Recorders – compute performance metrics. – record or output it.
communicate via Autopilot component
![Page 56: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/56.jpg)
5/30/00 CSE 225
Closed loop adaptive control
Knowledge Repository
Fuzzy SetsRules
Fuzzy LogicDecision Process
Fuzz
ifier
Def
uzzi
fier
Inp
uts
Ou
tpu
ts
System
SensorsSensors
ActuatorsActuators
Illinois Autopilot Toolkit Illinois Autopilot Toolkit (Reed (Reed et alet al))
Real-time measurementGlobus integration
![Page 57: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/57.jpg)
5/30/00 CSE 225
Autopilot conclusions
• Goal is creation of an infrastructure for building resilient, distributed and parallel applications.
• allow the creation of software that can change its behavior and optimize its performance in response to real-time data – on software dynamics
– performance.
• order of magnitude performance improvements
![Page 58: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/58.jpg)
5/30/00 CSE 225
Outline
• Background
• Performance measurement
• SvPablo
• Autopilot
• Paradyn
![Page 59: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/59.jpg)
5/30/00 CSE 225
Paradyn
performance measurement tool for parallel and distributed programs
Computer Science,
University of Wisconsin
![Page 60: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/60.jpg)
5/30/00 CSE 225
Paradyn outline
• Motivations
• Approach
• Performance Consultant
• Conclusions
![Page 61: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/61.jpg)
5/30/00 CSE 225
Paradyn motivations
• provide a performance measurement tool that scales to long-running programs on large parallel and distributed systems
• automate much of the search for performance bottlenecks
• avoid the space and time overhead typically associated with trace-based tools.
• go beyond post-mortem analysis
![Page 62: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/62.jpg)
5/30/00 CSE 225
Paradyn approach
Dynamic instrumentation – based on dynamically controlling what performance
data is to be collected.
– allows data collection instructions to be inserted into an application program during runtime.
Paradyn• dynamically instruments the application• automatically controls the instrumentation in
search of performance problems
![Page 63: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/63.jpg)
5/30/00 CSE 225
Paradyn model
• the Paradyn front-end and user interface– display performance visualizations
– use the Performance Consultant to find bottlenecks
– start and stop the application
– monitor the status of the application
• the Paradyn daemons – monitor and instrument the application processes.
![Page 64: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/64.jpg)
5/30/00 CSE 225
Performance consultant module
• automatically directs the placement of instrumentation
• has a knowledge base of performance bottlenecks and program structure
• can associate bottlenecks with specific causes and with specific parts of a program.
![Page 65: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/65.jpg)
5/30/00 CSE 225
Paradyn runtimeConcepts for performance data analysis/presentation1. metric-focus grid – cross-product of two vectors
– list of performance metrics (CPU time, blocking time…)– list of program components (procedures, processors, disks)– elements of the matrix can be single-valued (e.g., current – value, average, min, or max) or time-histograms
2. time-histogram – fixed-size data structure recording behavior of a metric as it varies over time
Performance data granularity– global phase– local phase
![Page 66: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/66.jpg)
5/30/00 CSE 225
Performance consultantWisconsin Paradyn Toolkit (Miller Wisconsin Paradyn Toolkit (Miller et alet al))
unknowntrue false
![Page 67: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/67.jpg)
5/30/00 CSE 225
Performance consultantWisconsin Paradyn Toolkit (Miller Wisconsin Paradyn Toolkit (Miller et alet al))
![Page 68: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/68.jpg)
5/30/00 CSE 225
Outline
• Background
• Performance measurement
• SvPablo
• Autopilot
• Paradyn
• XPVM
![Page 69: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/69.jpg)
5/30/00 CSE 225
XPVM
Graphical console and monitor for PVM
developed at the Oak Ridge National Lab
• Provides a graphical user interface to the PVM console commands
• Provides several animated views to monitor the execution of PVM programs
![Page 70: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/70.jpg)
5/30/00 CSE 225
XPVM overview
• Xpvm generates trace records during PVM program execution. The resulting trace file is used to "playback" a program's execution.
• The xpvm views provide information about the interactions among tasks in a parallel PVM program, to assist in debugging and performance tuning.
• Xpvm writes a Pablo self-defining trace file
![Page 71: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/71.jpg)
5/30/00 CSE 225
XPVM menus
• Host menu permits to configure a parallel virtual machine by adding/removing hosts
• Tasks menu enables to spawn, signal, or kill PVM processes, can monitor selected PVM system tasks, such as the group server process
![Page 72: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/72.jpg)
5/30/00 CSE 225
XPVM menus
• Reset menu resets parallel virtual machine, xpvm views, or trace file
• Help menu provides help features
• Views permits selection of any of the five xpvm displays for monitoring program execution
![Page 73: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/73.jpg)
5/30/00 CSE 225
XPVM menus
• Trace file play back controls - play, step forward, stop or reset the execution trace file
• Trace file selection window - displays the name of the current trace file
![Page 74: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/74.jpg)
5/30/00 CSE 225
XPVM views (5)
Network• Displays high-level activity on each node in the
virtual machine • Each host is represented by an icon image
showing host name and architecture • Icons are color illuminated to indicate status
– Active - at least one task on that host is doing useful work
– System - no tasks are doing user work and at least one task is busy executing PVM system routines
– No tasks
![Page 75: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/75.jpg)
5/30/00 CSE 225
Network
![Page 76: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/76.jpg)
5/30/00 CSE 225
Space time
Shows status of all tasks as they execute across all hosts
• Computing - executing useful user computations• Overhead - executing PVM system routines for
communication, task control, etc. • Waiting - waiting for messages from other tasks • Message - indicates communications between
tasks
![Page 77: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/77.jpg)
5/30/00 CSE 225
Space time
![Page 78: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/78.jpg)
5/30/00 CSE 225
Utilization
• Summarizes the Space-Time view at each instant by showing the aggregate number of tasks computing, in overhead or waiting for a message.
• Shares same horizontal time scale as the Space-Time view
• Zooming-in • Zooming-out
![Page 79: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/79.jpg)
5/30/00 CSE 225
Utilization
![Page 80: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/80.jpg)
5/30/00 CSE 225
Call trace
• Displays each tasks' most recent PVM call
• Changes as program executes
• Useful for debugging
• Clicking on a task in the scrolling task list will
display that task's full name and TID
![Page 81: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/81.jpg)
5/30/00 CSE 225
Call trace
![Page 82: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/82.jpg)
5/30/00 CSE 225
Task output• Provides a view of output (stdout) generated by
tasks in a scrolling window • Can be saved to a file at any point
![Page 83: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/83.jpg)
5/30/00 CSE 225
Concluding remarks
• System complexity is rising fast– computational grids– multidisciplinary applications– performance tools
• There are many open problems– adaptive optimization– performance prediction– compiler/tool integration– performance “quality of service” (QoS)
![Page 84: 5/30/00CSE 225 Performance Analysis Tools Nadya Williams Spring, 2000 UCSD](https://reader037.vdocuments.net/reader037/viewer/2022110322/56649d425503460f94a1ced0/html5/thumbnails/84.jpg)
5/30/00 CSE 225
Concluding remarks
– the software problems are large & cannot be solve in isolation
– open source collaboration
– vendors, laboratories, and academics
– technology assessment