damaris: how to efficiently leverage multicore parallelism ... · pdf filegabriel antoniu...
TRANSCRIPT
Gabriel Antoniu [email protected] KerData Project-Team Inria Rennes Bretagne Atlantique
Damaris: How To Efficiently Leverage Multicore Parallelism To Achieve Scalable, Jitter-Free I/O And Non-Intrusive In-Situ Visualization
Maison de la simulation, Saclay, January 2014
Work involving Matthieu Dorier (IRISA, ENS Cachan), Gabriel Antoniu (INRIA), Marc Snir (ANL), Franck Cappello (ANL), Leigh Orf (CMICH), Dave Semeraro (NCSA, UIUC), Roberto Sisneros (NCSA, UIUC), Tom Peterka (ANL)
Context: HPC simulations on Blue Waters
• INRIA – UIUC - ANL Joint Lab for Petascale Computing
• Blue Waters: sustained petaflop • Large-scale simulation at unprecedented accuracy • Terabytes of data produced every minute
- 2 Dealing with Big Data on post-petascale machines: the Damaris approach
Big Data challenge on post-petascale machines
- 3
• How to efficiently store, move data? • How to index, process, compress these data? • How to analyze, visualize and finally understand them?
Dealing with Big Data on post-petascale machines: the Damaris approach
Outline
1. Big Data challenge at exascale
2. The Damaris approach
3. Achieving scalable I/O
4. Non-impacting in-situ visualization
5. Conclusion
- 4 Dealing with Big Data on post-petascale machines: the Damaris approach
Big Data challenges at exascale When parallel file systems don’t scale anymore
1
The stardard I/O flow: offline data analysis
100.000+ cores
PetaBytes of data
~ 10.000 cores
Periodic data generation from the simulation
Storage in a parallel file system (Lustre, PVFS, GPFS,…)
Offline data analysis (on another cluster)
Dealing with Big Data on post-petascale machines: the Damaris approach - 6
Writing from HPC simulations: File per process vs. Collective I/O
Two main approaches for I/O in HPC simulations
Implemented in MPI-I/O, available in HDF5, NetCDF,…
• Too many files • High metadata overhead • Hard to read back
• Requires coordination • Data communications
Dealing with Big Data on post-petascale machines: the Damaris approach - 7
Periodic synchronous snapshots lead to I/O bursts and high variability
The “cardiogram” of a PVFS data server during a run of the CM1 simulation
Input Output
- 8
Visualizing throughput variability: - Between cores - Between iterations
Dealing with Big Data on post-petascale machines: the Damaris approach
The Damaris approach Dedicating cores to enable scalable asynchronous I/O
2
CORE
CORE
CORE
CORE
CORE
CORE
CORE
Main Memory
CORE
Network
Huge access contention, degraded performance
Writing without Damaris
Dealing with Big Data on post-petascale machines: the Damaris approach - 10
CORE
CORE
CORE
CORE
CORE
CORE
CORE
Main Memory
CORE
Network
• Dedicate cores to I/O • Write in shared memory • Process data asynchronously • Produce results • Output results
With Damaris: using dedicated I/O cores
Dealing with Big Data on post-petascale machines: the Damaris approach - 11
Leave a core, go faster!
Moving I/O to a dedicated core
- 12
Time-Partitioning Space-Partitioning
Dealing with Big Data on post-petascale machines: the Damaris approach
Damaris at a glance
• Dedicated Adaptable Middleware for Application Resources Inline Steering
• Main idea: dedicate one or a few cores in each SMP node for data management
• Features: - Shared-memory-based
communications - Plugin system (C,C++, Python) - Connection to visualization engines
(e.g. VisIt, ParaView) - XML external description of data
- 13 Dealing with Big Data on post-petascale machines: the Damaris approach
Damaris: current state of the software
• Version 0.7.2 available at http://damaris.gforge.inria.fr/ (1.0 available soon J) - Along with documentation, tutorials, examples and a demo!
• Written in C++, uses - Boost for IPC, Xerces-C and XSD for XML parsing
• API for Fortran, C, C++
• Tested on - Grid’5000 (Linux Debian), Kraken (Cray XT5 - NICS), Titan (Cray XK6 – Oak Ridge), JYC, Blue Waters (Cray XE6 - NCSA), Surveyor, Intrepid (BlueGene/P – Argonne)
• Tested with - CM1 (climate), OLAM (climate), GTC (fusion), Nek5000 (CFD)
- 14 Dealing with Big Data on post-petascale machines: the Damaris approach
Results: achieving scalable I/O 3
Running the CM1 simulation on Kraken, G5K and BluePrint with Damaris
- 16
• The CM1 simulation - Atmospheric simulation - One of the Blue Waters target applications - Uses HDF5 (file-per-process) and pHDF5 (for collective I/O
• Kraken
- Cray XT5 at NICS - 12 cores/node - 16 GB/node - Luster file system
• Grid 5000 - 24 cores/node - 48 GB/node - PVFS file system
• BluePrint - Power5 - 16 cores/node - 64 GB/node - GPFS file system
Dealing with Big Data on post-petascale machines: the Damaris approach
Damaris achieves almost perfect scalability
- 17
0
2000
4000
6000
8000
10000
0 5000 10000 Sc
alab
ility
fact
or
Number of cores
Perfect scaling
Damaris
File-per-process
Collective-I/O
Weak scalability factor S = N TbaseT
N: number of cores Tbase: time of an iteration on one core w/o write T: time of an iteration + a write
0
200
400
600
800
1000
576 2304 9216
Run
tim
e (s
ec)
Number of cores
Kraken Cray XT5 Application run time
(50 iterations + 1 write phase)
Dealing with Big Data on post-petascale machines: the Damaris approach
- 18
0 100 200 300 400 500 600 700 800 900
576 2304 9216
Tim
e to
writ
e (s
ec)
Number of cores
Collective-I/O File-per-process Damaris
0
2
4
6
8
10
0 10 20 30
Tim
e to
writ
e (s
ec)
Total amount of data (GB)
Kraken Cray XT5 Average and maximum write time
28MB per process
BluePrint Power5, 1024 cores Average, max and min write time
Damaris hides the I/O jitter
Dealing with Big Data on post-petascale machines: the Damaris approach
- 19
0,0625 0,125
0,25 0,5
1 2 4 8
16
0 5000 10000
Agg
rega
te th
roug
hput
(G
B/s
)
Number of cores
File-per-process Damaris Collective-I/O
Average aggregate throughput from the writer processes
Damaris increases effective throughput
Kraken Cray XT5 Average aggregate throughput
from writers
Dealing with Big Data on post-petascale machines: the Damaris approach
Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O Matthieu Dorier, Gabriel Antoniu, Franck Cappello, Marc Snir, Leigh Orf Proceedings of IEEE CLUSTER 2012 (Beijing, China)
More results…
- 20 Dealing with Big Data on post-petascale machines: the Damaris approach
- 21
0 50
100 150 200 250 300
576 2304 9216
Tim
e (s
ec)
Number of cores
0
50
100
150
200
250
0,05 5,8 15,1 24,7
Tim
e (s
ec)
Total amount of data (GB)
Spare time Used time
BluePrint Power5 (1024 cores) Kraken Cray XT5
Time spent by Damaris writing data and time spent waiting
Damaris spares time? Let’s use it!
Damaris spares time for data management
Dealing with Big Data on post-petascale machines: the Damaris approach
Recent work: non-impacting in-situ visualization Getting insights from running simulations
4
From offline to coupled visualization
• Offline approach • I/O performance issues in the simulation
• I/O performance issues in visualization software
• Too much data!!!
• Coupled approach • Direct insight in the simulation • Bypass the file system • Interactive BUT • Hardly accepted by users
- 23 Dealing with Big Data on post-petascale machines: the Damaris approach
Towards in-situ visualization
- 24
• Loosely coupled strategy - Visualization runs on a separate, remote set of resources - Partially or fully asynchronous - Solutions include staging areas, file format wrappers (HDF5 DSM, ADIOS, …)
• Tightly coupled strategy - Visualization is collocated with the simulation - Synchronous (time-partitioning): the simulation periodically stops - Solution by code instrumentation - Memory constrained
Dealing with Big Data on post-petascale machines: the Damaris approach
Four main challenges
- 25
Low impact on simulation code
Adaptability (to different simulations and visualization scenarios)
Low impact on simulation run time
Good resource utilization (low memory footprint, use of GPU,…)
User friendliness
Performance
Dealing with Big Data on post-petascale machines: the Damaris approach
In-situ visualization strategies
- 26
Coupling Tight Loose ? Impact on code High Low Minimal
Adaptability Interactivity Yes None Yes Instrumentation High Low Minimal
Impact on run time High Low Minimal Resource usage Good Non-optimal Better
• Researchers seldom accept tightly-coupled in-situ visualization - Because of development overhead, performance impact… - “Users are stupid, greedy, lazy slobs” [1]
• Is there a solution achieving all these goals? • Yes: Damaris
[1] D. Thompson, N. Fabian, K. Moreland, L. Ice, “Design issues for performing in-situ analysis of simulation data”, Tech. Report, Sandia National Lab
Dealing with Big Data on post-petascale machines: the Damaris approach
Node Node Node Node Node
File System
Connect to a visualization backend
Dealing with Big Data on post-petascale machines: the Damaris approach - 27
Node Node Node Node Node
File System
Interact with your simulation
Dealing with Big Data on post-petascale machines: the Damaris approach - 28
Let’s take a representative example
- 29
// rectilinear grid coordinates float mesh_x[NX]; float mesh_y[NY]; float mesh_z[NZ]; // temperature field double temperature[NX][NY][NZ];
Dealing with Big Data on post-petascale machines: the Damaris approach
“Instrumenting” with Damaris
- 30
!Damaris_write(“mesh_x”,mesh_x);!Damaris_write(“mesh_y”,mesh_y);!Damaris_write(“mesh_z”,mesh_z);!!Damaris_write(“temperature”,temperature);!!
(Yes, that’s all)
Dealing with Big Data on post-petascale machines: the Damaris approach
Now describe your data in an XML file
- 31
<parameter name="NX" type="int" value="4"/>!<layout name="px" type="float” sizes="NX"/>!<variable name="mesh_x" layout="px">!<!-- idem for PTY and PTZ, py and pz, mesh_y and mesh_z -->!!<layout name="data_layout" type="double” sizes="NX,NY,NZ"/>!<variable name="temperature" layout="data_layout” mesh=“my_mesh” />!!<mesh type=“rectilinear” name=“my_mesh” topology=“3”>! <coord name=“mesh_x” unit=“cm” label=“width” />! <coord name=“mesh_y” unit=“cm” label=“depth” />! <coord name=“mesh_z” unit=“cm” label=“height” />!</mesh>!
• Unified data description for different visualization software • Damaris translates this description into the right function calls to any
such software (right now: Python and VisIt) • Damaris handles the interactivity through dedicated cores!
Dealing with Big Data on post-petascale machines: the Damaris approach
- 32
VisIt interacting with the CM1 simulation through Damaris
Plugin exposing vector fields in the Nek5000 simulation with VTK
Dealing with Big Data on post-petascale machines: the Damaris approach
Damaris has a low impact on the code
VisIt Damaris Curve.c 144 lines 6 lines Mesh.c 167 lines 10 lines Var.c 271 lines 12 lines Life.c 305 lines 8 lines
Number of lines of code required to instrument sample simulations with VisIt and with Damaris
- 33
Nek5000 VTK : 600 lines of C and Fortran
Damaris : 20 lines of Fortran, 60 of XML
Dealing with Big Data on post-petascale machines: the Damaris approach
Results with the Nek5000 simulation
• The Nek5000 simulation • CFD solver • Based on spectral elements method • Developed at ANL • Written in Fortran 77 and MPI • Scales over 250,000 cores
• Data in Nek5000 • Fixed set of elements constituting an unstructured mesh • Each element is a curvilinear mesh
• Damaris already knows how to handle curvilinear meshes and pass them to VisIt
• Tested on up to 384 cores with Damaris so far • Traditional approach does not even scale to
this number!
- 34 Dealing with Big Data on post-petascale machines: the Damaris approach
Damaris removes the variability inherent to in-situ visualization tasks
Experiments done with the turbChannel test-case in Nek5000, on 48 cores (2 nodes) of Grid’5000’s Reims cluster.
- 35 Dealing with Big Data on post-petascale machines: the Damaris approach
Conclusion: using Damaris for in-situ analysis
- 36
Coupling Tight Loose Damaris Impact on code High Low Minimal
Adaptability Interactivity Yes None Yes Instrumentation High Low Minimal
Impact on run time High Low Minimal Resource usage Good Non-optimal Better
• Impact on code: 1 line per object (variable or event) • Adaptability to multiple visualization software (Python, VisIt, ParaView, etc.) • Interactivity through VisIt • Impact on run time: simulation run time independent of visualization • Resources usage:
• preserves the “zero-copy” capability of VisIt thanks to shared-memory, • can asynchronously use GPU attached to Cray XK6 nodes
Dealing with Big Data on post-petascale machines: the Damaris approach
Damaris/Viz: a Nonintrusive, Adaptable and User-Friendly In Situ Visualization Framework M. Dorier, R. Sisneros, T. Peterka, G. Antoniu, D. Semeraro, In Proc IEEE LDAV 2013 - IEEE Symposium on Large-Scale Data Analysis and Visualization, Oct 2013, Atlanta, United States.
More results…
- 37 Dealing with Big Data on post-petascale machines: the Damaris approach
Conclusion
- 38
The Damaris approach in a nutshell • Dedicated cores • Shared memory • Highly adaptable system thanks to plugins and XML • Connection to VisIt Results on I/O • Fully hides the I/O jitter and I/O-related costs • 15x sustained write throughput (compared to collective I/O) • Almost perfect scalability • Execution time divided by 3.5 compared to collective I/O • Enables 600% compression ratio without any overhead Recent work for in-situ visualization • Efficient coupling of simulation and analysis tools • Perfectly hides the run-time impact of in-situ visualization • Minimal code instrumentation • High adaptability
Dealing with Big Data on post-petascale machines: the Damaris approach
Gabriel Antoniu [email protected] KerData Project-Team Inria Rennes Bretagne Atlantique
Damaris: How To Efficiently Leverage Multicore Parallelism To Achieve Scalable, Jitter-Free I/O And Non-Intrusive In-Situ Visualization
Maison de la simulation, Saclay, January 2014
Work involving Matthieu Dorier (IRISA, ENS Cachan), Gabriel Antoniu (INRIA), Marc Snir (ANL), Franck Cappello (ANL), Leigh Orf (CMICH), Dave Semeraro (NCSA, UIUC), Roberto Sisneros (NCSA, UIUC), Tom Peterka (ANL)
Thank you!