Download - ERLANGEN REGIONAL COMPUTING CENTER
![Page 1: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/1.jpg)
ERLANGEN REGIONAL COMPUTING CENTER
J. Eitzinger, T. Röhl, W. Hesse, A. Jeutter, E. Focht15.12.2015
FEPAProject status and further steps
![Page 2: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/2.jpg)
2
§ Cluster administrators employ monitoring to§ Detect errors or faulty operation§ Observe total system utilization
§ Application developers use (mostly GUI) tools to do performance profiling
Motivation
Primary TargetProvide a monitoring infrastructure to allow for a continuous system-wide application performance and energy profiling based on hardware performance counter measurements
Ein flexibles Framework zur Energie- und Performanceanalysehochparalleler Applikationen im Rechenzentrum
![Page 3: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/3.jpg)
3
§ Allow to detect applications with pathological performance behavior
§ Help to identify applications with large optimization potential
§ Give users feedback about application performance
§ Ease access to hardware performance counter data
Objectives
![Page 4: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/4.jpg)
STATUS
![Page 5: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/5.jpg)
5
§ Support for new architectures: Intel Silvermont, Intel Broadwelland Broadwell-EP, Intel Skylake
§ Improved overflow detection (including RAPL)§ Improved documentation with many new examples (Cilk+,
C++11 threads)§ More performance groups and validated metrics for many
architectures§ Improvements in likwid-bench and likwid-mpirun
§ New access layer to support platform-independent code (x86, Power, ARM)
RRZE (Thomas Röhl)
![Page 6: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/6.jpg)
6
NEC (Andreas Jeuter)
collector
collector
controller
aggregator
aggregator
collector
node
node
node
node
node
node
node
nodegroup
group
groupper job
per job
store
store
store
NoSQLDB
NoSQLDB
NoSQLDB
ResourceScheduler
per group
Sharding+
Replication
instantiateInstantiate
Program tagger
jobstart/stop
Instantiate at job start(Trigger aggregation)
Kill when job stops
AggMon
§ Componentized§ Fully distributed§ Separate processes: truly parallel§ Implemented in Python§ Connected through ZeroMQ
![Page 7: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/7.jpg)
7
AggMon: Collector
queueZMQPULL tagger match &
publishZMQPUSH
RPCcollectorgmond
ZMQ PUSH
modified
Add tagRemove tag
SubscribeUnsubscribe
Messages: JSON serialized dicts/mapsTagger: adds a key-value to message, based on match condition
Subscribe: based on match condition (key-value, key-value regex)
O(50k)msg/s O(10k)
msg/s
![Page 8: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/8.jpg)
8
§ TokuMX: MongoDB compatible§ Collections can be sharded
§ Spread Documents on different mongod instances§ Entry point: any mongos instance
§ Replication (for example master-slave) is possible
AggMon: Data Store
mongod
Group master
mongod mongod mongod
rack1
mongos
Group master Group mastermongos
Group master...
...
configsvr
rack2 rack3 ...
shard key{ group:rack1, … }
O(10k)msg/s
![Page 9: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/9.jpg)
9
LRZ (Wolfram Hesse, Carla Guillen)
Erfolgreicher Abschluss der Promotion von C. Guillen
§ Validierung der verwendeten Performancemuster§ Statistische Auswertung der Performancemuster§ Dokumentation des PerSyst-Monitoring-System
Knowledge-based Performance Monitoring for Large Scale HPC Architectures; Dissertation C. Guillen Carias; 2015; http://mediatum.ub.tum.de?id=1237547
![Page 10: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/10.jpg)
10
§ PerSyst-Monitoring ist produktiv @ SuperMUC Phase I + II § Definition und Umsetzung der Performancemuster Phase 1 (Westmere-
EX,SandyBridge-EP) und Phase 2 (Haswell-EP)
§ Nutzung und Verifikation durch:§ LRZ-Applikationsunterstützungsgruppe und IBM-Mitarbeiter› Benachrichtigung der Benutzer, falls offensichtliche Bottlenecks vorliegen +
Vorschläge für Optimierungen› Sichtung von Anwendungen für Extreme Scaling und Benchmarks
§ SuperMUC-Benutzer › Pos. Feedback bzg. Nützlichkeit
§ Umsetzung des PerSyst Web-Frontend am RRZE
LRZ: PerSyst Status
![Page 11: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/11.jpg)
ONGOING WORK
Integrate complete stack at RRZEValidate Performance Patterns from profiling data
![Page 12: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/12.jpg)
12
§ How to deal with established monitoring infrastructure (Ganglia)?§ Easy: Use existing monitoring infrastructures§ Target: Replace existing software with FEPA stack
§ Concerns about large overhead of continous HPM profiling§ Overhead could be lower with a better interface to HPM (ISA, OS)§ Missing knowledge about overheads in general
§ Picking the right building blocks.§ Backend daemon: diamond (https://github.com/python-diamond/Diamond)§ Communication protocol: ZeroMQ (http://zeromq.org)§ Storage: TokuMX (NoSQL)
Current Questions
![Page 13: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/13.jpg)
13
§ Target system: 80-node Nehalem cluster system in normal production use
Objectives§ Sort out issues between components§ Validate and benchmark solution:
§ diamond§ mongoDB/TokuMX§ Liferay framework based PerSyst frontend
§ Experiment on application profiling data§ Required granularity for phase detection§ Performance Pattern validation on set of known codes
Integration of FEPA components
![Page 14: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/14.jpg)
14
§ Layers are ready to be integrated into complete stack§ Convergence for finding external building blocks
§ LRZ PerSyst System in production use
Next:§ Continue integrating stack to make FEPA ready to be distributed
at associated HPC centers§ Validate FEPA on a set of known benchmarks (Mantevo, NPB,
SPEC)
Conclusion and Outlook
![Page 15: ERLANGEN REGIONAL COMPUTING CENTER](https://reader031.vdocuments.net/reader031/viewer/2022013005/61cd9825e15eb552aa3898a9/html5/thumbnails/15.jpg)
ERLANGEN REGIONAL COMPUTING CENTER
Thank You.
Leibniz-Rechenzentrum
NEC Deutschland GmbH
RegionalesRechenzentrum
Erlangen