new capabilities in qoscosgrid middleware for advanced job management, advance reservation and...
TRANSCRIPT
New Capabilities in QosCosGrid Middleware for Advanced Job Management, Advance Reservation and Co-allocation of Computing Resources
B. Bosak, P. Kopta, K. Kurowski, M. Mamonski, T. PiontekPoznan Supercomputing and Networking Center
Cracow Grid Workshop8-th November 2011
2
Plan of the Presentation
Introduction
Main features of QCG
Comparison with other
Grid systems
QCG Architecture and its main Components
Status on deployments
3
Introduction
Co-allocationAdvance Reservation
Large-scale parallel
applications
Cross-cluster MPI
and ProActive
Multiscale simulationsWorkflows
4
QCG for Parallel Applications
• Multicluster OpenMPI and ProActive• Hybrid applications, e.g. OpenMPI/OpenMP• Multiscale, cross-cluster applications based on
the MUSCLE framework.• Application consisting of groups of processes
with different resource requirements• Topology aware scheduling:
- by QCG basing on application requirements- by application basing on topology discovery
5
QCG for Workflow Applications
• Workflows based on direct acyclic graphs (DAG)
• Task may be triggered by statuses of proceeding tasks (e.g. some task may be started when the proceeding task is in a state “Running”)
• Multi dimensional parameter sweep experiments (as a part of workflow)
6
Advance Reservation and Co-allocation
• AR is a mechanism offering execution of applications in a specified timeslots
• Main use case in cross-cluster application execution, where co-allocation of resources is required
• QCG creates co-allocation based on parameters specified by users: it may use not only resource requirements, but also requested start time, end time or duration
7
QosCosGrid vs. Popular Grid Middleware
Middleware Single jobs Workflows MPI Jobs Cross cluster MPI jobs
Interactive Jobs
Parametric Jobs
gLite Yes Yes Yes No Yes Yes
UNICORE Yes Yes Yes No No Yes
QCG Yes Yes Yes Yes No Yes
9
QCG-Broker
• Grid domain meta-scheduling framework• Deals with load-balancing and scheduling of
cross-cluster jobs• Provides consisted WebService interface to
the Grid; JobProfile XML-based language is used as a job description format
• Interacts directly with cluster level services (QCG-Computing, QCG-Notification, gridFTP, …)
10
QCG-Computing
• The key component of the cluster domain;• Provides WebService interface to various DRMs –
integration based on DRMAA (e.g. PBS Pro, LoadLeveler, GE, Torque/Maui);
• Compliant with OGF HPC Basic Profile Specification (JSDL as a job description language, BES interface);
• Offers methods for creation and management of advance reservations;
• Many plugins for authentication, authorization and accounting.
12
QCG-Notification
• Its main function in QCG system is brokering asynchronous notifications between the QCG-Computing and QCG-Broker services
• Implementation of brokered version of WS-Notification standard; features:– Advanced two-level filtering based on topics and
content of the notification messages– Pull and push styles of distributing notification
messages– HTTP/HTTPS and XMPP transport protocols
14
QCG Science Gateways and Tools
• Nano portal – advanced web-based portal dedicated for nanotechnologists (Abinit, NAMD, Quantum Espresso)
• QCG-Icon – lightweight desktop interface to QCG (MATLAB)
• QCG-Mobile – mobile access to QosCosGrid services (Android, JME)
15
Deployments
• QCG is deployed at 4 production sites in PL-Grid (PSNC, Cyfronet AGH, TASK, WSNC):– proxy certificates,– LDAP grid-mapfile generation,– BAT accounting,– Nagios probes,– RPM packages.
• NEL, the application from Quantum Chemistry written by prof. Jacek Komasa was adapted to cross-cluster execution on top of QosCosGrid. Tests were performed on the PL-Grid infrastructure.
• Ongoing production deployments in Europe on EGI and PRACE resources (e.g. LRZ, UCL, SARA) – MAPPER project.
16
Summary
• QosCosGrid is an alternative grid middleware
• Great support for cross-cluster application execution (MPI, ProActive, MUSCLE)
• Ready production deployments in PL-Grid infrastructure.
• Further reading:http://www.qoscosgrid.org