new capabilities in qoscosgrid middleware for advanced job management, advance reservation and...

17
New Capabilities in QosCosGrid Middleware for Advanced Job Management, Advance Reservation and Co-allocation of Computing Resources B. Bosak, P. Kopta, K. Kurowski, M. Mamonski, T. Piontek Poznan Supercomputing and Networking Center Cracow Grid Workshop 8-th November 2011

Upload: lester-walker

Post on 25-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

New Capabilities in QosCosGrid Middleware for Advanced Job Management, Advance Reservation and Co-allocation of Computing Resources

B. Bosak, P. Kopta, K. Kurowski, M. Mamonski, T. PiontekPoznan Supercomputing and Networking Center

Cracow Grid Workshop8-th November 2011

2

Plan of the Presentation

Introduction

Main features of QCG

Comparison with other

Grid systems

QCG Architecture and its main Components

Status on deployments

3

Introduction

Co-allocationAdvance Reservation

Large-scale parallel

applications

Cross-cluster MPI

and ProActive

Multiscale simulationsWorkflows

4

QCG for Parallel Applications

• Multicluster OpenMPI and ProActive• Hybrid applications, e.g. OpenMPI/OpenMP• Multiscale, cross-cluster applications based on

the MUSCLE framework.• Application consisting of groups of processes

with different resource requirements• Topology aware scheduling:

- by QCG basing on application requirements- by application basing on topology discovery

5

QCG for Workflow Applications

• Workflows based on direct acyclic graphs (DAG)

• Task may be triggered by statuses of proceeding tasks (e.g. some task may be started when the proceeding task is in a state “Running”)

• Multi dimensional parameter sweep experiments (as a part of workflow)

6

Advance Reservation and Co-allocation

• AR is a mechanism offering execution of applications in a specified timeslots

• Main use case in cross-cluster application execution, where co-allocation of resources is required

• QCG creates co-allocation based on parameters specified by users: it may use not only resource requirements, but also requested start time, end time or duration

7

QosCosGrid vs. Popular Grid Middleware

Middleware Single jobs Workflows MPI Jobs Cross cluster MPI jobs

Interactive Jobs

Parametric Jobs

gLite Yes Yes Yes No Yes Yes

UNICORE Yes Yes Yes No No Yes

QCG Yes Yes Yes Yes No Yes

8

QCG Architecture

9

QCG-Broker

• Grid domain meta-scheduling framework• Deals with load-balancing and scheduling of

cross-cluster jobs• Provides consisted WebService interface to

the Grid; JobProfile XML-based language is used as a job description format

• Interacts directly with cluster level services (QCG-Computing, QCG-Notification, gridFTP, …)

10

QCG-Computing

• The key component of the cluster domain;• Provides WebService interface to various DRMs –

integration based on DRMAA (e.g. PBS Pro, LoadLeveler, GE, Torque/Maui);

• Compliant with OGF HPC Basic Profile Specification (JSDL as a job description language, BES interface);

• Offers methods for creation and management of advance reservations;

• Many plugins for authentication, authorization and accounting.

11

QCG-Computing Performance

12

QCG-Notification

• Its main function in QCG system is brokering asynchronous notifications between the QCG-Computing and QCG-Broker services

• Implementation of brokered version of WS-Notification standard; features:– Advanced two-level filtering based on topics and

content of the notification messages– Pull and push styles of distributing notification

messages– HTTP/HTTPS and XMPP transport protocols

13

QCG-Notification Performance

14

QCG Science Gateways and Tools

• Nano portal – advanced web-based portal dedicated for nanotechnologists (Abinit, NAMD, Quantum Espresso)

• QCG-Icon – lightweight desktop interface to QCG (MATLAB)

• QCG-Mobile – mobile access to QosCosGrid services (Android, JME)

15

Deployments

• QCG is deployed at 4 production sites in PL-Grid (PSNC, Cyfronet AGH, TASK, WSNC):– proxy certificates,– LDAP grid-mapfile generation,– BAT accounting,– Nagios probes,– RPM packages.

• NEL, the application from Quantum Chemistry written by prof. Jacek Komasa was adapted to cross-cluster execution on top of QosCosGrid. Tests were performed on the PL-Grid infrastructure.

• Ongoing production deployments in Europe on EGI and PRACE resources (e.g. LRZ, UCL, SARA) – MAPPER project.

16

Summary

• QosCosGrid is an alternative grid middleware

• Great support for cross-cluster application execution (MPI, ProActive, MUSCLE)

• Ready production deployments in PL-Grid infrastructure.

• Further reading:http://www.qoscosgrid.org

17

Thank You!

?