executing multi-workflow simulations on a mixed grid/cloud ... · service grids (sgs) (egee, osg,...

27
Executing Multi-workflow simulations on a mixed grid/cloud infrastructure using the SHIWA Technology Peter Kacsuk MTA SZTAKI and Univ.of Westminster [email protected]

Upload: others

Post on 29-Sep-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Executing Multi-workflow simulations on a mixed grid/cloud infrastructure using the SHIWA

TechnologyPeter Kacsuk

MTA SZTAKI and Univ.of [email protected]

Page 2: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

What is a multi-workflow simulation?

• A simulation workflow where nodes of the simulation are themselves workflows potentially based on different workflow languages

• A practical example: LINGA application (outGRID project)– Combining several workflows (2 CIVETs+FreeSurfer+STAT)

– Heterogeneous workflow systems (LONI/MOTEUR)

• It also should enable the execution in different DCIs. LINGA application:– LONI Cluster (USA)– gLite-based neuGRID infrastructure (Europe)– CBRAIN HPC infrastructure (Canada)

2

Page 3: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Experiment setup

3

CIVET @ CBRAIN

LONI Pipeline151 input data items

CIVET @ neuGRID

LONI Pipeline146 input data items

FreeSurfer @ CRANIUM

LONI Pipeline

STATS @ EGI

MOTEUROutputs of both CIVETs

Page 4: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

SHIWA solution for LINGA

Sub-WorkflowsManagement

4

Multi-Workflow

Page 5: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

5

Start date: 01/07/2010

Duration: 27 months

Total budget: 2,101,980 €

Funding from the EC: 1,800,000 €

Total funded effort in person-months: 231

Web site: www.shiwa-workflow.eu

Coordinator: Prof. Peter Kacsuk, email: [email protected]

SHIWA (SHaring Interoperable Workflows for Large-Scale Scientific Simulations on

Available DCIs) project

Page 6: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

6

Motivations 1• In many cases large simulations are organized

as scientific workflows that run on DCIs• However, there are too many different

• WF formalism• WF languages• WF engines

• If a community selected a WF system it is locked into this system:• They can not share their WFs with other communities

(even in the same scientific field)• They can not utilize WFs developed by other

communities

Page 7: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

WF Ecosystem

7

Page 8: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Who are the members of an e-science community from WF applications point of view?

End-users (e-scientists) (5000-50000)• Execute the published WF applications with custom

input parameters by creating application instances using the published WF applications as templates

WF Application Developers (500-1000)• Develop WF applications

• Publish the completed WF applications for end-users

WF System Developers (50-100)• Develop WF systems

•Writes technical, user and installation manuals

Page 9: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

accessing a large set of various DCIs to make these WF

applications run

Clouds

Local clusters

Supercomputers

Desktop grids (DGs)(BOINC, Condor, etc.)

Cluster based service grids (SGs)(EGEE, OSG, etc.)

Supercomputer based SGs

(DEISA, TeraGrid)

Grid systemsE-science infrastructure

What does a WF developer need?WF App.

RepositoryAccess to a large

set of ready-to-run scientific WF applications

Portal

Using a portal/desktop to parameterize and run these applications, and to further

develop them

Page 10: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

accessing a single DCI to make these WF applications run

Cluster based service grids (SG)

(e.g. ARC)

Grid system

In the past: WF developers worked in an isolated way, on a single DCI

Portal/desktop

Using a portal/desktop to develop WF applications

As a result if a community selected a WF system it is locked into this DCI• Porting the WF to another

DCI required large effort• Parallel execution of the

same WF in several DCIs is usually not possible

Page 11: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

After SHIWA: Collaboration between WF application developers

SHIWA App.

Repository

SSP Portal

Local clusters

Supercomputers

Desktop grids (DGs)(BOINC, Condor, etc.)

Cluster based service grids (SGs)(EGEE, OSG, etc.)

Supercomputer based SGs

(DEISA, TeraGrid)

Grid systems

Application developer

• Publish WF applications in the repository to be continued by other appl. developers

•Application developers use the portal/desktop to develop complex

applications (executable on various DCIs) for various end-user communities

Page 12: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Project objectives• Enable user communities to share their WFs

– Publish the developed WFs– Access and re-use the published WFs– Build multi-workflows from the published WFs

• Toolset:– SHIWA Simulation Platform

• WF Repository (production)• SHIWA Portal (production)• SHIWA Desktop (prototype)

12

Page 13: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Coarse-grained interoperability (CGI)

• CGI = Nesting of different workflow systems to achieve interoperability of WF execution frameworks

Multi-workflow

Page 14: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Export to IWIR

Import from IWIR

WFBWFA

Interoperable Workflow Intermediate Representation IWIR

Fine-grained interoperability (FGI)

Page 15: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Tools for CGI SHIWA services

• SHIWA repository to:– Describe workflows– Share workflows

• SHIWA portal to:– Access and enact registered workflows– Compose and enact multi-workflows – Monitor workflows and multi-workflows

execution in various DCIs– Retrieve results of the execution

Page 16: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

SHIWA Repository facilitates publishing and sharing workflowsSupports:• Abstract workflows with multiple implementations of over 10 workflow systems• Storing execution specific dataAvailable:• from the SHIWA Portal• standalone service at: repo.shiwa-workflow.eu

Page 17: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

EGI Community Forum, Munich, March 28, 2012

Scenario: Find and test WFs• SHIWA Repository: Analyze description, inputs and outputs of published WFs• SHIWA Portal: Instantiate WF from repo, execute with given sample data

(inside WS-PGRADE workflow used as the Master WF system)

17

Page 18: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

18

Title: Work Package

SA1...Author:. G

18

SHIWA Portal: Workflow Editor

Page 19: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

19

Title: Work Package

SA1...Author:. G

19

SHIWA Portal: Configuring Workflow

Page 20: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

20

Title: Work Package

SA1...Author:. G

20

SHIWA Portal: Executing Workflow

Page 21: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

21

SHIWA RepositorySHIWA Portal

WF1

SHIWA Science Gateway

GEMLCA Service

WFn

WE1 WEp

GEMLCA Repository

WE + WF

WF1 WFm

GEMLCA with GIB

WF listWS-

PGRADE Workflow

engine

WS-PGRADE Workflow

editor

edit WF

s2

search WF

s1

s5

s4

gLite DCI

MOTEUR WE

GWES WE

Globus DCI

pre-deployed- WEs

MOTEUR WE

Kepler WE

Taverna WE

Triana WE

local cluster

ASKALON WE

SHIWA VO

ASKALON WE

researcher

invoke WEs6

CGI User Scenario with WS-PGRADE as master

SHIWA Proxy Server

Proxy Server

s3

s6submit WE

Page 22: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

22

2222

SHIWA RepositorySHIWA Portal

WF1

SHIWA Science Gateway

GEMLCA Service

WFn

WE1 WEp

GEMLCA Repository

WF1 WFm

GEMLCA with GIBWS-PGRADE

Workflow Engine

WS-PGRADE Workflow

Editor

search WF

s1

WE + WF s5

gLite DCI

MOTEUR WE

GWES WE

Globus DCI

MOTEUR WE

Kepler WE

Taverna WE

Triana WE

local cluster

ASKALON WE

SHIWA VO

ASKALON WE

user

invoke WE s7

GEMLCA admin

GEMLCA Client

MOTEUR Workflow

Engine

MOTEUR Workflow

editor

s2

s3

GEMLCA UI

SHIWA Proxy Server

Proxy Server

s6

s4

CGI User Scenario with MOTEUR as master

Page 23: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

23

Integration of SCI-BUS and SHIWA gateway with clouds

CloudBroker Platform 

SCI‐BUSScienceGateways

API

App Storefor Cloud

Platformto accessapps inClouds

CloudProvider

AraGrid (Spain), Armenian Grid, Baltic Grid, Belgian GridBIFI Desktop Grid, Bulgarian Grid, ClGrid (Chile), COMPCHEM VO of EGEE

Croatian Grid, EGRID (Economy Grid, Italy)GILDA training grid,Grid Ireland, HunGrid (Hungarian National Grid)

IberGrid (Portugal and Spain), MathGrid (Spain)MoSGrid (Molecule Simulation Grid of D‐Grid), KnowledgeGrid Malaysia

PireGrid (Spain and France), See‐Grid (South‐East European Grid)SwissGrid, Turkish Grid, UK NGS, UK White Rose Grid

VOCE (Central European Grid), Westminster Desktop Grid

Clusters:  PBS,  LSF, Condor,  SGE

...

Liferay and P‐Grade

DesktopGrids:

BOINC, Xtrem‐Web, OurGrid

Super‐computers

WebService / API

Eucalyptus OpenNebula

gLiteMiddleware

UNICOREMiddleware

ARCMiddleware

App 1 App 2 App 3

SGIAmazo

nEC2

IBM

...

... Private Clouds

API API API API

Commercial Components

SCI‐BUSgenericgateway

German

 MosGrid 

Commun

ity

Statistical Seism

ology 

Commun

ity

Blender R

endering

 Co

mmun

ity

Amsterdam Medical 

Center Com

mun

ity

Swiss P

roteom

ics 

Commun

ity

Astroph

ysics 

Commun

ity

Helio physics 

Commun

ity

Business Process 

Commun

ity

Software bu

ild and

 test  Com

mun

ity

Citizen

 Web

 Co

mmun

ity

PireGrid Co

mmercial 

Commun

ity

Page 24: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

24

University of Westminster UoW United KingdomMagyar Tudomanyos Akademia Szamitastechnikai es Automatizalasi Kutato Intezete

MTA-SZTAKI Hungary

Centre National de la Recherche Scientifique CNRS FranceStichting European Grid Initiative EGI.eu The NetherlandsAcademic Medical Center of the University of Amsterdam

AMC The Netherlands

Technische Universität Dresden TUD GermanyLudwig-Maximilians-Universität München LMU GermanyUniversity College London UCL United KIngdomTrinity College Dublin TCD IrelandIstituto Nazionale di Astrofisica INAF Italy

Partners:

Technology providers:CNRS, EGI.eu, MTA-SZTAKI, UoW

Research Communities:Astro-Physics INAFComputational Chemistry LMU + TUDHelio-Physics TCD + UCLLife Science AMC

Duration:September 2012 – August 2014

ER-flow Project

Page 25: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Advantages for the various types of user communities using SHIWA

• WF system developers– Better visibility: much more WF developers can access and use their

WF system than before (through the applications stored in the SHIWA repo)

– The joint impact is much bigger than the individual WF systems can achieve

• WF developers– They can collaborate: share and re-use existing WF applications– WF application development can be accelerated– More complex WFs can be created in shorter time– They will access many different DCIs (their WF will be more popular)

• End-users– much bigger set of usable and more sophisticated WF applications– These applications can run on various DCIs

25

Page 26: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Conclusions

• SHIWA brings advantage for all the 3 kinds of user communities:– WF system developers– WF developers– End-users

• With relatively little effort– WF systems can join the SSP – WF system developers can adapt SHIWA technology

• Further information: www.shiwa-workflow.eu

26

Page 27: Executing Multi-workflow simulations on a mixed grid/cloud ... · service grids (SGs) (EGEE, OSG, etc.) Supercomputer . based SGs (DEISA, TeraGrid) Grid systems. E-science infrastructure

Summer school on workflows and gateways

27