sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

44
Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems Peter Kacsuk MTA SZTAKI [email protected] SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI-283481 1

Upload: stacey-stuart

Post on 31-Dec-2015

25 views

Category:

Documents


4 download

DESCRIPTION

Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems. Peter Kacsuk MTA SZTAKI [email protected]. 1. SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI-283481. Motivations 1. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Peter KacsukMTA SZTAKI

[email protected]

SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI-283481 1

Page 2: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

2

Motivations 1

• In many cases large simulations are organized as scientific workflows that run on Distributed Computing Infrastructures (DCIs)

• However, there are too many different• WF formalism• WF languages• WF engines

• If a community selected a WF system it is locked into this system:• They can not share their WFs with other

communities (even in the same scientific field)• They can not utilize WFs developed by other

communities

Page 3: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

3

Motivations 2

• A WF system engine is typically connected to one particular DCI (Distributed Computing Infrastructure)

• As a result, if a community selected a WF system it is locked into this DCI• Porting the WF to another DCI requires extra effort• Parallel execution of the same WF in several DCIs is

usually not possible

Page 4: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Part of WF Ecosystem

4

Page 5: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

er

What do we want to achieve?

5

XSEDE BOIN

C

Amazon

Bio1Bio2

BioN

CyberspaceWorkflows

InfrastructuresUsers should be able to access and use any WF and any infrastructure in an interoperable way no matter which is

their home WF system

Taverna Galaxy Kepler

WF systems

Page 6: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

What does WF interoperability mean?

• If a user developed WF Y in WF system B she (or other users) should be able to1. Re-use WF Y as part of another WF

(e.g. WF X) developed in WF system A (Coarse-grained interoperability – CGI)

6

Page 7: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Coarse-grained interoperability CGI

Coarse-grained interoperability (CGI) = Nesting of different workflow systems to achieve interoperability of WF execution frameworks

DCI 1

DCI 2

DCI 3

7

A

Y

Page 8: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Features of CGI approach

• Advantages– No restrictions on the embedded WFs– You can run the embedded WFs in their native

DCI (even in parallel -> easy to achieve high degree of DCI parallelism)

– Easy to implement and connect a new WF system• Drawbacks

– Black-box approach: you cannot • modify the embedded workflow• control and observe the internal operation of

the embedded WF

8

Page 9: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

What does WF interoperability mean?

• If a user developed WF Y in WF system B she (or other users) should be able to1. Run WF Y under another WF system

(e.g. WF system A) (Fine-grained interoperability – FGI)

2. Further develop WF Y under another WF system (e.g. WF system A) (FGI)

9

Page 10: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Transform WF Y into IWIR

WFY

Interoperable Workflow Intermediate Representation (IWIR)

Transform IWIR into WF X

WFX

Fine-grained interoperability (FGI) or white box WF interoperability:Enables to transform one WF to another WF system and further develop it in the new system

10

Page 11: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Features of FGI approach

• Advantages– White-box approach: you can

• modify the embedded workflow• control and observe the internal operation of the

embedded WF• Drawbacks

– There are some restrictions on the WFs that can be transformed

– You can run the embedded WFs only in the native DCIs of the target WF system

– Not easy to implement and connect a new WF system

11

Page 12: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

What does infrastructure interoperability mean?

• If a user developed WF X in WF system A she (or other users) should be able to– Run WF X in any DCI without significant porting

effort– Run different nodes of WF X in different DCIs (if

these nodes are in parallel branches then they can simultaneously run in different DCIs)

12

Cloud1

Cloud N

Page 13: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

EU Projects that develop solutions for these goals

• SHIWA– To solve WF and DCI interoperability

issues– Duration: 2 years (July 2010 – June

2012)

• SCI-BUS– To provide the required gateway

technology– Duration: 3 years (Oct 2011 – Sep 2014)

13

Page 14: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

accessing a large set of various DCIs to make these WF applications run

Clouds

Local clusters

Supercomputers

Desktop grids (DGs)(BOINC, Condor, etc.)

Cluster based service grids (SGs)(EGEE, OSG, etc.)

Supercomputer based SGs

(DEISA, TeraGrid)

Grid systems

E-science infrastructure

What does a WF developer need?

WF App. Repository

Access to a large set of ready-to-run scientific WF applications

Portal

Using a portal/desktop to parameterize and run these applications, and to further develop them

Page 15: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Reference production infrastructure of SHIWA

SHIWA App.

Repository

Application developers

• Publish WF applications in a repository to be continued/used by other appl. Developers

SHIWA Portal

Local clusters

Supercomputers

Desktop grids (DGs)(BOINC, Condor, etc.)

Cluster based service grids (SGs)(EGEE, OSG, etc.)

Supercomputer based SGs

(DEISA, TeraGrid)

Grid systems

• Use the portal/desktop to develop complex applications (executable on various DCIs) based on WFs stored in the repository

Page 16: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

facilitates publishing and sharing workflows

Supports:• Abstract workflows with multiple implementations of 10 workflow systems• Storing execution specific data

Available:• from the SHIWA Portal• standalone service at: repo.shiwa-workflow.eu

SHIWA Repository

16

Page 17: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

SHIWA Bundle and SHIWA Desktop for WF interoperability

17

SHIWA Bundle

SHIWA App. Repository

WS-PGRADE

Triana

MOTEUR

ASKALON

SHIWA Desktop

SHIWA Desktop

SHIWA Desktop

SHIWA Desktop

Page 18: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

SHIWA Bundle and SHIWA Desktop

• SHIWA Bundle: – object (stored as a zip file) containing everything needed

to expose a workflow for use– Provides a common language/format for workflow engines

• Workflows are stored as SHIWA bundle in the SHIWA Repository

• SHIWA Desktop connects a user’s desktop workflow environment to the SHIWA Repository

18

Page 19: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Extension of WF interoperability with DCI interoperability

19

SHIWA Bundle

SHIWA App. RepositorySHIWA

Desktop

WS-PGRADE

SHIWA Desktop

Triana

SHIWA Desktop

MOTEUR

SHIWA Desktop

ASKALON

Local clusters

Supercomputers

Desktop grids (DGs)(BOINC, Condor, etc.)

Cluster based service grids (SGs)

Supercomputer based SGs

Grid systems

DCI Bridge

BES interface

Page 20: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Production serviceJSDL Translator

Workflow Engine

DCI Bridge

J2

J1

J4

J3

jobs in non-JSDL

J2

J1

J4

J3

jobs in JSDL

DCI n

DCI 1

Accessing DCI Bridge

20

• BES requires JSDL for job submission• Therefore we need a JSDL generator to help WF engines

to create the JSDL for the jobs generated for WF nodes

Page 21: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Extension of WF interoperability with DCI interoperability (2)

21

SHIWA Bundle

SHIWA App. RepositorySHIWA

Desktop

WS-PGRADE

SHIWA Desktop

Triana

SHIWA Desktop

MOTEUR

SHIWA Desktop

ASKALON

Local clusters

Supercomputers

Desktop grids (DGs)(BOINC, Condor, etc.)

Cluster based service grids (SGs)

Supercomputer based SGs

Grid systems

DCI Bridge

BES interface

JSDL Translator

JSDL Translator

Page 22: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Where are we?

• Workflow interoperability done by– SHIWA Bundle– SHIWA Desktop– SHIWA Repository

• DCI interoperability done– DCI Bridge– JSDL Translator

• All of them are production services• What else do we need?

– A reference service through which anyone can try the technology

• The reference service is the SHIWA portal

22

Page 23: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

SHIWA portal: WS-PGRADE/gUSE Generic-purpose gateway framework

23

• Based on Liferay• General purpose• Workflow-oriented portal framework• Supports the development and execution

of workflow-based applications• Enables the multi-cloud, multi-DCI

execution of any WF• Provides access to

• internal repository • external SHIWA Repository

Page 24: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Creating and running WS-PGRADE workflows

24

Step 1: Edit workflow

Page 25: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Step 2: Configuring the workflow

Cloud1

Cloud N

Page 26: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Step 3: Running workflow instance

26

Page 27: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Scalable architecture based on collaborating services

Page 28: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Seamless access to various types of DCIs

WFGraph editor

Liferay

WS-PGRADE

portal

Information System

WF Storage

File StorageWF Repository

WF Interpreter

GT5Grid

DC

I-B

ridge

Client machine

Portal Server machine

DCIs

BES interface

BES interface

BOINC

Grid

ARCGrid

CloudBroker

Page 29: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

WFs in the clouds

• This issue is solved by the SCI-BUS project by integrating WS-PGRADE/gUSE with CloudBroker Platform

• Motivation:– Cloud resources are getting more and more popular– Clouds are more reliable than grids– WFs with cloud access are capable of satisfying

compute needs of complex scientific computations– Clouds can provide a vast amount of resources

• Aim: – Provide access to cloud resources in a transparent

way

29

Page 30: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

WS-PGRADE/gUSE and SCI-BUS

30

Page 31: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Multi-cloudMulti-cloud

Integrated WS-PGRADE/CloudBroker Platform to access multi-clouds

CloudBroker

Platform

WS-PGRADE

n

IaaSCloud 1

IaaSCloud N

SEQ

SEQ

WS-PGRADE

1

• Supported clouds: Amazon, IBM, OpenStack, Eucalyptus, OpenNebula

• SaaS solution: • Preregistered services/jobs can run from WS-

PGRADE (Supported from gUSE 3.5.0)• IaaS solution:

• any services/jobs can be submitted from WS-PGRADE (Supported from gUSE 3.5.1) 31

Page 32: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

CloudBroker Platform

• Web-based application repository for the deployment and execution of scientific and technical software in the cloud

• Offers these stored applications as SaaS service for end users

• On demand, pay per use, browser / programmatic / command-line access, cross-domain

• Uses infrastructure as a service (IaaS) from resource providers and offers these IaaS resources for users

32

Page 33: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

User ToolsUser Tools

Java Client Library

CloudBroker Platform Architecture

20.09.2012 /

CloudBroker PlatformCloudBroker Platform

AmazonCloud

Open-StackCloud

…Cloud

ChemistryAppli-

cations

BiologyAppli-

cations

HealthAppli-

cations

WebBrowser

UI

…Appli-

cations

REST Web Service API

CloudB

roker IntegrationC

loudBroker Integration

End Users, Software Vendors, Resource Providers

CLI

EngineeringAppli-

cations

IBMCloud

Euca-lyptusCloud

33

Page 34: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Integrated architecture

Page 35: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Integration features

• Support for commercial clouds with costs (prices configured in CloudBroker Platform):– Estimated job cost before submission– Actual job and workflow cost after execution

Page 36: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Accessible Cloud Resources

• Access provided by the CloudBroker Platform• Commercial:

– Amazon EC2– IBM

• OpenSource/Free:– OpenStack– OpenNebula– Eucalyptus

• Currently, within SCI-BUS accessible:– MTA SZTAKI OpenNebula (400 cores)– BIFI OpenStack (50 cores)

Page 37: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Collaboration within and among communities based on gUSE

37

SHIWA Repository

gUSE Portal

Cloud 1OpenNebula

Cloud 2Amazon

Cloud nOpenStack

gUSE Portal

WF upload as SHIWA bundle

WF upload as SHIWA bundle

gUSE WF

Repo

gUSE WF

Repo

Page 38: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Success story: SHIWA solution for the LINGA experiment

Sub-WorkflowsManagement

38

Multi-Workflow

Page 39: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Maturity of implementation

• Production services:– SHIWA Repository– SHIWA Bundle and SHIWA Desktop– CGI approach - Connected WF systems:

• ASKALON, Galaxy, MOTEUR, Pegasus, Taverna, Triana, WS-PGRADE

– SHIWA portal based on gUSE– CloudBroker Platform - Connected clouds:

• Amazon, IBM, Eucalyptus, OpenNebula, OpenStack

• Prototype services– FGI approach - Connected WF systems:– ASKALON, MOTEUR, Triana, WS-PGRADE

• New EU project ER-Flow supports 6 user communities

39

Page 40: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Recent WS-PGRADE/gUSE releasesHistory since v3.4.0•Nov 2011: v3.4.0 (DCI Bridge)•Feb 2012: v3.4.1 (usage statistics portlet)•March 2012: v3.4.2 (support for new EMI release)•April 2012: v3.4.3 (support for Liferay 6.1)•…•Aug 2012: v3.5.0 (SaaS cloud access via CBP) •Sep 2012: v3.5.1 (IaaS cloud access via CBP) •Oct 2012: v3.5.2 (SHIWA workflow repository export/import)•March 2013: v3.5.3 (REST support, EMI-UI v1/v2 support, …)•April 2013: v3.5.4 (cloud cost estimation/reporting)•April 2013: v.3.5.5 (robot certificates)•May 2013: v.3.5.6 (Improved SHIWA workflow repository export/import)

Page 41: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

gUSE sourceforge statistics

41

Page 42: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

42

Where to find further information?• gUSE/WS-PGRADE:

– http://www.guse.hu/• gUSE on sourceforge

– http://sourceforge.net/projects/guse/– http://sourceforge.net/projects/guse/forums/forum/– http://sourceforge.net/projects/guse/develop

• SCI-BUS web page:– http://www.sci-bus.eu/

• SHIWA web page:– http://www.shiwa-workflow.eu/

• ER-Flow web page:– http://www.erflow.eu

Page 43: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

43

Page 44: Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems

Summary

• We have created a technology that enables to combine many different WFs, WF systems and DCIs in many different ways

• It is like a puzzle where you can put together the required pieces to create the final picture

44

gUSE WF system

OpenNebula

CloudKepler

WF

GalaxyWF