beyond workflows - doe cloud computing paradigm and the sdm role and future

16
your name here Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future Mladen A. Vouk, Nagiza Smatova, Paul Breimyer, Pierre Moualem, Mei Nagappan, and the whole SPA team (list available separately) Scientific Data Management Center – Scientific Process Automation Group NC State University, Raleigh, NC 27695 1

Upload: rosa

Post on 22-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future. Mladen A. Vouk, Nagiza Smatova, Paul Breimyer, Pierre Moualem, Mei Nagappan, and the whole SPA team (list available separately) Scientific Data Management Center – Scientific Process Automation Group - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Beyond Workflows - DOE Cloud Computing Paradigm and the

SDMRole and Future

Mladen A. Vouk, Nagiza Smatova, Paul Breimyer, Pierre Moualem, Mei Nagappan,

and the whole SPA team (list available separately)

Scientific Data Management Center – Scientific Process Automation Group

NC State University, Raleigh, NC 276951

Page 2: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Overview

• Scientific Workflow technology – A success story from

the past 7 years in the SDM center (a technology used

in production or otherwise by application people) –

Developed components: Workflows, Provenance,

“Dashboard”, other

• DOE SDM “Cloud” -Vision for the future of the SDM

centre – Integration of components - Intelligent

Analytics and Social Networks, Component-based

“cloud”, Integrated Services (service oriented

architecture)

• Sustainable science - Long term approach for the

survival of SDM center technology (Beyond SciDAC and

longer) – Integration of Research, Engineering,

Transfer-of-Technology, Partnerships, Results (ROI,

TOC)2

Page 3: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Scientific Process Automation

• A key differentiating element of a successful

information technology (IT) is its ability to become

a true, valuable, and economical contributor to

cyberinfrastructure.

• An IT-assisted workflow represents a series of

structured activities and computations that arise in

information assisted problem solving.

• Scientific process automation principles, as well as

production level pilots, is SDM’s Key Contribution

over last 7 years – Smokey Mountains retreat.

• From NC State: numerous publications, 3 graduated

PhD and 4 MS with thesis students, several in

progress, several generations of software.

3

Page 4: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

4

Environment

Computations

Orchestration(Kepler) Data, DataBases

Provenance…Storage

Analytics

Control Panels(Dashboard)

& DisplayNetworking

Local/Remote… “Cloud” Services

NetworkingLocal/Remote… “Cloud” Services

AnalyticsAnalytics

Computations

Orchestration(Kepler) Data, DataBases

Provenance…Storage

Page 5: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

5

Workflow Framework

Provenance,Tracking &Meta-Data

(DBs and Portals)

Control Plane(light data flows)

ExecutionPlane(“HeavyLifting” Computationsand flows)

Synchronous or Asynchronous

Kepler

Page 6: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

66

Out

Network/”Cloud”Bsub < code_run------------ where code_run is a script --------------code_run#! /bin/csh source /usr/local/lsf/conf/cshrc.lsf #BSUB -W 5 #BSUB -n 100 mpiexec ./code#BSUB -o /share/vouk/WFLOW/code.out.%J #BSUB -e /share/vouk/WFLOW/code.err.%J #BSUB -J codevouk

-------------------------

In

Actor/Process in a Broader Sense

Page 7: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

7

Modular Framework

Supercomputers+

Analytics Nodes

Kepler

Dash

Storage

Meta-Data about:

Processes,Data,Workflows,System, Apps & Environment

Orchestration

Auth

DataStore

RecAPI

DispAPI

Management API

Access

Trust

Page 8: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Read More …

• Singh M.P. and M.A. Vouk, "Network Computing," in John G. Webster (editor),

Encyclopedia of Electrical and Electronics Engineering, John Wiley & Sons, New York, Vol.

14, pp. 114-132, 1999

• S Klasky, M Beck, V Bhat, E Feibush, B Ludäscher, M Parashar, A Shoshani, D Silver and M

Vouk, "Data management on the fusion computational pipeline," SciDAC 2005, Journal of

Physics: Conference Series 16 (2005), 510-520, doi:10.1088/1742-6596/16/1/070

• Ilkay Altintas, Oscar Barney, Zhengang Cheng, Terence Critchlow, Bertram Ludaescher,

Steve Parker, Arie Shoshani and Mladen Vouk, "Accelerating the scientific exploration

process with scientific workflows," sciDAC 2006, Journal of Physics: Conference Series 46

(2006), 468-478, doi:10.1088/1742-6596/46/1/065

• M. A. Vouk, I. Altintas R. Barreto, J. Blondin, Z.Cheng, T. Critchlow, A. Khan, S. Klasky, J.

Ligon, B. Ludaescher, P. A. Mouallem, S. Parker, N. Podhorszki, A. Shoshani, C. Silva, "

Automation of Network-Based Scientific Workflows," Proc. of the IFIP WoCo 9 on Grid-

based Problem Solving Environemnts: Implications for Development and Deployment of

Numerical Software, IFIP WG 2.5 on Numerical Software, Prescott, AZ, 2006, printed in

IFIP, Vol 239, "Grid-Based Problem Solving Environments, eds. Gaffney PW and Pool JCT

(Boston: Springer), pp. 35-61, 2007

• Klasky, S.; Barreto, R.; Kahn, A.; Parashar, M.; Podhorszki, N.; Parker, S.; Silver, D.; Vouk,

M.A. "Collaborative visualization spaces for petascale simulations," Proceedings of the

CTS 2008 - International Symposium on Collaborative Technologies and Systems, pp 203-

211, Digital Object Identifier 10.1109/CTS.2008.4543933,10-23 May 2008

• More… http://sdm.ncsu.edu

8

Page 9: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

DOE Cloud

• “Cloud” computing – builds on decades of research in

virtualization, distributed computing, utility computing, grids, and more

recently networking, web and software services.

• It implies a seamless service oriented and component-

based architecture - delivery of an integrated and orchestrated

suite of on-demand functions to an end-user through composition of

both loosely and tightly coupled functions, or services - often network-

based, reduced information technology overhead for the

end-user, service orchestration, virtualization of

resources, great flexibility, reduced total cost of

ownership, different “flavors”.

• Intelligent Analytics and Knowledge-Creating Social

Networks, Component-based “Clouds”,

Seamless/Integrated Services

• Necessary in the context of Peta- and Exa- sciences, data,

etc.9

Page 10: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

10

“Analytics Cloud"

Knowledge creation& Integration,

Social Networking,Provenance,Tracking &Meta-Data

(DBs and Portals)

ExecutionPlane - “Heavy duty” in-cloudComputations, Flows Services

W/FEngine

Concept-driven Analytics

W/F GenerationWizard

Run-time Manager and Scheduler

Synchronous & Asynchronous Services

Workflow control plane

Analytics Enabled ResourcesSupercomputers ClustersSupercomputers Active

StorageOther “cloud” devices

Page 11: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Components

• Reusability (elements can be re-used in other workflows)

• Substitutability (alternative implementations are easy to insert, very

precisely specified interfaces are available, run-time component

replacement mechanisms exist, there is ability to verify and validate

substitutions, etc), extensibility and scalability (ability to readily

extend system component pool and to scale it, increase capabilities of

individual components, have an extensible and scalable architecture

that can automatically discover new functionalities and resources, etc),

• Customizability (ability to customize generic features to the needs of a

particular scientific domain and problem),

• Composability (easy construction of more complex functional solutions

using basic components, reasoning about such compositions, etc.).

There are other characteristics that also are very important.

• Reliability and availability of the components and services,

• Cost - the cost of the services, total cost of ownership, economy of

scale

• Security and privacyand so on.

11

Page 12: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

12

Example: Meta-Data Framework

Supercomputers+

Analytics

Kepler?

Dash

Storage

Orchestration

Auth

DBRecAPI

DispAPI

CustomWeb

Other...

Page 13: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Fault-Tolerance – Clouds of Clouds

13

Master DB(replicated)

Page 14: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

User Categories

• Developers (10)

• Service Authors (100 to 1,000)

• Service Integrators (100– 10,000)

• End-users (1000 - ?)

14

Page 15: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Read More …

• Sam Averitt, Michael Bugaev, Aaron Peeler, Henry Shaffer, Eric Sills,

Sarah Stein, Josh Thompson, Mladen Vouk “Virtual Computing

Laboratory (VCL),” In the proceedings of the International Conference

on Virtual Computing Initiative, May 7-8, 2007, IBM Corp., Research

Triangle Park, NC, pp. 1-16.

• Mladen Vouk, Sam Averitt, Michael Bugaev, Andy Kurth, Aaron Peeler,

Andy Rindos*, Henry Shaffer, Eric Sills, Sarah Stein, Josh Thompson ,

“Powered by VCL” - Using Virtual Computing Laboratory (VCL)

Technology to Power Cloud Computing, Published in the Prelim.

Proceedings of the 2nd International Conference on Virtual Computing

Initiative, 15-16 May 2008, RTP, NC, pp. 1-10, final version to be

available through the ACM Digital Library

• Mladen A. Vouk, “Cloud Computing – Issues, Research and

Implementations,” ITI08, to appear in IEEE Digital Library

• Google for “cloud computing” …

• Other ..

15

Page 16: Beyond Workflows - DOE Cloud Computing Paradigm and the SDM Role and Future

your name here

Sustainable Science

• A Long term approach for the survival of SDM

center technology (Beyond SciDAC and longer)

• Research

• Engineering

• Transfer-of-Technology,

• Partnerships with scientists

• Operational open-source tools

• Visible results (agreed upon ROI, and an

accounting of TOC)

16