workshop on the future of scientific workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf...

13
dr hgf dj hngng f mhgm gh mgh j mgh f mf Tom Peterka [email protected] Mathematics and Computer Science Division ASCR SSIO Workshop September 19, 2018 Workshop on the Future of Scientific Workflows April 20-21, 2015, Rockville, MD Co-organizers Ewa Deelman (USC) and Tom Peterka (ANL) Program committee IlkayAltintas (SDSC), Chris Carothers (RPI), Ken Moreland (SNL), Manish Parashar (Rutgers), Lavanya Ramakrishnan (LBNL), Jeff Vetter (ORNL), Kerstin Kleese van Dam (BNL), Michela Taufer (UD) Program managers Lucy Nowell (DOE) and Rich Carlson (DOE) QDONQSNESGDCNDMFMR.BRRBHDMSHEHBVNQJEKNVRVNQJRGNO @OQHK1/fl10+1/04 RonmrnqdcaxsgdNeehbdne@cu‘mbdcRbhdmshehbBnlotshmfQdrd‘qbg T-R-Cdo‘qsldmsneDmdqfx NeehbdneRbhdmbd THE FUTURE OF SCIENTIFIC WORKFLOWS

Upload: others

Post on 24-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

d rh g fd jh n g n gfmh g mg hmg hjmg hfm f

Tom [email protected]

Mathematics and Computer Science Division

ASCR SSIO WorkshopSeptember 19, 2018

Workshop on the Future of Scientific WorkflowsApril 20-21, 2015, Rockville, MD

Co-organizersEwa Deelman (USC) and Tom Peterka (ANL)

Program committeeIlkay Altintas (SDSC), Chris Carothers (RPI), Ken Moreland (SNL),

Manish Parashar (Rutgers), Lavanya Ramakrishnan (LBNL), Jeff Vetter (ORNL),Kerstin Kleese van Dam (BNL), Michela Taufer (UD)

Program managersLucy Nowell (DOE) and Rich Carlson (DOE)

2%0/24�/&�4(%�$/%�.'.3�#3�3#)%.4)&)#�7/2+&,/73�7/2+3(/0!02),���n��������

3PONSORED�BY�THE�/FFICE�OF�!DVANCED�3CIENTIFIC�#OMPUTING�2ESEARCH5�3��$EPARTMENT�OF�%NERGY/FFICE�OF�3CIENCE

5)&�'6563&�0'�4$*&/5*'*$�803,'-084

Page 2: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Workflows 101

2

Page 3: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Definitions

• Workflow: Sequencing and orchestrating operations, along with the attendant tasks of, for example, moving data between workflow processing stages.

• Workflow management systems: Aiding in the automation and capture the provenance of these processes, freeing the scientist from the details of the process.• Manage the execution of constituent tasks • Manage the information exchanged between them

TaskA

B

C

D E

F

cycles are OK

parallelprograms

parallelcommunication

• Directed graph of tasks and communication

• Graph nodes are the tasks

• Graph links are the communication

• Graph does not have to be acyclic

• “Large tasks” (programs), not “small tasks” (threads)

• Nodes and links can be parallel

Graph Model

Page 4: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

In Situ (IS) and Distributed Area (DA)

• In Situ (IS): Within an HPC system (synonyms: in situ, in transit, coprocessing, run-time, online)• Distributed Area (DA): Across systems, potentially geographically distributed

Task

Source

Source

Exploration

Exploration

... ...

Task

Task

Task

Task

In Situ (IS) Distributed Area (DA)

Steering

Pers

isten

t Sto

rage

Data /Control

Page 5: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Examples

5

Page 6: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Cosmology: HPC Simulation

• Computational workflow is one small part of a complete cosmology campaign

• Converts dark matter particles to an unstructured mesh

• Converts an unstructured mesh to a regular grid

• Computes statistics over the grid and visualizes the results

6

Page 7: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Materials Science: Simulation + Experiment

7Ulvestad et al.: In-situ 3D Imaging of Catalysis Induced Strain in Gold Nanoparticles. Physical Chemistry Letters, 2016.

Science workflow for the comparison of a molecular dynamics simulation with a high-energy X-ray microscopy of the same material system includes three interrelated in situ and distributed area experimental workflows.

LatticeFitting

Thresholding

DensityIsovolume

DiffractionPattern

AtomDisplacements

Strain Stats

Atom Positions

Max

Max

Strain Stats

Atom Positions

Reconstruction

Ampitude &Phase

Strain Stats

CDI

LAMMPS

Experiment Workflow (DAC)

Analysis Workflow (DAC)

Modeling Workflow (HPC)

Page 8: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Bioinformatics: High Throughput + Cloud

8

Courtesy Folker Meyer

High-throughput workflows often feature many small tasks, web services, cloud architectures, and big data tools.

Page 9: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

The Workshop

9

Page 10: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Mission

Develop requirements for workflow methods and tools in a combined IS and DA environment to enable science applications to better manage their end-to-end data flow.

Objectives• Identify the workflows of representative science use cases in IS and DA settings• Understand the state of the art in existing workflow technologies, including creation, execution, provenance, (re)usability, and reproducibility• Address emerging hardware and software trends, both in centralized and distributed environments, as they relate to workflows• Bridge the gap between IS and DA workflows

Page 11: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Findings and Research Priorities Related to SSIO

• Applications: Lagging I/O bandwidth motivates in situ workflows• Hardware: New storage technology, heterogeneity, hierarchy• System software: Provisioning storage resources and services as first-class citizens• Dataflow: Storage for communication• Provenance: Fast storage and retrieval of log data (e.g., performance profiles)

• Validation: Predicting performance and validate results (e.g., storage systems)

Page 12: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Reporthttp://science.energy.gov/~/media/ascr/pdf/programdocuments/docs/workflows_final_report.pdf

12

Journal paper

Ewa Deelman, Tom Peterka, Ilkay Altintas, Christopher Carothers, Kerstin Kleese van Dam, Kenneth Moreland, Manish Parashar, Lavanya Ramakrishnan, Michela Taufer, Jeffrey Vetter: The Future of Scientific Workflows. International Journal of High Performance Computing Applications, 2017.

Page 13: Workshop on the Future of Scientific Workflows · 2018. 10. 9. · drhgfdjhngngfmhgmghmghjmghfmf Tom Peterka tpeterka@mcs.anl.gov Mathematics and Computer Science Division ASCR SSIO

Tom [email protected]

Mathematics and Computer Science Division

Meeting OrganizersEwa Deelman (USC)Tom Peterka (ANL)Ilkay Altintas (SDSC)Christopher Carothers (RPI)Kerstin Kleese van Dam (PNNL)Kenneth Moreland (SNL)Manish Parashar (RU)Lavanya Ramakrishnan (LBNL)Michela Taufer (UD)Jeffrey Vetter (ORNL)

Meeting Participants Greg Abram (UT)Gagan Agrawal (OSU)Jim Ahrens (LANL)Pavan Balaji (ANL)Ilya Baldin (RENCI)Jim Belak (LLNL)Amber Boehnlein (SLAC)Shreyas Cholia (NERSC)Alok Choudhary (NU)Constantinos Evangelinos (IBM)Ian Foster (ANL)Geoffrey Fox (IU)Foss Friedman-Hill (SNL)Jonathan Gallmeier (AMD)Al Geist (ORNL)Berk Geveci (Kitware)Gary Grider (LANL)Mary Hall (UU)

Ming Jiang (LLNL)Elizabeth Jurrus (UU)Gideon Juve (USC)Larry Kaplan (Cray)Dimitri Katramatos (BNL)Dan Katz (UC)Darren Kerbyson (PNNL)Scott Klasky (ORNL)Jim Kowalkowski (FNAL)Dan Laney (LLNL)Miron Livny (UW)Allen Malony (UO)Anirban Mandal (RENCI)Folker Meyer (ANL)Dave Montoya (LANL)Peter Nugent (LBNL)Valerio Pascucci (UU)Wilfred Pinfold (Intel)Thomas Proffen (ORNL)

Rob Ross (ANL)Erich Strohmaier (LBNL)Christine Sweeney (LANL)Douglas Thain (UND)Craig Tull (LBNL)Tom Uram (ANL)Dean Williams (LLNL)Matt Wolf (GT)John Wright (MIT)John Wu (LBNL)Frank Wuerthwein (UCSD)Dantong Yu (BNL)

DOE Program ManagersRichard CarlsonLucy Nowell

ASCR SSIO WorkshopSeptember 19, 2018

Acknowledgments

http://science.energy.gov/~/media/ascr/pdf/programdocuments/docs/workflows_final_report.pdf