pegasus on the virtual grid: a case study of workflow planning over captive resources
DESCRIPTION
Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources. Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle US Inc K orea Advanced Institute of Science and Technology I nformation Sciences Institute/University of Southern California - PowerPoint PPT PresentationTRANSCRIPT
Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources
Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo KimOracle US Inc
Korea Advanced Institute of Science and TechnologyInformation Sciences Institute/University of Southern California
Sungkyunkwan University
Overview
Motivation Background
– Pegasus– Virtual Grid
Pegasus-VG Proxy Conclusion Discussion
Motivation
Challenges in scientific application development
– Data/control flow, task scheduling, data replication, fault-tolerance, etc
Challenges in resource management– Availability, performance, cost, reliability, fault-
tolerance, etc
How to leverage existing cyber infrastructures for easy and efficient scientific computing?
Separations of Concerns
Application domain– Workflow management: application management
can be conducted independently of target execution environments.
– E.g.) Pegasus, Askalon, Triana Resource domain
– Resource provisioning: resource management can be encapsulated underneath abstractions or virtualizations
– E.g.) Virtual Grid, virtual cluster, cloud
Workflow planning and execution over provisioned resources
Pegasus
A framework for workflow planning and execution
Workflow lifecycle– Design: describe the data/control flows of
application via an abstract workflow– Planning: map the workflow tasks onto physical
resources– Execution: schedule and run the workflow tasks
on the mapped resources
Pegasus Workflow Management
Pegasus mapper
Condor DAGman
Condor
Computing environment
MonitoringInformation provenance
Pegasus Executable workflow
tasks
tasks MonitoringInformation provenance
Abstract workflow
Condor pool
Virtual Grid
A programmable virtualized resource provisioning framework
Components– vgDL (Virtual Grid Description Language)
Specifies resource requirements– vgES (Virtual Grid Execution System)
Compiles and coordinates resources– PC (Personal Cluster)
Provides uniform job management
TimeshareTimeshare
A
B C
D
Application
Virtual GridResource Abstraction
Virtual GridResource Abstraction
VG
TimeshareTimeshare
LeaseLease
BatchBatch
VGVG
PBS
P4 P4VGDLVGDL
vgdl=clusterof (node) [2] { node = [Processor==“P4”]}
program run
A B
C
D
ClassificationClassification SelectionSelection BindingBinding EnvironmentEnvironment
ok
Pegasus on Virtual Grid
Scope– A basic integration for workflow planning and
execution over provisioned resources
Issues– Resource capacity estimation
Resource specification (vgDL) synthesis for Virtual Grid
– Resource information publicationSite catalog generation for Pegasus
Resource Capacity Estimation
What Virtual Grid expects from Pegasus– vgDL description
Available information– Task execution time, data transfer time, performance
metrics, minimum memory capacity, cost, deadline, etc
Unknown information– # of virtual processors
Resource capacity estimate– Minimize the # of processors that can execute a workflow
within a deadline
BTS (Balanced Time Scheduling)
Ref: E-science’08 E.-K. Byun, Y.-S. Kee et. al
1
2 3 4 5
6
ID
1
2
3
4
5
6
ET
1
5
2
2
1
1
1
2
6
3
4
5
Tim
e
p1 p2
How many processors do we need to run this workflow within 7 units?
Example
Execution time of each task - Xeon processor Data transfer time - network with 1Gbs bandwidth. Deadline is 1 hour.
Diamond = ClusterOf [2] (nd) [, 0:30:00] { nd = [Processor == “Xeon”] }
preprocess
findrange findrange
analyze
f.input
f.output
Resource Information Publication
What Pegasus expects from Virtual Grid– Site catalog
Virtual Grid– VG instance
Resource information publication– Devirtualize a VG instance and generate a site
catalog for Pegasus
TimeshareTimeshare
A
B C
D
Application
Virtual GridResource Abstraction
Virtual GridResource Abstraction
VG
TimeshareTimeshare
LeaseLease
BatchBatch
VGVG
PBS
P4 P4VGDLVGDL
vgdl=clusterof (node) [2] { node = [Processor==“P4”]}
program run
A B
C
D
ClassificationClassification SelectionSelection BindingBinding EnvironmentEnvironment
ok
Personal Cluster
A partition of resources dedicated to a user under the control of a user-level resource manager during a limited time period
GT4/PBS
GT4/PBS
Ref: HCW’08 Y.-S. Kee and C. Kesselman
Site Catalog Publication
<sitecatalog xmlns="http://pegasus.isi.edu/schema/sitecatalog" …>…<profile namespace="env" key="PEGASUS_HOME">/home/globus/pegasus-2.1.0</profile> <profile namespace="condor" key="grid_type">gt4</profile> <profile namespace="condor" key="jobmanager_type">PBS</profile> <lrc url="rlsn://cat7.kaist.ac.kr" /> <gridftp url="gsiftp://cat7.kaist.ac.kr:2811" storage="/home/globus" major="4" minor="0" patch="7" /> <jobmanager universe="transfer" url="https://cat7.kaist.ac.kr:9000/wsrf/services/ManagedJobFactoryService" major="4" minor="0" patch="7" total-nodes="2" /><jobmanager universe="vanilla" url="https://cat7.kaist.ac.kr:9000/wsrf/services/ManagedJobFactoryService" major="4" minor="0" patch="7" total-nodes="2" /><workdirectory>$HOME/workdir</workdirectory> </site>…</sitecatalog>
Workflow Planning over Provisioned Resources
Creation
Planning
Scheduling/Execution
A
B C
D
CC
A
B C
D
CC
Executable workflow
Abstract workflow BTS
VGVG
Virtu
al G
ridV
irtua
l Grid
VGDL
Devirtualization
Site
catalog
vgdl = ClusterOf (nd) [2] { nd = [Proc==“Xeon”] }
GT4+PBS
Pegasus VG-Pegasus Proxy
Conclusion
Pegasus on Virtual Grid– Implements workflow planning and execution
over on-demand captive resources– Enables easy and efficient application
development and execution
Issues– Resource capacity estimation– Site catalog publication
Discussion
Effective performance– What is the cost that a user has to pay to have a
successful execution?
Ongoing studies– Find-grain planning for resource provisioning
Performance, cost, reliability– Workflow execution for virtualization
Recovery of failed tasks
Need More Information?
Pegaus– http://pegasus.isi.edu
VGrADS– Tuesday, 11:30am, RENCI booth (2633)– Wednesday, noon, GCAS booth (285)– Wednesday, 2:00Pm, SDSC booth (568)– Wednesday, 4:00pm, RENCI booth (2633)
AQ&Q U E S T I O N SQ U E S T I O N S
A N S W E R SA N S W E R S