![Page 1: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/1.jpg)
STORK: A Scheduler for Data Placement
Activitiesin Grid
Tevfik KosarUniversity of Wisconsin-Madison
![Page 2: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/2.jpg)
2www.cs.wisc.edu/condor
Some Remarkable Numbers
Application First Data Data Volume (TB/yr)
User Community
SDSS 1999 10 100s
LIGO 2002 250 100s
ATLAS/
CMS
2005 5,000 1000s
Characteristics of four physics experiments targeted by GriPhyN:
Source: GriPhyN Proposal, 2000
![Page 3: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/3.jpg)
3www.cs.wisc.edu/condor
Even More Remarkable…
“ ..the data volume of CMS is expected to subsequently increase rapidly, so that the accumulated data volume will reach 1 Exabyte (1 million Terabytes) by around 2015.”
Source: PPDG Deliverables to CMS
![Page 4: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/4.jpg)
4www.cs.wisc.edu/condor
Other Data Intensive Applications
Genomic information processing applicationsBiomedical Informatics Research Network (BIRN) applicationsCosmology applications (MADCAP)Methods for modeling large molecular systems Coupled climate modeling applicationsReal-time observatories, applications, and data-management (ROADNet)
![Page 5: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/5.jpg)
5www.cs.wisc.edu/condor
Need to Deal with Data Placement
Data need to be moved, staged, replicated, cached, removed; storage space for data should be allocated, de-allocated.We call all of these data related activities in the Grid as Data Placement (DaP) activities.
![Page 6: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/6.jpg)
6www.cs.wisc.edu/condor
State of the Art
Data placement activities in the Grid are performed either manually or by simple scripts.Data placement activities are simply regarded as “second class citizens” of the computation dominated Grid world.
![Page 7: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/7.jpg)
7www.cs.wisc.edu/condor
Our Goal
Our goal is to make data placement activities “first class citizens” in the Grid just like the computational jobs!They need to be queued, scheduled, monitored and managed, and even checkpointed.
![Page 8: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/8.jpg)
8www.cs.wisc.edu/condor
Outline
IntroductionGrid ChallengesStork SolutionsCase Study: SRB-UniTree Data PipelineConclusions & Future Work
![Page 9: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/9.jpg)
9www.cs.wisc.edu/condor
Grid Challenges
Heterogeneous ResourcesLimited ResourcesNetwork/Server/Software FailuresDifferent Job RequirementsScheduling of Data & CPU together
![Page 10: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/10.jpg)
10www.cs.wisc.edu/condor
Stork
Intelligently & reliably schedules, runs, monitors, and manages Data Placement (DaP) jobs in a heterogeneous Grid environment & ensures that they complete.What Condor means for computational jobs, Stork means the same for DaP jobs. Just submit a bunch of DaP jobs and then relax..
![Page 11: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/11.jpg)
11www.cs.wisc.edu/condor
Stork Solutions to Grid Challenges
Specialized in Data ManagementModularity & ExtendibilityFailure RecoveryGlobal & Job Level PoliciesInteraction with Higher Level Planners/Schedulers
![Page 12: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/12.jpg)
12www.cs.wisc.edu/condor
Already Supported URLs
file:/ -> Local Fileftp:// -> FTPgsiftp:// -> GridFTPnest:// -> NeST (chirp) protocolsrb:// -> SRB (Storage Resource Broker) srm:// -> SRM (Storage Resource Manager) unitree:// -> UniTree serverdiskrouter:// -> UW DiskRouter
![Page 13: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/13.jpg)
13www.cs.wisc.edu/condor
Higher Level Planners
DAGMan
Condor-G(compute)
Stork(DaP)
RFT
GateKeeper
SRM
StartD
SRB NeST GridFTP
![Page 14: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/14.jpg)
14www.cs.wisc.edu/condor
CondorJob
Queue
Interaction with DAGMan
Job A A.submitDaP X X.submitJob C C.submitParent A child C, XParent X child B…..
A
DAGMan
B
D
A
C StorkJob
Queue
X
Y
X
![Page 15: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/15.jpg)
15www.cs.wisc.edu/condor
Sample Stork submit file
[ Type = “Transfer”; Src_Url =
“srb://ghidorac.sdsc.edu/kosart.condor/x.dat”; Dest_Url =
“nest://turkey.cs.wisc.edu/kosart/x.dat”;…………Max_Retry = 10;Restart_in = “2 hours”;
]
![Page 16: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/16.jpg)
16www.cs.wisc.edu/condor
Case Study: SRB-UniTree Data Pipeline
We have transferred ~3 TB of DPOSS data (2611 x 1.1 GB files) from SRB to UniTree using 3 different pipeline configurations. The pipelines are built using Condor and Stork scheduling technologies. The whole process is managed by DAGMan.
![Page 17: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/17.jpg)
SRB Server UniTree Server
NCSA Cache
SRB getUniTree put
Submit Site1
![Page 18: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/18.jpg)
18www.cs.wisc.edu/condor
SRB Server UniTree Server
SDSC Cache NCSA Cache
SRB get
GridFTP
UniTree put
Submit Site
2
![Page 19: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/19.jpg)
19www.cs.wisc.edu/condor
SRB Server UniTree Server
SDSC Cache NCSA Cache
SRB get
DiskRouter
UniTree put
Submit Site
3
![Page 20: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/20.jpg)
20www.cs.wisc.edu/condor
Outcomes of the Study
1. Stork interacted easily and successfully with different underlying systems: SRB, UniTree, GridFTP and Diskrouter.
![Page 21: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/21.jpg)
21www.cs.wisc.edu/condor
Outcomes of the Study (2)
2. We had the chance to compare different pipeline topologies and configurations:
Configuration End-to-end rate (MB/sec)
1 5.0
2 3.2
3 5.95
![Page 22: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/22.jpg)
22www.cs.wisc.edu/condor
Outcomes of the Study (3)
3. Almost all possible network, server, and software failures were recovered automatically.
![Page 23: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/23.jpg)
23www.cs.wisc.edu/condor
UniTree not responding Diskrouter reconfigured and restarted
SDSC cache reboot & UW CS Network outage SRB server maintenance
Failure Recovery
![Page 24: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/24.jpg)
24www.cs.wisc.edu/condor
For more information on the results of this study, please check:
http://www.cs.wisc.edu/condor/stork/
![Page 25: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/25.jpg)
25www.cs.wisc.edu/condor
Conclusions
Stork makes data placement a “first class citizen”.Stork is the Condor of data placement world.Stork is fault tolerant, easy to use, modular, extendible, and very flexible.
![Page 26: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/26.jpg)
26www.cs.wisc.edu/condor
Future Work
More intelligent schedulingData level management instead of file level managementCheckpointing for transfersSecurity
![Page 27: STORK: A Scheduler for Data Placement Activities in Grid](https://reader035.vdocuments.net/reader035/viewer/2022081516/5681521b550346895dc05de9/html5/thumbnails/27.jpg)
27www.cs.wisc.edu/condor
You don’t have to FedEx your data
anymore.. Stork delivers it for
you!For more information Drop by my office anytime
• Room: 3361, Computer Science & Stats. Bldg.
Email to:• [email protected]