1 trace-based characteristics of grid workflows alexandru iosup and dick epema pds group delft...
Post on 21-Dec-2015
214 Views
Preview:
TRANSCRIPT
1
Trace-Based Characteristics of Grid Workflows
Alexandru Iosup and Dick Epema
PDS GroupDelft University of Technology
The Netherlands
Simon Ostermann, Radu Prodan, and Thomas Fahringer
DPS GroupUniversity of Innsbruck
Austria
2
Why are Grid Workflows Interesting?• Grids promise reliable and
easy-to-use computational infrastructure for e-Science
• Full automation from experiment design to final result
• Often, automation = workflows• Jobs comprising inter-related
computing and data-transfer tasks
3
Why are the Characteristics of Grid Workflows Interesting?
• For focusing on the right research problems• What are the interesting characteristics?
Number of nodes? Number of edges? Other characteristics…
• For simulation studies• Optimizing a scheduler for one workload does not make
it useful for another (often quite the contrary)• … optimizing for a workload type is better
• For performance evaluation in real environments• The system tuned to one workload
4
Outline
• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work
5
Method for Grid Workflow Analysis [1/3]Overview
• Goal: establish the main characteristics of grid workflow such that building a workflow-based grid workload model is greatly facilitated
• Grid workflow characteristics• Workflow-intrinsic• Environment-related
6
Method for Grid Workflow Analysis [2/3]Intrinsic Workflow Characteristics• Size and structure of the workflow
• Number of nodes (N)/edges (E)• Branching Factor = N/E• Work Size = task runtime of a task on a base platform
[SI2k]• Work Size Variability = ratio longest vs. shortest WF task• Sequential execution path• Critical execution path• Graph level (L) = length of critical execution path
• Arrival patterns• Daily patterns: Peak Hours• Weekly patterns: Week-end vs. Work Days
7
Method for Grid Workflow Analysis [3/3]Environment-Related WF Characteristics• Time-related
• Makespan (MS) = time between WF entering and exiting system
• Scheduler-related• Speedup (S) = MS / Sequential Execution Path Size• Normalized Schedule Length (NSL) = MS / Critical Path
Size
• Failure-related• Success rate = % tasks finished correctly, per WF
8
Outline
• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work
9
The Austrian Grid Traces
• Austrian Grid: 8 sites, ~500 processors• Two non-overlapping long-term traces
from two workflow engines: Askalon DEE, Askalon EE2
• Workflows: mostly testing, but many jobs similar to production workflows
• Production areas: material sciences, astrophysics, weather prediction, engineering, movie rendering
10
Outline
• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work
11
Intrinsic Workflow Characteristics [1/3]Number of nodes
• 75% WFs have <40 tasks• 95% WFs have < 200 tasks
200 tasks
40 tasks
12
Intrinsic Workflow Characteristics [2/3]Task Work Size
• >80% WFs take <2 minutes on 1000-SI2k machine• >95% WFs take <10 minutes on 1000-SI2k
machine
10 mins
2 mins
14
Classes of Workflows
• Simple classifier (experience from previous work)
• Future: data mining techniques
15
Environment-Related Characteristics
• Workflow class matters: better SU for “easier” classes• Large-and-Flat “easier” than Large-and-Branchy• Large-and-Branchy “easier” than Branchy (o/head)
16
Outline
• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work
17
Conclusion and Future Work
• Method for the analysis of grid workflows• Intrinsic workflow characteristics• Environment-dependent workflow characteristics• More statistical details than average/std.deviation
(Normal is not the typical distribution in computer science)
• Analysis of two workflow-based traces from Austrian Grid
• Future work• Apply method to more traces• Design workflow-based grid workload model
top related