1 trace-based characteristics of grid workflows alexandru iosup and dick epema pds group delft...

Post on 21-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Trace-Based Characteristics of Grid Workflows

Alexandru Iosup and Dick Epema

PDS GroupDelft University of Technology

The Netherlands

Simon Ostermann, Radu Prodan, and Thomas Fahringer

DPS GroupUniversity of Innsbruck

Austria

2

Why are Grid Workflows Interesting?• Grids promise reliable and

easy-to-use computational infrastructure for e-Science

• Full automation from experiment design to final result

• Often, automation = workflows• Jobs comprising inter-related

computing and data-transfer tasks

3

Why are the Characteristics of Grid Workflows Interesting?

• For focusing on the right research problems• What are the interesting characteristics?

Number of nodes? Number of edges? Other characteristics…

• For simulation studies• Optimizing a scheduler for one workload does not make

it useful for another (often quite the contrary)• … optimizing for a workload type is better

• For performance evaluation in real environments• The system tuned to one workload

4

Outline

• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work

5

Method for Grid Workflow Analysis [1/3]Overview

• Goal: establish the main characteristics of grid workflow such that building a workflow-based grid workload model is greatly facilitated

• Grid workflow characteristics• Workflow-intrinsic• Environment-related

6

Method for Grid Workflow Analysis [2/3]Intrinsic Workflow Characteristics• Size and structure of the workflow

• Number of nodes (N)/edges (E)• Branching Factor = N/E• Work Size = task runtime of a task on a base platform

[SI2k]• Work Size Variability = ratio longest vs. shortest WF task• Sequential execution path• Critical execution path• Graph level (L) = length of critical execution path

• Arrival patterns• Daily patterns: Peak Hours• Weekly patterns: Week-end vs. Work Days

7

Method for Grid Workflow Analysis [3/3]Environment-Related WF Characteristics• Time-related

• Makespan (MS) = time between WF entering and exiting system

• Scheduler-related• Speedup (S) = MS / Sequential Execution Path Size• Normalized Schedule Length (NSL) = MS / Critical Path

Size

• Failure-related• Success rate = % tasks finished correctly, per WF

8

Outline

• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work

9

The Austrian Grid Traces

• Austrian Grid: 8 sites, ~500 processors• Two non-overlapping long-term traces

from two workflow engines: Askalon DEE, Askalon EE2

• Workflows: mostly testing, but many jobs similar to production workflows

• Production areas: material sciences, astrophysics, weather prediction, engineering, movie rendering

10

Outline

• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work

11

Intrinsic Workflow Characteristics [1/3]Number of nodes

• 75% WFs have <40 tasks• 95% WFs have < 200 tasks

200 tasks

40 tasks

12

Intrinsic Workflow Characteristics [2/3]Task Work Size

• >80% WFs take <2 minutes on 1000-SI2k machine• >95% WFs take <10 minutes on 1000-SI2k

machine

10 mins

2 mins

14

Classes of Workflows

• Simple classifier (experience from previous work)

• Future: data mining techniques

15

Environment-Related Characteristics

• Workflow class matters: better SU for “easier” classes• Large-and-Flat “easier” than Large-and-Branchy• Large-and-Branchy “easier” than Branchy (o/head)

16

Outline

• Introduction• Method for Grid Workflow Analysis• Austrian Grid Traces• Grid Workflow Characteristics• Conclusion and Future Work

17

Conclusion and Future Work

• Method for the analysis of grid workflows• Intrinsic workflow characteristics• Environment-dependent workflow characteristics• More statistical details than average/std.deviation

(Normal is not the typical distribution in computer science)

• Analysis of two workflow-based traces from Austrian Grid

• Future work• Apply method to more traces• Design workflow-based grid workload model

18

Thank you! Questions? Remarks? Observations?

Help building our community’sGrid Workloads Archive:

http://gwa.ewi.tudelft.nl/

top related