performance metrics and ontology for describing performance data of grid workflows

21
Hong-Linh Truong , Thomas Fahringer, Francesco Nerieri Distributed and Parallel Systems Group Institute for Computer Science, University of Innsbruck {truong,tf,nero}@dps.uibk.ac.at Schahram Dustdar Information Systems Institute, Vienna University of Technology [email protected] http://dps.uibk.ac.at/projects/pma 1st Performability Workshop, CCGrid05, Cardiff 09 May, 2005 Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Upload: hong-linh-truong

Post on 22-Apr-2015

1.823 views

Category:

Education


0 download

DESCRIPTION

Many Grid workow middleware services require knowledge about the performancebehavior of Grid applications/services in order to e ectively select, compose, andexecute workows in dynamic and complex Grid systems. To provide performanceinformation for building such knowledge, Grid workow performance tools haveto select, measure, and analyze various performance metrics of workows. However,there is a lack of a comprehensive study of performance metrics which canbe used to evaluate the performance of a workow executed in the Grid. Moreover,given the complexity of both Grid systems and workows, semantics of essentialperformance-related concepts and relationships, and associated performance datain Grid workows should be well described. In this paper, we analyze performancemetrics that performance monitoring and analysis tools should provide during theevaluation of the performance of Grid workows. Performance metrics are associatedwith multiple levels of abstraction. We introduce an ontology for describingperformance data of Grid workows and illustrate how the ontology can be utilizedfor monitoring and analyzing the performance of Grid workows.

TRANSCRIPT

Page 1: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Hong-Linh Truong, Thomas Fahringer, Francesco NerieriDistributed and Parallel Systems Group

Institute for Computer Science, University of Innsbruck {truong,tf,nero}@dps.uibk.ac.at

Schahram DustdarInformation Systems Institute, Vienna University of Technology

[email protected]

http://dps.uibk.ac.at/projects/pma

1st Performability Workshop, CCGrid05, Cardiff 09 May, 2005

Performance Metrics and Ontology for Describing Performance Data of Grid

Workflows

Page 2: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 2

OutlineOutline

Motivation

Grid workflows and workflow execution model

Performance metrics of Grid workflows

WfPerfOnto: Ontology for describing performance data of Grid workflows

Utilizing WfPerfOnto

Conclusion and Future work

Page 3: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 3

MotivationMotivationLack of comprehensive study of useful performance metrics for

Grid workflowsA few metrics are studied and supportedMost of metrics are being limited to the activity (task) level.

study performance metrics at multiple levels of abstraction

Describing and sharing performance data of Grid workflowsHighly heterogeneous, inter-related and dynamicInter-organizational Multiple types of performance and monitoring data provided by various tools

an ontology for performance data • Can be used to describe concepts associated with workflow

executions• Will facilitate the performance data sharing

Page 4: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 4

mProject1.c

int main() {

}

Hierarchical Structure View of a Workflow

WorkflowWorkflow

A();

<parallel>

</parallel>

Workflow Construct nWorkflow Construct n

Activity mActivity m

Invoked Application mInvoked Application m

Code Code Region 1Region 1

Code Code Region qRegion q

Code Code Region Region ……

<activity name="mProject2"><executable name="/home/truong/mProject2"/>

</activity>

<activity name="mProject1"><executable name="/home/truong/mProject1"/>

</activity>

while () {...}

Page 5: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 5

Workflow Execution Model (Simplified)

Workflow execution Spanning multiple Grid sites Highly inter-organizational, inter-related and dynamic

Multiple levels of job scheduling At workflow execution engine (part of WfMS)At Grid sites

Local schedulerLocal scheduler

Page 6: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 6

Performance Metrics of Grid Workflows

Interesting performance metrics associated with multiple levels of abstraction

Metrics can be used in workflow composition, forcomparing different invoked applications of a single activity, etc.

Five levels of abstractionCode region, Invoked applicationActivity,Workflow construct, Workflow

Performance metrics of a lower level can be used to construct similar metrics for the immediate higher-level

By using aggregate operatorBased on metric definition and structure of workflows

Page 7: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 7

Performance Metrics at Code Region Level

Most existing conventional performance tools provide these metricsExisting workflow monitoring and analysis tools normally do notChallenging issues

Integrate conventional performance monitoring tools into workflow monitoring tools

CondSynTime, ExclSynTimeSynchronizationTotalCommTime, TotalTransSizeData Movement

Counter

CachMissRatio, MFLOPS, etc.temporal overhead of parallel code regionsTemporal overhead

MeanElapsedTime, CommPerComp, MeanTransRate, MeanTranSizeRatio

NCalls, NSubs, RecvMsgCount, SendMsgCountL2_TCM, L2_TCA, etc., (hardware counters)

CategoryElapsedTIme, UserCPUTime, SystemCPUTime, SerialTime, EncodingTime

Execution timeMetric

Page 8: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 8

Performance Metrics at Invoked ApplicationLevel

Most metrics can be constructed from metrics at code regionlevel

FailedFreqRatio

NCallFailed

SpeedupFactorPerformance Improvement

NCallsCounter

FailedTime

CategoryElapsedTimeExecution time

Metric

Page 9: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 9

Performance Metrics at Activity Level

Metrics can be defined for both activity and activity instance

Aggregate metrics of an activity can be defined based on itsinstances and the execution of instances at runtime

Challenging problemsHow to monitor and correlate metrics when a resource is shared among applications

FailedTime, SharedResTime

SlowdownFactorPerformance Improvement

SynDelay, ExecDelaySynchronizationThroughput, MeanTimePerState, TransRateRatioRedandantActivity, NIteration, PathSelectionRatio, ResUtilizationCounter

CategoryElapsedTime, ProcessingTime, QueuingTime, SuspendingTimeExecution time

Metric

Page 10: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 10

Performance Metrics at Workflow ConstructLevel

Generic and construct-specific metricsRedundantProcessingResource

SpeedupFactorPerformance ImprovementLoadIm (Load imbalance)Load balancingNIteration, PathSelectionRatio, ResUtilizationRedandantActivity, Counter

CategoryElapsedTime, ProcessingTimeExecution time

Metric

Aggregate metrics of a workflow construct/workflow constructinstance are defined based on the structure of the construct. E.g.,

LoadIm (load imbalance) is for parallel constructElapsedTime/ProcessingTime is defined based on critical path

Page 11: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 11

Performance Metrics at Workflow Level

SpeedupPerformance ImprovementNAPerRes,ProcInRes,LoadImResCorrelation

QueuingRatio, MeanProcessingTime, MeanQueuingTime, ResUtilization

RatioParTime,SeqTime

CategoryElapsedTime,ProcesingTimeExecution time

Metric

Page 12: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 12

Performance Metrics Ontology

WfMetricOnto OWL-based performance metrics ontology

Metrics ontologySpecifies which performance metrics a tool can provideSimplifies the access to performance metrics provided by various tools

Page 13: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 13

Monitoring and Measuring Performance Metrics

Performance monitoring and analysis toolsOperate at multiple levelsCorrelate performance metrics from multiple levels

Middleware and application instrumentationInstrument execution engine of WfMS • Execution engine can be distributed or centralized

Instrument applications • Distributed, spanning multiple Grid sites

Challenging problems: Performance tool and data complexity Integrate multiple performance monitoring tools executed on multiple Grid sites Integrate performance data produced by various tools

Page 14: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 14

Ontology Describing Performance Data of Grid Workflows

ObjectivesUnderstanding basic concepts associated with performance data ofGrid workflowsPerformance data integration for Grid workflowsTowards distributed/intelligent performance analysis

WfPerfOnto (Ontology describing Performance data of GridWorkflows)

Basic concepts• Concepts reflects the hierarchical view of a workflow • Static and dynamic performance and monitoring data of workflow

Relationships

• Static and dynamic relationships among concepts

Page 15: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 15

Ontology for Describing Performance Dataof Grid Workflows

WfPerfOnto

Page 16: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 16

Utilizing WfPerfOnto

Describing Performance Data and Data IntegrationDifferent monitoring and analysis tools can store/export performance data in/to ontological representationHigh-level search and retrieval of performance data

Knowledge base performance data of Grid workflows Utilized by high-level tools such as schedulers, workflow composition tools, etc. Used to re(discover) workflow patterns, interactions in workflows, to check correct execution, etc.

Distributed Performance AnalysisPerformance analysis requests can be built based on WfPerfOnto

Page 17: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 17

Utilizing WfPerfOnto: DescribingPerformance Data

<rdf:Description rdf:about="http://dps.uibk.ac.at/wfperfonto#mImgtbl21">

<wfperfonto:hasPerfMetric rdf:resource="http://dps.uibk.ac.at/wfperfonto#ElapsedTime78"/>

<rdf:type rdf:resource="http://dps.uibk.ac.at/wfperfonto#ActivityInstance"/>

<wfperfonto:instanceName>mImgtbl21</wfperfonto:instanceName>

<wfperfonto:hasPerfMetric rdf:resource="http://dps.uibk.ac.at/wfperfonto#QueuingTime80"/>

<wfperfonto:ofActivity rdf:resource="http://dps.uibk.ac.at/wfperfonto#mImgtbl2"/>

<wfperfonto:hasPerfMetric rdf:resource="http://dps.uibk.ac.at/wfperfonto#ProcessingTime79"/>

</rdf:Description>

Page 18: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 18

Utilizing WfPerfOnto: CheckingCorrect Execution

<rdf:Description rdf:about="http://dps.uibk.ac.at/wfperfonto#Seq4ForkJoin5">

<wfperfonto:hasActivityInstance rdf:resource="http://dps.uibk.ac.at/wfperfonto#tRawImage4"/>

<wfperfonto:hasActivityInstance rdf:resource="http://dps.uibk.ac.at/wfperfonto#tProjectedImage4"/>

<wfperfonto:hasPerfMetric rdf:resource="http://dps.uibk.ac.at/wfperfonto#ElapsedTime57"/>

<wfperfonto:hasPerfMetric rdf:resource="http://dps.uibk.ac.at/wfperfonto#QueuingTime59"/>

<wfperfonto:hasActivityInstance rdf:resource="http://dps.uibk.ac.at/wfperfonto#mProject14"/>

<wfperfonto:hasActivityInstance rdf:resource="http://dps.uibk.ac.at/wfperfonto#mImgtbl14"/>

<rdf:type rdf:resource="http://dps.uibk.ac.at/wfperfonto#WorkflowConstructInstance"/>

<wfperfonto:ofWorkflowConstruct rdf:resource="http://dps.uibk.ac.at/wfperfonto#SeqForkJoin"/>

<wfperfonto:hasPerfMetric rdf:resource="http://dps.uibk.ac.at/wfperfonto#ProcessingTime58"/>

<wfperfonto:instanceName>Seq4ForkJoin5</wfperfonto:instanceName>

</rdf:Description>

Page 19: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 19

Utilizing WfPerfOnto: Distributed Performance Analysis

Resources Applications

Monitoring Service

DIPAS

Clients External Tools

KnowledgeBuilder Agent

Grid analysisagent

Grid analysisagent

Grid analysisagent

Grid analysisagent

Grid analysisagent

GOM

Page 20: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 20

Utilizing WfPerfOnto: Analysis Request

Grid analysis agentGrid analysis agent

Analysisagent

Monitoringagent

Ontological Ontological datadata

Requests based on WfPerfOnto

To the Monitoring Service

Page 21: Performance Metrics and Ontology for Describing Performance Data of Grid Workflows

Performance Metrics and Ontology for Describing Performance Data of Grid Workflows, CCGrid 05 21

ConclusionConclusion and Future and Future WorkWorkPerformance metrics of Grid workflows that characterize

the performance and dependability of Grid workflows; metrics associated with multiple levels of abstraction

Ontology describing performance data of Grid workflows

Current implementationOWL-based ontologies, Jena toolkit for processing ontology-related taskStore and export performance data in/to WfPerfOnto representation

Future workExtend and revise performance metrics and WfPerfOntoDistributed performance analysis Reasoning performance data

Shared conceptualization community work?