s-cube lp: analyzing business process performance using kpi dependency analysis

36
www.s-cube-network.eu S-Cube Learning Package Analyzing Business Process Performance Using KPI Dependency Analysis University of Stuttgart (USTUTT), TU Wien (TUW) Branimir Wetzstein, USTUTT

Upload: virtual-campus

Post on 23-Jan-2015

602 views

Category:

Technology


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

www.s-cube-network.eu

S-Cube Learning Package

Analyzing Business Process Performance Using KPI Dependency Analysis

University of Stuttgart (USTUTT), TU Wien (TUW)

Branimir Wetzstein, USTUTT

Page 2: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Learning Package Categorization

S-Cube

Adaptable Coordinated Service Compositions

Adaptable and QoS-aware Service Compositions

Analyzing Business Process Performance Using KPI Dependency Analysis

Page 3: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Learning Package Overview

Problem Description

KPI Dependency Analysis

Discussion

Conclusions

Page 4: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Let’s Consider a Scenario (1)

Assume we have implemented a business process as a

service orchestration

It is a reseller process which interacts with external services

of customer, suppliers, bank, shipper, and internal services

such as the warehouse etc.

Page 5: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Let’s Consider a Scenario (2)

We are interested in measuring the performance of the

business processes (time, cost, quality, customer satisfaction)

This is done by defining Key Performance Indicators (KPIs)

which specify target values on key metrics based on business

goals

– KPI target value function maps metric value ranges to KPI classes

(e.g., “good”, “medium”, “bad”)

Some typical KPI metrics in our scenario

– Order Fulfillment Lead Time

– Perfect Order Fulfillment (in time and in full)

– Customer Complaint Rate

– Availability of the reseller service

Page 6: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Let’s Consider a Scenario (3)

In the first step, KPIs are monitored at process runtime for a

set of process instances (what?)

If monitoring of KPIs shows unsatisfying results, we want to

be able to analyze and explain the violations (why?)

That is not trivial as a KPI often depends on many influential

factors measured by lower-level metrics

Purchase Order Process

Order

Fulfillment

Lead Time

PPMs

Avail.in Stock,

Customer,

Products, …

QoS

Service

Availability,

Response Time

is measured by

Page 7: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Learning Package Overview

Problem Description

KPI Dependency Analysis

Discussion

Conclusions

Page 8: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Architectural Overview (1)

Page 9: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Architectural Overview (2)

Model and deploy the business process (e.g., in WS-BPEL )

Define and monitor a set of KPIs and potential influential

metrics

– Event-based monitoring based on CEP

– Supporting in particular both process events and QoS events and their

correlation

Train a decision tree (KPI Dependency Tree) from monitored

data

– Gather monitored data from Metrics DB

– Classify the monitored process instances according to their KPI class

– Use Decision Tree Learning Algorithms to learn the dependencies of

the KPI and the lower-level metrics

Page 10: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Background: Event-Based Monitoring

In order to be used for analysis, runtime data needs to be

monitored

Event-based monitoring is an often-used idea to implement

this

Basic principle:

Register for and receive some lifecycle events from the service

composition and use Complex Event Processing (CEP) to extract,

correlate and aggregate monitoring data from raw event data

Can be used to monitor both QoS and domain-specific data

Page 11: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Background: Monitoring of Service Orchestrations

Our monitoring approach for service orchestrations supports:

– Process Performance Metrics (PPMs) based on process events (BPEL event model)

– QoS metrics based on QoS events provided by QoS monitors

– Correlation of Process events and QoS events

– Metric calculation based on Complex Event Processing (ESPER)

Process Engine

Lis

tener

Dashboard

QoS Monitor

Complex Event

Processing

Metric

definitions

Metrics

Database

Service

Event

Event

Page 12: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

KPI Dependency Analysis - Motivation

So far we are able to monitor metrics and find out which KPI targets are

violated (what?)

In the next step, we want to explain the violations (why?)

That is not trivial as a KPI often depends on many influential factors

measured by lower-level metrics

Typically, such an analysis is done manually (if at all) by a business

analyst using OLAP queries on a data warehouse

that is very cumbersome and time-consuming

we want to “discover” the problems in an automated way

therefore we can use data mining techniques

In particular, we construct a classification problem and use existing

classification learning techniques (decision trees)

Page 13: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Background: Machine Learning and Data Mining

Automated discovery of interesting patterns from large

amounts of data (stored typically in data warehouses)

– Manual discovery (e.g., by using OLAP queries) could take days or

weeks

Functionalities include:

– Mining of association rules, correlation analysis

– Classification and Prediction our focus here!

– Clustering

– Time-series analysis

– Graph mining and text mining

Interdisciplinary field using techniques from machine learning,

statistics, pattern recognition, data visualization

Page 14: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Background: Classification Learning (1)

Given: a (historical) dataset containing a set of instances described in

terms of:

– A set of explanatory (a.k.a. predictive) categorical or numerical attributes

– a categorical target attribute (a.k.a. class)

Goal: based on the historical dataset (“supervised learning”) create a

classification model which helps…

– explaining the dependencies between the class and the explanatory attributes

in history data (interpretation)

– making predictions about future data; i.e. based on future explanatory attribute

values predict the class (prediction)

Some Classification Learning techniques:

– Decision Trees

– Classification Rules

– Support Vector Machines

Page 15: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Background: Classification Learning (2)

Training Phase

Test Phase

Classification

Algorithm

Training Data

Test Data

New Data

Classification

Model

Prediction Phase Interpretation Phase

Predicted Class

Explanatory

Attribute

Values

Knowledge

Page 16: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Background: Decision Tree Learning

A1

< 2 > 4

2 < x <4

yes no yes no

< 3 > 3 < 1 > 1

C2

20/1

C1

50

C3

20

C3

10/1

C4

80/2

C1

30

C1

5

A2 A2

A4 A3

A non-leaf node represents

an explanatory categorical

or numeric attribute;

Outgoing edges represent

conditions on the parent

explanatory attribute values

A leaf node represents a

target attribute class .

A path shows which

attribute values lead to a

certain class. The leaf node

shows the corresponding

number of instances from

the training set.

Page 17: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

KPI Dependency Analysis

The KPI class of a process instance (alt. choreography instance, activity

instance, business object, …) depends on a set of influential factors

(PPMs and QoS metrics)

For finding out those dependencies, we use classification learning:

– The data set consists of a set of (historical) process instances; for each

process instance the KPI class and a set of metrics is evaluated

– The KPI is the target attribute which maps values of the underlying metric to

categorical values (KPI classes)

– The potential influential lower-level metrics are the explanatory attributes

(predictive variables)

– Goal: Based on a set of monitored instances, create a classification model

(decision tree) which identifies recurring relationships among the explanatory

attributes which describe the instances belonging to the same KPI class

The decision tree (KPI dependency tree) can be used to explain KPI

classes of past process instances and also to predict the class of process

instances for which only the values of some of the lower-level metrics are

are known

Page 18: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Defining KPIs

The KPI Definition includes:

– KPI metric definition (e.g., order fulfillment time)

– A set of categorical values defining the KPI classes, at least 2 (e.g.,

“green”, “yellow”, “red”)

– Target value function mapping KPI metric values to KPI classes (e.g.,

m < 2 days green, 2 days < m < 4 days yellow, otherwise red)

The KPI metric is specified for a monitored entity type:

– Process Instance (e.g., duration of a reseller process instance)

– Activity Instance (e.g., duration of the supplier service invocation)

– Choreography Instance (e.g., duration

– Service endpoint (e.g., availability)

– Set of Process Instances per day (e.g., average duration)

Page 19: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Generating Metric Definitions

A set of metric definitions (representing potential influential factors) can be

automatically generated

We support rules to generate automatically the following metrics:

– Service invocation:

- availability and response time of invoked service (both for synchronous

and asynchronous invocations (invoke-receive))

– WS-BPEL invoke activity (other basic activities not interesting for long-running

processes):

- execution time of the activity (i.e. the time between starting and finishing

the activity)

- If part of a loop, in addition:

- Average/minimum/maximum execution time per process instance

- number of executions per process instance

– For every branching activity, fault, compensation and event handlers, we

generate a metric representing the branch that has been executed

Metrics based on process variable data elements are created manually

Page 20: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Data Preparation and Learning

Create a KPI Analysis Model

– Select KPI + a set of potential influential factors

Gather metric values of monitored entity instances and create a training

set:

– Each monitored entity instance with ist KPI class and influential metric values

maps to a row in the training set

A decision tree is learned (e.g., using the J48 algorithm)

Page 21: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

KPI Dependency Analysis

KPI Analysis Model

Metric: Order Fulfillment Lead Time

Analyzed Time window: last 2 months

Instances: Filter: customerType=“gold“

Target < 5 days green

Value: >= 5days red

Metric Set: M={orchestration.all, qos.all}

Algorithm: Classification Tree -J48

Monitor Model

Choreo. Order Fulfillment Lead Time,

Level: Delivery Time Shipment,…

Orch. Order In Stock, Delivery Time

Level: Supplier, Order amount,

Packaging time…

KPI

Supplier

Deliv. Time

Order

In Stock

Process

Availability …

Red 28 h No 1,00 …

Green N/A yes 0,84 …

Red 32 h

No

0,9

……. ……. ……. ……. …

Delivery Time

Shipment

Delivery Time

Supplier

Order In

Stock?

Order In

Stock?

Delivery Time

Supplier

< 2 > 4 2 < x <4

yes no yes no

< 3 > 3 < 1 > 1

green

20/1

green

50

red

20

red

10

red

80/2

green

30

green

5

Design time

Runtime

Service Process Infr. Av., response time

Level: banking service, …

Designate a

Metric as KPI

Metric

values

Decision

Tree

Learning

KPI

definition

Page 22: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Prototype Implementation

Prototype is based on…

Apache ODE (BPEL execution

engine)

– Publishes events to JMS topics

– Standalone QoS monitor evaluates

QoS metrics of services

Monitoring Tool

– Based on ESPER CEP Framework

– Metrics DB in MySQL

– Bam Dashboard as Java Swing

Application

Process Analyzer

– Uses WEKA Machine Learning

toolkit

Page 23: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Learning Package Overview

Problem Description

KPI Dependency Analysis

Discussion

Conclusions

© Philipp Leitner

Page 24: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Experimental Results

Generated tree for KPI = order fulfillment time (J48 algorithm)

Contains the expected influential metrics in a satisfactory manner and

produce suitable results ‘out of the box’

In our setting on a standard laptop computer a decision tree generation

based on 1000 instances takes about 30 sec

Page 25: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Experimental Results: Drill-Down

Generated tree for KPI = “order in stock” (J48 algorithm)

Here, we perform “drill-down” analysis by setting the metric “order in stock”

as KPI

We want to understand which factors have an influence on whether the

order cannot be processed from stock

Page 26: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Experiment Results: Differences between Algorithms

We have experimented with J48 and ADTree and generated trees for

different numbers of process instances (100, 400, 1000)

– ADTree algorithm produces bigger trees than J48 (third column: number of leaves and

nodes) for the same number of instances. However, it also reaches a higher precision

(last column: correctly classified instances).

– Both algorithms show very similar results concerning the displayed influential metrics.

Typically there is only one or at most two (marginal) metrics which differ.

Page 27: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Experiment Results: Tree Size

Trees are getting bigger with the number of process instances

– J48 generated for 400 instances a tree with 11 nodes, for 1000

instances a tree with 18 nodes, while the precision improved only by

1%

– When the tree gets bigger, factors are shown in the tree which have

only marginal influence and thus make the tree less readable

(‘Displayed Metrics’ shows how many distinct metrics are displayed in

the tree)

Page 28: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Experiment Results: Tree Size (2)

To improve the readability…

– usage of parameters has lead to only marginal changes in our

experiments (for example, J48 -U with no pruning). The only

parameter that turned out useful to reduce the size of the tree was

‘reduced error pruning’ (J48 -R)

– Another option, in the case of too many undesirable (marginal)

metrics, is to simply remove those metrics from the potential influential

factor metric set and repeat the analysis

Page 29: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Some Important Earlier Work

We were not the first ones to have similar ideas

Important earlier work includes:

Castellanos, M., et al., 2005. iBOM: a platform for intelligent business operation management. In: Proceedings of the 21st international conference on data engineering (ICDE005). Washington, DC: IEEE Computer Society, 1084–1095

M. Castellanos, F. Casati, U. Dayal, and M.-C. Shan, “A Comprehensive and Automated Approach to Intelligent Business Processes Execution Analysis,” Distributed and Parallel Databases, vol. 16, no. 3, pp. 239–273, 2004.

Page 30: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Main Advances Over Earlier Work

The S-Cube approach to KPI analysis based on event logs

improves on earlier work in some important aspects:

– KPI Dependency Analysis incorporates both process-level metrics and

QoS metrics

– Semi-automated generation of potential influential metric definitions for

WS-BPEL processes

– Many different algorithms can be used for analysis

- Courtesy of the WEKA backend

Page 31: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Discussion - Advantages

The KPI Dependency Analysis based on decision trees has a

number of clear advantages …

– Simplicity – the principle approach is relatively easy to understand; the

generated trees can be understood also by non-IT users

– Efficiency – the analysis of influential factors is “automated”; the

traditional approach is to manually pose analysis questions by using

OLAP queries over data marts which is much more time-consuming

– Proven in the real world – machine learning is by now a proven

technique that has been successfully applied in many areas

Page 32: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Discussion - Disadvantages

… but of course the approach also has some disadvantages.

– Bootstrapping problem – the approach assumes that some recorded

historical event logs are available for training

– Necessary domain knowledge – in order to define the potential

influential metric set some domain knowledge is necessary

– Availability of monitoring data – one of the basic assumptions of the

approach is that all necessary data can be monitored (if this is not the

case the approach cannot be used)

Page 33: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Learning Package Overview

Problem Description

KPI Dependency Analysis

Discussion

Conclusions

© Philipp Leitner

Page 34: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Summary

Classification learning based techniques can be used to

explain performance problems in service compositions

Steps:

1. Define a KPI and a set of potential influential metrics

2. Monitor all metrics for a set of process instances

3. Train a decision tree from historical event log

The created KPI dependency tree explains the dependencies

of the KPI classes and a set of lower level process metrics

and QoS metrics

Page 35: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Further S-Cube Reading

Wetzstein, Leitner, Rosenberg, Brandic, Dustdar, and Leymann. Monitoring and Analyzing Influential Factors of Business Process Performance. In Proceedings of the 13th IEEE international conference on Enterprise Distributed Object Computing (EDOC'09). IEEE Press, Piscataway, NJ, USA, 118-127.

Wetzstein, Branimir; Leitner, Philipp; Rosenberg, Florian; Dustdar, Schahram; Leymann, Frank: Identifying Influential Factors of Business Process Performance Using Dependency Analysis. In: Enterprise Information Systems. Vol. 5(1), Taylor & Francis, 2010.

Kazhamiakin, Raman; Wetzstein, Branimir; Karastoyanova, Dimka; Pistore, Marco; Leymann, Frank: Adaptation of Service-Based Applications Based on Process Quality Factor Analysis. In: Proceedings of the 2nd Workshop on Monitoring, Adaptation and Beyond (MONA+), co-located with ICSOC/ServiceWave 2009.

Leitner, Wetzstein, Rosenberg, Michlmayr, Dustdar, and Leymann. Runtime Prediction of Service Level Agreement Violations for Composite Services. In Proceedings of the 2009 International conference on Service-Oriented Computing (ICSOC/ServiceWave'09), Springer-Verlag, Berlin, Heidelberg, 176-186.

Leitner, Michlmayr, Rosenberg, and Dustdar. Monitoring, Prediction and Prevention of SLA Violations in Composite Services. In Proceedings of the 2010 IEEE International Conference on Web Services (ICWS '10). IEEE Computer Society, Washington, DC, USA, 369-376.

Page 36: S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

Acknowledgements

The research leading to these results has

received funding from the European

Community’s Seventh Framework

Programme [FP7/2007-2013] under grant

agreement 215483 (S-Cube).

© Philipp Leitner