process mining

52
Process Mining: The next step in Business Process Management Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information and Technology P.O. Box 513, 5600 MB Eindhoven The Netherlands [email protected] & Centre for Information Technology Innovation (CITI) Queensland University of Technology (QUT) Brisbane, Australia

Upload: freek-hermkens

Post on 31-Oct-2014

145 views

Category:

Business


7 download

DESCRIPTION

Process Mining

TRANSCRIPT

Page 1: Process Mining

Process Mining: The next step in Business

Process Management

Prof.dr.ir. Wil van der AalstEindhoven University of Technology

Department of Information and TechnologyP.O. Box 513, 5600 MB Eindhoven

The [email protected]

&Centre for Information Technology Innovation (CITI)

Queensland University of Technology (QUT)Brisbane, Australia

Page 2: Process Mining

Outline

• Motivation• Overview of process mining

– Basic performance metrics– Process models– Organizational models– Social networks– Performance characteristics

• Process Mining: Some of our tools– EMiT– Thumb– MinSocN

• Conclusion

Page 3: Process Mining

Workflow/BPM in The Netherlands

• “The Netherlands in the country with the highest density of workflow systems per capita” John O'Connell (CEO Staffware)(cf. population density per sq. km390 versus 2.5 for Australia)

• Emphasis on process modelingand analysis (the European way)

• Innovative companies like PallasAthena, Baan, …

Page 4: Process Mining

I&T department, Eindhoven University of Technology

• Embedded in research institute BETA joining multiple disciplines

• Three subgroups:– Business Process Management

(workflow management, Petri nets, mining, ...)

– ICT Architectures(agents, transactions, ...)

– Software Engineering(software quality, ...)

• Team working on process mining: Wil van der Aalst, Ton Weijters, Ana Karla Alves de Medeiros, Boudewijn van Dongen, Eric Verbeek, Minseok Song, Monique Vullers-Jansen, Laura Maruster, …

Page 5: Process Mining

Motivation

Page 6: Process Mining
Page 7: Process Mining

25 years of workflow

Commercial Workflow Systems

1980 1985 1990 1995 2000

Exotica I - III

FlowMark MQSeries Workflow

jFlow

Staffware

Pavone

Onestone Domino Workflow

BEA PI

CARNOT

ViewStar

Digital Proc.Flo. AltaVista Proc.Flow

ActionWorkflow

SNI WorkParty

AdminFlow ChangengineWorkManager

OpenPM FlowJ et

Verve Versata

Action Coordinator

ActionWorks MetroDaVinci

FileNet WorkFlo Visual WorkFlo

FileNet Ensemble

Panagon WorkFlo

Xerox InConcert TIB/InConcert

Plexus FloWare BancTec FloWare

NCR ProcessIT

Netscape PM

MS2 Accelerate

Teamware Flow

Fujitsu iFlow

Beyond BeyondMail

DST AWD

IABG ProMInanD

DEC LinkWorks

COSA BaaN Ley COSA

Fujitsu Regatta

Pegasus

LEU

Banyan BeyondMail

Olivetti X_Workflow

Oracle WorkflowDigital Objectflow

ImagePlus FMS/FAF

VisualInfo

DST AWD

Continuum

Recognition Int.

WANGSIGMAEastman

WANG WorkfloweiStream

Lucent Mosaix

BlueCrossBlueShield

J CALS

iPlanet

• Pioneers like Skip Ellis and Michael Zisman already worked on “office automation” in the 70-ties

• The WFM hype is over …, but there are more and more applications, it has become a mature technology, and WFM is adopted by many other technologies (ERP, Web Services, etc.).

(Zur Muehlen 2003)

Page 8: Process Mining
Page 9: Process Mining

Let us reverse the process!

• Process mining can be used for:– Process discovery (What is the process?)– Delta analysis (Are we doing what was specified?)– Performance analysis (How can we improve?)

• Particularly interesting in pre- and post-workflow settings!

process mining

Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContact

customer

Archive order

End

Page 10: Process Mining

Process mining: Overview

Page 11: Process Mining

Classification of process mining

The following types of process mining can be distinguished:1) Determine basic performance metrics

2) Determine process model

3) Determine organizational model

4) Analyze social network (i.e., relations between actors)

5) Analyze performance characteristics (i.e., derive rules explaining performance)

Page 12: Process Mining

1) basic performance metrics

2) process modelStart

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContact

customer

Archive order

End

3) organizational model 4) social network

5) performance characteristics

If …then …

Page 13: Process Mining

(1) Determine basic performance metrics

• Process/control-flow perspective: flow time, waiting time, processing time and synchronization time.Questions:

• What is the average flow time of orders?• What is the maximum waiting time for activity approve?• What percentage of requests is handled within 10 days?• What is the minimum processing time of activity reject?• What is the average time between scheduling an activity and actually starting it?

• Resource perspective: frequencies, time, utilization, and variability.Questions:

• How many times did Sue complete activity reject claim?• How many times did John withdraw activity go shopping?• How many times did Clare suspend some running activity?• How much time did Peter work on instances of activity reject claim?• How much time did people with role Manager work on this process?• What is the utilization of John?• What is the average utilization of people with role Manager?• How many times did John work for more than 2 hours without interruption?

Page 14: Process Mining

Example (ARIS PPM)

                                                    

IDS Scheer's ARIS Process Performance Manager

Page 15: Process Mining

(2) Determine process model• Discover a process model (e.g., in terms of a PN or EPC)

without prior knowledge about the structure of the process.case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

A

B

C

D

E F

(W)

Page 16: Process Mining

(3) Determine organizational model

• Discover the organizational model (i.e., roles, departments,etc.) without prior knowledge about the structure of the organization.

Row Points for Source

Symmetrical Normalization

Dimension 1

2.01.51.0.50.0-.5-1.0

Dim

en

sio

n 2

2.0

1.5

1.0

.5

0.0

-.5

-1.0

Mary

Peter

Lucia

Alex

Johne.g., correspondence analysis (typically applied in ecology)

A B C D E F John 88 0 8 0 38 50 Alex 0 189 0 2 0 0 Lucia 112 0 0 0 62 40 Peter 0 11 192 0 0 0 Mary 0 0 0 198 0 0

Page 17: Process Mining

(4) Analyze social network

• Social Network Analysis (SNA)

• Based on:– Handover of work– Subcontracting– Working together– Reassignments– Doing similar tasks

Page 18: Process Mining

Example John Alex Lucia Peter Mary John 0 0 0 0 2 Alex 0 0 0 0 0 Lucia 0 0 0 2 2 Peter 0 0 2 0 2 Mary 2 0 2 2 0

Page 19: Process Mining

(5) Analyze performance characteristics

• Each case (process/workflow instance) has a number of properties:– Resource that worked on a specific activity– Value of a characteristic data element (e.g., size of

order, age of customer, etc.)– Performance metrics of case (e.g., flow time)

• Using machine-learning techniques it is possible to find relevant relations between these properties.

Page 20: Process Mining

Example

• If John and Mike work together, it takes longer.

• Expensive cases require less processing.

• Etc.

caseid Act A

Act B

... Act Z

Data D1

Data D2

... Data D9

Proc time

Wait Time

Flow time

1 John Mike Anne $50 20y 80% 12h 3d 3.5d 2 Clare Jim Ike $75 15y 75% 6h 3d 3.25d 3 John Mike Clare $55 20y 80% 18h 4d 4.75d ... ... ... ... ... ... ... ... ... ... ... ...

Page 21: Process Mining

Process mining: The tools• EMiT• Thumb• MinSocN

Page 22: Process Mining

Process Mining: Tooling

Staffware

InConcert

MQ Series

workflow management systems

FLOWer

Vectus

Siebel

case handling / CRM systems

SAP R/3

BaaN

Peoplesoft

ERP systems

common XML format for storing/exchanging workflow logs

EMiT Thumb

mining tools

MinSocN

Page 23: Process Mining

Example: processing customer orders

Example in Staffware: 7 tasks and

all basic routing

constructs

Page 24: Process Mining

Case 21Diractive Description Event User yyyy/mm/dd hh:mm---------------------------------------------------------------------------- Start swdemo@staffw_edl 2003/02/05 15:00Register order Processed To swdemo@staffw_edl 2003/02/05 15:00Register order Released By swdemo@staffw_edl 2003/02/05 15:00Prepare shipment Processed To swdemo@staffw_edl 2003/02/05 15:00(Re)send bill Processed To swdemo@staffw_edl 2003/02/05 15:00(Re)send bill Released By swdemo@staffw_edl 2003/02/05 15:01Receive payment Processed To swdemo@staffw_edl 2003/02/05 15:01Prepare shipment Released By swdemo@staffw_edl 2003/02/05 15:01Ship goods Processed To swdemo@staffw_edl 2003/02/05 15:01Ship goods Released By swdemo@staffw_edl 2003/02/05 15:02Receive payment Released By swdemo@staffw_edl 2003/02/05 15:02Archive order Processed To swdemo@staffw_edl 2003/02/05 15:02Archive order Released By swdemo@staffw_edl 2003/02/05 15:02 Terminated 2003/02/05 15:02

Case 22Diractive Description Event User yyyy/mm/dd hh:mm---------------------------------------------------------------------------- Start swdemo@staffw_edl 2003/02/05 15:02Register order Processed To swdemo@staffw_edl 2003/02/05 15:02Register order Released By swdemo@staffw_edl 2003/02/05 15:02Prepare shipment Processed To swdemo@staffw_edl 2003/02/05 15:02

Fragment of Staffware log

Page 25: Process Mining

Fragment of XML file<?xml version="1.0"?><!DOCTYPE WorkFlow_log SYSTEM

"http://www.tm.tue.nl/it/research/workflow/mining/WorkFlow_log.dtd"><WorkFlow_log>

<source program="staffware"/><process id="main_process">

<case id="case_0"><log_line>

<task_name>Case start</task_name><event kind="normal"/><date>05-02-2003</date><time>15:04</time>

</log_line><log_line>

<task_name>Register order</task_name><event kind="schedule"/><date>05-02-2003</date><time>15:04</time>

Page 26: Process Mining

EMiT

Focus on time.

Page 27: Process Mining

Thumb

Focus on noise.

Page 28: Process Mining

Thumb is able to deal with noise (D/F-graphs)

causality

no noise 10% noise

Page 29: Process Mining

Representation in terms of an EPC…(collaboration with IDS Scheer)

Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContact

customer

Archive order

End

Page 30: Process Mining

MinSocN (Mining Social Networks)

Page 31: Process Mining

Real case: CJIB

• Processing of fines

• 130136 cases

• 99 different activities

Page 32: Process Mining

Process in EMiT

Page 33: Process Mining

Complete process model

Validated by CJIB

Page 34: Process Mining

Conclusion

Page 35: Process Mining

Conclusion (1)

• Process mining is practically relevant and the logical next step in Business Process Management.

processdesign

implementation/configuration

processenactment

diagnosis

Page 36: Process Mining

Conclusion (2)

1) basic performance metrics

2) process model

Start

Register order

Prepareshipment

Ship goods

(Re)send bill

Receive paymentContact

customer

Archive order

End

3) organizational model 4) social network

5) performance characteristics

If …then …

• Process mining provides many interesting challenges for scientists, customers, users, managers, consultants, and tool developers.

Page 37: Process Mining

More information

http://www.tm.tue.nl/it/research/workflow_mining.htm

http://www.tm.tue.nl/it/research/patterns

http://www.tm.tue.nl/it/staff/wvdaalst

W.M.P. van der Aalst and K.M. van Hee. Workflow Management: Models, Methods, and Systems. MIT press, Cambridge, MA, 2002.

Page 38: Process Mining

References BPM (just books and far from complete)

• W.M.P. van der Aalst and K.M. van Hee. Workflow Management: Models, Methods, and Systems. MIT press, Cambridge, MA, 2002.

• Workflow Management: Modeling Concepts, Architecture and Implementation by Stefan Jablonski and Christoph Bussler; Paperback: 351 pages; International Thomson Publishing, October 1996.

• Production Workflow: Concepts and Techniques, by Frank Leymann, Dieter Roller, Andreas Reuter; Paperback, 479 pages; Prentice Hall PTR, 1st edition, September 1999.

• Workflow-Based Process Controlling: Foundation, Design and Application of Workflow-Driven Process Information Systems, by Michael Zur Muehlen. Logos, Berlin, 2003

• Proceedings of the International Conference on Business Process Management (BPM), Eindhoven, The Netherlands, June 26-27, 2003, by Wil M. P. van der Aalst, Arthur H. M. ter Hofstede, and Mathias Weske (Editors); Paperback, 391 pages; Springer Verlag, 2003.

• W.M.P. van der Aalst, J. Desel, and A. Oberweis, editors. Business Process Management: Models, Techniques, and Empirical Studies, volume 1806 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, 2000.

Page 39: Process Mining

References (2)• Internet Based Workflow Management: Towards a Semantic Web by Dan C.

Marinescu; Hardcover, 626 pages; John Wiley & Sons, 1st edition, April 2002. • Web Services, by Gustavo Alonso, Fabio Casati, Harumi Kuno, and Vijay

Machiraju; Hardcover, 480 pages, Springer Verlag, June 2003.• The Workflow Imperative, by Thomas M. Koulopolous; Hardcover, 240 pages; Van

Nostrand Reinhold, 1st edition, January 1995.• Database Support for Workflow Management: The WIDE Project, by Paul Grefen,

Barbara Pernici, and Gabriel Sanchez (Editors); Hardcover, 296 pages. Kluwer Academic Publishers, February, 1999.

• Design and Control of Workflow Processes: Business Process Management for the Service Industry (Lecture Notes in Computer Science # 2617), by Hajo Reijers; Paperback, 320 pages; Springer Verlag; October 2003.

• Practical Workflow for SAP - Effective Business Processes using SAP's WebFlow Engine, by Alan Rickayzen et al; Hardcover, 52 pages; SAP Press, July 2002.

• Workflow Modeling: Tools for Process Improvement and Application Development, by Alec Sharp and Patrick McDermott, Hardcover, 345 pages; Artech House, 1st edition, February 2001.

• Business Process Modelling With ARIS: A Practical Guide, by Rob Davis; Paperback, 545 ; Springer Verlag, August 2001.

Page 40: Process Mining

References (3)• Workflow Handbook 2003, by Layna Fischer (Editor); Hardcover, 384 pages. Future

Strategies, April 2003.

Specific for process mining:• W.M.P. van der Aalst, B.F. van Dongen, J. Herbst, L. Maruster, G. Schimm, and

A.J.M.M. Weijters. Workflow Mining: A Survey of Issues and Approaches. Data and Knowledge Engineering , 47(2):237-267, 2003.

• W.M.P. van der Aalst and B.F. van Dongen. Discovering Workflow Performance Models from Timed Logs. EDCIS 2002, volume 2480 of Lecture Notes in Computer Science, pages 45-63. Springer-Verlag, Berlin, 2002.

• A.J.M.M. Weijters and W.M.P. van der Aalst. Rediscovering Workflow Models from Event-Based Data using Little Thumb. Integrated Computer-Aided Engineering, 10(2):151-162, 2003.

• W.M.P. van der Aalst and A.J.M.M. Weijters, editors. Process Mining, Special Issue of Computers in Industry, Elsevier Science Publishers, Amsterdam, 2004.

• W.M.P. van der Aalst, A.J.M.M. Weijters, and L. Maruster. Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering (to appear).

Page 41: Process Mining

Appendix: A concrete algorithm

Page 42: Process Mining

Process Mining: The alpha algorithm

alpha algorithm

22 Opbergen en einde

10 registreren

14 eindcontrolere, tekenen Standaard

17 bepalen vervolg

9 Bepalen vervolg1

18 registreren offerte gesloten

13 inv., 1e controle, printen STANDAARD

3 controleren compleetheid/juistheid

1 start

2 collectief of particulier

12 Bepalen offerte standaard of NIET

klaar voor invoeren

Goedgekeurde offerte

begin proces

klaar voor controle

compleet/juist

klaar voor registreren

naar registreren

offerte uitgeprint

klaar voor einde

Standaard offerte

afgekeurde offerte

20 ontvangst verklaring

P2 accoord verklaring

7 ontvangst gegevens

P1 ontbrekende gegevens

19 wachten op accoord verklaring

16 eindcontrolere, tekenen niet std.

15 inv, 1e controle, printen NIET STD.

retour gewenst

wachten2

4 dubbele aanvraag?

5 navraag VA (telefoon)

6 opvragen ontbrekende gegevens

NS uitgeprint

D2 geen retour ontvangen

Niet Standaard offerte

21 registreren offerte afgelegd

is collectief

opvagen gegevens

wachten

dubbele

D1 Geen reactie

8 verlopen deadline

11 afwijzen

Afgekeurd NS

afgewezen

collectief retour reeds ontvangen

P of C retour gewenst

particulier zonder retour

collectief

particulier en invoerenparticulier en afwijzen

niet compleet/onjuist

particulier

collectief

incompleet

voldoendeonvoldoende

Page 43: Process Mining

Process log• Minimal information in

log: case id’s and task id’s.

• Additional information: event type, time, resources, and data.

• In this log there are three possible sequences:– ABCD– ACBD– EF

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

Page 44: Process Mining

>,,||,# relations

• Direct succession: x>y iff for some case x is directly followed by y.

• Causality: xy iff x>y and not y>x.

• Parallel: x||y iff x>y and y>x

• Choice: x#y iff not x>y and not y>x.

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

A>BA>CB>CB>DC>BC>DE>F

AB

AC

BD

CD

EF

B||CC||B

Page 45: Process Mining

Basic idea (1)

x y

xy

Page 46: Process Mining

Basic idea (2)

xy, xz, and y||z

x

z

y

Page 47: Process Mining

Basic idea (3)

xy, xz, and y#z

x

z

y

Page 48: Process Mining

Basic idea (4)

xz, yz, and x||y

x

y

z

Page 49: Process Mining

Basic idea (5)

xz, yz, and x#y

x

y

z

Page 50: Process Mining

It is not that simple: Basic alpha algorithm

Let W be a workflow log over T. (W) is defined as follows.

1. TW = { t T     W t },

2. TI = { t T     W t = first() },

3. TO = { t T     W t = last() },

4. XW = { (A,B)   A TW   B TW    a Ab B a W b     a1,a2 A a1#W a2    b1,b2

B b1#W b2 },

5. YW = { (A,B) X    (A,B) XA A B B (A,B) = (A,B) },

6. PW = { p(A,B)    (A,B) YW } {iW,oW},

7. FW = { (a,p(A,B))    (A,B) YW   a A }   { (p(A,B),b)    (A,B) YW   b B }

 { (iW,t)    t TI}  { (t,oW)   t TO}, and

8. (W) = (PW,TW,FW).

Page 51: Process Mining

Results• If log is complete with respect to relation >, it can be used to

mine any SWF-net!• Structured Workflow Nets (SWF-nets) have no implicit places

and the following two constructs cannot be used:

(Short loops require some refinement but not a problem.)

Page 52: Process Mining

Examplecase 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

A

B

C

D

E F

(W)

W