process mining chapter_06_advanced_process_discovery_techniques

43
Chapter 6 Advanced Process Discovery Techniques prof.dr.ir. Wil van der Aalst www.processmining.org

Upload: muhammad-ajmal

Post on 25-Dec-2014

107 views

Category:

Documents


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Process mining chapter_06_advanced_process_discovery_techniques

Chapter 6Advanced Process Discovery Techniquesprof.dr.ir. Wil van der Aalstwww.processmining.org

Page 2: Process mining chapter_06_advanced_process_discovery_techniques

Overview

PAGE 1

Part I: Preliminaries

Chapter 2 Process Modeling and Analysis

Chapter 3Data Mining

Part II: From Event Logs to Process Models

Chapter 4 Getting the Data

Chapter 5 Process Discovery: An Introduction

Chapter 6 Advanced Process Discovery Techniques

Part III: Beyond Process Discovery

Chapter 7 Conformance Checking

Chapter 8 Mining Additional Perspectives

Chapter 9 Operational Support

Part IV: Putting Process Mining to Work

Chapter 10 Tool Support

Chapter 11 Analyzing “Lasagna Processes”

Chapter 12 Analyzing “Spaghetti Processes”

Part V: Reflection

Chapter 13Cartography and Navigation

Chapter 14Epilogue

Chapter 1 Introduction

Page 3: Process mining chapter_06_advanced_process_discovery_techniques

Process discovery

PAGE 2

software system

(process)model

eventlogs

modelsanalyzes

discovery

records events, e.g., messages,

transactions, etc.

specifies configures implements

analyzes

supports/controls

enhancement

conformance

“world”

people machines

organizationscomponents

businessprocesses

Page 4: Process mining chapter_06_advanced_process_discovery_techniques

Challenge

PAGE 3

process discovery

fitness

precisiongeneralization

simplicity

“able to replay event log” “Occam’s razor”

“not overfitting the log” “not underfitting the log”

Page 5: Process mining chapter_06_advanced_process_discovery_techniques

Observing a stable process infinitely long

PAGE 4

trace in event log

frequent behavior

all behavior(including noise)

Page 6: Process mining chapter_06_advanced_process_discovery_techniques

Target model

PAGE 5

target model

Page 7: Process mining chapter_06_advanced_process_discovery_techniques

Non-fitting model

PAGE 6

non-fitting model

Page 8: Process mining chapter_06_advanced_process_discovery_techniques

Overfitting model

PAGE 7

overfitting model

Page 9: Process mining chapter_06_advanced_process_discovery_techniques

Underfitting model

PAGE 8

underfitting model

Page 10: Process mining chapter_06_advanced_process_discovery_techniques

Characteristics of process discovery algorithms

• Representational bias− Inability to represent concurrency− Inability to deal with (arbitrary) loops− Inability to represent silent actions− Inability to represent duplicate actions− Inability to model OR-splits/joins− Inability to represent non-free-choice behavior− Inability to represent hierarchy

• Ability to deal with noise• Completeness notion assumed• Approach used (direct algorithmic approaches, two-

phase approaches, computational intelligence approaches, partial approaches, etc.)

PAGE 9

Page 11: Process mining chapter_06_advanced_process_discovery_techniques

Examples

PAGE 10

• Algorithmic techniques• Alpha miner• Alpha+, Alpha++, Alpha#• FSM miner• Fuzzy miner• Heuristic miner• Multi phase miner

• Genetic process mining• Single/duplicate tasks• Distributed GM

• Region-based process mining• State-based regions• Language based regions

• Classical approaches not dealing with concurrency• Inductive inference (Mark Gold, Dana Angluin et al.)• Sequence mining

Page 12: Process mining chapter_06_advanced_process_discovery_techniques

Heuristic mining

• To deal with noise and incompleteness.• To have a better representational bias than the α

algorithm (AND/XOR/OR/skip).• Uses C-nets.

PAGE 11

a

register claim

c e

close case

check damage

dconsult expert

b

check policy

Page 13: Process mining chapter_06_advanced_process_discovery_techniques

Example log; problem α algorithm

PAGE 12

a

b

c

ed

p2

end

p4

p3p1

start

p5

Page 14: Process mining chapter_06_advanced_process_discovery_techniques

Taking into account frequencies

PAGE 13

Page 15: Process mining chapter_06_advanced_process_discovery_techniques

Dependency measure

PAGE 14

Page 16: Process mining chapter_06_advanced_process_discovery_techniques

Example

PAGE 15

Page 17: Process mining chapter_06_advanced_process_discovery_techniques

Lower threshold (2 direct successions and a dependency of at least 0.7)

PAGE 16

a c e

d

b

11(0.92)

11(0.92)

13(0.93)

5(0.83)

4(0.80)

13(0.93)

11(0.92)

11(0.92)

Page 18: Process mining chapter_06_advanced_process_discovery_techniques

Higher threshold (5 direct successions and a dependency of at least 0.9)

PAGE 17

a c e

d

b11(0.92)

11(0.92)

13(0.93) 13(0.93)

11(0.92)

11(0.92)

Page 19: Process mining chapter_06_advanced_process_discovery_techniques

Learning splits and joins

PAGE 18

a40

c21

e 40

d 17

b21

5

20

20

13

20

20

13

4

5

20

13

20

20 20

20

5

20

13 1313

4

4

Page 20: Process mining chapter_06_advanced_process_discovery_techniques

Alternative visualization

PAGE 19

a40

c21

e 40

d 17

b21

5

20

20

13

20

20

13

4

5

20

13

20

20 20

20

5

20

13 1313

4

4

a c e

d

b

AND AND

Page 21: Process mining chapter_06_advanced_process_discovery_techniques

Characteristics of heuristic mining

• Can deal with noise and therefore quite robust.• Improved representational bias.• Split and join rules are only considered locally

(therefore most of the discovered model are not sound and require repair actions).

PAGE 20

Page 22: Process mining chapter_06_advanced_process_discovery_techniques

Genetic process mining

PAGE 21

next generationcomputefitness

elitism

parents

crossover

children

mutation

create initial population

“dead” individuals

tournament

select best individual

event log

termination

Page 23: Process mining chapter_06_advanced_process_discovery_techniques

Design decisions

• Representation of individuals• Initialization• Fitness function• Selection strategy (tournament and elitism)• Crossover• Mutation

PAGE 22

next generationcomputefitness

elitism

parents

crossover

children

mutation

create initial population

“dead” individuals

tournament

select best individual

event log

termination

Page 24: Process mining chapter_06_advanced_process_discovery_techniques

Example: crossover

PAGE 23

a

start register request

b

examine thoroughly

c

examine casually

d

check ticket

decide

pay compensation

reject request

reinitiate request

e

g

h

f

end

a

start register request

b

examine thoroughly

c

examine casually

d

check ticket

decide

pay compensation

reject request

reinitiate request

e

g

h

f

end

a

start register request

b

examine thoroughly

c

examine casually

d

check ticket

decide

reinitiate request

e

f

a

start register request

b

examine thoroughly

c

examine casually

d

check ticket

decide

pay compensation

reject request

reinitiate request

e

g

h

f

end

pay compensation

reject request

g

hend

Page 25: Process mining chapter_06_advanced_process_discovery_techniques

Example: mutation

PAGE 24

a

start register request

b

examine thoroughly

c

examine casually

d

check ticket

decide

pay compensation

reject request

reinitiate request

e

g

h

f

end

a

start register request

b

examine thoroughly

c

examine casually

d

check ticket

decide

pay compensation

reject request

reinitiate request

e

g

h

f

end

remove place

added arc

Page 26: Process mining chapter_06_advanced_process_discovery_techniques

Characteristics of genetic process mining

• Requires a lot of computing power.• Can be distributed easily.• Can deal with noise, infrequent behavior, duplicate tasks,

invisible tasks, etc.• Allows for incremental improvement and combinations

with other approaches (heuristics post-optimization, etc.).PAGE 25

Page 27: Process mining chapter_06_advanced_process_discovery_techniques

Region-based mining

• Two types of regions theory:− State-based regions− Language-based regions

• All about discovering places (like in the α algorithm)!

PAGE 26

a1

...

a2

am

b1

b2

bn

p(A,B) ...

A={a1,a2, … am} B={b1,b2, … bn}

Page 28: Process mining chapter_06_advanced_process_discovery_techniques

State-based regions

Two steps:1.Discover a transition system (different abstractions

are possible)2.Convert transition system into an “equivalent” Petri

net.

PAGE 27

Page 29: Process mining chapter_06_advanced_process_discovery_techniques

Step 1: learning a transition system

• past, future, past+future• sequence, multiset, set abstraction• limited horizon to abstract further• filtering e.g. based on transaction type, names, etc.• labels based on activity name or other features

PAGE 28

a b c d c d c d e f a g h h h ipast future

current state

past and future

trace:

Page 30: Process mining chapter_06_advanced_process_discovery_techniques

Past without abstraction (full sequence)

PAGE 29

‹a,e,d›‹a,e›‹a›

‹a,b,c,d›

‹›

‹a,c,b,d›

‹a,b,c›‹a,b›

‹a,c,b›‹a,c›

a

c d

e d

b dc

b

Page 31: Process mining chapter_06_advanced_process_discovery_techniques

Future without abstraction

PAGE 30

‹a,e,d› ‹d›

‹a,b,c,d›

‹›

‹a,c,b,d›

d

‹b,c,d›‹c,d›

‹c,b,d›‹b,d›

ea

a

a

b

bc

c

‹e,d›

Page 32: Process mining chapter_06_advanced_process_discovery_techniques

Past with multiset abstraction

PAGE 31

[ ]

a

d

b dc

e

c

b

[a,b,c,d][a,b,c][a,c]

[a,b]

[a,e]

[a,d,e]

[a]

Page 33: Process mining chapter_06_advanced_process_discovery_techniques

Only last event matters for state

PAGE 32

‹›a b

c

d

e d

d‹a›

‹b›

‹c›

‹d›

‹e›

c b

Page 34: Process mining chapter_06_advanced_process_discovery_techniques

Step 2: constructing a Petri net using regions

PAGE 33

a

a

b

c

d

fpR

e

a

b

e

c

d

df

f

e

e

a = enterb = enterc = exitd = exite = do not crossf = do not cross

R

Page 35: Process mining chapter_06_advanced_process_discovery_techniques

Example

PAGE 34

[ ]

a

d

b dc

e

cb

[a,b,c,d][a,b,c][a,c]

[ a,b][a,e] [a,d,e]

[a]

a

b

c

de

p2

end

p4

p3p1

start

Page 36: Process mining chapter_06_advanced_process_discovery_techniques

Language based regions

PAGE 35

a1

a2

b1

b2

dpR

e

c1

c

f

YX

Region R = (X,Y,c) corresponding to place pR: X = {a1,a2,c1} = transitions producing a token for pR, Y = {b1,b2,c1} = transitions consuming a token from pR, and c is the initial marking of pR.

Page 37: Process mining chapter_06_advanced_process_discovery_techniques

Based idea: enough tokens should be present when consuming

PAGE 36

a1

a2

b1

b2

dpR

e

c1

c

f

YX

A place is feasible if it can be added without disabling any of thetraces in the event log.

Page 38: Process mining chapter_06_advanced_process_discovery_techniques

Example

PAGE 37

Page 39: Process mining chapter_06_advanced_process_discovery_techniques

Regions

PAGE 38

Page 40: Process mining chapter_06_advanced_process_discovery_techniques

Model

PAGE 39

b

c

a

e

d

p1 p2 p3 p4

p6

p5

Page 41: Process mining chapter_06_advanced_process_discovery_techniques

Characteristics of region-based mining

• Can be used to discover more complex control-flow structures.

• Classical approaches need to be adapted (overfitting!).

• Representational bias can be parameterized (e.g., free-choice nets, label splitting, etc.).

• Problems dealing with noise.

PAGE 40

Page 42: Process mining chapter_06_advanced_process_discovery_techniques

Other approaches, e.g. fuzzy mining

PAGE 41

Page 43: Process mining chapter_06_advanced_process_discovery_techniques

Evaluating the discovered process

PAGE 42

Structure: Is this the simplest model (Occam's Razor)?

Fitness: Is the event log possible according to the model?

Precision: Is the model not underfitting (allow for too much)?

Generalization: Is the model not overfitting (only allow for the “accidental” examples)?