process mining chapter_06_advanced_process_discovery_techniques

Chapter 6Advanced Process Discovery Techniquesprof.dr.ir. Wil van der Aalstwww.processmining.org

Overview

Part I: Preliminaries

Chapter 2 Process Modeling and Analysis

Chapter 3Data Mining

Part II: From Event Logs to Process Models

Chapter 4 Getting the Data

Chapter 5 Process Discovery: An Introduction

Chapter 6 Advanced Process Discovery Techniques

Part III: Beyond Process Discovery

Chapter 7 Conformance Checking

Chapter 8 Mining Additional Perspectives

Chapter 9 Operational Support

Part IV: Putting Process Mining to Work

Chapter 10 Tool Support

Chapter 11 Analyzing “Lasagna Processes”

Chapter 12 Analyzing “Spaghetti Processes”

Part V: Reflection

Chapter 13Cartography and Navigation

Chapter 14Epilogue

Chapter 1 Introduction

Process discovery

software system

(process)model

eventlogs

modelsanalyzes

discovery

records events, e.g., messages,

transactions, etc.

specifies configures implements

analyzes

supports/controls

enhancement

conformance

“world”

people machines

organizationscomponents

businessprocesses

Challenge

process discovery

fitness

precisiongeneralization

simplicity

“able to replay event log” “Occam’s razor”

“not overfitting the log” “not underfitting the log”

Observing a stable process infinitely long

trace in event log

frequent behavior

all behavior(including noise)

Target model

target model

Non-fitting model

non-fitting model

Overfitting model

overfitting model

Underfitting model

underfitting model

Characteristics of process discovery algorithms

• Representational bias− Inability to represent concurrency− Inability to deal with (arbitrary) loops− Inability to represent silent actions− Inability to represent duplicate actions− Inability to model OR-splits/joins− Inability to represent non-free-choice behavior− Inability to represent hierarchy

• Ability to deal with noise• Completeness notion assumed• Approach used (direct algorithmic approaches, two-

phase approaches, computational intelligence approaches, partial approaches, etc.)

Examples

• Algorithmic techniques• Alpha miner• Alpha+, Alpha++, Alpha#• FSM miner• Fuzzy miner• Heuristic miner• Multi phase miner

• Genetic process mining• Single/duplicate tasks• Distributed GM

• Region-based process mining• State-based regions• Language based regions

• Classical approaches not dealing with concurrency• Inductive inference (Mark Gold, Dana Angluin et al.)• Sequence mining

Heuristic mining

• To deal with noise and incompleteness.• To have a better representational bias than the α

algorithm (AND/XOR/OR/skip).• Uses C-nets.

register claim

close case

check damage

dconsult expert

check policy

Example log; problem α algorithm

Taking into account frequencies

Dependency measure

Example

Lower threshold (2 direct successions and a dependency of at least 0.7)

11(0.92)

13(0.93)

5(0.83)

4(0.80)

13(0.93)

11(0.92)

Higher threshold (5 direct successions and a dependency of at least 0.9)

b11(0.92)

11(0.92)

13(0.93) 13(0.93)

11(0.92)

Learning splits and joins

13 1313

Alternative visualization

13 1313

AND AND

Characteristics of heuristic mining

• Can deal with noise and therefore quite robust.• Improved representational bias.• Split and join rules are only considered locally

(therefore most of the discovered model are not sound and require repair actions).

Genetic process mining

next generationcomputefitness

elitism

parents

crossover

children

mutation

create initial population

“dead” individuals

tournament

select best individual

event log

termination

Design decisions

• Representation of individuals• Initialization• Fitness function• Selection strategy (tournament and elitism)• Crossover• Mutation

next generationcomputefitness

elitism

parents

crossover

children

mutation

create initial population

“dead” individuals

tournament

select best individual

event log

termination

Example: crossover

start register request

examine thoroughly

examine casually

check ticket

decide

pay compensation

reject request

reinitiate request

examine thoroughly

examine casually

check ticket

decide

pay compensation

reject request

reinitiate request

examine thoroughly

examine casually

check ticket

decide

reinitiate request

examine thoroughly

examine casually

check ticket

decide

pay compensation

reject request

reinitiate request

pay compensation

reject request

Example: mutation

examine thoroughly

examine casually

check ticket

decide

pay compensation

reject request

reinitiate request

examine thoroughly

examine casually

check ticket

decide

pay compensation

reject request

reinitiate request

remove place

added arc

Characteristics of genetic process mining

• Requires a lot of computing power.• Can be distributed easily.• Can deal with noise, infrequent behavior, duplicate tasks,

invisible tasks, etc.• Allows for incremental improvement and combinations

with other approaches (heuristics post-optimization, etc.).PAGE 25

Region-based mining

• Two types of regions theory:− State-based regions− Language-based regions

• All about discovering places (like in the α algorithm)!

p(A,B) ...

A={a1,a2, … am} B={b1,b2, … bn}

State-based regions

Two steps:1.Discover a transition system (different abstractions

are possible)2.Convert transition system into an “equivalent” Petri

Step 1: learning a transition system

• past, future, past+future• sequence, multiset, set abstraction• limited horizon to abstract further• filtering e.g. based on transaction type, names, etc.• labels based on activity name or other features

a b c d c d c d e f a g h h h ipast future

current state

past and future

trace:

Past without abstraction (full sequence)

‹a,e,d›‹a,e›‹a›

‹a,b,c,d›

‹›

‹a,c,b,d›

‹a,b,c›‹a,b›

‹a,c,b›‹a,c›

Future without abstraction

‹a,e,d› ‹d›

‹a,b,c,d›

‹›

‹a,c,b,d›

‹b,c,d›‹c,d›

‹c,b,d›‹b,d›

‹e,d›

Past with multiset abstraction

[a,b,c,d][a,b,c][a,c]

[a,d,e]

Only last event matters for state

‹›a b

d‹a›

‹b›

‹c›

‹d›

‹e›

Step 2: constructing a Petri net using regions

a = enterb = enterc = exitd = exite = do not crossf = do not cross

Example

[a,b,c,d][a,b,c][a,c]

[ a,b][a,e] [a,d,e]

Language based regions

Region R = (X,Y,c) corresponding to place pR: X = {a1,a2,c1} = transitions producing a token for pR, Y = {b1,b2,c1} = transitions consuming a token from pR, and c is the initial marking of pR.

Based idea: enough tokens should be present when consuming

A place is feasible if it can be added without disabling any of thetraces in the event log.

Example

Regions

p1 p2 p3 p4

Characteristics of region-based mining

• Can be used to discover more complex control-flow structures.

• Classical approaches need to be adapted (overfitting!).

• Representational bias can be parameterized (e.g., free-choice nets, label splitting, etc.).

• Problems dealing with noise.

Other approaches, e.g. fuzzy mining

Evaluating the discovered process

Structure: Is this the simplest model (Occam's Razor)?

Fitness: Is the event log possible according to the model?

Precision: Is the model not underfitting (allow for too much)?

Generalization: Is the model not overfitting (only allow for the “accidental” examples)?

process mining chapter_06_advanced_process_discovery_techniques

Documents

schenck process mining

process mining introduction

process mining chapter_08_mining_additional_perspectives

process mining: control-flow mining algorithms technologie...

process mining and predictive process monitoring

process mining bei softwareprozessen - kneuper.de ·...

process mining - chapter 3 - data mining

process mining tutorial -...

process mining and security: detecting anomalous process...

mining process

process mining methodology and its application at...

process mining @

business process management e process mining

process mining chapter_12_analyzing_spaghetti_processes

lean & process mining

statistical learning introduction: data mining process and...

process mining chapter_09_operational_support

process mining manifesto

meeting of the ieee task force on process mining - tu/e...

process mining manifesto - win.tue.nl mining...