jorge munoz-gama advisor: josep carmona december 2014 conformance checking and diagnosis in process...
TRANSCRIPT
Jorge Munoz-GamaAdvisor: Josep Carmona
December 2014
CONFORMANCE CHECKING
AND DIAGNOSIS IN PROCESS MINING
PRECISION DECOMPOSITION
CONFORMANCE CHECKING
CONCLUSIONS
PRECISION DECOMPOSITION
CONFORMANCE CHECKING
CONCLUSIONS
4
Conformance Checking in a Nutshell
MODEL REALITY
PROCESSDOMAINEXPERTS
?
5
Biased Vision
6
Conformance Checking in a Nutshell
MODEL REALITY
PROCESS
?
LOGS
DOMAINEXPERTS
7
Conformance Checking in a Nutshell
MODEL REALITY
PROCESS
?
LOGS
8
Conformance Checking in a Nutshell
MODEL REALITY
PROCESS
?
LOGS
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
9
Structure and Outline
• Structure of the Presentation
Problem – Context – Contributions
• Outline of the Presentation• Precision
• Precision based on the Log• Qualitative Analysis of Precision Checking• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking• Topological Conformance Diagnosis• Data-aware Decomposed Conformance Checking• Event-based Real-time Decomposed Conformance Checking
PRECISION DECOMPOSITION
CONFORMANCE CHECKING
CONCLUSIONS
11
• Precision• Precision based on the Log• Qualitative Analysis of Precision Checking• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking• Topological Conformance Diagnosis• Data-aware Decomposed Conformance Checking• Event-based Real-time Decomposed Conformance
Checking
12
Problem“Low Criticality Diagnosis” Process
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Hospital Process-aware
Information System
HospitalStaff
“Low Criticality Diagnosis”Process Model
13
Problem“Low Criticality Diagnosis” Process
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
14
Problem“Low Criticality Diagnosis” Process
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationRadiology TestRadiology TestRadiology TestRadiology Test
…
15
ContextThe Importance of Precision
A good model must be fitting but also be precise
16
ContextEfficient and Comprehensive
• Approach to measure precision
• Based on potential points of improvement
• Not require an exhaustive model state-space exploration
• Previous works require model exploration/simulation
• Identify precision problems with a fine granularity
• Results for analysis and process improvement
17
ContributionsPrecision based on Escaping Arcs
MODEL BEHAVIOR
LOG BEHAVIOR
Exploration of the model’s behavior: costly, possibly infinite, or require simulation.
18
ContributionsPrecision based on Escaping Arcs
LOG BEHAVIOR
Model behavior traversal restricted by the log behavior.
Escaping arcs: points where the model allows more behavior than the one observed in the log.
ESCAPING ARC
ComputePrecision
ModeledBehavior
ObservedBehavior
log
Minimal Imprecise Traces
ETC Precision (etcp)
0.81
19
ContributionsOutline of Precision based on Escaping Arcs
a, b, d, g, ia, c, d, e, f, h, ia, c, e, d, f, h, ia, c, e, f, d, h, i
model
ab
c
d
e
fh
gi
a, c, fa, c, d, fa, c, d, e, ea, c, e, d, ea, c, e, e
ComputePrecision
ModeledBehavior
ObservedBehavior
log
Minimal Imprecise Traces
ETC Precision (etcp)
0.81
20
ContributionsOutline of Precision based on Escaping Arcs
a, b, d, g, ia, c, d, e, f, h, ia, c, e, d, f, h, ia, c, e, f, d, h, i
model
ab
c
d
e
fh
gi
a, c, fa, c, d, fa, c, d, e, ea, c, e, d, ea, c, e, e
21
ContributionsObserved Behavior
a, b, d, g, ia, c, d, e, f, h, ia, c, e, d, f, h, ia, c, e, f, d, h, i
a1 1
b
1 g 1d
1i 1
22
ContributionsObserved Behavior
a
c
b
d g i
ihf
d
e2 2
1 11 1
a, b, d, g, ia, c, d, e, f, h, ia, c, e, d, f, h, ia, c, e, f, d, h, i
1
1 1 1 1 1
23
ContributionsObserved Behavior
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
a, b, d, g, ia, c, d, e, f, h, ia, c, e, d, f, h, ia, c, e, f, d, h, i
4 4
1 11 1
1 1111
1111
3
2
1 1 1 1
ComputePrecision
ModeledBehavior
ObservedBehavior
log
Minimal Imprecise Traces
ETC Precision (etcp)
0.81
24
ContributionsOutline of Precision based on Escaping Arcs
a, b, d, g, ia, c, d, e, f, h, ia, c, e, d, f, h, ia, c, e, f, d, h, i
model
ab
c
d
e
fh
gi
a, c, fa, c, d, fa, c, d, e, ea, c, e, d, ea, c, e, e
25
ContributionsModeled Behavior
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f e
e
e
ab
c
d
e
fh
gi
ComputePrecision
ModeledBehavior
ObservedBehavior
log
Minimal Imprecise Traces
ETC Precision (etcp)
0.81
26
ContributionsOutline of Precision based on Escaping Arcs
a, b, d, g, ia, c, d, e, f, h, ia, c, e, d, f, h, ia, c, e, f, d, h, i
model
ab
c
d
e
fh
gi
a, c, fa, c, d, fa, c, d, e, ea, c, e, d, ea, c, e, e
27
ContributionsCompute Precision
• For each state of the automaton we take into account the weight, the observed arcs and the allowed arcs:
observed states
weight escaping arcs
allowed arcs
28
ContributionsComputing Precision
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f e
e
e
44
1
3
2
1 1 1
11111
1 1 1 1
1111… + 4 · 0 +…
… + 4 · 2 +… 1 -
29
ContributionsComputing Precision
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f e
e
e
44
1
3
2
1 1 1
11111
1 1 1 1
1111… + 1 · 1 +…
… + 1 · 2 +… 1 -
30
ContributionsChallenges Addressed
• The precision based on escaping arcs does not require a complete exploration of the model behavior.
• Instead, the model exploration is restricted by the behavior observed in the log.
• Escaping arcs pinpoint the situations that need to be fixed to achieve a completely precise system.
• Collect imprecisions in terms of event log - Minimal Imprecise Log
a, c, fa, c, d, fa, c, d, e, ea, c, e, d, ea, c, e, e
31
• Precision• Precision based on the Log
• Qualitative Analysis of Precision Checking
• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking• Topological Conformance Diagnosis• Data-aware Decomposed Conformance Checking• Event-based Real-time Decomposed Conformance
Checking
32
ProblemThe Effects of Exceptional Behavior
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
33
ProblemThe Effects of Exceptional Behavior
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
Initial Examination
Allergy TestBlood Test
Radiology TestDiagnosis
Home Care
34
ProblemVariability of Precision in the Future
ETC Precision
0.81
ETC Precision
?
ETC Precision
??
CurrentMoment
CloseFuture
FarFuture
futurepresent
35
ProblemLimited Resources and Imprecision Points
Hospital Process
Imprecision Points
Limited Analysts and
Resources
36
ContextRobustness, Confidence and Severity
• Precision based on Escaping Arcs more robust to exceptional behavior.
• Estimate the possible variability of the metric in the future.
• Asses the severity of imprecision points and compare them.
37
ContributionsRobustness on Escaping Arcs
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f
e
ee
3199 3199
1435 1435 1435 1435
1765
946 946
946 946 9460
0
00
0
818
764 764 764 764
54 54 54 54
38
ContributionsRobustness on Escaping Arcs
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f
e
ee
3200 3200
1435 1435 1435 1435
1765
947 947
946 946 9460
00
0
818
764 764 764 764
54 54 54 54
1ihf
e0
1 1 1
39
ContributionsRobustness on Escaping Arcs• Threshold parameter to cut exceptional behavior.
• Parametric threshold• High cut factor for main behavior • Low cut factor for extreme cases
• Local-context cut, not global-context cut
499
1500
2
13
499
1500
200
300500
40
ContributionsRobustness on Escaping Arcs
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f
e
ee
3200 3200
1435 1435 1435 1435
1765
947 947
946 946 9460
00
0
818
764 764 764 764
54 54 54 54
1ihf
e0
1 1 11ihf
e0
1 1 1
41
log
K
Low Confidence High Confidence
ContributionsConfidence on Escaping Arcs Metric
42
log
K
ContributionsConfidence on Escaping Arcs Metric
43
log
K
ContributionsConfidence on Escaping Arcs Metric
44
ContributionsUpper Estimation of Precision
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f
e
ee
3200 3200
1435 1435 1435 1435
1765
947 947
946 946 9460
00
0
818
764 764 764 764
54 54 54 54
11
K = 3
• Best scenario = covering escaping arcs
45
ContributionsUpper Estimation of Precision
• Problem of optimization.
• Cover escaping arcs with the given k to maximize the metric.
• Cost of covering a escaping arc: the number of traces to overpass the threshold.
• Gain of covering a escaping arc: the weight of the state.
BIP Formulation
Upper Estimation
46
ContributionsLower Estimation of Precision
a
c
b
d g i
i
i
ih
h
hf
f
f d
d
d
e
e
f
f
e
ee
3200 3200
1435 1435 1435 1435
1765
947 947
946 946 9460
00
0
818
764 764 764 764
54 54 54 54
11
K = 1
• Worst scenario = new escaping arcs
0
1 1 1 1 1
Lower Estimation
avg
A-1 A-1 A-1 A-1 A-1
47
• Subjective and multifactor
• Weight, Alternation, Stability, Criticality
A
AE
B
D
C D G H F A946
946
946
AFHG
1 1 1 1
1435
1435
1435
1435
G
D H F A764
764
764
764
H
D F A54545454
818
947
947
3200
3200
1765
0
H
0
H
0
G
0
G
0
GD H F A
764
764
764
764
H
D F A54545454
818
0
G
0
G
A
AE
B
D
C D G H F A946
946
946
AFHG
1 1 1 1
1435
1435
1435
1435
G
D H F A764
764
764
764
H
D F A54545454
818
947
947
3200
3200
1765
0
H
0
H
0
G
0
G
0
GD H F A
764
764
764
764
H
D F A54545454
818
0
G
0
G
0
H
0
H
0
H
0
H
0
H
0
H
0
H
0
H0
H
0
H
0
H
0
H0
H
0
H
0
H
0
H
0
H
0
H
All imprecisions equally important?
sever
mid
low
ContributionsSeverity of the Escaping Arcs
48
• Escaping arcs in parts with more weight more sever
10000
0
7000
3000
10
0
7
3sever sever
ContributionsWeight of an Escaping Arc
49
• More chances to make a mistake more sever
•
sever sever
ContributionsAlternation of an Escaping Arc
50
• Apply perturbation • increase the number of instances in that point• proportional to the current occurrence number
• Measure how easy is to overpass the threshold
• Imprecision stable to perturbation more sever
10000
0
7000
3000
10000
99
6901
3000sever sever
ContributionsStability of an Escaping Arc
51
• Importance of the task involved in the escaping arc
sever sever
CheckDateFormat
Bank Transfer
ContributionsCriticality of an Escaping Arc
52
ContributionsChallenges Addressed
• Robustness on the Precision based on Escaping Arcs.
• Confidence interval on the Precision metric.
• Severity assessment on the precision problems.
53
• Precision• Precision based on the Log• Qualitative Analysis of Precision Checking
• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking• Topological Conformance Diagnosis• Data-aware Decomposed Conformance Checking• Event-based Real-time Decomposed Conformance
Checking
54
ProblemPrecision on Unfitting Scenarios
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home CarePerfect fitness is uncommon in real life
55
ContributionsUnfitting Observed Behavior
Log Trace
Model Behavior
?
56
ProblemFitness effects on Precision based on Log
a, a, b, b, d
What state reaches the model when the trace does not fit?
a bc
a bd
baa1 1 1 1 ???
• Option: Not considering the unfitting part.• The position of the fitting problem influences the precision.
57
ContextPrecision Independent of Fitness
• Unfitting scenarios are common in real-life
• Precision independent from Fitness
• A precision not based directly on the log but on a pre-alignment between the observed behavior and the modeled behavior.
58
ContextAligning Observed and Modeled Behavior
Log Trace
Model Behavior
59
ContextAligning Observed and Modeled Behavior
• Find the closest model trace in the model behavior for a given log trace
• From a global perspective• Able to deal with unfitting behavior• Optimal guaranteed
• Time-consuming problem based on A* search algorithms
* Adriansyah, A.: Aligning Observed and Modeled Behavior. PhD Thesis. Eindhoven University of Technology. 2014
ComputePrecision
ModeledBehavior
ObservedBehavior
Minimal Imprecise Traces
ETC Precision (etcp)
0.81
60
Alignments
a d ba a b
ad
ContributionsPrecision based on Alignments
61
ContributionsAligning Observed and Modeled Behavior
a bc
a bd
adab
a ad ba a db
Log Trace
Alignment
Process Model
62
ContributionsAligning Observed and Modeled Behavior
a bc
a bd
Log Trace adab aabd
a ad ba a db
Alignment
Process Model
Log Moves
Model Moves
Deviation
Deviation
Fitting trace, closest to the original
63
ContributionsAligning Observed and Modeled Behavior
a bc
a bd
Log Trace ad abd/acd
a da d
Alignment 1
Process Model
ba da d
Alignment 2c
Both alignments are optimal
ComputePrecision
ModeledBehavior
ObservedBehavior
Minimal Imprecise Traces
ETC Precision (etcp)
0.81
64
Alignments
a d ba a b
ad
New weight function
ContributionsPrecision based on Alignments
65
ContributionsObserved Behavior from 1-Alignment
a, a, b, da, b, da, d, a, ba, d
Event Log
/
a bc
a bdProcess Model
a, a, b, da, b, da, a, b, da, b, d
Fitting Traces
a, c, d
db
b d
ac 2 2 2
2 2
a4 4
66
ContributionsObserved Behavior from All-Alignment
a, a, b, da, b, da, d, a, ba, d
Event Log
/
a bc
a bdProcess Model
a, a, b, da, b, da, a, b, da, b, d
Fitting Traces
a, c, d
db
b d
ac 2 2 2
1 1
a3 3
0.5d
0.5
4 4
1.5 1.5
ComputePrecision
ModeledBehavior
ObservedBehavior
Minimal Imprecise Traces
ETC Precision (etcp)
0.81
67
Alignments
a d ba a b
ad
New weight function
ContributionsPrecision based on Alignments
68
ContributionsChallenges Addressed
• Precision based on alignments.
• Precision for unfitting cases.
• Precision independent of fitness.
• Precision based on 1-alignment or All-alignments.
69
ContributionsExtensions to Precision based on Alignments
• Extensions to represent the modeled behavior.• Use of Representative-alignments.• Multi-sets to represent automaton states.
• Backwards use of the alignments.
b
ba
ab
b a
a
a, c, d, eb, c, d, e edc
a
b
PRECISION DECOMPOSITION
CONFORMANCE CHECKING
CONCLUSIONS
71
• Precision• Precision based on the Log• Qualitative Analysis of Precision Checking• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking• Topological Conformance Diagnosis• Data-aware Decomposed Conformance Checking• Event-based Real-time Decomposed Conformance
Checking
72
ProblemFitness in Large Models
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
73
ProblemFitness in Large Models
74
ProblemFitness in Large Models
75
ProblemFitness in Large Models
76
ContextFast, Comprehensible and Guaranteed
• Decompose the Fitness checking problem.
• Comprehensible decomposition and understandable diagnosis results.
• Formal guarantees.
• There is a fitness problem on the original net iff there is a fitness problem in one or more of the components.
• Fast compared to the monolithic approach.
The decomposition preserves the fitness.
77
ContributionsAlignment Fitness Checking
Log Trace
Model Behavior
78
ContributionsDecomposing Alignment Fitness Checking
Log Trace
Model Behavior
79
ContributionsDecomposition based on Graphs
• Based on Graph Decomposition
t1
t2
t3
t4
t5
t6
t7
• Decomposition based on:• Single-Entry Single-Exit Components (SESE)• Refined Process Structure Tree (RPST)
* Artem Polyvyanyy: Structuring Process Models. PhD Thesis. University of Potsdam (Germany), January 2012
* Hopcroft, J., Tarjan, R.E.: Dividing a graph into triconnected components. SIAM J. Com- put. 2(3), 1973
80
ContributionsInterior, Boundary, Entry, and Exit nodes
• Entry node: boundary where • no incoming edge• or all outgoing edges
• Exit node: boundary where • no outgoing edge• or all incoming edges
81
Example of SESE and RPST
SESE: set of edges which graph has a Single Entry node and a Single Exit node
Refined Process Structure Tree (RPST) containing non overlapping SESEs
• Unique• Modular• Linear Time
ContributionsSESE and RPST
82
• Why SESE? • Only one entry; only one exit• Represent subprocesses within the process• Intuitive for conformance diagnosis
• Why RPST?• Partitioning over the RPST• Any cut is a partitioning• Algorithm to partitioning by size (k)
ContributionsSESE and RPST
83
K<5
16
48
4 4
ContributionsSESE and RPST
• Why SESE? • Only one entry; only one exit• Represent subprocesses within the process• Intuitive for conformance diagnosis
• Why RPST?• Partitioning over the RPST• Any cut is a partitioning• Algorithm to partitioning by size (k)
84
• A decomposition based on SESEs preserves the fitness?
• Fitness Preservation: A model/log is perfectly fitting if and only if all the components are perfectly fitting
ContributionsPreserving the Fitness
85
• SESEs (per se) do not preserve fitness.
ContributionsSESE Decomposition does not Preserve Fitness
d
ef
p
ab
c
p
86
• SESEs (per se) do not preserve fitness.
• 0 tokens in p abcdef S2 is blocked
ContributionsSESE Decomposition does not Preserve Fitness
d
ef
p
ab
c
p
S2S1
87
• SESEs (per se) do not preserve fitness.
• 0 tokens in p abcdef S2 is blocked• 1 token in p abcdef fits S but not S2
ContributionsSESE Decomposition does not Preserve Fitness
d
ef
p
ab
c
p
S2S1
88
• SESEs (per se) do not preserve fitness.
• 0 tokens in p abcdef S2 is blocked• 1 token in p abcdef fits S but not S2• 2 tokens in p abdecf fits S1 and S2 but not S
ContributionsSESE Decomposition does not Preserve Fitness
d
ef
p
ab
c
p
S2S1
89
• The problem is in the shared places• No reflection on the log, therefore no synchronization.
• Valid Decomposition: a partition where only transitions are shared among components. No places neither arcs.
• There is a fitness problem on the original net iff there is a fitness problem in one or more of the components.
ContributionsValid Decomposition
Theorem: Valid Decomposition preserves the fitness.
* W.M.P. van der Aalst : Decomposing Petri nets for process mining: A generic approach. Distributed and Parallel Databases, 2013
90
• Create a ‘bridge’ for each shared place
ContributionsBridging a SESE Decomposition
d
efa
b
c
b
c
p d
e
p
S1’ S2’
B1
Notice that not a SESE anymore
91
Theorem: SESE decomposition with Bridging post-processing preserves the fitness.
ContributionsSESE + Bridging Theorem
SESE decomposition with Bridging is a valid decomposition.
92
Monolithic 1h 15min
ContributionsDecomposition Fitness Results
Decomposition(7) 2min
93
ContributionsDecomposition Fitness Results
94
• Precision• Precision based on the Log• Qualitative Analysis of Precision Checking• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking
• Topological Conformance Diagnosis• Data-aware Decomposed Conformance Checking• Event-based Real-time Decomposed Conformance
Checking
95
ProblemLocate Fitness Problems in Large Models
96
ContextProblematic Components
• More than just report the list of model components with fitness problems.
• Provide a structure among the components.
• Visualize the structure of the decomposition.
• Use the structure to detect conflictive components highly related.
97
ContributionsTopological Fitness Checking
98
• Non-Fitting (Weakly) Connected Components
• Non-Fitting Subnet
ContributionsTopological Fitness Checking
99
ContributionsTopological Fitness Checking
100
• Precision• Precision based on the Log• Qualitative Analysis of Precision Checking• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking• Topological and Multi-level Conformance Diagnosis
• Data-aware Decomposed Conformance Checking
• Event-based Real-time Decomposed Conformance Checking
101
ProblemFitness in Data-aware Models
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationAllergy TestBlood Test
Radiology TestDiagnosis
Home Care
102
ProblemFitness in Data-aware Models
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
Initial ExaminationAllergy Test - FAIL Blood Test - PASS
Radiology Test - PASS Diagnosis - HOME
Home Care
tests diagnosis
103
ProblemFitness in Data-aware Models
Large MedicalData-aware Models
104
ContextData-aware Conformance Checking
• Existing techniques for data-aware fitness checking are time-consuming based on
A* (control-flow) + ILP (data)
• Decompose the data-aware fitness problem.
• Meaningful decomposition and diagnostic results.
• Formal guarantees on the fitness correctness.
• Fast compared with the monolithic approach.
105
ContributionsValid Decomposition of Data-aware Models
t1
t2
t3
t4
p t5
t6
t7
p
No Synchronization
• Shared places can be out of synchronization during the fitness checking.
• Valid Decompositions (no places or arcs shared) preserve the fitness.
106
ContributionsValid Decomposition of Data-aware Models
Theorem: Valid Decomposition of Petri nets with data (no shared places, arcs, or data variables) preserves the
fitness.
No Synchronization
t5
t6
t7t4
data
t1
t2
t3
t4
data
Details in the thesis
107
ContributionsValid Decomposition of Data-aware Models
• Petri nets with Data are graphs.
• Decomposition based on SESEs for comprehensive results.
t5
t6
t7t4
data
t1
t2
t3
t4
data
108
ContributionsValid Decomposition of Data-aware Models• Improve in the control flow + improve in the data
5 7 9 11 13 15 17 19 21 23 250
2000
4000
6000
8000
DecompNo Decomp
Average number of events per event-log trace
Ave
rage
com
puta
tion
tim
e (s
)
Real case: Dutch municipality From 52891 seconds to 52 seconds (99%)
109
• Precision• Precision based on the Log• Qualitative Analysis of Precision Checking• Precision based on Alignments
• Fitness Decomposition• Decomposed Conformance Checking• Topological and Multi-level Conformance Diagnosis• Data-aware Decomposed Conformance Checking
• Event-based Real-time Decomposed Conformance Checking
110
ProblemReal-life Monitoring of Hospital Processes
Hospital Processesrunning
Large Process Model
Process-awareMonitoring System
ConformanceReports
ConformanceAlarms
111
ContextEvent-based, Fast, and Comprehensible
• Fitness real-life monitoring architecture for large process models.
• Based on events, not in complete traces.
• Real-time requires time efficiency
• Comprehensive results as part of the monitoring procedure.
112
ContributionsEvent-based Real-time Decomposed Fitness
DecomposedModel
Stream of Events
113
ContributionsDecomposition based on SESE
114
ContributionsEvent-based Real-time Decomposed Fitness
• Heuristic Replay
• Faster compared with alignments.
• Consequences of bad decisions are limited to the fragment.
• Event based.
• Not optimal, but heuristic.
ab c
fd e
acf
a c fa c f
Log Trace
Replay
b
Look-aheadHeuristic
115
ContributionsExample of Real-time Decomposed Fitness
PRECISION DECOMPOSITION
CONFORMANCE CHECKING
CONCLUSIONS
117
Contributions of the ThesisContribution
Precision Approach to quantify and analyze the precision between a log and a model based on escaping arcs.
Robustness and confidence interval for precision based on escaping arcs.
Severity assessment of the imprecision point detected.
Precision checking based on aligning observed and modeled behavior.
Abstraction and directionality in precision based on alignments.
Fitness Decomposition
Decomposed conformance checking based on SESE components.
Hierarchical and topological decomposition based on SESE components for conformance diagnosis.
Decomposed conformance checking for data-aware models.
Decomposed conformance checking for real-time scenarios.
118
Publications of the Thesis (Precision)Jorge Munoz-Gama, Josep Carmona
A Fresh Look at Precision in Process Conformance BPM 2010 – pp. 211 - 226
Jorge Munoz-Gama, Josep Carmona Enhancing precision in Process Conformance: Stability, confidence
and severity. CIDM 2011 – pp. 184-191
Jorge Munoz-Gama, Josep CarmonaA General Framework for Precision Checking
Journal of Innovative Computing, Information and Control – vol.8 no.7B
Arya Adriansyah, Jorge Munoz-Gama, Josep Carmona, Boudewijn F. van Dongen, Wil M. P. van der Aalst
Alignment Based Precision Checking BPM Workshops 2012 – pp. 137-149
Arya Adriansyah, Jorge Munoz-Gama, Josep Carmona, Boudewijn F. van Dongen, Wil M. P. van der Aalst
Measuring precision of modeled behaviorInformation Systems and e-Business Management
119
Publications of the Thesis (Decomposition)
Jorge Munoz-Gama, Josep Carmona, Wil M. P. van der AalstConformance Checking in the Large: Partitioning and Topology
BPM 2013 – pp. 130-145 – Best Student Paper Award
Jorge Munoz-Gama, Josep Carmona, Wil M. P. van der Aalst Hierarchical Conformance Checking of Process Models Based
on Event LogsPetri Nets 2013 – pp. 291-310
Jorge Munoz-Gama, Josep Carmona, Wil M. P. van der AalstSingle-Entry Single-Exit Decomposed Conformance Checking
Information Systems – vol.46 pp. 102-122
Massimiliano de Leoni, Jorge Munoz-Gama, Josep Carmona and Wil M.P. van der Aalst
Decomposing Conformance Checking on Petri Nets with DataCoopIS 2014 – pp. 3-20
Seppe K.L.M. vanden Broucke, Jorge Munoz-Gama, Josep Carmona, Bart Baesens and Jan Vanthienen
Event-based Real-time Decomposed Conformance AnalysisCoopIS 2014 – pp. 345-363
120
Impact of the Thesis
• Published in international journals and international conferences
• Best Student Paper Award in BPM 2013(Acceptance Rate 14%)
• Extensively used in the field• 150 citations • Used for:
measure precision and fitness in models evaluate discovery algorithms guide discovery techniques based on genetic algorithms CoBeFra framework recommender systems trainning
121
Directions for Future Work
• New metrics, new dimensions
• Decomposed alignment of observed and modeled behavior
• Decomposed conformance for other dimensions
• Visualization and diagnosis
• Model repair
Thesis and Acknowledgements
• More details in:
• … and to all the people that made this work possible, THANKS! 122
Jorge Munoz-GamaAdvisor: Josep Carmona
December 2014
CONFORMANCE CHECKING
AND DIAGNOSIS IN PROCESS MINING
124
Backup Slides
125
ContributionsPrecision based on Escaping Arcs
Escaping arcs: points where the model allows more behavior than the one observed in the log.
126
127
128
129
130
131
132
133
134
Problem“Low Criticality Diagnosis” Process
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care
ProcessSimulation
Software
“Low Criticality Diagnosis”Process Model
SimulationResults
135
Problem“Low Criticality Diagnosis” Process
ProcessSimulation
Software
“Low Criticality Diagnosis”Process Model
SimulationResults
Initial Examination
Allergy Test
Blood Test
Radiology Test
Diagnosis
Hospital Treatment
Home Care