process mining approaches [email protected]
TRANSCRIPT
Process Mining A Process Managmenet technique that allows
for the analysis of business Process based on event logs.
Algorithms are applied to event log datasets to find patterns and details contained in event logs recorded by an information system
Objective is Effiecient and improve
ClassificationDiscoveryA discovery technique takes an event log and produces a process model without using any a-priori information. Conformance checking An existing process model is compared with an event log of the same process.EnhancementThe main idea is to extend or improve an existing process model using information about the actual process recorded in some event log.
Approach UsedDirect Algorithmic ApproachesTwo-Phase ApproachesComputational Intelligence ApproachesPartial Approaches
Direct Algorithmic ApproachesExtracts footprint from the event log and
uses this footprint to directly construct a process model
Also called language-based regionsExtracted from the log and based on this
relation a Petri net is constructedAlpha Algorithem is Example of Direct
ApproachWe apply an algorithm on the logs and derive
directly the process model
Two Phases ApproachUses a two-step approach in which first a “low-level model” (e.g., a
transition system , Markov model) is constructed.2nd step is that low-level model is converted into a “high-level model”
that can express concurrency and other (more advanced) control-flow patterns.
Transition system is extracted from the log using a customizable abstraction mechanism.
Transition system is converted into a Petri net using called statebased regions The resulting model can be visualized as a Petri net, but can also be converted into other notations (e.g., BPMN and EPCs).
Similar approaches can be envisioned using hidden Markov models. Using an Expectation-Maximization(EM) algorithm such as the Baum–Welch algorithm, the “most likely” Markov model can be derived from a log.
Model is converted into highlevel model.
Hidden Morkov ModelSet of states: {s1, s2, s3…. sn } Process moves from one state to another
generating a sequence of states : s1, s2…. Markov chain property: probability of each
subsequent state depends only on what was the previous state
Hidden Morkov ModelYou are going to find robot mood that either
rebot is happy or sad by watching movie(W), sleeping S, Crying C, Facebook F.
X=h if you happy X=s if unknown Y observation . w, s, c or f .
We want to answer queries, such as: P(X= h|Y =f) ? P(X= s|Y =c) ?
Hidden Morkov Model
Computational Intelligence Approaches Techniques originating from the field of computational intelligence form
the basis for the third family of process discovery approaches. Examples of techniques are genetic programming, genetic algorithms,
simulated annealing, reinforcement learning, machine learning, neural networks, fuzzy sets, rough sets, and swarm intelligence.
The log is not directly converted into a model but uses an iterative procedure to mimic the process of natural evaluation.
Using genetic process mining approach starts with initial population of individuals. Each individual corresponds to a randomly generated process model. For each individual a fitness value is computed describing how well the model fits with the log.
Populations evolve by selecting the fittest individuals and generating new individuals using genetic operators such as crossover (combining parts of two individuals) and mutation (random modification of an individual). The fitness gradually increases from generation to generation. The process stops once an individual of acceptable quality is found.
Machine Learning Determine rules from data/factsImprove performance with experienceGetting computers to program themselves
Sketch of an Induction AlgorithmCalculate for each attribute, how good it classifies the elements of the
training set Classify with the best attribute Repeat for each resutling subtree the first
two steps Stop this recursive process as soon as a
termination condition is satisfied
Partial ApproachesThe approaches produce a complete end-to-end
process model.It is also possible to focus on rules or frequent
patterns approach for mining of sequential patterns. This approach is similar to the discovery of
association rules, however, now the order of events is taken into account.
Here a sliding window is used to analyze how frequent an “episode” ( partial order) is appearing.
Approaches exist to learn declarative (LTL-based) languages like Declare.
PROLOGPROLOG (=PROgramming in LOGic) is a
programming language based on Horn clauses
father(peter,mary). father(peter,john). mother(mary,mark). mother(jane,mary).grandfather(X,Z) :- father(X,Y), father(Y,Z).
grandfather(X,Z) :- father(X,Y), mother(Y,Z).
Heuristic minerHeuristics Miner is a practical applicable mining
algorithm that can deal with noise, and can be used to express the main behavior that is not all details and exceptions, registered in an event log.
Extends alpha algorithm by considering the frequency of traces in the log.
The Heuristics Miner Plug-in mines the control flow perspective of a process model.
Considers the order of the events within a case. these algorithms take frequencies of events and
sequences into account when constructing a process model
StepsThe construction of the dependency graph For each activity, the construction of the input and
output expressions The search for long distance dependency relations
1. Read a log 2. Get the set of tasks 3. Infer the ordering relations based on their
frequencies 4. Build the net based on inferred relations 5. Output the net
Genetic Miner Genetic miner uses a genetic algorithm to
mine a petri net representation of the process model from execution traces.
A global search strategy (the quality or fitness of a candidate model is calculated by comparing the process model with all traces in the event log the search process takes place at a global level. For a local strategy there is no guarantee that the outcome of the locally optimal steps
StepsThe first is to define the internal representation. The second concern is to define the fitness
measure.The third concern relates to the genetic operators
(crossover and mutation) Read event log Build the initial population Calculate fitness of the individuals in the populationStop and return the fittest individuals Create next population
Fuzzy minerProcess Mining is a technique for extracting
process models from execution logs. People have an idealized view of reality. Real-life processes turn out to be less
structured than people tend to believe. Model spaghetti-like
OutputPhase I: Fuse similar behaving attributesPhase II: Generate Meta rules Phase III: Generate frequent fuzzy itemsets Phase IV: Make fuzzy association rules.
Questions