eis 2011
DESCRIPTION
Slides of my presentation at EIS conference, 31 October 2011, Delft, NLTRANSCRIPT
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20118 April 2023
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION
Merging Computer Log Files for Process Mining:An Artificial Immune System Technique
Jan Claes and Geert Poelshttp://processmining.ugent.be
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20112 / 15
Process Mining
Processes are supported by IT systemsIT systems record actual process dataProcess data can be used to
Discover process model Check conformance with existing process info Improve or extend existing process model
Attention Only As-Is Only (correctly) recorded information
Process Mining
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20113 / 15
Process data in event logs
Process supportRecorded events
Grouped events
Event log
The process
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20114 / 15
Preparation Collect data: find event information Merge data: from different sources Structure data: group per instance Convert data: to tool specific format
Process mining Make decisions, take actionM
Process Mining steps
A
MM
M
MA
A
MA
Manual task Analysts needed in most cases
Automated task Less human involvement needed
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20115 / 15
Merging log files
My research:Merging log files
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20116 / 15
Merging log files
1. Find links between traces 2. Merge events chronologically 3. Add unlinked traces
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20117 / 15
Find links
Required properties of solution Finds traces in both log files that belong to the
same process execution Without prior knowledge about the provided log
files (as generic as possible) But with maximal possibilities for the (expert) user
to include his knowledge about the log files
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20118 / 15
Find links
Proposed solution Take the best possible guess based on assumptions Include multiple indicator factors in analysis Calculate factor scores for each analysed solution Combine factor scores into global score per solution ‘Best guess’ is solution with highest combined score,
because based on assumed indicators, most indicator value points to this solution
Provide user interaction possibilities
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 20119 / 15
Decisions to make
Which indicator factors?How to calculate a score for each factor?How to combine factor scores to global score?Which solutions to analyse?
(analyse = calculate & compare scores)
Which user interactions to include (expert) user knowledge?
See paper for more details
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 201110 / 15
Indicator factors
Same trace identifier Assumption: If both logs contain a trace with the
same id, there is a very high chance they match Not always though (e.g. customer id vs. order id)
161718192021
101214161820
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 201111 / 15
Indicator factors
Equal attribute values Assumption: The more attributes of a trace and its
events from both logs are equal, the higher the chance they match
JAN 12:00JAN 12:10JAN 12:20JAN 12:30JAN 12:40JAN 12:50
JC 14 14:00JC 15 14:10JC 16 14:20JC 17 14:30JC 18 14:40JC 19 14:50
161718192021
1718191A1B1C
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 201112 / 15
Test results
Simulated data (300-400 msec on standard laptop) Benefit of controllable parameters, known solution Correct number of linked traces in all tests Perfect results for same trace id and up to 50%
noise, worse results for higher overlap of tracesReal data (6-10 min on standard laptop)
Correct number of linked traces in all tests Almost perfect results for same trace id and up to
50% noise, worse results for higher overlap
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 201113 / 15
New approach
Rule Based Merger User has to configure rules for linking traces Rule = relationship between attributes in both logs Events of linked traces are merged chronologically
“Merge all traces where attribute A of the trace in log 1 equals attribute B of any event in the trace in log 2”
Select attributes, contexts and operatorResearch focus: suggesting merging rules
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 201114 / 15
New approach
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management
Jan Claes for EIS 201115 / 15
Contact information
http://processmining.ugent.beTwitter: @janclaesbelgium