33 process cubes - pa.win.tue.nl

57
33 PAGE 0 Process Cubes Wil van der Aalst AIS Meeting 13-2-2014

Upload: others

Post on 07-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 33 Process Cubes - pa.win.tue.nl

33

PAGE 0

Process Cubes

Wil van der Aalst

AIS Meeting 13-2-2014

Page 2: 33 Process Cubes - pa.win.tue.nl

DSC/e's Motto

Turning Data into Value

Page 3: 33 Process Cubes - pa.win.tue.nl

What

happened?

Page 4: 33 Process Cubes - pa.win.tue.nl

Why did

it happen?

Page 5: 33 Process Cubes - pa.win.tue.nl

What will

happen?

Page 6: 33 Process Cubes - pa.win.tue.nl

What is the

best that

can happen?

Page 7: 33 Process Cubes - pa.win.tue.nl

Data Science

aims to

answer such

questions

Process Mining

Page 8: 33 Process Cubes - pa.win.tue.nl

Motivation

PAGE 7

4.56 compute average

create model

(freq., sum, variance, mode, …)

(alpha, heuristic, fuzzy, social,

dotted, compliance, etc., etc.)

Page 9: 33 Process Cubes - pa.win.tue.nl

2 dimensions: model = number

PAGE 8

hours: 50.5

number: 32 passed

failed

TU/e HSE QUT

hours: 48.5

number: 55

hours: 23.2

number: 49

hours: 23.5

number: 23

hours: 10.5

number: 4

hours: 24.5

number: 8

Page 10: 33 Process Cubes - pa.win.tue.nl

2 dimensions: higher-level models

PAGE 9

passed

failed

TU/e HSE QUT

Page 11: 33 Process Cubes - pa.win.tue.nl

Process Cubes

PAGE 10

W.M.P. van der Aalst.

Process Cubes: Slicing, Dicing, Rolling Up

and Drilling Down Event Data for Process

Mining. In M. Song, M. Wynn, and J. Liu, editors, Asia Pacific

Conference on Business Process Management (AP-

BPM 2013), volume 159 of Lecture Notes in Business

Information Processing, pages 1-22. Springer-Verlag,

Berlin, 2013.

10.1007/978-3-319-02922-1_1

Page 12: 33 Process Cubes - pa.win.tue.nl

Process Cube: Dimensions and Cells

PAGE 11

time

win

dow

event class

case

typ

e

process cell

event

New behaviorprocess cube

cell model

cell sublog

case

s

time

ct

ec

tw

dimension

caseactivitytimestampresourcetype….

event

Page 13: 33 Process Cubes - pa.win.tue.nl

Example Dimensions

PAGE 12

time location

group

concept

drift

analysis cross-

organizational

process

mining

clustering and

classification

acbe abce ade

acbe abce ade

acbe abce ade

cases may

be

distributed

over

multiple

cells, the

same event

may belong

to multiple

cells

Page 14: 33 Process Cubes - pa.win.tue.nl

Example

PAGE 13

gold

silver

normal

Page 15: 33 Process Cubes - pa.win.tue.nl

Another Example

PAGE 14

>100k

50k

>50k & 100k

Page 16: 33 Process Cubes - pa.win.tue.nl

Another Example

PAGE 15

Page 17: 33 Process Cubes - pa.win.tue.nl

Another Example

PAGE 16

{1,2,3,4}

{5,6,7}

{8,9,10}

Page 18: 33 Process Cubes - pa.win.tue.nl

Roll up and drill down

PAGE 17

time

win

dow

event class

case

typ

e A

G

B

C

D

E

F

1: ACDEG2: BCFG3: BCFG

4: ACEDG5: ACFG

6: ACEDG7: BCEDG8: BCDEG

1: AC4: AC5: AC6: AC

1: CDEG4: CEDG5: CFG

6: CEDG

2: BC3: BC7: BC8: BC

2: CFG3: CFG

7: CEDG8: CDEG

A

C

B

C

GC

D

E

F

GC

D

E

F

time

win

dow

event class

case

typ

e

drill down

gold

cu

sto

mer

silv

er c

ust

om

er

2012

2013

2012

2013

sales delivery

roll up

Page 19: 33 Process Cubes - pa.win.tue.nl

Two foundational ways of spitting event

data: horizontal or vertical

PAGE 18

A

start complete

G

B

C

D

E

F

1: ACDEG2: BCFG3: BCFG

4: ACEDG5: ACFG

6: ACEDG7: BCEDG8: BCDEG

1: AC2: BC3: BC4: AC5: AC6: AC7: BC8: BC

A

G

B

C

D

E

FC

1: CDEG2: CFG3: CFG

4: CEDG5: CFG

6: CEDG7: CEDG8: CDEG

split

horizontally

A C F G

B C F G

B C

D

G

E

A C

D

G

E

1: ACDEG4: ACEDG6: ACEDG

7: BCEDG8: BCDEG

5: ACFG

2: BCFG3: BCFG

merge

horizontally

split

vertically

merge

vertically

Page 20: 33 Process Cubes - pa.win.tue.nl

Ingredients

PAGE 19

Event Base (EB) with

(raw) event data

Process Cube Structure (PCS)

defining dimensions

independent of content

Process Cube View (PCV)

based on PCS that can be

linked to real events

Page 21: 33 Process Cubes - pa.win.tue.nl

Event Base (E,P, π)

• Set of events E

• Set of properties P

• Partial mapping π

(πp(e)=v means event e has property p with value v)

PAGE 20

Page 22: 33 Process Cubes - pa.win.tue.nl

Slightly more formal

PAGE 21

Page 23: 33 Process Cubes - pa.win.tue.nl

Process Cube Structure (D,type,hier)

• D is the set of dimensions

• type defines the type of each dimension (string, int,

etc.)

• hier defines the hierarchy per dimension (does not

need to be a tree, elements at the same level may be

partially overlapping)

PAGE 22

caseactivityresourcetypetotaltime

Process Cube Dimensions

Page 24: 33 Process Cubes - pa.win.tue.nl

Compatible: D P and correct types

PAGE 23

reso

urce

activity

case

typ

e

Event Base (EB) with

(raw) event data

Process Cube Structure

(PCS) defining

dimensions D

independent of content

Page 25: 33 Process Cubes - pa.win.tue.nl

Slightly more formal

PAGE 24

Page 26: 33 Process Cubes - pa.win.tue.nl

Process Cube View (Dsel , sel)

• View on a process cube structure (D,type,hier).

• Dsel D are the selected dimensions.

• sel is a function determining the structure of each

dimension in D .

PAGE 25

reso

urce

activity

case

typ

e

reso

urce

activity

Page 27: 33 Process Cubes - pa.win.tue.nl

Slightly more formal

PAGE 26

Page 28: 33 Process Cubes - pa.win.tue.nl

Process Cube View Example (1/3):

Dimension resource is not selected

PAGE 27

reso

urce

activity

case

typ

e

activity

case

typ

e

PCV = (Dsel , sel) PCS = (D,type,hier)

Page 29: 33 Process Cubes - pa.win.tue.nl

Process Cube View Example (2/3):

Dimension activity is coarsened

PAGE 28

reso

urce

activityca

se t

ype

e.g., per subprocess rather than per activity

reso

urce

activity

case

typ

e

PCV = (Dsel , sel) PCS = (D,type,hier)

Page 30: 33 Process Cubes - pa.win.tue.nl

Process Cube View Example (3/3):

Dimension case type is filtered

PAGE 29

reso

urce

activity

case

typ

e

part of the case type dimension is not selected

reso

urce

activity

case

typ

e

PCV = (Dsel , sel) PCS = (D,type,hier)

Page 31: 33 Process Cubes - pa.win.tue.nl

Materialized Process Cube View

PAGE 30

activity

case

typ

e

reso

urce

activity

case

typ

e

reso

urce

activity

case

typ

e

3 x 2 x 1 = 6 event logs

2 x 2 x 3 = 12 event logs

3 x 1 x 3 = 9 event logs

Page 32: 33 Process Cubes - pa.win.tue.nl

An event may be part of multiple cells!!

• departments may

overlap

• subprocesses share

interfaces

• contracts involving

multiple parties

• ….

PAGE 31

activity

case

typ

e

• events are unique,

• but may be part of multiple cells

Page 33: 33 Process Cubes - pa.win.tue.nl

Slightly more formal

PAGE 32

Page 34: 33 Process Cubes - pa.win.tue.nl

OLAP (Online Analytical Processing)

PAGE 33

Page 35: 33 Process Cubes - pa.win.tue.nl

OLAP operations can be formalized (see W.M.P. van der Aalst. Process Cubes: Slicing, Dicing, Rolling Up and Drilling Down Event Data

for Process Mining. In AP-BPM 2013, volume 159 of Lecture Notes in Business Information

Processing, pages 1-22. Springer-Verlag, Berlin, 2013.)

PAGE 34

Page 36: 33 Process Cubes - pa.win.tue.nl

Initial Implementation

(work of Tatiana Mamaliga)

PAGE 35 http://www.processmining.org/_media/blogs/pub2013/masterthesistatianamamaliga.pdf

Page 37: 33 Process Cubes - pa.win.tue.nl

Architecture

PAGE 36

Page 38: 33 Process Cubes - pa.win.tue.nl

Results Generated Using ProCube

PAGE 37

Page 39: 33 Process Cubes - pa.win.tue.nl

PAGE 38

problem 1

performance

Page 40: 33 Process Cubes - pa.win.tue.nl

Sparsity is a problem when unloading

cells to ProM

PAGE 39

Process cubes tend to

be sparse. Palo does

not handle this well.

Ongoing work of Shengnan Guo.

Page 41: 33 Process Cubes - pa.win.tue.nl

PAGE 40

problem 2

interpretation

Page 42: 33 Process Cubes - pa.win.tue.nl

PAGE 41

Page 43: 33 Process Cubes - pa.win.tue.nl

Two variants of the same "model" …

PAGE 42

comparing models is closely related to configurable process models

Page 44: 33 Process Cubes - pa.win.tue.nl

PAGE 43

link to decomposed

process mining

Page 45: 33 Process Cubes - pa.win.tue.nl

Remember

PAGE 44

A

start complete

G

B

C

D

E

F

1: ACDEG2: BCFG3: BCFG

4: ACEDG5: ACFG

6: ACEDG7: BCEDG8: BCDEG

1: AC2: BC3: BC4: AC5: AC6: AC7: BC8: BC

A

G

B

C

D

E

FC

1: CDEG2: CFG3: CFG

4: CEDG5: CFG

6: CEDG7: CEDG8: CDEG

split

horizontally

A C F G

B C F G

B C

D

G

E

A C

D

G

E

1: ACDEG4: ACEDG6: ACEDG

7: BCEDG8: BCDEG

5: ACFG

2: BCFG3: BCFG

merge

horizontally

split

vertically

merge

vertically

both process

cubes and

decomposed

process mining

split event logs

Page 46: 33 Process Cubes - pa.win.tue.nl

Decomposing Conformance Checking

PAGE 45

SNprocessmodel

Levent log

decomposition technique

conformancechecking

technique

M1

submodel

L1sublog

conformance check

M2

submodel

L2sublog

conformance check

Mn

submodel

Lnsublog

conformance check

conformance diagnostics

decompose model

decompose event log

e.g., maximal decomposition, passage-based decomposition, or SESE/RPST-based decomposition

e.g., A* based alignments, token-based replay, or simple replay until first deviation

yields a (valid) activity partitioning

See "divide and conquer"

framework by Eric Verbeek.

Page 47: 33 Process Cubes - pa.win.tue.nl

Example of a valid decomposition

PAGE 46

a

start t1

SN1

a

c

e

c2

t1

t4

t6SN3

ab

d

e

c1 c3

t1

t2

t3

t5

t6

SN2

d

g

h

e

c5

f

t5

t6

t7 t8

t9

t10

c6

c7

SN5

c

d

c4

t4

t5

SN4

g

h

end

f

t8

t9

t10

t11c8

c9

SN6

a

start

b

c

d

g

h

e

end

c1

c2

c3

c4

c5t1

f

t2

t3

t4

t5

t6

t7 t8

t9

t10

t11

c6

c7 c8

c9

Log can be split in the same way!

Page 48: 33 Process Cubes - pa.win.tue.nl

Example of an alignment for observed

trace a,b,c,d,e,c,d,g,f

PAGE 47

a

start

b

c

d

g

h

e

end

c1

c2

c3

c4

c5t1

f

t2

t3

t4

t5

t6

t7 t8

t9

t10

t11

c6

c7 c8

c9

a

start t1

SN1

a

c

e

c2

t1

t4

t6SN3

ab

d

e

c1 c3

t1

t2

t3

t5

t6

SN2

d

g

h

e

c5

f

t5

t6

t7 t8

t9

t10

c6

c7

SN5

c

d

c4

t4

t5

SN4

g

h

end

f

t8

t9

t10

t11c8

c9

SN6

Etc.

a,b,c,d,e,c,d,g,f

Page 49: 33 Process Cubes - pa.win.tue.nl

Conformance checking can be

decomposed !!!

• General result for any valid decomposition: Any

event log or trace is perfectly fitting the overall

model if and only if it is also fitting all the individual

fragments

PAGE 48

a

start

b

c

d

g

h

e

end

c1

c2

c3

c4

c5t1

f

t2

t3

t4

t5

t6

t7 t8

t9

t10

t11

c6

c7 c8

c9

a

start t1

SN1

a

c

e

c2

t1

t4

t6SN3

ab

d

e

c1 c3

t1

t2

t3

t5

t6

SN2

d

g

h

e

c5

f

t5

t6

t7 t8

t9

t10

c6

c7

SN5

c

d

c4

t4

t5

SN4

g

h

end

f

t8

t9

t10

t11c8

c9

SN6

Wil van der Aalst, Decomposing Petri nets for process mining:

A generic approach. Distributed and Parallel Databases,

Volume 31, Issue 4, pp 471-507, 2013

Page 50: 33 Process Cubes - pa.win.tue.nl

Decomposing Process Discovery

PAGE 49

Mprocessmodel

Levent log

decomposition technique

process discoverytechnique

M1

submodel

L1sublog

discovery

M2

submodel

L2sublog

Mn

submodel

Lnsublog

compose model

decompose event log

discovery discovery

e.g., causal graph based on frequencies is decomposed using passages or SESE/RPST

e.g., language/state-based region discovery, variants of alpha algorithm, genetic process mining

yields a (valid) activity partitioning

See "divide and conquer"

framework by Eric Verbeek.

Page 51: 33 Process Cubes - pa.win.tue.nl

Learn more about decomposing process

mining problems?

• W.M.P. van der Aalst. Decomposing Petri Nets for Process Mining: A Generic Approach. Distributed and Parallel

Databases, 31(4):471-507, 2013.

• W.M.P. van der Aalst. A General Divide and Conquer Approach for Process Mining. In M. Ganzha, L. Maciaszek,

and M. Paprzycki, editors, Federated Conference on Computer Science and Information Systems (FedCSIS

2013), pages 1-10. IEEE Computer Society, 2013.

• W.M.P. van der Aalst and E. Verbeek. Process Discovery and Conformance Checking Using Passages.

Fundamenta Informaticae, 2014.

• W.M.P. van der Aalst. Decomposing Process Mining Problems Using Passages. In S. Haddad and L. Pomello,

editors, Applications and Theory of Petri Nets 2012, volume 7347 of Lecture Notes in Computer Science, pages

72-91. Springer-Verlag, Berlin, 2012.

• J. Munoz-Gama, J. Carmona, and W.M.P. van der Aalst. Hierarchical Conformance Checking of Process Models

Based on Event Logs. In J.M. Colom and J. Desel, editors, Applications and Theory of Petri Nets 2013, volume

7927 of Lecture Notes in Computer Science, pages 291-310. Springer-Verlag, Berlin, 2013.

• J. Munoz-Gama, J. Carmona, and W.M.P. van der Aalst. Conformance Checking in the Large: Partitioning and

Topology. In F. Daniel, J. Wang, and B. Weber, editors, International Conference on Business Process

Management (BPM 2013), volume 8094 of Lecture Notes in Computer Science, pages 130-145. Springer-Verlag,

Berlin, 2013.

• J. Munoz-Gama, J. Carmona, and W.M.P. van der Aalst. Single-Entry Single-Exit Decomposed Conformance

Checking. Information Systems, 2014.

• E. Verbeek and W.M.P. van der Aalst. Decomposing Replay Problems: A Case Study. In D. Moldt and H. Roelke,

editors, Proceedings of the International Workshop on Petri Nets in Software Engineering (PNSE 2013), volume

989 of CEUR Workshop Proceedings, pages 213-232. CEUR-WS.org, 2013.

PAGE 50

Page 52: 33 Process Cubes - pa.win.tue.nl

PAGE 51

Challenges

Page 53: 33 Process Cubes - pa.win.tue.nl

PAGE 52

Page 54: 33 Process Cubes - pa.win.tue.nl

PAGE 53

formal

(not just a

picture)

fast

(should not

take years) sound

(result should

at least be free

of deadlocks,

etc.)

ability to balance

all conformance

dimensions

(fitness, precision,

generalization, and

simplicity) incl.

noise

provide

guarantees

(not just a best

effort)

1

2

3 4

5

process

discovery

challenge

Page 55: 33 Process Cubes - pa.win.tue.nl

Selected challenges in process mining

• Process mining in the Large (DeLiBiDa project)

− decomposition & distribution (SESE, passages, etc.)

− streaming data

− not just discovery (conformance, repair, ext., pred.!)

• Mining with/for stochastic models

− likelihood of models, alignments, delays

• Operational support

− cases!

• Data-aware and resource-aware process mining

− decision points, work distribution, etc.

− exploiting more informative event logs

− low-level logs versus higher-level processes

PAGE 54

Page 56: 33 Process Cubes - pa.win.tue.nl

Selected challenges in process mining

• Process of process mining

− workflow support, reporting

− parameter tuning, experimentation

− links to other types of mining (e.g., text mining)

− repositories (models, data, and analysis techniques)

− guidelines

• Log extraction

− correlation, n:m relations, etc.

− continuous to discrete ("mine your own body")

• Comparative process mining

− process cubes, concept drift, etc.

• Process improvement

− optimization (improve your past), simulation, cigar box

PAGE 55

Page 57: 33 Process Cubes - pa.win.tue.nl

Conclusion

• Challenges in process mining:

concept drift, comparing processes,

cases, and organizations, need for

contextual analysis, limited

scalability, performance problems.

• Unifying view: Process Cubes.

• Similar to OLAP but also many

differences.

• Related to decomposing process

mining problems.

• Initial implementations, but just the

starting point.

PAGE 56