Download - 11-Data Science and Big Data
-
8/10/2019 11-Data Science and Big Data
1/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Data Science in Act ion
Data Science and Big Data
prof.dr.ir. Wil van der Aalstwww.processmining.org
-
8/10/2019 11-Data Science and Big Data
2/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Data is the new oil!
In the last 10 minutes we generated more
than from prehistoric times until 2003
-
8/10/2019 11-Data Science and Big Data
3/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
We are all generating event data!
taking the train
refueling your car
buying a coffee
adjusting the temperature
getting a spee
sending
making an a
making a phone callw
this
-
8/10/2019 11-Data Science and Big Data
4/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
GPS
Proximity sensor
Ambient l ight sensor
Accele
Magnetomet
Gyroscopic senso
Touchscreen
Camera (front)
Camera (back)
Bluetooth
Finger-p
M
GSM/HSDPA/LTE
14+ sensors
-
8/10/2019 11-Data Science and Big Data
5/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events
-
8/10/2019 11-Data Science and Big Data
6/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event da
-
8/10/2019 11-Data Science and Big Data
7/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event da
-
8/10/2019 11-Data Science and Big Data
8/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event da
-
8/10/2019 11-Data Science and Big Data
9/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Internet of Events: 4 sources of event da
-
8/10/2019 11-Data Science and Big Data
10/38
-
8/10/2019 11-Data Science and Big Data
11/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)Wil van der Aalst & TU/e (use only with permission & acknowledgements)
-
8/10/2019 11-Data Science and Big Data
12/38
Moo
Gordon E. M
Components
Circui ts, Elec1965.Diagram by Wgsimon/CC BY 3.0
Other e
Compu
Capaci
Bytes p
220= 1.048.576x in 40 y
-
8/10/2019 11-Data Science and Big Data
13/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Question
40 years ag
approximathours to go
Eindhoven
Amsterdam
How long w
today if tran
technology
followed Mo
-
8/10/2019 11-Data Science and Big Data
14/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Answer
1.5 x 60 x 60 / 220 =
0 00515 s
-
8/10/2019 11-Data Science and Big Data
15/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Question
40 years ago
approximateto go from A
to New York
How long wotoday if trans
technology w
followed Mo
-
8/10/2019 11-Data Science and Big Data
16/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Answer
7 x 60 x 60 / 220 = 0.0240 secon
-
8/10/2019 11-Data Science and Big Data
17/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Question
40 years ago
approximateliters of petr
around the w
How much p
it take today
transportatio
technology w
fol lowed Moo
-
8/10/2019 11-Data Science and Big Data
18/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Answer
4000 / 220 = 0.0038 liters
-
8/10/2019 11-Data Science and Big Data
19/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Drowning in data
How
extrac
value event d
4 V' f Bi D t
-
8/10/2019 11-Data Science and Big Data
20/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
4 V's of Big Data
VERACITYVELOCITYVARIETYVOLUME
-
8/10/2019 11-Data Science and Big Data
21/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Data does not have to be "Big" to be chal
Need for data scien
Data anaquestion
everywhe
-
8/10/2019 11-Data Science and Big Data
22/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
A data scientist is able to
analyze, and interpret dat
variety of sources (socialinteraction, business proc
cyber-physical systems).
Turning data into
-
8/10/2019 11-Data Science and Big Data
23/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Four generic
data science
questions
-
8/10/2019 11-Data Science and Big Data
24/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
What
happened?
#1
-
8/10/2019 11-Data Science and Big Data
25/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Why did
it happen?
#2
-
8/10/2019 11-Data Science and Big Data
26/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
What will
happen?
#3
-
8/10/2019 11-Data Science and Big Data
27/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
What is
the best that
can happen?
#4
Wh d i h
-
8/10/2019 11-Data Science and Big Data
28/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Can we predict
waiting times?
Why do patients have
to wait so long?
How can we
costs?
How much staff is
needed tomorrow?
Do doctors follow the
guidelines?
-
8/10/2019 11-Data Science and Big Data
29/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Why and when doX-ray machines
malfunction?
Which componentsshould be replaced?
How are X-ray
machines really used?
Can w
that the
will bre
next
Wh
ne
imfrom the organizational level
to the hardware/software level
Data science skills
-
8/10/2019 11-Data Science and Big Data
30/38
needed to answer
such questions
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
-
8/10/2019 11-Data Science and Big Data
31/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
DSC/e
http://www.tue.nl/
It is the process s
-
8/10/2019 11-Data Science and Big Data
32/38
It is the process s
In the end i t is th
that matters (adata or the
Not jus
and
but e
p
P t i i d t i
-
8/10/2019 11-Data Science and Big Data
33/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process-centric view on data scien
Understand and improve "care
flows" in a hospital by intelligently
using event data scattered over
hundreds of database tables w ith
patient data.
Understand and improve
and performance of X-ra
in the field using teraby
level event data col lecte
remote services ne
F f thi
-
8/10/2019 11-Data Science and Big Data
34/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Focus of this course
P i i
-
8/10/2019 11-Data Science and Big Data
35/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process mining use cases
What is the process that people re
follow?
Where are the bottlenecks in my
process?
Where do people (or machines) d
from the expected or idealized pro
M f
-
8/10/2019 11-Data Science and Big Data
36/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Many more use cases for process m
What are the "highways" in my process?
What factors are influencing a bottlenec
Can we predict problems (delay, deviatio
etc.) for running cases?
Can we recommend countermeasures? How to redesign the process / organizat
machine?
-
8/10/2019 11-Data Science and Big Data
37/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Data Science in Part I: Preliminaries Part III: Beyond Process Discovery
-
8/10/2019 11-Data Science and Big Data
38/38
Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Chapter 2Process Modeling and
Analysis
Chapter 3Data Mining
Part II: From Event Logs to Process Models
Chapter 4
Getting the Data
Chapter 5
Process Discovery: AnIntroduction
Chapter 6
Advanced ProcessDiscovery Techniques
Chapter 7Conformance
Checking
Chapter 8Mining Additional
Perspectives
Part IV: Putting Process Mining to W
Chapter 10
Tool Support
Chapter 11
Analyzing LasagnaProcesses
Part V: Reflection
Chapter 13Cartography and
Navigation
Chapter 14Epilogue
Chapter 1Introduction
Wil van der Aalst & TU/e (use only with permission & acknowledgements)