introduction to machine learning and data...

49
Introduction to Machine Learning and Data Mining Advanced Information Systems and Business Analytics for Air Transportation M.Sc. Air Transport Management June 1-6, 2015 Slides prepared by N. Kemal Üre

Upload: others

Post on 29-May-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Introduction to Machine Learning and Data Mining

Advanced Information Systems and Business Analytics for Air TransportationM.Sc. Air Transport Management

June 1-6, 2015

Slides prepared by N. Kemal Üre

A Framework for Business Analytics

2

What is Machine Learning?

Study of algorithms that can learn and make predictionsfrom data

3

ModelData Prediction

• Also referred to as predictive modeling or predictive analytics• Strong ties with statistics, computer science and optimization• A wide range of applications: spam filtering, optical character recognition

(OCR), search engines and computer vision

What is Machine Learning?

• How is Machine Learning (ML) different than Data Mining and Statistics?

• Statistics– Sub-field of mathematics– Inference of probabilistic models– The main objective is understanding the underlying data generation

process

• Data Mining (DM)– Carried by a person, uses methods from statistics and ML– Usually works with massive datasets with problematics entries– Gain preliminary insight and make predictions

4

ML/DM Process

Source: Kantarzdic 5

ML/DM Process

Source: Kantarzdic 6

Types of Data

7

Data Preparation

• Transformations

– Normalization

• Decimal Scaling

• Min-max normalization

• Standard Deviation Normalization

– Smoothing

Source: Kantarzdic8

Data Preparation

Source: Kantarzdic9

• Missing Data

• Time Dependent Data

Data Preparation• Outliers

Source: Kantarzdic10

Primary ML/DM Problems

• Supervised Learning

– Data is labeled <x_i,y_i>

– Learn the association between x and y

• Unsupervised Learning

– Data is unlabeled, we only have x_i

– Learn the structure and patters in x

• Reinforcement Learning

– Learn how to `control` a dynamic system

11

Supervised Learning

• Classification

• Regression

12

Classification

• Predict the class of the input variable

• Function approximation approach y = f(x)• Probabilistic approach P(y|x)

Source: Murphy 2011 13

Classification Examples

Document Classification, Spam Filtering, Hand-written Digit Recognition

14

Classification Examples

15

Face Detection

Credit Risk Calculation

Classification for Delay Prediction

Source: Rebollo, Balakrishnan 2014 16

Regression• Classification with continuous variables• Curve fitting and model selection

17

Regression• Classification with continuous variables• Curve fitting and model selection

18

Regression• Classification with continuous variables• Curve fitting and model selection

19

Regression• Classification with continuous variables• Curve fitting and model selection

20

Regression• Classification with continuous variables• Curve fitting and model selection

21

Regression

Beware of the noise in the data!

22

Regression Examples

• Predict tomorrow’s stock market price given current market conditions and other possible side information.

• Predict the age of a viewer watching a given video on YouTube.

• Predict the location in 3d space of a robot arm end effector, given control signals (torques) sent to its various motors.

• Predict the amount of prostate specific antigen (PSA) in the body as a function of a number of different clinical measurements.

• Predict the temperature at any location inside a building using weather data, time, door sensors, etc.

Source: Murphy 2011 23

Regression for Predicting Ticket Prices

Source: Gini 2011 24

Unsupervised Learning

• Clustering

• Learning Graphs

• Matrix Completion

25

Clustering

• Segment the data into different groups

26

Clustering Examples

Astronomy Social Networks

27

Clustering for Delivery Network

28

Clustering for Delivery Network

29

Clustering for Delivery Network

30

Clustering for Delivery Network

31

Clustering for Delivery Network

32

Clustering for Delivery Network

33

Clustering for Delivery Network

34

Clustering for Delivery Network

35

36

smart study

prepared fair

pass

p(smart)=.8 p(study)=.6

p(fair)=.9

p(prep|…) smart smart

study .9 .7

study .5 .1

p(pass|…)smart smart

prep prep prep prep

fair .9 .7 .7 .2

fair .1 .1 .1 .1

Query: What is the probability that a student is smart, given that they pass the exam?

Bayesian Networks

37

Bayesian Networks

Visit to Asia

Smoking

Lung CancerTuberculosis

Abnormalityin Chest

Bronchitis

X-Ray Dyspnea

“Asia” network:

BN Application Fare Value and Passenger Behavior

Source: Booz Allen38

What is the expected fare value for a specific passenger behavior?

Can predictive modeling be developed for reservation changes and no-show rates for individual passengers on individual itineraries?

Matrix Completion

Source: Murphy 2011 39

MC for Image Recovery

Source: Murphy 2011 40

MC for Product Recommendation

• Filtering: Given my purchase history, what is my next likely purchase?• Collaborative Filtering: Given the purchase history of customers similar to me,

what is my next likely purchase?

Source: Murphy 2011 41

Collaborative Filtering Challenges

• Data Sparsity

• Scalability

• Synonymy

• Gray Sheep

• Attacks

42

Beyond the User-Item Matrix

Source: Shi 2014 43

Beyond the User-Item Matrix

Source: Shi 2014 44

Product Recommendation System For Airlines

Source: Barth 2014 45

Reinforcement Learning

46

Maze Exploration

Source: Geramifard 2011 47

RL Application - Maintenance Optimization

• A machine/component degradation model

• Maintenance costs money but restores the machine to its original state

• If not maintained, the machine eventually breaks down

• What is the optimal state to repair the machine?

Source: Bertsekas 2006 48

RL Application – Active Web Advertising

Silver 2013 49