demystifying machine learning learning.pdfii. need to recalibrated and retrained on a regular basis...

Post on 18-Sep-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Demystifying Machine Learning

us.sogeti.com2

Topics• Definition• A Couple of Motivating Examples• Why the Hype?

What is Machine Learning?

• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning

Types of Machine Learning

• Lifecycle• Analysis and Model Building

Machine Learning Process

• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance

Common Challenges in Machine Learning

• A Few ApplicationsMachine Learning in the

Public Sector

3

What is Machine Learning?

“Science of getting computers to act without explicit programming”

4

Motivating Example – Predicting Home Prices

Living Area (sq. ft.) # of bedrooms Parking Space Finished Basement?

Other Parameters (zipcode, school

district, tax rate..)

House Price

2400 4 3 1 … $350,000

1400 2 1 0 … $190,000

1900 3 2 0 … $250,000

Living Area (sq. ft.) # of bedrooms Parking Space Finished Basement?

Other Parameters (zipcode, school

district, tax rate..)

House Price

2000 2 2 1 … ?

5

Another Motivating Example – Self Driving Cars

Basic High Level Technique Behind Self Driving Cars

1) A human drives a car in varying traffic conditions2) While the car is being driven, a set of cameras:

a. Record the traffic conditionsb. And the corresponding action(s) taken by the driver

3) The data points (billions/trillions of telemetry, videos, and images!) collected is fed to computers

4) Machine Learning algorithms train on the data and the computers learn what action(s) to take under different traffic conditions

5) The computers are now given charge of driving the car!

Note that no explicit rules of what to do when are explicitly fed to the computers

6

Machine Learning - Why the Hype Now?!

Perf

orm

ance

Amount of Data

Traditional MachineLearning Algorithms

Modern Machine Learning Algorithms

1. The term ‘Machine Learning’ was coined in 1959

2. However, it is only in the 7 to 8 years that is has caught on and been adopted widely in business

3. This is for two reasons:a. Explosion of data

generation all around us

b. Availability of compute power in terms of GPUs, and horizontally scalable platforms like Hadoop

Courtesy - Andrew Ng: Artificial Intelligence is the New Electricity

us.sogeti.com7

Topics• Definition• A Couple of Motivating Examples• Why the Hype?

What is Machine Learning?

• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning

Types of Machine Learning

• Lifecycle• Analysis and Model Building

Machine Learning Process

• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance

Common Challenges in Machine Learning

• A Few ApplicationsMachine Learning in the

Public Sector

8

Machine Learning – TypesMachine Learning

Predictive Prescriptive(Optimizations)

Supervised Un-Supervised

Regression Classification

Examples:1. House price

prediction2. Stock price

prediction3. Demand prediction

Examples:1. Image classification2. email spam

detection3. Tumor classification4. Fraud detection

Examples:1. Customer segmentation2. Document classification3. Fraud detection

Examples:1. Inventory optimization2. Truck route optimization3. Retail store assortment

optimization

9

Machine Learning – Common AlgorithmsMachine Learning

Predictive Prescriptive(Optimizations)

Supervised Un-Supervised

Regression Classification

1. Linear Regression

2. SVM3. K-Nearest

Neighbors4. Decision Trees

1. Logistic Regression

2. Neural Networks3. K-Nearest

Neighbors4. Decision Trees

1. K-means Clustering2. Principal

Component Analysis (PCA)

1. Linear Programming2. Non-Linear Programming:3. Metaheuristics:

a. Genetic Algorithmsb. Simulated Annealing

10

The Jargon: Artificial Intelligence, Machine Learning, Deep Learning??!!

11

The Jargon: Venn Diagram Representation

AIMachine LearningDeep Learning

12

Definitions: Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence: Techniques that enable computers to mimic human intelligence.

This can include things like making predictions, planning, understanding language, recognizing objects etc.

AI can be achieved using a variety of techniques such as if-then-rules, decision trees, Robotic Process Automation (RPA), machine learning etc.

Machine Learning: A subset of AI techniques based around the idea that we should really just be able to give machines

access to data and mimic human intelligence by letting them learn for themselves

Some techniques include Linear/Logistic Regression, SVM, Random Forests, Neural Networks etc.

13

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Deep Learning: A subset of machine learning algorithms that allow the algorithms to perform higher level human

tasks such as image recognition, speech to text translation, sentiment classification in a text, language translation etc.

These algorithms are inspired by the neural networks in the human brain

us.sogeti.com14

Topics• Definition• A Couple of Motivating Examples• Why the Hype?

What is Machine Learning?

• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning

Types of Machine Learning

• Lifecycle• Analysis and Model Building

Machine Learning Process

• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance

Common Challenges in Machine Learning

• A Few ApplicationsMachine Learning in the

Public Sector

15

Machine Learning Lifecycle

Deployment

16

Machine Learning – Analysis and Model Building

Explore Data

Prepare Data

Perform Feature Engineering

Divide Data into Train/Validation/

Test Datasets

Train Different Models on

Training Dataset

Evaluate Trained Models on ‘Validation’

Dataset

Pick the best performing

model

Test the chosen model on ‘Test’

Dataset

70 – 80% of time spent here Only 20-30% of time spent here

us.sogeti.com17

Topics• Definition• A Couple of Motivating Examples• Why the Hype?

What is Machine Learning?

• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning

Types of Machine Learning

• Lifecycle• Analysis and Model Building

Machine Learning Process

• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance

Common Challenges in Machine Learning

• A Few ApplicationsMachine Learning in the

Public Sector

18

Machine Learning – Challenges1. Lack of data or lack of labeled/tagged data for supervised learning

a. Many modern Machine Learning algorithms are very data hungryb. Some organizations do have the required data but it is not labeledc. Ways to address the challenge:

Gather more data! Data synthesis

2. Poor Data Quality:a. A Machine Learning Model is as good as the data! b. Typical reasons for an organization to have poor data quality are:

Manual data entry Lack of consistent data dictionaries Inconsistent entries by different users System Integration Issues

c. Ways to improve data quality: Automation Good data architecture Implementation of good data governance principles

19

Machine Learning – Challenges (Contd.)

3. Poor Performance of Machine Learning Algorithms:a. Bias vs. Variance tradeoff a.k.a Underfitting vs. Overfitting tradeoffb. Stale Models

i. Models when left alone get stale very quicklyii. Need to recalibrated and retrained on a regular basisiii. Having a feedback loop from production data is a best practice

c. Unbalanced datasetsi. For example in Fraud Detectionii. Accuracy could be very high of a useless model!

20

Challenge: Overfitting and Underfitting

Underfitting – Model does not capture the structure of data. It is too simple.

Overfitting – Model tries too hard to fit all outliers and errors in the data and does not do well with new data. Generally is high order polynomial model.

21

Challenge: How to Overcome Underfitting and Overfitting

Underfitting- Increase model complexity by adding more features, adding higher order terms or interaction features

- Try different Machine Learning algorithms

Overfitting- Make the model simpler by:

- Reducing the number of features- Penalizing a complex model by using a mathematical technique called Regularization

- Train on a larger dataset

22

Machine Learning – Challenges (Contd.)

4. Production Deployment and Maintenance: a. Industry still immature in this area. Many

organizations have ML models running on individual laptops!

b. Without an effective deployment solution, it is hard to:i. Make the model available to the larger

organization and embed it in business applications

ii. Determine why a model works well on training data but not on production data

iii. Maintain different versions of the model and to do A/B testing

iv. Gain confidence of the business when the results cannot be interpreted by them

us.sogeti.com23

Topics• Definition• A Couple of Motivating Examples• Why the Hype?

What is Machine Learning?

• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning

Types of Machine Learning

• Lifecycle• Analysis and Model Building

Machine Learning Process

• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance

Common Challenges in Machine Learning

• A Few ApplicationsMachine Learning in the

Public Sector

24

Machine Learning Applications in the Public Sector

1. Fraud Detection: Payments Insider threat detection DMV applications

2. Deployment of Resources: Optimal deployment of police and traffic cops Readiness for adverse events like earthquakes and forest fires

3. Education: Automatic computer grading of student papers and answer sheets Detect cheating in tests and plagiarism

25

Machine Learning Applications in the Public Sector (Contd.)

4. Process Automation: Chat bots to answer citizen queries Deciphering the sentiment and mood of citizens calling-in into government agencies call-centers Matching job descriptions with resumes Intelligent automation of things like Event Registration, sending communication email etc.

5. Predictive Analytics: Predict and reduce reincarceration rates Predict and reduce hospital readmission rates Detect system intrusion into IT systems and hacking activity

6. Predictive Maintenance Predictive maintenance of heavy equipment

26

Questions?

27

Thank You!

28

References

i. https://medium.com/iotforall/the-difference-between-artificial-intelligence-machine-learning-and-deep-learning-3aa67bff5991

ii. https://medium.com/@diamond_io/artificial-intelligence-101-everything-you-need-to-know-to-understand-ai-8e20fe4bd750

iii. https://www.forbes.com/sites/bernardmarr/2016/12/06/what-is-the-difference-between-artificial-intelligence-and-machine-learning/#25895eea2742

iv. https://dzone.com/articles/10-interesting-use-cases-for-the-k-means-algorithm

v. https://www.youtube.com/watch?v=21EiKfQYZXc&list=PL4KifhYqFlly5ynC4WqwNSwzGcPmT_t6g&index=2&t=0s

top related