demystifying machine learning learning.pdfii. need to recalibrated and retrained on a regular basis...
TRANSCRIPT
1
Demystifying Machine Learning
us.sogeti.com2
Topics• Definition• A Couple of Motivating Examples• Why the Hype?
What is Machine Learning?
• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning
Types of Machine Learning
• Lifecycle• Analysis and Model Building
Machine Learning Process
• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance
Common Challenges in Machine Learning
• A Few ApplicationsMachine Learning in the
Public Sector
3
What is Machine Learning?
“Science of getting computers to act without explicit programming”
4
Motivating Example – Predicting Home Prices
Living Area (sq. ft.) # of bedrooms Parking Space Finished Basement?
Other Parameters (zipcode, school
district, tax rate..)
House Price
2400 4 3 1 … $350,000
1400 2 1 0 … $190,000
1900 3 2 0 … $250,000
Living Area (sq. ft.) # of bedrooms Parking Space Finished Basement?
Other Parameters (zipcode, school
district, tax rate..)
House Price
2000 2 2 1 … ?
5
Another Motivating Example – Self Driving Cars
Basic High Level Technique Behind Self Driving Cars
1) A human drives a car in varying traffic conditions2) While the car is being driven, a set of cameras:
a. Record the traffic conditionsb. And the corresponding action(s) taken by the driver
3) The data points (billions/trillions of telemetry, videos, and images!) collected is fed to computers
4) Machine Learning algorithms train on the data and the computers learn what action(s) to take under different traffic conditions
5) The computers are now given charge of driving the car!
Note that no explicit rules of what to do when are explicitly fed to the computers
6
Machine Learning - Why the Hype Now?!
Perf
orm
ance
Amount of Data
Traditional MachineLearning Algorithms
Modern Machine Learning Algorithms
1. The term ‘Machine Learning’ was coined in 1959
2. However, it is only in the 7 to 8 years that is has caught on and been adopted widely in business
3. This is for two reasons:a. Explosion of data
generation all around us
b. Availability of compute power in terms of GPUs, and horizontally scalable platforms like Hadoop
Courtesy - Andrew Ng: Artificial Intelligence is the New Electricity
us.sogeti.com7
Topics• Definition• A Couple of Motivating Examples• Why the Hype?
What is Machine Learning?
• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning
Types of Machine Learning
• Lifecycle• Analysis and Model Building
Machine Learning Process
• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance
Common Challenges in Machine Learning
• A Few ApplicationsMachine Learning in the
Public Sector
8
Machine Learning – TypesMachine Learning
Predictive Prescriptive(Optimizations)
Supervised Un-Supervised
Regression Classification
Examples:1. House price
prediction2. Stock price
prediction3. Demand prediction
Examples:1. Image classification2. email spam
detection3. Tumor classification4. Fraud detection
Examples:1. Customer segmentation2. Document classification3. Fraud detection
Examples:1. Inventory optimization2. Truck route optimization3. Retail store assortment
optimization
9
Machine Learning – Common AlgorithmsMachine Learning
Predictive Prescriptive(Optimizations)
Supervised Un-Supervised
Regression Classification
1. Linear Regression
2. SVM3. K-Nearest
Neighbors4. Decision Trees
1. Logistic Regression
2. Neural Networks3. K-Nearest
Neighbors4. Decision Trees
1. K-means Clustering2. Principal
Component Analysis (PCA)
1. Linear Programming2. Non-Linear Programming:3. Metaheuristics:
a. Genetic Algorithmsb. Simulated Annealing
10
The Jargon: Artificial Intelligence, Machine Learning, Deep Learning??!!
11
The Jargon: Venn Diagram Representation
AIMachine LearningDeep Learning
12
Definitions: Artificial Intelligence vs. Machine Learning vs. Deep Learning
Artificial Intelligence: Techniques that enable computers to mimic human intelligence.
This can include things like making predictions, planning, understanding language, recognizing objects etc.
AI can be achieved using a variety of techniques such as if-then-rules, decision trees, Robotic Process Automation (RPA), machine learning etc.
Machine Learning: A subset of AI techniques based around the idea that we should really just be able to give machines
access to data and mimic human intelligence by letting them learn for themselves
Some techniques include Linear/Logistic Regression, SVM, Random Forests, Neural Networks etc.
13
Artificial Intelligence vs. Machine Learning vs. Deep Learning
Deep Learning: A subset of machine learning algorithms that allow the algorithms to perform higher level human
tasks such as image recognition, speech to text translation, sentiment classification in a text, language translation etc.
These algorithms are inspired by the neural networks in the human brain
us.sogeti.com14
Topics• Definition• A Couple of Motivating Examples• Why the Hype?
What is Machine Learning?
• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning
Types of Machine Learning
• Lifecycle• Analysis and Model Building
Machine Learning Process
• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance
Common Challenges in Machine Learning
• A Few ApplicationsMachine Learning in the
Public Sector
15
Machine Learning Lifecycle
Deployment
16
Machine Learning – Analysis and Model Building
Explore Data
Prepare Data
Perform Feature Engineering
Divide Data into Train/Validation/
Test Datasets
Train Different Models on
Training Dataset
Evaluate Trained Models on ‘Validation’
Dataset
Pick the best performing
model
Test the chosen model on ‘Test’
Dataset
70 – 80% of time spent here Only 20-30% of time spent here
us.sogeti.com17
Topics• Definition• A Couple of Motivating Examples• Why the Hype?
What is Machine Learning?
• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning
Types of Machine Learning
• Lifecycle• Analysis and Model Building
Machine Learning Process
• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance
Common Challenges in Machine Learning
• A Few ApplicationsMachine Learning in the
Public Sector
18
Machine Learning – Challenges1. Lack of data or lack of labeled/tagged data for supervised learning
a. Many modern Machine Learning algorithms are very data hungryb. Some organizations do have the required data but it is not labeledc. Ways to address the challenge:
Gather more data! Data synthesis
2. Poor Data Quality:a. A Machine Learning Model is as good as the data! b. Typical reasons for an organization to have poor data quality are:
Manual data entry Lack of consistent data dictionaries Inconsistent entries by different users System Integration Issues
c. Ways to improve data quality: Automation Good data architecture Implementation of good data governance principles
19
Machine Learning – Challenges (Contd.)
3. Poor Performance of Machine Learning Algorithms:a. Bias vs. Variance tradeoff a.k.a Underfitting vs. Overfitting tradeoffb. Stale Models
i. Models when left alone get stale very quicklyii. Need to recalibrated and retrained on a regular basisiii. Having a feedback loop from production data is a best practice
c. Unbalanced datasetsi. For example in Fraud Detectionii. Accuracy could be very high of a useless model!
20
Challenge: Overfitting and Underfitting
Underfitting – Model does not capture the structure of data. It is too simple.
Overfitting – Model tries too hard to fit all outliers and errors in the data and does not do well with new data. Generally is high order polynomial model.
21
Challenge: How to Overcome Underfitting and Overfitting
Underfitting- Increase model complexity by adding more features, adding higher order terms or interaction features
- Try different Machine Learning algorithms
Overfitting- Make the model simpler by:
- Reducing the number of features- Penalizing a complex model by using a mathematical technique called Regularization
- Train on a larger dataset
22
Machine Learning – Challenges (Contd.)
4. Production Deployment and Maintenance: a. Industry still immature in this area. Many
organizations have ML models running on individual laptops!
b. Without an effective deployment solution, it is hard to:i. Make the model available to the larger
organization and embed it in business applications
ii. Determine why a model works well on training data but not on production data
iii. Maintain different versions of the model and to do A/B testing
iv. Gain confidence of the business when the results cannot be interpreted by them
us.sogeti.com23
Topics• Definition• A Couple of Motivating Examples• Why the Hype?
What is Machine Learning?
• Techniques and Use-Cases• Common ML and AI Algorithms• AI vs. ML vs. Deep Learning
Types of Machine Learning
• Lifecycle• Analysis and Model Building
Machine Learning Process
• Challenges with Data• Poor Model Performance• Production Deployment and Maintenance
Common Challenges in Machine Learning
• A Few ApplicationsMachine Learning in the
Public Sector
24
Machine Learning Applications in the Public Sector
1. Fraud Detection: Payments Insider threat detection DMV applications
2. Deployment of Resources: Optimal deployment of police and traffic cops Readiness for adverse events like earthquakes and forest fires
3. Education: Automatic computer grading of student papers and answer sheets Detect cheating in tests and plagiarism
25
Machine Learning Applications in the Public Sector (Contd.)
4. Process Automation: Chat bots to answer citizen queries Deciphering the sentiment and mood of citizens calling-in into government agencies call-centers Matching job descriptions with resumes Intelligent automation of things like Event Registration, sending communication email etc.
5. Predictive Analytics: Predict and reduce reincarceration rates Predict and reduce hospital readmission rates Detect system intrusion into IT systems and hacking activity
6. Predictive Maintenance Predictive maintenance of heavy equipment
26
Questions?
27
Thank You!
28
References
i. https://medium.com/iotforall/the-difference-between-artificial-intelligence-machine-learning-and-deep-learning-3aa67bff5991
ii. https://medium.com/@diamond_io/artificial-intelligence-101-everything-you-need-to-know-to-understand-ai-8e20fe4bd750
iii. https://www.forbes.com/sites/bernardmarr/2016/12/06/what-is-the-difference-between-artificial-intelligence-and-machine-learning/#25895eea2742
iv. https://dzone.com/articles/10-interesting-use-cases-for-the-k-means-algorithm
v. https://www.youtube.com/watch?v=21EiKfQYZXc&list=PL4KifhYqFlly5ynC4WqwNSwzGcPmT_t6g&index=2&t=0s