machine learning using cloud services

MACHINE LEARNING USING CLOUD SERVICES

Max Pagels, Data Science Specialist [email protected], @maxpagels

12.6.2016

A general overview

mailto:[email protected]

WHAT IS MACHINE LEARNING?

“… FIELD OF STUDY THAT GIVES COMPUTERS THE ABILITY TO LEARN WITHOUT BEING

EXPLICITLY PROGRAMMED” - Arthur Lee Samuel, 1959

ALGORITHMS THAT LEARN FROM DATA IN ORDER TO FIND STRUCTURE, PROVIDE

INSIGHTS, CLASSIFY AND PREDICT

DATA > COMPUTER LEARNS A MODEL > MODEL USED TO SOLVE TASK

SUPERVISED MACHINE

LEARNING

UNSUPERVISED MACHINE LEARNING

REINFORCEMENT LEARNING

1. FETCH & PREPARE DATA 2. TRAIN MODELS

3. DEPLOY MODELS

MACHINE LEARNING, IN A NUTSHELL

HOW CAN CLOUD SERVICES HELP?

1. FETCHING & PREPARING DATA90% of all time is spent on getting & cleaning data

TYPICAL PROBLEMS• Data is stored in multiple DBs • Data access is behind multiple systems • Data is missing • Data is in the incorrect format • Data is only available in aggregated form • Running queries takes a long time

SOLUTION: CLOUD DATA WAREHOUSING• Data is stored in one logical, petabyte-scale DB • Centralised user access management • Usually much cheaper to run than in-house solutions

• Can save (and query) raw data • Querying is typically much faster

2. TRAINING MODELSTest, validate, rinse & repeat

Depending on the type of classifier and the problem at hand, training a model can take ages on a normal laptop/desktop computer.

PROBLEM

SOLUTION: GPU(S)

SOLUTION: CLOUD-BASED COMPUTATION

Example: on AWS EC2, a p2.8xlarge instance has: • 32 vCPUs • 488 GiB RAM • 8 NVIDIA K80 GPUs, 2,496 PPCs and 12GiB of GPU memory per GPU

Cost of buying one K80 yourself: $5,000 Cost of buying the equivalent hardware yourself: $50,000 Cost of running the instance in AWS: about $8 per hour

3. DEPLOYING MODELSPutting your machine learning models to good use

DEPLOYING MODELS

• ML models can take a long time to train, but the models themselves usually don’t take much (disk/RAM) space

• Getting a prediction/result from an ML model typically doesn’t take that much time, either (milliseconds)

• Building a REST API on top of your model allows other services to get predictions on demand

• Use functions-as-a-service as your first choice

DEPLOYING MODELS

• ML models can take a long time to train, but the models themselves usually don’t take much (disk/RAM) space

• Getting a prediction/result from an ML model typically doesn’t take that much time, either (milliseconds)

• Building a REST API on top of your model allows other services to get predictions on demand

• Use functions-as-a-service as your first choice

SERVERLESS + API GATEWAY = QUICK PREDICTION REST API

DON’T REINVENT THE WHEELPre-made models and services may work well

SOME READY-MADE AI/ML SERVICES

IBM WATSON• Natural language processing

• Language translation

• Sentiment analysis

• Speech-to-text

• Text-to-speech

• Personality insights

AWS AI• AWS ML: linear/logistic

regression (classification & real-number prediction)

• Amazon Lex: conversational interfaces

• Amazon Rekognition: object detection

• Amazon Polly: text-to-speech

LET’S END WITH AN EXAMPLE

THANK YOUHave a great evening!

machine learning using cloud services

Software