a few challenges to make machine learning easy

44
June 3rd, 2013 BigML Inc, 2013 Challenges to Make Machine Learning Easy ACM San Francisco Bay Area Professional Chapter Francisco J Martin, Ph.D. BigML Co-founder & CEO eBay Whitman Campus

Upload: pemo-theodore

Post on 21-Jan-2015

295 views

Category:

Technology


2 download

DESCRIPTION

Dr. Francisco J Martin: In the age of data, Machine Learning is the key component to make data-driven decisions, develop smart applications, and build predictive analytics. However, Machine Learning is complex. The current tools are complicated and do not scale well. Most solutions are costly, easily involving hundreds of thousands of dollars and substantial resources. Additionally, experts with industry experience are very scarce. BigML is building a scalable, cloud-based service that makes Machine Learning easy or, at least, lowers the barriers that most developers and business folks face to learn from data. In this talk, I will first demo BigML and then describe the efforts, highlight some of the key findings, and discuss some of the challenges from a technical, user, and business perspective, related to developing a Machine Learning service for the masses.

TRANSCRIPT

Page 1: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013

Challenges to Make Machine Learning Easy

ACM San Francisco Bay Area Professional Chapter

Francisco J Martin, Ph.D.BigML Co-founder & CEO

eBay Whitman Campus

Page 2: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 2

Expert: Published papers at KDD, ICML, NIPS, etc or developed own ML algorithms used at large scale.

Sampling the Audience

Aficionado: Understands pros/cons of different techniques and/or can tweak algorithms as needed.

Newbie: Just taking Coursera ML class or reading an introductory book to ML.

Absolute beginner: ML sounds like science fiction

Practitioner: Very familiar with ML packages (Weka, Scikit, R, etc).

Page 3: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 3

Data, data everywhereA special report on managing information

Why make ML easy?

In the age of data, Machine Learning is the key component to:

‣ make data-driven decisions‣ develop smart applications‣ build predictive analytics

Page 4: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013

However, Machine Learning is COMPLEX:

‣tools are complicated and do not scale well‣solutions are costly‣e x p e r t s w i t h i n d u s t r y experience are scarce

4

Why make ML easy?

http://ttic.uchicago.edu/~samory/

Page 5: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 5

Why make ML easy?

Page 6: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 6

Why make ML easy?

Page 7: A few Challenges to Make Machine Learning Easy

April, 2013BigML Inc, 2013 7

BigML A cloud-based service that makes

Machine Learning SIMPLE

$ bigmler --train customer2012.csv \ --test new_customers.csv \ --objective churn

>>> from bigml.api import BigML>>> api = BigML()>>> source = api.create_source("s3://bigml-public/csv/sales.csv")>>> dataset = api.create_dataset(source)>>> model = api.create_model(dataset)

$ curl https://bigml.io/model?$BIGML_AUTH \ -X POST \ -H "content-type: application/json" \ -d '{"dataset": "dataset/50ca447b3b56356ae0000029"}'

Page 8: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 8

AgendaBigML web-based interface (10-15 min)

Questions (10-15 min)

$ bigmler --train customer2012.csv \ --test new_customers.csv \ --objective churn

>>> from bigml.api import BigML>>> api = BigML()>>> source = api.create_source("s3://bigml-public/csv/iris.csv")>>> dataset = api.create_dataset(source)>>> model = api.create_model(dataset)

$ curl https://bigml.io/dataset?$BIGML_AUTH \ -X POST \ -H "content-type: application/json" \ -d '{"source": "source/50ca447b3b56356ae0000029"}'

BigML API, API Bindings, BigMLer (5 min)

Challenges (10-15 min)

#1 Machine Learning Breadth and Depth#2 User Diversity #3 Simplicity#4 Scalability #5 Measuring Impact#6 Pricing

Page 9: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 9

How it works

Page 10: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 10

BigML Resources

csv, arff, xlshttps, s3, azure, odata

Sources local and remote

Datasets

Stream histogramsStatistics

ModelsInteractive Compoundable Random Decision Forests

Actionable: exportable to rules, code, pmml

PredictionsForm-based PredictionsQuestion by QuestionLocal predictions

Evaluations

ClassificationRegressionComparison

Page 11: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 11

BigML API

Page 12: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 12

3,500+ users

35,000+ models

BigML

Page 13: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 13

FREE subscription?

mail your username to: [email protected]

Page 14: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 14

Challenges

#1 Machine Learning breadth and depth#2 User Diversity #3 Simplicity#4 Scalability #5 Measuring Machine Learning Impact#6 Pricing

Page 15: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 15

...or you can deal with that!

#1 Supervised learning #2 Unsupervised learning#3 Semi-supervised learning #4 Reinforcement learning#5 Learning to Learn

#1 machine learning breadth and depth

Page 16: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 16

...or you can deal with that!#1 machine learning breadth and depth

Page 17: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 17

Phrase a problem as an ML task

The stages of an ML application

Data Wrangling

Feature Engineering

Learn from Data

Pre-evaluate

Measure Impact

Page 18: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 18

Problems

Techniques

Applications

ClassificationRegressionClusteringDensity EstimationManifold learningActive learningetc. Just solving a couple of

problems and using a few techniques thousands of

applications can be developed

churn prevention, date matching, decision making, diagnostics, fraud detection, detecting tumors, detecting investment opportunities, human body pose estimation, pedestrian tracking, predictive analytics, recommendation systems, risk analysis, spam detection, etc

#1 machine learning breadth and depth

Page 19: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 19

Understanding the past

Predicting the future

Why Trees first?

Page 20: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 20

Why Trees?

Page 21: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 21

A Machine Learning application requires more tasks (that are even more important) than just learning from data.

Just solving one problem more will enable a huge number of applications more.

What problem(s) to tackle next and which techniques to use?

#1 machine learning breadth and depth

Page 22: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 22

Experts

Aficionados

Practitioner

Newbies

Absolute beginners

#2 user diversity

How to prioritize what to build next? More features for the

expert or simplifying more for the newbies?

Page 23: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 23

Tim

e-to

-pro

duct

ivit

y

+

+

Expertise

#2 user diversity

Page 24: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 24

#2 user diversity

MBs PBs

MBs PBs

Actual size

Size

Most users believe their data is much bigger than it really is

Page 25: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 25

Num

ber

of J

obs

+

+

Size of Job

#2 user diversity

Page 26: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 26

#3 simplicity

Page 27: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 27

“Any fool can make something complicated. It takes a genius to

make it simple.”

― Woody Guthrie

#3 simplicity

Page 28: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 28

‣install ‣configure‣use‣train ‣understand‣test‣pre-evaluate‣measure impact‣deploy‣scale ‣access programmatically (API)

#3 simplicity Simple means much more than a easy-to-use interface

Page 29: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 29

#4 scalability

N CONCURRENT

JOBSfrom

1 CUSTOMER

1 JOBfrom

1 USER

N JOBSfrom

M CUSTOMERS

Page 30: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 30

Infrastructure

Page 31: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 31

#5 measuring machine learning impact

Page 32: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 32

Measuring “actual” impact is complex and goes beyond traditional performance evaluation.

Imagine that an algorithm predicts that user Alice is going to buy a Magic Potion.

‣ But Magic Potions are out of stock.

‣ Should we blame ‣ the algorithm for the “false positive” prediction?‣ the data scientist for not including that feature?‣ operations for running out of stock on things that customers want to buy?

#5 measuring machine learning impact

Page 33: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 33

Kiri Wagstaff, Machine Learning that Matters, ICML, 2012

The stages of an ML research program

Very inspirational!!!

Page 34: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 34

Phrase a problem as an ML task

Data Wrangling

Learn from Data

The stages of an ML application

Feature Engineering

Pre-evaluate

Measure Impact !!!!!

Page 35: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 35

#6 pricing

Page 36: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 36

#6 pricing

Page 37: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 37

Pre-pay-as-you-go

Page 38: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 38

Subscriptions

Page 39: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 39

...or you can deal with that!

BigML 1-click model

You can deal with this...

Machine Learning made easy?

Page 40: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 40

BigML 1-click model

You can deal with this...

...or you can deal with that!

Machine Learning made easy?

Page 41: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 41

Ease

-of-

use

+

+

2013

Machine Learning made easy?

Page 42: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 42

Ease

-of-

use

+

+

2013 2014 2015 2016 2017 2018

Machine Learning made Easy!!!

Page 43: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 43

Questions

Page 44: A few Challenges to Make Machine Learning Easy

June 3rd, 2013BigML Inc, 2013 44

Unknown Modelf : X -> Y

Example: ideal credit approval formula

ModelsM

Example: set of candidate credit approval formulas

Learning from Data

LearningAlgorithm

Based on Learning from Data by Y. Abu-Mostafa, M. Magdon-Ismail and H. Lin

Final Modelg ~ f

Example: learned credit approval formula

Training Examples(x1, l1), (x2, l2), ..., (xN, lN)

Example: historical records of credit customers

x1

xN

labelf1 f2 fn