san francisco hacker news - machine learning for hackers

13

Click here to load reader

Upload: adam-gibson

Post on 06-May-2015

451 views

Category:

Technology


0 download

DESCRIPTION

This was for the san francisco hacker news meetup in february at engineyard. This was intended as a basic intro to machine learning for people who wanted to step in to the field. Video coming shortly.

TRANSCRIPT

Page 1: San Francisco Hacker News - Machine Learning for Hackers

Machine Learning for Hackersis how we make sense of big data.

Adam Gibson

2-27-2014 SFHN

Page 2: San Francisco Hacker News - Machine Learning for Hackers

BIG DATA & STATISTICS• Statistics – Group by, aggregate, count,average, mean,p

values,mode,correlations, exploring, < 100 variables

• Machine Learning – Label this image, Predict the next event, Pick out the anomalies – aka learn from data not count it, group data by similarities, > 100 variables.

Page 3: San Francisco Hacker News - Machine Learning for Hackers

What is data?!

Page 4: San Francisco Hacker News - Machine Learning for Hackers

Unstructured

Text

Video

Images

Time Series

Structured

Many kinds of data Wow.

Data Scientists We know this, and just process it.

SQL

XML

JSON

CSV

Page 5: San Francisco Hacker News - Machine Learning for Hackers

WHAT do machines learn?• Machine learning is a general tool that can work with

various data types.• Images = Machine vision• Text = Natural-language processing • Time-series = Prediction• Facial recognition => Security• Text => Customer profiles/Recommendation engines• Time-series => stock-market trading platforms• NLP => Customer service

Page 6: San Francisco Hacker News - Machine Learning for Hackers

WHAT IS A DATA SCIENTIST?

Analyst Distributed Systems Engineer

Exploratory analysis of data, typically on smaller data sets.

Understands the algorithms and interprets data.

Implements production data crunching, also known as the nosql person. They handle distributed systems and workloads, APIs, perhaps even data collection and storage

Page 7: San Francisco Hacker News - Machine Learning for Hackers

What kinds of Machine Learning Are there?

Unsupervised – Clustering (group things that are similar, regression (correlation != causation ring a bell?)

Supervised – Label all the things! Predict the future!

Page 8: San Francisco Hacker News - Machine Learning for Hackers

How does this affect me?

Page 9: San Francisco Hacker News - Machine Learning for Hackers

Ad Targeting

Recommends you Movies

Brings you search results

Recognizes your face in the camera

Drives your car

Automatically disables your credit card when you leave the country

I will leave who does this to your imagination

Page 10: San Francisco Hacker News - Machine Learning for Hackers

Can I do this?

The shortcut here is to start with basics – for example google analytics, understanding churn rate.

Pick up a more advanced understanding after that if it still seems interesting.

If you are in to backends start with distributed systems, get your math basics up enough to understand what the guy on the other side of the table who's asking you to put the algorithm in to production is saying

Page 11: San Francisco Hacker News - Machine Learning for Hackers

ResourcesCoursera Machine Learning

Reddit Machine Learning

DataTau (hacker news for data scientists)

More mathy Stanford Machine Learning

Page 13: San Francisco Hacker News - Machine Learning for Hackers

Analysts

http://scikit-learn.org/stable/

Julia Lang

R Lang

Data Engineers

Spark

Hadoop

Storm

Hadoop QuickStart VM