how to get started in data science
TRANSCRIPT
How to get started in Data Science from scratch
Frank D. Evans
A Data Scientist is a data analyst that lives in California
A Data Scientist is
a better mathematician than any of the programmers,
and a better programmer than any of the mathematicians.
OSEMN
OSEMN
Obtain
OSEMN
Scrub
OSEMN
Explore
OSEMN
Model
OSEMN
Interpret
TOOLS
TOOLS
TOOLS
TOOLS
Data is huge. People are expensive. Computation is cheap.
Use applied statistics to let the computers program themselves.
Types
Types
Supervised "I have a set of examples with the right answers, I want to learn a pattern and use it on examples where I don't have the answers."
Types
Unsupervised "I have data with no answers, but I want to find a pattern that might lead me to an answer."
TypesReinforcement "I want to start with what I know now, and be able to learn new things as new data comes along."
Types
Supervised "I have a set of examples with the right answers, I want to learn a pattern and use it on examples where I don't have the answers."
Unsupervised "I have data with no answers, but I want to find a pattern that might lead me to an answer."
Reinforcement "I want to start with what I know now, and be able to learn new things as new data comes along."
Types
Supervised "I have a set of examples with the right answers, I want to learn a pattern and use it on examples where I don't have the answers."
Unsupervised "I have data with no answers, but I want to find a pattern that might lead me to an answer."
Reinforcement "I want to start with what I know now, and be able to learn new things as new data comes along."
Regression vs Classification
Supervised Learning
RegressionUse continuous data to make a model that predicts where new data will fit.
ClassificationLabel data into "buckets", and make predictions on which bucket a new data point will fall into.
exaptive.com/blog
Frank D. Evans@frankdevans
@exaptive
slideshare.net/frankdevans