Download - Data Science 101
![Page 1: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/1.jpg)
Data Science 101
A Love Story
![Page 2: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/2.jpg)
Agenda
• Introduction to Data Science• Who’s who in Data Science?• That Data Science Life.• [Case Study] How Spotify manages their data.• [VM] The Data Science life at VaynerMedia.• Conclusions.
![Page 3: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/3.jpg)
“If you can measure it, you can hack it.”
E -> A -> E
![Page 4: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/4.jpg)
We’re generating (and tracking) exponentially more data online than ever before.
![Page 5: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/5.jpg)
Big Data is big.
![Page 6: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/6.jpg)
5,000,000,000 GB/2 Days
![Page 7: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/7.jpg)
We’re always playing catch-up.
![Page 8: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/8.jpg)
“Innovative Solutions” >
“Industry Standards”
![Page 9: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/9.jpg)
Data Scientists are “Innovative Problem Solvers”
![Page 10: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/10.jpg)
I get it. “Big Data” is real, and Data Scientists are
awesome.
![Page 11: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/11.jpg)
But what is a Data Scientist? Who are they, and
how do they work with “Big Data”?
![Page 12: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/12.jpg)
![Page 13: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/13.jpg)
VM
![Page 14: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/14.jpg)
DJ Patil is a huge influencer in this space.
![Page 15: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/15.jpg)
Why is DJ Patil so popular?
![Page 16: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/16.jpg)
LinkedIn and People You May Know
![Page 17: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/17.jpg)
Angel has 2 mutual friends with Vikash.Tim has 20 mutual friends with
Vikash.If John is friends with Vikash, he might know Tim and his mutual friends.
![Page 18: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/18.jpg)
This increased platform usage, making the experience on LinkedIn more valuable.
![Page 19: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/19.jpg)
Active Users = selling point for LinkedIn when pitching to Brands.
![Page 20: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/20.jpg)
Leg up to users looking for employment in the informal job market.
![Page 21: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/21.jpg)
Big Data.Real Business objective.
Simple Analysis.Valuable Data-driven Product.
![Page 22: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/22.jpg)
“Patil Effect”
![Page 23: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/23.jpg)
VM analysts do the same thing, we just don’t use the same tools.
![Page 24: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/24.jpg)
![Page 25: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/25.jpg)
10^100
![Page 26: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/26.jpg)
Google started downloading the entire internet in the late 90s-early 00s.
![Page 27: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/27.jpg)
“It’s not you, it’s me.”- Google
![Page 28: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/28.jpg)
Google created a better way to process Big Data. They created MapReduce.
![Page 29: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/29.jpg)
Yahoo! wanted to download the internet too.
![Page 30: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/30.jpg)
They liked MapReduce so much that they created Hadoop.
![Page 31: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/31.jpg)
![Page 32: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/32.jpg)
Hadoop is an open sourced distributed file system technology built using MapReduce.
![Page 33: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/33.jpg)
![Page 34: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/34.jpg)
![Page 35: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/35.jpg)
![Page 36: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/36.jpg)
![Page 37: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/37.jpg)
Developed by the folks over at Facebook.
![Page 38: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/38.jpg)
Hive is a data “warehouse” tool built to query Hadoop systems.
![Page 39: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/39.jpg)
Querying this data also allows us to work on our data retrieval skills.
![Page 40: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/40.jpg)
Less time cleaning data.Less time “fishing”.Less spreadsheets.
BOOM.
![Page 41: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/41.jpg)
![Page 42: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/42.jpg)
Amazon Web Services makes computing data in the cloud easy and cheap.
![Page 43: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/43.jpg)
No need for huge data centers on site.
![Page 44: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/44.jpg)
Pay for what you use.
![Page 45: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/45.jpg)
Makes it easy to move data around in the cloud.
![Page 46: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/46.jpg)
How does a company actually use all of these cool tools?
![Page 47: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/47.jpg)
![Page 48: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/48.jpg)
Spotify Client
AWS EMR(Hadoop)
PostgreSQL
Hive (data warehouse infrastructure; SQL-like
syntax)
AdHoc MapReduce
Jobs
![Page 49: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/49.jpg)
How does all of this fit in to VaynerMedia?
![Page 50: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/50.jpg)
VM
![Page 51: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/51.jpg)
Where do analysts fall under the VM umbrella?
![Page 52: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/52.jpg)
Optimizing Content.Optimizing Ad Spends.
Understanding Overall Trends.
![Page 53: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/53.jpg)
We could also develop data-driven products.
![Page 54: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/54.jpg)
Business Objective (s):
-How are we doing against our competitors/ourselves?
-How is our content performing this week?
![Page 55: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/55.jpg)
Math Skills: How do we calculate engagements appropriately? What are my KPIs?
![Page 56: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/56.jpg)
Hacking Skills: How do I get a hold of all of the public data needed for the analysis?
![Page 57: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/57.jpg)
We can also apply a similar methodology to ads.
![Page 58: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/58.jpg)
![Page 59: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/59.jpg)
Trending topics in real time.
![Page 60: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/60.jpg)
Big Picture
![Page 61: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/61.jpg)
Top Phrases available in API in real time.
![Page 62: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/62.jpg)
Demo information is also available.
![Page 63: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/63.jpg)
Other data points attached to stories.
![Page 64: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/64.jpg)
Using the Bit.ly API, we can pull all of this data.Using R, we can analyze the data.
![Page 65: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/65.jpg)
We can adjust our targeting buckets in real time.
![Page 66: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/66.jpg)
Doesn’t matter what we do, as long we develop our core skills.
![Page 67: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/67.jpg)
All of the cool tools that large companies use aren’t necessary for us to be called “Data scientists”.
![Page 68: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/68.jpg)
A carpenter isn’t judged by the tools he uses, but by the things he builds.
![Page 69: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/69.jpg)
Data Science is a method of problem solving.
![Page 70: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/70.jpg)
We are Data scientists.
![Page 71: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/71.jpg)
![Page 72: Data Science 101](https://reader035.vdocuments.net/reader035/viewer/2022062305/56816625550346895dd98125/html5/thumbnails/72.jpg)
Questions/Comments?