new professional careers in data
TRANSCRIPT
![Page 1: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/1.jpg)
…in data
new professional careers
![Page 2: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/2.jpg)
Who am I?
• David Rostcheck
• I’m a consulting data scientist
• Follow my articles on LinkedIn
![Page 3: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/3.jpg)
We will talk about 4 things:
Big Data DataScience
Data Engineering
BusinessIntelligence
![Page 4: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/4.jpg)
BIG DATA
![Page 5: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/5.jpg)
What is big data?
![Page 6: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/6.jpg)
is data that isso big
that it
requiresspecialized techniques
to handle
![Page 7: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/7.jpg)
like: clusters
![Page 8: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/8.jpg)
or cloud computing
![Page 9: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/9.jpg)
or graph algorithms
![Page 10: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/10.jpg)
Data may
change rapidly
so big data may also be fast data
![Page 11: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/11.jpg)
big data requires
specialized tools
to handle
MAP/REDUCE
![Page 12: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/12.jpg)
big data tools are in demand
but
keep your perspective
![Page 13: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/13.jpg)
Big Data tools can be complex
It is often easier to solve problems at small scale, then scale up, if possible
![Page 14: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/14.jpg)
remember:
not all companies use big data
but
all companies use data
![Page 15: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/15.jpg)
DATA SCIENCE
![Page 16: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/16.jpg)
What is data science?
![Page 17: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/17.jpg)
Data science is
industrial research
on a company’s
own data
![Page 18: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/18.jpg)
What is its goal?
![Page 19: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/19.jpg)
to produce
advanced algorithms
that deliver a
competitive advantage
![Page 20: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/20.jpg)
data scientists often work with unstructured data
… which can be large
![Page 21: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/21.jpg)
“The qualifications for the job include the strength to tunnel through mountains of information and the vision to discern patterns where others see none”
- Bloomberg Businessweek
![Page 22: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/22.jpg)
Is data science really science?
![Page 23: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/23.jpg)
let’s compare…
academic science data science
Teams PhDs, graduate students
PhDs, technologists
Setting University Company
Publication Formal (academic publications, conferences)
Less formal (blogs, white papers, open source)
Funding Public grants Corporate
Goal Advance human knowledge
Create competitive advantage
![Page 24: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/24.jpg)
Data science is industrial science
It shares some attributes with academic science, but has other differences
![Page 25: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/25.jpg)
What kind of work do data scientists do?
![Page 26: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/26.jpg)
data scientists create artificially intelligent systems
these are often called “narrow AI”
![Page 27: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/27.jpg)
examples
•Recommender systems•Self-driving cars•AI agents•Smart energy management•Medical diagnosis•Machine vision
![Page 28: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/28.jpg)
DATA ENGINEERING
![Page 29: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/29.jpg)
What is data engineering?
![Page 30: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/30.jpg)
data engineering is a specialized kind of
software engineering
with additional skills in
handling and processing data
![Page 31: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/31.jpg)
data science vs. data engineering
data science data engineering
Approach Scientific (Exploration) Engineering (Development)
Problems Unbounded Bounded
Path to Solution Iterative, exploratory, nonlinear Mostly linear
Education More is better (PhD’s common) BS and/or self-trained
Presentation Skills Important Not as important
Research experience
Important Not as important
Programming skills Not as important Important
Data skills Important Important
![Page 32: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/32.jpg)
What kind of special training does a data engineer need?
![Page 33: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/33.jpg)
Data storage and processing– structured: (SQL) – unstructured (NoSQL) – Big Data (Hadoop, Apache Spark/Storm/Flink, cloud)
Data visualization
Machine Learning algorithms and platforms (ex. Dato)
Predictive APIs (ex. Watson)
![Page 34: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/34.jpg)
Does a data engineer need more math than a regular software engineer?
![Page 35: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/35.jpg)
It really helps.
Linear algebra & calculus are important to understand machine learning
![Page 36: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/36.jpg)
BUSINESS INTELLIGENCE
![Page 37: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/37.jpg)
Wait – aren’t data science and business intelligence really the same thing?
![Page 38: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/38.jpg)
Maybe. Let’s compare…
business intelligence (BI) data science
Data analysis Yes Yes
Statistics Yes Yes
Visualization Yes Yes
Data Sources Usually SQL, often Data Warehouse
Less structured (logs, cloud data, SQL, noSQL, text)
Tools Statistics, Visualization Statistics, Machine Learning, Graph Analysis, NLP
Focus Present and past Future
Approach Analytic Scientific
Goal Better strategic decisions Advanced functionality
![Page 39: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/39.jpg)
The two fields are closely related.
In some ways data science is an evolution of business intelligence.
![Page 40: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/40.jpg)
which industries most use data-focused jobs?
![Page 41: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/41.jpg)
right now:
Technology Education
FinanceConsultingHealth Care
( Technology employs over 50% of data workers)
![Page 42: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/42.jpg)
but...
“Technology” companies like Uber, Amazon, AirBnB
compete in other industries (transportation,
retail, hotels)
![Page 43: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/43.jpg)
“Software is eating the world”
– Andreessen Horowitz
![Page 44: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/44.jpg)
which industries will AI change?
![Page 45: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/45.jpg)
Ultimately, all of them.
Incorporating AI is a large business opportunity
![Page 46: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/46.jpg)
data jobs are in demand
• “The hot job of the decade… Data scientists today are akin to Wall Street “quants” of the 1980s and 1990s”
- Harvard Business Review
• “18.7% projected growth 2010-2020”- VentureBeat
• “McKinsey projects […] ‘50 percent to 60 percent gap between supply and requisite demand’”
- Bloomberg Businessweek
![Page 47: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/47.jpg)
On the other hand…
Some people believe data jobs themselves will be automated:
“New Teradata Platform Reduces Demand For Data Scientists”
- Forbes
“Automating the Data Scientist”- MIT Technology Review
![Page 48: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/48.jpg)
What do we think?
• Yes, advanced tools will automate some data exploration
• But: research and communication are fundamental skills and are always in demand when the world is changing
• Data will continue to explode (Internet of Things)
• We will see more change and faster change
![Page 49: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/49.jpg)
education for data jobs
options include:
academic programs,boot camps,
and online classes (Coursera ,
Udacity)
![Page 50: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/50.jpg)
for data engineering:
– documentation and webinars (self-education)
– focus on data manipulation tools and machine learning
![Page 51: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/51.jpg)
for data science:
– The more academic science and research expertise, the better
– Focus on projects that solve unknown problems
– Work with more experienced data scientists
![Page 52: New professional careers in data](https://reader034.vdocuments.net/reader034/viewer/2022042906/58aa350c1a28abbb108b623f/html5/thumbnails/52.jpg)
Questions?
?Contact: [email protected], twitter: @davidrostcheckArticles: http://linkedin.com/in/davidrostcheck