lessons from teaching data science to over a million people · the data science specialization •...

35
Lessons from teaching data science to over a million people Sean Kross CRUNCH Conference Budapest 2017-10-20

Upload: others

Post on 28-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Lessons from teaching data science to over a

million people

Sean KrossCRUNCH Conference

Budapest2017-10-20

Page 2: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

A little about me

• Formerly: The Johns Hopkins Data Science Lab

• Currently: The University of California San Diego

• Main interests:• Data Science• Online Education• Open Science

Page 3: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Jeff Leek Roger Peng Brian Caffo

The Johns Hopkins Data Science Lab

@rdpeng @bcaffo@jtleek

jhudatascience.org

Page 4: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

?

Page 5: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Part 1: The Data Science Specialization

Page 6: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

https://peerj.com/collections/50-practicaldatascistats/

Page 7: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

• Forecasting at Scale by Sean Taylor and Benjamin Letham (Facebook)

• How to Share Data for Collaboration by Shannon Ellis and Jeffrey Leek (Johns Hopkins Data Science Lab)

• Opinionated Analysis Development by Hilary Parker (Stitch Fix)

• Data Organization in Spreadsheets by Karl Broman and Kara Woo (The University of Wisconsin & DataCamp)

Page 8: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Rationale: “Let’s put in-person courses online to augment in-person teaching.”

We were on to something.

Page 9: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Nine Courses1. The Data Scientist’s Toolbox

2. R Programming

3. Getting and Cleaning Data

4. Exploratory Data Analysis

5. Reproducible Research

6. Statistical Inference

7. Regression Models

8. Practical Machine Learning

9. Developing Data Products

Page 10: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Enrollment and Completions of the Data Science Specialization

Page 11: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Key innovations

Page 12: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Give everything away for free

Page 13: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Capstones -> Portfolios -> Jobs

Page 14: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Run every course every month

Page 15: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Integrate content

https://ubc-mds.github.io/

Page 16: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 17: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 18: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 19: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Lessons from Alumni: The Data Science Specialization

• Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s technical blogs. Also see open source software projects like Prophet, a forecasting library from Facebook.

• Data Scientists want to be able to do in-house data science training.

• Domain expertise is important but so is buy-in from management.

Page 20: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Part 2: Executive Data Science

Page 21: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 22: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 23: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 24: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 25: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 26: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 27: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 28: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Lessons from Alumni: Executive Data Science

• Data science technologies tend to “trickle up.” (Especially good to know if you develop data science technologies.)

• Invest your precious time into basic statistics over basic programming.

• Your expectations for developing an analysis should resemble your expectations for developing software.

Page 29: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Part 3: Mastering Software Development in R

Page 30: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Data Scientist

Data Engineer

Page 31: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s
Page 32: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Lessons from Alumni: Mastering Software Development in R

• The field of data science is still taking form.

• Experimentation with roles can give you a competitive advantage.

• Taking risks is easier of you’re part of a community.

Page 33: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

?

Page 34: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

• Learn how to use the command line from the ground up.

• No previous experience expected.

• The gateway into computationally intensive tasks.

• Includes an introduction to cloud computing.

• Free!

leanpub.com/unix

Page 35: Lessons from teaching data science to over a million people · The Data Science Specialization • Data scientists want to create online artifacts. See Stitch Fix and Stack Overflow’s

Thank you!

Questions?

Link to these slides: seankross.com/crunch-talk/

Let’s talk: [email protected]

Find me on Twitter: @seankross