coursera data science classes - files.meetup.com · • •wide variety of classes from many...

22
Coursera Data Science Discussion Earl F Glynn Principal Programmer/Analyst UMKC Center for Health Insights [email protected] 19 April 2014 1

Upload: others

Post on 31-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Coursera Data Science Discussion Earl F Glynn

Principal Programmer/Analyst UMKC Center for Health Insights

[email protected]

19 April 2014

1

Page 2: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Outline

• Coursera

• Johns Hopkins University Data Science Specialization series

• First three classes in series

– Data Scientist’s Toolbox

– R Programming

– Getting and Cleaning Data

2

Page 3: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Online Classes

• www.coursera.org • Wide variety of classes from many universities • Many technical topics • Class materials online and free • Learn on your own schedule • Interact with others worldwide via class forums • Do as much or as little as you want • Video lectures, quizzes, peer assessment assignments,

programming assignments, exams • Can receive Statement of Accomplishment (PDF) • Free or paid “signature track”

3

Page 4: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Johns Hopkins University

Data Science Specialization

• www.coursera.org/specialization/jhudatascience/

• Nine classes in series taught by Professors Brian Caffo, Jeff Leek and Roger Peng

• Each class four weeks long

• Three classes start each month

• All nine classes to run concurrently by June

• All use R programming language

• Free or paid “signature track”

• Signature track adds a Capstone project

4

Page 5: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

5

Page 6: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Course Dependencies https://d396qusza40orc.cloudfront.net/rprog/doc/JHDSS_CourseDependencies.pdf

6

Page 7: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

• Short video about each class in series • Install R, R packages [Mac or PC] • Install RStudio, an IDE for R • Command line interface • Install Git software • Establish GitHub account • Work with software repositories • Basic markdown (.md files)

• 2 dozen videos • 3 quizzes • 1 peer assessment project

(GitHub submission)

7

Page 8: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

8

Page 9: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

9

Page 10: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Git / GitHub

10 Source: www.eqqon.com/index.php/Collaborative_Github_Workflow

Page 11: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

GitHub Repository for Series

11

Page 12: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

12

Page 13: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

13

Page 14: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

• Data Types • Reading/Writing Data • Control Structures • Functions • Scoping Rules • Subsetting a data.frame • Vectorized operations, including “apply” functions (apply,

lapply, tapply, mappy) • Debugging and R Profiler

• ~40 videos • 4 quizzes • 3 programming assignments • 1 peer assessment assignment

14

Page 15: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Programming Assignment 1 (week 2)

15

Page 16: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Programming Assignment 1 (week 2)

16

Page 17: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Programming Assignment 1 (week 2)

17

Page 18: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

• Raw Data, Processed Data, “Tidy Data” • Downloading Files • Reading data: Excel, XML, JSON, MySQL, HDF5, HTML, APIs, fixed-

width fields, images, … • data.table Package • Subsetting, Sorting, Summarizing • Reshaping, Merging, Editing • Regular Expressions • Dates • Data Sources

• 2 dozen videos • 4 quizzes • 1 peer assessment project

18

Page 20: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

20

Page 21: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Markdown File for GitHub Submission

21

Page 22: Coursera Data Science Classes - files.meetup.com · • •Wide variety of classes from many universities •Many technical topics •Class materials online and free •Learn on your

Six more to go!

22