3 notebooks in jupyter 1. download all files for this

13
Good morning! Please: 1. Download all files for this lesson 2. Open all 3 notebooks in Jupyter 3. Make a Twitter account (optional) 4. Log in to your Twitter account

Upload: others

Post on 13-Apr-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3 notebooks in Jupyter 1. Download all files for this

Good morning! Please:

1. Download all files for this lesson 2. Open all 3 notebooks in Jupyter

3. Make a Twitter account (optional)4. Log in to your Twitter account

Page 2: 3 notebooks in Jupyter 1. Download all files for this

Outline

1. APIs:a. Overviewb. Example with Twitter

2. Scrapinga. Overviewb. Examples

Page 3: 3 notebooks in Jupyter 1. Download all files for this

What’s an API?

● “Application Programming Interface”○ “Application:” program that does things for humans○ API: does things for other programs

● Uses○ Get data○ Get services

Page 4: 3 notebooks in Jupyter 1. Download all files for this

Some things with APIs

● Twitter ○ Get tweets, post them, etc.

● Google ○ Search, translate, NLP…

● Patents <link>● New York Times● Library of Congress● _____?

Page 5: 3 notebooks in Jupyter 1. Download all files for this

Cautions

● Every API is different● Read the documentation

○ Especially: rate limits, query options● Google for example code

Page 6: 3 notebooks in Jupyter 1. Download all files for this

Neat example with Twitter

www.proporti.onl

Page 7: 3 notebooks in Jupyter 1. Download all files for this

Ethics sidebar

Randall Collins. 1998. Sociology of Philosophies.

Page 8: 3 notebooks in Jupyter 1. Download all files for this

Ethics sidebar

Gunter Grau. 1995. Hidden Holocaust.

Page 9: 3 notebooks in Jupyter 1. Download all files for this

Twitter example

Page 10: 3 notebooks in Jupyter 1. Download all files for this

Scraping Overview

● Sometimes, there is no API.

● “Scraping:” converting web pages to usable data

Page 11: 3 notebooks in Jupyter 1. Download all files for this

Things one might scrape

● Event information● Policy statements● Data tables● Faculty lists● Public comments or posts

○ (e.g. on legislation, news)● _____?

Page 12: 3 notebooks in Jupyter 1. Download all files for this

Cautions

1. Use the API (if it exists)2. Every website is different3. Read robots.txt4. Think seriously about ethics

a. (OKC debacle, TOS, CAPCHA)5. BE NICE (or get us all banned...)6. Recursion is dangerous (exponential growth)

Page 13: 3 notebooks in Jupyter 1. Download all files for this

Scraping examples