openrefine - wordpress.com · openrefine in firefox •start the program. still called...

80
OpenRefine ‘a free, open source, powerful tool for working with messy data’ http://upload.wikimedia.org/wikipedia/commons/thumb/7/76/Google-refine-logo.svg/2000px-Google-refine-logo.svg.png Jeff Moon Data Librarian & Academic Director, Queen’s Research Data Centre Queen’s University Library ACCOLEDS, University of Calgary, December , 2015

Upload: others

Post on 31-Oct-2019

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

OpenRefine‘a free, open source, powerful tool for working with messy data’

http://upload.wikimedia.org/wikipedia/commons/thumb/7/76/Google-refine-logo.svg/2000px-Google-refine-logo.svg.png

Jeff MoonData Librarian &Academic Director, Queen’s Research Data CentreQueen’s University Library

ACCOLEDS, University of Calgary, December, 2015

Page 2: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

What is OpenRefine?

• Formerly known as “Google Refine”

• Free & open-source

• Tool for cleaning large datasets

•Can create links between metadata sets

•Desktop application – run locally

http://openrefine.org

Page 3: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

openrefine.org

Page 4: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Real-world question…

• Start the program.Still called ‘google-refine’

• You’ll see:

Page 5: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data
Page 6: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

So, how do we get these data into a spreadsheet and manipulate them?

Page 7: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

You could use Excel…

Page 8: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

But let’s try OpenRefine

• Start the program Still called ‘google-refine’

• You’ll see:

Page 9: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Warning to use a ‘modern browser’

Page 10: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

On my PC, Internet Explorer is the default, BUT, IE is not the best choice for OpenRefine

https://mozorg.cdn.mozilla.net/media/img/firefox/firefox-256.e2c1fc556816.jpg

So, if necessary, copy the URL

& paste into

Firefox or Chrome

Page 11: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

OpenRefine in Firefox

• Start the program.Still called ‘google-refine’

• You’ll see:

Create a project by importing data. What kinds of data files can I import?TSV, CSV, *SV, Excel (.xls and .xlsx), JSON, XML, RDF as XML, and Google Data documents are all supported. Support for other formats can be added with Google Refine extensions.

Create a project by importing data. What kinds of data files can I import?

TSV, CSV, *SV, Excel (.xls and .xlsx), JSON, XML, RDF as XML, and Google Data documents

are all supported. Support for other formats can be added with Google Refine extensions.

Page 12: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Browse to the XML file

• Start the program.Still called ‘google-refine’

• You’ll see:

Page 13: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Click ‘Next’ when ready…

Page 14: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

OpenRefine recognizes your file

• Start the program.Still called ‘google-refine’

• You’ll see:

Page 15: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

We get a preview of the data…

• Start the program.Still called ‘google-refine’

• You’ll see:

We’re not quite there yet…

Page 16: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

We’re ready to ‘create’ our project…

• Start the program.Still called ‘google-refine’

• You’ll see:

Page 17: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Data is now in OpenRefine

• Start the program.Still called ‘google-refine’

• You’ll see:

Click on ’50’ to show 50 rows

Page 18: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Text facets – let’s look at the 2nd

‘Location’ column…

Page 19: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Dropdown: Facet Text facet

Page 20: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Facets show all unique entries, with frequencies

• Start the program.Still called ‘google-refine’

• You’ll see:

Page 21: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Unique values

&

Frequencies

Page 22: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Let’s look at another example

1911 Census Microdata

http://127.0.0.1:3333/

For this workshop I created a subset of ~100 records from each Province/Territory

Page 23: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Start OpenRefine

• Start the program Still called ‘google-refine’

• You’ll see:

Page 24: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

If needed, copy the URL to Chrome…

https://mozorg.cdn.mozilla.net/media/img/firefox/firefox-256.e2c1fc556816.jpg

If OpenRefineopens in IE,

copy/paste the URL into Chrome

Page 25: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

OpenRefine start screen Click Browse

Click ‘Browse’ to find your data file

Page 26: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Navigate to the ‘.csv’ fileD:\pccf\1911Census_subset.csv

And click ‘Open’

Click ‘Next’ to proceed

Page 27: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Text

You can adjust how files get read into OpenRefine

Page 28: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Text

Click ‘Create Project’

Page 29: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

What is OpenRefine?

Your data is now open as a ‘Project’ in OpenRefine

Page 30: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

What is OpenRefine?

You can specify how many rows to show

Page 31: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

What is OpenRefine?

You now have 50 rows displayed

Page 32: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 1. Transform dataCensus Year: Edit cells Common transforms To text

Census_Year

Page 33: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 1. Transform dataCensus Year: Edit cells Common transforms To text

Census_year is now in ‘Text’ format

Census_Year

Page 34: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 2. Transform back to numberCensus_Year: Edit cells Common transforms To number

Switch Census_yearback to ‘number’ format

Page 35: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 3. Text facetFirst, SCROLL to Province column

Province

Page 36: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 3. Text facetProvince: Facet Text facet

Province

Page 37: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 3. Facets

Facets appear at the left of the screen

Page 38: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 3. FacetsNote frequencies for each value in the column…

Frequencies

Page 39: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 4. EDIT entry for Alberta [ab]

Text

edit include

‘Mouse over’ to see ‘edit’,

Then click ‘edit’

Page 40: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Note that all‘Alberta’ records are now under

‘AB’

This is an efficient way of reviewing and ‘normalizing’ your data…

Exercise 4. EDIT entry for Alberta [ab]

Page 41: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringSub_District_Name: Facet Text facet

Sub_District_Na

Page 42: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringSub_District_Name: Again, facets shown on left side

Page 43: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringSub_District_Name: Click on Cluster

Page 44: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringNote different ‘methods’ and ‘keying functions’

Page 45: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringNote suggested ‘clusters’…

Page 46: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringNote suggested ‘clusters’…

Check them over, then Click ‘Select all’

Page 47: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringNow we’re ready to ‘Re-Cluster’

Click

‘Merge Selected & Re-Cluster

Page 48: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & Clustering‘fingerprint’ clustering done!

DONE!

Page 49: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringNow try the next function, ‘ngram-fingerprint’

Try the second option: ngram-

fingerprint

Page 50: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Two more ‘clusters’

Exercise 5. Text facets & ClusteringResults for ‘ngram-fingerprint’

Review, Select All, then Click

Merge Selected & Re-Cluster

Page 51: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & Clustering‘ngram-fingerprint’ clustering done!

DONE!

Page 52: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & Clustering

Now try the next keying function, ‘metaphone3’

Try the third option:

metaphone3

Page 53: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

But do we want to change all of

these??

Exercise 5. Text facets & ClusteringWe get a lot of new clusters…

Page 54: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Are these really the

same?

Exercise 5. Text facets & ClusteringWe get a lot of new clusters… scroll through the list

Go through the list and decide which clusters to include (or not), then

Merge Selected & Re-Cluster

Page 55: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 5. Text facets & ClusteringClustering… a powerful clean-up tool

You can use, and re-use clustering techniques

until your data is ‘clean’

Page 56: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 6. Editing data in columns LAST_NAME: Facet Text facet

edit include

Change the ‘asterisk’ to nothing (delete ‘*’)

LAST_NAME

Page 57: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 6. Editing data in columnsLAST_NAME: Edit cells Fill down

Page 58: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 6. Editing data in columns LAST_NAME: Edit cells Fill down

Not so easy to do in Excel…

Page 59: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Exercise 7. Creating new columnsCCRIUID_1911: Edit column Add column based on this column

CCRIUID_1911

Page 60: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

value[0,2]

Exercise 7. Creating new columnsCCRIUID_1911: Define values for new column

using ‘regular expression’

Page 61: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

‘grel’

‘A regular expression is a string that describes a

text pattern occurring in other strings.’

OpenRefine [formerly Google]Regular Expression Language

https://github.com/OpenRefine/OpenRefine/wiki/Understanding-Regular-Expressions

Regular ExpressionsA very powerful tool!

Page 62: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Regular Expressions. Source:XKCD, CC BY-NC 2.5 license.

Page 63: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Regular Expressions

Source: http://enipedia.tudelft.nl/wiki/OpenRefine_Tutorial

In Transformations:Edit cells Transform:• value.toNumber() change to ‘number’• value.toString() change to ‘string’• value.replace("+", "") replace one thing• value.replace("~", "").replace(",","") replace more than 1 thing• value.unescape('url') remove ‘odd’ characters• value.toUppercase() change to Uppercase

In FacetsFind entries based on what they contain:Facet Custom text facet:• value.contains("million") Facet based on presence/absence of a

“string”: returns ‘True’ or ‘False’

Page 64: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Regular Expressions

Source: http://enipedia.tudelft.nl/wiki/OpenRefine_Tutorial

value.replace("F", "f").replace("a","A")

Page 65: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Regular Expressions

Source: http://enipedia.tudelft.nl/wiki/OpenRefine_Tutorial

value.contains(“Farmer”)

Page 66: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

There is so much more… Edit cells Common transforms

Removes HTML or XML character

references and entities from a text string

Just what it says…

Collapses embedded >1 whitespace

Just what they say…

Page 67: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

There is so much more…

Use Google Regular Expression Language (GREL) to query external

data sources

For example, using Postal Codes to extract information from

Google Maps

Page 68: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Here’s a very simple example…

Page 69: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Use PCODE to retrieve external data…

Page 70: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Add GREL: Google Regular Expression Language

http://datadrivenjournalism.net/resources/geocoding_using_the_google_maps_geocoder_via_openrefine

"http://maps.googleapis.com/maps/api/geocode/json?

sensor=false&address="+escape(value,"url")+",CAN"

Page 71: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Google Map API is queried and sends back results…

Embedded in this data, you’ll find ‘latitude’ & ‘longitude’

Page 72: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

JSON output can be formatted…

{ "results" : [ { "address_components" : [ { "long_name" : "K7L 5C4", "short_name" : "K7L 5C4", "types" : [ "postal_code" ] }, { "long_name" : "Kingston", "short_name" : "Kingston", "types" : [ "locality", "political" ] }, { "long_name" : "Frontenac County", "short_name" : "Frontenac County", "types" : [ "administrative_area_level_2", "political" ] }, { "long_name" : "Ontario", "short_name" : "ON", "types" : [ "administrative_area_level_1", "political" ] }, { "long_name" : "Canada", "short_name" : "CA", "types" : [ "country", "political" ] } ], "formatted_address" : "Kingston, ON K7L 5C4, Canada", "geometry" : { "bounds" : { "northeast" : { "lat" : 44.2275953, "lng" : -76.4946668 }, "southwest" : { "lat" : 44.2269466, "lng" : -76.4954425 } }, "location" : { "lat" : 44.2272811, "lng" : -76.49506989999999 }, "location_type" : "APPROXIMATE", "viewport" : { "northeast" : { "lat" : 44.22861993029149, "lng" : -76.49370566970849 }, "southwest" : { "lat" : 44.22592196970849, "lng" : -76.4964036302915 } } }, "partial_match" : true, "place_id" : "ChIJo1lXewSr0kwRqBwN8Nxp4HM", "types" : [ "postal_code" ] } ], "status" : "OK" }

Online JSON Viewer

http://jsonviewer.stack.hu

Page 73: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

JSON output can be formatted…

{ "results" : [ { "address_components" : [ { "long_name" : "K7L 5C4", "short_name" : "K7L 5C4", "types" : [ "postal_code" ] }, { "long_name" : "Kingston", "short_name" : "Kingston", "types" : [ "locality", "political" ] }, { "long_name" : "Frontenac County", "short_name" : "Frontenac County", "types" : [ "administrative_area_level_2", "political" ] }, { "long_name" : "Ontario", "short_name" : "ON", "types" : [ "administrative_area_level_1", "political" ] }, { "long_name" : "Canada", "short_name" : "CA", "types" : [ "country", "political" ] } ], "formatted_address" : "Kingston, ON K7L 5C4, Canada", "geometry" : { "bounds" : { "northeast" : { "lat" : 44.2275953, "lng" : -76.4946668 }, "southwest" : { "lat" : 44.2269466, "lng" : -76.4954425 } }, "location" : { "lat" : 44.2272811, "lng" : -76.49506989999999 }, "location_type" : "APPROXIMATE", "viewport" : { "northeast" : { "lat" : 44.22861993029149, "lng" : -76.49370566970849 }, "southwest" : { "lat" : 44.22592196970849, "lng" : -76.4964036302915 } } }, "partial_match" : true, "place_id" : "ChIJo1lXewSr0kwRqBwN8Nxp4HM", "types" : [ "postal_code" ] } ], "status" : "OK" }

Looks a lot like XML

Page 74: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Now it is easier to find elements we might want to use

We can find ‘latitude’ & ‘longitude’

nested in ‘geometry’ &

‘location’

Page 75: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Edit column Add column based on this column

value.parseJson().results[0]["geometry"]["location"]["lat"]

Page 76: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

New ‘Latitude’ column

Page 77: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Edit column Add column based on this column

value.parseJson().results[0]["geometry"]["location"]["lng"]

Page 78: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

New ‘Longitude’ column

Page 79: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Closing comments

• Just scratched the surface of what OpenRefine can do

• More resources can be found at OpenRefine.org

Using OpenRefine to Update, Clean up, and Link your Metadata to the

Wider World, Sarah Weeks, St. Olaf College.

Youtube: https://www.youtube.com/watch?v=E-NbMR3_MRw

Data: https://raw.githubusercontent.com/PaulMakepeace/refine-client-py/master/tests/data/louisiana-elected-officials.csv

Regular Expressions Cheat Sheet: http://arcadiafalcone.net/GoogleRefineCheatSheets.pdf

Manual Videos

Page 80: OpenRefine - WordPress.com · OpenRefine in Firefox •Start the program. Still called ‘google-refine’ •You’ll see: Create a project by importing data. What kinds of data

Questions