open source, open data
DESCRIPTION
My presentation from Florida Linux Show 2009. Find out how open source's principles are being used outside of software, and how open source and open data can work together to change the world.TRANSCRIPT
Open Source, Open DataKirrily RobertFlorida Linux Show, 2009
From Open Source to Open Data
1993
Me in 1993 My Linux desktop looked like this
1993
• I started using Linux in 1993
• I was very excited by it, even though it was quite primitive at the time
• Other people thought I was a little crazy
Image: Wikipedia Image: Engadget
1999
Google’s servers in 1999Jar Jar in 1999
1999
• By 1999 Linux + open source was starting to take off
• Companies using and building services on Linux etc.
• We were calling it “Open Source” - a more marketable term for Free Software
Four Software Freedomshttp://www.gnu.org/philosophy/free-sw.html
• Freedom to run the program
• Freedom to study the program and modify it for your own use
• Freedom to redistribute verbatim copies
• Freedom to improve the program, and release your improvements
Free Culture
• A similar movement
• Make cultural works freely available
• Mostly over the Internet
Free Culture
Free Culture
Free Culture
Free Culturehttp://wiki.freeculture.org/Free_Culture_Definition
• Freedom to use the work
• Freedom to study the work and to apply knowledge acquired from it
• Freedom to make and redistribute copies
• Freedom to make changes and improvements, and to distribute derivative works
Image: masternewmedia.org
What is Open Data?
Data
Image: himmelskratzer @ Flickr
What is data?
• Ones and zeroes (obviously)
• But also filing cabinets, research archives, and other offline resources
• It’s not OPEN data unless you can get at it
Open Data Freedoms
• Freedom to use the data
• Freedom to study the data and modify it for your own use
• Freedom to make and share verbatim copies
• Freedom to improve the data and redistribute the results
Data availability
• Digital
• Online
• Well formatted
Open Data Projects
public.resource.org
• Created 2007 by Carl Malamud
• “Making Government Information More Accessible”
public.resource.org
• SEC EDGAR records
• Patents database
• Copyright database
• Congressional records
• Legal decisions
• Fedflix
Data.gov
• Founded 2008
• “Increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.”
OpenStreetMap
Compare...
OpenStreetMap
Open Library Project
• CD data
• Tracks, artists, releases...
• CC license
Flickr
• Images
• Metadata• tags, timestamps, geolocations, etc.
• Range of CC licenses and permissive TOS
Infochimps
• Large data sets
• Various licenses
• Tools for transformation
• Open data about “everything”
• 8.5m concepts
• CC-BY license
• API and data dumps
2,416,683 books
16,608 ships
488 cheeses
Structured data { "name": "Asiago cheese" "id": "/en/asiago_cheese", "region": [{ "id": "/en/asiago", "name": "Asiago", "type" : "/location/location"
}], "source_of_milk": [{ "id": "/en/cattle", "name": "Cow", "type" : "/biology/organism_classification" }] }
Open Data Apps
• Apps for America competition
• Open source and open data
• Round 1: various data sources
• Round 2: Data.gov
Legistalker
Filibusted
Where the money goes
Open Source for Open Data
What can open source do?
Input
Processing
Output
Scrape
Munge
Visualise
Scraping data
• APIs• XML, RSS, JSON...
• Downloadable data sets• XML, Excel, CSV, triple dumps...
• Beautiful Soup (Python)• http://www.crummy.com/software/
Munging data
• Perl• http://perl.org/
• R (statistical analysis)• http://r-project.org/
• Hadoop (parallel data processing)• http://hadoop.apache.org/
Visualisations
• MIT Simile• http://simile.mit.edu/
• Processing• http://processing.org/
Semantic Web
• Describe meaning, not markup
• Triples: subject, predicate, object
• Expression: RDF
Linked Open Data
Semantic web tools
• Triple stores• Sesame, BigData, Virtuoso...
• Libraries• RDFLib (Python), Redland RDf (librdf)...
Freebase Acre
Open source for open data
• Low barrier to entry
• Hooks in to Freebase data
• Share and clone apps
• Apps are BSD licensed
FMDB
Gendered names app
Query editor
Clone!
Where next?
Open Data: Issues
• License clarity
• Govt + Corporate acceptance
• Developer literacy
• What do we DO with it?
What do we do with it?
What do we do with it?
• 10 years ago we were asking the same questions of Open Source
• With Open Data, we are just starting to realise its potential
• Please join us!
Keep in touch
• Email• [email protected]
• Freebase blog• http://blog.freebase.com/
• Twitter• @fbase