open data - a goldmine (javazone 2009)
DESCRIPTION
An introduction to the basics and benefits of using Open Data. Slides from my presentation of this topic at the JavaZone 2009 conference in Oslo, Norway.TRANSCRIPT
Open Data - a goldmine 1
Open Data – a goldmine
Photo by BullionVault @ Flickr, CC BY-ND
the speaker
Svein-Magnus Sørensen Master of Science in Communications Technology from NTNU Graduate from the Norwegian School of Entrepreneurship (Gründerskolen)
Past experience include:• Knowledge engineer at Computas AS (Oslo, Norway)• Integration engineer at Searchforce Inc. (San Mateo, California)
Currently: Business Analyst at Objectware AS
Weblog: http://blog.menneske.org Twitter: http://twitter.com/sveinmagnus Slideshare: http://slideshare.net/sveinmagnus
2Open Data - a goldmine
3Open Data - a goldmine
CONTENTMATTERS
Why isn’t it enough?
Open source doesn’t require open formats
Open source only covers the software
Data often lasts longer than software
Data is more valuable when accessible
Any code will be acceptable, any data won’t
4Open Data - a goldmine
Graphic by Open Source Initiative, CC BY
Open data – real gold
Canadian GoldCorp Inc. was near collapse in the late 90’ies.It’s Red Lake mine showed reduced output after 50 years of production Then something previously unheard of happened:
Inspired by the crowd-sourcing of Linux and Open Source, Rob McEwen announced The GoldCorp Challenge: a competition to find new gold in the mine. The full geological dataset from Red Lake was made available to contestants.
5Open Data - a goldmine
Photo by Rickz @ Flickr, CC BY-NC-ND
6Open Data - a goldmine
110 new targets were suggested by
contestants from around the world.
80% of the targets submitted yielded
substantial quantities of new gold
GoldCorp got first look a wealth of new
technologies for mine analysis
Production at Red Lake increased
tenfold while mining costs dropped to
1/6th of their previous levels.
And the result?
Photo by BullionVault @ Flickr, CC BY-ND
What is Open Data?
• Open Knowledge Definition (http://www.opendefinition.org/)
Open data/content/information must:1. Be Available and Accessible at Reproduction Cost “As a Whole”2. Permit Free Redistribution3. Permit Reuse Under Same Terms4. Be Absent of Technological Restrictions5. Be Attributed as Required6. Keep Source Integrity7. Not Discriminate Access From Persons or Groups8. Not Discriminate Against Fields of Endeavor9. Be Distributed with only the Original License10. Must Not Be Licensed Specific to a Package11. Must Not by License Restrict the Distribution of Other Works
7Open Data - a goldmine
Graphic by ronin691 @ Flickr, CC BY-SA
Why should we create open data?
Restrictions on data re-use can create an anti-commons and its related tragedy.
8Open Data - a goldmine
Photo by robokow Flickr, CC BY-NC-SA
Why should we create open data?
Sponsors may not get full value of research unless the results are made freely available.
The rate of discovery often accelerates with better access to data.
9Open Data - a goldmine
Photo by Victor.Correa Flickr, CC BY-NC-SA
Why should we create open data?
Data access is often required for the operation of communal human activities.
10Open Data - a goldmine
Photo by coreytempleton Flickr, CC BY-NC-SA
Why should we create open data?
11Open Data - a goldmine
This presentation would have been really boring without the fully and partially open data available from
Wikipedia, Flickr and the open data projects online!
12Open Data - a goldmine
If you love something…
Set it free!
Photo by keltanen @ Flickr, CC BY-NC
When should we demand Open Data?
When the data belongs to the human race
13Open Data - a goldmine
Photo by guiguibu91 @ Flickr, CC BY
When should we demand Open Data?
When the data consists of independently verifiable facts or common knowledge
14Open Data - a goldmine
When should we demand Open Data?
When public money fundedthe creation of the data
When the data was created at
a government institution
When the source of the data was a public endeavor
15Open Data - a goldmine
Photo by Steve Wampler @ Flickr, CC BY
Saving lives with open data
o M.V. Rocknes was a 166-metre cargo-ship with a crew of 30.
o January 19th 2004 she ran aground and capsized. 18 people died in the accident.
o The use of outdated maps by both the crew and the Norwegian pilotage authorities contributed to the wreck.
16Open Data - a goldmine
Photos by Smit International / Scanpix
Examples from Norway
Norwegian Medicines Agencyo Open medical databases can aid both research and healthcare.o Data on approved medicines in Norway were made available online in 2008.
17Open Data - a goldmine
Norwegian Pollution Control Authorityo A central database of information on various materials can improve safety.o Many databases on hazardous chemicals are outdated and of limited scope.o No single source of up to date and complete information are available.
Norwegian Mapping Authorityo Open maps can prevent fatal accidents, especially at sea.o The ”Rocknes”-wreck in 2004 might have been avoided with open maps.o Updated maps are still not available to Norwegian pilotage authorities.o In the United States, official nautical maps are freely available online.
18Open Data - a goldmine
Graphic by W3C SWEO Linking Open Data, CC BY-SA
Wikipedia defines Linked Data as “a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.”
Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods.
The semantic web currently containsseveral billion triples of linked data.
http://linkeddata.org/
19Open Data - a goldmine
Graphic by semanticwebcompany @ Flickr, CC BY-NC-SA
DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data.
The DBpedia knowledge base currently describes more than 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies. The knowledge base consists of 274 million pieces of information (RDF triples).
http://dbpedia.org/
DBpedia and all other linked data is searchable with SPARQLhttp://en.wikipedia.org/wiki/SPARQL
20Open Data - a goldmine
Open StreetmapOpenStreetMap is a free editable map of the whole world. It is made by people like you. OpenStreetMap allows you to view, edit and use geographical data in a collaborative way from anywhere on Earth. www.openstreetmap.org
21Open Data - a goldmine
GeoNamesThe GeoNames geographical database is available for download free of charge under a creative commons attribution license. It contains over eight million geographical names and consists of 6.5 million unique features.www.geonames.org
Ensuring truly Open Data• Public Domain – Only after the expiration of copyright
• Science Commons protocol for open data
Creative Commons Zero (Link) Public Domain Dedication & Licence (Link)
22Open Data - a goldmine
Photo by suttonhoo @ Flickr, CC BY-NC-SA
Follow the PDDL Community Norms:
o Avoid technical protection measures
o Always give credit where credit is due
o Use open formats
o Let others know!
o Share your work too!
23Open Data - a goldmine
Questions?Photo by danesparza @ Flickr, CC BY-ND
The road to open knowledge starts here!