potentials and benefits of linked open data (lod)
DESCRIPTION
Talk by Sören Auer: Potentials and benefits of linked open data - given at the (Linked) Open Data MeetUp at The Waag in Amsterdam on 24 March 2013. See also: http://bit.ly/146hu1BTRANSCRIPT
Potentials and Benefits of Linked Open Data
Dr. Sören Auer
What‘s wrong with Open Data?
Creating Knowledge out of Interlinked Data
Creating Knowledge out of Interlinked Data
1st Resource Download Attempt
Creating Knowledge out of Interlinked Data
2nd Resource Download Attempt
Creating Knowledge out of Interlinked Data
After Installing 7zip for opening .gz files
Creating Knowledge out of Interlinked Data
Creating Knowledge out of Interlinked Data
3rd Resource Download Attempt
Creating Knowledge out of Interlinked Data
5th Resource Download Attempt
Creating Knowledge out of Interlinked Data
Creating Knowledge out of Interlinked Data
Giving up
Creating Knowledge out of Interlinked Data
<kindergarten>
<name>Seven Dwarfs</name>
<location>...</location>
<description>...</description>
</kindergarten>
Publishing Data about Kindergartens in XML (1)
Creating Knowledge out of Interlinked Data
<child_care name=„Seven Dwarfs“>
<address>
<street>...</street>
<zip>...</zip>
</address>
<text>...</text>
</child_care>
Publishing Data about Kindergartens in XML (2)
Creating Knowledge out of Interlinked Data
<daycare id=„Seven Dwarfs“
address=„...“>
. . .
</daycare>
Publishing Data about Kindergartens in XML (3)
Creating Knowledge out of Interlinked Data
Syntactic heterogenity – different
trees
Semantic heterogenity – different
tags and attributes (e.g. kindergarten,
child_care, daycare)
<kindergarten>
<name>Seven Dwarfs</name>
<location>...</location>
<description>...</description>
</kindergarten>
<child_care name=„Seven Dwarfs“>
<address>
<street>...</street>
<zip>...</zip>
</address>
<text>...</text>
</child_care>
<daycare id=„Seven Dwarfs“
address=„...“>
. . .
</ daycare >
Creating Knowledge out of Interlinked Data
Kindergarten Location Description …
Seven Dwarfs Rosentalgasse 9, 04105
… …
… … … …
Maybe CSV helps?
Child_care street Zip textSeven Dwarfs Rosentalgasse 04105 …
… … … …
Type Name Location FeaturesDaycare Seven Dwarfs 42.052384|
13.273679…
… … … …
Creating Knowledge out of Interlinked Data
Imagine you have 10.000 open data files
describing child care from communities
all over Europe all in different XML, CSV,
Excel, JSON, … formats
And then you want to look into polution,
road congestion, health care, …
A nightmare …
Creating Knowledge out of Interlinked Data
Distribution of file formats at PublicData.eu
Creating Knowledge out of Interlinked Data
How can we fix open data?
Creating Knowledge out of Interlinked Data
• Increasing data literacy???
• Organizing hackdays, hackathons???
• Publish more data???
Yes, but this won‘t scale
We need also:
• Standard formats, which preserve semantic: RDF
• Reuse vocabularies
• Visualizatuion widgets, mashups, apps, which can
make sense out of those vocabularies
How can we fix Open Data?
Creating Knowledge out of Interlinked Data 1. Uses RDF Data Model
Linked Data in a Nutshell
LOD-MeetUp
Amsterdam
24.3.2013
OKFNorganizes
starts
takesPlaceIn
2. Is serialised in triples:OKFN organizes LOD-MeetUp .LOD-MeetUp starts “20130324”^^xsd:date .LOD-MeetUp takesPlaceAt Amsterdam .
3. Uses Content-negotiation
Subject Predicate Object
Creating Knowledge out of Interlinked Data
Seven_Dwarfs rdf:type Kindergarten
Seven_Dwarfs rdfs:label „Seven Dwarfs“
Seven_Dwarfs foaf:location „Rosentalgasse 9“
Seven_Dwarfs rdfs:description „...“
...
Different Kindergarten descriptions also might look different, but
there will be definitely less variety than with XML or CSV
You can mix and mesh different vocabularies (RDF, RDFS, FOAF)
More information can be added without destroying the data
structure
7 Dwarfs in RDF
Creating Knowledge out of Interlinked Data
• Publish Open Data in RDF reusing vocabularies
which can be understood and combined by apps in
unforeseen ways (e.g. visualization widgets)
What has to be done?
Where we are now
Where we should be
make your stuff available on the Web (whatever format) under an open license
link your data
use URIs to denote things
use non-proprietary formats(e.g., CSV instead of Excel)
make it available as structured data(e.g., Excel instead of image scan of a table)
Creating Knowledge out of Interlinked Data
How can we lift Open Data to Linked Open Data?
Creating Knowledge out of Interlinked Data
All CSV on PublicData.eu is transformed in RDF
Creating Knowledge out of Interlinked Data
Creating Knowledge out of Interlinked Data
• Automatic CSV to
RDF
transformation
won‘t render good
results
• Mappings Wiki
enables the
crowdsourcing of
mappings
Mapping Wiki
Creating Knowledge out of Interlinked Data 1 {{CSV2RDFHeader}}
23 ...45 {{RelCSV2RDF6 | name = default-mapping7 | header = 18 | omitRows = -19 | omitCols = -110 | delimiter =11 | col1 = Department Family12 | col2 = Entity13 | col3 = Payment Date^^xsd:date14 | col4 = rdf:type15 | col5 = Cost Centre Name16 | col6 = Supplier17 | col7 = Transaction No.18 | col8 = Line Amount19 | col9 = Invoice Total^^xsd:decimal20 }}
CSV2RDF Mapping Syntax
Creating Knowledge out of Interlinked Data
How can we make this happen?
Expl
orati
onW
idge
ts Spatial faceted-browsing
Faceted-browsing
Statisticalvisualization
Entity-/faceted-Based browsing
Domain specificvisualizations
… …
Ope
n D
atas
ets
Dat
a Po
rtal
• Dataset analysis (size, vocabularies, properties)• Selection of suitable visualization widgets
SemMap OntoWiki
Creating Knowledge out of Interlinked Data
Browsing Statistical Data with CubeViz
Creating Knowledge out of Interlinked Data
Browsing Spatial Data with SemMap
Creating Knowledge out of Interlinked Data
Inter-linking/ Fusing
Classifi-cation/ Enrichment
Quality Analysis
Evolution / Repair
Search/ Browsing/ Exploration
Extraction
Storage/ Querying
Manual revision/ authoring
LOD Lifecyclesupported byDebian basedLOD2 Stack
http://stack.lod2.eu
Creating Knowledge out of Interlinked Data
• Open Data will only scale when ist Linked Open Data
• The RDF data model helps to reduce syntactic and
semantic heterogenity
• When Open Data is published as LOD adhering to
standard vocabularies, visualization widgets,
mashups, apps etc. can be applied to the data at
runtime and in possibly unforeseen ways
• By ultimately reducing the entrance and usage
barrier LOD will facilitate long-tail applications
Take home
Creating Knowledge out of Interlinked Data
Thank You!!!
http://lod2.euhttp://aksw.org
The emerging Web of Data
20082007
20082008
20082009
2009
2010
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Creating Knowledge out of Interlinked Data
Web server
Web server
Problem: Try to search for these things on the current Web:
• Apartments near German-English bilingual childcare in Leipzig
• ERP service providers with offices in Vienna and London
• Researchers working on multimedia topics in Eastern Europe
Information is available on the Web, but opaque to current search.
Why do we need the Linked Open Data
leipzig.deHas everything about childcare in Leipzig.
Immobilienscout.deKnows all about real estate offers in GermanyDB
Web server
DB
Web server
Search engineHTML HTML
RDF RDF
Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate/join such structured information from different sources: