linked data in action
Post on 18-Sep-2014
12 Views
Preview:
DESCRIPTION
TRANSCRIPT
LINKED DATA
KIN Webinar 11 June 2010Stephen DaleAssociate Consultant – IDeADirector Semantix (UK) Ltdwww.semanrtix.co.uk
SAN FRANCISCO CRIME DATA
Visualizations created by a private individual from dataset s made available by City & County of San Francisco
http://www.informationisbeautiful.net/visualizations/because-every-country-is-the-best-at-something/
GOOGLE PUBLIC DATA EXPLORER
http://www.google.com/publicdata/home
DATA SOURCED FROM GUARDIAN OPEN PLATFORM
OPEN DATA, LINKED DATA, DATA VISUALIZATION
Open Data + Visualization can surface hidden insights
Open + Linked Data provides opportunities for sharing these insights
SOME BASICS
Linked data and open data are often used interchangeably, but they are not the same thing.
Open data is non-personal data published in a digital format, e.g. HTML, CSV, Excel, Access, Word, PDF
Linked data is data that is structured in such a way that data sets can be combined to create something greater than the sum of its parts. Also described as ‘the semantic web’. Sir Tim Berners-Lee, the inventor of the World Wide Web, describes it as being a “web of data” the same way that the Internet is currently a “web of documents”
Whereas we use URLs to connect the web of documents, we use URIs to connect the web of data.
URI Definition at http://tools.ietf.org/html/rfc3986
OPEN DATA DEFINITION
A work is open if its manner of distribution satisfies the following conditions: 1. Complete All public data are made available. Public data are data that are not subject
to valid privacy, security or privilege limitations. 2. Primary Data are collected at the source, with the finest possible level of granularity,
not in aggregate or modified forms. 3. Timely Data are made available as quickly as necessary to preserve the value of the
data. 4. Accessible Data are available to the widest range of users for the widest range of
purposes. 5. Machine processable Data are reasonably structured to allow automated processing. 6. Non-discriminatory Data are available to anyone, with no requirement of registration. 7. Non-proprietary Data are available in a format over which no entity has exclusive
control. 8. License-free Data are not subject to any copyright, patent, trademark or trade secret
regulation
Source: http://www.opengovdata.org/home/8principles
SOME DEFINITIONS
Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF."
Source: http://linkeddata.org
PUBLIC DATA PRINCIPLES
Public data will be published in reusable, machine readable form Public data will be available and easy to find through a single easy to
use online access point (www.data.gov.uk) Public data will be published using open standards and following the
recommendations of the World Wide Web Consortium (W3C) Any ‘raw’ dataset will be represented in linked data form More public data will be released under an open licence which
enables free reuse, including commercial reuse. Data underlying the Government’s own websites will be published in
reusable form for others to use Personal, classified, commercially sensitive and third-party data will
continue to be protected.
Source: Smarter Government – December 2009
RESOURCE DESCRIPTION FRAMEWORK (RDF)
Linked Data is published on the web for machines to read rather than humans, using the RDF2 data model. RDF breaks a statement down into three parts ( so that an RDF statement is known as a ‘triple’ ). · Subject· Predicate· Object
Subject
Predicate Object
RDF Triple
MARKING UP DOCUMENTS FOR RDF
Use the XHTML+RDFa DOCTYPE at the top of the document:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
RDFa makes extensive use of URIs which can get a little unwieldy. RDFa supports a mechanism for shortening URIs
(called Compact URIs, or CURIEs), which involves using a prefix to replace part of the URI.
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:argot="http://purl.oclc.org/argot/" xmlns:dc="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:v="http://www.w3.org/2006/vcard/ns#" xmlns:dbp="http://dbpedia.org/resource/"> ...</html>
RDF TRIPLES
Subject Predicate Object
King Edward Primary School is located within the district of Mansfield
Mansfield is located within The county of Nottinghamshire
King Edward Primary School is a type of Community School
Community Schools in Nottinghamshire
achieved An average score of 4 for key Stage 4 Mathematics in School Year 2008/9
Stephen Dale achieved A score of 5 for Key Stage 2 Mathematics for School Year 2008/9
The examples above are ‘human-readable’ but the actual RDF representation of this data is in a form that machines can interpret and inference from. Each part of a ‘triple’ is therefore represented by a ‘universal identifier’ rather than the actual text.
Source: LeGSB
RESULTS FROM A SPARQL QUERY
Label King Edward Primary School
Latitude 53.13866
Longitude -1.19177
Type Community School
Easting 454165
Northing 360470
E y government funded children 0
FSM 41
FSMPercentage 13
LLSC Nottinghamshire
LSOA Mansfield 013F
MSOA Mansfield 013
PFI false
A description of the resource identified by http://education.data.gov.uk/id/school/133274
SPARQL
SPARQL (pronounced "sparkle" ) is an RDF query language; its name is a recursive acronym that stands for SPARQL Protocol and RDF Query Language. It was standardized by the RDF Data Access Working Group (DAWG) of the World Wide Web Consortium, and is considered a key semantic web technology. On 15 January 2008, SPARQL became an official W3C Recommendation.
Source: Wikipedia
REFERENCE DATA
Reference data and Ontologies are the basis for making connections between linked datasets. There are a number of well established sources of reference data that form the basis of the connections that mash-up style sites use. The latest view of these is given at http://linkeddata.org
Typical types of reference data are:
DBPedia – to define terms GeoNames – to define places
Source: Richard Wallis http://www.slideshare.net/rjw/linked-data-in-action
GROWTH OF THE INTERNET
Source: Richard Wallis http://www.slideshare.net/rjw/linked-data-in-action
MATURITY OF LINKED DATA
DATA
.GO
V.U
K
MAK
ING
PUBL
IC D
ATA
PUBL
IC
OPE
N O
R LI
NKE
D?
http://coins.guardian.co.uk/coins-explorer/search
http://alpine.coinsdata.co.uk
MORE EXAMPLES OF LINKED DATA IN ACTION
http:
//w
ww
.lich
field
dc.g
ov.u
k/si
te/c
usto
m_s
crip
ts/m
ap.p
hp
http:
//w
ww
.lich
field
dc.g
ov.u
k/si
te/c
usto
m_s
crip
ts/m
ap.p
hp
http:
//la
bs.ti
mes
onlin
e.co
.uk/
2009
/cyc
ling_
acci
dent
s/
http://filmlondon.org.uk/film_culture/film_tourism/movie_maps/love_from_london
CAN LINKED DATA SOLVE PROBLEMS WITHIN THE ENTERPRISE?
CONNECTING SILO’D DATA REPOSITORIES
IINTRANET
Intranet Documents
MailSystem
RDBMS
ERP, CRM
Legacy Data
Datawarehouse
Datamarts
BUSINESS SYSTEMS CORP WEBSITE
Intranet Documents
Documents
DMS,RMS
HR, FINANCE
Records
Linked Data Cloud
QUESTIONS FOR THE PANEL (YOU!)
How could linked data be used in your organisation to create value?
Could linked data techniques be used to connect silo’d data across the enterprise?
What are the risks (e.g. provenance, reliability, accuracy)?
Do we all need to become statisticians to correctly interpret the data?
Where next for linked data (semantic web)?
Thank you!
http://www.linkedin.com/in/stevedalexxx
http://www.flickr.com/photos/stephendale/
http://twitter.com/stephendale stephendale
http://stephendale.net
Steve.dale@gmail.com
http://friendfeed.com/dissident
http://www.delicious.com/stephendale
http://steve-dale.net
http://www.diigo.com/user/stephendale
http://slideshare.net/stephendale
top related