linked data and tools

Post on 23-Jun-2015

220 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

American Art Collaborative Planning Grant Educational Briefings Linked Data and Tools Pedro Szekely - USC/Information Sciences Institute September 30, 2014

TRANSCRIPT

Linked Data and Tools Pedro Szekely

USC/Information Sciences Institute pszekely@isi.edu, http://isi.edu/~szekely

September 2014

CC-By 2.0

Outline •  Introduction to linked open data

•  RDF: the Resource Description Framework

•  Tools to convert data to RDF

•  Tools for linking/reconciliation/resolution

•  Storing and maintaining the data

•  Applications

CC-By 2.0 2 Pedro Szekely

Pedro Szekely

Linked Open Data!

CC-By 2.0 3

The Web of Documents

CC-By 2.0 4 Pedro Szekely

What We See

Pedro Szekely CC-By 2.0 5

What the Computer Sees

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah      blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah      blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah    blah  blah  blah  blah      blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah      blah  blah  blah    blah  blah  blah  blah    blah  blah  blah    

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    

Pedro Szekely CC-By 2.0 6

web pages are machine processable, but not machine understandable

impractical for building applications using the data

Problem

Pedro Szekely CC-By 2.0 7

Solution

publish the data as Linked Open Data

Pedro Szekely CC-By 2.0 8

What Is Linked Data? A method of publishing structured data

so that it can be interlinked and become more useful

Builds upon standard Web technologies

such as HTTP and URIs to share information

in a way that can be read automatically by computers from Wikipedia

Pedro Szekely CC-By 2.0 9

“Linked” Open Data Crystal Bridges

Museum ofAmerican Art

Dallas Museum of Art

IndianapolisMuseum of Art

The Metropolitan Museum of Art

National Portrait Gallery

Smithsonian American Art Museum

Pedro Szekely CC-By 2.0 10

“Linked” Open Data Crystal Bridges

Museum ofAmerican Art

Dallas Museum of Art

IndianapolisMuseum of Art

The Metropolitan Museum of Art

National Portrait Gallery

Smithsonian American Art Museum

✔  

✖  

Pedro Szekely CC-By 2.0 11

… data is public!… in a common format!

… but we only have islands of data!

Linked Open Data

CC-By 2.0 12 Pedro Szekely

Linked Data Principles •  Use URIs as names for things

•  Use HTTP URIs so that people can look up those names

•  When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

•  Include links to other URIs so that they can discover more things

http://youtu.be/OM6XIICm_qo!

http://www.w3.org/DesignIssues/LinkedData.html !

Pedro Szekely CC-By 2.0 13

Pedro Szekely

Principle 1 Use URIs as names for things

Principle 2 Use HTTP URIs so that people can look up those names

CC-By 2.0 14

Can USC Have a URI?

Pedro Szekely CC-By 2.0 15

http://dbpedia.org/resource/University_of_Southern_California

Pedro Szekely CC-By 2.0 16

Can the Pythagoras Theorem Have a URI?

Pedro Szekely CC-By 2.0 17

http://www.freebase.com/m/05r2j

Pedro Szekely CC-By 2.0 18

My Dog: Can He Have a URI?

Pedro Szekely CC-By 2.0 19

http://szekelys.com/diego

Pedro Szekely CC-By 2.0 20

Pedro Szekely

Principle 3 When someone looks up a URI, provide useful information, using the standards

(RDF*, SPARQL)

CC-By 2.0 21

Pedro Szekely

http://dbpedia.org/resource/University_of_Southern_California

CC-By 2.0 22

Pedro Szekely

http://www.freebase.com/m/05r2j

CC-By 2.0 23

Pedro Szekely

http://szekelys.com/diego

Principle 3 When someone looks up a URI, provide useful information, using the standards

(RDF*, SPARQL) CC-By 2.0 24

Pedro Szekely

Principle 4 Include links to other URIs so that they

can discover more things

CC-By 2.0 25

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type “Dog” ; http://szekelys.com/name ”Diego" ; dbpedia-owl:species “Labrador Retriever” ; dbprop:country “Canada” ; dbprop:color “Yellow” ; fb:base.petbreeds.dog.gender “Male” .

Linked Data?!Pedro Szekely CC-By 2.0 26

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type “Dog” ; http://szekelys.com/name ”Diego" ; dbpedia-owl:species “Labrador Retriever” ; dbprop:country “Canada” ; dbprop:color “Yellow” ; fb:base.petbreeds.dog.gender “Male” .

Not Linked Data!Pedro Szekely CC-By 2.0 27

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type dbpedia:Dog; http://szekelys.com/name ”Diego" ; dbpedia-owl:species dbpedia:Labrador_Retriever ; dbprop:country dbpedia:Canada; dbprop:color dbpedia:Yellow; fb:base.petbreeds.dog.gender fb:en.male.

Almost Linked Data!Pedro Szekely CC-By 2.0 28

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type dbpedia:Dog; http://szekelys.com/name ”Diego" ; dbpedia-owl:species dbpedia:Labrador_Retriever ; dbprop:country dbpedia:Canada; dbprop:color dbpedia:Yellow; fb:base.petbreeds.dog.gender fb:en.male.

Almost Linked Data!Pedro Szekely CC-By 2.0 29

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . http://szekelys.com/diego

rdf:type dbpedia:Dog; foaf:name ”Diego" ; dbpedia-owl:species dbpedia:Labrador_Retriever ; dbprop:country dbpedia:Canada; dbprop:color dbpedia:Yellow; fb:base.petbreeds.dog.gender fb:en.male.

Linked Data!

foaf is a widely used ontology!

Pedro Szekely CC-By 2.0 30

Pedro Szekely

RDF!

CC-By 2.0 31

Intended for representing metadata about Web resources, such as the title, author, and modification date

of a Web document

… also be used to represent information about things that can be identified on the Web,

even when they cannot be directly retrieved on the Web

Resource Description Framework

Pedro Szekely CC-By 2.0 32

Represent Resources Using URIs

h&p://szekelys.com/family#pedro  

“Pedro”  

h&p://xmlns.com/foaf/0.1/firstName  

That guy has first name “Pedro”

Pedro Szekely CC-By 2.0 33

Represent Information as Triples

h&p://szekelys.com/family#pedro  h&p://xmlns.com/foaf/0.1/firstName  

Subject!Predicate!

Object!The resource being described

A property of the resource The value of the property

“Pedro”  

Pedro Szekely CC-By 2.0 34

Use Namespaces

h&p://szekelys.com/family#pedro  

“Pedro”  

foaf:firstName  

h&p://szekelys.com/family#pedro  

“Pedro”  

h&p://xmlns.com/foaf/0.1/firstName  

Pedro Szekely CC-By 2.0 35

RDF Graphs

h&p://szekelys.com/family#pedro  

“Pedro”  

foaf:firstName  

foaf:Person  rdf:type  

h&p://isi.edu/~szekely  

foaf:homepage  

Pedro Szekely CC-By 2.0 36

RDF Graphs

h&p://szekelys.com/family#pedro  

“Pedro”  

foaf:firstName  

foaf:Person  rdf:type  

h&p://isi.edu/~szekely  

foaf:homepage  

Real world objects!

Kinds of things!

Literals!

Properties of things!

Pedro Szekely CC-By 2.0 37

Mix Vocabularies

h&p://szekelys.com/family#pedro  

“Pedro”  foaf:firstName  

foaf:Person  rdf:type  

h&p://isi.edu/~szekely  

foaf:homepage  

schema:Person  

rdf:type  

h&p://szekelys.com/family#claudia  

schema:spouse  

Pedro Szekely CC-By 2.0 38

Linked Open Data

CC-By 2.0 39 Pedro Szekely

Pedro Szekely

Tools to Convert Data to RDF!

CC-By 2.0 40

Steps to Create Linked Open Data •  Select ontologies

… that define classes and properties for our data

•  Convert data to RDF … from the museum database to the ontologies

•  Identify links to other Linked Data datasets … to other museums and Link Data hubs

Pedro Szekely CC-By 2.0 41

Pedro Szekely CC-By 2.0 42

CIDOC CRM

•  Select ontologies … that define classes and properties for our data

http://www.cidoc-crm.org/

Pedro Szekely CC-By 2.0 43 Pedro Szekely

•  Select ontologies … that define classes and properties for our data

•  Convert data to RDF … from the museum database to the ontologies

RDF Mapping Tools

CC-By 2.0 44 Pedro Szekely

Tool Shortcomings Benefits custom code

labor intensive, error prone

flexible

R2RML difficult to learn, only for SQL databases

W3C standard, good documentation, multiple vendors

RDF Refine

only for tabular data graphical user interface, support for reconciliation, open source

Karma semi-automatic, graphical user interface, supports tabular data, XML and JSON, multiple export formats, R2RML compatible, open source

R2RML

CC-By 2.0 45 Pedro Szekely

About 6,550 results!

R2RML Example

CC-By 2.0 46 Pedro Szekely

:Table1 rdf:type rr:TriplesMap ;

rr:logicalTable "Select ('<http:..isbn/' || ISBN || '>') AS isbn,

Author, Title, Publisher, Year from book_table";

rr:subjectMap [ rdf:type rr:IRIMap ; rr:column "isbn" ] ;

rr:propertyObjectMap [ rr:property a:title ; rr:column "Title" ; ] ;

rr:propertyObjectMap [ rr:property a:year ; rr:column "Year" ; ] ;

http://ivan-herman.name/2010/11/02/my-first-mapping-from-rdb-to-rdf-using-r2rml/!http://www.w3.org/TR/r2rml/!

RDF Refine

CC-By 2.0 47 Pedro Szekely http://refine.deri.ie/rdfExportDocs!

Karma

CC-By 2.0 48 Pedro Szekely

https://github.com/InformationIntegrationGroup/Web-Karma!

Pedro Szekely

Tools for Linking!

CC-By 2.0 49

Multiple “John Singer Sargent” ima:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! dct:date "1856-1925" ;! foaf:name "John Singer Sargent" .!

saam:SaamPerson_4253! a saam:SaamPerson ;! saam:associatedPlace ! saam:SaamPlace_1357324439768t1r13950_0, ! saam:SaamPlace_1357324439768t1r13951_0 ;! saam:constituentId "4253" ;! rdaGr2:biographicalInformation ! “Painter. Sargent traveled …" ;! rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;! rdaGr2:dateOfBirth "1856-1-12" ;! rdaGr2:dateOfDeath "1925-4-15" ;! rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;! rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;! skos:altLabel "John S. Sargent" ;! skos:prefLabel "John Singer Sargent" .!

cb:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1879", "1885" ;! ont0:dateOfDeath "1925" ;! skos:prefLabel "John Singer Sargent" .!

met:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:placeOfResidence ! "North and Central America", ! "United States" ;! foaf:name "John Singer Sargent" .!

dallas:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1856" ;! ont0:dateOfDeath "1925" ;! foaf:name "John Singer Sargent" .!

Pedro Szekely CC-By 2.0 50

ima:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! dct:date "1856-1925" ;! foaf:name "John Singer Sargent" .!

saam:SaamPerson_4253! a saam:SaamPerson ;! saam:associatedPlace ! saam:SaamPlace_1357324439768t1r13950_0, ! saam:SaamPlace_1357324439768t1r13951_0 ;! saam:constituentId "4253" ;! rdaGr2:biographicalInformation ! “Painter. Sargent traveled …" ;! rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;! rdaGr2:dateOfBirth "1856-1-12" ;! rdaGr2:dateOfDeath "1925-4-15" ;! rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;! rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;! skos:altLabel "John S. Sargent" ;! skos:prefLabel "John Singer Sargent" .!

cb:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1879", "1885" ;! ont0:dateOfDeath "1925" ;! skos:prefLabel "John Singer Sargent" .!

met:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:placeOfResidence ! "North and Central America", ! "United States" ;! foaf:name "John Singer Sargent" .!

dallas:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1856" ;! ont0:dateOfDeath "1925" ;! foaf:name "John Singer Sargent" .!

Pedro  Szekely  

John Singer Sargent

Pedro Szekely CC-By 2.0 51

Linking “John Singer Sargent”

saam:SaamPerson_4253! owl:sameAs cb:SaamPerson_John_Singer_Sargent ;! owl:sameAs dallas:SaamPerson_John_Singer_Sargent ;! owl:sameAs ima:SaamPerson_John_Singer_Sargent ;! owl:sameAs met:SaamPerson_John_Singer_Sargent ;! owl:sameAs dbpedia:John_Singer_Sargent ;! owl:sameAs nytimes/N49129220686803623753 ;! owl:sameAs w-flick/John_Singer_Sargent ;! ...!.!

Pedro  Szekely  Pedro Szekely CC-By 2.0 52

Linking/Reconciliation Tools

CC-By 2.0 53 Pedro Szekely

Tool Shortcomings Benefits custom code

very difficult tuned to the data

SILK LIMES

experimental, poor support

work with RDF, efficient, relatively easy to use

RDF Refine

requires implementing a new reconciliation service

integrated with RDF conversion, user interface for curation

Karma under development

SILK

CC-By 2.0 54 Pedro Szekely

http://wifo5-03.informatik.uni-mannheim.de/bizer/silk!

RDF Refine

CC-By 2.0 55 Pedro Szekely

http://refine.deri.ie/reconciliationDocs!

Pedro Szekely

Storing and Maintaining the Data!

CC-By 2.0 56

Storage Options

CC-By 2.0 57 Pedro Szekely

Technology Shortcomings Benefits SPARQL endpoint

low reliability, esoteric, slow

sophisticated query language

RDF dump no query capability, esoteric

flexibility: clients can download and use in applications, easy to publish

JSON-LD + ElasticSearch

restricted query language very high performance, mainstream technology, easy to publish

JSON-LD

CC-By 2.0 58 Pedro Szekely

{

"@type": "http://www.cidoc-crm.org/cidoc-crm/E21_Person",

"@id": "http://americanart.si.edu/data/person-institution/99”,

“P1_is_identified_by": {

"@type": "http://www.cidoc-crm.org/cidoc-crm/E82_Actor_Appellation",

"@id": "http://americanart.si.edu/data/person-institution/99/appellation/Birth-or-Maiden-Name”,

“label": " Walter Inglis Anderson”,

“lastname": "Anderson",

“firstname": "Walter Inglis”

}

}

Pedro Szekely CC-By 2.0 59

Pedro Szekely

Applications!

CC-By 2.0 60

Pedro Szekely CC-By 2.0 61

we have expanded the reach of linked data within the BBC to more audience facing products and presented our ambitions to using linked

data as glue for the plethora of content the BBC produces!!

http://www.bbc.co.uk/blogs/internet/posts/Linked-Data-new-ontologies-website!http://www.bbc.co.uk/blogs/internet/posts/Linked-Data-Connecting-together-the-BBCs-Online-Content!

http://www.bbc.co.uk/blogs/internet/posts/Opening-up-the-BBCs-Linked-Data!

Pedro Szekely CC-By 2.0 62

Pedro Szekely CC-By 2.0 63

Pedro Szekely CC-By 2.0 64

thanks for your attention!questions?!

top related