linked data and tools

65
Linked Data and Tools Pedro Szekely USC/Information Sciences Institute [email protected], http://isi.edu/~szekely September 2014 CC-By 2.0

Upload: american-art-collaborative

Post on 23-Jun-2015

220 views

Category:

Technology


1 download

DESCRIPTION

American Art Collaborative Planning Grant Educational Briefings Linked Data and Tools Pedro Szekely - USC/Information Sciences Institute September 30, 2014

TRANSCRIPT

Page 1: Linked Data and Tools

Linked Data and Tools Pedro Szekely

USC/Information Sciences Institute [email protected], http://isi.edu/~szekely

September 2014

CC-By 2.0

Page 2: Linked Data and Tools

Outline •  Introduction to linked open data

•  RDF: the Resource Description Framework

•  Tools to convert data to RDF

•  Tools for linking/reconciliation/resolution

•  Storing and maintaining the data

•  Applications

CC-By 2.0 2 Pedro Szekely

Page 3: Linked Data and Tools

Pedro Szekely

Linked Open Data!

CC-By 2.0 3

Page 4: Linked Data and Tools

The Web of Documents

CC-By 2.0 4 Pedro Szekely

Page 5: Linked Data and Tools

What We See

Pedro Szekely CC-By 2.0 5

Page 6: Linked Data and Tools

What the Computer Sees

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah      blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah      blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah    blah  blah  blah  blah      blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah      blah  blah  blah    blah  blah  blah  blah    blah  blah  blah    

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  

blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah    

Pedro Szekely CC-By 2.0 6

Page 7: Linked Data and Tools

web pages are machine processable, but not machine understandable

impractical for building applications using the data

Problem

Pedro Szekely CC-By 2.0 7

Page 8: Linked Data and Tools

Solution

publish the data as Linked Open Data

Pedro Szekely CC-By 2.0 8

Page 9: Linked Data and Tools

What Is Linked Data? A method of publishing structured data

so that it can be interlinked and become more useful

Builds upon standard Web technologies

such as HTTP and URIs to share information

in a way that can be read automatically by computers from Wikipedia

Pedro Szekely CC-By 2.0 9

Page 10: Linked Data and Tools

“Linked” Open Data Crystal Bridges

Museum ofAmerican Art

Dallas Museum of Art

IndianapolisMuseum of Art

The Metropolitan Museum of Art

National Portrait Gallery

Smithsonian American Art Museum

Pedro Szekely CC-By 2.0 10

Page 11: Linked Data and Tools

“Linked” Open Data Crystal Bridges

Museum ofAmerican Art

Dallas Museum of Art

IndianapolisMuseum of Art

The Metropolitan Museum of Art

National Portrait Gallery

Smithsonian American Art Museum

✔  

✖  

Pedro Szekely CC-By 2.0 11

… data is public!… in a common format!

… but we only have islands of data!

Page 12: Linked Data and Tools

Linked Open Data

CC-By 2.0 12 Pedro Szekely

Page 13: Linked Data and Tools

Linked Data Principles •  Use URIs as names for things

•  Use HTTP URIs so that people can look up those names

•  When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

•  Include links to other URIs so that they can discover more things

http://youtu.be/OM6XIICm_qo!

http://www.w3.org/DesignIssues/LinkedData.html !

Pedro Szekely CC-By 2.0 13

Page 14: Linked Data and Tools

Pedro Szekely

Principle 1 Use URIs as names for things

Principle 2 Use HTTP URIs so that people can look up those names

CC-By 2.0 14

Page 15: Linked Data and Tools

Can USC Have a URI?

Pedro Szekely CC-By 2.0 15

Page 16: Linked Data and Tools

http://dbpedia.org/resource/University_of_Southern_California

Pedro Szekely CC-By 2.0 16

Page 17: Linked Data and Tools

Can the Pythagoras Theorem Have a URI?

Pedro Szekely CC-By 2.0 17

Page 18: Linked Data and Tools

http://www.freebase.com/m/05r2j

Pedro Szekely CC-By 2.0 18

Page 19: Linked Data and Tools

My Dog: Can He Have a URI?

Pedro Szekely CC-By 2.0 19

Page 20: Linked Data and Tools

http://szekelys.com/diego

Pedro Szekely CC-By 2.0 20

Page 21: Linked Data and Tools

Pedro Szekely

Principle 3 When someone looks up a URI, provide useful information, using the standards

(RDF*, SPARQL)

CC-By 2.0 21

Page 22: Linked Data and Tools

Pedro Szekely

http://dbpedia.org/resource/University_of_Southern_California

CC-By 2.0 22

Page 23: Linked Data and Tools

Pedro Szekely

http://www.freebase.com/m/05r2j

CC-By 2.0 23

Page 24: Linked Data and Tools

Pedro Szekely

http://szekelys.com/diego

Principle 3 When someone looks up a URI, provide useful information, using the standards

(RDF*, SPARQL) CC-By 2.0 24

Page 25: Linked Data and Tools

Pedro Szekely

Principle 4 Include links to other URIs so that they

can discover more things

CC-By 2.0 25

Page 26: Linked Data and Tools

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type “Dog” ; http://szekelys.com/name ”Diego" ; dbpedia-owl:species “Labrador Retriever” ; dbprop:country “Canada” ; dbprop:color “Yellow” ; fb:base.petbreeds.dog.gender “Male” .

Linked Data?!Pedro Szekely CC-By 2.0 26

Page 27: Linked Data and Tools

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type “Dog” ; http://szekelys.com/name ”Diego" ; dbpedia-owl:species “Labrador Retriever” ; dbprop:country “Canada” ; dbprop:color “Yellow” ; fb:base.petbreeds.dog.gender “Male” .

Not Linked Data!Pedro Szekely CC-By 2.0 27

Page 28: Linked Data and Tools

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type dbpedia:Dog; http://szekelys.com/name ”Diego" ; dbpedia-owl:species dbpedia:Labrador_Retriever ; dbprop:country dbpedia:Canada; dbprop:color dbpedia:Yellow; fb:base.petbreeds.dog.gender fb:en.male.

Almost Linked Data!Pedro Szekely CC-By 2.0 28

Page 29: Linked Data and Tools

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . http://szekelys.com/diego

rdf:type dbpedia:Dog; http://szekelys.com/name ”Diego" ; dbpedia-owl:species dbpedia:Labrador_Retriever ; dbprop:country dbpedia:Canada; dbprop:color dbpedia:Yellow; fb:base.petbreeds.dog.gender fb:en.male.

Almost Linked Data!Pedro Szekely CC-By 2.0 29

Page 30: Linked Data and Tools

http://szekelys.com/diego @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpprop: <http://dbpedia.org/property/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbpedia-owl: <http://dbpedia.org/ontology/> . @prefix fb: <http://rdf.freebase.com/ns/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . http://szekelys.com/diego

rdf:type dbpedia:Dog; foaf:name ”Diego" ; dbpedia-owl:species dbpedia:Labrador_Retriever ; dbprop:country dbpedia:Canada; dbprop:color dbpedia:Yellow; fb:base.petbreeds.dog.gender fb:en.male.

Linked Data!

foaf is a widely used ontology!

Pedro Szekely CC-By 2.0 30

Page 31: Linked Data and Tools

Pedro Szekely

RDF!

CC-By 2.0 31

Page 32: Linked Data and Tools

Intended for representing metadata about Web resources, such as the title, author, and modification date

of a Web document

… also be used to represent information about things that can be identified on the Web,

even when they cannot be directly retrieved on the Web

Resource Description Framework

Pedro Szekely CC-By 2.0 32

Page 33: Linked Data and Tools

Represent Resources Using URIs

h&p://szekelys.com/family#pedro  

“Pedro”  

h&p://xmlns.com/foaf/0.1/firstName  

That guy has first name “Pedro”

Pedro Szekely CC-By 2.0 33

Page 34: Linked Data and Tools

Represent Information as Triples

h&p://szekelys.com/family#pedro  h&p://xmlns.com/foaf/0.1/firstName  

Subject!Predicate!

Object!The resource being described

A property of the resource The value of the property

“Pedro”  

Pedro Szekely CC-By 2.0 34

Page 35: Linked Data and Tools

Use Namespaces

h&p://szekelys.com/family#pedro  

“Pedro”  

foaf:firstName  

h&p://szekelys.com/family#pedro  

“Pedro”  

h&p://xmlns.com/foaf/0.1/firstName  

Pedro Szekely CC-By 2.0 35

Page 36: Linked Data and Tools

RDF Graphs

h&p://szekelys.com/family#pedro  

“Pedro”  

foaf:firstName  

foaf:Person  rdf:type  

h&p://isi.edu/~szekely  

foaf:homepage  

Pedro Szekely CC-By 2.0 36

Page 37: Linked Data and Tools

RDF Graphs

h&p://szekelys.com/family#pedro  

“Pedro”  

foaf:firstName  

foaf:Person  rdf:type  

h&p://isi.edu/~szekely  

foaf:homepage  

Real world objects!

Kinds of things!

Literals!

Properties of things!

Pedro Szekely CC-By 2.0 37

Page 38: Linked Data and Tools

Mix Vocabularies

h&p://szekelys.com/family#pedro  

“Pedro”  foaf:firstName  

foaf:Person  rdf:type  

h&p://isi.edu/~szekely  

foaf:homepage  

schema:Person  

rdf:type  

h&p://szekelys.com/family#claudia  

schema:spouse  

Pedro Szekely CC-By 2.0 38

Page 39: Linked Data and Tools

Linked Open Data

CC-By 2.0 39 Pedro Szekely

Page 40: Linked Data and Tools

Pedro Szekely

Tools to Convert Data to RDF!

CC-By 2.0 40

Page 41: Linked Data and Tools

Steps to Create Linked Open Data •  Select ontologies

… that define classes and properties for our data

•  Convert data to RDF … from the museum database to the ontologies

•  Identify links to other Linked Data datasets … to other museums and Link Data hubs

Pedro Szekely CC-By 2.0 41

Page 42: Linked Data and Tools

Pedro Szekely CC-By 2.0 42

CIDOC CRM

•  Select ontologies … that define classes and properties for our data

http://www.cidoc-crm.org/

Page 43: Linked Data and Tools

Pedro Szekely CC-By 2.0 43 Pedro Szekely

•  Select ontologies … that define classes and properties for our data

•  Convert data to RDF … from the museum database to the ontologies

Page 44: Linked Data and Tools

RDF Mapping Tools

CC-By 2.0 44 Pedro Szekely

Tool Shortcomings Benefits custom code

labor intensive, error prone

flexible

R2RML difficult to learn, only for SQL databases

W3C standard, good documentation, multiple vendors

RDF Refine

only for tabular data graphical user interface, support for reconciliation, open source

Karma semi-automatic, graphical user interface, supports tabular data, XML and JSON, multiple export formats, R2RML compatible, open source

Page 45: Linked Data and Tools

R2RML

CC-By 2.0 45 Pedro Szekely

About 6,550 results!

Page 46: Linked Data and Tools

R2RML Example

CC-By 2.0 46 Pedro Szekely

:Table1 rdf:type rr:TriplesMap ;

rr:logicalTable "Select ('<http:..isbn/' || ISBN || '>') AS isbn,

Author, Title, Publisher, Year from book_table";

rr:subjectMap [ rdf:type rr:IRIMap ; rr:column "isbn" ] ;

rr:propertyObjectMap [ rr:property a:title ; rr:column "Title" ; ] ;

rr:propertyObjectMap [ rr:property a:year ; rr:column "Year" ; ] ;

http://ivan-herman.name/2010/11/02/my-first-mapping-from-rdb-to-rdf-using-r2rml/!http://www.w3.org/TR/r2rml/!

Page 47: Linked Data and Tools

RDF Refine

CC-By 2.0 47 Pedro Szekely http://refine.deri.ie/rdfExportDocs!

Page 48: Linked Data and Tools

Karma

CC-By 2.0 48 Pedro Szekely

https://github.com/InformationIntegrationGroup/Web-Karma!

Page 49: Linked Data and Tools

Pedro Szekely

Tools for Linking!

CC-By 2.0 49

Page 50: Linked Data and Tools

Multiple “John Singer Sargent” ima:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! dct:date "1856-1925" ;! foaf:name "John Singer Sargent" .!

saam:SaamPerson_4253! a saam:SaamPerson ;! saam:associatedPlace ! saam:SaamPlace_1357324439768t1r13950_0, ! saam:SaamPlace_1357324439768t1r13951_0 ;! saam:constituentId "4253" ;! rdaGr2:biographicalInformation ! “Painter. Sargent traveled …" ;! rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;! rdaGr2:dateOfBirth "1856-1-12" ;! rdaGr2:dateOfDeath "1925-4-15" ;! rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;! rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;! skos:altLabel "John S. Sargent" ;! skos:prefLabel "John Singer Sargent" .!

cb:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1879", "1885" ;! ont0:dateOfDeath "1925" ;! skos:prefLabel "John Singer Sargent" .!

met:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:placeOfResidence ! "North and Central America", ! "United States" ;! foaf:name "John Singer Sargent" .!

dallas:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1856" ;! ont0:dateOfDeath "1925" ;! foaf:name "John Singer Sargent" .!

Pedro Szekely CC-By 2.0 50

Page 51: Linked Data and Tools

ima:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! dct:date "1856-1925" ;! foaf:name "John Singer Sargent" .!

saam:SaamPerson_4253! a saam:SaamPerson ;! saam:associatedPlace ! saam:SaamPlace_1357324439768t1r13950_0, ! saam:SaamPlace_1357324439768t1r13951_0 ;! saam:constituentId "4253" ;! rdaGr2:biographicalInformation ! “Painter. Sargent traveled …" ;! rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;! rdaGr2:dateOfBirth "1856-1-12" ;! rdaGr2:dateOfDeath "1925-4-15" ;! rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;! rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;! skos:altLabel "John S. Sargent" ;! skos:prefLabel "John Singer Sargent" .!

cb:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1879", "1885" ;! ont0:dateOfDeath "1925" ;! skos:prefLabel "John Singer Sargent" .!

met:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:placeOfResidence ! "North and Central America", ! "United States" ;! foaf:name "John Singer Sargent" .!

dallas:SaamPerson_John_Singer_Sargent! a saam:SaamPerson ;! ont0:dateOfBirth "1856" ;! ont0:dateOfDeath "1925" ;! foaf:name "John Singer Sargent" .!

Pedro  Szekely  

John Singer Sargent

Pedro Szekely CC-By 2.0 51

Page 52: Linked Data and Tools

Linking “John Singer Sargent”

saam:SaamPerson_4253! owl:sameAs cb:SaamPerson_John_Singer_Sargent ;! owl:sameAs dallas:SaamPerson_John_Singer_Sargent ;! owl:sameAs ima:SaamPerson_John_Singer_Sargent ;! owl:sameAs met:SaamPerson_John_Singer_Sargent ;! owl:sameAs dbpedia:John_Singer_Sargent ;! owl:sameAs nytimes/N49129220686803623753 ;! owl:sameAs w-flick/John_Singer_Sargent ;! ...!.!

Pedro  Szekely  Pedro Szekely CC-By 2.0 52

Page 53: Linked Data and Tools

Linking/Reconciliation Tools

CC-By 2.0 53 Pedro Szekely

Tool Shortcomings Benefits custom code

very difficult tuned to the data

SILK LIMES

experimental, poor support

work with RDF, efficient, relatively easy to use

RDF Refine

requires implementing a new reconciliation service

integrated with RDF conversion, user interface for curation

Karma under development

Page 54: Linked Data and Tools

SILK

CC-By 2.0 54 Pedro Szekely

http://wifo5-03.informatik.uni-mannheim.de/bizer/silk!

Page 55: Linked Data and Tools

RDF Refine

CC-By 2.0 55 Pedro Szekely

http://refine.deri.ie/reconciliationDocs!

Page 56: Linked Data and Tools

Pedro Szekely

Storing and Maintaining the Data!

CC-By 2.0 56

Page 57: Linked Data and Tools

Storage Options

CC-By 2.0 57 Pedro Szekely

Technology Shortcomings Benefits SPARQL endpoint

low reliability, esoteric, slow

sophisticated query language

RDF dump no query capability, esoteric

flexibility: clients can download and use in applications, easy to publish

JSON-LD + ElasticSearch

restricted query language very high performance, mainstream technology, easy to publish

Page 58: Linked Data and Tools

JSON-LD

CC-By 2.0 58 Pedro Szekely

{

"@type": "http://www.cidoc-crm.org/cidoc-crm/E21_Person",

"@id": "http://americanart.si.edu/data/person-institution/99”,

“P1_is_identified_by": {

"@type": "http://www.cidoc-crm.org/cidoc-crm/E82_Actor_Appellation",

"@id": "http://americanart.si.edu/data/person-institution/99/appellation/Birth-or-Maiden-Name”,

“label": " Walter Inglis Anderson”,

“lastname": "Anderson",

“firstname": "Walter Inglis”

}

}

Page 59: Linked Data and Tools

Pedro Szekely CC-By 2.0 59

Page 60: Linked Data and Tools

Pedro Szekely

Applications!

CC-By 2.0 60

Page 61: Linked Data and Tools

Pedro Szekely CC-By 2.0 61

we have expanded the reach of linked data within the BBC to more audience facing products and presented our ambitions to using linked

data as glue for the plethora of content the BBC produces!!

http://www.bbc.co.uk/blogs/internet/posts/Linked-Data-new-ontologies-website!http://www.bbc.co.uk/blogs/internet/posts/Linked-Data-Connecting-together-the-BBCs-Online-Content!

http://www.bbc.co.uk/blogs/internet/posts/Opening-up-the-BBCs-Linked-Data!

Page 62: Linked Data and Tools

Pedro Szekely CC-By 2.0 62

Page 63: Linked Data and Tools

Pedro Szekely CC-By 2.0 63

Page 64: Linked Data and Tools

Pedro Szekely CC-By 2.0 64

Page 65: Linked Data and Tools

thanks for your attention!questions?!