data-mining the semantic web
TRANSCRIPT
Data-mining the Semantic Weband spatially visualising the resultsData Visualization for the Arts and HumanitiesQueen’s University Belfast 5-6 March 2015
1 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Workshop overview
• Day 1 : Data-mining
– Open Data
– Linked Data
– Linked Open Data implementation
– Semantic Web and ontologies
– Hands-on practicals
2 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Workshop overview
• Day 2 : Data visualisation
– Data visualisation concepts introduction
– Web maps and geo-tagging
– Hands-on practical
– Interpretations
– Hermeneutic circle
3 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
From the horse’s mouth
(source: www.ted.com/talks/tim_berners_lee_on_the_next_web)
4 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
5 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Open Access
TerminologyOpen Data
Big Data
The web of data
The Semantic WebLinked Data
data mining
6 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Asking questions of digital datasets
Terminology
7 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Open Access
Terminology
8 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Design by Julie Beckfor the Harvard University Neuroinformatics dept(source: www.juliebcreative.com/portfolio/open-data-logo/)
9 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
http://linkedarc.net/surveys/arch-datasharing
10 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Linked Data
Terminology
The linkages between the major Linked Data datasets (source: lod-cloud.net)
11 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Big Data
Terminology
Wordle of terms associated with Big Data activity (source: sfdata.startupweekend.org)
12 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
5 Stars of Open Data
put your data online under an open license
make it structured (e.g. as an Excel file)
use non-proprietary formats (e.g. XML and not Excel)
use URIs to identify resources
link your data to external datasets
13 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
The RDF Triple
14 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
A Triple Example
‘…the boy’s name is Tom…’
subject
predicate
object
15 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Triple Linking
‘…Tom is short for Thomas…’
subject
predicate
object
16 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Graph data
17 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Serialising RDF
• Turtle
• JSON
• RDF/XML
• N-Triples
18 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
RDF Turtle@base <http://example.org/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rel: <http://www.perceive.net/schemas/relationship/> .
<green-goblin>rel:enemyOf <spiderman> ;a foaf:Person ; # in the context of the Marvel universefoaf:name "Green Goblin" .
<spiderman>rel:enemyOf <green-goblin> ;a foaf:Person ;foaf:name "Spiderman", "Человек-паук"@ru .
1
2
3
19 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
As N-Triples
<http://example.org/green-goblin> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/spiderman> .<http://example.org/green-goblin> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/green-goblin> <http://xmlns.com/foaf/0.1/name> "Green Goblin" .<http://example.org/spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/green-goblin> .<http://example.org/spiderman> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "Spiderman" .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "\u00D0\u00A7\u00D0\u00B5\u00D0\u00BB\u00D0\u00BE\u00D0\u00B2\u00D0\u00B5\u00D0\u00BA-\u00D0\u00BF\u00D0\u00B0\u00D1\u0083\u00D0\u00BA"@ru .
20 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
As JSON
{"http:\/\/example.org\/green-goblin":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/spiderman"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"GreenGoblin"}]},"http:\/\/example.org\/spiderman":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/green-goblin"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"Spiderman"},{"type":"literal","value":"\u0427\u0435\u043b\u043e\u0432\u0435\u043a-\u043f\u0430\u0443\u043a","lang":"ru"}]}}
21 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
As RDF/XML
<?xml version="1.0" encoding="utf-8" ?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"xmlns:ns0="http://www.perceive.net/schemas/relationship/">
<foaf:Person rdf:about="http://example.org/green-goblin"><ns0:enemyOf><foaf:Person rdf:about="http://example.org/spiderman">
<ns0:enemyOf rdf:resource="http://example.org/green-goblin"/><foaf:name>Spiderman</foaf:name><foaf:name xml:lang="ru">Человек-паук</foaf:name>
</foaf:Person></ns0:enemyOf>
<foaf:name>Green Goblin</foaf:name></foaf:Person>
</rdf:RDF>
22 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Visualised as a Graph
23 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Triplestores and Infrastructure
24 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Practical: Making RDF
http://www.franklynam.com/blog.aspx?id=85
Q: Create RDF representations of yourself and your relationships
25 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
The Semantic Web and Ontologies
The stages of the Web (source: urenio.org)
26 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Ontological Classes and Properties
27 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
The British Museum data mapping onto the CIDOC CRM(source: confluence.ontotext.com/display/ResearchSpace/BM+Mapping)
28 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
The CIDOC CRM basic entity types and their relationships(source: www.cidoc-crm.org/)
29 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Vocabularies
30 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Graph data
31 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Minna Sundberg (source: www.sssscomic.com/comic.php?page=196)
32 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Querying using SPARQL
SELECT *
WHERE {
?s ?p ?o
} LIMIT 10
33 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
More complex SPARQL
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX letters1916: <http://letters1916.linkedarc.net/ontology/>PREFIX letters1916data: <http://letters1916.linkedarc.net/data/>PREFIX schema: <http://schema.org/>
SELECT DISTINCT ?letter ?letterName ?recipientPostalAddressName ?recipientLongitude ?recipientLatitudeWHERE {
?letter rdf:type letters1916:Letter ;schema:name ?letterName ;letters1916:recipientLocation ?recipientPostalAddress .
?recipientPostalAddress schema:addressRegion ?recipientPostalAddressRegion ;FILTER regex(?recipientPostalAddressRegion, 'Galway', 'i')?recipientPostalAddress schema:name ?recipientPostalAddressName .
?recipientPlace schema:address ?recipientPostalAddress ;schema:geo ?recipientGeoCoordinates .
?recipientGeoCoordinates schema:longitude ?recipientLongitude ;schema:latitude ?recipientLatitude
}
1
2
3
34 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Practical: Universities on DBpedia
http://www.franklynam.com/blog.aspx?id=86
Q: Get a list of all of the universities that DBpedia knows about
35 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
SKOS
@prefix dct: <http://purl.org/dc/terms/> .@prefix skos: <http://www.w3.org/2004/02/skos/core#> .@prefix cc: <http://creativecommons.org/ns#> .
<http://linkedarc.net/vocabs/vessel-jar> a skos:Concept ;cc:license <http://creativecommons.org/licenses/by/3.0> ;cc:attributionURL <http://linkedarc.net> ;cc:attributionName "linkedarc.net" ;skos:inScheme <http://linkedarc.net/vocabs> ;skos:prefLabel “Jar" ;skos:scopeNote ”A jar concept. Pottery. This isn’t a great scope note." ;dct:publisher <http://linkedarc.net> ;dct:identifier <http://linkedarc.net/vocabs/vessel-jar> ;dct:issued "2015-02-23"^^xsd:date ;skos:exactMatch <http://purl.org/heritagedata/schemes/mda_obj/concepts/97609> .
36 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
SPARQL + FILTER
SELECT * WHERE {
?s rdfs:label ?label .
FILTER langMatches(lang(?label), "en”)
}
37 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
SPARQL + FILTER
SELECT * WHERE {
?s rdfs:label ?label .
FILTER langMatches(lang(?label), "en") .
FILTER regex(?label, ”bell", "i”)
}
38 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
SPARQL + FILTER
SELECT * WHERE {
?s dct:dateCreated ?dateCreated .
FILTER (?dateCreated > '1900-01-01'
}
39 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Practical: Getty Concepts
Q: Get all of the Getty URIs that represent concepts related to amphorae
SPARQL endpoint: http://vocab.getty.edu/sparql
40 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Practical: British Museum Sarcophagi
Q: Get the find spots of all of the sarcophagi in the British Museum collection
SPARQL endpoint: http://collection.britishmuseum.org/sparql
41 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Geo-coding the Find Spots
with Google Refine
42 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
The Google Maps API
Address String
Geo-coordinates as JSON
43 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Export as CSV
44 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Practical: data.cso.ie
Q: Get the employment figures generated by the 2011 Irish census by region
SPARQL endpoint: http://nomisma.org/sparql
45 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Practical: Nomisma and Ancient Coins
Q: Get the geo-coordinates of all of the coin hoards stored in the Nomisma triplestore
SPARQL endpoint: http://data.cso.ie/query.html
46 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Additional Linked Data Resources
http://www.franklynam.com/blog.aspx?id=89
47 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities
Thank you!
Martin Lemay (source: twitter.com/martinlemay)