data-mining the semantic web

Post on 20-Jul-2015

408 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data-mining the Semantic Weband spatially visualising the resultsData Visualization for the Arts and HumanitiesQueen’s University Belfast 5-6 March 2015

1 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Workshop overview

• Day 1 : Data-mining

– Open Data

– Linked Data

– Linked Open Data implementation

– Semantic Web and ontologies

– Hands-on practicals

2 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Workshop overview

• Day 2 : Data visualisation

– Data visualisation concepts introduction

– Web maps and geo-tagging

– Hands-on practical

– Interpretations

– Hermeneutic circle

3 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

From the horse’s mouth

(source: www.ted.com/talks/tim_berners_lee_on_the_next_web)

4 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

5 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Open Access

TerminologyOpen Data

Big Data

The web of data

The Semantic WebLinked Data

data mining

6 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Asking questions of digital datasets

Terminology

7 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Open Access

Terminology

8 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Design by Julie Beckfor the Harvard University Neuroinformatics dept(source: www.juliebcreative.com/portfolio/open-data-logo/)

9 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

http://linkedarc.net/surveys/arch-datasharing

10 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Linked Data

Terminology

The linkages between the major Linked Data datasets (source: lod-cloud.net)

11 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Big Data

Terminology

Wordle of terms associated with Big Data activity (source: sfdata.startupweekend.org)

12 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

5 Stars of Open Data

put your data online under an open license

make it structured (e.g. as an Excel file)

use non-proprietary formats (e.g. XML and not Excel)

use URIs to identify resources

link your data to external datasets

13 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

The RDF Triple

14 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

A Triple Example

‘…the boy’s name is Tom…’

subject

predicate

object

15 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Triple Linking

‘…Tom is short for Thomas…’

subject

predicate

object

16 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Graph data

17 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Serialising RDF

• Turtle

• JSON

• RDF/XML

• N-Triples

18 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

RDF Turtle@base <http://example.org/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rel: <http://www.perceive.net/schemas/relationship/> .

<green-goblin>rel:enemyOf <spiderman> ;a foaf:Person ; # in the context of the Marvel universefoaf:name "Green Goblin" .

<spiderman>rel:enemyOf <green-goblin> ;a foaf:Person ;foaf:name "Spiderman", "Человек-паук"@ru .

1

2

3

19 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

As N-Triples

<http://example.org/green-goblin> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/spiderman> .<http://example.org/green-goblin> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/green-goblin> <http://xmlns.com/foaf/0.1/name> "Green Goblin" .<http://example.org/spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/green-goblin> .<http://example.org/spiderman> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "Spiderman" .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "\u00D0\u00A7\u00D0\u00B5\u00D0\u00BB\u00D0\u00BE\u00D0\u00B2\u00D0\u00B5\u00D0\u00BA-\u00D0\u00BF\u00D0\u00B0\u00D1\u0083\u00D0\u00BA"@ru .

20 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

As JSON

{"http:\/\/example.org\/green-goblin":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/spiderman"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"GreenGoblin"}]},"http:\/\/example.org\/spiderman":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/green-goblin"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"Spiderman"},{"type":"literal","value":"\u0427\u0435\u043b\u043e\u0432\u0435\u043a-\u043f\u0430\u0443\u043a","lang":"ru"}]}}

21 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

As RDF/XML

<?xml version="1.0" encoding="utf-8" ?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:foaf="http://xmlns.com/foaf/0.1/"xmlns:ns0="http://www.perceive.net/schemas/relationship/">

<foaf:Person rdf:about="http://example.org/green-goblin"><ns0:enemyOf><foaf:Person rdf:about="http://example.org/spiderman">

<ns0:enemyOf rdf:resource="http://example.org/green-goblin"/><foaf:name>Spiderman</foaf:name><foaf:name xml:lang="ru">Человек-паук</foaf:name>

</foaf:Person></ns0:enemyOf>

<foaf:name>Green Goblin</foaf:name></foaf:Person>

</rdf:RDF>

22 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Visualised as a Graph

23 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Triplestores and Infrastructure

24 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Practical: Making RDF

http://www.franklynam.com/blog.aspx?id=85

Q: Create RDF representations of yourself and your relationships

25 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

The Semantic Web and Ontologies

The stages of the Web (source: urenio.org)

26 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Ontological Classes and Properties

27 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

The British Museum data mapping onto the CIDOC CRM(source: confluence.ontotext.com/display/ResearchSpace/BM+Mapping)

28 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

The CIDOC CRM basic entity types and their relationships(source: www.cidoc-crm.org/)

29 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Vocabularies

30 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Graph data

31 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Minna Sundberg (source: www.sssscomic.com/comic.php?page=196)

32 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Querying using SPARQL

SELECT *

WHERE {

?s ?p ?o

} LIMIT 10

33 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

More complex SPARQL

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX letters1916: <http://letters1916.linkedarc.net/ontology/>PREFIX letters1916data: <http://letters1916.linkedarc.net/data/>PREFIX schema: <http://schema.org/>

SELECT DISTINCT ?letter ?letterName ?recipientPostalAddressName ?recipientLongitude ?recipientLatitudeWHERE {

?letter rdf:type letters1916:Letter ;schema:name ?letterName ;letters1916:recipientLocation ?recipientPostalAddress .

?recipientPostalAddress schema:addressRegion ?recipientPostalAddressRegion ;FILTER regex(?recipientPostalAddressRegion, 'Galway', 'i')?recipientPostalAddress schema:name ?recipientPostalAddressName .

?recipientPlace schema:address ?recipientPostalAddress ;schema:geo ?recipientGeoCoordinates .

?recipientGeoCoordinates schema:longitude ?recipientLongitude ;schema:latitude ?recipientLatitude

}

1

2

3

34 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Practical: Universities on DBpedia

http://www.franklynam.com/blog.aspx?id=86

Q: Get a list of all of the universities that DBpedia knows about

35 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

SKOS

@prefix dct: <http://purl.org/dc/terms/> .@prefix skos: <http://www.w3.org/2004/02/skos/core#> .@prefix cc: <http://creativecommons.org/ns#> .

<http://linkedarc.net/vocabs/vessel-jar> a skos:Concept ;cc:license <http://creativecommons.org/licenses/by/3.0> ;cc:attributionURL <http://linkedarc.net> ;cc:attributionName "linkedarc.net" ;skos:inScheme <http://linkedarc.net/vocabs> ;skos:prefLabel “Jar" ;skos:scopeNote ”A jar concept. Pottery. This isn’t a great scope note." ;dct:publisher <http://linkedarc.net> ;dct:identifier <http://linkedarc.net/vocabs/vessel-jar> ;dct:issued "2015-02-23"^^xsd:date ;skos:exactMatch <http://purl.org/heritagedata/schemes/mda_obj/concepts/97609> .

36 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

SPARQL + FILTER

SELECT * WHERE {

?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en”)

}

37 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

SPARQL + FILTER

SELECT * WHERE {

?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en") .

FILTER regex(?label, ”bell", "i”)

}

38 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

SPARQL + FILTER

SELECT * WHERE {

?s dct:dateCreated ?dateCreated .

FILTER (?dateCreated > '1900-01-01'

}

39 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Practical: Getty Concepts

Q: Get all of the Getty URIs that represent concepts related to amphorae

SPARQL endpoint: http://vocab.getty.edu/sparql

40 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Practical: British Museum Sarcophagi

Q: Get the find spots of all of the sarcophagi in the British Museum collection

SPARQL endpoint: http://collection.britishmuseum.org/sparql

41 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Geo-coding the Find Spots

with Google Refine

42 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

The Google Maps API

Address String

Geo-coordinates as JSON

43 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Export as CSV

44 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Practical: data.cso.ie

Q: Get the employment figures generated by the 2011 Irish census by region

SPARQL endpoint: http://nomisma.org/sparql

45 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Practical: Nomisma and Ancient Coins

Q: Get the geo-coordinates of all of the coin hoards stored in the Nomisma triplestore

SPARQL endpoint: http://data.cso.ie/query.html

46 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Additional Linked Data Resources

http://www.franklynam.com/blog.aspx?id=89

47 of 47@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsData Visualization for the Arts and Humanities

Thank you!

Martin Lemay (source: twitter.com/martinlemay)

top related