linked dataworkshopintro14aug2014
DESCRIPTION
Linked Data workshop for Archives Hub contributors. An introduction to Linked Data concepts, including entities, URIs, RDF, and use of Open Refine for name matching.TRANSCRIPT
Linked Data: a practical approach
Wellcome Institute, 14 August 2014Adrian Stevenson and Jane Stevenson
“Linked Data is Storytelling for computers. It doesn’t have the full richness, complexity and nuance that we invest in our narratives, but it does at least help computers to fit all the bits together in meaningful ways.”
Linked Data workshop• Entities and Identities • Documents and Data• URIs and Connections• Triples• Data Creation • RDF Graphs and the Archives Hub Graph• Vocabularies• Locah: our experience of creating RDF• Connecting datasets • Demonstration websites• Name Matching and Demo of Open Refine• Linking Lives interface• Calm and Linked Data• Round up and close
Beatrice Webb
Martha Beatrice Webb, 1858-1943, social reformer
Martha Beatrice Webb, 1858-1943, social reformer
is the creator of some archive collections
Each of these is about an archive collection
Each of these is a document
Each document has lots of useful information
Each is formatted so a human reader can understand it
But let’s give each document an identifier that works on the Web…
The Web works with http://
http://archiveshub.ac.uk/data/gb394we
http://archiveshub.ac.uk/data/gb227msda865.w4
http://archiveshub.ac.uk/data/gb0097collmisc0241
http://archiveshub.ac.uk/data/gb0097sr1100
http://archiveshub.ac.uk/data/gb0097webblocalgovernment
http://archiveshub.ac.uk/data/gb0097collmisc0243
http://archiveshub.ac.uk/data/gb0097collmisc0242http://archiveshub.ac.uk/data/gb0097passfield
http://archiveshub.ac.uk/data/gb0097webbtradeunion
http://archiveshub.ac.uk/data/gb227msda865.w4
Martha Beatrice Webb, 1858-1943, social reformer
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
http://data.archiveshub.ac.uk/id/person/86607236
or….
http://viaf.org/viaf/86607236/
or….
Now we can make the statement:
<creator-of>
http://data.archiveshub.ac.uk/id/archivalresource/gb394-we
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
Martha Beatrice Webb
is the creator of the archive
Beatrice Webb: A summer holiday in Scotland, 1884
…identifiers for the Web (for a machine) …labels for humans
<creator-of> is the creator of the archive
George Bernard Shaw Diaries
…identifiers for the Web (for a machine) …labels for humans
George Bernard Shaw, 1859-1950, playwright
http://archiveshub.ac.uk/id/archivalresource/gb0097sr0293
http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist
We can start to say things about relationships…
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
<knew>
http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist
We can start to say things that go beyond what is known within our own space…
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
<is the same as>
http://viaf.org/viaf/86607236/
We can start to find different sources about the same person…
<is the same as>
http://viaf.org/viaf/121884166/
http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist
http://dbpedia.org/page/George_Bernard_Shaw
<is the same as>
We can put these ideas together…
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
<knew>
http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist
http://dbpedia.org/page/George_Bernard_Shaw
<also known as>
GRAPHS AND DATA MODELLING
Archival Resource Person
Created
Subject: Archival ResourcePredicate: CreatedByObject: Person
Subject > Predicate > Object
CreatedBy
Triple statement
CREATING AN ARCHIVE DESCRIPTION
Archival Resource Start date
Subject: Archival ResourcePredicate: dateCreatedObject: Start date
Subject > Predicate > Object
dateCreated
Triple statement
Archival Resource
biographical history
has
Beatrice Webb (1858-1943), nee Potter, social reformer and diarist. Married to Sidney
Webb, pioneers of social science. She was involved in many spheres of political and social activity including the Labour Party,
Fabianism, social observation, investigations into poverty, development of socialism, the foundation of the National Health Service
and post war welfare state, the London School of Economics, and the New
Statesman.
has
http://archiveshub.ac.uk/data/gb227msda865.w4
http://data.archiveshub.ac.uk/id/archivalresource/gb394-we
http://lexvo.org/id/iso639-3/eng
isLanguageOf
Subject: Archival ResourcePredicate: hasLanguageObject: Person
Subject > Predicate > Object
hasLanguage
Triple statement
Archival Resource Repository
Archival Record
describedBy
heldAt
encodedAs
EAD document
Title
has
An RDF Graph
Place
locatedIn
ArchivalResource
Finding Aid
EAD Document
Biographical History
Agent
Family Person Place
Concept
Genre Function
Org
maintainedBy/maintains
origination
associatedWith
accessProvidedBy/providesAccessTo
topic/page
hasPart/partOf
hasPart/partOf
encodedAs/encodes
Repository(Agent)
Book
Place
topic/page
Language
Level
administeredBy/administers
hasBiogHist/isBiogHistFor
foaf:focus Is-aassociatedWith
level
Is-a
language
ConceptScheme
inScheme
ObjectrepresentedBy
PostcodeUnit
Extent
Creation
Birth Death
extent
participates in
TemporalEntity
TemporalEntity
at time
at time
product of
in
Subject
Archival Resource ‘Creator’
?
Vocabularies
“You share vocabularies, so that other people (and computers) know when you’re talking about the same sorts of things. You share identifiers, so that other people (and computers) know that you’re talking about a specific person, place, object or whatever.”
Tim Sherratt, Web Developer and Digital Historian, Australia
http://archiveshub.ac.uk/locah/2011/03/describing-the-things-the-rdf-terms-used-part-2/
http://data.archiveshub.ac.uk/id/archivalresource/gb394-we
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
Subject: Archival ResourcePredicate: ‘origination’Object: Person
Subject > Predicate > Object
archiveshub:origination
Triple statement
Archival Resource
biographical history
has
Beatrice Webb (1858-1943), nee Potter, social reformer and diarist. Married to Sidney
Webb, pioneers of social science. She was involved in many spheres of political and social activity including the Labour Party,
Fabianism, social observation, investigations into poverty, development of socialism, the foundation of the National Health Service
and post war welfare state, the London School of Economics, and the New
Statesman.
archiveshub:hasBiographicalHistory
http://data.archiveshub.ac.uk/id/archivalresource/gb394-we
http://data.archiveshub.ac.uk/id/archivalresource/gb394-we http://lexvo.org/id/i
so639-3/eng
Subject: Archival ResourcePredicate: dcterms:languageObject: Person
Subject > Predicate > Object
dcterms:language
Triple statement
CONNECTING DATASETS
Linking Datasets
• If something is identified, it can be linked to• We can then take items from one dataset and link
them to items from other datasets
BBC
VIAF
DBPedia Archives Hub
Copac
GeoNames
“Humans, presented with pieces of information about people, put things into the form of a story.” (Edward Ayers)
“even isolated and inert pieces of evidence – a list, a letter, a map, a picture – can assume new and unimagined meanings when placed in juxtaposition with other fragments.” (Edward Ayers)
“You use a glass mirror to see your face; you use works of art to see your soul”
http://archiveshub.ac.uk/blog/2013/08/hub-viaf-namematching/
historywall.nma.gov.au
wraggelabs.com/shed/presentations/anzsi
USING OPEN REFINE
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
44
Workshop Resources
• Workshop resources available from:
http://data.archiveshub.ac.uk/workshops/wellcome2014/
owl:sameAs
Archives Hub Person
owl:sameAs
VIAF Person
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/45
owl:sameAs
<http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer>
owl:sameAs
<http://viaf.org/viaf/86607236> .
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/46
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
47
Matching Tools
• LOD Refine• http://code.zemanta.com/sparkica/download.html
• SILK Framework• http://wifo5-03.informatik.uni-mannheim.de/bizer/
silk/#workbench
• Module 3 at http://euclid-project.eu/ good for use of Open Refine and SILK
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
48
LOD Refine
• Install files available from:– Mac:
• http://data.archiveshub.ac.uk/workshops/wellcome2014/Mac.zip
– Windows:• http://data.archiveshub.ac.uk/workshops/wellcome2014/
Windows.zip
– Direct:• http://code.zemanta.com/sparkica/download.html
• Install LOD Refine, run it and then in a web browser go to http://localhost:3333/
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
49
LOD Refine
• Download example matching file from:– http://data.archiveshub.ac.uk/workshops/wellco
me2014/Matching_Sample.csv
– In LOD Refine go to ‘Create Project’ and import the Matching_Sample.csv data.
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
50
Name Concatenation
• To concat the FamilyName, GivenName and Dates:
• Add new column:– Click on left down of ‘?Dates’ and select ‘Edit
Column’ > ‘Add Column Based on this Column’– Name the new column, e.g. ‘ConcatName’– Use the following GREL expression:
• cells["?FamilyName"].value + ", " + cells["?GivenName"].value + ", " + cells["?Dates"].value
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
51
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
52
Reconcile to VIAF
• Info on Roderick Page’s VIAF reconciliation service at:– http://iphylo.blogspot.co.uk/2013/04/reconciling-author-names-us
ing-open.html
• Add the VIAF reconciliation service by clicking on Concat column down arrow and select ‘Reconcile’ > ‘Start reconciling’
• Add the URI for VIAF reconciliation service:– http://iphylo.org/~rpage/phyloinformatics/services/
reconciliation_viaf.php
• Start Reconciling!
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
53
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
54
VIAF Reconciliation
• Facet the reconcil results by judgement• Confirm the matched and unmatched data as
required• Possibly create another column for e.g SKOS
close matches or ‘isLikes’
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
55
Create VIAF URI Column
• Select the reconciled column’s dropdown menu > Edit column > Add column based on this column
• Give col a name and add the GREL expression: – "http://viaf.org/viaf/"+cell.recon.match.id
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
56
Export the VIAF Triples
• Edit the RDF skeleton to include the columns to be matched and link using the owl:sameAs property.
• Check the preview• Export the RDF as Turtle of RDF/XML as
required.
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
57
Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/
58
How we created the tabular data
http://wraggelabs.com/shed/presentations/anzsi/
What we need is a data framework that sits beneath the text, identifying people, dates and places, and defining relationships between them and our documentary sources. A framework that computers could understand and interpret, so that if they saw something they knew was a placename they could head off and look for other people associated with that place. Instead of just presenting our research we’d be creating a whole series of points of connection, discovery and aggregation. (Tim Sherratt)
…this is the goal of Linked Data.
http://archiveshub.ac.ukhttp://archiveshub.ac.uk/bloghttp://archiveshub.ac.uk/locahhttp://data.archiveshub.ac.uk/linkinglives
This presentation is available under creative commons Non Commercial-Share Alike:http://creativecommons.org/licenses/by-nc/2.0/uk/