linking library data
DESCRIPTION
Slides accompanying the Linking Library Data workshop at European Libraries Automation Group conference 2011.TRANSCRIPT
Linking Library DataELAG 2011 Workshop
Jindřich Mynarz @jindrichmynarz
linked data is sooo 2009
Workshop
• Introductiono Motivationo Involved technologies
• Discussiono Key questionso Potential issues
• Practical linking
Shared document: bit.ly/linking-library-dataTwitter hashtag: #elag2011
conversion
lots about
linking
little about
library links
raw data
linked data
Key technologies
• URIs• RDF• SPARQL• Linked data
URIs
• Uniform Resource Identifierso <http://example.com>
• "Cool URIs"o resolvableo stableo implement content negotiation
Learn how the RDF looks like
• Data format for formalizing directed graphs.• Standard for data interchange on the Web.• Unit of RDF is a triple.
Step 1: a triple
Step 2: triples
Step 3: a graph
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Step 4: linked data
Learn how to SPARQL
• Query language for RDFPREFIX ex: <http://example.com>SELECT [DISTINCT] ?what [FROM ?where]WHERE { ?triplePattern} [LIMIT ?limit][ORDER BY ?variable]
Linked data
1. Use URIs as names for things2. Use HTTP URIs so that people can look up those names.3. When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)4. Include links to other URIs. so that they can discover more
things.
Linking data
• linking... o is a continuous integration of heterogeneous
dataspaces? o creates context?o is a job for librarians? Or machines?o is good?
@href is a blunt instrument
typed links
= identity~ similarity> hierarchy? aboutness
typed links
Discussion
• How to find datasets suitable for interlinking? • How to make my dataset worth linking to?• How to encourage others to link to my data?• What is the added value of links? • How to determine the quality of a link?• How to maintain links?
find and examine data
added value of links
link baiting
link maintenance
Linking
• Record linkage, identity resolution, duplicate detection, instance matching, co-reference detection
• Determinism: o Deterministic (e.g., dictionary-based)o Probabilistic (e.g., graph matching)
• Level:o Schema (e.g., ontology mapping)o Instances (e.g., record linkage)
Linking
1. Untyped links to typed links.2. Literals to links.3. Links to other links.
Interlinking with Silk
• Silk is an interlinking framework for instance matching.• Uses the link specification language to describe the
interlinking process.• Powerful and relatively easy-to-use.
Link specification language
• Your turn!
connect the dots
get this
or this
the end.