linked data implementations—who, what and why?
TRANSCRIPT
CNI Spring 2016 Membership MeetingSan Antonio TX
Linked Data Implementations—Who, What and Why?
Karen Smith-YoshimuraOCLC Research
International Linked Data Surveys for Implementers
International Linked Data Surveys for Implementers
201448
2015
71
Number of institutional responses
Both29
Geographic breakdown of 90 responding institutions
20 countries represented
USA
Spain
UK
The Netherlands
Norway
Canada
Australia
France
Germany
Italy
Switzerland
Austria
Czech Republic
Hungary
Ireland
Japan
Malaysia
Portugal
Singapore
Sweden
0 5 10 15 20 25 30 35 40 45
Linked Data Survey Respondents
Academic library
National library Network Government Scholarly Public Library Museum Other0
5
10
15
20
25
2014 2015
Responding institutions by type
Academic library
National library
Network
Government
Scholarly
Public Library
Museum
Other
31%
20%14%10%
8%7%
4% 6%
2015 responding institutions by type
2015 2014Not yet in production 37 27
Less than one year 19 13
More than one year, less than two years 10 12
More than two years 46 24
How long linked data project or service in production
Total 112 76
2015 2014Consume linked data 38 25
Publish linked data 10 4
Both consume & publish 64 47
How linked data is used
Reasons for publishing linked data 2015 2014Expose to larger audience on the Web 67 45Demonstrate what could be done with datasets as linked data 59 41Heard about linked data and wanted to try it out by exposing our data as linked data. 43 21See if publishing linked data would improve our Search Engine Optimization (SEO.) 29 9
Types of data published as linked data
Authority files
Bibliographic data
Data about musuem objects
Datasets
Descriptive metadata
Digital collections
Encoded archival descriptions
Geographic data
Ontologies/vocabularies
Other
0 10 20 30 40 50 60
SOME EXAMPLES IN PRODUCTION
North Rhine-Westphalian Library Service Center
http://www.lib.ncsu.edu/ld/onld/
http://bnb.data.bl.uk
http://kn.ndl.go.jp/
https://linkedjazz.org/
Barriers to publishing linked data 2015
Steep learning curve for staff 40
Inconsistency in legacy data 33
Selecting appropriate ontologies to represent our data 31
Establishing the links 27
Little documentation or advice on how to build the systems 21
Reasons for consuming linked data 2015 2014
Provide our users with a richer experience. 51 35
Enhance our own data by consuming linked data from other sources. 50 37
More effective internal metadata management. 32 16
Greater accuracy and scope in our search results 27 12
See if consuming linked data would improve our Search Engine Optimization (SEO). 19 12
Experiment with combining different types of data into a single triple store. 17 15
Heard about linked data and wanted to try it out by using linked data sources. 17 13
2015 linked data sources most consumed 2015VIAF (Virtual International Authority File) 41DBpedia 36GeoNames 35id.loc.gov 35Resources we convert to linked data ourselves 17Getty's AAT 16FAST (Faceted Application of Subject Terminology) 15WorldCat.org 15data.bnf.fr 12Deutsche National Bib Linked Data Service 12
PROFILES OF MOST CONSUMED SOURCES CITED
VIAFhttp://viaf.org
Combines multiple name authority files into a single OCLC-hosted name authority service.
More than 100,000 requests/day
Size: 500 million – 1 billion triples
Consumes:• GeoNames• id.loc.gov• ISNI• Wikidata• WorldCat.org• WorldCat.org Works
RDF Vocabularies/Ontologies:• Bibliographic Ontology• Dublin Core & DC Terms• FOAF• Owl 2 Web ontology • RDF schema• Schema.org• SKOS
id.loc.govEnables developers to interact with vocabularies found in data & standards promulgated by LC as linked data.
More than 100,000 requests/day
Size: 100 million – 500 million triples
Consumes:• AGROVAC• data.bnf.fr• DNB’s Linked Data Service • id.loc.gov• VIAF• Wikidata• WorldCat.org Works• Resources we convert to linked data ourselves
RDF Vocabularies/Ontologies:• BibFrame• FOAF• MADS/RDF• RDF schema• SKOS
Getty’s AAThttp://vocab.getty.edu
A structured vocabulary for generic concepts related to art and architecture.
More than 100,000 requests/day
Size: 10 million – 50 million triples
Consumes: None RDF Vocabularies/Ontologies:• Bibliographic Ontology• Dublin Core & DC Terms• FOAF• Local vocabulary• Owl 2 Web ontology language• RDF schema• SKOS
FASThttp://id.worldcat.org/fast/Adapts LC Subject Headings with a simplified syntax to retain LCSH’s rich vocabulary while making the schema easier to understand, control, apply and use .
10,000 – 50,000 requests/day
Size: 10 million – 50 million triples
Consumes:• DBpedia• GeoNames• id.loc.gov• VIAF
RDF Vocabularies/Ontologies:• Dublin Core & DC Terms• FOAF• Schema.org• SKOS• WSGS84 Geo Positioning
WorldCat.org
OCLC has made WorldCat.org bibliographic metadata experimentally available in linked data form.
More than 100,000 requests/day
Size: 15 billion triples
Consumes:• DBpedia• FAST• VIAF• WorldCat.org
RDF Vocabularies/Ontologies:• Dublin Core• FOAF• Schema.org• SKOS
data.bnf.frMake the data produced by the Bibliothèque nationale de France more useful on the Web.
10,000 – 50,000 requests/day
Size: 100 million – 500 million triples
Consumes:• AGROVAC• data.bnf.fr• DBpedia• DNB’s Linked Data Service• GeoNames • id.loc.gov• ISNI• VIAF• http://datos.bne.es (+ others)
RDF Vocabularies/Ontologies:• Bibliographic Ontology• Biographical Ontology• Dublin Core & DC Terms• FOAF• FRBR• ISNI• Music Ontology• OAI ORE Terms• Owl 2 Web ontology • RDA• RDF schema• SKOS• WSGS84 Geo Positioning …
DNB’s Linked Data Servicehttp://www.dnb.de/EN/ldsPublishes authority and bibliographic data in RDF to make the data accessible to the semantic Web community with no need to know library-specific metadata schemes.
Size: 100 million – 500 million triples
Consumes: None RDF Vocabularies/Ontologies:• Bibliographic Ontology• Dublin Core & DC Terms• FOAF• ISBD• Owl 2 Web ontology language• RDA• RDF schema• SKOS
Barriers to consuming linked data 2015Matching, disambiguating and aligning source data and linked data resources 23Mapping of vocabulary 17
What's published as linked data is not always reusable or lacks URIs 16Lack of authority control 15Datasets not being updated 14Size of RDF dumps 12
Understanding how data is structured before using it 12
What would you do differently? 2015Have more time allocated for its development 38
Would do nothing differently 30
Get more staff 28
Get wider organizational support 23
Have more realistic expectations 12
• Focus on what you want to achieve, not technical stuff.• Build on what you have that others don’t.• Pick a problem you can solve.• Model data that solves your use cases.• Consider legal issues from the beginning.• Read as widely as possible, consult community experts.• Have a good understanding of linked data structure,
available ontologies and your own data.• Strive for long-term data reconciliation & consolidation.• Involve your institution/community.• Experiment and start small.• Start now! Just do it!
Advice from the implementers
Full details of responses http://www.oclc.org/content/dam/research/activities/linkeddata/oclc-research-linked-data-implementers-survey-2014.xlsx
SMTogether we make breakthroughs possible.
Thank you!Contact: Karen Smith-Yoshimura
CNI Spring 2016 Membership Meeting, San Antonio TX4 April 2016
@KarenS_Y
©2016 OCLC. This work is licensed under a Creative Commons Attribution 4.0 International License. Suggested attribution: “This work uses content from Linked Data Implementations—Who, What and Why? © OCLC, used under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0/.”