linked data a personal perspective

25
The world’s libraries. Connected. Linked Data A Personal Perspective Janifer Gatenby OCLC EMEA With acknowledgements to Richard Wallis and Anila Angjeli

Upload: marlo

Post on 24-Feb-2016

35 views

Category:

Documents


7 download

DESCRIPTION

Linked Data A Personal Perspective. Janifer Gatenby OCLC EMEA With acknowledgements to Richard Wallis and Anila Angjeli. What is it? What does it promise? How do we get there? What happens when we get there?. What is it?. Not really a new way of linking but a new way of expressing a link . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Linked Data A Personal Perspective

The world’s libraries. Connected.

Linked DataA Personal Perspective

Janifer Gatenby

OCLC EMEAWith acknowledgements to Richard Wallis and Anila

Angjeli

Page 2: Linked Data A Personal Perspective

The world’s libraries. Connected.

• What is it?• What does it promise?• How do we get there?• What happens when we get there?

Page 3: Linked Data A Personal Perspective

The world’s libraries. Connected.

• Not really a new way of linking but a new way of expressing a link

What is it?

It is about using canonical trusted globally referenceable identifiers for concepts, people, organisations, locations etc. instead of copying text

strings and losing the connection with the authoritative sources they came from.

Richard Wallis

Page 4: Linked Data A Personal Perspective

The world’s libraries. Connected.

• 700 10 $a name $e role $0 authority control number

• (added entry in a MARC record for a name related to a work, not the main author)

MARC21 links

These familiar links reference an authority record in the same database as a bibliographic record, hence have no address portion. Linked data extends the linking

range.

Page 5: Linked Data A Personal Perspective

The world’s libraries. Connected.

Extending the linking range: URI

• URI – immutable address as well as an identifier• http://id.loc.gov/authorities/names/nr89009099

• http://viaf.org/viaf /116774723

• http://isni-url.oclc.nl/isni/000000114556841

9 NACO libraries – LC, National Agricultural Library,National Library of Medicine, British Library, NL Mexico, NLNZ, NL Scotland, NL South Africa, NL Wales

Page 6: Linked Data A Personal Perspective

The world’s libraries. Connected.

• RDF – metadata is expressed in triples• Data

• Data label (properties)

• Vocabulary from which the label comes (gives context to the label)

Extending the linking range: RDF

Page 7: Linked Data A Personal Perspective

The world’s libraries. Connected.

1. Use URIs as names for things

2. Use HTTP URIs so people can look up those names

3. When someone looks up a URI, provide useful information, using the standards - RDF

4. Include links to other URIs, so that they can discover more

Tim Berners-Lee - 2006

Linked Data Principles

Page 8: Linked Data A Personal Perspective

The world’s libraries. Connected.

• Vocabularies are not schemas, they are lists of defined data labels (concepts)

• Schema.org (Search engines)

• BibFrame (Library community)

• FOAF Friend of a friend

• OWL same as

• Vocabularies can be mixed

Vocabularies

foaf:name "Jimmy Wales" ; foaf:mbox <mailto:[email protected]> ; foaf:homepage <http://www.jimmywales.com/> ; foaf:nick "Jimbo" ;

Page 9: Linked Data A Personal Perspective

The world’s libraries. Connected.

• Enriched displays without data maintenance• Better harvesting and ranking

• because of markup

• and because of links

• Navigation to pages with additional information – – Example: from VIAF via ISNI to encyclopaedias, rights

management societies (digitisation rights), Bowker – biographies from fly leaves

What does it promise?

Page 10: Linked Data A Personal Perspective

The world’s libraries. Connected.

Page 11: Linked Data A Personal Perspective

The world’s libraries. Connected.

Page 12: Linked Data A Personal Perspective

Interconnecting French cultural heritage treasures on the Web

BnF Main catalogue(MARC)

Digital documents(DC)

Web pages for Internet usersBnF Archives and

Manuscripts catalogue

(EAD) Raw data for machines

ModelingMatching ClusteringAlignments

Semantic Web techniques

Other BnF resources External

resources

Page 13: Linked Data A Personal Perspective

example

BnF persistent ID

Imported from Wikipedia and integrated in the page

Links

ISNI 0000 0001 2283 1567 (soon)

Page 14: Linked Data A Personal Perspective

vocabularies used

Data can be downloaded

Existing ones + others defined for the specific

needs of the project

Information about the data model (or ontology) at : http://data.bnf.fr/about-en

Page 15: Linked Data A Personal Perspective

The world’s libraries. Connected.

How do we get there?

DNB CultureGraph• “It’s all about creating

connections”

• DDC to RVK (German classification) by comparing search results

• GND (names) to German Wikipedia

Page 16: Linked Data A Personal Perspective

The world’s libraries. Connected.

• Ingesting data to compare and create links• Makes clusters; cluster identifier• Ingesting preferred to external linking

• Wikipedia, ISNI, WorldCat identities

• More data used for clustering, so more reliable

• VIAFBot for making reciprocal links in Wikipedia / Wikidata

Example VIAF

<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/><rdf:typedf:resource="http://rdvocab.info/uri/schema/FRBRentitiesRDA/Person"/><foaf:name>De Groot, Gerard J., 1955-</foaf:name><foaf:name>DeGroot, Gerard J., 1955-</foaf:name><rdaGr2:dateOfBirth>1955-06-22</rdaGr2:dateOfBirth><owl:sameAs rdf:resource="http://data.bnf.fr/ark:/12148/cb12299846b#foaf:Person"/><owl:sameAs rdf:resource="http://www.idref.fr/034977651/id"/><owl:sameAs rdf:resource="http://d-nb.info/gnd/12422900X"/>

Page 17: Linked Data A Personal Perspective

Libraries

Text Rights Music Rights

Trade Sources

Encyclopaedias

Researchers & Professional

Page 18: Linked Data A Personal Perspective

The world’s libraries. Connected.

7 million NEW LINKS to & from VIAF

Linked Data: isni-url.oclc.nl/isni/

bnf dnb lc nta nukat wkp All VIAFText Rights Sources123,964 assigned 37.383 25.177 72.960 83.498 32.184 14.935 406.178

Research & Profess’l 404,272 assigned 24.141 14.688 76.986 30.526 16.730 3.465 223.305

Music Sources

189,000 assigned 27.542 33.997 38.218 13.560 8.675 19.700 207.231

Trade sources

2.4 million assigned 570.224 384.230 2.138.955 741.671 442.037 138.636 6.100.349

Totals 659.290 458.092 2.327.119 869.255 499.626 176.736 6.937.063

ì

Page 19: Linked Data A Personal Perspective

The world’s libraries. Connected.

• Identifiers Seal Uniqueness: “n” number of other elements are necessary for uniqueness

• Stable identifier; stable metadata:

• assigned where there is confidence in the quality and completeness of the metadata to establish uniqueness

• ISNI system + Quality Team (BL & BnF)

ISNI – an identifier

Linking erroneous data propagates errors.

Page 20: Linked Data A Personal Perspective

The world’s libraries. Connected.

Links are made once and inherited, e.g. by local catalogues

• URI – immutable address as well as an identifier• http://id.loc.gov/authorities/names/nr89009099

• http://viaf.org/viaf /116774723

• http://isni-url.oclc.nl/isni/000000114556841

9 NACO libraries – Library of Congress, National Agricultural Library,National Library of Medicine, British Library, NL Mexico, NLNZ, NL Scotland, NL South Africa, NL Wales

Page 21: Linked Data A Personal Perspective

The world’s libraries. Connected.

• Search happens mostly in the search engines• Library catalogue concentrates on:

• Being linked to (& linking out)

• Delivery, particularly of the digitised and immediate

What happens when we get there?

Page 22: Linked Data A Personal Perspective

The world’s libraries. Connected.

• How do search and linked data interact?• Is search really fully delegated to search engines

& larger union catalogues?

What happens when we get there?

Page 23: Linked Data A Personal Perspective

The world’s libraries. Connected.

Search type Happening inKnown item Search engines, also in more specific

sources where expected to reduce noiseSubject search Search engines, also in more specific

sourcesIndex browse In catalogues

Follow a link Everywhere . In library catalogues from a full record display.

Types of search

The more your catalogue is linked in, the more likely it is to attract all types of searches

Page 24: Linked Data A Personal Perspective

The world’s libraries. Connected.

• Data needed • For making

indexes

• For comparisons, e.g. For de-duplication

• Data mining

Links plus data needed in catalogues

It is about using canonical trusted globally referenceable identifiers for concepts, people, organisations, locations etc. instead of copying text

strings and losing the connection with the authoritative sources they came from.

This doesn’t mean that you only need the links; you often also need to ingest the data

Besides data storage no longer the restraint it once was

Page 25: Linked Data A Personal Perspective

The world’s libraries. Connected.

• http://www.slideshare.net/tulipbiru64/the-single-power-of-link-richard-wallis

• http://www.slideshare.net/rjw/linked-data-and-oclc

Richard Wallis: Further Reading