making library collections discoverable on the webmaking library collections discoverable on the web...

24
OCLC Collective Insight Event: Library Data [R]evolution: Applying Linked Data Concepts San Francisco Public Library - February 10, 2015 Making Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality

Upload: others

Post on 17-Jun-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

OCLC Collective Insight Event: Library Data [R]evolution: Applying Linked Data ConceptsSan Francisco Public Library - February 10, 2015

Making Library Collections Discoverable on the Web

Ted Fons Executive Director, Data Services & WorldCat Quality

Page 2: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Explain: Libraries shift their focus from collections on the shelf to the services they offer. Pressure on collection and staff budgets.

Some Trends in Libraries Today…

Trend 2: Search for Distinctiveness

Explain: As licensed content is available in commodity collections, academic libraries strive to declare their distinctiveness.

Trend 3: Be Found on the WebExplain: As users search everywhere, libraries want their collections to be found on the web.

Improve Library Workflows

Help Libraries Be Found on the Web

Trend 1: Shift from Collections to Services

Page 3: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Photo credit: http://timemanagementninja.com/wp-content/uploads/2013/06/Make-Decisions.jpg

So, what is OCLC going to do?

Page 4: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Model things of interest to the web.

Make those things available in structuresfamiliar to the web.

Improve library workflows.

OCLC’s Strategy

Improve discovery

Reinvent cataloging

Focus first on data elements for web discovery

Start with Schema.orgManage entities, not records

Page 5: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

person place

object concept

organization work

author

subjectitemavailability

The solution starts here.

The library knowledge graph

A graph of relationships

Page 6: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

The library knowledge graph

A graph of relationships

person place

object concept

organization work

Page 7: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Photo credit: http://www.ufunk.net/wp-content/uploads/2011/01/make-it-better-2.png

What will be better when we have that graph thingy?

Page 8: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Explain: Clustering related things (manifestations, items, formats, etc.)

What will be better?

Improved library applicationsExplain: Improved discovery. Make it more web-like. Reduce redundancy of data by managing entities in large aggregations.

Be Found on the WebExplain: Using the data models and structures familiar to the web.

Improved coherence of the data

Page 9: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

The library knowledge graph

Lots of things….if we do it right.

ILL and AnalyticsCataloging

Discovery Integration with the web

Page 10: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

The library knowledge graph

Discovery works…

person place

object concept

organization work

Page 11: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Entities and library workflows

Discovery

The Name of the Rose

Summary: The year is 1327. Franciscans in a wealthy Italian abbey are suspected of heresy, and Brother William of Baskerville arrives to investigate. His delicate mission is suddenly overshadowed by seven bizarre deaths that take place in seven days and nights of apocalyptic terror.

Subjects

Borrowing OptionseBooks | Printed Books | Audio Books

Other Languages

Monastic libraries -- Italy – Fiction | Semiotics -- Fiction

Page 12: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Entities and library workflows

Cataloging

Cataloging will be different…• Managing the quality of

Works• Improving clusters

• Managing the quality of Persons• Links to works, Other IDs

Consistent with RDA

Soon

Page 13: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Photo credit: http://media02.hongkiat.com/freebies-for-web-designers-2011/progress-bar.jpg

What has OCLC done?

So what progress have we made?

Page 14: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

• 197+ million Work descriptions and URIs• Schema.org + BiblioGraph.net• RDF Data formats

• RDF/XML, Turtle, Triples, JSON-LD

• Links to WorldCat manifestations• Links to Dewey, LCSH, LCNAF, VIAF, FAST• Open Data license via Linked Data Explorer

• 2015: Discovery API, Metadata API• Released April 2014

http://www.oclc.org/dataThe Work Entity

Page 15: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

• 98+ million Person descriptions and URIs• Person entities with authority: 20.2 million

• Person entities without authority: 78.3 million

• Schema.org + BiblioGraph.net• Harvested from WorldCat data and enriched from other hubs

RDF Data formats• RDF/XML, Turtle, Triples, JSON-LD

• Links to WorldCat Works. Added links from WC Works.• Open Data license via Linked Data Explorer

• 2015: Linked Data Explorer, Discovery API

http://www.oclc.org/dataThe Person Entity

Page 16: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

The Data Strategy: WorldCat EntitiesWork and Person Creation Process Flow

Extractors

EnhancedWC

Records

HarvestedTriples

RefinedTriples

CreateWorkReducer

1. Harvest

3. Reduce

There are three components to the pipeline for creatingWork and Person entities. The harvest componentextracts the data from the different sources. The mapcomponent identifies the objects and combines the triplesthrough name recognition and authority linkages. Thereduce component pulls together the entity descriptionsand writes them out to HBase.

VIAF

LCNAF

DBPedia

CreatePersonReducer

2. Map

ObjectMapper

PersonCombine

WorkCombine

Page 17: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

• Co-authored by the Library of Congress and OCLC Research

• This executive summary compares and contrasts the compatible linked data initiatives at both institutions

• A more detailed technical analysis will be released later this year

New “Common Ground” white paper

Download a copy at:

oc.lc/CommonGround

Page 18: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

High Level Conclusion

“The alignment…is still accurate and is perhaps even more defensible than in 2013 because the primary BIBFRAME concepts are now more consistent with the corresponding concepts defined in the OCLC/Schema model.”

p. 8

Page 19: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Photo credit: http://www.quickmeme.com/img/4c

Are you skeptical?

Page 20: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Photo credit: http://measuringupblog.com/app/wp-content/uploads/2013/11/blogpic2.jpg

Can we measure impact?

-

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

4,000,000

4,500,000

5,000,000

May 2014 Jun 2014 Jul 2014 Aug 2014 Sept 2014 Oct 2014

Page 21: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Entities and library workflows

Discovery

The Name of the Rose

Summary: The year is 1327. Franciscans in a wealthy Italian abbey are suspected of heresy, and Brother William of Baskerville arrives to investigate. His delicate mission is suddenly overshadowed by seven bizarre deaths that take place in seven days and nights of apocalyptic terror.

Subjects

Borrowing OptionseBooks | Printed Books | Audio Books

Other Languages

Monastic libraries -- Italy – Fiction | Semiotics -- Fiction

Page 22: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Release Works

Ongoing data modeling, research, experiments, and engagement

Release other Entities + add Articles

- 2015 +2016

Release PersonEnhance Discovery

Begin Cataloging Entities

Measure impact of change, review actions, improve plans, repeat

Improve Other Workflows

Import and Export record data from WorldCat: MARC, UNIMARC, ONIX, BIBFRAME, etc.

Build the Library Knowledge GraphConnecting local systems, OPACs and the web via identifiers

Page 23: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Register

FollowSchema – Bib Extendhttp://www.w3.org/community/schemabibex/

OCLC – Datahttp://www.oclc.org/data

Add your holdings & metadata to WorldCat

Page 24: Making Library Collections Discoverable on the WebMaking Library Collections Discoverable on the Web Ted Fons Executive Director, Data Services & WorldCat Quality Explain: Libraries

Explore. Share. Magnify.

Ted FonsExecutive Director, Data Services & WorldCat [email protected]

Links and Entities

@tedfons