faceted browsing for combined access to a digital repository … · •the marc records are indexed...

25
Faceted Browsing for Combined Access to a Digital Repository and a Library Catalog Bess Sadler Leslie Johnston University of Virginia Library DLF Fall 2007 Forum

Upload: others

Post on 03-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Faceted Browsing for CombinedAccess to a Digital Repository

and a Library Catalog

Bess SadlerLeslie Johnston

University of Virginia Library

DLF Fall 2007 Forum

Page 2: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

What is Project Blacklight?• Blacklight is a research project. Bess Sadler and

Erik Hatcher started an experiment to index MARCdata in Lucene/Solr in January 2007.

• The first prototype included ~3.7M MARC records,320 Tang Dynasty Chinese poems in TEI, and 470Digital Collections Repository objects.

• MARC indexing consulting came from Erin Stalberg(formerly UVA but now at NC State) and EdSummers at Library of Congress.

• Why “Blacklight”? Solr? UVA? Blacklight. Get it?Credit (or blame) Erik Hatcher.

Page 3: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

What Blacklight Can Solve For Us• Faceted browsing where we have none.• Relevancy ranking where we have none.• Facilitates the use of search and faceted browse together to make it

easier to perform complex discovery operations without knowing thelogic behind it.

• In addition to providing access to the entire catalog, we can createmultiple additional interfaces to accommodate specialized browsing ofdifferent types of collections, such as music collections (audio andscores), that takes advantage of specialized use of MARC.

• We can mix in data that’s not explicit in the MARC record. In our musicinterface, we solved a user frustration where they couldn’t find music bycentury – we’re extrapolating at the time of indexing from the year that isin the metadata.

• We can federate indexes of MARC records with indexes of metadata fordigital objects in our Repository for a single discovery method.

• We can potentially add additional applications into the mix, such as onewhere we will track our holdings that are included in Google BookSearch that will supply link URLS for the interface.

Page 4: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Blacklight Technical Details• The MARC records are indexed through a Ruby script,

directly from MARC binary without transforming intoMARCXML first.

• The implementation was accomplished using Ruby on Rails.• The box it’s all running on is a development server, with four

medium fast CPUs and 3.5Gb of RAM.• Solr Flare is running through Jetty, and Blacklight is running

on a pack of four mongrel instances, with apache doing loadbalancing out front.

• Sessions get written to a MySQL database, and we useCapistrano 2.0 for deployment and versioning.

• It sends us all an email every time an error gets generated.

Page 5: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Issues in Project Blacklight

• Our MARC data isn’t utf-8-compliant and thatcaused issues with the diacritics.

• Indexing in new ways always exposes newdata inconsistencies.

• We have not yet identified a productionworkflow for keeping the catalog updateddaily.

Page 6: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Blacklight Portal Entry

Page 7: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Combined Catalog with MARC Recordsand Repository Objects

Page 8: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Music Facets for Browsing

Page 9: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

More Music Facets for Browsing

Page 10: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

And Even More

Page 11: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

20th-century Opera in English

Page 12: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

View a Record

Page 13: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Books on Africa for Semester at Sea:Alternate Interface for a Shadowed Record Set

Page 14: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL

• A corollary project to replace the XPATindexes and search interface for our DigitalCollections Repository (on top of Fedora).

• Indexes full text and metadata for over10,000 TEI texts, almost 4,000 EAD, andover 20,000 images.

• Work done by OpenSource Connections inconjunction with Bess Sadler and MattMitchell at the UVA Library.

• BlacklightDL will be folded into Blacklight.

Page 15: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL Main Screen

Page 16: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL Collection Browse

Page 17: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL Subject Facet

Page 18: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL SearchPlus Subject Facet Filter

Page 19: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

19th-century Books in English withWomen as a Subject

Page 20: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Advanced Search in BlacklightDL

Page 21: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

The Advanced Search ResultsPlus a Subject Facet Filter

Page 22: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL Image Object View

Page 23: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL Image Object Viewwith Menu Options

Page 24: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

BlacklightDL Image Object Viewwith Metadata

Page 25: Faceted Browsing for Combined Access to a Digital Repository … · •The MARC records are indexed through a Ruby script, directly from MARC binary without transforming into MARCXML

Questions?

Contact us:

[email protected]@virginia.edu