bibliographic references in bhl

31
Bibliographic references in BHL Coordination and routes for cooperation across organizations, projects and e-infrastructures 23 rd of May 2013 William Ulate R., Missouri Botanical Garden

Upload: william-ulate

Post on 10-May-2015

127 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Bibliographic References in BHL

Bibliographic references in BHL

Coordination and routes for cooperation across organizations, projects and e-

infrastructures23rd of May 2013

William Ulate R., Missouri Botanical Garden

Page 2: Bibliographic References in BHL

Questions to Answer1. Type of content we discuss (e.g., occurrences, genes, behaviour,

morphology, etc.)2. Sources of content (from where)3. Formats of content (formats, standards)4. Methods of gathering information (e.g., harvesting, ftp uploads, protocols)5. Methods of delivery of information (e,g., free searches, API, web services,

automated exports, linking mechanisms, etc.; provide links to API and web services documentation)

6. Identifiers used (type, persistence, dereferencing, resolvability)7. Present or forthcoming interoperability features with other platforms8. Constraints, needs and expectations to:

a) Suppliers of content, and b) Users of content

9. What is needed for Bibliographic References?

Page 3: Bibliographic References in BHL

A brief history…

Page 4: Bibliographic References in BHL

The Biodiversity Heritage Library

www.biodiversitylibrary.org

Page 5: Bibliographic References in BHL

Book Viewer

Page 6: Bibliographic References in BHL

Sharing

BHL shares data through:

APIsData ExportOpenURLOAI-PMH

Page 7: Bibliographic References in BHL

Open Data

• Downloads– Simple tab-delimited exports of core data– http://www.biodiversitylibrary.org/data/BHLExportSchema.pdf

• Data model– DB schema as ERD– http://bhl-bits.googlecode.com/files/20090930_BHLDataModel.pdf

Page 8: Bibliographic References in BHL

Services

• Names Service– Return all occurrences of a name throughout BHL digitized corpus

• Documentation: http://bit.ly/2e6sg9

– Access to 100+ million name strings using TaxonFinder & NetiNeti• 1.5 million unique names

– Algorithm to detect nomenclatural & taxonomic acts

• OpenURL– Facilitate links to citations: protologues, articles, references

• Documentation: http://www.biodiversitylibrary.org/openurlhelp.aspx– Useful to Nomenclators, Reference Systems

• IPNI• Tropicos

Page 9: Bibliographic References in BHL

Services: OpenURL

http://www.biodiversitylibrary.org/openurl?pid=title:3934&volume=14&issue=&spage=301&date=1879

http://www.tropicos.org/Name/1200408

Page 10: Bibliographic References in BHL

DOIs

Page 11: Bibliographic References in BHL

DOIs for Legacy Literature

• BHL member of CrossRef through Smithsonian• Started assigning DOIs to BHL monographs– Low hanging fruit: Easy, non-controversial– 54,856 DOIs Approved to date

• Next, other publication types / articles?– Process of automatically assigning CrossRef DOIs

to articles has a higher potential for collisions.

Page 12: Bibliographic References in BHL

Article-level metadata

• Disambiguating and locating structural components in the corpus

• Done by automated and crowdsourced means– Thanks Rod Page! Welcome others!

• Greatly increases semantic value of the dataset

• Makes data addressable and thus linkable

Chapter-level metadataTreatment-level metadata Part-level metadata

Page 13: Bibliographic References in BHL

Genesis: “BHL Article Repository”

• Idea first introduced at TDWG 2008, Fremantle (by BHL, many have discussed for years)

• YouTube for biodiversity articles• Needed (need) a way to access articles in BHL– “BHL has no articles.”– BHL has hundreds of thousands of articles but you

can’t search for them via author, article title search– Can find via “article coordinates” using BHL’s UI &

OpenURL resolver: Journal / Volume / Start Page / Year

Page 14: Bibliographic References in BHL

CiteBank

• Objectives– Create a repository for community-vetted

taxonomic bibliographies.– Ability to ingest, display, download, and index

articles so that the BHL can operate as an article repository.

– Provide links to content published online through other repositories.

• Launched on December 6th 2010• 185609 bibliographic records to date

Page 15: Bibliographic References in BHL

Citations today: http://citebank.org

Page 16: Bibliographic References in BHL

Citations Providers

Page 17: Bibliographic References in BHL

SpecimenDatabases

CommercialAggregators

Software ToolsOpen Access

Digital Libraries

Indices

Nomenclators

SpecimenDatabases

CommercialAggregators

Software ToolsOpen Access

Digital Libraries

Indices

Nomenclators

Open AccessPublishers

International Collaborative Projects

Page 18: Bibliographic References in BHL

Lessons Learned

• Biblio/Drupal data model insufficient for mass of data envisioned for all biodiversity, too flat and difficult to expand in collaboration with Biblio development community

• Data providers want their content findable and managed in the Biodiversity Heritage Library, not a system alongside BHL

• Maintaining two platforms for biodiversity literature threatens sustainability of the literature resources over the longer term

Page 19: Bibliographic References in BHL

Global Names Architecture

Page 20: Bibliographic References in BHL

What have we done?

• Articles– Extended BHL data model to store article metadata– Built process to harvest data from BioStor

• Created user interfaces for adding article metadata and associated files– Defined functional requirements as improvements to Drupal-based

Citebank– Defined process flow for adding article metadata and associated

files– Implemented UI changes

• Changed BHL UI to accommodate article search• Changed BHL UI to accommodate article display (TOC)

Page 21: Bibliographic References in BHL

Articles in the BHL UI

Page 22: Bibliographic References in BHL

Articles

Page 23: Bibliographic References in BHL

Articles

Page 24: Bibliographic References in BHL

Articles

Page 25: Bibliographic References in BHL

Requirements for a citation repository?

Admin. Interface– IMPORT AND MAPPING TOOL• Preview/Accept/Reject/Undo/Report on Import• No standard schema, MODS or Bibtex• Drag & drop GUI or mapped source and target field config.

– USER MANAGEMENT• Self-Registration• Admin. Approval & Deletion• User Roles Assignment

– GLOBAL UPDATES

Page 26: Bibliographic References in BHL

Requirements for a citation repository?

General User Interface– IMPORT• Upload/Preview/Accept/Reject/Undo/Report on Import

– CREATE CITATION• By filling a Form, via BibTex

– BROWSE• Faceted: title,author,subject, year, contributor, my citations

Page 27: Bibliographic References in BHL

Requirements for a citation repository?

• CITATION TYPES– Journal Article, Book Chapter, Conference Proceedings,

Conference Paper, Thesis, Government Report, Note, etc.

• OAI HARVESTING– Harvest and serve data through OAI-PMH

• SPECIFICATIONS FOR DATA PROVIDERS PAGE

• CONTRIBUTORS PAGE– Recognize ALL contributions

• REPORTING– Statistics Page by Citation and Publication type– Recent/Latest Uploads

Page 28: Bibliographic References in BHL

What are we doing?

• Integrate BHL’s Services with ZooBank, IPNI & IF

• Authoritative list of titles in common use for nomenclatural acts (“TL3”)

• Harvest relevant content from Mendeley

• Integrate services and interfaces with the GNUB data model

• Interoperate with citation parsing tools & services

Page 29: Bibliographic References in BHL

Support citation reconciliation

.

.

.

.

.

.

.L. Sp. Pl. 2: 971. 1753

Linneaus, C. Species Plantarum, vol. 2 p. 971. 1753

Linné, Carl von. Sp. Pl. Vol. 2 Page 971. 1753

Caroli Linnaei, Species Plantarum exhibentes plantas rite cognitas, ad genera relatas, cum Differentis Specificis, Nominibus Trivialibus, Synonymis Selectis, Locis Natalibus, secundum SYSTEMA SEXUALE digestas.. 2:971. 1753

Zea mays

Page 30: Bibliographic References in BHL

Questions to Answer

1. Type of content - Literature, Images, OCR Text and Bibliographic Citations

2. Sources of content - BHL, CB & other Repositories 3. Formats of content - BibTex, MODS, DC4. Methods of gathering info - Harvesting, FTP Uploads5. Methods of delivery of info - Free Searches, API, web

services, exports, linking mechanisms

6. Identifiers used - CrossRef DOIs for Monographs7. Interoperability with

other platforms - Zoobank, IPNI, IF8. Constraints, needs and expectations to suppliers of content and users of

content

Page 31: Bibliographic References in BHL

Thank you

pro-iBiosphere Meeting 3Coordination and routes for cooperation across organizations, projects and e-infrastructures Berlin, GermanyMay 23rd, 2013

[email protected] BHL Project ManagerBHL Technical DirectorSenior Project ManagerMissouri Botanical Garden