article metadata in institutional repositories gonzalez
TRANSCRIPT
REPRESENTING SERIALS
METADATA IN
INSTITUTIONAL
REPOSITORIES
Lisa Gonzalez, Electronic Resources
Librarian
NASIG 2015
Washington, DC
Do we even need journal
metadata?
“What we want is articles,” said Gorman,
calling the idea of putting them together in
things called journals “irrelevant.” Tenopir, Carol. “The Value of the Container.” Library
Journal 131, no. 2 (2/1/2006 2006): 32–32.
Consider Discoverability
“If you're using repository or journal management software, such as Eprints, DSpace, Digital Commons or OJS, please configure it to export bibliographic data in HTML "<meta>" tags. Google Scholar supports Highwire Press tags (e.g., citation_title), Eprints tags (e.g., eprints.title), BE Press tags (e.g., bepress_citation_title), and PRISM tags (e.g., prism.title). Use Dublin Core tags (e.g., DC.title) as a last resort - they work poorly for journal papers because Dublin Core doesn't have unambiguous fields for journal title, volume, issue, and page numbers.”
Google Scholar Indexing Guidelines, https://scholar.google.com/intl/en-us/scholar/inclusion.html#indexing .
Example Article Citation
Elements
Chicago Manual of
Style
Article Title
Article Author
Journal Title
Journal Date
Journal Volume
Journal Issue
Page Range
Journal Article Tag
Suite (JATS)
<article>
<article-meta>
<journal-meta>
<contrib>
<ref-list>
(Peroni, Lapeyre, and
Shotton, 2012)
OpenDOAR Directory for IRs
Includes description, policies summary,
software platform, OAI-PMH availability, and
size
Statistics for repositories includes location,
frequent languages, frequent content types,
metadata and data re-use policies, and
content, submission and preservation policies
About 85% of repositories represented have
unknown, unstated, or undefined metadata re-
use policies
University of Michigan
Characteristics
DCTERMS.bibliographicCitation can refer to
pre-print or publisher’s PDF
DC.type indicates the genre is article
DC.date.issued is year of publication
University of Queensland
Characteristics
Include journal title, volume, issue, start page,
end page and date, plus ISSN – Highwire
Press tags
Sub-type for article not contained in <meta>
tags with other Dublin Core elements, but in
<body>
Now has Open Access Mandate Compliance
field
Columbia University
Characteristics
Includes Publisher and CU DOIs
Includes journal title, volume, issue, start page,
end page and date – Highwire Press tags
Uses MODS metadata schema, but not in the
<meta> tags
eLIS Characteristics
eprints.type and dc.type to indicate preprint or
journal article
eprints tags includes publication title, volume,
issue number and date range
Identifier examples include eprints.issn,
eprints.id_number, eprints.official_url, and
dc.identifier
University of Nebraska Lincoln
Characteristics
Uses bepress_citation tags – author, title and
date
The citation information for the journal is
contained in <body>
PDFs appear to be formatted according to
Google Scholar inclusion guidelines
Bielefeld University
Characteristics
Uses Highwire Press tags
Includes DOI
Includes ISSN
RDF example:
<link rel="DC.relation" href="urn:ISSN:0361-073x"
/>
UPEI Characteristics
Highwire Press tags for journal citation, except
for citation_lastpage
Additional Dublin Core elements - DC.isPartOf
also used for journal title, DC.type for Journal
Article, and DC.identifier used for PMID
pre-print status appears in record display
Starting a Data Dictionary
Identifier – ISSN (ISSN:1612-9768)
Identifier – DOI (URI)
Relation-IsPartOf – journal title
Identifier-BibliographicCitation – full citation
Type - “Journal Article” :
http://www.ukoln.ac.uk/repositories/digirep/index/Ep
rints_Type_Vocabulary_Encoding_Scheme#Journal
Article
Type - “text” : DCMI
Developing Good Practices
Try some tools to practice with Dublin Core metadata -http://www.dublincoregenerator.com/generator.html
Examples of useful documentation for our library include UIC Data Dictionary for CONTENTdm, Best Practices for CONTENTdm and Other OAI-PMH Compliant Repositories
Examples directly related to journal articles can be scattered across many data dictionaries, best practices, and other guidelines
Use Case – Green OA
“About 50% journal articles published during
the past 12 months are freely available on the
Internet. Nearly half of those OA articles are
Green OA. There are millions of them on IRs,
traditional journal Web sites, authors’ social
network sites, and other Web sites.” Xiaotian Chen, “Open Access Articles Reaching 50% But
Their Retrieval is Lagging,” CARLI Annual Meeting,
2014.
Distinguishing Article Versions
MIT metadata indicating publisher’s PDF
Example record: http://hdl.handle.net/1721.1/92550
dc.eprint.version – Final published version
dc.relation.isversionof -
http://dx.doi.org/10.1038/srep07467
Use Case – Zotero Integration and
IRs
CoinS – recognizes genre as article, but can
be missing key citation elements
Embedded Metadata – often detects journal
articles as web pages
DOI – can record publisher’s URL, rather than
article version present in IR
Retrieve Metadata for PDF – only works if
article is indexed in Google Scholar
Use Case – Open URL Link
Resolver
SFX links to Google Scholar via
getWebSearch, which is a citation title search
Could link resolver link to IRs individually, or,
more likely, a collection of IR metadata, such
as OpenDOAR?