national library of medicine pubmed central edwin sequeira national library of medicine may 26, 2004
Post on 10-Dec-2015
243 Views
Preview:
TRANSCRIPT
NATIONAL LIBRARY OF MEDICINE
PubMed Central
Edwin Sequeira
National Library of Medicine
May 26, 2004
NATIONAL LIBRARY OF MEDICINE
What is PubMed Central?
• Digital archive of life sciences journals• includes health policy, bioinformatics and other fields
• Participation is open to journals:• covered by a major abstracting/indexing service
• or, that have 3 editorial board members with current grants from major non-profit funding agencies
• Free access to full-text articles and supporting data
• Integrated with PubMed and other bibliographic and factual databases in NCBI’s Entrez network
NATIONAL LIBRARY OF MEDICINE
PMC Basic Policy
• Journal deposits an authoritative electronic copy that meets PMC data quality standards• full-text XML• original high-resolution graphics• PDF• supplementary data
• Journal may delay free access to its content• research articles are generally free in a year or less
• Copyright is retained by publisher or author
• Deposits – and free access permissions – are permanent• journal may stop depositing new material but may not withdraw
material already deposited
NATIONAL LIBRARY OF MEDICINE
PMC Archiving Model
• Multiple copies of archive on DVD and tape• Catalog database tracks what’s where
Journal files: SGML or XML in Publisher’s DTD;
Images, PDFs, Supplementary data files
Convert SGML to PMC XML common format
Convert images to Web display format
High resolution image files
Supplementary data files
PDFs PMC XML files (common DTD)
Web display images
Source SGML/XML files
PMC Public Access Database
PMC Archive
PMC Search results, TOC pages,
Full text pages, and others
Create online display pages dynamically from PMC database
NATIONAL LIBRARY OF MEDICINE
Why???
Why XML?• Preserves structure of an article
• Lends itself to intelligent processing • citation matching, selective searching, etc.
• Human readable – not dependent on technology
• Portable
Why Free?• Readers provide another level of quality control
• The more eyes the better
NATIONAL LIBRARY OF MEDICINE
Digital Journal Archiving Issues
• Ensuring quality of source materials
• Active use to ensure effective preservation
• Distribution of content to collaborating archives for added security• standard agreement covering rights and responsibilities of
archiving organization
• Basic toolset for archive duplication / exchange:• common interchange DTD
• standard file names
• unique object and accession IDs
• possibly, core software for loading content to database and displaying it online
NATIONAL LIBRARY OF MEDICINE
Genesis of the NLM Journal DTDs
In the beginning (Jan 2000) … custom handling of each journal (with different DTDs)
Within months … we need a common DTD… enter the PMC DTD – keep it simple… a simple DTD that accommodates a growing variety of incoming DTDs? Really?
Summer / Fall 2001 … we completely redesign and expand the PMC DTD
Early 2002 … Harvard / Mellon says “can we share?”Early 2003 … we have the NLM modular DTD suite
… and we’ve learned that an Archiving DTD should not be a Publishing DTD
NATIONAL LIBRARY OF MEDICINE
NLM Journal DTDs
• Journal Archiving and Interchange XML DTD• common format for storing and distributing content
supplied in a variety of “source” DTDs
• developed in cooperation with Mellon Foundation E-journal archiving program
• Journal Publishing XML DTD for original tagging of content at source
• Adopted by High Wire Press, JSTOR and many others
• Technical advisory group includes American Physical Society, High Wire Press, JSTOR, Microsoft
NATIONAL LIBRARY OF MEDICINE
What To Archive?
“…you don't know what you've got
Till it's gone”
– Joni Mitchell
NATIONAL LIBRARY OF MEDICINE
What the World Needs Now
• Journal production – authoring and copy-editing – using XML-based tools• published article comes from the XML, not vice versa
• Straightforward, universal standard for defining ownership and access rights, similar to copyright indication• evolving flavors of Open Access
• changes of ownership
• Other operational, free archives that can form a collaborative archiving network
NATIONAL LIBRARY OF MEDICINE
Back Issue Digitization
• Create a complete digital archive of PMC journals for today’s “if not online, it doesn’t exist” user
• Cover-to-cover digital copy of everything up to where journal began producing electronic copy
• Publisher gets free, unencumbered copy
• First complete archive, Bulletin of the Medical Library Association (1911), released in November 2003
• Expected collaboration with Wellcome Trust and UK Joint Information Systems Committee (JISC)
top related