lockss/clockss and portico what does that content look like? coalition for networked information...
TRANSCRIPT
LOCKSS/CLOCKSS and Portico
What does that content look like?
Coalition for Networked InformationFall 2006 Task Force Meeting
December 4, 2006
Presented by
Geneva L. HenryExecutive Director, Digital Library Initiative
Rice University
Carolyn WaltersExecutive Associate Dean
Indiana University Libraries
Phyllis DavidsonAssistant Dean of Digital & Information Technology Services
Indiana University Libraries
Kerry A. KeckAssistant University Librarian, Collections
Rice University
Agenda
Overview/Background Content -- What’s the difference? Costs Additional uses of the systems Summary
Overview/Background
Why are we up here?
Warning: WE’RE NOT EXPERTS! Rice and Indiana Universities are both users and members
of LOCKSS, CLOCKSS and Portico We hear people from various libraries and publishers say the
darnedest things about these solutions We thought you might want to hear an unbiased review and
comparison from the library community These efforts are important for libraries and should be taken
very seriously if you subscribe to any electronic journals and would like to ensure that they are preserved for future access
Some background information
LOCKSS, CLOCKSS and Portico are solutions for preserving electronic journal content
LOCKSS (Lots of Copies Keep Stuff Safe) developed by Stanford. Began in 1999, beta tested through 2002, production system developed 2002 - 2004, released April 2004
CLOCKSS (Controlled LOCKSS) based on LOCKSS s/w, started early 2006, piloting with a small number of libraries and publishers for 2 years
Portico launched by JSTOR in 2002 with funding from Mellon, became part of Ithaka Harbors, Inc. in 2004, then launched as Portico in 2005
Overview of system approaches
Portico http://www.portico.org/index.html
Centralized, hosted platform Proprietary software No customer equipment required
except browser access Source files collected, not web
presentation and PDF where available
LOCKSS/CLOCKSShttp://www.lockss.org/lockss/Home
Distributed, peer-to-peer platforms with error detection
Open source software Small workstation required for
LOCKSS; runs off of a CD Specific server hardware required
for CLOCKSS Web presentation and PDF where
available collected for LOCKSS; source also for CLOCKSS
Content - what’s the difference?
How the systems support perpetual access to content
Default access is always to the publisher’s existing website
Only in the event that the publisher’s website is unavailable, does access revert to the archival site
How the archival sites differ in supporting requests
Portico/CLOCKSSSystems provide qualifying libraries
supporting the archive with campus-wide access to archived content when specific trigger events occur, and when titles are no longer available from the publisher or other source. Trigger events include:
A publisher stops operations; or A publisher ceases to publish a title; or A publisher no longer offers back
issues; or Upon catastrophic and sustained
failure of a publisher's delivery platform
LOCKSS The institution must cache archive
units (journal volumes) to their local box as they are released by the publisher
Institutions often run web proxies, to allow off-campus access to subscriptions and to reduce the bandwidth cost of Web access. The LOCKSS Box integrates with these systems, intercepting requests from the community's browsers to the journals being preserved.
When a request for a page from a preserved journal arrives, it is first forwarded to the publisher. If the publisher returns content, that is what the browser gets. Otherwise the browser gets the preserved copy.
And then there is CLOCKSS…
CLOCKSS differs from LOCKSS by both its structure and purpose: it is conceived as a small, responsible network providing a safety net - or dark archive - of subscription-based journals on behalf of a much broader community.
Will be made available if needed through a 3rd party TBD (e.g. Google)
Publishers and archiving libraries will have to pay to participate
Participating pilot libraries: Indiana University, New York Public Library, OCLC, Rice University, Stanford University, University of Virginia, University of Edinburgh
What content is available?
There is substantial overlap in the publishers announcing content via Portico and via LOCKSS/CLOCKSS
There is notably less overlap in the serial titles and/or issues available for preservation via the Portico and LOCKSS
Existing content comparison
Number % Unique Number % Unique
Publisher 4 50 48 96
Titles available1
408 96 548 97
Portico LOCKSS
1 Data as of 11/30/2006. Many additional titles and publishers are committed to both systems
And then there is CLOCKSS
12 publishers are participating in the CLOCKSS initiative:
American Chemical Society American Medical Association American Physiological Society Blackwell Publishing Elsevier Institute of Physics Nature Publishing Group Oxford University Press SAGE Publications Springer Taylor and Francis John Wiley & Sons.
Sample displays from Audit menus
Note that these do not represent the experience of the user in event of
“publisher failure”
Portico’s display for available issues of a journal title
Portico has a very navigable auditing interface, comparable to end user resources
Portico’s display for an individual article
An individual article entry provides the DOI and links to the html and PDF files
Viewing the article in Portico
In html Note that internal hyperlinks are all active and
links to separately maintained images are present External links (e.g. back to table of contents) may
not be active within the audit view
As a PDF (where present)
LOCKSS’ display for available issues of journal titles
LOCKSS display for components from the journal volume…
Each article and image component of a volume is listed
An auditor must determine an appropriate ‘starting’ point - links from a table of contents screen may fail
LOCKSS display of an abstract-level screen from Oxford University Press
See a typical abstract record from the list
And the article html full text
Note internal links and separate images
And the PDF (where present)
Costs
Institutional costs of participation
Membership/
FeesEquipment Staff
LOCKSS Free or
$10,800
(Alliance member)
Servers
Set up, maintain LOCKSS box, manage lists
CLOCKSS(over two year pilot project)
$28,500 +
$10,800
(LOCKSS Alliance membership)
$3,000
(servers)
Set up, maintain CLOCKSS box,
manage lists
Portico Tiered based on materials
expendituresNone None
Our cost experience
LOCKSS costs: Alliance Member $10,800 annually Three Servers $3,085 one-time Programmer Two hours/week Technical Services 2-12 hours/month
Our cost experience
CLOCKSS (Two year pilot) * Programmer support $28,500 Servers $ 3,000 Programmer One hour/week Technical Services One hour/month
* Must be a member of the LOCKSS Alliance
Our cost experience
Portico - Indiana Annual Membership
$15,200 Portico Archive Founder* (-25%) - 3,800
$11,400
No equipment No ongoing expenses
*Discount for five years if joined in 200610% discount for five years if joining in 2007
Portico - Rice Annual Membership
$13,000 Portico Archive Founder* (-25%) - 3,250
$9,750
No equipment No ongoing expenses
*Discount for five years if joined in 200610% discount for five years if joining in 2007
Additional Uses of the LOCKSS software
Other current applications for the LOCKSS software
Preserving federal digital publications GPO LOCKSS Pilot Project
http://www.access.gpo.gov/su_docs/fdlp/lockss/index.html
State of Alaska Project http://www.library.state.ak.us/asp/shippinglists/fy_2007/fy_2007_shippinglists.html
Preserving born-digital, freely available humanities journals Humanities Project
http://www.lockss.org/lockss/Related_Projects#Humanities_Project
Other current applications for the LOCKSS software
Electronic theses and dissertations repositories Association of Southeast Research Libraries
http://scholar.lib.vt.edu/theses/ETDsASERLLOCKSS20050711PR.pdf (project announcement)
The ASERL LOCKSS-ETD INITIATIVE: Developing Preservation Strategies for Libraries that Publish E-Scholarship
http://www.cni.org/tfms/2005b.fall/abstracts/handouts/CNI_ASERL_McDonald.ppt
International ETDs Preservationhttp://www6.bibl.ulaval.ca:8080/etd2006/pages/papers/SP10_ Kamini_Santhanagopalan.ppt
MetaArchive of Southern Digital Culturehttp://www.metaarchive.org/index.html
Summary
In summary …
These are the three highest profile preservation solutions available at this time for subscription-based library content
Others may be coming Institutions have a responsibility to participate,
contributing to developing solutions for preservation of the digital cultural record just as we have done in the earlier, print-based era
There’s a solution for everyone, whether or not you have an IT staff
Considerations in selecting a solution
We currently preserve print journals, but quantity of electronic journals is much greater. The cost is still lower supporting all of these initiatives.
When looking at the options, where is the overlap with your titles?
Do faculty have interests in specific niche journals? These may be the most vulnerable.
Libraries can have tremendous influence with publishers in educating them about the need to preserve publications. LOCKSS, CLOCKSS and Portico provide an easy avenue for them to preserve their content.
Presented by
Geneva L. HenryExecutive Director, Digital Library Initiative
Rice [email protected]
Carolyn WaltersExecutive Associate Dean
Indiana University [email protected]
Phyllis DavidsonAssistant Dean of Digital & Information Technology Services
Indiana University [email protected]
Kerry A. KeckAssistant University Librarian, Collections
Rice [email protected]