designing for the discipline: open libraries and scholarly communication thomas krichel 2005-05-20

32
Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

Upload: morgan-webb

Post on 27-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

Designing for the Discipline: Open Libraries and Scholarly

Communication

Thomas Krichel

2005-05-20

Page 2: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

about this talk

• Three parts– normative theory– RePEc history– rclis future ideas

• And a final plea: all of this needs help.

Page 3: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

scholarly communication

• is mainly about scholars communicating – between themselves– to students, occasionally

• Thus it is essentially a community activity

• Traditionally, there have been two intermediaries acting as external agents.– libraries – publishers

Page 4: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

when tradition ends

• Two external shock– There comes the Internet and reduces

distribution costs to zero– There comes computer technology and

reduces storage costs somewhat

• “opportunity sets” of community members and external agents increases

• Proposition: the future depends much on what the community members decide. External agents have little impact.

Page 5: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

discipline communities

• Scholars of various disciplines have varying habits of research, publication, and evaluation

• It is likely that the Internet will emphasize those differences rather than reducing them.

Page 6: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

examples: disciplines with established informal publishing

• Preprint communities– Physics arxiv.org– Mathematics arxiv.org, partially

• Working paper communities– Computer Science CiteSeer

(working paper disappearing)– Economics RePEc

Page 7: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

change is tough

• Change has to come inside the discipline.• There has to come a pioneering individual

who– is technically well versed– is managerially smart– has extraordinary forward thinking– is willing to take considerable risk with her

career

• Ginsparg, Krichel, Giles & Lawrence are rare

Page 8: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

and what about libraries?

• Libraries do it systematically wrong– concentrate on access– concentrate on readers– concentrate on documents

• They need to– move from access to impact– move from the reader to the writer– move from documents to people

Page 9: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

RePEc

• RePEc is a freely available digital library related to Economics.

• It does provide for a partial evaluative database.

• It is entirely run by a virtual organization of volunteer.

• I am the person who got it starting in 1993.

• I skip over history.

Page 10: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

RePEc principle

• Many archives – archives offer metadata about digital objects (mainly

working papers)• One database

– The data from all archives forms one single logical database despite the fact that it is held on different servers.

• Many services – users can access the data through many interfaces. – providers of archives offer their data to all interfaces at

the same time. This provides for an optimal distribution.

Page 11: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

RePEc is based on 460+ archives

• WoPEc• EconWPA• DEGREE• S-WoPEc• NBER• CEPR

• US Fed in Print• IMF• OECD• MIT• University of Surrey• CO PAH

Page 12: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

to form a 312+k item dataset

153,000 working papers 157,000 journal articles 1,700 software components 1000 book and chapter listings and the really important stuff 7,000 author contact and publication

listings 8,700 institutional contact listings

Page 13: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

RePEc is used in many services

• EconPapers

• NEP: New Economics Papers

• Inomics• RePEc author service• Z39.50 service by the DEGREE

partners

• IDEAS

• RuPEc

• EDIRC

• LogEc

• CitEc

Page 14: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

institutional registration

• This works through a system called EDIRC.

• Christian Zimmermann started it as a list of departments that have a web site.

• I persuaded him that his data would be more widely used if integrated into the RePEc database.

• Now he is a crucial RePEc leader.

Page 15: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

author registration

• It started when funding allowed us to hire a student programmer to write an author registration system.

• The system went online as "HoPEc" in late 2000.

• It has been renamed "RePEc author service" (RAS)

• In 2002 grant from OSI allows for a rewrite and expansion.

Page 16: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

RePEc author service

• RePEc document data has author names as strings.

• The authors register with RAS to list contact details and identify the papers they wrote.

• This is classic access control, but done by the authors.

• Currently one in three items in RePEc has at least one identified author.

Page 17: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

LogEc

• It is a service by Sune Karlsson that tracks usage of items in the RePEc database– abstract views– downloads

• There is mail that is sent by Christian Zimmermann to– archive maintainers– RAS registrants

that contains a monthly usage summary.

Page 18: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

authors' incentives

• Authors perceive the registration as a way to achieve common advertising for their papers.

• Author records are used to aggregate usage logs across RePEc user services for all papers of an author.

• Stimulates a "I am bigger than you are" mentality. Size matters!

Page 19: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

summary: keys to success

• Have a small group of volunteers.

• Disseminate as widely as possible.

• Collect precise usage logs.

• Demonstrate to authors and institutions that it works for them. – institutional registration– author registration

Page 20: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

rclis

• rclis stands for Research in Computing and Library and Information Science.

• It is pronounced as “reckless”.

• It is a RePEc clone.

• My attempt to show that the same ideas that propel RePEc also can work in that area.

Page 21: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

technical innovation

• RePEc is built on attribute: value templates.

• rclis is built on a purpose built format called the Academic Metadata Format.

• I set up this format. It is tailor-made to suit the needs of rclis and RePEc.

• There is some usage of AMF in RePEc– RePEc OAI interface– ernad, the software feeding NEP

Page 22: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

E-LIS

• It is the largest LIS eprint archive on this planet.

• It lives at http://eprints.rclis.org.

• It contains over 2400 documents.

• It runs in Italy but uses a system of national editors to feed in material.

• I am one of the US editors.

Page 23: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

DoIS

• DoIS is a service based on a Spanish LIS bibliography.

• It used to run at Manchester computing but moved to http://wotan.liu.edu/dois when, because of JISC regulations, we had to move from there.

• It contains 13k records, 9k with free full text, but the data has many errors.

Page 24: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

using already existing resources

• There is already a very large computer science bibliography called DBLP, see http://dblp.uni-trier.de

• The data has no abstracts. It has some full-text links, mainly to toll-gated sites.

• I have done work to convert parts of it to AMF. • I am now searching if free full text versions of

the papers exist anywhere on the Web. This is the Konz project.

Page 25: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

the Konz project

• Current state– I use Google API to search of titles.– I examine responses and download pages. – I scan the pages for PDF and Word files. – I examine the text in the file to find the title.

• Limitations– pdf and word full text– conference paper data still being processed– significant hardware and disk problems.

Page 26: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

DoCIS

• Konz currently finds 25k papers with free versions out of the paper out of a 98k searched. Not particularly exciting.

• This data is integrated with DBLP AMF data and the result forms a new service called DoCIS.

• DoCIS lives at

http://wotan.liu.edu/docis

Page 27: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

DoCIS service

• DoCIS is implemented in mod_perl with swish++ and therefore very fast.

• The web pages are written by XSLT scripts directly from the AMF data.

• The service is available to copy from the web, I am more than happy to run it on other sites.

• But the most interesting thing are the service principles.

Page 28: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

construction transparency• DoCIS is an open digital library service

because it allows users to inspect exactly how the service runs

– DoCIS is built using open source software.– There is a special interface

http://wotan.liu.edu/strip/docis/ that allows to see almost all internal file. Non visible files are specially documented.

• The hope is that it may be used for teaching purposes.

Page 29: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

transportability

• Everything in DoCIS is built is such a way that it should be easy to move the service somewhere else and establish copies.

• The ideas may not make a lot of technical sense but it should increase to non-proprietary nature of the system.

• Note that this has not been tested ;--)

Page 30: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

usage transparency

• All usage is logged and the logs are made public.

• This it is hoped that it could be used for digital library research.

• Ways will be found to aggregate usage on different physical installations.

Page 31: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

to do list

• finish a version of konz that recognizes HTML full text

• integrate DoCIS and DoIS

• finish conversion of DBLP to AMF

• open institutional registration for rclis

• open author registration for rclis

• open a NEP-like service for rclis

Page 32: Designing for the Discipline: Open Libraries and Scholarly Communication Thomas Krichel 2005-05-20

http://openlib.org/home/krichel

Thank you for your attention!

collaboration is welcome!