dorsdl2006-arrow

34
The ARROW Project: A consortial institutional repository solution, combining Open Source and proprietary software David Groenewegen ARROW Project Manager

Upload: guestfbf1e1

Post on 14-Nov-2014

976 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: dorsdl2006-arrow

The ARROW Project: A consortial institutional repository solution, combining Open Source and proprietary software

David Groenewegen

ARROW Project Manager

Page 2: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

2

Outline

Why did we want a repository? What is ARROW? What is VITAL and how does it relate to Fedora? Where is ARROW now? What have we learnt so far? ARROW Stage-2

Page 3: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

3

Why did we want a repository?

provides a platform for promoting research output in the ARROW context

safeguards digital information gathers an institution’s research output into one place provides consistent ways of finding similar objects allows information to be preserved over the long term allows information from many repositories to be gathered

and searched in one step enables resources to be shared, while respecting access

constraints (when software allows access controls) enables effective communication and collaboration between

researchers

Page 4: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

4

What is ARROW?

ARROW Project: Originally funded for 3 years until December 31, 2006,

recently extended for 12 months. Funded by the Australian Commonwealth Department of

Education, Science and Training (DEST), under the Research Information Infrastructure Framework for Australian Higher Education.

“The ARROW project will identify and test software or solutions to support best practice institutional digital repositories comprising e-prints, digital theses and electronic publishing.”

Page 5: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

5

Who is ARROW?

Founding ARROW Partners: Monash University (lead institution) National Library of Australia The University of New South Wales Swinburne University of Technology.

ARROW Members: University of South Australia University of Southern Queensland Queensland University of Technology Central Queensland University University of Western Sydney La Trobe University 4 other RUBRIC members are expected to sign soon

Together they form the ARROW Community

Page 6: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

6

What did the ARROW projectset out to achieve?

Solution for storing any digital output Initial focus on print equivalents – theses, journal articles Now looking at other datasets, learning objects

More than just Open Access – some things need to be restricted

Copyright Confidentiality/ethical considerations Work in progress

Page 7: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

7

What did the ARROW projectset out to achieve? (2)

Meeting DEST reporting requirements Expected move to Research Quality Framework (RQF) has increased

the focus on repositories

Employ Open Standards Making sure the data is transferable in the future

Deliver Open Source Tools back to the FEDORA Community

Solution that could offer on-going technical support and development past the end of the funding period

Page 8: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

8

What is ARROW now?

A development project Combining Open Source and proprietary software:

Fedora™ VITAL Open Journal Services (OJS)

NOT a centralised or hosting solution Every member has their own hardware and software

Page 9: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

9

Why Fedora?

ARROW wanted: a robust, well architected underlying platform a flexible object-oriented data model to be able to have persistent identifiers down to the level of

individual datastreams, accommodating its compound content model

to be able to version both content and disseminators (think of software behaviours for content)

clean and open exposure of APIs with well-documented SOAP/REST web services.

Page 10: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

10

ARROW and Fedora™

Since the beginning of the project ARROW has worked actively and closely with Fedora™ and the Fedora Community ARROW Project Technical Architect is a member of Fedora

Advisory Board ARROW Project Technical Architect sits on Fedora

Development Group

This is reinforced by VTLS Inc. VTLS President is a member of Fedora Advisory Board VITAL Lead Developer sits on Fedora Development Group

Page 11: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

11

Partnering for success, support and survivability

ARROW needed to partner with a developer who could not only produce the software but could provide ongoing user support and development after December 31, 2006

Why VTLS Inc.? VTLS wanted to be a development partner Had begun work on a repository solution already Familiar with library sector Willing to produce a combination of a proprietary solution,

Fedora and other Open Source software

Page 12: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

12

What is VITAL?

ARROW specified software created and fully supported by VTLS Inc. built on top of Fedora™ that currently provides: VITAL Manager VITAL Portal VITAL Access Portal VALET - Web Self-Submission Tool Batch Loader Tool Handles Server (CNRI) Google Indexing and Exposure SRU / SRW Support

Page 13: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

13

Fedora™

VITAL architecture overview

IndexesHandles server

Web servicesGoogle exposureSRU/SRW

Batch Loading Tool

Access Portal

Valet

Vital Manager

Page 14: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

14

Where are we now?

2004

• Developed architecture• Selected, tested Fedora™ OJS• VITAL 1.0

2005• VITAL 1.3• Started populating repositories• OAI-PMH harvesting• ARROW Discovery Service• Open sources tools released• VITAL 2.0

2006

• VITAL 2.1• VITAL 3.0 (in test)• Authentication/Authorization Services• Enhanced Content Models• Usage and access statistics• User configurable interfaces• Movement towards a pure Web based interface• Support for OAI sets• Integration with 3rd party modules like federated search

2007ARROW Stage-2

Page 15: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

15

ARROW Repositories

Monash University http://arrowprod.lib.monash.edu.au:8000/access

University of New South Wales http://arrow.unsw.edu.au/

Swinburne University of Technology http://researchbank.swinburne.edu.au/access/

Central Queensland University http://library-resources.cqu.edu.au:8888/access/

Page 16: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

16

Implementation decisions

Atomistic or compound objects?Descriptive metadata

adopt one or enjoy MANY types? JHOVE validation JHOVE metadata extraction

Use cases and content modellingWhat import /export formats?

honouring what standards? validation, when and how?

Page 17: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

17

Policy frameworks and decisions

Direct or mediated deposit? managing workflows

Open or closed access? LDAP authentication? XACML authorisation

creating policies -who can do what? Shibboleth

Persistent URL format? External searching and harvesting?

OAI-PMH spidering

post ARROW project support For more detail see Andrew Treloar’s talk at:

http://www.lib.virginia.edu/digital/fedoraconf/schedule.shtml

Page 18: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

18

External searching and harvesting

Realised need to develop a discovery service for Australian institutional repositories

The ARROW Discovery Service developed by the NLA, provides consolidated searching across many Australian repositories, (uses OAI-PMH)

Picture Australia developed by the NLA, harvesting image collections (uses AOI-PMH)

SRU/SRW interface released as Open Source Software

Harvesting

Google and other service providers

Page 19: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

19

ARROW Discovery Service

http://search.arrow.edu.au/

Provides a national resource discovery service including: providing an appropriate search interface

simple search, advanced search, & browse options

contributing to other networks OAIster, Yahoo, Google

Ensuring appropriate local institutional and national “branding” of the service

occurs throughout the ADS interface and the exchanged metadata

providing appropriate subject-based access The Australian Standard Research Classification list

Page 20: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

20

Open Source contributions for Fedora

Already made: SRU/SRW HANDLES JHOVE Metadata extraction Exposure to Web indexing crawlers.

Coming in 2006: LDAP Authentication Administrative Reporting Bulk Citation Export Statistics for Public Users Metadata Synchronisation Requirements

Page 21: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

21

Upcoming VITAL Version 3.0 Authentication/Authorization Services.

XACML (Policy enforcement)

Enhanced Content Models.

Usage and access statistics.

User configurable interfaces.

Movement towards a pure Web based interface.

Support for OAI sets.

Integration with 3rd party modules like federated search.

Access to content via VTLS reseller arrangements.

Future of VITAL

Page 22: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

22

What have we learnt so far?

Multiple partners are good: Sharing of information and experiences Sharing of development work Multiple perspectives on issues

and bad: Multiple perspectives on issues Scope creep Managing expectations Pressure on the project management team Pressure on development team and partners Deadline conflicts

Software development feels slow, both commercial and open source

Development with a commercial partner can be tricky

Page 23: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

23

What have we learnt so far? (2)

That there aren’t enough real standards in this areaOpen versus closed repositories, or information

management versus accessibility is a BIG ISSUERepositories are only partly about software -

advocacy, policy, institutional engagement and grunt work need equal attention

Constraints of dealing with copyright

Page 24: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

24

ARROW Stage 2

Funded to the end of 2007Supporting the RQFCreative development of institutional repositoriesSupporting Australian engagement with institutional

repositoriesBuilding partnerships to further enhance repositories Identifier Management Infrastructure for e-Research

Resources (PILIN)

Page 25: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

25

Some changes in direction

Trying to do more development ourselves to: Spread the knowledge Leverage our use of Fedora

Want to work with VTLS in new ways Contract is finished now Some work we need to do is too local for VTLS VINES

Page 26: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

26

Supporting the RQF

Inclusion of all discrete pieces of evidence, regardless of content type

Including traditional text evidence and less traditional evidence, such as art works and music compositions or performances

Provision for maximum possible exposure of content Subject to copyright constraints.

Inclusion of metadata and links to content in commercial resources.

Reporting to DEST through multiple channels Such as Research Master, or direct to the repository.

Support for access and authorisation regimes. Retention of all evidence

To build institutional research profiles over time.

Page 27: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

27

Creative development of ARROW institutional repositories

Inclusion of multimedia and creative works produced in Australian universities.

To date have had limited exposure nationally or internationally. Addition of annotation capability Inclusion of datasets and other research output not easily

provided in any other publishing channel. In conjunction with the DART (ARCHER) Project.

Exploration of the research-teaching nexus by facilitating multiple uses of content held in repositories.

Integration with or development of new tools that will allow value added services for repositories.

For instance the creation of e-portfolios or CVs of research output of individual academics.

Page 28: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

28

ARROW Projects

ARROW is planning a number of local projects targeting local and community needs. These will interact directly with Fedora™ and VITAL where appropriate. The development is being done within the ARROW Community.

Page 29: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

29

Partner projects in 2007

Gathering research output from websites (UNSW) Displaying outputs through websites (portfolios)

(UNSW/Swinburne) Understanding workflows and needs of academics

(UNSW/Swinburne) Improving the ARROW Discovery Service (NLA)

OAI Sets support Greater automation Statistics capture and reporting Integration of e-journals

Usability analysis (Swinburne) Data needs survey (Swinburne) Building Rules for Access to Controlled Electronic Resources

(BRACER) (Monash)

Page 30: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

30

Supporting Australian engagement with institutional repositories

FRODO and MERRI projects have resulted in a significant leap in the levels of understanding and engagement with repositories in Australia,

Now the challenge is to translate this into substantial repository activity.

The newly formed ARROW Community is intended to provide a central platform for support and the exchange of information.

Page 31: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

31

The ARROW community

Sharing knowledge and experiences Annual meeting – inaugural one September 8, 2006 Regular workshops Working Groups

ARROW Repository Managers Group ARROW Development Group

Possibly groups for: Portfolio design Metadata: METS, MODS, DC and the future

Discussion group GoogleGroup

ARROW provides logistical and admin support

Page 32: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

32

Building partnerships to further enhance repositories

Through partnerships with other projects ARROW will endeavor to use best practice and new innovations to further enhance Australian repositories beyond their current limitations.

These include: APSR: http://www.apsr.edu.au/ DART/ARCHER: http://www.dart.edu.au/ ICE: http://ice.usq.edu.au/ MAMS: http://www.melcoe.mq.edu.au/projects/MAMS/ OAK-Law: http://www.oaklaw.qut.edu.au/ RUBRIC: http://www.rubric.edu.au/

Page 33: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

33

PILIN - Persistent Identifiers and Linking INfrastructure Growing realisation that sustainable identifier infrastructure is

required to deal with the vast amount of digital assets being produced and stored within universities.

This is a particular challenge for e-research communities where massive amounts of data are being generated without any means of managing this data over any length of time.

The broad objectives are to: Support adoption and use of persistent identifiers and shared

persistent identifier management services by the project stakeholders.

Plan for a sustainable, shared identifier management infrastructure that enables persistence of identifiers and associated services over archival lengths of time.

Page 34: dorsdl2006-arrow

DORDSL Workshop, 21 September 2006

34

Questions?

ARROW Project [email protected] http://arrow.edu.au/

ARROW Project Manager [email protected]