flexible and extensible digital object and repository architecture (fedora)

30
Flexible and Extensible Digital Object and Repository Architecture (FEDORA) Sandra Payette Cornell University [email protected]. edu http://www.cs.cornell.edu/payette/presentations/fedora- gdz.ppt Dritter Workshop der Digitalisierungszentren, October 5, 1999

Upload: audra-hoffman

Post on 30-Dec-2015

56 views

Category:

Documents


0 download

DESCRIPTION

Flexible and Extensible Digital Object and Repository Architecture (FEDORA). Sandra Payette Cornell University [email protected]. Dritter Workshop der Digitalisierungszentren, October 5, 1999. http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt. - PowerPoint PPT Presentation

TRANSCRIPT

Flexible and Extensible Digital Object and Repository

Architecture (FEDORA)

Sandra PayetteCornell University

[email protected]

http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt

Dritter Workshop der Digitalisierungszentren, October 5, 1999

Cornell Digital Library Research Group

• Computer Science Department Bill Arms Carl Lagoze Sandy Payette Naomi Dushay David Fielding

• Affiliates Anne Kenney (Cornell Library) Geri Gay (Human Computer Interaction) CNRI

CDLRG - Projects

• Prism (DLI2)

• Fedora

• Harmony (IDL)

• Dienst and NCSTRL

• Electronic Scholarly Publishing D-Lib Citation Linking (IDL)

Library of Congress

Cornell Digital Library

Digital Library Interoperability

Principles for Digital Library Architecture

• Open Architecture functionality partitioned into set of well-defined services services accessible via well-defined protocol

• Modularization promotes interoperability scalable to different clientele (library, informal web)

• Federation enable aggregations into logical collections

• Distribution of content and services of administration and management

Repository Service

Component-Ware Digital Libraries

Collection Service

Index Service

Identifiers

NameService

DigitalObjects

UI GatewayService

Query MediatorService

UI

FEDORA

• Digital Object Model container for aggregating any digital material disseminations of complex types global extensibility mechanisms access management

• Repository Service Service layer for “contained” DigitalObjects Object lifecycle management Secure environment open interface

FEDORA: Goals

• Distribution - of digital content and services

• Interface Stability - for digital objects

• Interoperability - for digital objects and repositories

• Extensibility - naturally evolving type system

• Flexibility - community-driven type development

• Security - rights management and access control

• Preservation - longevity of digital objects

FEDORA History

• Kahn/Wilensky

• Warwick Framework

• Distributed Active Relationships

• Cornell FEDORA (Lagoze, Payette)

• CNRI Repository (Arms, Blanchi, Overly)

• CNRI/FEDORA - Interoperability Project

• UVA - Complex disseminators, distribution

• Project Prism (DLI2)

FEDORA DigitalObjects can be...

• Simple, familiar entities

• Complex, compound, dynamic objects

DublinCore

Book

Dia

ry

Fu

ture

FEDORA DigitalObject Model

Internal DataStream

MIME-typed stream of bytes

Reference DataStream

Service Request upon external source

Dissemination

Disseminator Type

A set of behaviors that formally describes the functionality of any global or community-specific notion of content.

getSectiongetArticle

getChaptergetPage

getFramegetLength

Disseminator

A generic component that associates

a set of behaviors with a DigitalObject.

PrimitiveDisseminator

Extensible Type Disseminator

Generic behaviors Extended behaviors

FEDORA DigitalObject

application/MARC

application/postscript

PrimitiveDisseminator

image/gifimage/gif

image/gifimage/gif

application/MARC DS1

application/postscript DS2

PrimitiveDisseminator

Client communicates with generic requests

Book, DublinCore

ListDisseminatorTypesBook

DisseminatorDublinCore

Disseminator

GetDissemination(Book.GetPage(1))

GetChapterGetTOCGetPage

GetChapter(n), GetPage(n),GetTOC()

GetMethods(Book)

A Disseminator...

GetDCField(Title), GetDCRecord

GetMethods(DC)

application/MARC

DC

DS1

application/postscript

DS2

… references a Servlet TYPE DESCRIPTION = DublinCore

SERVLET = cornell.dli2/DC-from-MARC

… to produce non-generic behaviors for the DigitalObject

GetDCFieldGetDCRecord

DigitalObject Interface Stability

MechanismStructure Interface

Disseminator Type

Servlet-2

Servlet-1

Servlet-3

Mechanisms can be updated or replaced as technology changes ...

… and the interface tothe Digital Object

remains stable

DigitalObject Extensibility: Adding New Types

MechanismStructure Interface

Book

The sameunderlyingdata...

Boo

k

can be operatedon in novel ways…

Photo Collection

to create new disseminationsnot originally conceived of

for the particular digital object.

Pho

toC

olle

ct

Extensibility: a look under the hood

application/MARC

DC servlet

application/postscriptDublinCore

Record

GetDissemination( GetDCRecord)

DC

Servlet = URNDC1

DC sign

atur

e

GetDCFie

ld

GetDCRec

ord

DCMethodListSignature

Disseminator

URNDC

DublinCoreDisseminator Type

Signature(Interface Definition)

DublinCoreMechanism

(Servlet)

DC Mechanism

URNDC1

ServletDisseminator

Proliferation of Disseminator Types

• We use FEDORA DigitalObjects to store Disseminator Signatures and Servlets.

• Type Registration (via name service) a Disseminator Type’s global identifier is

… the URN of a DigitalObject containing a Signature

a Servlet’s global identifier is… the URN of a DigitalObject containing a Servlet

Types can be globally recognizable and mechanisms can be shared.

Repository

Interoperable Digital Objects and Repositories

Identifiers

NameService

RAP Client

Image Database System

Repository Repository

Cornell Library CollectionsAudio/Visual Archive

Persistent Identifiers

• In FEDORA, use them for: Repositories DigitalObjects Disseminator Types Servlet Mechanisms

• Benefits: Ensure uniqueness Provide stability (location independence) Promote global extensibility Promote interoperability

Identifiers

NameService

Identifiers - A Brief Primer

IETF Uniform Resource Name (URN) Spec• Naming Scheme

The policies and procedures for creating and assigning URNs within a particular domain.

• Resolution System A system that translates URNs into their location-specific

identifiers (e.g., URLs).

• Registries A set of global directories that provide information on

which resolution systems can translate any particular URN.

Identifiers - Existing Solutions

• CNRI’s Handle System good implementation of URN specification 1 Handle >> one or more locations resolve to different data types (URL, IOR,…)

• OCLC’s PURL persistent URLs, not really URNs 1 PURL >> only one location (a HTTP redirect)

• Community-specific Initiatives Digital Object Identifier (DOI) - publishers

• Handle System + Rights Metadata

PubMedID - Medline BibCode - astro-physics journals

FEDORA Status

• Reference Implementation CORBA IDL defines open interfaces for

Repository Access Protocol (RAP) Java/CORBA repository and clients

• Collaborations CNRI

• core design and interoperability• complex disseminations (dynamic)

U of Virginia• web integration• complex disseminations (e.g., e-texts)

New Research

• DLI2 - Project Prism security (associating enforceable policies

and mechanisms with DigitalObjects) preservation (enable long-term survival of

DigitalObjects in distributed environment)

• IDL - Harmony aggregation and interaction of multiple,

complex metadata sets in DigitalObjects RDF and XML

PRISM Security Policy Enforcement

• Challenges what is enforceable? distributed object environment interoperability and extensibility

• Monitor all operations, generic and extended

• Enforce a wide array of policies basic security violations rights management access control

application/MARC

text/x-acl

DC

GetDCFieldGetDCRecord

PRISM: Preservation

Handles

Preservation Service

FedoraRepositories

PRISM: Preservation Policy Enforcement

preservationmetadata

PreserveP

DS1

application/postscript

DS2

Book

Preservation Service

Monitors DigitalObject stateand catches unacceptable,

or risky transitions

Preservation Surrogate

Object

References• Payette, Blanchi, Lagoze, and Overly: Interoperability for Digital Objects

and Repositories: The Cornell/CNRI Experiments, D-Lib Magazine, May

1999. http://www.dlib.org/dlib/may99/payette/05payette.html

• Payette and Lagoze: Flexible and Extensible Digital Object and

Repository Architecture (FEDORA), ECDL 1998. http://www.cs.cornell.edu/payette/papers/ECDL98/FEDORA.html

• Lagoze and Payette: An Infrastructure for Open-Architecture Digital

Libraries http://ncstrl.cs.cornell.edu/Dienst/UI/1.0/Display/ncstrl.cornell/TR98-1690

• Daniel, Lagoze, and Payette, A Metadata Architecture for Digital Libraries,

IEEE ADL 1998. http://www.cs.cornell.edu/lagoze/papers/ADL98/dar-adl.html

• FEDORA Home Page http://www.cs.cornell.edu/NCSTRL/CDLRG/FEDORA.html

• Payette: Persistent Identifiers on the Digital Terrain, RLG DigiNews,April 1998, Volume 2, Number 2. http://www.rlg.org/preserv/diginews/diginews22.html