flexible and extensible digital object and repository architecture (fedora)
DESCRIPTION
Flexible and Extensible Digital Object and Repository Architecture (FEDORA). Sandra Payette Cornell University [email protected]. Dritter Workshop der Digitalisierungszentren, October 5, 1999. http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt. - PowerPoint PPT PresentationTRANSCRIPT
Flexible and Extensible Digital Object and Repository
Architecture (FEDORA)
Sandra PayetteCornell University
http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt
Dritter Workshop der Digitalisierungszentren, October 5, 1999
Cornell Digital Library Research Group
• Computer Science Department Bill Arms Carl Lagoze Sandy Payette Naomi Dushay David Fielding
• Affiliates Anne Kenney (Cornell Library) Geri Gay (Human Computer Interaction) CNRI
CDLRG - Projects
• Prism (DLI2)
• Fedora
• Harmony (IDL)
• Dienst and NCSTRL
• Electronic Scholarly Publishing D-Lib Citation Linking (IDL)
Principles for Digital Library Architecture
• Open Architecture functionality partitioned into set of well-defined services services accessible via well-defined protocol
• Modularization promotes interoperability scalable to different clientele (library, informal web)
• Federation enable aggregations into logical collections
• Distribution of content and services of administration and management
Repository Service
Component-Ware Digital Libraries
Collection Service
Index Service
Identifiers
NameService
DigitalObjects
UI GatewayService
Query MediatorService
UI
FEDORA
• Digital Object Model container for aggregating any digital material disseminations of complex types global extensibility mechanisms access management
• Repository Service Service layer for “contained” DigitalObjects Object lifecycle management Secure environment open interface
FEDORA: Goals
• Distribution - of digital content and services
• Interface Stability - for digital objects
• Interoperability - for digital objects and repositories
• Extensibility - naturally evolving type system
• Flexibility - community-driven type development
• Security - rights management and access control
• Preservation - longevity of digital objects
FEDORA History
• Kahn/Wilensky
• Warwick Framework
• Distributed Active Relationships
• Cornell FEDORA (Lagoze, Payette)
• CNRI Repository (Arms, Blanchi, Overly)
• CNRI/FEDORA - Interoperability Project
• UVA - Complex disseminators, distribution
• Project Prism (DLI2)
DublinCore
Book
Dia
ry
Fu
ture
FEDORA DigitalObject Model
Internal DataStream
MIME-typed stream of bytes
Reference DataStream
Service Request upon external source
Dissemination
Disseminator Type
A set of behaviors that formally describes the functionality of any global or community-specific notion of content.
getSectiongetArticle
getChaptergetPage
getFramegetLength
Disseminator
A generic component that associates
a set of behaviors with a DigitalObject.
PrimitiveDisseminator
Extensible Type Disseminator
Generic behaviors Extended behaviors
FEDORA DigitalObject
application/MARC
application/postscript
PrimitiveDisseminator
image/gifimage/gif
image/gifimage/gif
application/MARC DS1
application/postscript DS2
PrimitiveDisseminator
Client communicates with generic requests
Book, DublinCore
ListDisseminatorTypesBook
DisseminatorDublinCore
Disseminator
GetDissemination(Book.GetPage(1))
GetChapterGetTOCGetPage
GetChapter(n), GetPage(n),GetTOC()
GetMethods(Book)
A Disseminator...
GetDCField(Title), GetDCRecord
GetMethods(DC)
application/MARC
DC
DS1
application/postscript
DS2
… references a Servlet TYPE DESCRIPTION = DublinCore
SERVLET = cornell.dli2/DC-from-MARC
… to produce non-generic behaviors for the DigitalObject
GetDCFieldGetDCRecord
DigitalObject Interface Stability
MechanismStructure Interface
Disseminator Type
Servlet-2
Servlet-1
Servlet-3
Mechanisms can be updated or replaced as technology changes ...
… and the interface tothe Digital Object
remains stable
DigitalObject Extensibility: Adding New Types
MechanismStructure Interface
Book
The sameunderlyingdata...
Boo
k
can be operatedon in novel ways…
Photo Collection
to create new disseminationsnot originally conceived of
for the particular digital object.
Pho
toC
olle
ct
Extensibility: a look under the hood
application/MARC
DC servlet
application/postscriptDublinCore
Record
GetDissemination( GetDCRecord)
DC
Servlet = URNDC1
DC sign
atur
e
GetDCFie
ld
GetDCRec
ord
DCMethodListSignature
Disseminator
URNDC
DublinCoreDisseminator Type
Signature(Interface Definition)
DublinCoreMechanism
(Servlet)
DC Mechanism
URNDC1
ServletDisseminator
Proliferation of Disseminator Types
• We use FEDORA DigitalObjects to store Disseminator Signatures and Servlets.
• Type Registration (via name service) a Disseminator Type’s global identifier is
… the URN of a DigitalObject containing a Signature
a Servlet’s global identifier is… the URN of a DigitalObject containing a Servlet
Types can be globally recognizable and mechanisms can be shared.
Repository
Interoperable Digital Objects and Repositories
Identifiers
NameService
RAP Client
Image Database System
Repository Repository
Cornell Library CollectionsAudio/Visual Archive
Persistent Identifiers
• In FEDORA, use them for: Repositories DigitalObjects Disseminator Types Servlet Mechanisms
• Benefits: Ensure uniqueness Provide stability (location independence) Promote global extensibility Promote interoperability
Identifiers
NameService
Identifiers - A Brief Primer
IETF Uniform Resource Name (URN) Spec• Naming Scheme
The policies and procedures for creating and assigning URNs within a particular domain.
• Resolution System A system that translates URNs into their location-specific
identifiers (e.g., URLs).
• Registries A set of global directories that provide information on
which resolution systems can translate any particular URN.
Identifiers - Existing Solutions
• CNRI’s Handle System good implementation of URN specification 1 Handle >> one or more locations resolve to different data types (URL, IOR,…)
• OCLC’s PURL persistent URLs, not really URNs 1 PURL >> only one location (a HTTP redirect)
• Community-specific Initiatives Digital Object Identifier (DOI) - publishers
• Handle System + Rights Metadata
PubMedID - Medline BibCode - astro-physics journals
FEDORA Status
• Reference Implementation CORBA IDL defines open interfaces for
Repository Access Protocol (RAP) Java/CORBA repository and clients
• Collaborations CNRI
• core design and interoperability• complex disseminations (dynamic)
U of Virginia• web integration• complex disseminations (e.g., e-texts)
New Research
• DLI2 - Project Prism security (associating enforceable policies
and mechanisms with DigitalObjects) preservation (enable long-term survival of
DigitalObjects in distributed environment)
• IDL - Harmony aggregation and interaction of multiple,
complex metadata sets in DigitalObjects RDF and XML
PRISM Security Policy Enforcement
• Challenges what is enforceable? distributed object environment interoperability and extensibility
• Monitor all operations, generic and extended
• Enforce a wide array of policies basic security violations rights management access control
application/MARC
text/x-acl
DC
GetDCFieldGetDCRecord
PRISM: Preservation Policy Enforcement
preservationmetadata
PreserveP
DS1
application/postscript
DS2
Book
Preservation Service
Monitors DigitalObject stateand catches unacceptable,
or risky transitions
Preservation Surrogate
Object
References• Payette, Blanchi, Lagoze, and Overly: Interoperability for Digital Objects
and Repositories: The Cornell/CNRI Experiments, D-Lib Magazine, May
1999. http://www.dlib.org/dlib/may99/payette/05payette.html
• Payette and Lagoze: Flexible and Extensible Digital Object and
Repository Architecture (FEDORA), ECDL 1998. http://www.cs.cornell.edu/payette/papers/ECDL98/FEDORA.html
• Lagoze and Payette: An Infrastructure for Open-Architecture Digital
Libraries http://ncstrl.cs.cornell.edu/Dienst/UI/1.0/Display/ncstrl.cornell/TR98-1690
• Daniel, Lagoze, and Payette, A Metadata Architecture for Digital Libraries,
IEEE ADL 1998. http://www.cs.cornell.edu/lagoze/papers/ADL98/dar-adl.html
• FEDORA Home Page http://www.cs.cornell.edu/NCSTRL/CDLRG/FEDORA.html
• Payette: Persistent Identifiers on the Digital Terrain, RLG DigiNews,April 1998, Volume 2, Number 2. http://www.rlg.org/preserv/diginews/diginews22.html