preservation metadata: the premis experience
DESCRIPTION
Preservation Metadata: The PREMIS Experience. Priscilla Caplan Florida Center for Library Automation (FCLA). The Mission. Build on the first OCLC/RLG Working Group and A Metadata Framework to Support the Preservation of Digital Objects - PowerPoint PPT PresentationTRANSCRIPT
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Preservation Metadata:
The PREMIS ExperiencePriscilla Caplan
Florida Center for Library Automation (FCLA)
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
The Mission
• Build on the first OCLC/RLG Working Group and A Metadata Framework to Support the Preservation of Digital Objects
• Define an implementable set of core preservation metadata elements
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
OAIS Information Model
Content InformationPreservation DescriptiveInformation
Contentdata
object
RepresentationInformation
Context Info
Reference Info
Provenance Info
Fixity Info
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
What is Implementable?
• As rigorous as possible• As much explanation as possible• Implementation neutral -- “This is
what you have to know”• Values can be automatically
supplied and processed -- no lengthy textual descriptions
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
What is Core?
What most working preservation repositoriesare likely to need to knowin order to support the long-term Viability,
Renderability, Understandability, Authenticity and Identity
of archived objects.
(Maybe.)
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Types of Objects
A bitstream is data within a file, that has common attributes for preservation purposes,that can not be transformed into a stand-alone file without the addition of file structureand/or reformatting.
A filestream is a contiguous bitstream within a file that can be transformed into a stand-alone file conforming to some file format without adding any additional information.
A file is a named ordered sequence of bytes known by an operating system, accessible byapplications, that contains zero or more bytes, has access permissions and for which filesystems store statistics such as size and last modification date. A file has a file format.
A representation is the set of files, including structural metadata, needed to provide acomplete and reasonable rendition of an intellectual entity.
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
From the FrameworkNAME: Global identificationDEFINITION: Uniquely identifies the Content Data Object and its associated metadata
(Archival Information Package) to systems external to the Archive in which it isstored
EXAMPLE: ISBN, persistent URLSub-elements:NAME: Value
DEFINITION: Value of the Global Identification used to identify the AIPEXAMPLE: PURL: http://purl.oclc.org/file.pdf
NAME: Construction methodDEFINITION: Description of the means by which the Global Identification is
created and assignedEXAMPLE: Archival Information Package is registered with the OCLC PURL
service upon ingest into the Archive.NAME: Responsible agency
DEFINITION: Entity responsible for assigning and maintaining the GlobalIdentification
EXAMPLE: OCLC PURL Service
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Semantic Units Pertaining to OBJECTS
• objectIdentifier• contentLocation• originalName• preservationLevel• objectCharacteristics• environment
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
objectCharacteristics
• compositionlevel• fixity• size• format• inhibitors• significantProperties • creatingApplication
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Composition Level
.XML
Foo.PDF
.XML
Foo.tar
Foo.tgz
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Format
• What types of objects have format?• Is there a usable authority list of
formats?• Is there a difference between a
format and a profile?• What’s a format anyway?• What if there are format registries?
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
The evolution of a semantic unit
• Format– formatValue– formatScheme
• Format– formatName– formatScheme
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Evolution (2)
• formatName– formatNameValue– formatVersion
• formatRegistry– formatRegistryEntry– formatRegistryKey
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Evolution(3)
• formatName– formatNameValue– formatVersion
• formatRegistry– formatRegistryIdentifier
• formatRegistryIdentifierScheme• formatRegistryIdentifierValue
– formatRegistryName– formatRegistryEntry
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Evolution (4)
• Format (Required, not repeatable)– formatName (Optional, repeatable)
• formatNameValue• formatNameVersion• formatNameRole
– formatRegistry (optional, repeatable)• formatRegistryIdentifier• formatRegistryName• formatRegistryEntry• formatRegistryRole
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
PREMIS Format Entry• Format (Required, Not Repeatable)
– formatName (Optional, Not repeatable)• formatNameValue• formatVersion
– formatRegistry (Optional, Repeatable)• formatRegistryName• formatRegistryKey• formatRegistryRole
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Environment environmentCharacteristic environmentPurpose environmentDescription dependencies software
swName swVersion swType additionalReq swDependency
hardware hwName hwVersion hwType additionalReq
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Events
• eventIdentifier– eventIdentifierScheme– eventIdentifierValue
• eventType• eventOutcome• eventOutcomeDetail• eventDetail• eventDateTime
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Agents
• agentIdentifier– agentIdentifierScheme– agentIdentifierValue
• agentName
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Relationships
• relationshipType• relatedIdentifier
– relatedIdentifierType– relatedIdentifierValue
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES
Metadata Best Practices
• Have a rigorous data model and relate metadata clearly to appropriate objects
• Store metadata in database and also with content data object
• Use METS, Z39.87• Store complete metadata for all
versions of objects