chapter 11 - linking data and 'metadata'-packaging

Upload: foveros-foveridis

Post on 05-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging

    1/6

    Chapter 11

    Linking Data and Metadata: Packaging

    11.1 Information Packaging Overview

    OAIS describes packaging at a high level, as outlined in Sect. 6.3.4, where it is

    stressed that the package is a logical structure, i.e. does not have to be a single file.

    Despite stressing the logical structure, it can be useful to package digital objects

    lets say files together in a single file, for example a ZIP [142] file. However if

    one simply did that then there would be no indication of the relationship between

    the files, so there must be some mechanism for specifying the relationship. In any

    practical system one needs to encode the links somehow.

    If it is not practical to put everything into a single file then an alternative would

    be to point to one or more of the digital objects using some kind of identifier system.As in the single file case, one would need to specify the relationships somehow.

    There are many ways of implementing this kind of packaging and each has its

    own mechanism for specifying such relationships. Regarding the package as a dig-

    ital object, another way of thinking about this is that one needs the appropriate

    Representation Information in order to use the package however it seems useful

    to have some special terminology in this case.

    One can imagine that these mechanisms for specifying the relationships between

    the components of the package could include:

    Naming conventions for the components

    Reliance on specific software to extract the components

    Indirection, for example by means of an XML schema which provides the seman-

    tics to distinguish different components. Of course the schema would need its

    own Representation Information, and in particular the semantics associated with

    the element names.

    General relationship techniques such as RDF again there would need to be addi-

    tional Representation Information meaning of the tags would have to be specified

    separately.

    There are a number of techniques which have been proposed including

    IMS content packaging [143], SOAP [144], METS [145] and XFDU [146].

    191D. Giaretta, Advanced Digital Preservation, DOI 10.1007/978-3-642-16809-3_11,C Springer-Verlag Berlin Heidelberg 2011

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging

    2/6

    192 11 Linking Data and Metadata: Packaging

    Of these only XFDU has close connections to OAIS and in particular full

    support for all types of Representation Information. Therefore we use XFDU in

    our examples, but this should not be taken to mean this is the only way.

    OAIS describes several package variants, but only the Archival Information

    Package (AIP) has mandatory contents and we look in detail at the AIP next.

    11.2 Archival Information Packaging

    The AIP is a critical element in OAIS. There is a distinction which is made between

    an Archival Information Unit (AIU) and an Archival Information Collection (AIC),

    both of which are special types of AIPs (Fig. 11.1).

    There is an analogy here with what were termed in Sect. 4.1 Simple Objects and

    Composite Objects.

    OAIS defines:

    Archival Information Collection (AIC): An Archival Information Package

    whose Content Information is an aggregation of other Archival Information

    Packages.

    Archival Information Unit (AIU): An Archival Information Package where

    the archive chooses not to break down the Content Information into other

    Archival Information Packages. An AIU can consist of multiple digital objects(e.g., multiple files).

    This shows that an AIC is a Composite Object, and the AIU could in some ways be

    described as a Simple Object although clearly it has components.

    For further details of the useful terminology associated with AICs the reader

    should consult OAIS.

    Fig. 11.1 Specialisations

    of AIP

    http://-/?-http://-/?-
  • 7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging

    3/6

    11.3 XFDU 193

    11.3 XFDU

    Much of the packaging described in Part II uses the XFDU and, although this is not

    the only possible packaging technique, so it is convenient to provide a little more

    detail here.XFDU has been standardized and well-documented by CCSDS with the idea of

    supporting OAIS terminology from its conception. One key feature is the flexibility

    it allows in terms of which things are pointed to and which are physically inside the

    XFDU encoding.

    It has been used in an operational environment by The European Space Agency

    (ESA) in the form of the Standard Archive Format for Europe (SAFE) [147], a

    packaging format fully-compatible with XFDU. Developing XFDU solutions can

    be facilitated through existing open-source Java toolkits and APIs, which have

    been created by ESA and NASA, allowing the construction, editing and analysisof standardized XFDU Information Packages.

    The Manifest document shown in Fig. 11.2 contains the information about the

    relationships between the information that is packaged together. XFDU uses an

    XML schema to describe this manifest file which is split into five sections. The

    packageHeader documents information about the package itself, its versioning,

    its position in a sequence or volume, and PDI about it existence.

    The dataObjectSection and metadataSection are used to relate the digitalinformation to be preserved to its RepInfo or PDI, respectively. Both data objects

    and metadata objects can be either connected by reference or encoded within themanifest itself (Fig. 11.3). Each object is assigned an XML identifier, which is used

    to link objects between the two sections. Objects in both sections can be given built-

    in classifications or associated with user-defined classification schemes.

    Fig. 11.2 Conceptual view of an XFDU

    http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging

    4/6

    194 11 Linking Data and Metadata: Packaging

    packageHeader

    metadataObject

    metadataObjectdataObjectdataObjectSection

    data objects

    ContentUnit

    informationPackageMap

    metadataSection

    metadata objects

    behaviorObject

    behaviorSection

    package header

    Structure map

    xfdu

    xml Id URI

    xml Id

    REP

    PDI

    DMD

    OTHERANY

    CategoryURI

    URI

    xml Id

    MetadataCategoryPointers(xml Ids)

    URI

    OTHER

    DESCRIPTION, OTHER

    CONTEXT, PROVENANCE,REFERENCE, FIXITY, OTHER

    DED, SYNTAX, OTHER

    Class

    Fig. 11.3 XFDU manifest logical view

    The informationPackageMap records information about content units, which

    are used to associate data in the dataObjectSection to metadata in the meta-dataSection. The association is done via XML identifiers, and maps to the OAISconcept of Content Information Object, the combination of a digital object and its

    RepInfo.

    A diagram of the full XML schema of the XFDU is shown in Fig. 11.4. This

    schema keeps AIPs consistent and standard while allowing a flexible and adaptable

    implementation. By extending the XFDU schema to provide domain specific AIPs

    it is possible to allow the inclusion of additional information while maintaining

    the standardization and consistency that are two of the main advantages of using

    XFDU for preservation. ESA has demonstrated this by extending the XFDU schemainto SAFE, which includes spacecraft mission-specific information embedded in the

    XFDU manifest.

    A toolkit for creating and reading XFDUs is available from the XFDU web site

    [148] and GAEL XFDU web site [149].

    11.3.1 XFDU and TDO

    Because both embody packaging techniques, the XFDU structure does implement

    many, perhaps all, of the concepts of the Trustworthy Digital Object (TDO) [8].

    However the latter seems to rely on emulation (see Sect. 7.9) and in particular the

    UVC (see Sect. 7.9.4.3) as its ultimate preservation technique.

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging

    5/6

    11.3 XFDU 195

    Fig. 11.4 Full XFDU schema diagram

  • 7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging

    6/6

    196 11 Linking Data and Metadata: Packaging

    Emulation has its place in preservation but as we point out in Sect. 7.9, this

    is limiting not least because in essence one is limited to what has been possible

    with the digital object in the past. Moreover especially because the semantics of the

    digital object are not made explicit in the TDO, even if one could link the emulation

    to modern applications, one would be limited with what new things could be done.The XFDU is not tied in any way to emulation, although an emulator can be one

    part of the Representation Information in the package. Therefore it is fair to say that

    the XFDU is a superset of the TDO technical concept.

    11.4 Summary

    Packaging is an important requirement with many possible solutions. This chapter

    has tried to elucidate the key considerations and describe in some detail one possiblepackaging mechanism.

    http://-/?-http://-/?-