chapter 11 - linking data and 'metadata'-packaging
TRANSCRIPT
-
7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging
1/6
Chapter 11
Linking Data and Metadata: Packaging
11.1 Information Packaging Overview
OAIS describes packaging at a high level, as outlined in Sect. 6.3.4, where it is
stressed that the package is a logical structure, i.e. does not have to be a single file.
Despite stressing the logical structure, it can be useful to package digital objects
lets say files together in a single file, for example a ZIP [142] file. However if
one simply did that then there would be no indication of the relationship between
the files, so there must be some mechanism for specifying the relationship. In any
practical system one needs to encode the links somehow.
If it is not practical to put everything into a single file then an alternative would
be to point to one or more of the digital objects using some kind of identifier system.As in the single file case, one would need to specify the relationships somehow.
There are many ways of implementing this kind of packaging and each has its
own mechanism for specifying such relationships. Regarding the package as a dig-
ital object, another way of thinking about this is that one needs the appropriate
Representation Information in order to use the package however it seems useful
to have some special terminology in this case.
One can imagine that these mechanisms for specifying the relationships between
the components of the package could include:
Naming conventions for the components
Reliance on specific software to extract the components
Indirection, for example by means of an XML schema which provides the seman-
tics to distinguish different components. Of course the schema would need its
own Representation Information, and in particular the semantics associated with
the element names.
General relationship techniques such as RDF again there would need to be addi-
tional Representation Information meaning of the tags would have to be specified
separately.
There are a number of techniques which have been proposed including
IMS content packaging [143], SOAP [144], METS [145] and XFDU [146].
191D. Giaretta, Advanced Digital Preservation, DOI 10.1007/978-3-642-16809-3_11,C Springer-Verlag Berlin Heidelberg 2011
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging
2/6
192 11 Linking Data and Metadata: Packaging
Of these only XFDU has close connections to OAIS and in particular full
support for all types of Representation Information. Therefore we use XFDU in
our examples, but this should not be taken to mean this is the only way.
OAIS describes several package variants, but only the Archival Information
Package (AIP) has mandatory contents and we look in detail at the AIP next.
11.2 Archival Information Packaging
The AIP is a critical element in OAIS. There is a distinction which is made between
an Archival Information Unit (AIU) and an Archival Information Collection (AIC),
both of which are special types of AIPs (Fig. 11.1).
There is an analogy here with what were termed in Sect. 4.1 Simple Objects and
Composite Objects.
OAIS defines:
Archival Information Collection (AIC): An Archival Information Package
whose Content Information is an aggregation of other Archival Information
Packages.
Archival Information Unit (AIU): An Archival Information Package where
the archive chooses not to break down the Content Information into other
Archival Information Packages. An AIU can consist of multiple digital objects(e.g., multiple files).
This shows that an AIC is a Composite Object, and the AIU could in some ways be
described as a Simple Object although clearly it has components.
For further details of the useful terminology associated with AICs the reader
should consult OAIS.
Fig. 11.1 Specialisations
of AIP
http://-/?-http://-/?- -
7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging
3/6
11.3 XFDU 193
11.3 XFDU
Much of the packaging described in Part II uses the XFDU and, although this is not
the only possible packaging technique, so it is convenient to provide a little more
detail here.XFDU has been standardized and well-documented by CCSDS with the idea of
supporting OAIS terminology from its conception. One key feature is the flexibility
it allows in terms of which things are pointed to and which are physically inside the
XFDU encoding.
It has been used in an operational environment by The European Space Agency
(ESA) in the form of the Standard Archive Format for Europe (SAFE) [147], a
packaging format fully-compatible with XFDU. Developing XFDU solutions can
be facilitated through existing open-source Java toolkits and APIs, which have
been created by ESA and NASA, allowing the construction, editing and analysisof standardized XFDU Information Packages.
The Manifest document shown in Fig. 11.2 contains the information about the
relationships between the information that is packaged together. XFDU uses an
XML schema to describe this manifest file which is split into five sections. The
packageHeader documents information about the package itself, its versioning,
its position in a sequence or volume, and PDI about it existence.
The dataObjectSection and metadataSection are used to relate the digitalinformation to be preserved to its RepInfo or PDI, respectively. Both data objects
and metadata objects can be either connected by reference or encoded within themanifest itself (Fig. 11.3). Each object is assigned an XML identifier, which is used
to link objects between the two sections. Objects in both sections can be given built-
in classifications or associated with user-defined classification schemes.
Fig. 11.2 Conceptual view of an XFDU
http://-/?-http://-/?-http://-/?- -
7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging
4/6
194 11 Linking Data and Metadata: Packaging
packageHeader
metadataObject
metadataObjectdataObjectdataObjectSection
data objects
ContentUnit
informationPackageMap
metadataSection
metadata objects
behaviorObject
behaviorSection
package header
Structure map
xfdu
xml Id URI
xml Id
REP
PDI
DMD
OTHERANY
CategoryURI
URI
xml Id
MetadataCategoryPointers(xml Ids)
URI
OTHER
DESCRIPTION, OTHER
CONTEXT, PROVENANCE,REFERENCE, FIXITY, OTHER
DED, SYNTAX, OTHER
Class
Fig. 11.3 XFDU manifest logical view
The informationPackageMap records information about content units, which
are used to associate data in the dataObjectSection to metadata in the meta-dataSection. The association is done via XML identifiers, and maps to the OAISconcept of Content Information Object, the combination of a digital object and its
RepInfo.
A diagram of the full XML schema of the XFDU is shown in Fig. 11.4. This
schema keeps AIPs consistent and standard while allowing a flexible and adaptable
implementation. By extending the XFDU schema to provide domain specific AIPs
it is possible to allow the inclusion of additional information while maintaining
the standardization and consistency that are two of the main advantages of using
XFDU for preservation. ESA has demonstrated this by extending the XFDU schemainto SAFE, which includes spacecraft mission-specific information embedded in the
XFDU manifest.
A toolkit for creating and reading XFDUs is available from the XFDU web site
[148] and GAEL XFDU web site [149].
11.3.1 XFDU and TDO
Because both embody packaging techniques, the XFDU structure does implement
many, perhaps all, of the concepts of the Trustworthy Digital Object (TDO) [8].
However the latter seems to rely on emulation (see Sect. 7.9) and in particular the
UVC (see Sect. 7.9.4.3) as its ultimate preservation technique.
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging
5/6
11.3 XFDU 195
Fig. 11.4 Full XFDU schema diagram
-
7/31/2019 Chapter 11 - Linking Data and 'Metadata'-Packaging
6/6
196 11 Linking Data and Metadata: Packaging
Emulation has its place in preservation but as we point out in Sect. 7.9, this
is limiting not least because in essence one is limited to what has been possible
with the digital object in the past. Moreover especially because the semantics of the
digital object are not made explicit in the TDO, even if one could link the emulation
to modern applications, one would be limited with what new things could be done.The XFDU is not tied in any way to emulation, although an emulator can be one
part of the Representation Information in the package. Therefore it is fair to say that
the XFDU is a superset of the TDO technical concept.
11.4 Summary
Packaging is an important requirement with many possible solutions. This chapter
has tried to elucidate the key considerations and describe in some detail one possiblepackaging mechanism.
http://-/?-http://-/?-