chapter 15 - testing claims about digital preservation
TRANSCRIPT
-
7/31/2019 Chapter 15 - Testing Claims About Digital Preservation
1/3
Chapter 15
Testing Claims About Digital Preservation
In this part of the book we show a number of real examples of digital preservation
activities; these have been chosen to illustrate a number of scenarios and preser-vation strategies using a great variety of types of data, from the simplest to highly
complex.
15.1 Accelerated Lifetime Testing of Digital Preservation
Techniques
In order to understand what and how claims about digital preservation should be
tested, we need to understand what things can change over time and what we might
expect to be able to rely on. Then we can simulate the passage of time, at an accel-
erated rate. Some of this duplicates, for convenience, some of the text from Chap. 5.
15.1.1 What Can Change?
We can consider some of the things can change over time and hence against which
an archive must safeguard the digitally encoded information.
15.1.1.1 Hardware and Software Changes
Use of many digital objects relies on specific software and hardware, for exam-
ple applications which run on specific versions of Microsoft Windows which in
turn runs on Intel processors. Experience shows that while it may be possible to
keep hardware and software available for some time after it has become obsolete,
it is not a practical proposition into the indefinite future, however there are sev-
eral projects and proposals which aim to emulate hardware systems and hence run
software systems.
267D. Giaretta, Advanced Digital Preservation, DOI 10.1007/978-3-642-16809-3_15,C Springer-Verlag Berlin Heidelberg 2011
http://-/?-http://-/?- -
7/31/2019 Chapter 15 - Testing Claims About Digital Preservation
2/3
268 15 Testing Claims About Digital Preservation
15.1.1.2 Environment Changes
These include changes to licences or copyright and changes to organisations, affect-
ing the usability of digital objects. External information, ranging from the DNS to
XML DTDs and Schema, vital to the use and understandability, may also becomeunavailable.
15.1.1.3 Termination of the Archive
Without permanent funding, any archive will, at some time, end. It is therefore possi-
ble for the bits, i.e. the binary objects, to be lost, and much else besides, including the
knowledge of the curators about the information encoded in those bits. Experience
shows that much essential knowledge, such as the linkage between holdings, opera-
tion of specialised hardware and software and links of data files to events recorded
in system logs, is held by such curators (in their heads) but not written down or
encoded for exchange or preservation. Bearing these things in mind it is clear that
any repository must be prepared to hand over its holding together with all these
tacit pieces of information to its successor(s).
Other, major, threats include financial, political or environmental (such as floods
or earthquakes) upheaval.
15.1.1.4 Changes in What People Know
As described earlier the Knowledge Base of the Designated Community determinesthe amount of Representation Information which must be available. This Knowledge
Base changes over time as terminology, tools and theories change.
15.1.2 What can be Relied on in the Long Term?
While we cannot provide rigorous proofs, it is worth, at this point, listing those
things which we might credibly argue would be available in the long term, in order
to clarify the basis of our approach. We should be able to trace back our preservation
plans to these assumptions. Were we able to undertake a rigorous mathematical
proof these would form the basis of the axioms for our theorems.
Words on paper (or titanium sheets) that people can read; ISO standards kept in
national libraries are an example of this. Over the long term there may be an issue
of language and character shape.
Carvings in stone and books have proven track records of preserving informa-
tion over hundreds of years.
The information such as Representation Information which is collected.
A somewhat recursive assumption, however it is difficult to make progress
without it. This Representation Information includes both digital as well as
physical (e.g. books) objects.
Some kind of remote access
-
7/31/2019 Chapter 15 - Testing Claims About Digital Preservation
3/3
15.2 Summary 269
Network access is the natural assumption but in principle other methods of
obtaining information from a given address/location would suffice, for example
fax or horse-back rider.
Some kind of computers
Perhaps not strictly necessary but this seems a sensible assumption given theamount of calculation needed to do some of the most trivial operations, such as
displaying anything beyond simple ASCII text, or extracting information from
large datasets.
People? Organisations?
Clearly neither the originators of the digital objects nor the initial host organ-
isations can be relied on to continue to exist. However if no people and no
organisations exist at all then perhaps digital preservation becomes a moot topic.
Identifiers?
Some kind of identifier system is needed, but clearly we cannot assume thatany given URL, for example, will remain valid.
15.2 Summary
This short chapter provides a very brief introduction to what we need to think about
when we are planning to preserve digitally encoded information. Later chapters
discuss these topics in much more detail.