storage of digital objects adolf knoll national library of the czech republic [email protected]
TRANSCRIPT
Storage of digital objects
Adolf KnollNational Library of the Czech [email protected]
Basic principles
store digital objects in such a way that you can always: find them – know where they are identify them – know what they contain open and access them manipulate with them provide access to them to users
Search of realiability
image and other data files metadata platform and definitions structural schemes storage media
Everything is in motion – we should evaluate its speed and prefer more durable solutions.
Image and data files
they should have unique names they sholuld be attached to identification
metadata they should be archived in highly used ISO
formats (TIFF, JPEG) we should be aware of the fact that the
source archival image can be reused e.g. – Czech NL now optimizing access
image sets
Metadata
openly readable platforms (SGML Family) content markup and possibility to reuse them content description rules with enough granularity reliable storage of metadata awareness that one day they might be migrated e.g, Czech NL migrates now medata sets for ca. 1.3
million digitized page (in continuing production) make them independent of access tools
Structuring and storing
Structure metadata and link them to image entities
Store the structured units together If creating database access, pay a lot
of attention to preservation of the database
have always a reference off-line archives for digitized documents
Reference digital archiveon-line
Disk arrays for access
Reference archive of preservation microfilm
Reference digital archive of off-line media (CD, mg. tapes)
Administration
Media
media are not critical for long-term storage
the dependence of digital documents on software and hardware environments is critical
media can be refreshed always have more independently
stored copies
Stability of media vs. obsolescence
It seems that the most reliable medium is CD: obsolete, but well standardized it contains redundant data for corrections far from being dead – it is well suitable also for
video thanks to new compression standards cheap the CD archives should be measured and
monitored do not OVERBURN!
Other media
Most data is stored on magnetic tapes problems with writing software after years, no
data redundancy fragility; tapes should be rewound mass robotic storages facilities/libraries self-checking, rewriting, but…
Hard disks – backups – more access media DVD, flash cards, …
An example – compact disc
scratch from manipulation with reader
printing matrix made this scratch
metallic coating insuficciency
Origin of damages of CD
manipulation scratches – 43% dust – 28% damages incurred in production – 15% Table of Contents errors – 2% Other – 15%
Not all of them are critical for readability, but they warn us.
Quality vs. commerce
Industrial production works for current users – acceptable quality of media
For long-term storage needed more reliable quality
Only co-operation with producers is the solution
Especially compact discs and magnetic tapes (UNESCO involvement)
Proprietary vs. Open source
Open source – we possess resources and can change the product – usually required by academic sphere
However: Microsoft, Adobe, Ex-Libris – no chance
Society has a consumer character also in more important fields than libraries
I.e., work with small and medium enterprises, local companies, try to develop yourselves in harmony and grow both