drs 2 orientation harvard university library september 30, 2010 drs = digital repository service
TRANSCRIPT
DRS 2Orientation
Harvard University LibrarySeptember 30, 2010
DRS = Digital Repository Service
Agenda1. DRS 2
1. Concepts (Andrea)2. New metadata (Robin)3. Overall schedule (Andrea)
2. BatchBuilder 2 demo (Vitaly)3. Testing instructions (Vitaly)4. Questions & comments
DRS 2 Concepts
DRS 1: everything’s a file
METS XML fileMETS XML file
METS XML file
TIFF image file
TIFF image file
TIFF image file
JPEG image fileJP2 image file
JP2 image fileJPEG image file
JP2 image file
JP2 image file
Text file
Text fileText file
ZIP file
PDF document file
File level is not a meaningful level for curatorial uses… Which DRS files make up my digital manuscript?
HOLLIS number 009412949 http://nrs.harvard.edu/urn-3:FHCL.HOUGH:1116980 http://pds.lib.harvard.edu/pds/view/6522882
METS XML fileMETS XML file
METS XML file
TIFF image file
TIFF image file
TIFF image file
JPEG image fileJP2 image file
JP2 image fileJPEG image file
JP2 image file
JP2 image file
Text file
Text fileText file
ZIP file
PDF document file
METS XML fileMETS XML file
METS XML file
TIFF image file
TIFF image file
TIFF image file
JPEG image fileJP2 image file
JP2 image fileJPEG image file
JP2 image file
JP2 image file
Text file
Text fileText file
ZIP file
PDF document file
DRS file ID = 6522882
METS XML fileMETS XML file
METS XML file
TIFF image file
TIFF image file
TIFF image file
JPEG image fileJP2 image file
JP2 image fileJPEG image file
JP2 image file
JP2 image file
Text file
Text fileText file
ZIP file
PDF document file
METS XML fileMETS XML file
METS XML file
TIFF image file
TIFF image file
TIFF image file
JPEG image fileJP2 image file
JP2 image fileJPEG image file
JP2 image file
JP2 image file
Text file
Text fileText file
ZIP file
PDF document file
METS XML fileTIFF image file
TIFF image file
JP2 image file
JP2 image file
METS XML fileTIFF image file
TIFF image file
JP2 image file
JP2 image file
page 1
page 2
Objects Aggregations of files that together
represent a coherent unit of content All the files that make up a single digital book All the master and use copies representing a
single photograph Useful for management, reporting and
searching “How many PDS document objects do I have in
the DRS?”
Objects New hook for metadata
Administrative categories (projects, exhibits, collections, etc.)
Descriptive metadata, catalog records
Object
Hollis # 009412949
Digital Medieval Manuscripts at Houghton Library
Moralia in Job: manuscript
Content models Object types Define
valid file formats and relationships known delivery and rendering applications associated assessments and preservation plans
Enforce conformity - we know what we have in the DRS and can monitor & preserve it
DRS 2.1 content models – deposit & delivery1. Still image
Image objects, delivered by IDS
2. PDS document Page-turned documents, delivered by PDS
3. Document Initially just PDF files, delivered by FDS
4. Opaque Files in any format
5. Text Text, XML, etc. delivered by FDS
Still image CM – print
TIFF archival master
Several derivativeJPEG deliverables
Derivative JPEG thumbnailPope JoanSeries: Illustration from Philippus Bergomensis, De Claribus Mulieribus. Ferrara, Rossi Harvard Art Museum/Fogg Museum, Gift of Philip Hofer
PDS document CM - book
Zoeller, Karl William. Merchandising the plumbing business. Chicago : Domestic Engineering Co., c1921. Baker Library.
JP2 archival master / deliverable images per page
Plain text files per page
…
Document CM - report
Intergovernmental Panel on Climate Change (IPCC) WG1 Fourth Assessment Report,Environmental Science and Public Policy Archives Harvard College Library
PDF deliverable
Opaque content model The contents of Judge Tragers’ hard drive,
Harvard Law School Library Wordperfect files, Text files, PDF documents,
etc. Plus documentation about the collection
Text CM – methodologyPlain text file
Processing methodology for Intergovernmental Panel on Climate Change (IPCC) documents, HCL Imaging Services.
New metadata
Object descriptors A METS metadata file per object on the file
system alongside content files Descriptive, administrative, preservation,
technical and structural metadata Describes the object, all its files and bitstreams
and related significant events Gives the metadata the same secure storage
as the content files Self-contained, portable objects
The move to standards PREMIS -- for key preservation metadata,
including Events that affect content Relationships that are not implicit
MODS -- for descriptive metadata Form-specific schemas for technical metadata,
including MIX for images textMD for text DocumentMD for PDF and other document formats More to come…
Supplemented by local administrative schemas
New local metadata adminCategory adminFlag captions, phase 2
Behavior, default, unit name, description for objects content model identification DRS URI isFirstGenerationInDrs
Closest to original capture isPreferredDeliverableSource
Changes to local metadata OwnerSuppliedName
Required for objects, optional for files Role
Repeatable for both objects and files Processing
Instead of “purpose”; repeatable Quality
Optional Methodology
Now for objects and files of all types
Tracking changes DRS 2 will keep track of
Changes that affect content Troubleshooting content errors Key administrative metadata
Three types: Events Administrative flags “Versioned” metadata elements
Not tracking every metadata change
Events Object
creation deletion /recovery from deletion ingest merge
File addition deletion / recovery from deletion integrity check confirmation replacement virus check confirmation
Other trackingMetadata where changes will be tracked:
Access Flag Administrative Flag Billing Code Owner Code
Descriptive MetadataMODS
Administrative MetadataFor the object:
PREMIS (including relationships)
DRS administrative metadataFor each file:
PREMIS (including relationships)
Format-specific metadataDRS administrative metadata
PREMIS Events
Inventory of Files
Structure Map
What’s inside a descriptor?
Overall schedule
Overall schedule Available now: first release of BatchBuilder
2 for depositor training and testing Supports 5 content models
Fall 2010 – Summer 2011 BatchBuilder 2 enhancements & bug fixes Web Admin 2 development and testing
~September 2011: BatchBuilder 2 and Web Admin 2 in production
BatchBuilder 2
BatchBuilder 2 Will build batches of objects rather than
batches of files Will automatically determine most
technical metadata (using FITS) Will automatically create all object
descriptors (using OTS)
BatchBuilder 1 BatchBuilder 2
Expects files and creates batches of files.
Expects objects and creates batches of objects.
Can use an existing PDS METS file for PDS objects.
Can import a structmap from an “old-style” PDS METS file to create a PDS Document descriptor.
Uses batch genres. Uses DRS Content Models.
Uses a supplied HOLLIS ID to import contents of a HOLLIS record to a PDS METS Label.
Uses a supplied HOLLIS ID to import contents of a HOLLIS record into the MODS section of the object descriptor.
Batch level and directory level metadata entered in Batch Template panel.
Object level and directory level metadata entered in Object Template.
Project level metadata is entered in Administrative Properties panel.
Project level metadata is entered in Deposit Settings panel.
No depositor authorization – anyone with access to the ftp dropbox can load batches.
Depositor authorization – only depositors with permission to load into a particular owner code can load batches into that owner code.
Testing instructions
Questions & Comments