bitstreamformat renovation:

16
BitstreamFormat Renovation: DSpace Gets Real Technical Metadata

Upload: maxine-romero

Post on 31-Dec-2015

23 views

Category:

Documents


0 download

DESCRIPTION

DSpace Gets Real Technical Metadata. BitstreamFormat Renovation:. BitstreamFormat Renovation Prototype. Benefits (Why Should I Care?) ‏. The new format identifier corrected or fixed unidentified data formats of 858 Bitstreams in DSpace@MIT. How many are mis-identified in your repository? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BitstreamFormat Renovation:

BitstreamFormat Renovation:DSpace Gets Real Technical Metadata

Page 2: BitstreamFormat Renovation:

Benefits (Why Should I Care?)

• The new format identifier corrected or fixed unidentified data formats of 858 Bitstreams in DSpace@MIT. How many are mis-identified in your repository?

• Accurate MIME-types improve delivery to Web clients

• Quality preservation requires accurate data format knowledge

• Interoperability with internal and external tools relies on correct technical metadata in commonly-recognized standards

• Without automated tools, maintenance of format technical metadata is a tedious manual job for repository managers

BitstreamFormat Renovation Prototype

Page 3: BitstreamFormat Renovation:

Data Formats

BitstreamFormat Renovation Prototype

A “Data Format” is defined as:

Technical Metadata that describes how abstract information is encoded and structured in a digital document.

“Abstract Information” refers to the actual intellectual content contained in the digital object.

Page 4: BitstreamFormat Renovation:

Problems with Current Format Technical Metadata

• Formats are identified with arbitrary names; that hinders interoperability

• No means to collect additional format technical metadata, e.g. format specification documents.

• Identifying formats only by filename extension is imprecise and unreliable

• Current internal format model is inflexible

BitstreamFormat Renovation Prototype

Page 5: BitstreamFormat Renovation:

Terms

• Format Registry– PRONOM– GDFR

• Identification Plugins– DROID– JHOVE

• Interoperable Format Identifiers– MIME Type– PUID (PRONOM Unique IDentifier)

BitstreamFormat Renovation Prototype

Page 6: BitstreamFormat Renovation:

Object-Model Architecture Connected to PRONOM

BitstreamFormat Renovation Prototype

Page 7: BitstreamFormat Renovation:

BitstreamFormat Renovation Prototype

Identification of Two BitstreamFormat Types

Page 8: BitstreamFormat Renovation:

BitstreamFormat Renovation Prototype

The Local “DSpace” and Provisional Registries

Page 9: BitstreamFormat Renovation:

Interface to External Registries

• Get Synonyms– Returns a list of identifiers that are bound to the same format record

• Import– Turns an external format description into a new BitstreamFormat

entry, initializing its metadata fields from the external registry

• Update– Refresh the metadata fields of a BitstreamFormat to keep up with

changes

• ConformsTo– Tests whether the format described by one identifier “conforms to”

or is a sub-type of another format

BitstreamFormat Renovation Prototype

Page 10: BitstreamFormat Renovation:

The Bitstream Format Metadata Admin Panel

BitstreamFormat Renovation Prototype

Page 11: BitstreamFormat Renovation:

Importing New Bitstream Formats

BitstreamFormat Renovation Prototype

Page 12: BitstreamFormat Renovation:

Editing BitstreamFormat Metadata

BitstreamFormat Renovation Prototype

Page 13: BitstreamFormat Renovation:

Digital Preservation Strategies

• Pluggable architecture allows for access to external identification and technical metadata tools

• Access and preservation rely on accurate format identification

• Migration / Obsolescence tools are only effective with correct and precise identification, because format versions matter

• The creation of derivatives (i.e. thumbnails or delivery versions)

via MediaFilter will also rely on accurate identification

BitstreamFormat Renovation Prototype

Page 14: BitstreamFormat Renovation:

Interoperability Benefits

• Avoids platform lock-in

• Reliable delivery functionality

• Consistent object description semantics (ORE)

• Interoperability with digital preservation services

BitstreamFormat Renovation Prototype

Page 15: BitstreamFormat Renovation:

Quantitative Results

• Before:– 1,020 Unidentified (0.65%)

• After:– 162 Unidentified (0.104%)

DSpace@MIT (155,000 Bitstreams)

BitstreamFormat Renovation Prototype

Page 16: BitstreamFormat Renovation:

Related Links

http://wiki.dspace.org/index.php/BitstreamFormat_Renovation

http://www.nationalarchives.gov.uk/PRONOM/

http://droid.sourceforge.net/wiki/index.php/Introduction

https://collaborate.oclc.org/wiki/gdfr/about.html

http://www.loc.gov/standards/premis/

http://pilot.apsr.edu.au/wiki/index.php/AONS_II

http://www.ijdc.net/ijdc/article/view/53/

http://hul.harvard.edu/jhove/

http://web.mit.edu/sands/www/bfr

BitstreamFormat Renovation Prototype