archives, digital archives and encoded archival description

68
Archives, Digital Archives and Encoded Archival Description Chris Prom Assistant University Archivist University of Illinois Mortenson Visiting Scholars Tech Training April 19, 2006

Upload: adin

Post on 13-Jan-2016

103 views

Category:

Documents


4 download

DESCRIPTION

Archives, Digital Archives and Encoded Archival Description. Chris Prom Assistant University Archivist University of Illinois Mortenson Visiting Scholars Tech Training April 19, 2006. Intro. Overview of Archives, Arrangement and Description - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Archives, Digital Archives and Encoded Archival Description

Archives, Digital Archives and Encoded Archival Description

Chris PromAssistant University Archivist

University of IllinoisMortenson Visiting Scholars Tech Training

April 19, 2006

Page 2: Archives, Digital Archives and Encoded Archival Description

Intro

• Overview of Archives, Arrangement and Description

• Review Standards and Tools related to Archival Description

• Review Standards and Tools for providing access to digital archival materials

• Lots of interaction

Page 3: Archives, Digital Archives and Encoded Archival Description

Archives Background

• Archives: Organized non-current “records”; generated by institutions

• Manuscripts: non-current “papers”; generated by individuals or families

• Preserved because of ‘enduring’ value– Not necessarily ‘permanent value’

• Both generally referred to as “collections”

Page 4: Archives, Digital Archives and Encoded Archival Description

The Archival Mission• Identify, preserve, make available records and papers

From Gregory Hunter, Developing and Maintaining Practical Archives

Page 5: Archives, Digital Archives and Encoded Archival Description

Libraries ArchivesNature Published, discrete, make

sense on own, multiple copiesUnpublished, grouped with related items, make no sense on own

Creator Many One parent organization

Method of Creation

Each created separately Organically produced as part of normal business or life

How Received Selected as items Appraised as groups

How Arranged By subject classification Provenance and original order (structure and function)

How described By item In aggregate (record group, series, collection)

Where described Built into item itself (provided title, author, CIP data), in catalog

Prepared by archivist (e.g. supplied title) in ‘finding aids, guides, inventories, databases

How accessed Items circulate No circulation

Based on chart in Hunter, Developing. . . p. 7

Page 6: Archives, Digital Archives and Encoded Archival Description

Archival Appraisal 101

• Process of determining ‘value’

• Done over aggregates not items

• Primary: operational, legal, fiscal, administrative

• Secondary: Historical or ‘archival’ value

• Types of archival value– Evidential: documents

organization and functioning of organization

– Informational: sheds light on people, events, things aside from organization

Credit: Hunter, p. 51

Page 7: Archives, Digital Archives and Encoded Archival Description

Archival Arrangement 101

• Provenance– Records from one creator must not be intermingled

with those from another– NOT by subject

• Original order– Maintain records in order placed by creator

• Five “levels” of arrangement– Repository – Record group/subgroup (organizationally related group)

– Record series (set of files or documents maintained as a unit)

– File (folder, binder, packs for convenient use)

– Item (one document, letter, etc)

Page 8: Archives, Digital Archives and Encoded Archival Description

Levels of Arrangement: Examples

Repository University Archives Special Collections

Record Group College of Engineering Champaign County Republican Party

Series Dean’s Office Correspondence Files

Speaker’s Committee File

File Unit Federal Aviation Administration

Barry Goldwater, 1960-70

Item Letter to FAA Director, June 12, 1968

Copy of remarks by Goldwater to CCRP, August 23, 1965

Page 10: Archives, Digital Archives and Encoded Archival Description

Description of Archives

• Establish administrative control over archival materials– Locate collections– Identify their source, creators (chain of custody)– Outline contents

• Establish intellectual control– General nature of repository– General contents of collection– Detailed information on specific collections– Summarize information across several collections

• Important for both authentication and access• Internal vs. Public finding aids

Page 11: Archives, Digital Archives and Encoded Archival Description

Principles of Description*

• “Multilevel Description”– Proceed from general to specific– Provide information relevent to the level of

description– Link each level of description to next higher

unit of description– Do not repeat information, provide it only at

highest appropriate level

* Summarized from ISAD(G) General International Standard Archival Description

Page 12: Archives, Digital Archives and Encoded Archival Description

Finding Aid

• Basic Access Tool is the “Finding Aid” also known as ‘inventory’ or ‘register’.– Prefatory material– Introduction– Biographical sketch/agency history– Scope and content note– Series description (organization)– Container Listing– Index (less used now with electronic finding aids)

Page 13: Archives, Digital Archives and Encoded Archival Description

Elements of Description

• 26 in ISAD (G) (www.ica.org/biblio/cds/isad_g_2e.pdf) • Identity

– Reference code, title, dates, level of description• Context

– Name of creator, biographical or admin history, source of materials

• Content/Structure– Scope/content, appraisal information, arrangement

• Conditions of Access/Use• Allied Materials (copies, originals, related)• Notes• Description Control (author of description, revisions)

Page 14: Archives, Digital Archives and Encoded Archival Description

Finding Aid Examples

• Reston Papers and Third Armored Division Assn (bring along)

• American Crystal Sugar Co.

• Thurgood Marshall Papers

Page 15: Archives, Digital Archives and Encoded Archival Description

Questions?

• Next:– Overview of standards and tools for

description of paper and electronic materials, and tools for access to electronic collections.

Page 16: Archives, Digital Archives and Encoded Archival Description

Establishing a good descriptive system

• Takes planning, awareness of resources• Deciding on ‘platform’ or computers should

be LAST step• Better to describe all materials at high

level than put all effort into one collection• Beware tendency to do lower levels of

description before higher levels• Inventory MUST be the key• Use a content standard

Page 17: Archives, Digital Archives and Encoded Archival Description

Describing Archives: A Content Standard

• Provides rules/advice about the quality and structure of informational content– 8 principles– What to put in the 26 elements recommended by

ISAD (G)– Rules for describing creators and forms of names– Complement to AACR2– Provides mapping to appropriate data structure

standards

Page 18: Archives, Digital Archives and Encoded Archival Description

MARC21

• Advantages: Can use regular library software, provides integrated access with non-archival materials

• Disadvantages: Can undermine provenance, relationship to other materials may be lost

• Recommendation: USE MARC Cataloging as first step in PUBLIC finding aids

Page 19: Archives, Digital Archives and Encoded Archival Description

Cataloging Archival Materials

Page 20: Archives, Digital Archives and Encoded Archival Description

MARC 21 Sample

Page 21: Archives, Digital Archives and Encoded Archival Description

Typical Fields for Cataloging Archival Materials

Personal Name 100

Corporate Name 110

Title 245a,b

Inclusive Dates 245f

Physical Description (volume) 300

Arrangement/Organization 351

Biographical/Historical Note 545

Scope/content note 520

Restrictions on Access 506

Terms of Use 540

Provenance 561

Subject added entry 650s

Personal name added entry 700

Personal name as subject 600

Corporate name as subject 610

Link to finding aid or digital collection 856

Page 22: Archives, Digital Archives and Encoded Archival Description

Word-Processed Finding Aids

• Advantages: Easy to create, maintain

• Disadvantages: Not in standard format, cannot exchange with others, lack of coded fields

• Recommendation: Very useful for most institutions. Can be published to Internet via PDF

Page 23: Archives, Digital Archives and Encoded Archival Description

Encoded Archival Description (EAD)

• Data structure standards for descriptions of manuscripts or archives-->finding aids

• At any level of granularity

• Typically collection level

• sgml and xml versions of DTD

• <dao> tag for linking to archival surrogates

Page 24: Archives, Digital Archives and Encoded Archival Description

EAD

• Advantages: Best interoperability and data exchange, easier to implement with others (consortia)

• Disadvantages: Tool development still weak, steep learning curve.

• Recommendation: If you have good technical skills, and a basic archival program is in place, and resources are available, implement it

Page 25: Archives, Digital Archives and Encoded Archival Description

EAD Samples

• Static:– http://web.library.uiuc.edu/ahx/ead/ua/1505023/1505023f.html – http://www.amphilsoc.org/library/mole/e/edwards.htm

• Conversion on server: http://www.amphilsoc.org/library/mole/e/edwards.xml

• PDF: http://www.amphilsoc.org/library/mole/e/edwards.pdf

• In digital library software:– http://www.umich.edu/~bhl/EAD/index.html– http://www.oac.cdlib.org/

• Other implementations– Cheshire: http://www.archiveshub.ac.uk/

Page 26: Archives, Digital Archives and Encoded Archival Description

EAD Structure 1

• XML: perfect way to implement principles of ‘multi-level description– many elements optional– most repeatable at any level, nesting can vary– Normalization possible, but not common for

most finding aids

Page 27: Archives, Digital Archives and Encoded Archival Description

EAD Structure 2

• <eadheader> (information about EAD File)

– <eadid> unique id– <filedesc>

<titlestmt><publicationstmt><notestmt>

– <profiledesc><creation><langusage>

– <revisiondesc>– <frontmatter> (deprecated element, repeats info for

display)• <archdesc> (information about materials being described)

Page 28: Archives, Digital Archives and Encoded Archival Description

Common Top-Level <archdesc> Elements

<did> (descriptive id) <origination> <unitititle> <unitdate> <physdesc> <abstract> <repository> <unitid><bioghist><scopecontent><arrangement><controlaccess><accessrestrict>

Other elements include <accruals>, <acqinfo>, <altformatavail>, <appraisal>, <custodhist>, <prefercite>, <processinfo>, <userestrict>, <relatedencoding>, <separatedmaterial>, <otherfindaid>, <bibliography>, <odd>Linking elements: some based on XLink spec, suite of linking elements includes <archref> ,<extref>, <daogrp>

All of above elements are repeatable for components of the collection, at any level in the <dsc> (description of subordinate components)

Page 29: Archives, Digital Archives and Encoded Archival Description

Description of Subordinate Components

• nested components (i.e. <c> [unnumbered] or <c01>, <c02>, etc. [numbered]) represent intellectual structure of materials being described

• <container> elements (within each level) represent physical arrangement

• Maximum depth of 12 levels (not a good idea to use all of them)

• All elements available in archdesc top level also available in any component (typically not used)

Page 30: Archives, Digital Archives and Encoded Archival Description

A “raw” EAD File

• http://web.library.uiuc.edu/ahx/ead/xml/2620016.xml

Page 31: Archives, Digital Archives and Encoded Archival Description

EAD Tools: Creation

• Current options– Text editors (cheap, no built in validation,

transformation or unicode support)• Notetab• Word Processors

– XML editors (graphical view, built in validation, transformation, unicode support, FOP; tend to be buggy)

• XML Spy• oXygen • XMetal (not recommended)

– EAD Cookbook highly recommended, templates for Notetab, oXygen

Page 32: Archives, Digital Archives and Encoded Archival Description
Page 33: Archives, Digital Archives and Encoded Archival Description
Page 34: Archives, Digital Archives and Encoded Archival Description

EAD Tools: Display

• Most common to transform to HTML– Static via xsl stylesheet on command line or in

authoring software, then upload files to server– Client-side via link to css or xsl (dicey)– Server side transform engine (saxon, msxml,

xalan, etc) via servlets

• Dynamic (searchable)– dlxs findaid class

Page 35: Archives, Digital Archives and Encoded Archival Description

XML Transformations

XML

XSLT2

HTML1

HTML2

HTML3

HTML4

PDF

XSLT3

XSLT4

XSL-FO

XSLT1

XSL PARSER

Page 36: Archives, Digital Archives and Encoded Archival Description

Typical XSL file

Page 37: Archives, Digital Archives and Encoded Archival Description

Collection Management Tools

• Advantages: Software tailored for Archives, easy data entry

• Disadvantages: Few options currently exist. May be difficult to ‘migrate’ forward at a future point. Also not automatically online

Page 38: Archives, Digital Archives and Encoded Archival Description

“CMT” Examples

• Past Perfect http://www.museumsoftware.com/

• Archivist Toolkit http://www.archiviststoolkit.org/

• UIUC “Archival Information System”

Page 39: Archives, Digital Archives and Encoded Archival Description

AIS Demo

• www.chrisprom.com/ais/admin

• Login: guest

• Password: guest

Page 40: Archives, Digital Archives and Encoded Archival Description

Break for Questions

• Next: Digital Archives Standards and Tools

Page 41: Archives, Digital Archives and Encoded Archival Description

Libraries ArchivesNature Published items, each item

discrete, make sense on own, multiple copies

Unpublished, grouped with related items, make no sense on own

Creator Many different One parent organization

Method of Creation

Each created separately Organically produced as part of normal business or life

How Received Selected as items Appraised as groups

How Arranged By subject classification Provenance and original order (structure and function)

How described By item In aggregate (record group, series, collection)

Where described Built into item itself (provided title, author, CIP data), in catalog

Prepared by archivist (e.g. supplied title) in ‘finding aids, guides, inventories, databases

How accessed Items circulate No circulation

Digital Libraries or Archives?

Page 42: Archives, Digital Archives and Encoded Archival Description

The “on a horse” problem

• Best systems mix archival and library approaches

• Complete item description AND• Full context AND• Link to complete collection

(including description of off line items)

Page 43: Archives, Digital Archives and Encoded Archival Description

Sample of Digital Library/Archive Projects

• http://memory.loc.gov/ammem/index.html

• http://www.oac.cdlib.org/

• http://www.ohiomemory.org/index.html

• http://www.library.yale.edu/mssa/

• http://www.marquette.edu/library/MUDC/

• http://www.library.uiuc.edu/archives/coll/dl/bot/bot.html

Page 44: Archives, Digital Archives and Encoded Archival Description

Digital Library/Archive Standards

• Background on Metadata

• For images: Dublin Core

• For texts: TEI

• For information exchange: METS, OAI

• For Digital Preservation: OAIS Reference Model

Page 45: Archives, Digital Archives and Encoded Archival Description

Archivists and Metadata

• Structured data about an information resource

• Metadata by itself doesn’t “do” anything.

• Metadata schemas provide “buckets” for information about resources.

• Metadata needs to be interpreted by a system or user.

• Metadata provides context to help machines (and more importantly people) interpret content

• People usually talk about applying metadata to digital materials, but. . . . . .

Page 46: Archives, Digital Archives and Encoded Archival Description
Page 47: Archives, Digital Archives and Encoded Archival Description

This is Metadata

These are metadata

fields

Page 48: Archives, Digital Archives and Encoded Archival Description

same thing electronically

Metadata Fields

The metadata itself

Page 49: Archives, Digital Archives and Encoded Archival Description

Now as xml “metadata”

Descriptive and

administrative

Page 50: Archives, Digital Archives and Encoded Archival Description

This is Not Metadata

This is!

Page 51: Archives, Digital Archives and Encoded Archival Description

Metadata is about context and relationships

This is metadata, but. . .

Incomplete

Embedded in object

Not self- explaining

Page 52: Archives, Digital Archives and Encoded Archival Description

More complete Not embedded Relational Not self-explaining

Page 53: Archives, Digital Archives and Encoded Archival Description

Metadata and Code and human user

beginning to do something with metadata

But. . . Not self-explainingCan’t be exchanged

Page 54: Archives, Digital Archives and Encoded Archival Description

Non-embedded Self-explaining But relationships lost

now as xml metadata

Page 55: Archives, Digital Archives and Encoded Archival Description

Dublin Core

• Developed in 1995 for authors to describe own web resources

• Very simple, only 15 broad categories in the “simple” version

• Advantages: commonly held set of elements is easy to understand, built into many current tools

• Disadvantages: loss of specificity

Page 56: Archives, Digital Archives and Encoded Archival Description

The 15 elements:

• Content– Coverage– Description– Title– Type– Relation– Source– Subject– Audience

• Intellectual Prop– Contributor– Creator– Publisher– Rights

• Instantiation– Date– Format– Identifier– Language

Page 57: Archives, Digital Archives and Encoded Archival Description
Page 58: Archives, Digital Archives and Encoded Archival Description

Dublin Core Resources

• http://dublincore.org/

• http://www.ukoln.ac.uk/metadata/dcdot/

Page 59: Archives, Digital Archives and Encoded Archival Description

Text Encoding Initiative

• Encode any text with structural markup, deep semantic markup, or any combination of the two

• Section for metadata in <teiHeader>

• http://www.tei-c.org/

• Typically need xml editor to create, software such as DLXS to display

• http://media.library.uiuc.edu/projects/bot/xml/index.htm

Page 60: Archives, Digital Archives and Encoded Archival Description

OAIS Reference Model

• Based on Archival Principles

• Three parties involved with digital information– Producers; SIP: Submission Information Packet– Managers; AIP: Archival Information Packet– Consumers (Users); DIP: Dissemination Information

Packet• http://www.library.cornell.edu/iris/tutorial/dpm/foundation/oais/

index.html

Page 61: Archives, Digital Archives and Encoded Archival Description

“Simple” OAIS Model

Page 62: Archives, Digital Archives and Encoded Archival Description

METS

• Metadata Encoding and Transmission Standard• Standard for encoding descriptive,

administrative, and structural metadata regarding objects within a digital library

• Outgrowth of Making of American II project• Provides metadata for compound text and

image-based works• Need purpose-built software to display and

navigate.

Page 63: Archives, Digital Archives and Encoded Archival Description

METS: Why bother?

• Based on the OAIS Reference Model. It Includes support for:– Submission Information Packet– Archival Information Packet– Dissemination Information Packet

• Not only for transfer and archival management, but for giving access to, navigating an object

• It “plays well” with other systems (EAD, MARC, TEI, VRA etc)

• Software will be coming (support in Archivist Toolkit, NDIIPP projects)

• BUT. . . . It is currently very complex.

Page 64: Archives, Digital Archives and Encoded Archival Description

OAI-PMH

• Open Archives Initiative Protocol for Metadata Harvesting

• Not cross-database searching• metadata harvesting• Data Providers (expose collections in a

common syntax)• Service Providers (use metadata

harvested via the OAI-PMH as a basis for building value-added services)

Page 65: Archives, Digital Archives and Encoded Archival Description
Page 66: Archives, Digital Archives and Encoded Archival Description

OAI Example

• OAIster: http://oaister.umdl.umich.edu/o/oaister/

Page 67: Archives, Digital Archives and Encoded Archival Description

Tools for Digital Library/Archive Projects

• CONTENTdm http://www.dimema.com/– Very good, support for dublin core, OAI– Con: expensive– Recommendation: Skip it

• Greenstone http://www.greenstone.org/cgi-bin/library

– Pros: Free, (relatively) easy to configure, low hardware requirements, can run on internet or publish to CD, supported by UNESCO, targeted at developing nations

– Con: tends to be ‘item-centric’, difficult to aggregate materials

– Recommendation: Use it, but as part of large descriptive system

Page 68: Archives, Digital Archives and Encoded Archival Description

Thanks!!!!

• This powerpoint online at:– http://web.library.uiuc.edu/ahx/workpap