linked digital archive institutional repository rathachai chawuthai csim/set/ait

Post on 04-Jan-2016

224 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Linked Digital ArchiveInstitutional Repository

Rathachai Chawuthai CSIM/SET/AIT

ROADMAP

OAIS

PREMIS

Digital Archive

Institutional Repository

1

2

3Fedora

Challenges

Share

Reuse

Link

My Solution

Related Works

Agenda

• Institutional Repository & Digital Preservation

• PREMIS• OAIS

Institutional Repositories and Digital

Preservation: Assessing Current Practices at Research Libraries

Yuan LiSyracuse University

yli115@syr.edu

Meghan BanachUniversity of Massachusetts Amherst

mbanach@library.umass.edu

• Archive– Is a collection of historical records, or the physical place they

are located. – contain primary source documents that have accumulated

over the course of an individual or organization's lifetime, and are kept to show the function of an organization.

• Digital Archive– Is a digital format of archive that need to do digital

preservation• Digital Media• Environment to render

Digital Archive

Institutional Repository

• An Institutional Repository is an online locus for collecting, preserving, and disseminating - in digital form - the intellectual output of an institution, particularly a research institution.

• For a university, this would include materials such as research journal articles, before (preprints) and after (postprints) undergoing peer review, and digital versions of theses and dissertations, but it might also include other digital assets generated by normal academic life, such as administrative documents, course notes, or learning objects.

• The four main objectives for having an institutional repository are:– to provide open access to institutional research output by self-archiving it;– to create global visibility for an institution's scholarly research;– to collect content in a single location;– to store and preserve other institutional digital assets, including unpublished or

otherwise easily lost ("grey") literature (e.g., theses or technical reports).

Age of Information

• Printed Age– Paper is durable format– Store under proper condition

• Digital Age– Information is fragile• Technological obsolescence• Deterioration of media

Preservation

Capacities v.s. Age

1000 Years

15 Years

Media

Do you have these?

Born-Digital

Preserve?

Need

• Community– Manage & disseminate Digital Material created by institution and its

community members

• Preservation– Loooooooooooooooooooooong-term

Long-term?

PeriodTime

TechnologyChange

Infinity

Short-term

Medium-term

Long-term

I think that …

Read

Render

Environment File Format Access

Store

MediaRight & Agreement

Security

SearchMigrationSoftwareHardware

ความถู�กต้อง

How?

• Using METADATA to describe preservation information

M

I think that …

Read

Render

Environment File Format Access

Store

MediaRight & Agreement

Security

SearchMigrationSoftwareHardware

Next challenge

• Fit METADATA with digital archive requirement

E.g.

• Implement Digital Archive in Institutional Repository

• Concern about Right & Agreement• Have guidance of Digital Format preservation• Have Content Policies by monitoring user

activities or peer-review• Plan for Long-term digital preservation• Solve issue of lack of Preservation Funding

Summary

PREMISPREservation Metadata: Implement

Strategies

Overview

• PREservation Metadata: Implementation Strategies• Sponsor by Library of Congress (LOC)• People usually refer to

“PREMIS” as “Data Dictionary”• Represent in XML format

• Metadata that supports activities intended to ensure the long-term usability of a digital resource

• Example– Store securely ……… Nobody change

……… It needs metadata about Checksum

– Can be read ……… DVD, CD, Floppy Disk?, 5” Disk?, …– Can render ……… Word 6?, Word 2003, Word 2010 ….

……… It needs metadata about app & version

– Look originally ……… Support changing source, how to render

Need

• include information supports and document the digital preservation process

• and to include information that support the – Viability (survival)– Renderability (readable)– Understandability – Authenticity (correct) of the object over time.

Core Element

• Common data model for organizing/thinking about preservation metadata

• Guidance for local implementations• Standard for exchanging information

packages between repositories

Data Dictionary

• Pieces of information or knowledge related to PREMIS entities that digital repository systems need to know and should be able to export to other systems

• More structure than normal METADATA

Semantic Unit

Entities

Entity

• May called “Bibliographic Entities”

• A set of content that is considered a single intellectual unit for purposes of management and description• E.g. book, map, photograph, or database

Intellectual Entities

Entity

• To be stored and managed in the preservation repository• E.g.

– Intellectual Entity : “Thailand Map”– Object Entity : Image file

• 3 Kinds of object– File

• A computer file, likes a PDF or JPEG

– Representation• Set of files that work together• E.g. web page including, html, image, css, javascript

– Bitstream• A part of file• E.g. a frame image in video file

Object Entities

Entity

Object Entities • Data Dictionary includes:

– a unique identifier for the object (type and value),– fixity information such as a checksum (message digest) and the algorithm used to

derive it,– the size of the object,– the format of the object, which can be specified directly or by linking to a format

registry,– the original name of the object,– information about its creation,– information about inhibitors,– information about its significant properties,– information about its environment

• OS MacOS, Browser Safari

– where and on what medium it is stored,– digital signature information,– relationships with other objects and other types of entities.

Entity

• Action that effect object in the repository– E.g. changing an object

• Data Dictionary includes:– a unique identifier for the event (type and value),– the type of event (creation, ingestion, migration, etc.),– the date and time the event occurred,– a detailed description of the event,– a coded outcome of the event, (Result of event; success | fail | …)– a more detailed description of the outcome,– agents involved in the event and their roles,– objects involved in the event and their roles.

Events

Entity

• Actor, e.g. person, organization, or software• Metadata standard, e.g. FOAF, vCARD, eduPerson, …• Data Dictionary includes:

– a unique identifier for the agent (type and value),– the agent's name,– designation of the type of agent (person, organization, software).

• Note: Agent can has many roles – Up to Event entities or Rights entities

Agents

Entity

• Information about Rights and Permissionsthat are directly relevant to preserving objects in repository

• Assertion of Rights– To provide actionable info to preservation repository system

• Data Dictionary includes:– a unique identifier for the rights statement (type and value),– whether the basis for claiming the right is copyright, license or statute,– more detailed information about the copyright status, license terms, or statute, as

applicable,– the action(s) that the rights statement allows,– any restrictions on the action(s),– the term of grant, or time period in which the statement applies,– the object(s) to which the statement applies,– agents involved in the rights statement and their roles.

Rights

Example

Example

OAISOpen Archival Information System

• In 2000 the Research Libraries Group (RLG) and Online Computer Library Center (OCLC) discussed how both organizations build an infrastructure for purposes of archiving digital objects.

Overview

• Purpose– Model a system for archival information, which is

represented in digital format, for long-term preservation

• Scope– Framework for long-term preservation and access– Terminology

• Architectures and Operation• Preservation strategies and techniques• Data model

Overview

• Primary functions– To preserve information over an extended period

of time– To provide user access to the information in

archives

Overview

High Level Concept

Person(s), or client systems, who provide the information to be preserved

Person(s) who set the overall policy of the OAIS. Management is separatefrom administrative functions

Person(s), or client systems who interact with the OAIS system and services

Model: Archive External Data Workflow

Package Model

ContentInformation

PDIPreservationDescriptionInformation

Archive Packaging Information

DescriptiveInformation

about Package 1

Package 1

• Content Information: – Original targeted for

preservation. – Physical/Digital object and it

Representation Information.

Package Model

• Preservation Description Information (PDI): – What is needed to preserve the

Content Information • Provenance

– For reliability– Source of content– histories

• Context– Environment to render

• Reference– Refer to thing outside e.g. ISBN

• Fixity– Check sum, MD5, …

ContentInformation

PDIPreservationDescriptionInformation

Archive Packaging Information

DescriptiveInformation

about Package 1

Package 1

Package Model

• Descriptive Information: – information which is used to

discover which package has the Content Information of interest

– Full set of attributes that are searchable in catalog service

ContentInformation

PDIPreservationDescriptionInformation

Archive Packaging Information

DescriptiveInformation

about Package 1

Package 1

• Submission Information Package (SIP)– Is sent to an OAIS by a Producer– Have some Content Information– Have some associated PDI

• Archival Information Packages (AIP)– Store in Archive Storage– Has transformed from one or more SIP– Has a complete set of PDI for the associated Content Information– Conform to OAIS internal standard– Is managed by OAIS

• Dissemination Information Package (DIP)– Present to Consumer– May not have complete PDI

3 Package Types

Functional Models

• Accept SIPs from Producers – or from internal elements under Administration

control• Prepare the AIPs for archive storage

1) Ingest

• Storage of AIP• Maintenance of AIP • Retrieval of AIPs.

2) Archival Storage

• Populate – Descriptive Information– Administrative Data

• Maintain– Descriptive Information– Administrative Data

• Access – Descriptive Information– Administrative Data

3) Data Management

• Solicit and negotiate submission agreement– With producer

• Audit submission– To ensure that they meet standard

• Maintain Configuration Management of– System hardware– Software

• Day-to-day governance of the other OAIS functional entities

4) Administration

• Monitor environment of OAIS• Provide recommendations– Still accessible?– Long-term?– If original computing environment becomes

obsolete?

5) Preservation Planning

• Determine– Existence– Description– Location– AvailabilityOf information in OAIS

• Allow Consumer– Request– RetrieveInformation of Products

6) Access

?

top related