linked digital archive institutional repository rathachai chawuthai csim/set/ait
TRANSCRIPT
Linked Digital ArchiveInstitutional Repository
Rathachai Chawuthai CSIM/SET/AIT
ROADMAP
OAIS
PREMIS
Digital Archive
Institutional Repository
1
2
3Fedora
Challenges
Share
Reuse
Link
My Solution
Related Works
Agenda
• Institutional Repository & Digital Preservation
• PREMIS• OAIS
Institutional Repositories and Digital
Preservation: Assessing Current Practices at Research Libraries
Yuan LiSyracuse University
Meghan BanachUniversity of Massachusetts Amherst
• Archive– Is a collection of historical records, or the physical place they
are located. – contain primary source documents that have accumulated
over the course of an individual or organization's lifetime, and are kept to show the function of an organization.
• Digital Archive– Is a digital format of archive that need to do digital
preservation• Digital Media• Environment to render
Digital Archive
Institutional Repository
• An Institutional Repository is an online locus for collecting, preserving, and disseminating - in digital form - the intellectual output of an institution, particularly a research institution.
• For a university, this would include materials such as research journal articles, before (preprints) and after (postprints) undergoing peer review, and digital versions of theses and dissertations, but it might also include other digital assets generated by normal academic life, such as administrative documents, course notes, or learning objects.
• The four main objectives for having an institutional repository are:– to provide open access to institutional research output by self-archiving it;– to create global visibility for an institution's scholarly research;– to collect content in a single location;– to store and preserve other institutional digital assets, including unpublished or
otherwise easily lost ("grey") literature (e.g., theses or technical reports).
Age of Information
• Printed Age– Paper is durable format– Store under proper condition
• Digital Age– Information is fragile• Technological obsolescence• Deterioration of media
Preservation
Capacities v.s. Age
1000 Years
15 Years
Media
Do you have these?
Born-Digital
Preserve?
Need
• Community– Manage & disseminate Digital Material created by institution and its
community members
• Preservation– Loooooooooooooooooooooong-term
Long-term?
PeriodTime
TechnologyChange
Infinity
Short-term
Medium-term
Long-term
I think that …
Read
Render
Environment File Format Access
Store
MediaRight & Agreement
Security
SearchMigrationSoftwareHardware
ความถู�กต้อง
How?
• Using METADATA to describe preservation information
M
I think that …
Read
Render
Environment File Format Access
Store
MediaRight & Agreement
Security
SearchMigrationSoftwareHardware
Next challenge
• Fit METADATA with digital archive requirement
E.g.
• Implement Digital Archive in Institutional Repository
• Concern about Right & Agreement• Have guidance of Digital Format preservation• Have Content Policies by monitoring user
activities or peer-review• Plan for Long-term digital preservation• Solve issue of lack of Preservation Funding
Summary
PREMISPREservation Metadata: Implement
Strategies
Overview
• PREservation Metadata: Implementation Strategies• Sponsor by Library of Congress (LOC)• People usually refer to
“PREMIS” as “Data Dictionary”• Represent in XML format
• Metadata that supports activities intended to ensure the long-term usability of a digital resource
• Example– Store securely ……… Nobody change
……… It needs metadata about Checksum
– Can be read ……… DVD, CD, Floppy Disk?, 5” Disk?, …– Can render ……… Word 6?, Word 2003, Word 2010 ….
……… It needs metadata about app & version
– Look originally ……… Support changing source, how to render
Need
• include information supports and document the digital preservation process
• and to include information that support the – Viability (survival)– Renderability (readable)– Understandability – Authenticity (correct) of the object over time.
Core Element
• Common data model for organizing/thinking about preservation metadata
• Guidance for local implementations• Standard for exchanging information
packages between repositories
Data Dictionary
• Pieces of information or knowledge related to PREMIS entities that digital repository systems need to know and should be able to export to other systems
• More structure than normal METADATA
Semantic Unit
Entities
Entity
• May called “Bibliographic Entities”
• A set of content that is considered a single intellectual unit for purposes of management and description• E.g. book, map, photograph, or database
Intellectual Entities
Entity
• To be stored and managed in the preservation repository• E.g.
– Intellectual Entity : “Thailand Map”– Object Entity : Image file
• 3 Kinds of object– File
• A computer file, likes a PDF or JPEG
– Representation• Set of files that work together• E.g. web page including, html, image, css, javascript
– Bitstream• A part of file• E.g. a frame image in video file
Object Entities
Entity
Object Entities • Data Dictionary includes:
– a unique identifier for the object (type and value),– fixity information such as a checksum (message digest) and the algorithm used to
derive it,– the size of the object,– the format of the object, which can be specified directly or by linking to a format
registry,– the original name of the object,– information about its creation,– information about inhibitors,– information about its significant properties,– information about its environment
• OS MacOS, Browser Safari
– where and on what medium it is stored,– digital signature information,– relationships with other objects and other types of entities.
Entity
• Action that effect object in the repository– E.g. changing an object
• Data Dictionary includes:– a unique identifier for the event (type and value),– the type of event (creation, ingestion, migration, etc.),– the date and time the event occurred,– a detailed description of the event,– a coded outcome of the event, (Result of event; success | fail | …)– a more detailed description of the outcome,– agents involved in the event and their roles,– objects involved in the event and their roles.
Events
Entity
• Actor, e.g. person, organization, or software• Metadata standard, e.g. FOAF, vCARD, eduPerson, …• Data Dictionary includes:
– a unique identifier for the agent (type and value),– the agent's name,– designation of the type of agent (person, organization, software).
• Note: Agent can has many roles – Up to Event entities or Rights entities
Agents
Entity
• Information about Rights and Permissionsthat are directly relevant to preserving objects in repository
• Assertion of Rights– To provide actionable info to preservation repository system
• Data Dictionary includes:– a unique identifier for the rights statement (type and value),– whether the basis for claiming the right is copyright, license or statute,– more detailed information about the copyright status, license terms, or statute, as
applicable,– the action(s) that the rights statement allows,– any restrictions on the action(s),– the term of grant, or time period in which the statement applies,– the object(s) to which the statement applies,– agents involved in the rights statement and their roles.
Rights
Example
Example
OAISOpen Archival Information System
• In 2000 the Research Libraries Group (RLG) and Online Computer Library Center (OCLC) discussed how both organizations build an infrastructure for purposes of archiving digital objects.
Overview
• Purpose– Model a system for archival information, which is
represented in digital format, for long-term preservation
• Scope– Framework for long-term preservation and access– Terminology
• Architectures and Operation• Preservation strategies and techniques• Data model
Overview
• Primary functions– To preserve information over an extended period
of time– To provide user access to the information in
archives
Overview
High Level Concept
Person(s), or client systems, who provide the information to be preserved
Person(s) who set the overall policy of the OAIS. Management is separatefrom administrative functions
Person(s), or client systems who interact with the OAIS system and services
Model: Archive External Data Workflow
Package Model
ContentInformation
PDIPreservationDescriptionInformation
Archive Packaging Information
DescriptiveInformation
about Package 1
Package 1
• Content Information: – Original targeted for
preservation. – Physical/Digital object and it
Representation Information.
Package Model
• Preservation Description Information (PDI): – What is needed to preserve the
Content Information • Provenance
– For reliability– Source of content– histories
• Context– Environment to render
• Reference– Refer to thing outside e.g. ISBN
• Fixity– Check sum, MD5, …
ContentInformation
PDIPreservationDescriptionInformation
Archive Packaging Information
DescriptiveInformation
about Package 1
Package 1
Package Model
• Descriptive Information: – information which is used to
discover which package has the Content Information of interest
– Full set of attributes that are searchable in catalog service
ContentInformation
PDIPreservationDescriptionInformation
Archive Packaging Information
DescriptiveInformation
about Package 1
Package 1
• Submission Information Package (SIP)– Is sent to an OAIS by a Producer– Have some Content Information– Have some associated PDI
• Archival Information Packages (AIP)– Store in Archive Storage– Has transformed from one or more SIP– Has a complete set of PDI for the associated Content Information– Conform to OAIS internal standard– Is managed by OAIS
• Dissemination Information Package (DIP)– Present to Consumer– May not have complete PDI
3 Package Types
Functional Models
• Accept SIPs from Producers – or from internal elements under Administration
control• Prepare the AIPs for archive storage
1) Ingest
• Storage of AIP• Maintenance of AIP • Retrieval of AIPs.
2) Archival Storage
• Populate – Descriptive Information– Administrative Data
• Maintain– Descriptive Information– Administrative Data
• Access – Descriptive Information– Administrative Data
3) Data Management
• Solicit and negotiate submission agreement– With producer
• Audit submission– To ensure that they meet standard
• Maintain Configuration Management of– System hardware– Software
• Day-to-day governance of the other OAIS functional entities
4) Administration
• Monitor environment of OAIS• Provide recommendations– Still accessible?– Long-term?– If original computing environment becomes
obsolete?
5) Preservation Planning
• Determine– Existence– Description– Location– AvailabilityOf information in OAIS
• Allow Consumer– Request– RetrieveInformation of Products
6) Access
?