metadata for digital libraries: a functional approach sandra payette digital library research group...
Post on 22-Dec-2015
221 views
TRANSCRIPT
![Page 1: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/1.jpg)
Metadata for Digital Libraries:A Functional Approach
Sandra PayetteDigital Library Research Group
Cornell University
Cornell Digital Imaging Workshop
October 21, 1998
![Page 2: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/2.jpg)
Metadata
CREATOR: Plato
TITLE: The Republic
Image 1 cdrom 1Image 2 cdrom 1Image 3 cdrom 2
Image File Storage
Metadata is structured data about data that imposes order on a disordered information universe.
Access Control List
![Page 3: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/3.jpg)
Many Types of Metadata
• Descriptive
• Structural
• Terms and conditions
• Administrative
• Content ratings
• Provenance
• Relationship
![Page 4: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/4.jpg)
Basic Functions We Must Support
• Resource Discovery
• Access and Use
• Preservation and Administration
![Page 5: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/5.jpg)
Resource Discovery:
Focus on Descriptive Metadata
![Page 6: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/6.jpg)
Metadata for Resource Discovery
• Catalogs– OPAC / MARC Records
• Indexes– Structured descriptive records (e.g., Dublin Core)– Abstracts – Full-text surrogates (e.g, via OCR)
![Page 7: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/7.jpg)
Challenges
• Impracticality of large-scale traditional cataloging– time consuming, labor intensive, special skills– limited coverage - only “selected” items
• Problems with resource discovery– full-text indexing ineffective (false hits, irrelevancy,
overload)– full-text approaches not useful for non-textual data
(e.g., audio, video, executable programs)
![Page 8: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/8.jpg)
One Solution:Simple Descriptive Surrogates
• Easy to create
• Applicable across domains
• Applicable for different genre of objects
• Allows interoperability among robots, indexers, and search clients
![Page 9: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/9.jpg)
Dublin Core Element Set
• Good baseline descriptive record
• Can exist along side other specialized metadata
• Common ground for discovery across disparate resources
• No specialized skills required
• Flexibility through qualifiers
Source: http://www.purl.org/Metadata/dublin_core/
![Page 10: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/10.jpg)
Dublin Core : 15 Elements
• Title name given to the work by the author
• Author or Creator person(s) responsible for the intellectual content
• Subject and Keywords the topic of the work, keywords, or formal classification schemes
• Description textual description of the content (abstract, prose describing an image, etc.)
• Publisher the organization making the work available in its present form
• Other Contributor person(s) other than the author who have made significant contributions to the intellectual content
• Date the date the work was made available
• Resource Type category of the resource
• Format Data representation of the resource
• Resource Identifier Unique Identification string (e.g. URL, URN, ISBN...)
• Source object from which this object is derived (if applicable)
• Language language of the intellectual content of the object
• Relation relationship of the object to other objects or collections
• Coverage spatial locations and temporal duration characteristics
• Rights Management a pointer to a copyright notice, a rights management statement, or a rights server.
![Page 11: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/11.jpg)
Dublin Core in HTML META Tags
<html><head><title>Cornell Digital Library Research Group</title><META name="DC.subject" content=”digital library research"><META name="DC.subject" content="networked object description"><META name="DC.publisher" content=”Cornell University"><META name="DC.creator" content=”Lagoze, Carl, [email protected]."><META name="DC.creator" content=”Payette, Sandra, [email protected]."><META name="DC.title" content=”Cornell Digital Library Research Group"><META name="DC.date” content="1998-05-15"><META name="DC.form" scheme="IMT" content="text/html"><META name="DC.language" scheme="ISO639" content="en"><META name="DC.identifier" scheme="URL" content="http://www2.cs.cornell.edu/NCSTRL/CDLRG/cdlrg.htm"></head><IMG SRC="/mydir/mysubdir/mypicture.gif" WIDTH=208 HEIGHT=216></html>
Source: http://www.w3.org/TR/REC-html40/
![Page 12: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/12.jpg)
Warwick Framework
• Developed by Dublin Core community
• Broader framework to accommodate diverse metadata schemes
• Encourages community-specific definition and administration of metadata
• Modularity supports interoperability among:– content providers – catalogers and indexers– automated resource discovery systems
![Page 13: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/13.jpg)
Warwick Framework Container
Container
Package
Dublin Core
Package
Other Descriptive
Package
Reference to MARC
Simple Package:Typed Metadata Set
Package
MARC RecordURI
![Page 14: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/14.jpg)
WWW Infrastructure Evolving in this Direction
• Dublin Core submitted to IETF as RFC– ftp://ftp.isi.edu/in-notes/rfc2413.txt
• Resource Description Framework (RDF)– http://www.w3.org/RDF/
• Extensible Markup Language (XML)– http://www.w3.org/XML/
![Page 15: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/15.jpg)
Resource Description Framework (RDF)
• Influenced by the Warwick Framework, among others
• Enables interoperability between applications that exchange metadata
• Mix and match of metadata elements from different schemas
• An application of XML (transfer syntax)
![Page 16: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/16.jpg)
A Simple RDF Model
www2.cs.cornell.edu/CDLRG/doc1
DC:Creator
DC:Publisher
QCSchema:Rating www.xxx.org/rate
A B
MyRating YourRating
![Page 17: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/17.jpg)
RDF Expressed in XML
Dublin Core
Element Set
<?xml:namespace name=“http://www.purl.org/Metadata/dublin_core/” as=“DC”>
<?xml:namespace name=“http://www.w3.org/Schemas/RDF/” as=“RDF”>
<RDF:Serialization><RDF:Assertions href=“http://www2.cs.cornell.edu/CDLRG/doc1”>
<DC:Creator>Sandy Payette</DC:Creator><DC:Publisher>Cornell DLRG </DC:Publisher>
</RDF:Assertions></RDF:Serialization>
![Page 18: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/18.jpg)
RDF: Why is it important?
• Market demand for metadata deployment• Software infrastructure will be ubiquitous (e.g. free in
browsers, servers, proxies, editors, etc.)• RDF is a general purpose framework that provides
structured, human-readable and machine-understandable metadata for the web
• Allows stakeholder communities to independently developed, maintain, and reuse vocabularies
![Page 19: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/19.jpg)
Access and Use
Focus on Structural Metadata
![Page 20: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/20.jpg)
Structural Metadata
• What is it? Data that….– Defines structure within documents– Aggregates images into meaningful entities– Correlates document components to image files– Organizes a collection of objects
• Where is it?– ASCII text files in directories– Relational databases– Embedded in documents or surrogates (e.g. SGML)
![Page 21: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/21.jpg)
First... A Data Model
Data models mirror natural attributes and relationships of real-world objects
PageChapter
TableContents
Index
Front0:1
1:N
0:1
1:N 1:N
1:N
0:1
1:N
![Page 22: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/22.jpg)
“Binding” Document Images with SGML
<!DOCTYPE EBIND PUBLIC "-//UC Berkeley//DTD ebind.dtd (ElectronicBinding (Ebind))//EN" [<!ENTITY % birch PUBLIC "-//UC Berkeley//ENTITIESBirch-tree fairy book (Page Images)//EN">%birch;]><ebind type="book"><front><page><image entityref="birch001" seqno="1" nativeno="i"></page><page><image entityref="birch002" seqno="2" nativeno="ii"></page><page><image entityref="birch003" seqno="3" nativeno="iii"></page><page><image entityref="birch004" seqno="4" nativeno="iv"></page><div0 type="titlepage"><page><image entityref="birch005" seqno="5" nativeno="v"></page><page><image entityref="birch006" seqno="6" nativeno="vi"></page></div0><div0 type="introduction"><head>Introductory note</head><page><image entityref="birch007" seqno="7" nativeno="vii"></page></div0>
Source: http://sunsite.berkeley.edu/Ebind/
![Page 23: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/23.jpg)
Finding Aids in SGML
• Encoded Archival Description (EAD)– SGML mark up of descriptive access tools
(inventories, registers, indexes, and guides)– provides more detail about a collection than in
typical catalog record – facilitates access - “drill down” into collection– potential international standard– maintained jointly by Library of Congress and
Society of American Archivists (SAA)
Source: http://www.loc.gov/rr/ead/eadhome.html
![Page 24: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/24.jpg)
Preservation and Administration
Focus on Administrative Metadata
and Persistent Identifiers
![Page 25: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/25.jpg)
Administrative Metadata
• Information for managing images… over time– relocation– migration (new formats)– copyright tracking– archiving of objects and services
• Where is it?– File headers (to help prevent orphaned images)– External databases (e.g., relational db)– Separate files stored with images
![Page 26: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/26.jpg)
Create a Preservation Audit Trail
Image File Attributes:• formats • versions • compression
Image Attributes:• resolution• bit depth• orientation
Process Data:• creation date/time• equipment used
Rights Management Data:•Expiration dates•Copyright info•source statements
![Page 27: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/27.jpg)
Persistent Identifiers
• Globally unique names
• Persistent … names are permanent, lasting
• Used in resolution services to locate the object (locations change over time).
cnri.dlib/april97-payette
NamingAuthority
ItemName
UniqueIdentifier:
URL: http://www.somewebserver.org/somedirectory/somefile
![Page 28: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/28.jpg)
Identifiers: Current Initiatives
• IETF Uniform Resource Names (URN) – specification of URN framework– requirements for resolution systems– syntax definition
• Existing Systems– CNRI’s Handle System – OCLC PURLs– DOI Initiative
![Page 29: Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649d7a5503460f94a5e87d/html5/thumbnails/29.jpg)
Further reading
• IFLA: A Good List - http://www.nlc-bnc.ca/ifla/II/metadata.htm
• Lynch, et. al.: CNI Resource Discovery White Paper -http://www.cni.org/projects/nidr/nidr.html
• Lagoze: Resource Discovery in the Digital Age -http://www.dlib.org/dlib/june97/06lagoze.html
• Payette: Persistent Identifiers, RLG DigiNews - http://www.rlg.org/preserv/diginews/diginews22.html
• W3C: Metadata Overview - http://www.w3.org/Metadata