digital libraries made easy 2004 samla convention roanoke, virginia november 12, 2004 edward a. fox...

203
Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer Science, Virginia Tech, Blacksburg, VA 24061 [email protected] http://fox.cs.vt.edu http://fox.cs.vt.edu/talks/2004/

Upload: gregory-reynolds

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Digital LibrariesMade Easy

2004 SAMLA ConventionRoanoke, VirginiaNovember 12, 2004

Edward A. FoxDigital Library Research Laboratory & Dept. of Computer

Science, Virginia Tech, Blacksburg, VA 24061

[email protected] http://fox.cs.vt.edu

http://fox.cs.vt.edu/talks/2004/

Page 2: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 3: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Acknowledgements (Selected)• Sponsors: ACM, Adobe, AOL, CNI, CONACyT, DFG,

IBM, Microsoft, NASA, NDLTD, NLM, NSF (IIS-9986089, 0086227, 0080748, 0325579; DUE-0121679, 0136690, 0121741, 0333601), OCLC, SOLINET, SUN, SURA, UNESCO, US Dept. Ed. (FIPSE), VTLS

• VT Faculty/Staff: Debra Dudley, Weiguo Fan, Gail McMillan, Manuel Perez, Naren Ramakrishnan, Layne Watson, …

• VT Students: Yuxin Chen, Shahrooz Feizabadi, Marcos Goncalves, Nithiwat Kampanya, S.H. Kim, Aaron Krowne, Bing Liu, Ming Luo, Paul Mather, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ohm Sornil, Hussein Suleman, Ricardo Torres, Wensi Xi, Baoping Zhang, …

• Leonid Kalinichenko: for advice shaping this tutorial

Page 4: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Other Collaborators (Selected)

• Brazil: FUA, UFMG, UNICAMP

• Case Western Reserve University

• Emory, Notre Dame, Oregon State

• Germany: Univ. Oldenburg

• Mexico: UDLA (Puebla), Monterrey

• College of NJ, Hofstra, Penn State, Villanova

• University of Arizona

• University of Florida, Univ. of Illinois

• University of Virginia

Page 5: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline

1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions

5. Advanced Issues

Page 6: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

For More Information• Magazine: www.dlib.org• Books: http://fox.cs.vt.edu/DLSB.html (1994)

• MIT Press: Arms, plus related by Borgman, Licklider (1965)• Morgan Kaufmann: Witten... (several), Lesk (2nd edition soon)

• Conferences• ECDL: www.ecdl2005.org• ICADL: http://icadl2004.sjtu.edu.cn• JCDL: www.jcdl2005.org

• Associations• ASIS&T DL SIG• IEEE TCDL: www.ieee-tcdl.org (student awards, consortium)

• NSF: www.dli2.nsf.gov• Labs: VT: www.dlib.vt.edu, http://ei.cs.vt.edu/~dlib/

Page 7: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Domain Concepts (theory)

DL Architecture

instance of

Modeling Language (Meta-Model)

Model

used to compose instance of

abstracted from

represented by

interpreted as

instance of

instance of

Running DL

Actors “Real” World

“real” world object

represented by

interpreted as

Page 8: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Digital LibrariesShorten the Chain from

Editor

Publisher

A&I

Consolidator

Library

Reviewer

Page 9: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

DLs Shorten the Chain to

Author

Reader

Digital

LibraryEditor

Reviewer

Teacher

Learner

Librarian

Page 10: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

A Digital Library Case Study

• Domain: graduate education, research

• Genre:ETDs=electronic theses & dissertations

• Submission: http://etd.vt.edu

• Collection: http://www.theses.org

Project: Networked Digital

Library of Theses & Dissertations

(NDLTD) http://www.ndltd.org

Page 11: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

DLs: Why of Global Interest?

• National projects can preserve antiquities and heritage: cultural, historical, linguistic, scholarly

• Knowledge and information are essential to economic and technological growth, education

• DL - a domain for international collaboration• wherein all can contribute and benefit• which leverages investment in networking• which provides useful content on Internet & WWW• which will tie nations and peoples together more

strongly and through deeper understanding

Page 12: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Libraries of the FutureJCR Licklider, 1965, MIT Press

World

Nation

State

City

Community

Page 13: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

5S Definition: Digital Libraries are complex systems that

• help satisfy info needs of users (societies)

• provide info services (scenarios)

• organize info in usable ways (structures)

• present info in usable ways (spaces)

• communicate info with users (streams)

Page 14: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

SynchronousScholarly Communication

Same time, Same or different place

Page 15: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Asynchronous, Digital Library Mediated Scholarly Communication

Different time and/or place

Page 16: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Computing (flops)Digital content

Com

mun

icat

ions

(ban

dwid

th, c

onne

ctiv

ity)

Locating Digital Libraries in Computing andCommunications Technology Space

Digital Libraries technologytrajectory: intellectualaccess to globally distributed information

less moreNote: we should consider 4 dimensions: computing, communications,content, and community (people)

Page 17: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

D ig ita l L ib ra r y C o n te n t

A rtic le s ,R e p o rts,

B o o ks

T e xtD o cum e n ts

S p ee ch ,M u s ic

V id eoA u d io

(A e ria l)P h o tos

G e og rap h icIn fo rm ation

M o d e lsS im u la tio ns

S o ftw a re ,P ro g ra m s

G e no m eH u m a n,a n im a l,

p la n t

B ioIn fo rm ation

2 D , 3 D ,V R ,C A T

Im ag es a ndG ra p h ics

C o nte n tT yp e s

Page 18: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

AmericanSouth.Org – Roles, ContentSOLINET Libraries (Data

Providers)Scholars

Intellectual Organization Controlled vocabulary Metadata extension

development

Collection Decisions Selection Criteria

Selection Criteria Controlled

vocabulary

Central Server Maintenance Local Server Maintenance Provision of Context

Metadata Repository Metadata Creation/Maintenance

Organizational Structure and

Annotation Tools

Central Interface Design/Maintenance

Local Interface Design/Maintenance

Selection of Other Annotation

Tools

Central Indices Creation/Maintenance

Local Indices Selection of Thesauri

Coordination of Metadata Gateway

Development

Gateway Implementation Concept Mapping

Digital Objects

Page 19: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Content Area Description Audio

Digital

Finding Aid

MSS Other

Photo

Video

MF

Print

Total

African-American cultural life 6 4 6 9 4 12 3 10 18 72

Agricultural crisis of late 19th century

1 1 3 1 1 4 8 19

Codification of segregation laws 1 3 2 1 1 8 16

Configuration of white supremacy 1 3 3 3 1 9 20

Cultural values and activities 3 1 5 17 4 15 1 5 20 71

Disenfranchising movements 1 2 2 1 2 1 6 15

Educational movements 6 1 1 18 6 21 3 5 27 98

Emergence of Holiness & Pentecostal Groups

1 1 1 7 10

Emergence of new musical forms 3 1 1 1 2 8

Emergence of organized groups expressing farmers concerns

2 2 1 8 13

… … … … … … … … … … …Total Each Format 41 14 51 161 38 133 13 79 301 831

Page 20: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline

1. Introduction

2. Historical Perspective• Computing-related (ACM-DL,

CSTC, CITIDEL), NSDL

• DLI, Workshop Results: Chatham

3. Topical Perspective

4. Software Solutions

5. Advanced Issues

Page 21: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

CS -> CSTC -> CRIM• NSF and ACM Education Committee are funding

a 2 year project “A Computer Science Teaching Center” - CSTC - http://www.cstc.org/

• College of NJ, U. Ill. Springfield, Virginia Tech

• Focus initially on labs, visualization, multimedia

• Multimedia part is also supported by a 2nd grant to Virginia Tech and The George Washington University: http://www.cstc.org/~crim/ (with curricular guidelines also under development)

Page 22: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

CS Teaching Center (CSTC)

• Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units.

• Learners benefit from having well-crafted modules that have been reviewed and tested.

• Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built.

• ACM support led to Journal of Educational Resources in Computing (JERIC), accessible from www.cstc.org

Page 23: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 24: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Browsing (1)

Page 25: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Browsing (2)

Page 26: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 27: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 28: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 29: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Computing and Information Technology Interactive Digital Educational Library (CITIDEL)

• Domain: computing / information technology

• Genre: one-stop-shopping for teachers & learners: courseware (CSTC, JERIC), leading DLs (ACM, IEEE-CS, DB&LP, CiteSeer), PlanetMath.org, NCSTRL (technical reports), …

• Submission & Collection: sub/partner collections www.citidel.org

Page 30: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

www.CITIDEL.org

• Led by Virginia Tech, with co-PIs:• Fox (director, DL systems)• Lee (history)• Perez (user interface, Spanish support)

• Partners• College of New Jersey (Knox)• Hofstra (Impagliazzo)• Villanova (Cassel)• Penn State (Giles)

Page 31: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

English

Spanish

Nominated

Editor reviewed

Java

Multimedia

LLaanngguuaaggee TTooppiicc

QQuuaalliittyy

Identified by crawl

Peer reviewed

Algorithms

Multi-dimensional Categorization

Page 32: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

DIGITAL LIBRARY SERVICES

REPOSITORIES

USER PORTALS

Overview of CITIDEL architecture

Page 33: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Annotations

OAI Data

Harvester

EDUCATORS

ADMINISTRATORS LEARNERS

Multilingual Searching

Revising Annotating Filtering Browsing Administering

Filtering Profiles User Profiles

Union Metadata

OAI Data

Provider

Remote and Peer Digital Libraries (eg. NSDL -CIS)

PORTALS

SERVICES

REPOSITORIES

Digital library architecture for localand interoperable CITIDEL services

Page 34: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 35: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 36: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 37: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 38: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 39: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

CITIDEL Technology Features•Component architecture (Open Digital Library)

•Re-use and compose re-deployable digital library components.

•Built Using Open Standards & Technologies

•OAI: Used to collect DL Resources and DL Interoperability

•XSL and XML: Interface rendering with multi-lingual community based translation of screens and content (Spanish, …)

•Perl: Component Integration

•ESSEX: Search Engine Functionality

•Very fast, utilizing in-memory processing

•Includes snap-shots for persistence

•Multi-scheming

•Integrates multiple classifications / views through maps, closure

Page 40: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 41: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Cluster Search Results from CITIDEL

Page 42: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Cluster NDLTD-Computing

Page 43: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

CITIDEL -> NSDL

• A collection project in the

• National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL

• National Science Digital Library

• www.nsdl.org

• (Next slides courtesy Lee Zia, NSF)

Page 44: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Supports:

Users

Content

Tools

(profiles)

(metadata)

(protocols)

Learning communities

Customizable collections

Application services

Page 45: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Enables:Environments for

• Communication

• Collaboration

• Creation

• Validation

• Evaluation

• Recognition

• ...

• Discovery

• Stability

• Reliability

• Reusability

• Interoperability

• Customizability

• ...

of Resources

AND

Page 46: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

NSDL ProgramTracks

• Core Integration: coordinate a distributed alliance of resource collection and service providers; and ensure reliable and extensible access to and usability of the resulting network of learning environments and resources

• Collections: aggregate and actively manage a subset of the digital library’s content within a coherent theme / specialty

• Services: increase the impact, reach, efficiency, and value of the digital library in its fully operational form

• Targeted (Applied) Research: have immediate impact on one or more of the other three tracks

• Pathways: large efforts across broad ranges of areas or approaches or users

Page 47: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 48: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 49: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 50: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

NSDL Information ArchitectureEssentially as developed by the Technical Infrastructure Workgroup

referenceditems &

collections

referenceditems &

collections

Special Databases

NSDLServicesNSDL

ServicesOther NSDLServices

CI Services

annotation

CI Services

discussion

CI Services

personalization

CI Services

authentication

CI Services

browsing

Core Services:information retrieval

Core Collection-Building Services

harvesting

Core Collection-Building Services

protocols

Core Services:metadata gathering

Portals &ClientsPortals &

ClientsPortals &Clients

Usage Enhancement

Collection Building

User Interfaces

NSDLCollections

NSDLCollections

NSDLCollections

CoreNSDL“Bus”

Page 51: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline

1. Introduction

2. Historical Perspective• Computing-related (ACM-DL, CSTC,

CITIDEL), NSDL

• DLI, Workshop Results

3. Topical Perspective

4. Software Solutions

5. Advanced Issues

Page 52: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Borgman et al.:Workshop Report onSocial Aspects ofDigital Libraries: http://www-lis.gseis.ucla.edu/DL/

InformationLifeCycle

Page 53: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Information Life Cycle

AuthoringModifying

OrganizingIndexing

StoringRetrieving

DistributingNetworking

Retention/ Mining

AccessingFiltering

UsingCreating

Page 54: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

AuthoringModifying

OrganizingIndexing

Storing

Archiving

NetworkingAccessing

Filtering

Creation

DistributionUtilization

Significance

Similarity

Pertinence

AccuracyCompletenessConformance

Seeking

SearchingBrowsingRecommending

Relevance

Timeliness

Accessibility

Accessibility

Believability

Inactive

Active

Discard

RetentionMining

Semi-Active

Preservability

Timeliness

Preservability

Describing

Page 55: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Benefits

• Ease of use

• Effectiveness

• “The benefits of digital libraries will not be appreciated unless they are easy to use effectively.” - IITA Workshop report

Page 56: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Application

Domain

Related Institutions

Examples   Technical Challenges Benefit / Impact

PublishingPublishers, Eprint

archivesOAI   Quality control, openness Aggregation, organization

Education

Schools, colleges, universities

NSDL, NCSTRL  Knowledge management,

reuseabilityAccess to data

Art, Culture

Museum AMICO, PRDLA  Digitization, describing,

catalogingGlobal understanding

ScienceGovernment,

Academia, Commerce

NVO, PDG, SwissProt, UK

eScience,European Union Commission

  Data modelsreproducibility, faster reuse, faster

advance

(e) Governme

nt

Government Agencies (all levels)

Census  Intellectual property rights,

privacy, multi-nationalAccountability, homeland security

(e) Commerce

, (e) Industry

Legal institutionsCourt cases,

patents  Developing standards

Standardization, economic development

History, Heritage

Foundations American Memory  Content, context,

interpretation

Long term view, perspective, documentation, recording, facilitating, interpretation,

understanding

Cross-cutting

Library, Archive

Web, personal collections

 

Multi-language, preservation, scalability, interoperability, dynamic

behavior, workflow, sustainability, ontologies,

distributed data, infrastructure

Reduced cost, increased access, pereservation, democratization, leveling, peace, competitiveness

Reagan Moore

Ed Fox

June

2002

for

NSF

Page 57: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective• Key concepts: do, mdo, coll, catalog, repository, service,

archive, DL

• Interoperability: federated search, harvesting, OAI

• Architecture: distributed, clusters, LOCKSS

• Digitization, preservation

4. Software Solutions

5. Advanced Issues

Page 58: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Ourside Key Set, but Important: Interfaces

• 5S perspective: spaces, scenarios

• Taxonomy of interface components

• Workflow

• Visualization

• Environments

• Design

• Usability testing

Page 59: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Also Important: Epub, SGML, XML

• 5S perspective: streams, structures, scenarios

• Authoring

• Rendering, presenting

• Tagging, Markup, DOM

• Semi-structured information

• Dual-publishing, eBooks

• Styles (XSL, XSLT)

• Structure queries

Page 60: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Also Important: Databases

• 5S perspective: structures, streams, scenarios

• Extending database technology

• Structured and unstructured info

• Multimedia databases

• Link databases

• Performance

• Replicated storage, I2-DSI (details following)

Page 61: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Also Important: Agents

• 5S perspective: societies, streams, spaces, scenarios, structures

• Protocols

• Knowledge interchange

• Negotiation, registries

• Distributed issues

• Webbots (automatic indexing)

• Ontologies (standard upper)

Page 62: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Also Important: Economics

• 5S perspective: societies, scenarios

• E-commerce

• Sustainability

• Preservation and archiving• DLF, Besser, Lorie, Gladney

• Self-archiving

• Open collections

• Economic models, business plans

Page 63: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Also Important: IPR

• 5S perspective: societies, scenarios

• Intellectual property rights

• Legal issues

• Terms and conditions

• Copyright

• Patents, trademarks

• Distributed rights management

• Security

Page 64: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Also Important: Social Issues

• 5S perspective: societies, scenarios• Cooperation, collaboration• Annotation, ratings• Digital divide• Educational applications• Cultural heritage• Museums (AMICO)• Organizational acceptance• Personalization• Internationalization

Page 65: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

What is Key depends on yourDL Definition

• Library ++ (library+archive+museum+…)

• Distributed information system + organization + effective interface

• User community + collection + services

• Digital objects, repositories, IPR management, handles, indexes, federated search, hyperbase, annotation

Page 66: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Our Perspective on Key Concepts

• Recall the 5S approach• Minimal digital library• Metamodel for minimal digital library• Metamodel for “born digital standard” DLs• Metamodel for architectural DL

• Here, focus on key concepts in minimal DL

Page 67: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Digital Objects (DOs)

• Born digital

• Digitized version of “real” object• Is the DO version the same, better, or worse?• Decision for ETDs: structured + rendered

• Surrogate for “real” object• Not covered explicitly in metamodel for a

minimal DL• Crucial in metamodel for archaelogy DL

Page 68: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Metadata Objects (MDOs)

• MARC

• Dublin Core

• RDF

• IMS

• OAI (Open Archives Initiative)

• Crosswalks, mappings

• Ontologies

• Topics maps, concept maps

Page 69: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Other Key Definitions

• coll, catalog, repository, service, archive, (minimal) DL

• See Gonçalves et al. in April 2004 ACM Transactions on Information Systems (TOIS)

Page 70: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

5S

structures (2) streams (1) spaces (4) scenarios (7) societies (10)

structural metadata specification (11)

descriptive metadata specification (12)

repository (19)

collection (17)

(20)indexing service

structured stream (15)

digital object (16)

metadata catalog (18)

browsing service (23)

searching service (21)

digital library (minimal) (24)

services (8)

sequence (A.3)

graph (A.6) function (A.2)

measurable(A.10), measure(A.11), probability (A.12), vector(A.13), topological (A.14) spaces

event (6) state (5)

hypertext (22)

sequence (A.3)

Page 71: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

StreamsStreams Structures SpacesSpaces ScenariosScenarios SocietiesSocieties

indexingindexing

browsingbrowsing searchingsearching

servicesservices

hypertexthypertext

Structured Stream

ArchObj

ArchColl

ArchObjArchObj

ArchCollArchColl

Arch Metadata catalogArchDO

ArchDRArchDRArchDCollArchDColl Minimal ArchDL

SpaTemOrgSpaTemOrg

StraDiaStraDia

Arch Descriptive Metadata specification

Descriptive Metadata

specification

Page 72: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Streams

text

audio

image

video digitalobject

Repository

CollectionCatalog

describes

stores

is_version_of/ cites/links_to

Index

Service

Scenario

event

extends

reuses

ServiceManager

Actor

operationexecutes

participates_in

recipient

runs

Scenarios

Societies

inherits_from/includes

association

uses

Topological

ProbabilisticMetric

Measurable

Measure

describes

employsproduces

employsproduces

employs

produces

Structures

Spaces

Vector

contains

metadata specifications

is_a is_a

precedes

happens_before

is_a

redefinesinvokes

contains

contains

Page 73: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Models Examples Objectives

Stream Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data

Structures Collection; catalog; hypertext; document; metadata; organization tools

Specifies organizational aspects of the DL content

Spaces Measure; measurable, topological, vector, probabilistic

Defines logical and presentational views of several DL components

Scenarios Searching, browsing, recommending

Details the behavior of DL services

Societies Service managers, learners, teachers, etc.

Defines managers responsible for running DL services; actors that use those; and relationships among them

Page 74: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing

Annotating Classifying Clustering Evaluating Extracting Indexing

Measuring Publicizing

Rating Reviewing (peer)

Surveying Translating

(language)

Conserving Converting

Copying/Replicating Emulating Renewing

Translating (format)

Acquiring Cataloging

Crawling (focused) Describing Digitizing

Federating Harvesting Purchasing Submitting

Preservational Creational

Add Value

Repository-Building

Information Satisfaction

Services

Infrastructure Services

Page 75: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

SearchingBrowsing

queryanchor

Society

actor

Collection, {digital object}

Recommending Filtering Binding Visualizing Expanding query

user model query/category {digital object}

{digital object} {digital object}

binder

InformationSatisfaction Services

space query’

fundamental

Rating Training

Infrastructure

Services (Add_Value)

composite

Requesting

handle

p pp

e e e{(digital object, actor, rate) }

p

e

e

p p p p p

e e

classifier

e ee e

e

p

e

Indexing

Index

p

e

transformer

e

Page 76: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Requirements Analysis Design Implementation Test

5S 5SLOO ClassesWorkflow Components

DLEvaluation

5SGraph 5SLGenFormalTheory/Metamodel

DL XMLLog

Page 77: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 78: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 79: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

5SLGen: Automatic DL Generation

5S Meta

Model5SLGraph

DL Expert

DL Designer

5SL DL

Model

5SLGen

Practitioner

Researcher

TailoredDL

Services

Teacher

componentpool

ODLSearch,ODLBrowse,ODLRate,ODLReview,

…….

Requirements (1) Analysis (2)

Implementation (4)

Design (3)

Page 80: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective• Key concepts: do, mdo, coll, catalog, repository, service,

archive, DL

• Interoperability: federated search, harvesting, OAI

• Architecture: distributed, clusters, LOCKSS

• Digitization, preservation

4. Software Solutions

5. Advanced Issues

Page 81: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Interoperability through Standards

• Protocols/federation• Z39.50, CIMI• Dienst, NCSTRL• OAI protocol

• Metadata• TEI: inline, detailed (structure in stream)• MARC: two-level, fine-grained• Dublin Core: high-level, 15 elements• RDF: describing resources/collections, annotation• OAMS -> DC and others used in OAI

Page 82: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Interoperability and IR

• Information storage and retrieval

• Search, Retrieval, Resource Discovery

• Boolean vs. natural language

• Search engines

• Indexing, phrases, thesauri, concepts

• Federated search and harvesting, OAI

• Integrating links and ratings

• Crawlers, spiders, metasearch, fusion

Page 83: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Open Archives Initiative (OAI)

• Advocacy for interoperability• Standard for transferring metadata among

digital libraries• Protocol for Metadata Harvesting (PMH)

• Simplicity• Generality• Extensibility

• Support for PMH => Open Archive (OA)

Page 84: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OAI = Technical Umbrella forPractical Interoperability…

ReferenceLibraries

PublishersE-Print

Archives

…that can be exploited by different communities

Museums

Page 85: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OAI – Repository Perspective

Required: Protocol

DODO DO DO

MDO

MDO MDOMDOMDO

MDOMDOMDO

Page 86: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OAI – Black Box Perspective

OA 1

OA 2

OA 4

OA 3

OA 5OA 6

OA 7

Page 87: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Tiered Model of Interoperability

Mediator services

Metadata harvesting

Document models

Page 88: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

DiscoveryCurrent

AwarenessPreservation

Service Providers

Data Providers

Meta

data

harv

estin

g

The World According to OAI

Page 89: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective• Key concepts: do, mdo, coll, catalog, repository, service,

archive, DL

• Interoperability: federated search, harvesting, OAI

• Architecture: distributed, clusters, LOCKSS

• Digitization, preservation

4. Software Solutions

5. Advanced Issues

Page 90: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Architectural Issues

• Internet middleware• Independent system / part of federation• Decompositions vary

• search engine, browser, DBMS, MM support• repository, handle server, client• information resources + mediators, bus or agent

collection + client with workspace/environment• Metrics: e.g., for federated search

Page 91: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Clusters

• How can computer clusters scale with collections and user communities to achieve cost-effective solutions for DLs?

• Paul Mather dissertation by early 2005• Modeling and simulation• Cluster size• Communication fabric and patterns• Disks and nodes• Characterize DL collections: file sizes• Characterize user workload: logs• Special considerations:

• Linear hashing of names• Replication of popular objects

Page 92: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

LOCKSS

• Lots of copies keep stuff safe• Stanford (Vicky Reich)• Initial focus on lower levels• Initial content: journals• Emory (Martin Halbert)

• Help deploy and adapt

• Help apply in other contexts• Another registry

• Set of publisher manifests (information providers)

• Set of storage systems (archival storage)

• NDIIP: AmericanSouth, MetaArchive

Page 93: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OCKHAM Library Network

NSDL

OCKHAM

Services

NSDLServices

Teachers LearnersLibrarians

OCKHAMLibrary

Network

LibraryServices

Page 94: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OCKHAM

• Simplicity (a la OCCAM’s razor)

• Support by Mellon and DLF

• Four main ideas:

1. Components

2. Lightweight protocols

3. Open reference models (e.g., 5S, OAIS)

4. Community perspective and involvement

• Funded by NSF in NSDL, with P2P

Page 95: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Lightweight Protocols

• “Lightweight”, or relatively small and simple protocols seem to have clear advantages over “Full” protocols that attempt to be comprehensive.

• Successes of protocols considered lightweight is illuminating.

• Examples: TCP/IP, HTTP, LDAP, and the OAI PMH

Page 96: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Reference Models

• Reference Model: a common vocabulary and description of components, services, and inter-relationships that comprise a system under consideration

• Useful as a tool to foster consensus and common understanding in a time of rapid change and/or disagreement

Page 97: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OCKHAM Proposed Services

• Alerting• Browsing• Cataloging• Conversion• OAI – Z39.50• Pathfinding• Registry • (plus others such as from adapted ODL)

Page 98: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective• Key concepts: do, mdo, coll, catalog, repository, service,

archive, DL

• Interoperability: federated search, harvesting, OAI

• Architecture: distributed, clusters, LOCKSS

• Digitization, preservation

4. Software Solutions

5. Advanced Issues

Page 99: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Digitization and PreservationCommunity and Activity (selected)

• Archivists worldwide• International collaboration

• Million book project in US, China, India (Reddy, Chen, Balakrishnan)• US Library of Congress

• Matching funds• American Memory• Infrastructure: NDIIP

• Dutch National Library + IBM• Associations: ARL, DLF• People

• Harnad: Self-archiving movement• Lorie: Universal virtual computer• Gladney: technology, philosophy

(http://home.pacbell.net/hgladney/ddq_3_1.htm)• Besser, Trant, …

Page 100: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions• Open Source: Greenstone, eprints, Kepler, DSpace,

Fedora, ETD-DB, ODL

• Commercial: IBM Content Manager, VTLS’ VITAL

• Comparison: by capability - institutional repository, by environment (library, WWW, personal use)

• Evaluation, usability

5. Advanced Issues

Page 101: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 102: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Open Source DL Examples

• Eprints (www.eprints.org)

• Fedora

• Greenstone (www.greenstone.org)

• Many systems in NSF DLI projects

• VT systems: CITIDEL, CSTC, DL-in-a-box, ETANA, MARIAN, NCSTRL, NDLTD

Page 103: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 104: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 105: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 106: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 107: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 108: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 109: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

What is a Digital Object Repository?

Also called: digital rep., digital asset rep., institutional repository

Stores and maintains digital objects (assets)Provides external interface for Digital Objects

Creation, Modification, Access

Enforces access policiesProvides for content type disseminations

Adapted from Slide by V. Chachra, VTLS

Page 110: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Goals of Institutional Repositories (by Steven Harnad, U. Southampton)

Self Archiving of Institutional ResearchSelf Archiving of Institutional ResearchThesis and Dissertations (VTLS NDLTD Project)Thesis and Dissertations (VTLS NDLTD Project)Article preprints and post printsArticle preprints and post printsInternal documents and mapsInternal documents and maps

Management of digital collectionsManagement of digital collections

Preservation of materials – decentralized approachPreservation of materials – decentralized approach

Housing of teaching materialsHousing of teaching materials

Electronic Publishing of journals, books, posters, maps, audio, Electronic Publishing of journals, books, posters, maps, audio, video and other multimedia objectsvideo and other multimedia objects

Adapted from Slide by V. Chachra, VTLS

Page 111: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Fedora™ Digital Object ArchitecturePersistent ID (PID)

Disseminators

System Metadata

EAD, TEI, DC, MARC,

VRA Core, MIX, etc.

Datastreams

Images, E-books, E-journals, Music, Video, etc.

Globally unique persistent id

Public view: access methods for obtaining “disseminations” of digital object content

Internal view: metadata necessary to manage the object

Protected view: content that makes up the “basis” of the object

The Mellon Fedora Project

Adapted from Slide by V. Chachra, VTLS

Page 112: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Fedora™Repository

E x ter n a lC o n ten tS o u r c e

E x ter n a lC o n ten tS o u r c e

HT

TP

E x ter n a l C o n ten tR etr iev er

X M L F ile s

Re la t io n a l D B

S e s s io n M a n a g e me n tU s e r A u th e n t ic a t io n

P o l icies

U s ers /G ro u p s

H T T P

F T P

D atas tr eam s

D ig ita l O b jec tsS to rag e S u b s ys te m

S e c u rityS u b s ys te m

W e b Se r vi c eE xpo s ur eL aye r

SO

AP

R em o teS er v ic e

L o c alS er v ic e

M an ag e A c c e s s S e arc h O A I P ro v id e r

M an ag e m e n tS u b s ys te m

A c c e s sS u b s ys te m

HT

TP

FT

P

H T T PH T T P S O A P H T T P S O A P H T T P S O A P

C lie n tA pplica t io n

B a tchPro g ra m

S e rv e rA pplica t io n

W e bB ro ws e r

Co mp o n e n t M g mt

O b je c t M g mt

O b je c t Va lid a t io n

P ID Ge n e ra t io n

O b je c t D is s e min a t io n

O b je c t Re fle c t io n

P o lic y En fo rc e me n t

P o lic y M g mt

Co n te n t

Web Service Web Service Exposure Exposure LayerLayer

Adapted from Slide by V. Chachra, VTLS

Page 113: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 114: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 115: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 116: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 117: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 118: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 119: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 120: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

users digital objects

?

Page 121: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

?1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video?digital library

Monolithicand/or

Custom-builtweb-basedapplication

Page 122: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

componentized digital library

?

?

?

?

???

?

?

?

?

??

? ?

?

?

?

?

?

?

?

Page 123: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

open digital library

OA OA

OA

OA

OA

OA

OA

OA

OA

PMH

PMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

Page 124: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Open Digital Library Protocol

Extended OAI-PMH

Protocol for Metadata Harvesting

Page 125: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Open Digital Library Component

Extended OPEN ARCHIVE

OPENARCHIVE

Page 126: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Open Digital Library Deployments

• NDLTD (www.ndltd.org)• Computer Science Teaching Center (www.cstc.org)• Computing and Information Technology

Interactive Digital Educational Library (www.citidel.org)

• Open Archives Distributed (NSF, DFG) – enhancements to PhysNet

• OCKHAM• Open to others through DL-in-a-box

Page 127: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Open Digital Library

• Network of Extended Open Archives where each node acts as either a provider of data, services or both.

• Component = Node

• Protocol = Arc

Page 128: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Open Digital Library Components

• Running now• XML-File (data provider from file system)• Search: simple or in-memory (Essex) or generalized• Union, browse, recent, filter• E-journal/review, Submit, Edit, Annotation• Recommender, Rating; Mirroring (see JCDL’02)• Working with NCSA: from DB, unstructured text

• Others in process• Classification/categorization• Registry (and other connections with web services)

Page 129: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

ETD-1

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

ETD-2

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

ETD-3

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

ETD-4

ETD DL for the Networked Digital Library of Theses and Dissertations

(www.ndltd.org)

Search

Filter

Filter

Union

Recent

Browse

PMH

PMH

PMH

ODLRecent

ODLBrowse

ODLUnion

ODLUnion

ODLSearch

ODLUnionPMH

PMH

US

ER

INT

ER

FA

CE

Students and researchers ETD collections

Example Open Digital Library

Page 130: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OAI, ODL, DL-in-a-box

• Open Archives Initiative• since 1999, www.openarchives.org

• Open Digital Libraries• since 2001, from www.dlib.vt.edu• with Hussein Suleman (now U. Cape Town)

• DL-in-a-box• NSDL support since 2001• Aimed to help new collections / services projects• http://dlbox.nudl.org

Page 131: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 132: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 133: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 134: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 135: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 136: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 137: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 138: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 139: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 140: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 141: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 142: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 143: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 144: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 145: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions• Open Source: Greenstone, eprints, Kepler, DSpace, Fedora,

ETD-DB, ODL

• Commercial: IBM Content Manager, VTLS’ VITAL

• Comparison: by capability - institutional repository, by environment (library, WWW, personal use)

• Evaluation, usability

5. Advanced Issues

Page 146: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Commercial DL Examples

• IBM Digital Library

• Virtua (www.vtls.com)• Fedora -> VITAL

• Some systems from NSF DLI projects• Google

Page 147: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions• Open Source: Greenstone, eprints, Kepler, DSpace, Fedora,

ETD-DB, ODL

• Commercial: IBM Content Manager, VTLS’ VITAL

• Comparison: by capability - institutional repository, by environment (library, WWW, personal use)

• Evaluation, usability

5. Advanced Issues

Page 148: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Conceptual Category Feature Name

Discovery Tools

Searching

Browsing

Syndication & Notification

   

Aggregation Tools

Personal Collections

Content Aggregator and Packaging Tool

   

Community & Evaluation

Evaluation System

Context Usage Illustrators

Wish Lists

WCET

LOR

Study

2004

Page 149: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions• Open Source: Greenstone, eprints, Kepler, DSpace, Fedora,

ETD-DB, ODL

• Commercial: IBM Content Manager, VTLS’ VITAL

• Comparison: by capability - institutional repository, by environment (library, WWW, personal use)

• Evaluation, usability

5. Advanced Issues

Page 150: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Case Study: NCSTRL Costs/BenefitsStakeholders Sample Potential Cost Sample Potential Benefit

Providers Faculty Lower value for P&T Faster publishing

Students Less recognition Broader set of outlets

Practitioners Limited relevance Ease of publishing, > quantity

Users Faculty Lower quality of work Broader access to resources

Students Higher access costs (vs. department available material)

Lower access costs (vs. journal available material)

Departments New maintenance costs Broader visibility

University libraries Additional access costs Access to new resources

Practitioners More difficult access Access to new resources

Page 151: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline

1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions

5. Advanced Issues

• Challenges, open problems

• Promising approaches

Page 152: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Digital Libraries --- Objectives

• World Lit.: 24hr / 7day / from desktop• Integrated “super” information systems: 5S:

streams, structures, spaces, scenarios, societies • Ubiquitous, Higher Quality, Lower Cost • Education, Knowledge Sharing, Discovery• Disintermediation -> Collaboration • Universities Reclaim Property• Interactive Courseware, Student Works• Scalable, Sustainable, Usable, Useful

Page 153: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

DL Challenges

• Preservation - so people with trust DLs

• Supporting infrastructure - networks, ...

• Scalability, sustainability, interoperability

• DL industry - critical mass by covering libraries, archives, museums, corporate info, govt info, personal info - “quality WWW” integrating IR, HT, MM, ...

• Need tools & methods to make them easier to build

Page 154: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Outline

1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions

5. Advanced Issues

• Challenges, open problems

• Promising approaches

Page 155: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 156: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 157: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 158: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 159: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 160: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 161: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

NDLTD: How can a university get involved?

• Select planning/implementation team• Graduate School

• Library

• Computing / Information Technology

• Institutional Research / Educ. Tech.

• Join online, give us contact names• www.ndltd.org/join

• Adapt Virginia Tech or other proven approach• Build interest and consensus

• Start trial / allow optional submission

Page 162: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Student Gets CommitteeSignatures and Submits ETD

Signed

Grad School

Page 163: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Library Catalogs ETD, Access isOpened to the New Research

WWW

NDLTD

Page 164: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 165: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETD Union Collection (OAI)

VIRTUA

Merged Metadata Collection

ODL (VT)

Virginia Tech ETD Archive

Brazil ETD

Archive

OCLC ETD

Archive

Future: recommender, …

… OAI Data Provider

OAI Service Provider

OAI Harvesting

LEGEND

Page 166: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Union catalog: OCLC

• OCLC will expand OAI data provider on TDs.

• Is getting data from WorldCat (so, from many sites!).

• Will harvest from all others who contact them.

• Need DC and either ETD-MS or MARC.

• Has a set for ETDs.

Page 167: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 168: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 169: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 170: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

OCLC SRU Interface

Page 171: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Union catalog: VTLS, VT

• VTLS will enhance search/browse service for ETDs

• Will harvest from OCLC’s set of ETD records

• Will receive through other mechanisms

• Will work with MARC-21 and ETD-MS

• VT will continue to offer experimental services

Page 172: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 173: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETD Union Search Mirror Site in China (CALIS)(http://ndltd.calis.edu.cn – popular site!)

Page 174: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 175: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

VTLS Union CatalogContent Languages

The VTLS NDLTD Union Catalog has data in 6 different languages. These are: English German Greek Korean Portuguese Spanish

Examples follow

Page 176: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Language = German; hits = 137

Page 177: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Full record display

Page 178: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 179: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 180: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Complex to Simple

MARC ($50) Dublin Core (DC)

+thesis

Page 181: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 182: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Why ETD?Short Answer

• For Students:• Gain knowledge and skills for the Information Age

• Richer communication (digital information, multimedia, …)

• For Universities: • Easy way to enter the digital library field and benefit thereby

• For the World: • Global digital library – large, useful, many services

• General:• Save time and money

• Increased visibility for all associated with research results

Page 183: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETANA-DL: 5S Extension

• 5S and component architecture to allow handling of very complex DL applications: archaeology

• Information visualization, clustering

• Mappings across streams, structure, spaces

Page 184: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Case Study (Archaeology):ETANA

• NSF ITR with CWRU (and Vanderbilt …)

• Faster DL development• for complex application domains,• with suitable tailoring

• Approach• ODL – pool of components• 5S – theory-based generation of systems

Page 185: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETANA Website

Page 186: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Lahav Website

Page 187: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Megiddo Opening Screen

Page 188: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Locus Screen: Pictures

View all

Page 189: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Area Screen: Distribution of Artifacts

Page 190: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer
Page 191: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETANA-DL Website

Page 192: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Archaeology DL – Approach

• Solve the following DL problems:• interoperability,• making primary data available,• data preservation

• Modeling archaeological information systems• using 5S theory to design system and services

• Rapidly prototyping DLs that handle• heterogeneous archaeological data using• componentized frameworks

Page 193: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETANA-DL Schema Design

Bone Seed Figurine

ETANA-DLObject

Count

Animal

……

Species

Name

……

Description

Dimensions

……

Owner

Subpartition

PartitionLocus

ID Container

Collection

……

Page 194: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Data Mapping

Page 195: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETANA-DL Architecture

Users Services DataETANA-DL

UnionServices Users

DigBase

DigKit

Page 196: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETANA-DL ArchitectureDigBase and DigKit

Lahav

Nimrin

Umayri

Hisban

Megiddo

Jalul

New Sites

DATABASE

WRAPPERS

ETANA-DLUNION

CATALOG

SearchUSER

INTERFACE

Browse

Recommend

Note

Personalize

Review

Visualizations

ArchaeologySpecific

Work in progress

Page 197: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

ETANA-DL Architecture

UnionCatalog

Inverted Files

Services DB

Index

Index

BrowseComponent

SearchComponent

Browse DB

OtherETANA-DL

Services

Web

Interface

XOAI

XOAI

DigBase

DB

DataMapping

Component

OA

I Data P

rovider

OAI

Archaeological Site ETANA-DL

DigKit

Configure

Page 198: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Searching – Search Results

Page 199: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Searching – Advanced Search

Page 200: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Searching – Advanced Search Results

Page 201: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Summary

1. Introduction

2. Historical Perspective

3. Topical Perspective

4. Software Solutions

5. Advanced Issues

Page 202: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Selected Links - http://fox.cs.vt.edu• CITIDEL (computing education resources)

• www.citidel.org• NCSTRL (computing technical reports)

• www.ncstrl.org• NDLTD (electronic theses and dissertations worldwide)

• www.ndltd.org and etdguide.org• NSDL (National Science Digital Library)

• www.nsdl.org• OAI (Open Archives Initiative)

• www.openarchives.org• Virginia Tech Digital Library Research Laboratory

(DLRL, www.dlib.vt.edu)• 5S, AmericanSouth.Org, CSTC, DL-in-a-box, ENVISION,

ETANA, MARIAN, NDLTD, NSDL, OAD, ODL, …)

Page 203: Digital Libraries Made Easy 2004 SAMLA Convention Roanoke, Virginia November 12, 2004 Edward A. Fox Digital Library Research Laboratory & Dept. of Computer

Questions/Discussion?