1 cs 502: computing methods for digital libraries lecture 20 multimedia digital libraries

15
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

1

CS 502: Computing Methods for Digital Libraries

Lecture 20

Multimedia digital libraries

Page 2: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

2

Administration

Discussion classes

• attend!

• speak!

Assignment 3

• email messages have been forwarded

Page 3: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

3

Harvest

Gatherer

• Program that collects indexing information from digital library collections.

• Most effective when they are installed on the same system as the collections.

• Allows indexes to include materials with restricted access.

Broker

• Builds a combined index with information about many collections. (Also called a union catalog.)

Page 4: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

4

Harvest

Repositories

Search Systems

Users

Metadata

Metadata

Gatherer

Page 5: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

5

Harvest

Gatherers

• need not be restricted to web pages or any specific format

• can incorporate dictionaries or lexicons for specialized topic areas

• combine the benefits of local indexing with a central index for users

• particularly effective for federated digital libraries.

In a federation, each library can run its own gatherer and transmit indexing information to brokers that build consolidated indexes for the entire library.

Page 6: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

6

Multimedia 1: Geospatial Information

Example: Alexandria Digital Library at the University of California, Santa Barbara

• Funded by the NSF Digital Libraries Initiative from 1994.

• Collections include any data referenced by a geographical footprint.

terrestrial maps, aerial and satellite photographs, astronomical maps, databases, related textual information

• Program of research with practical implementation at the university's map library

Page 7: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

7

Alexandria user interface

Page 8: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

8

Computer systems and user interfaces

Computer systems

• Digitized maps and geospatial information -- large files

• Wavelets provide multi-level decomposition of image

first level is a small coarse imageextra levels provide greater detail

User interfaces

• Small size of computer displays

• Slow performance of Internet in delivering large files

retain state throughout a session

Page 9: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

9

Alexandria: information discovery

Metadata for information discovery

Coverage: geographical area covered, such as the city of Santa Barbara or the Pacific Ocean.

Scope: varieties of information, such as topographical features, political boundaries, or population density.

Latitude and longitude provide basic metadata for maps and for geographical features.

Page 10: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

10

Gazetteer

Gazetteer: database and a set of procedures that translate representations of geospatial references:

place names, geographic features, coordinatespostal codes, census tracts

Search engine tailored to peculiarities of searching for place names.

Research is making steady progress at feature extraction, using automatic programs to identify objects in aerial photographs or printed maps -- topic for long-term research.

Page 11: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

11

Multimedia 2: video segmentation

Example: Informedia research program at Carnegie Mellon University

• objects in library -- broadcast news and documentary programs

• more than one thousand hours of digitized video,

Cable Network News, British Open University,WQED television

• automatically broken into short segments of video, such as the individual items in a news broadcast

• automatic methods for extracting information from the video -- populate the library with minimal human intervention

Page 12: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

12

Page 13: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

13

Automatic processing

Video skimming:

automatic methods to extract important words and images from the video -- conveys the essence of the full video segment

Text extraction:

• speech recognition from sound track

• closed captions

• text on screen

Page 14: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

14

Multi-model information discovery

The multi-modal approach to information retrieval

Computer programs to analyze video materials for clues e.g., changes of scene

• methods from artificial intelligence, e.g., speech recognition, natural language processing, image recognition.

• analyze of video track, sound track, closed captioning if present, any other information.

Each mode gives imperfect information but combining the evidence from all can be effective.

Page 15: 1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries

15

Combined browsing and searching

Browsing

• see what books are stored together• begin with one item and then move to the items that it refers to --

citations or hyperlinks

Measures of effectiveness

• evaluating the effectiveness of information retrieval in an interactive session with the user in the loop -- unsolved

Overall

• thorough information discovery hard except in limited circumstances

• most users find most of what they want