1 cs 502: computing methods for digital libraries lecture 20 multimedia digital libraries
Post on 20-Dec-2015
213 views
TRANSCRIPT
1
CS 502: Computing Methods for Digital Libraries
Lecture 20
Multimedia digital libraries
2
Administration
Discussion classes
• attend!
• speak!
Assignment 3
• email messages have been forwarded
3
Harvest
Gatherer
• Program that collects indexing information from digital library collections.
• Most effective when they are installed on the same system as the collections.
• Allows indexes to include materials with restricted access.
Broker
• Builds a combined index with information about many collections. (Also called a union catalog.)
4
Harvest
Repositories
Search Systems
Users
Metadata
Metadata
Gatherer
5
Harvest
Gatherers
• need not be restricted to web pages or any specific format
• can incorporate dictionaries or lexicons for specialized topic areas
• combine the benefits of local indexing with a central index for users
• particularly effective for federated digital libraries.
In a federation, each library can run its own gatherer and transmit indexing information to brokers that build consolidated indexes for the entire library.
6
Multimedia 1: Geospatial Information
Example: Alexandria Digital Library at the University of California, Santa Barbara
• Funded by the NSF Digital Libraries Initiative from 1994.
• Collections include any data referenced by a geographical footprint.
terrestrial maps, aerial and satellite photographs, astronomical maps, databases, related textual information
• Program of research with practical implementation at the university's map library
7
Alexandria user interface
8
Computer systems and user interfaces
Computer systems
• Digitized maps and geospatial information -- large files
• Wavelets provide multi-level decomposition of image
first level is a small coarse imageextra levels provide greater detail
User interfaces
• Small size of computer displays
• Slow performance of Internet in delivering large files
retain state throughout a session
9
Alexandria: information discovery
Metadata for information discovery
Coverage: geographical area covered, such as the city of Santa Barbara or the Pacific Ocean.
Scope: varieties of information, such as topographical features, political boundaries, or population density.
Latitude and longitude provide basic metadata for maps and for geographical features.
10
Gazetteer
Gazetteer: database and a set of procedures that translate representations of geospatial references:
place names, geographic features, coordinatespostal codes, census tracts
Search engine tailored to peculiarities of searching for place names.
Research is making steady progress at feature extraction, using automatic programs to identify objects in aerial photographs or printed maps -- topic for long-term research.
11
Multimedia 2: video segmentation
Example: Informedia research program at Carnegie Mellon University
• objects in library -- broadcast news and documentary programs
• more than one thousand hours of digitized video,
Cable Network News, British Open University,WQED television
• automatically broken into short segments of video, such as the individual items in a news broadcast
• automatic methods for extracting information from the video -- populate the library with minimal human intervention
12
13
Automatic processing
Video skimming:
automatic methods to extract important words and images from the video -- conveys the essence of the full video segment
Text extraction:
• speech recognition from sound track
• closed captions
• text on screen
14
Multi-model information discovery
The multi-modal approach to information retrieval
Computer programs to analyze video materials for clues e.g., changes of scene
• methods from artificial intelligence, e.g., speech recognition, natural language processing, image recognition.
• analyze of video track, sound track, closed captioning if present, any other information.
Each mode gives imperfect information but combining the evidence from all can be effective.
15
Combined browsing and searching
Browsing
• see what books are stored together• begin with one item and then move to the items that it refers to --
citations or hyperlinks
Measures of effectiveness
• evaluating the effectiveness of information retrieval in an interactive session with the user in the loop -- unsolved
Overall
• thorough information discovery hard except in limited circumstances
• most users find most of what they want