dec 9-11, 2003 icadl 2003 1 a case study of a stream-based digital library: medical data mohamed...
TRANSCRIPT
Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003 11
A Case Study of a A Case Study of a Stream-Based Digital Stream-Based Digital Library: Medical Data Library: Medical Data
Mohamed KholiefMohamed Kholief
Kurt Maly Kurt Maly
Stewart ShenStewart Shen
22Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
OverviewOverview
IntroductionIntroduction
ApproachApproach
ArchitectureArchitecture
Implementation of the case studyImplementation of the case study
Conclusions and future workConclusions and future work
33Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
IntroductionIntroductionData streams are important sources of Data streams are important sources of information for applications such as:information for applications such as: News-on-demandNews-on-demand Weather servicesWeather services Scientific researchScientific research ……
A data stream is a sequence of data units A data stream is a sequence of data units produced over a period of timeproduced over a period of time
Examples: video, audio, sensor readings, … Examples: video, audio, sensor readings, …
44Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
IntroductionIntroduction
Saving data streams in digital libraries is Saving data streams in digital libraries is advantageous:advantageous: ArchivalArchival PreservationPreservation AdministrationAdministration Access controlAccess control ……
55Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
IntroductionIntroduction
Most common method of retrieval from digital libraries is Most common method of retrieval from digital libraries is by using bibliographic informationby using bibliographic informationContent based retrieval is a hot research area from Content based retrieval is a hot research area from multimedia digital librariesmultimedia digital librariesWe introduce event-based retrievalWe introduce event-based retrievalAn event is a noteworthy occurrence that occurred An event is a noteworthy occurrence that occurred during the stream during the stream There has been psychological evidence that events are There has been psychological evidence that events are easier to remember than specific time instances at which easier to remember than specific time instances at which they occurred. Searching data streams in a digital library they occurred. Searching data streams in a digital library by events gives users an extra flexibility and allows for by events gives users an extra flexibility and allows for faster and more precise retrieval faster and more precise retrieval
66Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
ApproachApproach
Considering some potential applications:Considering some potential applications: digital libraries for: stock market, news streams, census bureau digital libraries for: stock market, news streams, census bureau
statistics, weather, sports games, and the educational statistics, weather, sports games, and the educational environment environment
Forming categories of possible users and Forming categories of possible users and their basic requirementstheir basic requirements
Identifying a list of design considerationsIdentifying a list of design considerations
Implementing a medical digital library to Implementing a medical digital library to illustrate and validate our approach illustrate and validate our approach
77Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Digital Library ArchitectureDigital Library Architecture
Data streams organizationData streams organization
Events organizationEvents organization
Retrieval architectureRetrieval architecture
All to be presented in the All to be presented in the context of the case study.context of the case study.
88Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Case Study:Case Study:A Medical Digital LibraryA Medical Digital Library
Actual CT scan streams, sample text and Actual CT scan streams, sample text and audio streamsaudio streamsDomain expert – radiologist:Domain expert – radiologist: Provided the CT scansProvided the CT scans Generated sample eventsGenerated sample events Specified the specific metadata fieldsSpecified the specific metadata fields Described the mapping between the raw Described the mapping between the raw
metadata format and the bibliographic metadata format and the bibliographic metadatametadata
99Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Structure of the Stream Object Structure of the Stream Object
Stream Object Folder
Data Folder Metadata Folder
A Stream Data File
Temporal Metadata
Bib Metadata
Specific Metadata
1010Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Stream MetadataStream Metadata
Standard Bib:Standard Bib: ID, ENTRY, TITLE, TYPE, ID, ENTRY, TITLE, TYPE, CREATOR, DESCRIPTION, FORMAT, CREATOR, DESCRIPTION, FORMAT, SOURCE, START_DATE, END_DATESOURCE, START_DATE, END_DATE
Type Specific Bib:Type Specific Bib: patient name, birth patient name, birth date, age, sex, history, slice thickness, date, age, sex, history, slice thickness, contrast, and reasoncontrast, and reason
Temporal:Temporal: timing information of stream timing information of stream framesframes
1111Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Structure of the Digital Library Structure of the Digital Library Repository Repository
Archive 1 Archive 2 Archive n
Digital Library
Repository
Stream O
bjects
1212Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
The ER Diagram of the Database The ER Diagram of the Database
1313Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
PublishingPublishing
Mass Mass publishing publishing using PERL using PERL scriptsscriptsIndividual Individual publishing publishing manuallymanuallyIn a future In a future extension: a extension: a publishing toolpublishing tool
Publishing Tool
Indexing ToolStream
Templates Pool
Streams Repository
Publisher
Stream Type
Stream Template
Stream Object
Stream Data and metadata
1414Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
EventsEvents
The following events were used:The following events were used: CalcificationCalcification Tumor started to appearTumor started to appear Maximum diameter of the tumorMaximum diameter of the tumor Necrosis appearingNecrosis appearing Maximum necrosis diameter Maximum necrosis diameter
Event information and time instances were Event information and time instances were manually inserted in the database manually inserted in the database
1515Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Bib. Metadata
Time Instances List
Related Streams List
Event Criteria
EventDisplay Method
Structure of the Event ObjectStructure of the Event Object
1616Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Structure of the Event-Class Object Structure of the Event-Class Object
Bibliographic Metadata (e.g. id, type, description, terms
and conditions)
Event-class display method
List of events of that type
1717Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
The Event The Event Generation Generation Process Process
Related Stream Objects
Event Template
Specifications:Bib metadataEvent criteriaRelated streams
Domain Experts
Event Generation Tool
Criteria Module
Display Module
Bibliographic Fields
Related Streams List
Digital library administrators
Event Object
Bibliographic metadata
Related streams list
Time instances list
Criteria Module
Display Module
Event Templates Repository
Event Editing Tool
1818Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Related StreamsRelated Streams
Represented in database tablesRepresented in database tables
Generated in two ways:Generated in two ways: Automated Automated
PERL scripts used to relate streams using simple PERL scripts used to relate streams using simple criteria (e.g. same patient)criteria (e.g. same patient)
ManualManualInsert directly into database Insert directly into database
1919Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Retrieval Retrieval Architecture Architecture
Search Interface
Search Results
Object’s Display Interface
Control Applet
Player Applets
Metadata
Objects Repository
Web Client Web Server Storage
Search Script
Display Script
Control Script
Streaming Server
Resou
rce Discovery u
ser
Datab
ase Server
File S
ystem
2020Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
The User InterfaceThe User Interface
Search: event-based, simple, browseSearch: event-based, simple, browse
Search resultsSearch results
Stream displayStream display
Playback interfacePlayback interface
DemoDemo
2121Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
The The Simple Simple Search Search FormForm
2222Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
The The Event Event Based Based Search Search FormForm
2323Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
The Events MenuThe Events Menu
2424Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Search ResultsSearch Results
2525Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Browsing ResultsBrowsing Results
2626Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Stream Display InterfaceStream Display Interface
2727Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Sample Sample Playback Playback InterfaceInterface
2828Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Playing Synchronous Streams Playing Synchronous Streams
2929Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
ConclusionsConclusions
Introduced a new approach of retrieval of data Introduced a new approach of retrieval of data streams from digital libraries that is natural and streams from digital libraries that is natural and powerfulpowerful Showed the feasibility of using events for retrieval Showed the feasibility of using events for retrieval
from digital librariesfrom digital libraries Provided a new paradigm for medical digital libraries Provided a new paradigm for medical digital libraries
Accessing medical streams more efficientlyAccessing medical streams more efficientlyKey to the specification of events and what metadata to use Key to the specification of events and what metadata to use were the domain experts (radiologists) were the domain experts (radiologists)
The library prototype can be adapted to easily support The library prototype can be adapted to easily support more stream types more stream types
The system is efficient and scales to support a The system is efficient and scales to support a production DLproduction DL
3030Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Future WorkFuture Work
Event generation toolEvent generation toolAutomatic event generationAutomatic event generationDynamic event generationDynamic event generationAdvanced search interfaceAdvanced search interfaceOAI complianceOAI complianceManipulating a greater number of eventsManipulating a greater number of eventsEvent-related streamsEvent-related streamsFurther automating the process of relating streamsFurther automating the process of relating streamsPublishing toolPublishing toolMore applicationsMore applications
3131Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
3232Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
IntroductionIntroduction
Digital libraries are storehouses of information Digital libraries are storehouses of information available through the Internet that provide ways available through the Internet that provide ways to collect, store, and organize data and make it to collect, store, and organize data and make it accessible for search, retrieval, and processingaccessible for search, retrieval, and processingDigital library collections are not limited to Digital library collections are not limited to document resources: they extend to digital document resources: they extend to digital artifacts that cannot be represented or artifacts that cannot be represented or distributed in printed formats. A digital library distributed in printed formats. A digital library may contain text, images, sound and video in an may contain text, images, sound and video in an integrated repository integrated repository
3333Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
MotivationMotivationFirst, some realizations from the (old) IRI system.First, some realizations from the (old) IRI system.
Many streams are usedMany streams are used: : video, audio, button clicks, slides, …video, audio, button clicks, slides, …
Playback of recorded sessions required much knowledgePlayback of recorded sessions required much knowledgesession title, date, and time. session title, date, and time. use of a slider for random access. use of a slider for random access.
Using events seemed more intuitiveUsing events seemed more intuitive.. Playback could only be done through IRI. Playback could only be done through IRI.
Using the web seemed more flexibleUsing the web seemed more flexible.. Only students could access the recordings. Only students could access the recordings.
Using the web would increase the audience. Using the web would increase the audience. Using DL’s would still provide access control.Using DL’s would still provide access control.
Then, generalization to any stream and for other applications such Then, generalization to any stream and for other applications such as weather services.as weather services.
3434Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
ObjectiveObjective
To study the issues involved in building a To study the issues involved in building a digital library that:digital library that: contains data streamscontains data streams allows event-based retrievalallows event-based retrieval allows other traditional retrieval approaches. allows other traditional retrieval approaches.
3535Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Objective DetailsObjective Details
Building a streams-based digital libraryBuilding a streams-based digital library
Supporting event-based retrievalSupporting event-based retrieval
Developing a prototype for the digital Developing a prototype for the digital library library
Developing a digital library application of Developing a digital library application of the proposed prototypethe proposed prototype
3636Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Issues of Building a Streams-Based Issues of Building a Streams-Based Digital LibraryDigital Library
Data organizationData organization
Required metadataRequired metadata
Services provided to usersServices provided to users
Other retrieval approachesOther retrieval approaches
3737Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Issues of Supporting Event-Based Issues of Supporting Event-Based RetrievalRetrieval
Creating events and inserting them Creating events and inserting them The physical representationThe physical representationEvent metadataEvent metadataUsing events for retrievalUsing events for retrievalRepresenting events that depend on many Representing events that depend on many streamsstreamsUsing events to synchronize the playback of Using events to synchronize the playback of multiple streamsmultiple streams
3838Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Issues of Developing a PrototypeIssues of Developing a Prototype
Process of deciding on important streamsProcess of deciding on important streams
Process of determining events to be consideredProcess of determining events to be considered
Process of applying the prototype for an Process of applying the prototype for an applicationapplication
Deciding on software components to be included Deciding on software components to be included in the prototypein the prototype
Supporting extensibility in the prototypeSupporting extensibility in the prototype
3939Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Developing an ApplicationDeveloping an Application
A repositoryA repository A few hundred data streams of text, images, and A few hundred data streams of text, images, and
audio formatsaudio formats
Search interfacesSearch interfaces employ event-based retrievalemploy event-based retrieval retrieval based on bibliographic informationretrieval based on bibliographic information
A player application for each supported stream A player application for each supported stream type and the corresponding streaming serverstype and the corresponding streaming serversA control moduleA control moduleA process for extending the digital library to A process for extending the digital library to support any new stream type or formatsupport any new stream type or format
4040Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
System ModelSystem Model
A frame: A frame: f = (t,d); or f(t)=df = (t,d); or f(t)=d
A stream: A stream: s = {f(ts = {f(t00), f(t), f(t11), f(t), f(t22)…})…}
An event: An event: E = (name, T),E = (name, T),where T = {twhere T = {tii, t, tjj…}…}
Criteria of an event Criteria of an event E: CE: CEE
T = {tT = {tii | C | CEE is not satisfied for f(t is not satisfied for f(ti-1i-1) and is ) and is
satisfied for f(tsatisfied for f(tii))
4141Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
GeneralizationGeneralization
All streams in the library:All streams in the library: S = {sS = {s11, s, s22, s, s33…}…}
Related streams toRelated streams to E: SE: SEE S S
New event:New event: E = (name, T, SE = (name, T, SEE))
New criteria:New criteria: CCEE(S(SEE))
FFSESE(t)(t) is the set of frames from all the is the set of frames from all the
related streams that happen at timerelated streams that happen at time tt
4242Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
GeneralizationGeneralization
If If f(t)f(t) exists, exists,
then then f(tf(tmm) = f(t),) = f(t),
else else
if if f(tf(tt)t) exists ( exists (t is a tolerance)t is a tolerance)
then then f(tf(tmm) = f(t) = f(tt)t)
else else f(tf(tmm) = NULL) = NULL..
T = {t | CT = {t | CEE(S(SEE) is not satisfied for F) is not satisfied for FSESE(t(tm-1m-1) )
and satisfied for Fand satisfied for FSESE(t(tmm)})}
4343Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
ApplicationsApplications
An educational digital libraryAn educational digital libraryA weather-related digital libraryA weather-related digital libraryA stock market digital libraryA stock market digital libraryA digital library containing news-streamsA digital library containing news-streamsA digital library containing census bureau A digital library containing census bureau statisticsstatisticsA digital library containing sports games A digital library containing sports games multimedia streamsmultimedia streamsA medical digital libraryA medical digital library
4444Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
An Educational Digital LibraryAn Educational Digital Library
Example: IRIExample: IRIStreams: Streams:
teacher audio and videoteacher audio and video students audio and video students audio and video shared room viewshared room view button clicksbutton clicks
Events: Events: teacher started tool teacher started tool students joined session students joined session somebody asked a question, etcsomebody asked a question, etc
Potential users: Potential users: teachers, students, general public, education researchersteachers, students, general public, education researchers
4545Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
An Educational Digital LibraryAn Educational Digital Library
Query: When the teacher started Netscape, what did he Query: When the teacher started Netscape, what did he say?say?Query: Return all sessions in which Dr. Smith is the Query: Return all sessions in which Dr. Smith is the teacherteacherQuery: When Query: When anyany student started speaking, what was student started speaking, what was the running application, who was the teacher, what is the running application, who was the teacher, what is that student’s name, etc?that student’s name, etc?Query: Return all streams in which Netscape was Query: Return all streams in which Netscape was executing and the teacher opened an XTERMexecuting and the teacher opened an XTERM
4646Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
A Weather-Related Digital A Weather-Related Digital LibraryLibrary
Streams: Streams: temperatures, wind, humidity, radar maps, and satellite imagestemperatures, wind, humidity, radar maps, and satellite images
Events: Events: wind, temp., rain, etc exceeding certain limit, wind, temp., rain, etc exceeding certain limit, hurricane upgrade, downgrade, etc.hurricane upgrade, downgrade, etc.
Potential users:Potential users: Meteorologists, public, weather researchers, …Meteorologists, public, weather researchers, …
Queries: Queries: Display the wind stream starting from the time at which a Display the wind stream starting from the time at which a
hurricane upgrade was reported over this areahurricane upgrade was reported over this area Display the radar maps and satellite images during a snowstormDisplay the radar maps and satellite images during a snowstorm Display the TV weather reports that relate to a specific eventDisplay the TV weather reports that relate to a specific event
4747Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Design ConsiderationsDesign Considerations
Multiple modes for retrievalMultiple modes for retrievalEvent instancesEvent instancesan event usually occurs more than once during a data streaman event usually occurs more than once during a data stream
Some events occur only once (esp. media events)Some events occur only once (esp. media events)There maybe a very large number of events to browse There maybe a very large number of events to browse throughthroughAtomic and composite events Atomic and composite events Composite events have durationsComposite events have durations
Issues with event generationIssues with event generationWho? How? Manual or Automated? Tools?Who? How? Manual or Automated? Tools?
Domain or field experts are required to define and Domain or field experts are required to define and specify event instancesspecify event instances
4848Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Design ConsiderationsDesign Considerations
Accommodation for related streams Accommodation for related streams streams the user might want to retrieve whenever she retrieves a streamstreams the user might want to retrieve whenever she retrieves a stream
Stream-related Stream-related e.g.: audio and video streams of a class session e.g.: audio and video streams of a class session
Event-related Event-related e.g.: audio stream of a description of that event e.g.: audio stream of a description of that event
Instance-related Instance-related e.g.: the CNN broadcast when an upgrade occurs in a specific hurricanee.g.: the CNN broadcast when an upgrade occurs in a specific hurricane
Concurrent streamsConcurrent streams can be synchronized while they are played back, e.g. the video and can be synchronized while they are played back, e.g. the video and
audio streams of the same class sessionaudio streams of the same class sessionNon-concurrent, Non-concurrent,
e.g.: news broadcast about a hurricane upgradee.g.: news broadcast about a hurricane upgradeDomain experts to determine the relationship criteriaDomain experts to determine the relationship criteria
4949Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Design ConsiderationsDesign Considerations
Multiple stream formatsMultiple stream formatsMultiple stream typesMultiple stream types
A set of streams that share the same characteristics are of the same A set of streams that share the same characteristics are of the same stream type: metadata, players, data access methodsstream type: metadata, players, data access methods
Type is not format (Teacher audio is not the same type as TV audio)Type is not format (Teacher audio is not the same type as TV audio)Different bibliographic metadata fields for different stream typesDifferent bibliographic metadata fields for different stream types
““Specific metadata” and “Standard metadata”Specific metadata” and “Standard metadata”Certain software modules needed for every stream typeCertain software modules needed for every stream type
To display stream information, to play back, and to access dataTo display stream information, to play back, and to access dataVarious roles for DL usersVarious roles for DL users
domain experts, publishers, resource discovery usersdomain experts, publishers, resource discovery users
5050Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
Type ExtensionType Extension
Add new record in the stream types table Add new record in the stream types table in the databasein the database
Implement the Player Applet for streams of Implement the Player Applet for streams of this new type:this new type: It has to implement a pre-defined JAVA It has to implement a pre-defined JAVA
interfaceinterface An existing player could be used as a An existing player could be used as a
templatetemplate
5151Dec 9-11, 2003Dec 9-11, 2003 ICADL 2003ICADL 2003
ObjectiveObjective
To study the issues involved in building a digital To study the issues involved in building a digital library that:library that: Contains data streamsContains data streams Allows event-based retrievalAllows event-based retrieval Allows other traditional retrieval approachesAllows other traditional retrieval approaches
Components:Components: Building a streams-based digital libraryBuilding a streams-based digital library Supporting event-based retrievalSupporting event-based retrieval Developing a prototype for the digital library Developing a prototype for the digital library Developing a digital library application of the Developing a digital library application of the
proposed prototypeproposed prototype