medical information retrieval: eevidence system by zhao jin mar-12-2010
TRANSCRIPT
Medical Information Retrieval: eEvidence System
By Zhao JinMar-12-2010
Domain-specific Information Retrieval
• Research– What are the characteristics of the users, the documents
and the search process in a specific domain? – What changes should be made in a IR system?
• Domains– Math
• User study, Prototype Implementation, Probabilistic Framework and Iterative Readability Computation
– Medical• eEvidence system for evidence-based practice
Outline
• What is Evidence-based Practice (EBP)• How EBP is implemented and what are the
issues• Design of eEvidence system• Discussion and Future work
Evidence-based practice (EBP)
• Decide what to do with the patients based on research findings– Instead of common sense, conventions, etc.
• Promote the publication and use of reviews and summaries of research articles
• Advantage:– Satisfy the information needs of the practitioners– Reduce the amount of literature to keep up with– Accelerate the implementation of research findings
Implementation of EBP• Guideline (active search)– Form clinical question– Identify key elements
• Patient, Intervention, Comparison, Outcome
– Search EBP resources• Availability• Applicability• Validity / Strength of evidence (Study Design)
• Issues– Generic vs Specialized search engine– Hard to assess applicability and validity– Time constraint
Implementation of EBP
• Alternative (passive search)– Receive suggestion/support while working • Knowledge-based system• Decision support system (meta-search)
• Issues– Less precise– Limited resources– Difficult to encode and update findings
eEvidence System
• Features– Crawling-based• Generic, available, updated and flexible
– Automatic Classification and Extraction• More organized results• Applicability and Validity assessment
– Dual Interface• Different seeking behaviors
eEvidence-based System
Medical Websites
WebpagesClassification / Extracted Data
Index
Read Interface
Search Interface
Profile
Users
CrawlerClassifier/Extractor
Indexer
Crawling• Implemented with Nutch
• Periodical crawling on websites selected by experts
• Advantage:– Generic, available, updated, flexible
Classification and Extraction
• Type classification on webpages– Three classes: Abstract, full text and others– Ensure proper organization of search results and filter out
unuseful webpages
• Key sentence and word extraction
• Maxent classification with text features, parse features and medical features
Dual Interface (Read)
Dual Interface (Search)
Discussion & Future work
• Size of article collection– 17 websites, 16,522 abstracts and 3371 full text articles– Not large enough for evaluation with practical task
• Classification and extraction– Good accuracy on webpage type classification, to be
extended to more types– High precision but low recall on sentence extraction– Handling of word classes with open-vocabulary still
tricky
Some results…
Precision Recall F1-MeasurePatient .68 .21 .33 Result .81 .55 .66 Intervention .84 .22 .35 Study Design .94 .30 .45 Research Goal .93 .37 .53
Precision Recall F1-MeasureAbstract .95 .98 .97Full text .94 .97 .96Others .99 .98 .99
Precision Recall F1-MeasureAge .74 .52 .61 Gender .89 .68 .77Condition .58 .49 .53 Race - - - Intervention .59 .45 .51Study Design .84 .73 .78
Type Classification
Sentence Extraction
Word Extraction