evaluating xml retrieval: the inex initiative mounia lalmas queen mary university of london
TRANSCRIPT
Evaluating XML Evaluating XML retrieval: retrieval:
The INEX initiativeThe INEX initiativeMounia LalmasMounia Lalmas
Queen Mary University of LondonQueen Mary University of London
http://qmir.dcs.qmul.ac.ukhttp://qmir.dcs.qmul.ac.uk
OutlineOutline
Information retrievalInformation retrieval (Content-oriented) XML retrieval(Content-oriented) XML retrieval
Evaluating information retrievalEvaluating information retrieval Evaluating XML retrieval: INEXEvaluating XML retrieval: INEX
Information retrievalInformation retrieval
Example of a user information need:Example of a user information need:
““Find all documents about sailing charter Find all documents about sailing charter agencies that (1) offer sailing boats in the Greek agencies that (1) offer sailing boats in the Greek islands, and (2) are registered with the RYA. The islands, and (2) are registered with the RYA. The documents should contain boat specification, documents should contain boat specification, price per week, e-mail and other contact details.”price per week, e-mail and other contact details.”
A formal representation of an information need A formal representation of an information need constitutes a constitutes a queryquery
Information retrievalInformation retrieval
IR is concerned with the representation, IR is concerned with the representation, storage, organisation, and access to storage, organisation, and access to repositories of information, usually under repositories of information, usually under the form of the form of documentsdocuments. .
Primary goal of an IR systemPrimary goal of an IR system““Retrieve all the documents which are Retrieve all the documents which are relevantrelevant (useful) to a user query, while (useful) to a user query, while retrieving as few non-relevant documents as retrieving as few non-relevant documents as possible.”possible.”
DocumentsDocuments QueryQuery
Document representationDocument representation
Retrieval resultsRetrieval results
Query representationQuery representation
Indexing Formulation
Retrieval function
Relevancefeedback
Conceptual model for IRConceptual model for IR
Structured Document Structured Document RetrievalRetrieval
Traditional IR is about finding relevant documents to a Traditional IR is about finding relevant documents to a user’s information need, e.g. entire book.user’s information need, e.g. entire book.
SDR allows users to retrieve document components that SDR allows users to retrieve document components that are more focussed to their information needs, e.g a are more focussed to their information needs, e.g a chapter of a book instead of an entire book.chapter of a book instead of an entire book.
The structure of documents is exploited to identify which The structure of documents is exploited to identify which document components to retrieve.document components to retrieve.
Structured DocumentsStructured Documents
Linear order of words, sentences, paragraphs …
Hierarchy or logical structure of a book’s chapters, sections …
Links (hyperlink), cross-references, citations …
Temporal and spatial relationships in multimedia documents
Book
Chapters
Sections
Paragraphs
World Wide Web
This is only only another to look one le to show the need an la a out structure of and more a document and so ass to it doe not necessary text a structured document have retrieval on the web is an it important topic of today’s research it issues to make se last sentence..
Structured DocumentsStructured Documents
ExplicitExplicit structure structure formalised formalised through document representation through document representation standards (Mark-up Languages)standards (Mark-up Languages)
LayoutLayoutLaTeX (publishing), HTML (Web LaTeX (publishing), HTML (Web publishing)publishing)
StructureStructureSGML, SGML, XMLXML (Web publishing, (Web publishing, engineering), MPEG-7 (broadcasting)engineering), MPEG-7 (broadcasting)
Content/Content/SemanticSemanticRDF, DAML + OIL, OWL (semantic RDF, DAML + OIL, OWL (semantic web)web)
World Wide Web
This is only only another to look one le to show the need an la a out structure of and more a document and so ass to it doe not necessary text a structured document have retrieval on the web is an it important topic of today’s research it issues to make se last sentence..
<b><font size=+2>SDR</font></b><img src="qmir.jpg" border=0>
<section> <subsection> <paragraph>… </paragraph> <paragraph>… </paragraph> </subsection></section>
<Book rdf:about=“book”> <rdf:author=“..”/> <rdf:title=“…”/></Book>
XML: eXML: eXXtensible tensible Mark-upMark-up LLanguageanguage
Meta-language (user-defined tags) currently Meta-language (user-defined tags) currently being adopted as the document format being adopted as the document format language by W3Clanguage by W3C
Used to describe content and structure (and Used to describe content and structure (and not layout)not layout)
Grammar described in DTD (Grammar described in DTD ( used for used for validation)validation)<lecture> <title> Structured Document Retrieval </title> <author> <fnm> Smith </fnm> <snm> John </snm> </author> <chapter> <title> Introduction into XML retrieval </title> <paragraph> …. </paragraph> … </chapter> …</lecture>
<!ELEMENT lecture (title, author+,chapter+)><!ELEMENT author (fnm*,snm)><!ELEMENT fnm #PCDATA>…
XML: eXML: eXXtensible tensible Mark-upMark-up LLanguageanguage
Use of XPath notation to refer to the Use of XPath notation to refer to the XML structureXML structure
chapter/title: title is a direct sub-component of chapter//title: any titlechapter//title: title is a direct or indirect sub-component of chapterchapter/paragraph[2]: any direct second paragraph of any chapterchapter/*: all direct sub-components of a chapter
<lecture> <title> Structured Document Retrieval </title> <author> <fnm> Smith </fnm> <snm> John </snm> </author> <chapter> <title> Introduction into SDR </title> <paragraph> …. </paragraph> … </chapter> …</lecture>
Querying XML documentsQuerying XML documents
Content-only (CO) queriesContent-only (CO) queries
''open standards for digital video in distance learningopen standards for digital video in distance learning''
Content-and-structure (CAS) queriesContent-and-structure (CAS) queries
//article [about(., 'formal methods verify correctness aviation //article [about(., 'formal methods verify correctness aviation systems')]systems')]
/body//section/body//section [about(.,'case study application model checking [about(.,'case study application model checking
theorem proving')]theorem proving')]
Structure-only (SA) queriesStructure-only (SA) queries
/article//*section/paragraph[2]/article//*section/paragraph[2]
Conceptual model for XML Conceptual model for XML retrievalretrieval
Structured documents Content + structure
Inverted file + structure index
tf, idf, acc
Matching content + structure
Presentation of related components
DocumentsDocuments QueryQuery
Document representationDocument representation
Retrieval resultsRetrieval results
Query representationQuery representation
IndexingIndexing FormulationFormulation
Retrieval functionRetrieval function
Relevancefeedback
Relevancefeedback
Content-oriented XML Content-oriented XML retrievalretrieval
Return document components of Return document components of varying granularityvarying granularity (e.g. a book, (e.g. a book,
a chapter, a section, a paragraph, a a chapter, a section, a paragraph, a table, a figure, etc), relevant to the table, a figure, etc), relevant to the user’s information need both with user’s information need both with
regards to regards to contentcontent and and structurestructure..
Content-oriented XML Content-oriented XML retrievalretrieval
Retrieve theRetrieve the bestbest components components according to content and structure according to content and structure criteria:criteria:
INEX:INEX: most specific component that satisfies the query, most specific component that satisfies the query, while being exhaustive to the querywhile being exhaustive to the query
Shakespeare study:Shakespeare study: best entry points, which are best entry points, which are components from which many relevant components can components from which many relevant components can be reached through browsingbe reached through browsing
??????
ArticleArticle ?XML,??XML,?retrievalretrieval
??authoringauthoring
0.9 XML 0.5 XML 0.2 XML0.9 XML 0.5 XML 0.2 XML
0.4 retrieval 0.7 0.4 retrieval 0.7 authoringauthoring
ChallengesChallenges
Title Section 1 Section 2
No fixed retrieval unit + nested document components +different types of document components
how to obtain document and collection statistics? which component is a good retrieval unit? which components contribute best to content of Article? how to estimate? how to aggregate?
0.40.5
0.2
0.6 0.40.4
0.2
Approaches …Approaches …
vector space model
probabilistic model
bayesian network
language model
extending DB model
boolean model
natural language processing
cognitive model
ontology
parameter estimation
tuning
smoothing
fusion
phrase
term statistics
collection statistics
component statistics
proximity search
logistic regression
belief modelrelevance feedback
EvaluationEvaluation
The goal of an IR systemThe goal of an IR systemretrieve as many relevant documents as possible and as retrieve as many relevant documents as possible and as few non-relevant documents as possiblefew non-relevant documents as possible
Comparative evaluation of technical performance of Comparative evaluation of technical performance of IR systems = effectivenessIR systems = effectiveness
ability of the IR system to retrieve relevant documents and ability of the IR system to retrieve relevant documents and suppress non-relevant documentssuppress non-relevant documents
EffectivenessEffectivenesscombination of combination of recallrecall and and precisionprecision
RelevanceRelevance
A document is relevant if it “has significant A document is relevant if it “has significant and demonstrable bearing on the matter at and demonstrable bearing on the matter at hand”.hand”.
Common assumptions:Common assumptions: ObjectivityObjectivity TopicalityTopicality Binary natureBinary nature IndependenceIndependence
Recall / PrecisionRecall / Precision
Document collection
Retrieved RelevantRetrieved and relevant
documentsrelevant ofnumber
retrieved documentsrelevant ofnumber recall
retrieved documents ofnumber
retrieved documentsrelevant ofnumber precision
=
=
Recall / PrecisionRecall / Precisionrelevant documents for a given queryrelevant documents for a given query
{d3, d5, d9, d25, d39, d44, d56, d71, d89, d123}{d3, d5, d9, d25, d39, d44, d56, d71, d89, d123}
rankrank docdoc precisionprecision recallrecall rankrank docdoc precisionprecision recallrecall
11
22
33
44
55
66
77
d123d123
d84d84
d56d56
D6D6
d8d8
d9d9
d511d511
1/11/1
2/32/3
3/63/6
1/101/10
2/102/10
3/103/10
88
99
1010
1111
1212
1313
1414
d129d129
d187d187
d25d25
d48d48
d250d250
d113d113
d3d3
4/104/10
5/145/14
4/104/10
5/105/10
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100recall
precision
s 1s 2
Test collectionTest collection Document collection = document themselvesDocument collection = document themselves
depend on the task, e.g. evaluating web retrieval depend on the task, e.g. evaluating web retrieval requires a collection of HTML documents.requires a collection of HTML documents.
Queries / requestsQueries / requestssimulate real user information needs.simulate real user information needs.
Relevance judgementsRelevance judgementsstating for a query the relevant documents.stating for a query the relevant documents.
See TREC, CLEF, etcSee TREC, CLEF, etc
Evaluation of XML retrieval: Evaluation of XML retrieval: INEXINEX
Evaluating the effectiveness of content-oriented XML Evaluating the effectiveness of content-oriented XML retrieval approachesretrieval approaches
Collaborative effort Collaborative effort participants contribute to the participants contribute to the development of the collectiondevelopment of the collection
queriesqueriesrelevance assessmentsrelevance assessments
Similar methodology as for TREC, but adapted to XML Similar methodology as for TREC, but adapted to XML retrievalretrieval
40+ participants worldwide40+ participants worldwide
Workshop in Schloss Dagstuhl in December (20+ Workshop in Schloss Dagstuhl in December (20+ institutions)institutions)
INEX Test CollectionINEX Test Collection Documents (~500MB), which consist of 12,107 articles Documents (~500MB), which consist of 12,107 articles
in XML format from the IEEE Computer Society; 8 in XML format from the IEEE Computer Society; 8 millions elementsmillions elements
INEX 2002INEX 200230 CO and 30 CAS queries30 CO and 30 CAS queries
inex_eval metricinex_eval metric
INEX 2003INEX 200336 CO and 30 CAS queries36 CO and 30 CAS queries
CAS queries are defined according to enhanced subset of CAS queries are defined according to enhanced subset of XPathXPath
inex_eval and inex_eval_ng metricsinex_eval and inex_eval_ng metrics
INEX 2004 is just startingINEX 2004 is just starting
Relevance in XMLRelevance in XML
A element is relevant if it “has significant A element is relevant if it “has significant and demonstrable bearing on the matter at and demonstrable bearing on the matter at hand”hand”
Common assumptions in IRCommon assumptions in IR ObjectivityObjectivity TopicalityTopicality Binary natureBinary nature IndependenceIndependence
section
paragraph
article
1 2
1 2 3
Relevance in INEXRelevance in INEX
ExhaustivityExhaustivityhow exhaustively a document component discusses the how exhaustively a document component discusses the query: 0, 1, 2, 3query: 0, 1, 2, 3
SpecificitySpecificityhow focused the component is on the query: 0, 1, 2, 3how focused the component is on the query: 0, 1, 2, 3
RelevanceRelevance (3,3), (2,3), (1,1), (0,0), …(3,3), (2,3), (1,1), (0,0), …
section
article all sections relevant article very relevantall sections relevant article better than sectionsone section relevant article less relevantone section relevant section better than article…
Relevance assessment Relevance assessment tasktask
CompletenessCompleteness Element Element parent element, children element parent element, children element
ConsistencyConsistency Parent of a relevant element must also be relevant, although to a Parent of a relevant element must also be relevant, although to a
different extentdifferent extent Exhaustivity increase going Exhaustivity increase going Specificity decrease going Specificity decrease going
Use of an online interfaceUse of an online interface Assessing a query takes a week!Assessing a query takes a week! Average 2 topics per participantsAverage 2 topics per participants
Only participants that complete the assessment task have access to the Only participants that complete the assessment task have access to the collectioncollection
section
paragraph
article
1 2
1 2 3
MetricsMetrics
Recall / precision - basedRecall / precision - based
quantisation functions to obtain one relevance quantisation functions to obtain one relevance valuevalue
expected search lengthexpected search length
penalise overlappenalise overlap consider sizeconsider size
OthersOthersexpected ratio of relevantexpected ratio of relevantcumulated gain-based metricscumulated gain-based metricstolerance to irrelevancetolerance to irrelevance
section
article
Lessons learntLessons learnt
Good definition of relevanceGood definition of relevance
Expressing CAS queries was not easyExpressing CAS queries was not easy
Relevance assessment process must be Relevance assessment process must be “improved”“improved”
Further development on metrics neededFurther development on metrics needed
User studies requiredUser studies required
ConclusionConclusion XML retrieval is not just about the effective XML retrieval is not just about the effective
retrieval of XML documents, but also about how retrieval of XML documents, but also about how to evaluate effectivenessto evaluate effectiveness
INEX 2004 tracksINEX 2004 tracks Relevance feedbackRelevance feedback InteractiveInteractive Heterogeneous collectionHeterogeneous collection Natural language queryNatural language query
http://inex.is.informatik.uni-duisburg.de:2004/
Evaluating XML Evaluating XML retrieval: retrieval:
The INEX initiativeThe INEX initiativeMounia LalmasMounia Lalmas
Queen Mary University of LondonQueen Mary University of London
http://qmir.dcs.qmul.ac.ukhttp://qmir.dcs.qmul.ac.uk