inex: evaluating content-oriented xml retrieval mounia lalmas queen mary university of london

35
INEX: Evaluating INEX: Evaluating content-oriented XML content-oriented XML retrieval retrieval Mounia Lalmas Mounia Lalmas Queen Mary University of Queen Mary University of London London http://qmir.dcs.qmul.ac.uk http://qmir.dcs.qmul.ac.uk

Upload: katelyn-roy

Post on 10-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX: Evaluating INEX: Evaluating content-oriented XML content-oriented XML

retrieval retrieval Mounia LalmasMounia Lalmas

Queen Mary University of LondonQueen Mary University of London

http://qmir.dcs.qmul.ac.ukhttp://qmir.dcs.qmul.ac.uk

Page 2: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

OutlineOutline

Content-oriented XML Content-oriented XML retrievalretrieval

Evaluating XML retrieval: Evaluating XML retrieval: INEXINEX

Page 3: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

XML RetrievalXML Retrieval

Traditional IR is about finding relevant documents to a Traditional IR is about finding relevant documents to a user’s information need, e.g. entire book.user’s information need, e.g. entire book.

XML retrieval allows users to retrieve document XML retrieval allows users to retrieve document components that are more focussed to their information components that are more focussed to their information needs, e.g a chapter of a book instead of an entire book.needs, e.g a chapter of a book instead of an entire book.

The structure of documents is exploited to identify which The structure of documents is exploited to identify which document components to retrieve.document components to retrieve.

Page 4: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Structured DocumentsStructured Documents

Linear order of words, sentences, paragraphs …

Hierarchy or logical structure of a book’s chapters, sections …

Links (hyperlink), cross-references, citations …

Temporal and spatial relationships in multimedia documents

Book

Chapters

Sections

Paragraphs

World Wide Web

This is only only another to look one le to show the need an la a out structure of and more a document and so ass to it doe not necessary text a structured document have retrieval on the web is an it important topic of today’s research it issues to make se last sentence..

Page 5: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Structured DocumentsStructured Documents

ExplicitExplicit structure structure formalised formalised through document representation through document representation standards (standards (mark-up languagesmark-up languages))

LayoutLayoutLaTeX (publishing), HTML (Web LaTeX (publishing), HTML (Web publishing)publishing)

StructureStructureSGML, SGML, XMLXML (Web publishing, (Web publishing, engineering), MPEG-7 (broadcasting)engineering), MPEG-7 (broadcasting)

Content/Content/SemanticSemanticRDF, DAML + OIL, OWL (semantic RDF, DAML + OIL, OWL (semantic web)web)

World Wide Web

This is only only another to look one le to show the need an la a out structure of and more a document and so ass to it doe not necessary text a structured document have retrieval on the web is an it important topic of today’s research it issues to make se last sentence..

<b><font size=+2>SDR</font></b><img src="qmir.jpg" border=0>

<section> <subsection> <paragraph>… </paragraph> <paragraph>… </paragraph> </subsection></section>

<Book rdf:about=“book”> <rdf:author=“..”/> <rdf:title=“…”/></Book>

Page 6: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

XML: eXML: eXXtensible tensible Mark-upMark-up LLanguageanguage

Meta-language (user-defined tags) currently Meta-language (user-defined tags) currently being adopted as the document format being adopted as the document format language by W3Clanguage by W3C

Used to describe content and structure (and Used to describe content and structure (and not layout)not layout)

Grammar described in DTD (Grammar described in DTD ( used for used for validation)validation)<lecture> <title> Structured Document Retrieval </title> <author> <fnm> Smith </fnm> <snm> John </snm> </author> <chapter> <title> Introduction into XML retrieval </title> <paragraph> …. </paragraph> … </chapter> …</lecture>

<!ELEMENT lecture (title, author+,chapter+)><!ELEMENT author (fnm*,snm)><!ELEMENT fnm #PCDATA>…

Page 7: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

XML: eXML: eXXtensible tensible Mark-upMark-up LLanguageanguage

Use of XPath notation to refer to the Use of XPath notation to refer to the XML structureXML structure

chapter/title: title is a direct sub-component of chapter//title: any titlechapter//title: title is a direct or indirect sub-component of chapterchapter/paragraph[2]: any direct second paragraph of any chapterchapter/*: all direct sub-components of a chapter

<lecture> <title> Structured Document Retrieval </title> <author> <fnm> Smith </fnm> <snm> John </snm> </author> <chapter> <title> Introduction into SDR </title> <paragraph> …. </paragraph> … </chapter> …</lecture>

Page 8: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Querying XML documentsQuerying XML documents

Content-only (CO) queriesContent-only (CO) queries

''open standards for digital video in distance learningopen standards for digital video in distance learning''

Content-and-structure (CAS) queriesContent-and-structure (CAS) queries

//article [about(., 'formal methods verify correctness aviation //article [about(., 'formal methods verify correctness aviation systems')]systems')]

/body//section/body//section [about(.,'case study application model checking [about(.,'case study application model checking

theorem proving')]theorem proving')]

Structure-only (SA) queriesStructure-only (SA) queries

/article//*section/paragraph[2]/article//*section/paragraph[2]

Page 9: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Content-oriented XML Content-oriented XML retrievalretrieval

Return document components of Return document components of varying granularityvarying granularity (e.g. a book, (e.g. a book,

a chapter, a section, a paragraph, a a chapter, a section, a paragraph, a table, a figure, etc), relevant to the table, a figure, etc), relevant to the user’s information need both with user’s information need both with

regards to regards to contentcontent and and structurestructure..

Page 10: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Content-oriented XML Content-oriented XML retrievalretrieval

Retrieve theRetrieve the bestbest components components according to content and structure according to content and structure criteria:criteria:

INEX:INEX: most specific component that satisfies the query, most specific component that satisfies the query, while being exhaustive to the querywhile being exhaustive to the query

Shakespeare study:Shakespeare study: best entry points, which are best entry points, which are components from which many relevant components can components from which many relevant components can be reached through browsingbe reached through browsing

??????

Page 11: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

ArticleArticle ?XML,??XML,?retrievalretrieval

??authoringauthoring

0.9 XML 0.5 XML 0.2 XML0.9 XML 0.5 XML 0.2 XML

0.4 retrieval 0.7 0.4 retrieval 0.7 authoringauthoring

ChallengesChallenges

Title Section 1 Section 2

no fixed retrieval unit + nested elements + element types how to obtain document and collection statistics? which component is a good retrieval unit? which components contribute best to content of Article? how to estimate? how to aggregate?

0.40.5

0.2

0.6 0.40.4

0.2

Page 12: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Approaches …Approaches …

vector space model

probabilistic model

bayesian network

language model

extending DB model

boolean model

natural language processing

cognitive model

ontology

parameter estimation

tuning

smoothing

fusion

phrase

term statistics

collection statistics

component statistics

proximity search

logistic regression

belief modelrelevance feedback

Page 13: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Vector space modelVector space model

article index

abstract index

section index

sub-section index

paragraph index

RSV normalised RSV

RSV normalised RSV

RSV normalised RSV

RSV normalised RSV

RSV normalised RSV

merge

tf and idf as for fixed and non-nested retrieval units

(IBM Haifa, INEX 2003)

Page 14: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Language modelLanguage modelelement language modelcollection language modelsmoothing parameter

element score

element sizeelement scorearticle score

query expansion with blind feedbackignore elements with 20 terms

high value of leads to increase in size of retrieved elements

results with = 0.9, 0.5 and 0.2 similar

rank element

(University of Amsterdam, INEX 2003)

Page 15: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Evaluation of XML retrieval: Evaluation of XML retrieval: INEXINEX

Evaluating the effectiveness of content-oriented XML Evaluating the effectiveness of content-oriented XML retrieval approachesretrieval approaches

Collaborative effort Collaborative effort participants contribute to the participants contribute to the development of the collectiondevelopment of the collection

queriesqueriesrelevance assessmentsrelevance assessments

Similar methodology as for TREC, but adapted to XML Similar methodology as for TREC, but adapted to XML retrievalretrieval

40+ participants worldwide40+ participants worldwide

Workshop in Schloss Dagstuhl in December (20+ Workshop in Schloss Dagstuhl in December (20+ institutions)institutions)

Page 16: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX Test CollectionINEX Test Collection Documents (~500MB), which consist of 12,107 articles Documents (~500MB), which consist of 12,107 articles

in XML format from the IEEE Computer Society; 8 in XML format from the IEEE Computer Society; 8 millions elementsmillions elements

INEX 2002INEX 200230 CO and 30 CAS queries30 CO and 30 CAS queries

inex2002 metricinex2002 metric

INEX 2003INEX 200336 CO and 30 CAS queries36 CO and 30 CAS queries

CAS queries are defined according to enhanced subset of CAS queries are defined according to enhanced subset of XPathXPath

inex2002 and inex2003 metricsinex2002 and inex2003 metrics

INEX 2004 is just startingINEX 2004 is just starting

Page 17: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

TasksTasks

COCO: aim is to decrease user effort by : aim is to decrease user effort by pointing the user to the most specific pointing the user to the most specific relevant portions of documents. relevant portions of documents.

SCASSCAS: retrieve relevant nodes that match : retrieve relevant nodes that match the structure specified in the query. the structure specified in the query.

VCASVCAS: retrieve relevant nodes that may : retrieve relevant nodes that may not be the same as the target elements, not be the same as the target elements, but are structurally similar.but are structurally similar.

Page 18: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Relevance in XMLRelevance in XML

A element is relevant if it “has significant A element is relevant if it “has significant and demonstrable bearing on the matter at and demonstrable bearing on the matter at hand”hand”

Common assumptions in IRCommon assumptions in IR ObjectivityObjectivity TopicalityTopicality Binary natureBinary nature IndependenceIndependence

section

paragraph

article

1 2

1 2 3

Page 19: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Relevance in INEXRelevance in INEX

ExhaustivityExhaustivityhow exhaustively a document component discusses the how exhaustively a document component discusses the query: 0, 1, 2, 3query: 0, 1, 2, 3

SpecificitySpecificityhow focused the component is on the query: 0, 1, 2, 3how focused the component is on the query: 0, 1, 2, 3

RelevanceRelevance (3,3), (2,3), (1,1), (0,0), …(3,3), (2,3), (1,1), (0,0), …

section

article all sections relevant article very relevantall sections relevant article better than sectionsone section relevant article less relevantone section relevant section better than article…

Page 20: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Relevance assessment Relevance assessment tasktask

CompletenessCompleteness Element Element parent element, children element parent element, children element

ConsistencyConsistency Parent of a relevant element must also be relevant, Parent of a relevant element must also be relevant,

although to a different extentalthough to a different extent Exhaustivity increase going Exhaustivity increase going Specificity decrease going Specificity decrease going

Use of an online interfaceUse of an online interface Assessing a query takes a week!Assessing a query takes a week! Average 2 topics per participantsAverage 2 topics per participants

section

paragraph

article

1 2

1 2 3

Page 21: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

InterfaceInterface

Currentassessments

Navigation

Groups

Page 22: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

AssessmentsAssessments

With respect to the elemens to assessWith respect to the elemens to assess26 % assessments on elements in the pool (66 26 % assessments on elements in the pool (66 % in INEX 2002).% in INEX 2002).

68 % highly specific elements not in the pool68 % highly specific elements not in the pool

7 % elements automatically assessed7 % elements automatically assessed

INEX 2002INEX 200223 inconsistent assessments per query for one 23 inconsistent assessments per query for one rulerule

Page 23: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

MetricsMetrics

Need to consider:Need to consider:

Two dimensions of relevanceTwo dimensions of relevance Independency assumption does not holdIndependency assumption does not hold No predefined retrieval unitNo predefined retrieval unit OverlapOverlap Linear vs. clustered rankingLinear vs. clustered ranking

section

article

Page 24: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX 2002 metricINEX 2002 metric

Quantization: Quantization:

strict strict

generalizedgeneralized

fstrict(exh,spec) =1 if exh = 3 and spec = 3

0 otherwise

⎧ ⎨ ⎩

fgen(exh,spec) =

1.00 if (exh,spec) = 33

0.75 if (exh,spec)∈ {23,32,31}

0.50 if (exh,spec)∈ {13,22,21}

0.25 if (exh,spec)∈ {11,12}

0.00 if (exh,spec) = 00

⎪ ⎪ ⎪

⎪ ⎪ ⎪

Page 25: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX 2002 metricINEX 2002 metric

Precision as defined by Raghavan’89 Precision as defined by Raghavan’89 (based on ESL)(based on ESL)

where n is estimatedwhere n is estimated1

))(|(

+⋅++⋅

⋅=

ris

jnx

nxxretrrelP

n = f (assess(c))c∈C

Page 26: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Overlap problemOverlap problem

Page 27: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX 2003 metricINEX 2003 metric

Ideal concept space (Wong & Yao ‘95)Ideal concept space (Wong & Yao ‘95)

c

ctspec

∩=

t

ctexh

∩= c

t

Page 28: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX 2003 metricINEX 2003 metric

Quantization:Quantization:

strictstrict

generalisedgeneralised

exhstrict(exh) =1 if exh = 3

0 otherwise

⎧ ⎨ ⎩

specstrict(spec) =1 if spec = 3

0 otherwise

⎧ ⎨ ⎩

exhgen(exh) = exh / 3

specgen(spec) = spec / 3

Page 29: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX 2003 metricINEX 2003 metric

Ignoring overlap:Ignoring overlap:

recalls =

t∩ c iU

i=1

k

t∩ c iU

i=1

N

∑=

exh(c iU )

i=1

k

exh(c iU )

i=1

N

precisions =

t∩ c iU

c iU

⋅ c iT

i=1

k

c iT

i=1

k

∑=

spec(c iU ) ⋅ c i

T

i=1

k

c iT

i=1

k

Page 30: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX 2003 metricINEX 2003 metric

Considering overlap:Considering overlap:

recallo =

exh(c iU ) ⋅

c iT − U j=1

i−1 c jT

c iT

i=1

k

exh(c iU )

i=1

N

precisiono =

spec(c iU ) ⋅ c i

T − c jT

j=1

i−1

Ui=1

k

c iT − c j

T

j=1

i−1

Ui=1

k

Page 31: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX 2003 metricINEX 2003 metric Penalises overlap by only scoring novel Penalises overlap by only scoring novel

information in overlapping resultsinformation in overlapping results Assume uniform distribution of relevant Assume uniform distribution of relevant

informationinformation Issue of stabilityIssue of stability Size considered directly in precision (is it Size considered directly in precision (is it

intuitive that large is good or not?)intuitive that large is good or not?) Recall defined using exh onlyRecall defined using exh only Precision defined using spec onlyPrecision defined using spec only

Page 32: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Alternative metricsAlternative metrics

User-effort oriented measuresUser-effort oriented measuresExpected Relevant RatioExpected Relevant Ratio

Tolerance to IrrelevanceTolerance to Irrelevance

Discounted Cumulated GainDiscounted Cumulated Gain

Page 33: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

Lessons learntLessons learnt

Good definition of relevanceGood definition of relevance

Expressing CAS queries was not easyExpressing CAS queries was not easy

Relevance assessment process must be Relevance assessment process must be “improved”“improved”

Further development on metrics neededFurther development on metrics needed

User studies requiredUser studies required

Page 34: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

ConclusionConclusion XML retrieval is not just about the effective XML retrieval is not just about the effective

retrieval of XML documents, but also about how retrieval of XML documents, but also about how to evaluate effectivenessto evaluate effectiveness

INEX 2004 tracksINEX 2004 tracks Relevance feedbackRelevance feedback InteractiveInteractive Heterogeneous collectionHeterogeneous collection Natural language queryNatural language query

http://inex.is.informatik.uni-duisburg.de:2004/

Page 35: INEX: Evaluating content-oriented XML retrieval Mounia Lalmas Queen Mary University of London

INEX: Evaluating INEX: Evaluating content-oriented XML content-oriented XML

retrievalretrievalMounia LalmasMounia Lalmas

Queen Mary University of LondonQueen Mary University of London

http://qmir.dcs.qmul.ac.ukhttp://qmir.dcs.qmul.ac.uk