evaluating content-oriented xml retrieval: the inex initiative mounia lalmas queen mary university...

28
Evaluating Evaluating content-oriented content-oriented XML retrieval: XML retrieval: The INEX initiative The INEX initiative Mounia Lalmas Mounia Lalmas Queen Mary University of Queen Mary University of London London http:// http:// qmir.dcs.qmul.ac.uk qmir.dcs.qmul.ac.uk

Upload: andrew-kent

Post on 28-Mar-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Evaluating Evaluating content-oriented content-oriented

XML retrieval: XML retrieval: The INEX initiativeThe INEX initiative

Mounia LalmasMounia Lalmas

Queen Mary University of LondonQueen Mary University of London

http://qmir.dcs.qmul.ac.ukhttp://qmir.dcs.qmul.ac.uk

Page 2: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

OutlineOutline

Information retrievalInformation retrieval XML retrievalXML retrieval

Evaluating information retrievalEvaluating information retrieval Evaluating XML retrieval: INEXEvaluating XML retrieval: INEX

Page 3: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Information retrievalInformation retrieval

Example of a user information need (e.g. on the Example of a user information need (e.g. on the WWW):WWW):

““Find all documents about sailing charter agencies that Find all documents about sailing charter agencies that (1) offer sailing boats in the Greek islands, and (2) are (1) offer sailing boats in the Greek islands, and (2) are registered with the RYA. The documents should contain registered with the RYA. The documents should contain boat specification, price per week, e-mail and other boat specification, price per week, e-mail and other contact details.”contact details.”

A formal representation of an information need constitutes a A formal representation of an information need constitutes a queryquery

Page 4: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Information retrievalInformation retrieval

IR is concerned with the representation, storage, IR is concerned with the representation, storage, organisation, and access to repositories of organisation, and access to repositories of information, usually under the form of information, usually under the form of documentsdocuments. .

Primary goal of an IR systemPrimary goal of an IR system““Retrieve all the documents which are Retrieve all the documents which are relevantrelevant (useful) (useful) to a user query, while retrieving as few non-relevant to a user query, while retrieving as few non-relevant documents as possible.”documents as possible.”

Page 5: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Documents Query

Document representation

Retrieval results

Query representation

Indexing Formulation

Retrieval function

Relevancefeedback

Conceptual model for IRConceptual model for IR

Page 6: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

XML RetrievalXML Retrieval Traditional IR is about finding relevant documents to a user’s Traditional IR is about finding relevant documents to a user’s

information need, e.g. entire book.information need, e.g. entire book.

XML allows users to retrieve document components that are XML allows users to retrieve document components that are more focussed to their information needs, e.g a chapter of a more focussed to their information needs, e.g a chapter of a book instead of an entire book.book instead of an entire book.

The structure of documents is exploited to identify which The structure of documents is exploited to identify which document components (XML elements) to retrieve.document components (XML elements) to retrieve.

Page 7: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

XML: eXML: eXXtensible tensible Mark-upMark-up LLanguageanguage

Meta-language (user-defined tags) currently being Meta-language (user-defined tags) currently being adopted as the document format language by W3Cadopted as the document format language by W3C

Used to describe content and structure (and not Used to describe content and structure (and not layout)layout)

Grammar described in DTD (Grammar described in DTD ( used for validation) used for validation)<lecture> <title> Structured Document Retrieval </title> <author> <fnm> Smith </fnm> <snm> John </snm> </author> <chapter> <title> Introduction into XML retrieval </title> <paragraph> …. </paragraph> … </chapter> …</lecture>

<!ELEMENT lecture (title, author+,chapter+)><!ELEMENT author (fnm*,snm)><!ELEMENT fnm #PCDATA>…

Page 8: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

XML: eXML: eXXtensible tensible Mark-upMark-up LLanguageanguage

Use of XPath notation to refer to the XML Use of XPath notation to refer to the XML structurestructure

chapter/title: title is a direct sub-component of chapter//title: any titlechapter//title: title is a direct or indirect sub-component of chapterchapter/paragraph[2]: any direct second paragraph of any chapterchapter/*: all direct sub-components of a chapter

<lecture> <title> Structured Document Retrieval </title> <author> <fnm> Smith </fnm> <snm> John </snm> </author> <chapter> <title> Introduction into SDR </title> <paragraph> …. </paragraph> … </chapter> …</lecture>

Page 9: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

QueriesQueries

Content-only (CO) queriesContent-only (CO) queries Standard IR queries but here we are retrieving document Standard IR queries but here we are retrieving document

componentscomponents ““London tube strikes”London tube strikes”

Content-and-structure (CAS) queriesContent-and-structure (CAS) queries Put on constraints on which types of components are to be Put on constraints on which types of components are to be

retrievedretrieved E.g. “Sections of an article in the Times about congestion E.g. “Sections of an article in the Times about congestion

charges”charges” E.g. Articles that contain sections about congestion charges in E.g. Articles that contain sections about congestion charges in

London, and that contain a picture of Ken Livingstone”London, and that contain a picture of Ken Livingstone”

Page 10: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Conceptual model for XML Conceptual model for XML retrievalretrieval

Structured documents Content + structure

Inverted file + structure index

tf, idf, acc

Matching content + structure

Presentation of related components

Documents Query

Document representation

Retrieval results

Query representation

Indexing Formulation

Retrieval function

Relevancefeedback

Page 11: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Example of XML approachesExample of XML approaches

The representation of a composite The representation of a composite

element (e.g. article and section) is element (e.g. article and section) is

defined as the aggregated defined as the aggregated

representation of its sub-elementsrepresentation of its sub-elements section

p1 is about “XML” “retrieval”p2 is about “XML”, “authoring”

paragraph

article

1 2

1 2 3

Sec3 is then also about “XML” (in fact very much about “XML”), “retrieval”, “authoring”

Page 12: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Example of XML approachesExample of XML approaches

Document {?tDocument {?t11, ?t, ?t22, ?t, ?t33}}

Title Section_1 Section_2Title Section_1 Section_2

{0.9 t{0.9 t11, 0.4 t, 0.4 t22} {0.5 t} {0.5 t11} {0.2 t} {0.2 t11, 0.7 t, 0.7 t33}}

? = Aggregated weight of t? = Aggregated weight of t ii in Document based on in Document based on

the instances of tthe instances of t i i in the sub-elements (Title, in the sub-elements (Title,

Section_1 and Section_2)Section_1 and Section_2)

Page 13: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

EvaluationEvaluation

The goal of an IR systemThe goal of an IR system retrieve as many relevant documents as possible and as retrieve as many relevant documents as possible and as

few non-relevant documents as possiblefew non-relevant documents as possible

Comparative evaluation of technical performance of Comparative evaluation of technical performance of IR systems = effectivenessIR systems = effectiveness ability of the IR system to retrieve relevant documents ability of the IR system to retrieve relevant documents

and suppress non-relevant documentsand suppress non-relevant documents

EffectivenessEffectiveness combination of combination of recallrecall and and precisionprecision

Page 14: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

RelevanceRelevance

A document is relevant if it “has significant and A document is relevant if it “has significant and demonstrable bearing on the matter at hand”.demonstrable bearing on the matter at hand”.

Common assumptions:Common assumptions: ObjectivityObjectivity TopicalityTopicality Binary natureBinary nature IndependenceIndependence

Page 15: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Recall / PrecisionRecall / Precision

Document collection

Retrieved RelevantRetrieved and relevant

documentsrelevant ofnumber

retrieved documentsrelevant ofnumber recall

retrieved documents ofnumber

retrieved documentsrelevant ofnumber precision

=

=

Page 16: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Recall / PrecisionRecall / Precisionrelevant documents for a given queryrelevant documents for a given query

{d3, d5, d9, d25, d39, d44, d56, d71, d89, d123}{d3, d5, d9, d25, d39, d44, d56, d71, d89, d123}

rankrank docdoc precisionprecision recallrecall rankrank docdoc precisionprecision recallrecall

11

22

33

44

55

66

77

d123d123

d84d84

d56d56

D6D6

d8d8

d9d9

d511d511

1/11/1

2/32/3

3/63/6

1/101/10

2/102/10

3/103/10

88

99

1010

1111

1212

1313

1414

d129d129

d187d187

d25d25

d48d48

d250d250

d113d113

d3d3

4/104/10

5/145/14

4/104/10

5/105/10

Page 17: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Comparison of systemsComparison of systems

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50 60 70 80 90 100

recall

precision

system 1

system 2

Page 18: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Test collectionTest collection Document collection = document themselvesDocument collection = document themselves

depend on the task, e.g. evaluating web retrieval requires a depend on the task, e.g. evaluating web retrieval requires a collection of HTML documents.collection of HTML documents.

Queries / requestsQueries / requests simulate real user information needs.simulate real user information needs.

Relevance judgementsRelevance judgements stating for a query the relevant documents.stating for a query the relevant documents.

See TRECSee TREC

Page 19: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Evaluation of XML retrieval: Evaluation of XML retrieval: INEXINEX

Evaluating the effectiveness of content-oriented XML Evaluating the effectiveness of content-oriented XML retrieval approachesretrieval approaches

Collaborative effort = participants contribute to the Collaborative effort = participants contribute to the development of the collectiondevelopment of the collection

queriesqueries relevance assessmentsrelevance assessments

Similar methodology as for TREC, but adapted to XML Similar methodology as for TREC, but adapted to XML retrieval.retrieval.

Page 20: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

INEX Test CollectionINEX Test Collection

The INEX test collection (2002)The INEX test collection (2002) Documents (~500MB), which consist of 12,107 articles in XML format from the Documents (~500MB), which consist of 12,107 articles in XML format from the

IEEE Computer SocietyIEEE Computer Society 30 CO and 30 CAS queries30 CO and 30 CAS queries Relevance assessments per retrieved components, by participating groupsRelevance assessments per retrieved components, by participating groups Relevance defined in terms of “relevance” and “coverage”Relevance defined in terms of “relevance” and “coverage” Participants: 36 active groups worldwideParticipants: 36 active groups worldwide

In 2003, INEX has 36 CO and 30 CAS queriesIn 2003, INEX has 36 CO and 30 CAS queries Same document collectionsSame document collections CAS queries are defined according to a subset of XPath.CAS queries are defined according to a subset of XPath. Relevance assessments per retrieved components, by participating groupRelevance assessments per retrieved components, by participating group Relevance defined in terms of “exhaustivity” and “specificity”Relevance defined in terms of “exhaustivity” and “specificity” Participants: 40 active groups worldwideParticipants: 40 active groups worldwide

INEX 2004 is just startingINEX 2004 is just starting

Page 21: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Example of CO topicExample of CO topic<inex_topic topic_id="126" query_type="CO" ct_no="25">

<title>Open standards for digital video in distance learning</title>

<description>Open technologies behind media streaming in distance learning projects</description>

<narrative> I am looking for articles/components discussing methodologies of digital video production and distribution that respect free access to media content through internet or via CD-ROMs or DVDs in connection to the learning process. Discussions of open versus proprietary standards of storing and sending digital video will be appreciated. </narrative>

<keywords>media streaming,video streaming,audio streaming, digital video,distance learning,open standards,free access</keywords>

Page 22: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Example of CAS topicExample of CAS topic

<title>//article[about(.,'formal methods verify correctness aviation systems')]/body//*[about(.,'case study application model checking theorem proving')]</title>

<description>Find documents discussing formal methods to verify correctness of aviation systems. From those articles extract parts discussing a case study of using model checking or theorem proving for the verification. </description>

<narrative>To be considered relevant a document must be about using formal methods to verify correctness of aviation systems, such as flight traffic control systems, airplane- or helicopter- parts. From those documents a body-part must be returned (I do not want the whole body element, I want something smaller). That part should be about a case study of applying a model checker or a theorem proverb to the verification. </narrative>

<keywords>SPIN, SMV, PVS, SPARK, CWB</keywords>

Page 23: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Relevance in XMLRelevance in XML

A element is relevant if it “has significant and A element is relevant if it “has significant and demonstrable bearing on the matter at hand”demonstrable bearing on the matter at hand”

Common assumptions in IRCommon assumptions in IR ObjectivityObjectivity TopicalityTopicality Binary natureBinary nature Independence Independence

section

paragraph

article

1 2

1 2 3

Page 24: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Relevance in XMLRelevance in XML

ExhaustivityExhaustivity how exhaustively a document component discusses how exhaustively a document component discusses

the topic of requestthe topic of request SpecificitySpecificity

how focused the component is on the topic of request how focused the component is on the topic of request (i.e. discusses no other, irrelevant topics)(i.e. discusses no other, irrelevant topics)

4-graded: 0, 1, 2 , 34-graded: 0, 1, 2 , 3 needed because of the structureneeded because of the structure

Relevance: (3,3), (2,3), (1,1), (0,0), etcRelevance: (3,3), (2,3), (1,1), (0,0), etc

Page 25: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Relevance assessment taskRelevance assessment task ExhaustivityExhaustivity

Element Element parent element, children element parent element, children element

ConsistencyConsistency Parent of a relevant element must also be relevant, although to a Parent of a relevant element must also be relevant, although to a

different extentdifferent extent Exhaustivity increase going Exhaustivity increase going Specificity decrease going Specificity decrease going

Use of an online interfaceUse of an online interface Assessing a query takes a week!Assessing a query takes a week! Average 2 topics per participantsAverage 2 topics per participants

Only participants that complete the assessment task have access to the Only participants that complete the assessment task have access to the collectioncollection

section

paragraph

article

1 2

1 2 3

Page 26: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

MetricsMetricsRecall/precision can used but must take Recall/precision can used but must take into consideration:into consideration:

near misses (we do not retrieve the near misses (we do not retrieve the best component e.g. p[4] but best component e.g. p[4] but one near enough e.g. p[2])one near enough e.g. p[2])

overlap (we retrieve a component e.g. overlap (we retrieve a component e.g. doc[23] and one of its sub-components e.g. doc[23] and one of its sub-components e.g. sec[3])sec[3])

doc[23]

sec[3]

p[2] p[4]

Page 27: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

ConclusionConclusion

XML retrieval is not just about the effective XML retrieval is not just about the effective retrieval of XML documents, but also how to retrieval of XML documents, but also how to evaluate the effectivenessevaluate the effectiveness

INEX 2004INEX 2004 More rigorous query topic format (e.g. parser)More rigorous query topic format (e.g. parser) New metrics (e.g. not based on precision/recall)New metrics (e.g. not based on precision/recall) TracksTracks

• Relevance feedbackRelevance feedback• InteractiveInteractive• Heterogeneous collectionHeterogeneous collection• Natural language queryNatural language query

Page 28: Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Thank youThank you

http://inex.is.informatik.uni-duisburg.de:2004/