overview of the inex 2008 efficiency track martin theobald ralf schenkel max planck institute

21
Overview of the Overview of the INEX 2008 INEX 2008 Efficiency Track Efficiency Track Martin Theobald Ralf Schenkel Max Planck Institute

Upload: jason-anderson

Post on 18-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Overview of the INEX 2008 Overview of the INEX 2008 Efficiency TrackEfficiency Track

Martin TheobaldRalf Schenkel

Max Planck Institute

General Idea

• Have a nice collection of readily assessed INEX ad-hoc topics from 2006—2008, so why not consider runtimes for a change?

• Attract more people from DB&IR to efficient XML-IR

• Investigate effectiveness/efficiency trade-offs for different retrieval modes and topic types:

Article, Thorough, Focused, NEXI CO/CAS, XPath 2.0 Full-Text, high-

dimensional content (query expansion/relevance feedback), deep structure, top-k, distribution, sequential vs. parallel executions

Test Collection

• Default INEX-Wikipedia collection 2007– 4.38 GB XML sources > 659,000 documents and > 115,000,000 elements– Not very heterogeneous but a rather awkward

structure sometimes, many deeply nested paths> 3,000 distinct tags (> 1,000 out of which have content)> 120,000 distinct root-to-leaf paths

– No DTD available– But: Be able to reuse large body of Ad-Hoc topics &

assessments

Topics• 540 type (A) topics (no. 289—828)

– Previous and current Ad-Hoc topics taken from INEX 2006—2008 – 308 topics have assessments– Topic titles in NEXI CO & CAS and XPath 2.0 Full-Text syntax– Full-text predicates: “”, +, -

• 21 type (B) topics (no. 829—849)– High-dimensional content with up to 112 keywords– Obtained from the 2006 Interactive Track feedback experiments by the Royal School Of Library And Information Science, Denmark– Originally CO topics only, cast into CAS using //*[about(…)]– Mapping to original Ad-Hoc topic id, thus reuse assessments

• 7 type (C) topics (no. 850—856)– High-dimensional structure with multiple branches– Newly submitted by Efficiency Track participants– Assessments skipped due to low expected overall impact on overall

effectiveness results

<topic id="844" adhocid="517" type="B">

<co_title>

castle mound castles fortress defensive earthworks offensive herefordshire french fortification ditch hollingbury scrob type circularconquest defend siegecraft surrounded dry essential walls norman ages weapon skagerrak kattegat inseparably internees feature citadels halland bayeux connotes palisade zar segovia mota provide earth castel word motte richard middle twofold fitz moat inroads tapestry confessor shropshire country flattened article disambiguation examines perimeter include bordering angular denotes styled crest prehistoric discusses fortified maiden timber denote countryside occupy welsh summit hostile erected towers parish danish mainland siege depicted wait mechanism stronger restricted residence aspect familiar provinces knight subjects survive virtually medieval lay swedish estate enemies describes measure denmark structures architecture traditionally domestic techniques store permanent normally camp fort

</co_title>

<cas_title>

//*[about(., castle mound castles fortress defensive earthworks offensive herefordshire french fortification ditch hollingbury scrob type circular conquest defend siegecraft surrounded dry essential walls norman ages weapon skagerrak kattegat inseparably internees feature citadels halland bayeux connotes palisade zar segovia mota provide earth castel word motte richard middle twofold fitz moat inroads tapestry confessor shropshire country flattened article disambiguation examines perimeter include bordering angular denotes styled crest prehistoric discusses fortified maiden timber denote countryside occupy welsh summit hostile erected towers parish danish mainland siege depicted wait mechanism stronger restricted residence aspect familiar provinces knight subjects survive virtually medieval lay swedish estate enemies describes measure denmark structures architecture traditionally domestic techniques store permanent normally camp fort)]

</cas_title>

<xpath_title>//*[. ftcontains "castle mound castles fortress defensive earthworks offensive herefordshire french fortification ditch hollingbury scrob type circular conquest defend siegecraft surrounded dry essential walls norman ages weapon skagerrak kattegat inseparably internees feature citadels halland bayeux connotes palisade zar segovia mota provide earth castel word motte richard middle twofold fitz moat inroads tapestry confessor shropshire country flattened article disambiguation examines perimeter include bordering angular denotes styled crest prehistoric discusses fortified maiden timber denote countryside occupy welsh summit hostile erected towers parish danish mainland siege depicted wait mechanism stronger restricted residence aspect familiar provinces knight subjects survive virtually medieval lay swedish estate enemies describes measure denmark structures architecture traditionally domestic techniques store permanent normally camp fort"]

</xpath_title>

</topic>

Example Topic - Type (B)

Conjunctive evaluation not fe

asible!

Conjunctive evaluation not fe

asible!

Example Topic - Type (C)<topic id="856" type="C">

<co_title>

State Parks Geology Geography +Canyon

</co_title>

<cas_title>

//article//body[about(.//section//p, State Park) and

about(.//section//title, Geology) and

about(.//section//title, Geography)]

//figure[about(.//caption, +Canyon)]

</cas_title>

<xpath_title>

//article//body[.//section//p ftcontains "State Park" and

.//section//title ftcontains "Geology" and

.//section//title ftcontains "Geography"]

//figure[.//caption ftcontains "Canyon"]

</xpath_title>

<description>

I’m looking for state parks with sections describing their geology and/or

geography, preferably with a figure of a canyon as target element.

</description>

<narrative>

State park pages often follow the common pattern of having sections entitled

with "Geology" or "Geography". I’m particularly interested in those pages with a figure

of a canyon, e.g., the Grand Canyon.

</narrative>

</topic>

Sub-Tasks• Article– Article-only runs, naturally overlap-free– In combination with CO queries, this resembles a classic IR

setting with keyword queries and documents as results• Thorough– Used in INEX 2003—2006, allows overlapping results– May be more efficient for some systems

• Focused – Current default mode in INEX, overlap-free at both passage-

and element-level– May be an expensive post-processing step to remove overlap

Submissions<!ELEMENT efficiency-submission (topic-fields,

general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

<!ELEMENT efficiency-submission (topic-fields,general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

DTD for run submissions

<!ELEMENT efficiency-submission (topic-fields,general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT efficiency-submission (topic-fields,general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

Submissions<!ELEMENT efficiency-submission (topic-fields,

general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

<!ELEMENT efficiency-submission (topic-fields,general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

DTD for run submissions

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates (yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates (yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

Submissions<!ELEMENT efficiency-submission (topic-fields,

general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

<!ELEMENT efficiency-submission (topic-fields,general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

DTD for run submissions

<!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

<!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

Metrics

• Interpolated Precision (iP) and Mean Average Interpolated Precision (MAiP) for Focused & Article modes– New INEX evaluation software 2008 (passage-based)– Qrels from 2006-2007 transformed into 2008 format

• Classic precision/recall plots for Thorough mode– INEX EvalJ 2006-2007 (element-based)– Qrels from 2008 transformed back into 2006 format(made available for download at the track homepage)

Participants & Runs

• Max-Planck-Institut Informatik [10], 8 runs• University of Frankfurt [16], 5 runs• University of Toronto [42], 2 runs• University of Twente & CWI [53], 4 runs• JustSystems Corporation [56], 1 run

Results Overview• General Setting (parameters taken from submission headers)

Results Overview• Effectiveness (iP, MAiP) vs. Efficiency (wallclock

runtime)

Effectiveness: Focused & Article, All Topics

Focused, Type (A)

Focused, Type (B)

Focused, Type (C)

Thorough, All Topics

Conclusions

• Continue in 2009 with more new topics & subtasks

• Establish as reference benchmark for XML-IR experiments for a broad DB&IR audience

• Make available also to non INEX participants?

http://www.inex.otago.ac.nz/efficiency/efficiency.asp