michael bendersky, w. bruce croft dept. of computer science univ. of massachusetts amherst amherst,...

Post on 17-Jan-2018

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Motivation Goal : retrieve more relevant documents to users Query Representation : 3 This paper term dependencies concept dependencies bag-of-words

TRANSCRIPT

Michael Bendersky , W. Bruce CroftDept. of Computer Science

Univ. of Massachusetts AmherstAmherst, MA

SIGIR 2012

1

• Motivation• Query Hypergraphs• Ranking Documents• Parameter estimation• Evaluation• Conclusion

2

Outline

Motivation• Goal : retrieve more relevant documents to

users• Query Representation :

3

This paper

term dependencies

concept dependencies

bag-of-words

Example • ”Provide information on the use of dogs worldwide for law enforcement purposes.”

• bag-of-word { Provide, information, dog….}• term dependency {(Provide, information ),( law, enforcement)}• concept dependency {(dog, law enforcement),..}

4

• ”Provide information on the use of dogs worldwide for law enforcement

purposes.”

5

Example(cont.)

{provide, information,( law, enforcement)} {(dog, law enforcement)}

Model concept dependency

• Use Query Hypergraphs 1. build linguistic structure ” members of the rock group nirvana” 2. each element in the structures can be represented as a concept

6

Query Hypergraphs• Query Hypergraph

7

(international art crime)

D: a document

V = {D,i,a,c,ac}

E = {({i},D),({a},D),({c},D),({ac},D),({i,a,c,ac},D)}

hyperedge

Query Hypergraph Induction

• Three types of structures

8

• query term structure : individual query words • phrase structure : bi-gram (consider order)• proximity structure : arbitrary subsets of query terms

Hyperedges• Local hyperedges ({k},D)• Global hyperedge ( ,D)

9

QK

k: a conceptQK : set of query concepts

k QK

Ranking Documents• relevance score

10

Q: a queryD: a documente: a hyperedge E: set of hyperedges

Factor: )( ,Dkee

Local Factors

11

)(k : the importance weight of the concept k

: a matching function between the concept k and the document D

Matching Function

12

DCCktfDktf

Dkf

),(),(log),(

C: the collectionD

C

: the number of term in the document

: the number of term in the collection

: Dirichlet smoothing parameter

• consider the dependency between the entire set of query concepts

13

Global Factor

: the highest score passage from the document

The dependency range is much longer for concept dependencies.

),( QKk : the importance weight of concept k in the context of the entire set of query concepts QK (with the concept in the passage )

Example

14

{(dog, law enforcement)}

Don’t appear in the same sentence, but co-occurrence in a largertext passage.

Query Hypergraph Parameterization

• Goal: parameterize concept weights (local & global)

15

)(k ),( QKk

• Parameterization By Structure• Parameterization By Concept

Parameterization By Structure

16

: a structure

• parameterize the concept weights based on the concepts themselves

17

Parameterization By Concept

concept importance feature

estimation

Parameter Estimation• optimize a target metric (mean average

precision)• rely on a large collection• use coordinate ascent algorithm - a coordinate-level hill climbing search• repeatedly cycles through each of

parameters , while holding all other parameters fixed

18

)(

19

Parameter Estimation(cont.)

Optimize the local component (the weight ))(k

retrieve top thousand documents

optimize the global component (the weight )),( QKk

Parameter Estimation(cont.)

20

(Robust04 collection)

Evaluation(testing)• search engine - Indri • test collections

• query

21

Evaluation(evaluation metric)• MAP(mean average precision)

ex. Topic 1 : 3 個相關 (order: 1,3,5) (1/1+2/3+3/5)/3

• ERR@k (expected reciprocal rank, k=20)

22

1

11

))(1()( k

jj

k

i

i gRigR g= 0,1,2,3,4

R(g)=(2^g-1)/16

satisfied by doc k

not satisfied with previous doc (1~k-1)

Evaluation(retrieval performance)

23

Conclusion• model arbitrary term dependencies as

concepts• uses passage-level evidence to model the

dependencies between the concepts • assign weight to both concepts and

concept dependencies• The proposed retrieval framework

improves the retrieval effectiveness for verbose natural queries.

24

top related