discovering key concepts in verbose queries

26
Discovering Key Concepts in Verbose Queries Michael Bendersky and W. Bruce Croft University of Massachusetts SIGIR 2008

Upload: jadzia

Post on 21-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Discovering Key Concepts in Verbose Queries. Michael Bendersky and W. Bruce Croft University of Massachusetts SIGIR 2008. Objective. “Discovering Key Concepts in Verbose Queries”. Objective. “Discovering Key Concepts in Verbose Queries” Number 829 Spanish Civil War support - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Discovering Key Concepts in Verbose Queries

Discovering Key Concepts in Verbose Queries

Michael Bendersky and W. Bruce Croft

University of Massachusetts

SIGIR 2008

Page 2: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

Page 3: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 4: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 5: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• Use of key concepts?

Page 6: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• Use of key concepts?

• Combine with current IR model

Page 7: Discovering Key Concepts in Verbose Queries

Retrieval Model

• Conventional Language Model:

score(q,d) = p(q|d) = )(

),(

dp

dqp

Page 8: Discovering Key Concepts in Verbose Queries

Retrieval Model

• Conventional Language Model:

score(q,d) = p(q|d) =

• New Model:

score(q,d) = p(q|d) = =

)(

),(

dp

dqp

)(

),,(

dp

cdqpic

i)(

),(

dp

dqp

Page 9: Discovering Key Concepts in Verbose Queries

Final Retrieval Function

score(q,d) = ic

ii dcpqcpdqp )|()|()1()|(

Page 10: Discovering Key Concepts in Verbose Queries

Final Retrieval Function

score(q,d) =

Language Model

ic

ii dcpqcpdqp )|()|()1()|(

Page 11: Discovering Key Concepts in Verbose Queries

Final Retrieval Function

score(q,d) =

Key Concepts

ic

ii dcpqcpdqp )|()|()1()|(

Page 12: Discovering Key Concepts in Verbose Queries

What is a Concept?

• Noun phrase in a query

Page 13: Discovering Key Concepts in Verbose Queries

What is a Concept?

• Noun phrase in a query

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 14: Discovering Key Concepts in Verbose Queries

What is a Concept?

• Noun phrase in a query

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 15: Discovering Key Concepts in Verbose Queries

Finding ‘Key’ Concepts

• Rank concepts by p(ci|q)

Page 16: Discovering Key Concepts in Verbose Queries

Finding ‘Key’ Concepts

• Rank concepts by p(ci|q)

• Compute p(ci|q) by frequency?

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 17: Discovering Key Concepts in Verbose Queries

Finding ‘Key’ Concepts

• Approximate p(ci|q) by machine learning

• h(ci) is ci’s query-independent importance score

• p(ci|q) = h(ci) / ciq h(ci)

ci AdaBoost.M1 h(ci)

Page 18: Discovering Key Concepts in Verbose Queries

Features of a Concept

• is_cap : is capitalized• tf : in corpus• idf : in corpus• ridf : idf modified by Poisson model• wig : weighted information gain; change in entro

py from corpus to retrieved data• g_tf : Google term frequency• qp : number of times the concept appears as a

part of a query in MSN Live• qe : number of times the concept appears as ex

act query in MSN Live

Page 19: Discovering Key Concepts in Verbose Queries

TREC Corpus

Page 20: Discovering Key Concepts in Verbose Queries

Exp 1: Identifying Key Concept

• Cross-validation on corpus

• Each fold has 50 queries

• Check whether the top concept is a key concept

• Assume 1 key concept per query during annotation

Page 21: Discovering Key Concepts in Verbose Queries

Exp 1: Identifying Key Concept

Page 22: Discovering Key Concepts in Verbose Queries

Exp 1: Identifying Key Concept

• Better than idf ranking

Page 23: Discovering Key Concepts in Verbose Queries

Exp 2: Information Retrieval

score(q,d) =

• Use only the top 2 concepts for each query

• q is the entire <desc> section = 0.8

ic

ii dcpqcpdqp )|()|()1()|(

Page 24: Discovering Key Concepts in Verbose Queries

Exp 2: Information Retrieval

• KeyConcept[2]<desc> : author’s method

• SeqDep<desc> : include all bigrams in query

Page 25: Discovering Key Concepts in Verbose Queries

Exp 2: Information Retrieval

Page 26: Discovering Key Concepts in Verbose Queries

What to take home?

• Singling out key concepts improves retrieval