extending faceted search to the general web

40
+ Extending Faceted Search to the General Web 2014/11/25 (Tue.) Chang Wei-Yuan @ MakeLab Group Meeting Weize Kong, James Allan CIKM‘14

Upload: chang-wei-yuan

Post on 15-Jul-2015

296 views

Category:

Data & Analytics


1 download

TRANSCRIPT

+

Extending Faceted Search to the General Web

2014/11/25 (Tue.)�Chang Wei-Yuan @ MakeLab Group Meeting

Weize Kong, James Allan �CIKM‘14

+Outline

n Introduction �

n Method �n Facet Generation �n Facet Feedback �n Evaluation �

n Experiment �

n Conclusion �

n Thought

2

+Outline

n Introduction �

n Method �n Facet Generation �n Facet Feedback �n Evaluation �

n Experiment �

n Conclusion �

n Thought

3

+ Introduction

n Faceted search helps users by offering drill-down options as a complement to the keyword input box.

4

+ Introduction

n However, this idea is not well explored for general web search. �n heterogeneous nature �

5

+ Introduction

n However, this idea is not well explored for general web search. �n heterogeneous nature �

6

baggage allowance

所有航線

所有航線

國內航線

國際航線

貨運公司 行李類型

+ Introduction

n However, this idea is not well explored for general web search. �n heterogeneous nature �

7

baggage allowance

所有航線

所有航線

國內航線

國際航線

貨運公司 行李類型

← query

← facet

← facet term

↓ search result ( ducument)

+ Introduction

n Goal : �n query-dependent automatic facet generation �n user feedback on these query facets into

document ranking

8

+Outline

n Introduction �

n Method �n Facet Generation �n Facet Feedback �n Evaluation �

n Experiment �

n Conclusion �

n Thought

9

+Flow Chart 10

Search Result

Candidate Facets Facets

Selected Terms

Top-ranked Documents

Search Result

Query Extracting

Candidates Refining

Candidates

Facet Feedback

+Flow Chart 11

Search Result

Candidate Facets Facets

Selected Terms

Top-ranked Documents

Search Result

Query Extracting

Candidates Refining

Candidates

Facet Feedback

+Facet

Generation Facet

Feedback Evaluation

12

n Input : Query and Search Result�

n Step 1 : Extracting Candidates �

n Step 2 : Refining Candidates �

n Output : Query Facet �

+Facet

Generation Facet

Feedback Evaluation

13

n Step 1 : Extracting Candidates �n applied both textual and HTML patterns on

the top search results �

+Facet

Generation Facet

Feedback Evaluation

14

n Step 1 : Extracting Candidates �

n query : “mars landing”�

n search results �n “ Mars rovers such as Curiosity, Opportunity

and Spirit ”�

n candidate facets �n C : { Curiosity, Opportunity, Spirit } �

+Facet

Generation Facet

Feedback Evaluation

15

n Step 1 : Extracting Candidates �

n the candidate query facets extracted. �n noisy�n non-relevant to the issued query�n terms be not members of the same class �

+Facet

Generation Facet

Feedback Evaluation

16

n Step 1 : Extracting Candidates �

n query : “mars landing”�

n candidate facets : �

+Facet

Generation Facet

Feedback Evaluation

17

n Step 1 : Extracting Candidates �

n query : “mars landing”�

n candidate facets : �

+Facet

Generation Facet

Feedback Evaluation

18

n Step 1 : Extracting Candidates �

n query : “mars landing”�

n candidate facets : �

n Refine �

+Facet

Generation Facet

Feedback Evaluation

19

n Step 2 : Refining Candidates �n re-cluster the query facets or their facet

terms into higher quality query facets �

+Facet

Generation Facet

Feedback Evaluation

20

n Step 2 : Refining Candidates �

n Topic modeling �n pLSA, LDA�

n Unsupervised clustering method �n QDMiner, QDM �

n Super-vised methods based on a graphical model �n QF-I, QF-J �

+Facet

Generation Facet

Feedback Evaluation

21

n Input : Query and Search Result�

n Step 1 : Extracting Candidates �

n Step 2 : Refining Candidates �

n Output : Facet : { a set of terms } �n Year : { 2007, 2011, 2012 } �n Lab : { NASA, Mars Science Lab, Curiosity Lab } �

+Flow Chart 22

Search Result

Candidate Facets Facets

Selected Terms

Top-ranked Documents

Search Result

Query Extracting

Candidates Refining

Candidates

Facet Feedback

+Facet

Generation Facet

Feedback Evaluation

23

n Input : Document, Query, User Selection �n Document = one of search result �

n Boolean Filtering Model �

n Soft Ranking Model �

n Output : the score of each document

+Facet

Generation Facet

Feedback Evaluation

24

n Boolean Filtering Model �

n Fu denotes the set of feedback facets which user selected �

n condition B can be either AND, OR, or A+O �n S(D, Q) is the score returned by the original

retrieval model �

+Facet

Generation Facet

Feedback Evaluation

25

n Soft Ranking Model �

n λ is a parameter for adjusting the weight �n SE(D, Fu) is the expansion part which captures

the relevance between the document and feedback facet�

+Facet

Generation Facet

Feedback Evaluation

26

n Input : Documents, Query, User Selection �

n Boolean Filtering Model �

n Soft Ranking Model �

n Output : the score of each document

+Facet

Generation Facet

Feedback Evaluation

27

n Intrinsic Evaluation �n Ground Truth: query facets are constructed

by human annotators �n annotators are asked to group or re-group

terms in the pool into preferred query facets. �n  pooling facets generated by the different systems �

n compared with facets generated by different systems �

+Facet

Generation Facet

Feedback Evaluation

28

n Extrinsic Evaluation �n User Model �

n  The user model describes how a user selects feedback terms from facets, based on which we can estimate the time cost for the user.

↑ time for scanning facet

time for selecting terms

+Facet

Generation Facet

Feedback Evaluation

29

n Extrinsic Evaluation �n Oracle Feedback and Annotator Feedback �

n  Oracle feedback model only selected effective terms as feedback. �

n  The annotator is asked to select all the terms from the facets that would help address the information need. �

+Outline

n Introduction �

n Method �n Facet Generation �n Facet Feedback �n Evaluation �

n Experiment �

n Conclusion �

n Thought

30

+Experiment Settings

n Dataset �n  For the document corpus, we use the ClueWeb09

Category-B collection. �n  196 queries and 678 query subtopics �

n Facet Generation Models �n  pLSA, LDA, QDM, QF-I and QF-J �

n Facet Feedback Models �n  Boolean filtering models, soft ranking models �

n Baseline Retrieval Model �n  SDM, and its MAP(Mean average precision) = 0.185 �

31

+Facet Generation Models 32

+Facet Generation Models 33

based on annotator feedback and SF feedback model

based on oracle feedback and SF feedback model.

+Facet Generation Models 34

based on annotator feedback and SF feedback model

based on oracle feedback and SF feedback model.

Our experiments testify to the potential of Faceted Web Search.

+Facet Feedback Models 35

+Facet Feedback Models 36

Our experiments show feedback models effective.

+Outline

n Introduction �

n Method �n Facet Generation �n Facet Feedback �n Evaluation �

n Experiment �

n Conclusion �

n Thought

37

+Conclusion

n This paper proposed Faceted Web Search. �n an extension of faceted search to the general

Web �

n query-dependent automatic facet generation �

n feedback on these query facets into document ranking

38

+Outline

n Introduction �

n Method �n Facet Generation �n Facet Feedback �n Evaluation �

n Experiment �

n Conclusion �

n Thought

39

+Thanks for listening. 2014 / 11 / 25 (Tue.) @ MakeLab Group Meeting �[email protected]