extending faceted search to the general web

+

Extending Faceted Search to the General Web

2014/11/25 (Tue.)�Chang Wei-Yuan @ MakeLab Group Meeting

Weize Kong, James Allan �CIKM‘14

+Outline

n Introduction �

n Method �n Facet Generation �n Facet Feedback �n Evaluation �

n Experiment �

n Conclusion �

n Thought

2

+Outline



n Experiment �

n Conclusion �

n Thought

3

+ Introduction

n Faceted search helps users by offering drill-down options as a complement to the keyword input box.

4

+ Introduction

n However, this idea is not well explored for general web search. �n heterogeneous nature �

5

+ Introduction


6

baggage allowance

所有航線

所有航線

國內航線

國際航線

貨運公司行李類型

+ Introduction


7

baggage allowance

所有航線

所有航線

國內航線

國際航線

貨運公司行李類型

← query

← facet

← facet term

↓ search result ( ducument)

+ Introduction

n Goal : �n query-dependent automatic facet generation �n user feedback on these query facets into

document ranking

8

+Outline



n Experiment �

n Conclusion �

n Thought

9

+Flow Chart 10

Search Result

Candidate Facets Facets

Selected Terms

Top-ranked Documents

Search Result

Query Extracting

Candidates Refining

Candidates

Facet Feedback

+Flow Chart 11

Search Result


Selected Terms


Search Result

Query Extracting

Candidates Refining

Candidates

Facet Feedback

+Facet

Generation Facet

Feedback Evaluation

12

n Input : Query and Search Result�

n Step 1 : Extracting Candidates �

n Step 2 : Refining Candidates �

n Output : Query Facet �

+Facet

Generation Facet

Feedback Evaluation

13

n Step 1 : Extracting Candidates �n applied both textual and HTML patterns on

the top search results �

+Facet

Generation Facet

Feedback Evaluation

14


n query : “mars landing”�

n search results �n “ Mars rovers such as Curiosity, Opportunity

and Spirit ”�

n candidate facets �n C : { Curiosity, Opportunity, Spirit } �

+Facet

Generation Facet

Feedback Evaluation

15


n the candidate query facets extracted. �n noisy�n non-relevant to the issued query�n terms be not members of the same class �

+Facet

Generation Facet

Feedback Evaluation

16



n candidate facets : �

+Facet

Generation Facet

Feedback Evaluation

17




+Facet

Generation Facet

Feedback Evaluation

18




n Refine �

+Facet

Generation Facet

Feedback Evaluation

19

n Step 2 : Refining Candidates �n re-cluster the query facets or their facet

terms into higher quality query facets �

+Facet

Generation Facet

Feedback Evaluation

20


n Topic modeling �n pLSA, LDA�

n Unsupervised clustering method �n QDMiner, QDM �

n Super-vised methods based on a graphical model �n QF-I, QF-J �

+Facet

Generation Facet

Feedback Evaluation

21

n Input : Query and Search Result�



n Output : Facet : { a set of terms } �n Year : { 2007, 2011, 2012 } �n Lab : { NASA, Mars Science Lab, Curiosity Lab } �

�

+Flow Chart 22

Search Result


Selected Terms


Search Result

Query Extracting

Candidates Refining

Candidates

Facet Feedback

+Facet

Generation Facet

Feedback Evaluation

23

n Input : Document, Query, User Selection �n Document = one of search result �

n Boolean Filtering Model �

n Soft Ranking Model �

n Output : the score of each document

+Facet

Generation Facet

Feedback Evaluation

24


n Fu denotes the set of feedback facets which user selected �

n condition B can be either AND, OR, or A+O �n S(D, Q) is the score returned by the original

retrieval model �

+Facet

Generation Facet

Feedback Evaluation

25


n λ is a parameter for adjusting the weight �n SE(D, Fu) is the expansion part which captures

the relevance between the document and feedback facet�

+Facet

Generation Facet

Feedback Evaluation

26

n Input : Documents, Query, User Selection �



n Output : the score of each document

+Facet

Generation Facet

Feedback Evaluation

27

n Intrinsic Evaluation �n Ground Truth: query facets are constructed

by human annotators �n annotators are asked to group or re-group

terms in the pool into preferred query facets. �n  pooling facets generated by the different systems �

n compared with facets generated by different systems �

+Facet

Generation Facet

Feedback Evaluation

28

n Extrinsic Evaluation �n User Model �

n  The user model describes how a user selects feedback terms from facets, based on which we can estimate the time cost for the user.

↑ time for scanning facet

time for selecting terms

↓

+Facet

Generation Facet

Feedback Evaluation

29

n Extrinsic Evaluation �n Oracle Feedback and Annotator Feedback �

n  Oracle feedback model only selected effective terms as feedback. �

n  The annotator is asked to select all the terms from the facets that would help address the information need. �

+Outline



n Experiment �

n Conclusion �

n Thought

30

+Experiment Settings

n Dataset �n  For the document corpus, we use the ClueWeb09

Category-B collection. �n  196 queries and 678 query subtopics �

n Facet Generation Models �n  pLSA, LDA, QDM, QF-I and QF-J �

n Facet Feedback Models �n  Boolean filtering models, soft ranking models �

n Baseline Retrieval Model �n  SDM, and its MAP(Mean average precision) = 0.185 �

31

+Facet Generation Models 32


based on annotator feedback and SF feedback model

based on oracle feedback and SF feedback model.


based on annotator feedback and SF feedback model

based on oracle feedback and SF feedback model.

Our experiments testify to the potential of Faceted Web Search.

+Facet Feedback Models 35

+Facet Feedback Models 36

Our experiments show feedback models effective.

+Outline



n Experiment �

n Conclusion �

n Thought

37

+Conclusion

n This paper proposed Faceted Web Search. �n an extension of faceted search to the general

Web �

n query-dependent automatic facet generation �

n feedback on these query facets into document ranking

38

+Outline



n Experiment �

n Conclusion �

n Thought

39

+Thanks for listening. 2014 / 11 / 25 (Tue.) @ MakeLab Group Meeting �[email protected]�

extending faceted search to the general web

Data & Analytics

facet terms

candidates nquery

candidates nstep

mars landingncandidate

refining candidates

search resultnstep

general web search

spirit ncandidate facets