personalized search cheng cheng (cc2999) department of computer science columbia university a large...

14
Personalized Personalized Search Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search Strategies

Upload: agnes-kelly

Post on 13-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Personalized Personalized SearchSearch

Cheng Cheng (cc2999)Department of Computer Science

Columbia University

A Large Scale Evaluation and Analysis of Personalized Search Strategies

Page 2: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Contents

IntroductionIntroduction11

Evaluation FrameworkEvaluation Framework22

Experiment ResultsExperiment Results33

ConclusionConclusion44

Page 3: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Introduction

What is Personalized Search ? What is Personalized Search ? ““Personalized Search is the fine-tuning of search results and Personalized Search is the fine-tuning of search results and

advertising based on an individual’s preferences, information and advertising based on an individual’s preferences, information and other factors.”[Steve Johnson]other factors.”[Steve Johnson]

Personalized Search EnginesPersonalized Search Engines Google (http://www.google.com)Google (http://www.google.com)

Yahoo’s Myweb (http://myweb.yahoo.com)Yahoo’s Myweb (http://myweb.yahoo.com)

Page 4: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Personalization Strategies

Person-level Re-ranking Based on Historical ClicksPerson-level Re-ranking Based on Historical Clicks

P-Click :P-Click :

Person-level Re-ranking Based on User InterestsPerson-level Re-ranking Based on User Interests L-Profile :L-Profile :

S-Profile : S-Profile :

LS-Profile :LS-Profile :

Group-level Re-rankingGroup-level Re-ranking G-Click : G-Click :

Page 5: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Evaluation Framework

Step 1Download the top 50 search

Results from MSN searchEngine for the test query.

We denote the downloadedWeb pages with U and

deonte the rank list with R1.

Step 2 Compute a personalized scorefor each web page in U using personalization strategy andgenerate a new rank list R2.(Five different strategies are

given in the last slide.)

Step 3Combine the rank lists of R1 and R2 using Borda’ranking fusion method and sort the page with

combined rankings. The final rank list is personalized

search result list denotedwith R.

Step 4Using the measurement

in the next slide toevaluate the personalization

performance on R.

Page 6: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Evaluation Metrics

Ranking Scoring Ranking Scoring (evaluate the accuracy of personalized (evaluate the accuracy of personalized search)search)

The expected utility of a ranked list of web pages:The expected utility of a ranked list of web pages:

The final rank scoring (reflecting the utility of all test queries)The final rank scoring (reflecting the utility of all test queries)

Average Rank Average Rank (evaluate the quality of personalized search)(evaluate the quality of personalized search)

The average rank of a query s:The average rank of a query s:

The final average rank on the test query set S:The final average rank on the test query set S:

Page 7: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Experiment Data

Dataset (large scale)Dataset (large scale)

Randomly sample 10,000 distinct users from the MSN query logs Randomly sample 10,000 distinct users from the MSN query logs for 12 days in August 2006. These users and their click-through for 12 days in August 2006. These users and their click-through logs are extracted as our dataset.logs are extracted as our dataset.

Training Set and Testing SetTraining Set and Testing Set

training set: the log data of the first 11 days.training set: the log data of the first 11 days.

testing set: the log data of the last day.testing set: the log data of the last day.

Page 8: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Experiment Results (1)

Overall Performance of StrategiesOverall Performance of Strategies1) 1) Click-based personalization methods G-Click and P-Click perform

better than the method WEB on the whole. 2) Profile-based methods L-Profile, S-Profile, and LS-Profile perform

less well.

The not–optimal query is the query on which users select not only

the top results returned by MSN search engine

Page 9: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Experiment Results (2)

Overall Performance of StrategiesOverall Performance of Strategies 3) T3) Though L-Profile, S-Profile, and LS-Profile methods improve the

search accuracy on many queries, they also harm the performance on more queries, which makes them perform worse on average.

Page 10: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Click Entropy

Click EntropyClick EntropyClick entropy is a direct indication of query click variation.

ClickEntroy(q) is the click entropy of query q :

P(p|q) is the percentage of the clicks on web page p among all the clicks on q :

Smaller click entropy means that the majorities of users agree with each other on a small number of web pages. In such a case, there is no need to do personalization.

Page 11: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Experiment Results (3)

Performance on Different Click EntropiesPerformance on Different Click Entropies1)1) TThe improvement of the personalized search performance

increases when the click entropy of query becomes larger, especially when ClickEntropy ≥ 1.5.

2) All these results indicate that on the queries with small click entropy (which means that these queries are less ambiguous), the personalization is insufficient and thus personalization is unnecessary.

Page 12: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Analysis of Profile-based Strategies

Profile-based personalization strategies perform less optimally, which contradict the existing investigation. This is probably caused by the rough implementation of our strategies.

Method LS-Profile is more stable than methods L-Profile and S-Profile. In other words, both long-term and short-term search contexts are very important to personalize search results.

The combination of the two type of search context can make the prediction of real user information need more reliable.

Page 13: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Conclusion

All proposed methods have significant improvements over common web search on queries with large click entropy.

Personalized search has different effectiveness on different queries and thus not all queries should be handled in the same manner.

Click-based personalization strategies work well. And they are straightforward and stable.

The appropriate combination of long term profile and short term profile can be more reliable than solely using either of them.

Page 14: Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search

Your company sloganYour company slogan

Thank youThank you ! !