query logs – used everywhere and for everything sai vallurupalli

8
Query Logs – Used everywhere and for everything Sai Vallurupalli

Upload: laura-farmer

Post on 12-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Query Logs – Used everywhere and for everything Sai Vallurupalli

Query Logs – Used everywhere and for

everythingSai Vallurupalli

Page 2: Query Logs – Used everywhere and for everything Sai Vallurupalli

What are query logs useful for?

In Social Sciences, Medical & Health, Advertising & Marketing, Law Enforcement etc. • Understanding Search Behavior – Trends and Hot Trends

• average length of search terms • most frequently searched terms• percentage of repeat queries• query term frequency distributions• number of users using the advanced features• number of queries a user entered before being satisfied with the results or giving up• average number of result pages and links examined

• Understanding and Categorizing Queries & Users• Informational, Navigational, Transactional, Connectivity

Page 3: Query Logs – Used everywhere and for everything Sai Vallurupalli

What are query logs useful for (contd.):For improving applications that produce these logs • Improving Document Scoring

• Scoring based on usage statistics, i.e., number of users, type of users, nature of the visit etc.

• Scoring is based on usage patterns. Score increases if• more users select the document • more time is spent on the document, or an increase in rate of time spent • number of search terms that resulted in the document increase, or the rate of

increase increases• the document moves up in rank positions, the rate of position movement increases

•Improving Performance thru Query caching

Page 4: Query Logs – Used everywhere and for everything Sai Vallurupalli

What is logged?

• User identifier or session identifier• IP address identifying the device• Query terms • Query timestamp, additional timestamps for clicked results• List of URLs or results, ranks, whether they were clicked on• click through data• relevance feedback links • page dwell time• search exit

Page 5: Query Logs – Used everywhere and for everything Sai Vallurupalli

Improving Information Retrieval

• Determining Query Intent• Select a set of adjacent queries for a single need by a single user • Understand user query modification & reformulation• Determine equivalent descriptions for an information need• Identify and account for misspelled terms

• Query Recommendation• Query expansion• Relevance feedback

• Query Suggestion• Query Caching

Page 6: Query Logs – Used everywhere and for everything Sai Vallurupalli

Privacy Concerns & Current Research • Contain Sensitive Information • Can be Mined for User information • Anonymizing• Privacy/utility tradeoff

• Can be used to Determine User Intent• Topical Obfuscation with dummy query injection• By not logging unique queries• Substituting user query with a group of queries which produce same results

• Length of query logs• not keep logs for more than a certain period

Page 7: Query Logs – Used everywhere and for everything Sai Vallurupalli

Future Work

Studying search patterns/behaviors in • mobile environments• question/answering, longer queries• data-driven search -- chained queries and intent revision• more privacy protection techniques

Page 8: Query Logs – Used everywhere and for everything Sai Vallurupalli

References

• Analysis of a Very Large Web Search Engine Query Log, Craig Silverstein, Monika Henzinger, Hannes Marais, Michael Moritz

• Users’ interactions with the Excite Web Search Engine – A query reformulation and relevance feedback analysis, Amanda Spink, Carol Chang, Agnes Goz.

• Learning about the World through Long-Term Query Logs, Matthew Richardson.

• User 4XXXXX9: Anonymizing Query Logs, Eytan Adar.

• “I Know What You Did Last Summer” – Query Logs and User Privacy, Rosie Jones, Ravi Kumar, Bo Pang, Andrew Tomkins.

• Query Logs Alone are not Enough, carrie Grimes, Diane Tang, Daniel Russell

• Providing Privacy through Plausibly Deniable Search, Mummoorthy Murugesan, Chris Clifton.

• Web Search log analysis – Programmers rarely refine queries, but are good at it, Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, Scott Klemmer.

• Clustering Query Refinements by User Intent

• Analysis of Long Queries in a Large Scale Search Log, Michael Bendersky, Bruce Croft.

• Search Trends: Are Compound Queries the Start of the Shift to Data Driven Search?

• Google patents for Document scoring, employing usage statistics in document retrieval, methods for determining equivalent descriptions for an information need, extracting user intent from query logs.