improving relevance with log information

Click here to load reader

Upload: richard-boulton

Post on 02-Jul-2015

464 views

Category:

Technology

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Improving relevance with log information

Improving relevance with log analysis

Richard [email protected]

@rboulton

mailto:[email protected]

Page 2: Improving relevance with log information

Sources of ranking information

Page 3: Improving relevance with log information

Document text

Term frequency based weights.

Vector models, Cosine

BM25, BM25F

Purely based on document and query

Page 4: Improving relevance with log information

Link analysis

Google – Page Rank

Citation analysis

Page 5: Improving relevance with log information

Historical behaviour

Which results were picked for this query before

How well did results which are similar to this result perform in the past.

Page 6: Improving relevance with log information

How to analyse historical behaviour

Page 7: Improving relevance with log information

Finding past behaviour

Keep logs of searches, together with their results.

Keep click-through information.

Keep track of eventual outcomes (sales, ad views, content downloads).

Page 8: Improving relevance with log information

Hadoop

Distributed data processing

Map-combine-reduce

Very good for log analysis!

Page 9: Improving relevance with log information

Dumbo

Python interface for writing Hadoop jobs.

Very simple to use.

Very poor documentation, sadly.

Some performance penalty for using python, but very good for ad-hoc jobs and rapid development.

Page 10: Improving relevance with log information

Past results

Easy to track results which were picked, but:

New results were never picked

New queries never had results picked

Need massive volume to get anywhere

Page 11: Improving relevance with log information

Past behaviour

Use the history better by building models

Represent documents in terms of features.

Use history to produce a score for each result.

Use machine learning to build a model to predict the score for a set of features.

Use model to produce scores for ranking.

Page 12: Improving relevance with log information

Features

BM25 scores for each field

Review scores

Categories

Prices

Price within a category (dumbo)

Page 13: Improving relevance with log information

Scores

Account for position bias

Model click-throughs for each position

“An Experimental Comparison of Click Position-Bias Models” - Craswell et al.

Account for old data being less relevant

Page 14: Improving relevance with log information

Building a model

Logistic regression

Liblinear / libsvm

Apache Mahout

Neural nets

libfann

Page 15: Improving relevance with log information

Interesting results

BM25 weights for title should be biased 5 times higher than weights for body text.*

Don't need very much data to build a useful model.

* for some sample news data.

Page 16: Improving relevance with log information

Summary

Keep your logs!

Tie searches to results in logs

Dumbo + Hadoop makes adhoc investigation of behaviour easy.

Improving Log-Based Fault Diagnosis by Log Classification

Improving Education Statistics Systems: Challenges … · Improving Education Statistics Systems: Challenges and Opportunities ... through standards Relevance to ... action/programme

Improving Search Relevance for Short Queries in Community Question Answering

Query operations 1- Introduction 2- Relevance feedback with user relevance information 3- Relevance feedback without user relevance information - Local

A novel log-based relevance feedback technique in content- based image retrieval Reporter: Francis 2005/6/2

Improving the Decision-Relevance of Climate Science for ... · Improving the Decision-Relevance of Climate Science for Adaptation Planning William D. Collins With Andrew Jones Lawrence

Questioning Our Competence: Improving the Practical Relevance …prestos/Downloads/DC/9-23... · 2010-09-20 · 1 Questioning Our Competence: Improving the Practical Relevance of

Improving Web Search Relevance with Learning Structure of ...borisk/pub/AutoTaxonomyBuildingFin.pdfImproving Web Search Relevance with Learning Structure of Domain Concepts Boris A

Improving Surgical Training - The Royal College of ... · Improving Surgical Training •Relevance to Shape ... interventional radiology suite during the last month of her 6 ... Adequate

IMPROVING THE RELEVANCE AND QUALITY OF UNDERGRADUATE EDUCATION PROGRAMME UNIVERSITY OF RUHUNA 2004 -2010

7 Steps to Improving IT Relevance - VMware · that it’s simple, straightforward, and makes sense. Here’s a 7-step plan to help you get back on the path to relevance by helping

Nonnegative Shared Subspace Learning and IAlii SilIts Application … · 2011-03-24 · Previous Research (Internal) • Improving tag relevance • Sigurbjornssonand Zwol • Dl

Improving retrieval performance by relevance feedback …vagelis/classes/CS172/publications/jasistSalton... · Improving Retrieval Performance by Relevance Feedback Gerard Salton

Sharing Our Stories, Improving Our Service, Declaring Our Relevance Reviewing Assessment Data Creating an Assessment Rubric Passonneau & Lewin Northumbria

Improving event detection using related videos and Relevance …bmezaris/publications/mm13_preprint.pdf · Improving event detection using related videos and Relevance Degree Support

7 Steps to Improving IT Relevance - VMware · 7 STEPS TO IMPROVING IT RELEVANCE | 5. 2. ... With IaC, developers can use a building-block approach that lets them specify exactly what

Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval

Improving the Effectiveness of Log Analysis with HP …docs.media.bitpipe.com/io_12x/io_128669/item_1260842/SANS...SANS ANALYST PROGRAM 2 Improving the Effectiveness of Log Analysis

INFLUENCE RELEVANCE - Chartered Institute of Management ... docs/web pages 2016/glob… · GLOBAL MANAGEMENT ACCOUNTING PRINCIPLES© Effective management accounting: Improving decisions

Improving Website Relevance Through Personalization

Relevance Theory - SFU.ca - Simon Fraser Universityhedberg/Relevance_Theory.pdf · 2007-10-22 · 481 - Relevance Theory 4 Relevance Theory •The expectations of relevance raised

Improving Public Safety at Fingertips: A Smart City Experienceeverton/publications/2016-ISC2-ROTA.pdf · Despite the relevance of improving public safety in smart cities, the literature

MODERNIZING HIGHER EDUCATION PROJECT Environment ......Feb 08, 2016 · improving the learning environment in HEIs; (iii) improving the relevance of higher education; and (iv) project

Digital Marketing Certified Associate Program · 1.) Compelling ads that increase click through rates (CTR) lower costs ..... 2.) Understanding, Analysing & Improving -Relevance &

Sharing Our Stories, Improving Our Service, Declaring Our Relevance

Improving the Relevance and Reliability of Oil and Gas ... · PDF filePrepared Statement of Bala G. Dharan, PhD, CPA, Professor of Accounting, Rice University Improving the Relevance

Improving Website Relevance through Personalizationd1y9uiksrn06av.cloudfront.net/wp-content/uploads/2013/06/PaceCo... · Improving Website Relevance ... Personalization Service vs

EPSAS Cell on principles related to EPSAS standards ... · Relevance Relevance Relevance Neutrality Relevance Relevance Reliability Faithful Representation -materiality Reliability

IMPROVING THE RELEVANCE AND QUALITY OF UNDERGRADUATE EDUCATION PROGRAMME UNIVERSITY OF RUHUNA 2004 -2010 LESSONS LEARNED FROM THE EXPERIENCE

Large Chinese Online Retailer: Improving Customer ... fileLarge Chinese Online Retailer: Improving Customer Relevance and Conversion Rates Executive Summary In a booming Chinese economy,

Improving Injury and Illness Recordkeeping in Nursing Homes · The need for accurate OSHA log recordkeeping . page 3 . Introduction . Sections . page 19 . OSHA log recordkeeping tips

Improving Ad Relevance in Sponsored Search

Improving event detection using related videos and Relevance …iti.gr/~bmezaris/publications/mm13_preprint.pdf · 2013. 12. 3. · Improving event detection using related videos

Improving Customer Relevance and Business Outcomes

Master Thesis Project Improving Biometric Log Detection