behavioral dynamics from the serp’s perspective: what are failed serps and how to fix them?

44
Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and How to Fix Them Julia Kiseleva, Jaap Kamps, Vadim Nikulin , Nikita Makarov Eindhoven University of Technology University of Amsterdam Yandex CIKM’15, Melbourne, Australia

Upload: julia-kiseleva

Post on 11-Jan-2017

581 views

Category:

Internet


0 download

TRANSCRIPT

Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and How to Fix Them?

Julia Kiseleva, Jaap Kamps, Vadim Nikulin , Nikita Makarov

Eindhoven University of TechnologyUniversity of Amsterdam

Yandex

CIKM’15, Melbourne, Australia

Want to go to CIKM conference

QUERY SERP

What is User Satisfaction?

Want to go to CIKM conference

QUERY SERP

What is User Satisfaction?

?A YEAR AGO

Want to go to CIKM conference

QUERY SERP

What is User Satisfaction?

? TODAY

Want to go to CIKM conference

QUERY SERP

What is User Satisfaction?

What about freshness

Want to go to CIKM conference

QUERY SERP

What is User Satisfaction?

What about freshness???

What is User Satisfaction?

QUERY SERP,

What is User Satisfaction?

QUERY SERP,

Changes in User Intents

Wik

iped

ia P

age

View

2015

Changes in User Intents

Wik

iped

ia P

age

View

2015

Malaysia Airlines Flight 370

Changes in User Intents

Wik

iped

ia P

age

View

2015

Malaysia Airlines Flight 370

Changes in User Intents

Wik

iped

ia P

age

View

2015

Malaysia Airlines Flight 370

Malaysia Airlines Flight 17

• By analyzing behavioral dynamics at the SERP level, • can we detect an important class of detrimental cases (such as search failure)• based on changes in observable behavior

caused by low user satisfaction?

Research Problem

ti

ti+1

Tim

elin

e

How Can We Detect the Changes?

QUERY SERP,

QUERY SERP,

ti

ti+1

Tim

elin

e

How Can We Detect the Changes?

QUERY SERP,

QUERY SERP, SAT

DSAT

ti

ti+1

Tim

elin

e

How Can We Detect the Changes?

BF1 = Reformulation Signal (RS)

BF2 = Abandonment Signal (AS)

BF3 = Volume Signal (VS)

QUERY SERP,

QUERY SERP, BF4 = Click Position Signal (CS)

ti

ti+1

Tim

elin

e

How Can We Detect the Changes?

BF1 = Reformulation Signal (RS)

BF2 = Abandonment Signal (AS)

BF3 = Volume Signal (VS)

QUERY SERP,

QUERY SERP, BF4 = Click Position Signal (CS)

• Change detection techniques o In dynamically changing and non-stationary environments, the data

distribution can change over time yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features)

• Concept drift types:

Change Detection Techniques

• Change detection techniques o In dynamically changing and non-stationary environments, the data

distribution can change over time yielding the phenomenon of concept drift

o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features)

• Concept drift types:

Time

Dat

a m

ean

Sudden/abrupt Increment

alGradual Reoccurring Outlier

(not concept drift)The example:

“flawless Beyoncé”

Seasonal change such as

“black Friday 2014”

The example:“idaho bus

crash investigation”

The example:“cikm conference

2015”

Change Detection Techniques

Detecting Drift in User Satisfaction

QUERY SERP,BFi Pti BFi( ),

Detecting Drift in User Satisfaction

QUERY SERP,BFi Pti BFi( )

QUERY SERP,Pti+1 BFi( )

,

,

Detecting Drift in User Satisfaction

QUERY SERP,BFi Pti BFi( )

QUERY SERP,Pti+1 BFi( )

,

,Failed SERP !!!

Failed SERP

Failed SERP

schedule of matches of the World Cup 2014schedule of matches of the World Cup 2014 ON TV

Detecting Drifts in Behavioral Signals

Query: “cikm conference”

0.1TimeLinet0

0.1 0.2 0.2 0.3

Reformulation: “2015”

Window W0ti

Query: “cikm conference”

0.1TimeLinet0 ti+ t

0.1 0.2 0.2 0.3 0.7 0.8 0.8

Reformulation: “2015”

Window W0 Window W1ti

E(W0) E(W1)Size of Window W1

= n1

Size of Window W0

= n0

Drift

If |E(W1) - E(W2)|> eout

Then Drift Detected

Detecting Drifts in Behavioral Signals

Calculating Threshold eout

Confidence

Variance at W = W0 U W1

m = 1/(1/n0 + 1/n1)

eout

Time

Dat

a m

ean Sudden drift

ti ti+1 ti+1+W1

Sudden VS Incremental Drifts

Time

Dat

a m

ean Sudden drift

Sudden Drift:Disambiguation

such as ‘medal Olympics

2016’

ti ti+1 ti+1+W1

Sudden VS Incremental Drifts

Time

Dat

a m

ean Sudden drift

Sudden Drift:Disambiguation

such as ‘medal Olympics

2016’

ti ti+1 ti+1+W1

Sudden VS Incremental Drifts

Abandonment Signal (AS)

Volume Signal (VS)

Time

Dat

a m

ean Sudden drift Incremental Drift

Sudden Drift:Disambiguation

such as ‘medal Olympics

2016’

Incremental Drift:

Disambiguation such as ‘CIKM

conference 2015’

ti ti+1

ti+1+W1

tj tj+1 tj+1+W2

Sudden VS Incremental Drifts

Time

Dat

a m

ean Sudden drift Incremental Drift

Sudden Drift:Disambiguation

such as ‘medal Olympics

2016’

Incremental Drift:

Disambiguation such as ‘CIKM

conference 2015’

ti ti+1

ti+1+W1

tj tj+1 tj+1+W2

Sudden VS Incremental Drifts

Click Position Signal (CS)

Abandonment Signal (AS)

Volume Signal (VS)

Time

Dat

a m

ean Sudden drift Incremental

Drift

Sudden Drift:Disambiguation

such as ‘medal Olympics

2016’

Incremental Drift:

Disambiguation such as ‘CIKM

conference 2015’

ti ti+1

ti+1+W1

tj tj+1 tj+1+W2W1 <<

W2

Sudden VS Incremental Drifts

Click Position Signal (CS)

Abandonment Signal (AS)

Volume Signal (VS)

Time

Dat

a m

ean Reoccurring

driftDisambiguation

such as ‘movie

premieres November 2014’

Disambiguation such as ‘movie

premieres December 2014’

Disambiguation such as ‘movie

premieres January 2015’

+ _

Changes in query intent

_ _+ +

Positive Sudden

Drift

Negative

Sudden Drift

Negative

Sudden Drift

Negative

Sudden Drift

Positive Sudden

Drift

Positive Sudden

Drift

Reoccurring Drifts

If sign(E(Wi+1) - E(Wi)) > 0 then “+”If sign(E(Wi+1) - E(Wi)) < 0 then “-”

Gradual Drifts

Time

Dat

a m

ean

Gradual drift Disambiguation

such as ‘novak djokovic

fiancée’

Disambiguation such as

‘novak djokovic wedding’

Disambiguation such as

‘novak djokovic baby’

+ + +__ _

Positive Sudden

Drift

Negative

Sudden Drift

Positive Incremental

Drift

Changes in query intent

oDataset consists of 12 months of the behavioral log data from Yandex (2015)

o~25 millions users per dayo In total ~ 150 millions of queries per dayoTrain Period – one monthoTest Period – 3, 7, and 14 days

Experimentation

Freq

uenc

y of

det

ecte

d D

rift

s

o 100s of thousands of query drifts detected• huge number, but small fraction of traffic

o Over 200,000 unique <Q,Q’> pairso Revisions are varied• unique revision term(s) occurring in 3-4 unique pairs• 2-3 % are year revisions (‘2014’, ‘2015’)• 17-18 % of revisions contain any number

o Detect far more revisions than standard rules/templates• Queries and revision in many language• Demonstrates general applicability of the approach

Detected Query Drifts

Evaluation

Return most clicked URL

Results

Results

Incremental Drift

• We conducted a conceptual analysis of success and failure at the SERP levelo we introduced the concept of a successful and failed SERPo we analyzed their behavioral consequences identifying indicators

of success and failure• We conducted an analysis of different types of drifts in

query intent over timeo we studied different changes in query intent: sudden,

incremental, gradual and reoccurringo we introduced an unsupervised approach to detect failed SERPs

caused by drift (sudden, incremental)• We tested our detector on massive raw search logs

Conclusions

Questions?

• We conducted a conceptual analysis of success and failure at the SERP levelo we introduced the concept of a successful and failed SERPo we analyzed their behavioral consequences identifying indicators

of success and failure• We conducted an analysis of different types of drifts in

query intent over timeo we studied different changes in query intent: sudden,

incremental, gradual and reoccurringo we introduced an unsupervised approach to detect failed SERPs

caused by drift (sudden, incremental)• We tested our detector on massive raw search logs

Conclusions