query session detection as a...
TRANSCRIPT
![Page 1: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/1.jpg)
Query Session Detection as a Cascade
Matthias Hagen Benno Stein Tino Rub
Bauhaus-Universitat [email protected]
SIR 2011Dublin, IrelandApril 18, 2011
Hagen, Stein, Rub Query Session Detection as a Cascade 1
![Page 2: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/2.jpg)
Introduction Motivation
It’s quiz time!
What is the user searching?
paris hilton
Hagen, Stein, Rub Query Session Detection as a Cascade 2
![Page 3: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/3.jpg)
Introduction Motivation
It’s quiz time!
What is the user searching?
paris hilton
Hagen, Stein, Rub Query Session Detection as a Cascade 2
![Page 4: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/4.jpg)
Introduction Motivation
Without context . . .
paris hilton
source: [http://upload.wikimedia.org/wikipedia/commons/2/26/Paris Hilton 3 Crop.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 3
![Page 5: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/5.jpg)
Introduction Motivation
What if you knew the previous queries?
paris hotelsparis marriottparis hyattparis hilton
sources: [http://www.alison-anderson.com/wp-content/uploads/hilton hotel paris 2.jpg][http://maps.google.de/][http://upload.wikimedia.org/wikipedia/en/e/eb/HI mk logo hiltonbrandlogo.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 4
![Page 6: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/6.jpg)
Introduction Motivation
What if you knew the previous queries?
paris hotelsparis marriottparis hyattparis hilton
sources: [http://www.alison-anderson.com/wp-content/uploads/hilton hotel paris 2.jpg][http://maps.google.de/][http://upload.wikimedia.org/wikipedia/en/e/eb/HI mk logo hiltonbrandlogo.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 4
![Page 7: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/7.jpg)
Introduction Motivation
Query sessions: same information need
The benefits
Improved understanding of user intent
Improved retrieval performance via session knowledge
The “minor” issue
Users do not announce when querying for a new information need.
Hagen, Stein, Rub Query Session Detection as a Cascade 5
![Page 8: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/8.jpg)
Introduction Motivation
Query sessions: same information need
The benefits
Improved understanding of user intent
Improved retrieval performance via session knowledge
The “minor” issue
Users do not announce when querying for a new information need.
Hagen, Stein, Rub Query Session Detection as a Cascade 5
![Page 9: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/9.jpg)
Introduction Motivation
A typical query log
User Query Click domain + Click rank Time
773 istanbul en.wikipedia.org 1 2011-04-16 20:34:17773 istanbul archeology 2011-04-17 12:02:54773 istanbul archeology www.kulturturizm.tr 6 2011-04-17 12:03:15773 istanbul archeology www.arkeoloji.gov.tr 13 2011-04-17 18:24:07773 constantinople 2011-04-17 19:00:40773 constantinople www.roman-empire.net 4 2011-04-17 19:01:02773 hurling 2011-04-17 19:03:01773 hurling en.wikipedia.org 1 2011-04-17 19:03:05773 liam mccarthy cup 2011-04-17 23:33:04773 liam mccarthy cup www.hurling.net 5 2011-04-17 23:33:12773 liam mccarthy cup starbets.ie 16 2011-04-18 12:42:48
Hagen, Stein, Rub Query Session Detection as a Cascade 6
![Page 10: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/10.jpg)
Introduction Motivation
How to determine the break points?
User Query Click domain + Click rank Time
773 istanbul en.wikipedia.org 1 2011-04-16 20:34:17773 istanbul archeology 2011-04-17 12:02:54773 istanbul archeology www.kulturturizm.tr 6 2011-04-17 12:03:15773 istanbul archeology www.arkeoloji.gov.tr 13 2011-04-17 18:24:07773 constantinople 2011-04-17 19:00:40773 constantinople www.roman-empire.net 4 2011-04-17 19:01:02
— — — — — — — — — — — — — — — — — —
773 hurling 2011-04-17 19:03:01773 hurling en.wikipedia.org 1 2011-04-17 19:03:05773 liam mccarthy cup 2011-04-17 23:33:04773 liam mccarthy cup www.hurling.net 5 2011-04-17 23:33:12773 liam mccarthy cup starbets.ie 16 2011-04-18 12:42:48
Hagen, Stein, Rub Query Session Detection as a Cascade 7
![Page 11: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/11.jpg)
Introduction The Problem
The key is . . .
Automatic query session detection
Hagen, Stein, Rub Query Session Detection as a Cascade 8
![Page 12: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/12.jpg)
Introduction The Problem
Automatic query session detection
Usual “technique”
Check for consecutive queries whether same/new information need.
Example
773 istanbul 2011-04-16 20:34:17 X same773 istanbul archeology 2011-04-17 18:24:07 X same773 constantinople 2011-04-17 19:01:02
— — — — — — — — — � new
773 hurling 2011-04-17 19:03:05
Hagen, Stein, Rub Query Session Detection as a Cascade 9
![Page 13: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/13.jpg)
Introduction Related Work
Typical features
Temporal thresholds 5 minutes [Silverstein et al., 1999]
10–15 minutes [He and Goker, 2000]
30 minutes [Downey et al., 2007]
user specific [Murray et al., 2006]
Lexical similarity n-gram overlap [Zhang and Moffat, 2006]
Levenshtein distance [Jones and Klinkner, 2008]
Semantic similarity Search results [Radlinski and Joachims, 2005]
ESA [Lucchese et al., 2011]
Hagen, Stein, Rub Query Session Detection as a Cascade 10
![Page 14: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/14.jpg)
Introduction Related Work
Previous methods
Observations
Temporal thresholds: fast but bad accuracy
Feature combinations: more accurate
One of the best: Geometric method (time + lexical) [Gayo-Avello, 2009]
Shortcomings
All features evaluated simultaneously → runtime
Geometric method ignores semantics → accuracy
Examples
Subset test suffices
hurling X samehurling gaa
Geometric method fails
hurling X samemccarthy cup
Hagen, Stein, Rub Query Session Detection as a Cascade 11
![Page 15: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/15.jpg)
Introduction Related Work
Previous methods
Observations
Temporal thresholds: fast but bad accuracy
Feature combinations: more accurate
One of the best: Geometric method (time + lexical) [Gayo-Avello, 2009]
Shortcomings
All features evaluated simultaneously → runtime
Geometric method ignores semantics → accuracy
Examples
Subset test suffices
hurling X samehurling gaa
Geometric method fails
hurling X samemccarthy cup
Hagen, Stein, Rub Query Session Detection as a Cascade 11
![Page 16: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/16.jpg)
Cascading Method The Framework
We address the shortcomings in a cascade . . .
source: [http://wp.ltchambon.com/wp-content/uploads/2010/09/Cascade-de-Tufs-Baume-les-messieurs-Jura.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 12
![Page 17: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/17.jpg)
Cascading Method The Framework
. . . well . . . a small 4-step cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
Basic Idea
Increased feature cost (runtime) from step to step.Expensive features only if previous steps “unreliable.”
Hagen, Stein, Rub Query Session Detection as a Cascade 13
![Page 18: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/18.jpg)
Cascading Method The Framework
. . . well . . . a small 4-step cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
Basic Idea
Increased feature cost (runtime) from step to step.Expensive features only if previous steps “unreliable.”
Hagen, Stein, Rub Query Session Detection as a Cascade 13
![Page 19: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/19.jpg)
Cascading Method Step 1: Subset tests
Simple string comparison
Criterion
Consecutive queries q and q′ in same session if q sub- or superset of q′.Else: Goto Step 2.
Remarks: Repetition, specialization, or generalization.Time gap = continuing a pending session.
Example
Repetition Specialization Generalization
hurling X same hurling X same hurling gaa X samehurling hurling gaa hurling
Hagen, Stein, Rub Query Session Detection as a Cascade 14
![Page 20: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/20.jpg)
Cascading Method Step 2: Geometric method
Combination of temporal and lexical features [Gayo-Avello, 2009]
For consecutive queries q and q′
ftemp = maximum of 0 and 1− t24h t is time between q and q′
flex = cosine similarity of 3- to 5-grams of q′ and s s is session of q
Criterion (original)
Consecutive queries q and q′ in samesession if √
f 2temp + f 2
lex ≥ 1.
Lexi
cal s
imila
rity
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
Nearly identicalqueries at long
temporal distance
Differentqueries with no
temporal distance
Same session
New session
Hagen, Stein, Rub Query Session Detection as a Cascade 15
![Page 21: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/21.jpg)
Cascading Method Step 2: Geometric method
Combination of temporal and lexical features [Gayo-Avello, 2009]
For consecutive queries q and q′
ftemp = maximum of 0 and 1− t24h t is time between q and q′
flex = cosine similarity of 3- to 5-grams of q′ and s s is session of q
Criterion (original)
Consecutive queries q and q′ in samesession if √
f 2temp + f 2
lex ≥ 1.Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
Nearly identicalqueries at long
temporal distance
Differentqueries with no
temporal distance
Same session
New session
Hagen, Stein, Rub Query Session Detection as a Cascade 15
![Page 22: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/22.jpg)
Cascading Method Step 2: Geometric method
Performs well on standard test corpus . . .Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
Same session
Lexi
cal s
imila
rity
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
New session
Hagen, Stein, Rub Query Session Detection as a Cascade 16
![Page 23: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/23.jpg)
Cascading Method Step 2: Geometric method
. . . but has some problems “on the edge”Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
70
50
50
140
58350
10
00
40
60
1423
10
00
20
40
28
10
20
00
10
07
1147
010
011
02
011
Major problems
Similar queries, time gap (upper left)→ Merely a matter of opinion
Diff. queries, same semantics (lower right)→ Incorporate semantics
Criterion (adapted)
Original geometric method if ftemp < 0.8 or flex > 0.4.Else: Goto Step 3.
Hagen, Stein, Rub Query Session Detection as a Cascade 17
![Page 24: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/24.jpg)
Cascading Method Step 2: Geometric method
. . . but has some problems “on the edge”Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
70
50
50
140
58350
10
00
40
60
1423
10
00
20
40
28
10
20
00
10
07
1147
010
011
02
011
Major problems
Similar queries, time gap (upper left)→ Merely a matter of opinion
Diff. queries, same semantics (lower right)→ Incorporate semantics
Criterion (adapted)
Original geometric method if ftemp < 0.8 or flex > 0.4.Else: Goto Step 3.
Hagen, Stein, Rub Query Session Detection as a Cascade 17
![Page 25: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/25.jpg)
Cascading Method Step 3: Explicit Semantic Analysis
How ESA works [Gabrilovich and Markovitch, 2007]
Preprocessing
tf · idf -weighted inverted indexof Wikipedia articles
→ term-document matrixM
For consecutive queries q and q′
fesa = cosine similarity of MT · q′ and MT · s s is session of q
Criterion
Consecutive queries q and q′ in same session if fesa ≥ 0.35.Else: Goto Step 4.
Hagen, Stein, Rub Query Session Detection as a Cascade 18
![Page 26: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/26.jpg)
Cascading Method Step 4: Search results
Even more “semantics”
Idea
Enrich the short query strings with the results of some web search engine.
Criterion
Consecutive queries q and q′ in same session iffthey share at least one of the top 10 search results.
Remark
If q and q′ share no top 10 result, decision should be “not sure.”
Hagen, Stein, Rub Query Session Detection as a Cascade 19
![Page 27: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/27.jpg)
Cascading Method Step 4: Search results
Even more “semantics”
Idea
Enrich the short query strings with the results of some web search engine.
Criterion
Consecutive queries q and q′ in same session iffthey share at least one of the top 10 search results.
Remark
If q and q′ share no top 10 result, decision should be “not sure.”
Hagen, Stein, Rub Query Session Detection as a Cascade 19
![Page 28: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/28.jpg)
Cascading Method Experimental Results
That’s the complete cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
What about accuracy and performance?
Hagen, Stein, Rub Query Session Detection as a Cascade 20
![Page 29: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/29.jpg)
Cascading Method Experimental Results
That’s the complete cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
What about accuracy and performance?
Hagen, Stein, Rub Query Session Detection as a Cascade 20
![Page 30: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/30.jpg)
Cascading Method Experimental Results
Accuracy and runtime
Accuracy on Gayo-Avello’s corpus (11 000 queries, 2.7 per session)
Precision Recall F-Measure (β = 1.5)
Geometric 0.8673 0.9431 0.9184Cascading 0.8618 0.9676 0.9328
Performance per step on Gayo-Avello’s corpus
affected F-Measure time factor
Step 1 40.49% 0.8303 0.08 ms 1.0Step 2 35.15% 0.9292 0.20 ms 2.5Step 3 2.05% 0.9316 0.27 ms 3.4Step 4 0.85% 0.9328 9.85 ms 123.1
Hagen, Stein, Rub Query Session Detection as a Cascade 21
![Page 31: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/31.jpg)
Cascading Method Experimental Results
Goal: high quality session test data
Our own use case
Sample sessions from the AOL log as test data.AOL log (cleaned): 35.4 million interactions from 470 000 users.
Some figures
Step 4 involved on 22.5% → 8 million web queries→ 300 ms per search → 1 month
Way out
Drop Step 4 and the sessions on which it would have been invoked
Remaining sessions:F-Measure = 0.9755
Cleaned AOL log:27 minutes
Hagen, Stein, Rub Query Session Detection as a Cascade 22
![Page 32: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/32.jpg)
Cascading Method Experimental Results
Goal: high quality session test data
Our own use case
Sample sessions from the AOL log as test data.AOL log (cleaned): 35.4 million interactions from 470 000 users.
Some figures
Step 4 involved on 22.5% → 8 million web queries→ 300 ms per search → 1 month
Way out
Drop Step 4 and the sessions on which it would have been invoked
Remaining sessions:F-Measure = 0.9755
Cleaned AOL log:27 minutes
Hagen, Stein, Rub Query Session Detection as a Cascade 22
![Page 33: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/33.jpg)
Conclusion
Almost the end: The take-away messages!
Hagen, Stein, Rub Query Session Detection as a Cascade 23
![Page 34: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/34.jpg)
Conclusion
What we have done
Results
Cascading method
Cheap features first
Beats geometric
3 step version: simple, fast,high quality sessions
Future Work
Postprocessing for multi-tasking
Postprocessing for goals/missions
Thank you,
Hagen, Stein, Rub Query Session Detection as a Cascade 24
![Page 35: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/35.jpg)
Conclusion
What we have (not) done
Results
Cascading method
Cheap features first
Beats geometric
3 step version: simple, fast,high quality sessions
Future Work
Postprocessing for multi-tasking
Postprocessing for goals/missions
Thank you,
Hagen, Stein, Rub Query Session Detection as a Cascade 24
![Page 36: Query Session Detection as a Cascadeir.cis.udel.edu/ECIR12Sessions/proceedings/110418-hagen-sir-dublin.pdfQuery Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub](https://reader033.vdocuments.net/reader033/viewer/2022041719/5e4d23956e1dc1519208f90b/html5/thumbnails/36.jpg)
Conclusion
What we have (not) done
Results
Cascading method
Cheap features first
Beats geometric
3 step version: simple, fast,high quality sessions
Future Work
Postprocessing for multi-tasking
Postprocessing for goals/missions
Thank you,
Hagen, Stein, Rub Query Session Detection as a Cascade 24