search query disambiguation from short sessions
DESCRIPTION
Search Query Disambiguation from Short Sessions. Lilyana Mihalkova & Raymond Mooney The University of Texas at Austin. Query Disambiguation. scrubs. ?. Existing Work. Well-studied problem: [e.g., Sugiyama et al. 04, Sun et al. 05, Dou et al. 07] - PowerPoint PPT PresentationTRANSCRIPT
1
Search Query Disambiguation
from Short SessionsLilyana Mihalkova & Raymond
Mooney
The University of Texas at Austin
2
Query Disambiguation
scrubs
?
3
Existing Work
• Well-studied problem: [e.g., Sugiyama et al. 04, Sun et al. 05, Dou et al. 07]
• Common Assumption: Information about each user is available over a relatively long period of time.
4
Privacy Concerns
• NY Times: “A Face is Exposed for AOL Searcher no. 4417749”
• [Conti, 06]: “Googling Considered Harmful”
5
Pragmatic Concerns
• Identifying users across search sessions– Log-in?– IP Address?
• Managing and protecting user-specific information
6
Proposed Setting
• Base personalization only on short-term search histories– complete search histories cannot be
reconstructed
• Relate current session to previous short sessions of other users, based on the search activity in these sessions
7
How Short is Short-Term?N
umbe
r of
ses
sion
s w
ith th
at m
any
quer
ies
Number of queries before ambiguous query
8
Is This Enough Info?
98.7 fm
kroq
scrubs
www.star987.com
www.kroq.com
???
huntsville hospital
ebay.com
scrubs
www.huntsvillehospital.com
www.ebay.com
???scrubs-tv.com scrubs.com
9
More Closely Related Work
• [Almeida & Almeida 04]: Similar assumption of short sessions, but better suited for a specialized search engine (e.g. on computer science literature)
• [Krause & Horvitz 08]: Explicitly models the tradeoff between better performance and more user information.
10
Main Challenge
• How to harness this small amount of potentially noisy information available for each user?– Exploit the relations among users,
sessions, URLs– Use statistical relational learning (SRL)
[Getoor & Taskar 07]
11
Using Relational Information
huntsville hospital
ebay
scrubs
huntsvillehospital.org
ebay.com
???
huntsville school
. . .
. . .
hospitallink.com
scrubs
scrubs-tv.com
…
ebay.com
scrubs
scrubs.com
12
Details
• Used Markov logic networks (MLNs) [Richardson & Domingos 06]– MLN structure is provided as domain
knowledge– Weights are learned from the data
• Weight learning: Adapted contrastive divergence [Lowd & Domingos 07] for incremental learning
13
Predicates
• Evidence predicates– provide information about clicked URLs and
keywords shared between sessions, i.e.• shares-keyword-between-clicks(ActiveS, backgroundS,
keyword)• shares-keyword-between-click-and-search(ActiveS,
backgroundS, keyword)• shares-clicks(ActiveS, BackgroundS, hostname)
– provide information about clicked URLs and keywords in current session
• Query predicate– states that user will chose particular URL
• clicks-on(ActiveS, hostname)
14
Re-Ranking of Search Results
• Search engine produces a list of search results
• For each possible search result R, compute the probability that
clicks-on(ActiveS, R)
• Rank the search results by their likelihood of being clicked
15
MLN 1
• User will click on at least one result• User will select result chosen by
previous user with whom a click is shared
ambiguous query www.someplace1.com
some query ambiguous query
www.clickedResult.com
. . .
www.someplace1.com
16
MLN 2
• MLN1 +• User will select result chosen by
previous user with whom a keyword is shared– click-to-click, click-to-search, search-to-
click, search-to-search
ambiguous query www.aClick.com
some query ambiguous query
www.clickedResult.comwww.someplace1.com
some other
17
MLN 3
• MLN 2 + • User will choose result that shares a
keyword with a previous search or click in the current session
ambiguous query
some query
www.someplace1.com
www.someResult.com
www.anotherPossibility.com
www.yetAnother.com
18
Data
• Collected from the MSN engine in May 2006
• Contains time-stamped records of searches and clicked URLs, grouped by sessions – Average session length is 3.28– No across-session identifiers
• Used first 25 days for training/validation and last 6 days for test
19
Data Limitation #1:
• Data does not specify what queries are ambiguous– Consider query as ambiguous, if over all
pages clicked after searching for this query, at least 2 fall in different high-level categories in the DMOZ (dmoz.org) hierarchy.
– Limit to query strings of up to two words (43.7%)• 6,360 ambiguous queries (2.4% of all two-word
query strings)
20
Data Limitation #2
• Data does not provide the full list of search results presented to the user; only the ones actually clicked– Assume that the URLs seen by the user
are those clicked by at least one person after searching for the exact query string
– Consequence: result sets have differing lengths
21
Result Set Sizes
Size of result set for ambiguous queryNum
ber
of q
ueri
es w
ith th
at r
esul
t set
siz
e
22
Evaluation Metrics: MAP
• Mean average precision – identical to the area under the
interpolated precision/recall curve
23
Evaluation Metrics: AUC-ROC
• Area under the ROC curve– identical to the mean average true
negative rate
24
Baselines
• Random: Rank randomly• Click-Sim: Rank by similarity based on
shared clicks• Click-KW-Sim: Rank by similarity based
on shared clicks and keywords
25
Click-Sim
huntsville hospital
scrubs
huntsvillehospital.org
???
scrubs
scrubs.tv
scrubs
scrubs.tv
scrubsscrubs
scrubs
scrubs
scrubs.med
scrubs.med
scrubs.med
scrubs.med
. . . . . . . . .
. . .
. . .
. . .
scrubs.tv
Average similaritybased on shared clicks
26
Click-KW-Sim
huntsville hospital
scrubs
huntsvillehospital.org
???
scrubs
scrubs.tv
scrubs
scrubs.tv
scrubsscrubs
scrubs
scrubs
scrubs.med
scrubs.med
scrubs.med
scrubs.med
. . . . . . . . .
. . .
. . .
. . .
scrubs.tv
Average similaritybased on shared clicksand keywords
27
Results (MAP)MAP
0.28
0.29
0.3
0.31
0.32
0.33
0.34
0.35
0.36
0.37
0.38
0.39
Random Click-Sim Click-KW-Sim
MLN1 MLN2 MLN3
* **
*
28
Results (AUC-ROC)
AUC-ROC
0.46
0.48
0.5
0.52
0.54
0.56
0.58
Random Click-Sim
Click-KW-Sim
MLN1 MLN2 MLN3
*
**
29
Current/Future Work
• Incorporating more information in the models– Actual content of clicked pages– Popularity of pages– Weighing evidence based on how close it
is in time to ambiguous query• Learning separate weights for each
connecting keyword or domain/group of keywords or domains
• Revising the provided clauses
30
Questions?
31
1
• The popularity of a possible result provides a strong signal, but providing relational information on top of popularity gives further performance improvements– Rank by popularity + click-KW-Sim baseline:
• MAP (0.383), AUC-ROC (0.536)
– Rank by popularity only:• MAP(0.380), AUC-ROC (0.525)
32
2N
umbe
r of
ses
sion
s w
ith th
at m
any
clic
ks
Number of distinct clicks before ambiguous query