output url bidding panagiotis papadimitriou, hector garcia-molina, (stanford university) ali dasdan,...
TRANSCRIPT
Output URL Bidding
Panagiotis Papadimitriou, Hector Garcia-Molina, (Stanford University)
Ali Dasdan, Santanu Kolay(Ebay Inc)
Related papers: VLDB 2011, InfoLab TR-939, AdAuctions 2009
Search Engine Results Page (SERP)
Organic Results
Sponsored Ads
Query
Sponsored Search Ads
Keyword Bidding
Advertiser Search Engines
the social networklord of the rings
the matrixlotr III
... ...
# keywords = ~ 10K
KEYWORDS
Example SERPs
en.wikipedia.org/wiki/The_Social_Network
www.imdb.com/title/tt1285016/
www.imdb.com/title/tt133093/
en.wikipedia.org/wiki/The_Matrix
en.wikipedia.org/wiki/The_Lord_of_the_rings
en.wikipedia.org/wiki/The_Lord_of_the_rings
www.imdb.com/title/tt167260/
www.imdb.com/title/tt120737/
the social network
the matrix
the lord of the rings
lotr iii
Output Bidding
Advertiser Search Engines
imdb.com AND wikipedia.org
# URLs = 2
URLs
Outline
• Architectures
• Bid Language
• Output bid/expression generation
• Spill Evaluation
• Experiments
ArchitecturesCurrent Search Engine Architecture
ArchitecturesSerialization
• Overview– First, retrieve organic
results – Then, retrieve ads
• Pros– Simplicity
• Cons– Results Latency
O: Organic Search SystemS: Sponsored Search System
SERP
Architectures Pipelining
• Split organic search system to– Or: retrieval subsystem
(retrieve relevant docs)– Op: post-processing
subsystem (create result snippets)
• Op and S run in parallel• Pros
– No additional latency
• Cons– Sponsored search system
depends on organic system
O: Organic Search System = Or + OpS: Sponsored Search System
SERP
ArchitecturesParallelization
• URLs with ads are known a priori
• S can use– Or’: Or replica that indexes
only URLs with ads
• Pros– No additional latency– Independent organic and
sponsored search system
• Cons– More resources
O: Organic Search System (Or + Op)S: Sponsored Search SystemOr’: Small replica of OrV: Ad validation
SERP
Bid Language Model
• Output Expression– e.g., a := (u1 u2) u3 (h1 h2)– u: URL• e.g., en.wikipedia.org/wiki/The_Social_Network
– h: host• e.g., en.wikipedia.org
• Questions– URLs or hosts or both?– complex or simple?
Output Expression GenerationMotivation
• Use existing keyword campaigns to generate realistic output expressions to study
The social networklord of the rings
the matrixlotr III
……
Output Expression Generator
imdb.com AND wikipedia.org
• Problem– INPUT: keyword set R– OUTPUT: expression a
that “covers” R
• Candidate solutions– a1 := u1 u2 u3
– a2 := u1 u4
– a3 := u5
Output Expression GenerationMotivating Example
• CompactnessContain few URLs
• Spill minimization:Do not match “irrelevant” queries
Output Expression GenerationObjectives
OutputExpression
Size|a|
Spillspill(a,R)
a1 := u1 u2 u3 3 {}
a2 := u1 u4 2 {q5}
a3 := u5 1 {q4,q5, q6}
• Query Set Output Coverminimize γ|a| + (1-γ) |spill(a, R)|subj. to m(a,q), q R
• γ : regularization parameter
• Related to– Set Cover– Red-Blue Set Cover
Output Expression GenerationProblem Statement
Output Expression GenerationGreedy Algorithm
• Pre-compute– C[u]: Queries covered by URL u– S[u]: Spill of URL u w.r.t. R
• Algorithm
Spill Evaluation
• Spill queries may be relevant to R
• Divide spill(a, R) to – positive: relevant – negative: irrelevant
• Use query clustering for evaluation
• Example:– a := u2 u3
– Positive spill = {q1}
– Negative spill = {q5}
Experimental EvaluationGoals
• Compare output URL bidding variations– 1-URL, 2-URL, 3-URL• e.g, 2-URL: use only URLs, up to 2 URLs in a disjunct
– 1-host, 2-host, 3-host– 1-mixed, 2-mixed
• Comparison criteria– Compactness – Spill tradeoff– Spill Evaluation
Experimental EvaluationSetup
• Dataset (from Yahoo query logs)– 12,931,117 queries– 62,666,514 URLs– 7,185,392 hosts– 2,251 ads
• Process– For each variation (1-URL, 2-URL, …)• For different γ values
– Generate output expressions for all 2,251 ads
Experimental EvaluationCompactness vs Spill
Experimental EvaluationPositive and Negative Spill
Experimental EvaluationSummary
• Compactness-spill trade-off– Using both URLs/hosts outperform other options– Up to 2 conjuncts in a disjunct is sufficient
• Spill evaluation– Output expressions can bring additional queries
• Other experiments in Combining keyword and output bidding– Output expression are suitable for half of the keywords– Using only hosts seems to be sufficient
Conclusions
• Output URL bidding can be implemented efficiently
• Advantages over keyword bidding– Bid Compactness– More relevant queries
THANK YOU!