scalable advertising recommender systems
DESCRIPTION
In this presentation I will talk about the design of scalable recommender systems and its similarity with advertising systems. The problem of generating and delivering recommendations of content/products to appropriate audiences and ultimately to individual users at scale is largely similar to the matching problem in computational advertising, specially in the context of dealing with self and cross promotional content. In this analogy with online advertising a display opportunity triggers a recommendation. The actors are the publisher (website/medium/app owner) the advertiser (content owner or promoter), whereas the ads or creatives represent the items being recommended that compete for the display opportunity and may have different monetary value to the actors. To effectively control what is recommended to whom, targeting constraints need to be defined over an attribute space, typically grouped by type (Audience, Content, Context, etc.) where some associated values are not known until decisioning time. In addition to constraints, there are business objectives (e.g. delivery quota) defined by the actors. Both constraints and objectives can be encapsulated into and expressed as campaigns. Finally, there there is the concept of relevance, directly related to users' response prediction that is computed using the same attribute space used as signals. As in advertising, recommendation systems require a serving platform where decisioning happens in real-time (few milliseconds) typically selecting an optimal set of items to display to the user from hundreds, sometimes thousands or millions of items. User actions are then taken as feedback and used to learn models that dynamically adjust order to meet business objectives. This is a radical departure from the traditional item-based and user-based collaborative filtering approach to recommender systems, which fails to factor-in context, such as time-of-day, geo-location or category of the surrounding content to generate more accurate recommendations. Traditional approaches also fail to recognize that recommendations don't happen in a vacuum and as such may require the evaluation of business constraints and objectives. All this should be considered when designing and developing true commercial recommender/advertising systems. Speaker Bio Joaquin A. Delgado is currently Director of Advertising Technology at Intel Media (a wholly owned subsidiary of Intel Corp.), working on disruptive technologies in the Internet T.V. space. Previous to that he held CTO positions at AdBrite, Lending Club and TripleHop Technologies (acquired by Oracle). He was also Director of Engineering and Sr. Architect Principal at Yahoo! His expertise lies on distributed systems, advertising technology, machine learning, recommender systems and search. He holds a Ph.D in computer science and artificial intelligence from Nagoya Institute of Technology, Japan.TRANSCRIPT
ì Scalable Adver,sing v Recommender Systems From Search, Display to Mobile, Social and TV
By Joaquin A. Delgado, PhD.
ACM San Francisco Bay Area Professional Chapter
Disclaimer
ì The content of this presenta,on are of my own personal opinion and does not officially represent my employer’s view in anyway. Included content is especially not intended to convey the views of Intel Media (an Intel Corp Subsidiary) or Intel Corpora,on.
Objectives
ì Demonstrate the strong similari,es between adver,sing and recommender systems
ì Illustrate some of the techniques used to build large-‐scale adver,sing systems that can be used to build effec,ve and scalable recommender systems.
Agenda
ì Introduc,on to Recommender Systems
ì Introduc,on to Adver,sing Systems
ì Example: Video Adver,sing Exchange
ì Ok, So How Do We Scale?
ì The Business of Recommenda,ons
ì The Crux of Metrics and Evalua,on
ì Q&A
Introduction to Recommender Systems
Recommender Systems
ì Recommender systems or recommenda.on systems (a.k.a. recommenda,on engines/plaYorm) are a subclass of informa,on filtering systems that seek to predict the 'ra,ng' or 'preference' that a user would give to an item (such as music, books, or movies) or social element (e.g. people or groups) they had not yet considered.
ì Recommender Systems have been around since the 1980s primarily applied to ecommerce and various social and media services.
ì E.g. Movie Recommenda,ons
ì Univ. Minnesota, MovieLens (circa 1984) 2009 Ne?lix $1M Challenge
Evolution of Recommender Systems
Problem
Item, j
User i Interacts with user features xi (demographics, browse history, search history, …)
available with item features xj (keywords, content categories, ...)
(i, j) : response yij
Algorithm selects
(explicit rating, implicit click/no-click)
Predict the unobserved entries based on features and the observed entries
Algorithmic Approaches (1) : Collaborative Filtering
Better performance for old users and old items Does not naturally handle new users and new items (cold-start)
Algorithmic Approaches (2) Content Based Classification Task
Intel Confiden.al
Limitation: need predictive features Bias often high, does not capture signals at granular levels
Other Critical Limitations
ì Lack of Contextually-‐Aware Recommenda,ons ì Recommenda,ons do not happen in a vacuum; context
such as ,me-‐of-‐day, type/size of the device, geo-‐loca,on, surrounding content and even more granular user informa,on (e.g. behavioral user segments) is key to providing more relevant and ,ming recommenda,ons
ì Scaling Recommender Systems is hard! ì Dimensionality reduc,on and some recent map-‐reduce
implementa,ons of matrix factoriza,on and ML algorithms are a step in the right direc,on, yet alone have not been tested at “Internet Scale”
Recommender System Redux
ì True goals of a Recommender System ì Amaze the user by sugges,ng cap,va,ng content
and useful services that are contextually relevant and ,mely
ì Enable further mone.za.on via poten,al up-‐sale and cross-‐sell opportuni,es of content and services that actually ma9er to the user.
ì Do all this at scale!
Introduction to Advertising Systems
Advertising
ì Adver.sing is a form of communica,on for marke,ng and used to encourage, persuade, or manipulate an audience (viewers, readers or listeners; some,mes a specific group) to con,nue or take some new ac,on. Most commonly, the desired result is to drive consumer behavior with respect to a commercial offering, although poli,cal and ideological adver,sing is also common.
Long History of Traditional Advertising
A form of promo,on that uses Internet
Technology for the expressed purpose of delivering marke,ng messages to aeract
customers.
Online Advertising
The Rise of Online Adver,sing
Online Advertising Spending Tops $100 Billion in 2012
Why Online Adver,sing?
Computational Advertising
ì Computa,onal adver,sing is at the intersec,on of large scale search and text analysis, informa,on retrieval, sta,s,cal modeling, machine learning, op,miza,on, and microeconomics. The central challenge of computa,onal adver,sing is to find the "best match" between a given user in a given context and a suitable adver,sement.
ì Depending on the defini,on of "best match" this challenge leads to a variety of massive op,miza,on and search problems, with complicated constraints.
Key Enabling Technology
ì Systems that Scale – Distributed Compu,ng – Distributed Data Processing – No-‐SQL/New-‐SQL Databases
ì Marketplace Design • Auc,on and Game Theory • Yield Op,miza,on • Bidding Agents
ì Connec,ng Markets • Real-‐,me Bidding (RTB)
ì Pervasive Internet Compu,ng • Prolifera,on of Internet Connected Devices
The World of Online Adver,sing
• Text • Image • Rich Media • Video
• Computer • Tablet • Phone • Television
• Search • Display • Email • Social
• Brand • Performance
Objec,ve Channel
Format Device
UX: In-‐App or In-‐Browser
The Marketplace
Audiences
Adver,sing Opportuni,es
Publishers
Service Providers
Adver,sers
Ads
Search Keyword
Geo-‐loca,on
Contextual
Behavioral
Retarge,ng
Data is King!
How Audiences are Selected?
Delivery Options and Market Types
GD means Guaranteed Delivery and is synonymous to brand, wholesale and fixed-‐price online adver,sing.
NGD means Non-‐Guaranteed Delivery and is synonymous to performance, retail, spot-‐market (auc,on-‐base) online adver,sing.
How are ad opportuni,es priced?
– CPM (Cost Per Mile), also called "Cost Per Thousand” (CPT) , is where adver,sers pay per impression or exposure or of their message to a specific target audience.
– CPC (Cost Per Click) is also known as pay-‐per-‐click (PPC). Adver,sers pay each ,me a user clicks on their lis,ng and is redirected to their website.
– CPA (Cost Per Ac.on) or cost per acquisi,on adver,sing is performance based and is common in the affiliate marke,ng sector of the business
Advertising Funnel and Marketing Strategies
Brand Adver,sing Performance Adver,sing
Bidding and Yield Op,miza,on Real-‐,me Bidding (RTB) facilitates the connec,on of Supply and Demand from different private marketplaces
Summary
Channel Market Formats Pricing Devices Targe.ng UX
Search NGD Text CPC All Keyword, Geo-‐loca,on
Browser
Display GD, NGD All All All All Browser, In-‐App
Social NGD Text, Image
CPC All Behavioral, geo-‐loca,on. contextual, retarge,ng
Browser, In-‐App
Email GD, NGD Text, Image
CPM, CPL All Geo-‐loca,on, behavioral, retarge,ng
Email App
Example Ad System: Video Exchange
3d party Data is used To Iden.fy a User and Matches It to Adver.ser Demand via Impression Level Bidding
User visits pubs in an exchange auc.on
marketplace
User clicks on video player to play Video
Exchange simultaneously pings all twelve 3rd party data
partners to see whether they have relevant demographic
and/or behavioral informa.on matching the target to
available impressions across the exchange
Exchange matches adver.ser demand to
qualified users
The ad server serves a relevant pre-‐roll to that
user in real .me.
Match
Responding to a Pub Ad Call
Exchange/ Network
Publisher
P
Yo! I need an ad!
No prob Home Slice!
Here’s a XML doc
with all the info to execute the ad 010011010101
Publisher ad call
1
2
Exchange/Network responds
with XML doc
The XML file is the recipe to execute the video ad! 31
Pub follows XML file recipe to execute ad
Publisher page Publisher
P
I now have my XML doc recipe …
Now I’ll follow the recipe to show the ad
1
3rd Party Video Ad Server
2Request
for video ad file
End User
Pre-roll ad plays &
beacon events provide metrics
4
3 Video ad file sent to the Publisher’s video player
32
More Players, More redirec,ons
Adver,sers use their “primary” ad server to manage the campaign and then hand off the ad calls to a “secondary” rich media ad server, finally pulling the ad from a content delivery network as in the diagram above. This type of daisy-‐chaining is also quite common with ad exchanges that handle remnant inventory, thus crea,ng even more redirec,ons.
OK, So How Do We Scale?
ì What is the Right Architecture?
ì What are the best Data Structures?
ì What family of Algorithms?
35
Impression-‐Processing Server
Index, Model Par..ons
impression
Bid-‐Genera.on Server . . .
bids, auc,on info
Bid-‐Genera.on Server
Publisher Data
Scalable FE Serving Architecture
36
Bid-‐Generation Server Farm
Bid-‐Genera.on Server . . . Bid-‐Genera.on
Server
Bid-‐Genera.on Server
Bid-‐Genera.on Server . . .
. . . . . .
#columns = #par,,ons = M #rows = #replicas = N
37
Bidding System Structure
ì Impression-‐Processing Server annotates the submieed impression, scaeers the impression to a set of Bid-‐Genera,on Servers, gathers top bids from local auc,ons, and computes the overall top bids for the impression by running a global auc,on
ì Each Bid-‐Genera,on Server works on a par,,on of demand data, generates bids for a given impression based on that data par,,on, conducts a local auc,on across those bids, and returns local winners and the corresponding auc,on info
Unified BE Data Analytics
ì Descrip,ve Analy,cs ì OLAP ì Reports & Visualiza,on
ì Predic,ve Analy,cs ì OLTP ì Indexes and Models
ì Ranking ì Predic,on ì Classifica,on ì Op,miza,on
Analyzing an Ad Request Flow
1. Eligibility
2. Ranking (Auc,on)
3. Delivery
4. Display Ad
EXCHANGE
Eligibility: The Ad Matching Problem
ì BE: age ∈ {10,20} & country ∉ {US}
ì S: age=20 & country=FR & gender=F
ì Given an assignment S, find all matching Boolean expressions (BEs)
Background: Inverted Indexes
ì Pos,ng lists of occurring terms (tokens) with list of documents:posi,ons
ì Used to match queries ì Tokens ì Boolean operators
ì Search returns documents with relevance score
Indexing Boolean Expressions
ì E1: A ∈ {1}
ì E2: A ∈ {1} & B ∈ {2} & C ∈ {3,4}
ì S: A=1 & B=2
Key Pos.ng List
(A,1) E1,E2
(B,2) E2
(C,3) E2
(C,4) E2
ID Expression K
1 age ∈ {3} ∧ state ∈ {NY } 2
2 age ∈ {3} ∧ gender ∈ {F} 2
3 age ∈ {3} ∧ gender ∈ {M} ∧ state ∉ {CA}
2
4 state ∈ {CA} ∧ gender ∈ {M}
2
5 age ∈ {3, 4} 1
6 state ∉ {CA,NY } 0
K Key and UB Pos.ng List
0
(state,CA), 2.0 (6, ∉, 0)
(state,NY ), 5 (6, ∉, 0)
Z, 0 (6, ∈, 0)
1
(age, 3), 1.0 (5, ∈, 0.1)
(age, 4), 3.0 (5, ∈, 0.5)
2
(state,NY ), 5 (1, ∈, 4.0)
(age, 3), 1.0 (1, ∈, 0.1) (2, ∈, 0.1) (3, ∈, 0.2)
(gender, F), 2 (2, ∈, 0.3)
(state,CA), 2.0 (3, ∉, 0) (4, ∈, 1.5)
(gender,M), 1.0 (3, ∈, 0.5) (4, ∈, 0.9)
Figure 1: A set of conjunc,ons
Figure 2: Inverted list for Figure 1
43
K-‐Inverted List Construction
Ranking Phase I: Top-‐K Selection
ì Search algorithm for DNF/CNF BEs with relevance ranking
ì The score of a BE E reflects the “relevance” of E to an assignment S. For example, a user interested in running might be more interested in an adver,sement on shoes than an adver,sement on flowers
Example: Scoring
ì S= {age=1, state=NY, gender=F}
ì Ws=(1,2,3)
ì Score(BE1)=0.1*1+2*4 = 8.1
ì Score(BE2)=0.5*1+0.3*3 = 1.4
K Key and UB Pos.ng List
2
(state,NY ), 5 (1, ∈, 4.0)
(age, 3), 1.0 (1, ∈, 0.1) (2, ∈, 0.5)
(gender, F), 2 (2, ∈, 0.3)
ID Expression K
1 age ∈ {3} ∧ state ∈ {NY } 2
2 age ∈ {3} ∧ gender ∈ {F} 2
Matching Requires Two Kinds of Indexes
Example: Ad Matching
• Assignment [S]: age=20 & country=FR & gender=F
• Boolean Expression[SF]: age ∈ {10,20} & country ∉ {US}
Given an assignment S, find all matching Boolean expressions (SFs)
• Boolean Expression[DF]: ad_size ∈ {800x400,200x50} & type ∉ {flash}
• Assignment [D]: crtv_tag =sports & size=800x400 & type=Flash
Given a Boolean Expression DF, find all matching Assignments (Ds)
Return al matching Ad Units satisfying the two-way match!!
Opportunity Query = Supply Attributes (values)^ Demand Filters (BE)
Indexed Ad UnitsDemand
Attributes (values)
Supply Filters (BE)
Ranking Phase II: Auction
ì Bids are computed as an op,miza,on based on objec,ves subject to budget constraints.
vG =X
g
�gvgaction-rate goal value
goal
Predictive Analytics and Models
ì ML and CF techniques can be used to compute ì Weights for Relevance Ranking
ì Assigned to BE clauses and assignment pairs
ì Ac,on-‐Rates ì E.g. Response predic,on: what is the probability of a user
comple,ng an ad view, clicking or conver,ng
ì Op,miza,on ì Delivery: Availability and Pacing based on Budgets ì Revenue/ROI based Op,miza,on
ì Explora,on-‐Exploita,on is required to “learn” new signals.
ì Resul,ng models should be par,,oned and loaded into Bidding Servers
50
The Business of Recommendations
ì Recommenda,ons impact your business ì Create campaigns that target certain audiences,
sec,ons of the applica,on, geo-‐loca,on, etc. ì Use recommenda,ons as a way to do promo,ons as
well as upsell and cross-‐sell
ì Not all items-‐ac,ons are created equal ì Assess the value of the goals. Bidding agents will
take care of the rest ì Some items have a limited life-‐span (e.g. window of
availability). Be sure to represent this as constraints or budgets
Summary
Adver.sing Recommender Systems
Targe,ng Constraints
Budget Availability
Bid Relevance
Auc,on Selec,on
Model Model
The All Encompassing Data Engine
Data Engine
Search & Discovery
Recommenda,ons Adver,sing
Intelligence
Data Engine = Data Core + Analy,cs
The Crux of Metrics and Evaluation
Business
• Revenue • User Experience • Product and Service Ra,ng
Systems
• Conversion Rate • ROC Curves • Precision • Recall
User
• Relevance • Enjoyable • Novelty • Originality
Intel Confiden.al
Bucket Testing and Offline Evaluation
Ad Server To be
evaluated
Ad Server (Random Bucket)
Traces (100%)
Event Data (impr, click, Conv, prob)
Replayer
Ad Calls
HTTP Response
Join Final Data For Evalua,on
The Big Fish: OTT Television
• Online TV and Video-‐on-‐Demand is here to stay
• Star.ng to tap into tradi.onal TV/Cable adver.sing Budgets
• Viewership + Web Data will power new forms of Online Adver.sement
References
ì Indexing Boolean Expressions
ì Computa,onal Adver,sing and Recommender Systems
ì A Market-‐Based Approach to Recommender Systems
ì ICML’11 Tutorial on Machine Learning for Large Scale Recommender Systems