scalable advertising recommender systems

ì Scalable Adver,sing v Recommender Systems From Search, Display to Mobile, Social and TV

By Joaquin A. Delgado, PhD.

ACM San Francisco Bay Area Professional Chapter

Disclaimer

ì  The content of this presenta,on are of my own personal opinion and does not officially represent my employer’s view in anyway. Included content is especially not intended to convey the views of Intel Media (an Intel Corp Subsidiary) or Intel Corpora,on.

Objectives

ì  Demonstrate the strong similari,es between adver,sing and recommender systems

ì  Illustrate some of the techniques used to build large-‐scale adver,sing systems that can be used to build effec,ve and scalable recommender systems.

Agenda

ì  Introduc,on to Recommender Systems

ì  Introduc,on to Adver,sing Systems

ì  Example: Video Adver,sing Exchange

ì  Ok, So How Do We Scale?

ì  The Business of Recommenda,ons

ì  The Crux of Metrics and Evalua,on

ì  Q&A

Introduction to Recommender Systems

Recommender Systems

ì  Recommender systems or recommenda.on systems (a.k.a. recommenda,on engines/plaYorm) are a subclass of informa,on filtering systems that seek to predict the 'ra,ng' or 'preference' that a user would give to an item (such as music, books, or movies) or social element (e.g. people or groups) they had not yet considered.

ì  Recommender Systems have been around since the 1980s primarily applied to ecommerce and various social and media services.

ì  E.g. Movie Recommenda,ons

ì  Univ. Minnesota, MovieLens (circa 1984) 2009 Ne?lix $1M Challenge

Evolution of Recommender Systems

Problem

Item, j

User i Interacts with user features xi (demographics, browse history, search history, …)

available with item features xj (keywords, content categories, ...)

(i, j) : response yij

Algorithm selects

(explicit rating, implicit click/no-click)

Predict the unobserved entries based on features and the observed entries

Algorithmic Approaches (1) : Collaborative Filtering

Better performance for old users and old items Does not naturally handle new users and new items (cold-start)

Algorithmic Approaches (2) Content Based Classification Task

Intel Confiden.al

Limitation: need predictive features Bias often high, does not capture signals at granular levels

Other Critical Limitations

ì  Lack of Contextually-‐Aware Recommenda,ons ì  Recommenda,ons do not happen in a vacuum; context

such as ,me-‐of-‐day, type/size of the device, geo-‐loca,on, surrounding content and even more granular user informa,on (e.g. behavioral user segments) is key to providing more relevant and ,ming recommenda,ons

ì  Scaling Recommender Systems is hard! ì  Dimensionality reduc,on and some recent map-‐reduce

implementa,ons of matrix factoriza,on and ML algorithms are a step in the right direc,on, yet alone have not been tested at “Internet Scale”

Recommender System Redux

ì  True goals of a Recommender System ì  Amaze the user by sugges,ng cap,va,ng content

and useful services that are contextually relevant and ,mely

ì  Enable further mone.za.on via poten,al up-‐sale and cross-‐sell opportuni,es of content and services that actually ma9er to the user.

ì  Do all this at scale!

Introduction to Advertising Systems

Advertising

ì  Adver.sing is a form of communica,on for marke,ng and used to encourage, persuade, or manipulate an audience (viewers, readers or listeners; some,mes a specific group) to con,nue or take some new ac,on. Most commonly, the desired result is to drive consumer behavior with respect to a commercial offering, although poli,cal and ideological adver,sing is also common.

Long History of Traditional Advertising

A form of promo,on that uses Internet

Technology for the expressed purpose of delivering marke,ng messages to aeract

customers.

Online Advertising

The Rise of Online Adver,sing

Online Advertising Spending Tops $100 Billion in 2012

Why Online Adver,sing?

Computational Advertising

ì  Computa,onal adver,sing is at the intersec,on of large scale search and text analysis, informa,on retrieval, sta,s,cal modeling, machine learning, op,miza,on, and microeconomics. The central challenge of computa,onal adver,sing is to find the "best match" between a given user in a given context and a suitable adver,sement.

ì  Depending on the defini,on of "best match" this challenge leads to a variety of massive op,miza,on and search problems, with complicated constraints.

Key Enabling Technology

ì  Systems that Scale –  Distributed Compu,ng –  Distributed Data Processing –  No-‐SQL/New-‐SQL Databases

ì  Marketplace Design •  Auc,on and Game Theory •  Yield Op,miza,on •  Bidding Agents

ì  Connec,ng Markets •  Real-‐,me Bidding (RTB)

ì  Pervasive Internet Compu,ng •  Prolifera,on of Internet Connected Devices

The World of Online Adver,sing

•  Text •  Image •  Rich Media •  Video

•  Computer •  Tablet •  Phone •  Television

•  Search •  Display •  Email •  Social

•  Brand •  Performance

Objec,ve Channel

Format Device

UX: In-‐App or In-‐Browser

The Marketplace

Audiences

Adver,sing Opportuni,es

Publishers

Service Providers

Adver,sers

Ads

Search Keyword

Geo-‐loca,on

Contextual

Behavioral

Retarge,ng

Data is King!

How Audiences are Selected?

Delivery Options and Market Types

GD means Guaranteed Delivery and is synonymous to brand, wholesale and fixed-‐price online adver,sing.

NGD means Non-‐Guaranteed Delivery and is synonymous to performance, retail, spot-‐market (auc,on-‐base) online adver,sing.

How are ad opportuni,es priced?

–  CPM (Cost Per Mile), also called "Cost Per Thousand” (CPT) , is where adver,sers pay per impression or exposure or of their message to a specific target audience.

–  CPC (Cost Per Click) is also known as pay-‐per-‐click (PPC). Adver,sers pay each ,me a user clicks on their lis,ng and is redirected to their website.

–  CPA (Cost Per Ac.on) or cost per acquisi,on adver,sing is performance based and is common in the affiliate marke,ng sector of the business

Advertising Funnel and Marketing Strategies

Brand Adver,sing Performance Adver,sing

Bidding and Yield Op,miza,on Real-‐,me Bidding (RTB) facilitates the connec,on of Supply and Demand from different private marketplaces

Summary

Channel Market Formats Pricing Devices Targe.ng UX

Search NGD Text CPC All Keyword, Geo-‐loca,on

Browser

Display GD, NGD All All All All Browser, In-‐App

Social NGD Text, Image

CPC All Behavioral, geo-‐loca,on. contextual, retarge,ng

Browser, In-‐App

Email GD, NGD Text, Image

CPM, CPL All Geo-‐loca,on, behavioral, retarge,ng

Email App

Example Ad System: Video Exchange

3d party Data is used To Iden.fy a User and Matches It to Adver.ser Demand via Impression Level Bidding

User visits pubs in an exchange auc.on

marketplace

User clicks on video player to play Video

Exchange simultaneously pings all twelve 3rd party data

partners to see whether they have relevant demographic

and/or behavioral informa.on matching the target to

available impressions across the exchange

Exchange matches adver.ser demand to

qualified users

The ad server serves a relevant pre-‐roll to that

user in real .me.

Match

Responding to a Pub Ad Call

Exchange/ Network

Publisher

P

Yo! I need an ad!

No prob Home Slice!

Here’s a XML doc

with all the info to execute the ad 010011010101

Publisher ad call

1

2

Exchange/Network responds

with XML doc

The XML file is the recipe to execute the video ad! 31

Pub follows XML file recipe to execute ad

Publisher page Publisher

P

I now have my XML doc recipe …

Now I’ll follow the recipe to show the ad

1

3rd Party Video Ad Server

2Request

for video ad file

End User

Pre-roll ad plays &

beacon events provide metrics

4

3 Video ad file sent to the Publisher’s video player

32

More Players, More redirec,ons

Adver,sers use their “primary” ad server to manage the campaign and then hand off the ad calls to a “secondary” rich media ad server, finally pulling the ad from a content delivery network as in the diagram above. This type of daisy-‐chaining is also quite common with ad exchanges that handle remnant inventory, thus crea,ng even more redirec,ons.

OK, So How Do We Scale?

ì  What is the Right Architecture?

ì  What are the best Data Structures?

ì  What family of Algorithms?

35

Impression-‐Processing Server

Index, Model Par..ons

impression

Bid-‐Genera.on Server . . .

bids, auc,on info

Bid-‐Genera.on Server

Publisher Data

Scalable FE Serving Architecture

36

Bid-‐Generation Server Farm

Bid-‐Genera.on Server . . . Bid-‐Genera.on

Server

Bid-‐Genera.on Server

Bid-‐Genera.on Server . . .

. . . . . .

#columns = #par,,ons = M #rows = #replicas = N

37

Bidding System Structure

ì  Impression-‐Processing Server annotates the submieed impression, scaeers the impression to a set of Bid-‐Genera,on Servers, gathers top bids from local auc,ons, and computes the overall top bids for the impression by running a global auc,on

ì  Each Bid-‐Genera,on Server works on a par,,on of demand data, generates bids for a given impression based on that data par,,on, conducts a local auc,on across those bids, and returns local winners and the corresponding auc,on info

Unified BE Data Analytics

ì  Descrip,ve Analy,cs ì  OLAP ì  Reports & Visualiza,on

ì  Predic,ve Analy,cs ì  OLTP ì  Indexes and Models

ì  Ranking ì  Predic,on ì  Classifica,on ì  Op,miza,on

Analyzing an Ad Request Flow

1.  Eligibility

2.  Ranking (Auc,on)

3.  Delivery

4.  Display Ad

EXCHANGE

Eligibility: The Ad Matching Problem

ì  BE: age ∈ {10,20} & country ∉ {US}

ì  S: age=20 & country=FR & gender=F

ì  Given an assignment S, find all matching Boolean expressions (BEs)

Background: Inverted Indexes

ì  Pos,ng lists of occurring terms (tokens) with list of documents:posi,ons

ì  Used to match queries ì  Tokens ì  Boolean operators

ì  Search returns documents with relevance score

Indexing Boolean Expressions

ì  E1: A ∈ {1}

ì  E2: A ∈ {1} & B ∈ {2} & C ∈ {3,4}

ì  S: A=1 & B=2

Key Pos.ng List

(A,1) E1,E2

(B,2) E2

(C,3) E2

(C,4) E2

ID Expression K

1 age ∈ {3} ∧ state ∈ {NY } 2

2 age ∈ {3} ∧ gender ∈ {F} 2

3 age ∈ {3} ∧ gender ∈ {M} ∧ state ∉ {CA}

2

4 state ∈ {CA} ∧ gender ∈ {M}

2

5 age ∈ {3, 4} 1

6 state ∉ {CA,NY } 0

K Key and UB Pos.ng List

0

(state,CA), 2.0 (6, ∉, 0)

(state,NY ), 5 (6, ∉, 0)

Z, 0 (6, ∈, 0)

1

(age, 3), 1.0 (5, ∈, 0.1)

(age, 4), 3.0 (5, ∈, 0.5)

2

(state,NY ), 5 (1, ∈, 4.0)

(age, 3), 1.0 (1, ∈, 0.1) (2, ∈, 0.1) (3, ∈, 0.2)

(gender, F), 2 (2, ∈, 0.3)

(state,CA), 2.0 (3, ∉, 0) (4, ∈, 1.5)

(gender,M), 1.0 (3, ∈, 0.5) (4, ∈, 0.9)

Figure 1: A set of conjunc,ons

Figure 2: Inverted list for Figure 1

43

K-‐Inverted List Construction

Ranking Phase I: Top-‐K Selection

ì  Search algorithm for DNF/CNF BEs with relevance ranking

ì  The score of a BE E reflects the “relevance” of E to an assignment S. For example, a user interested in running might be more interested in an adver,sement on shoes than an adver,sement on flowers

Example: Scoring

ì  S= {age=1, state=NY, gender=F}

ì  Ws=(1,2,3)

ì  Score(BE1)=0.1*1+2*4 = 8.1

ì  Score(BE2)=0.5*1+0.3*3 = 1.4

K Key and UB Pos.ng List

2

(state,NY ), 5 (1, ∈, 4.0)

(age, 3), 1.0 (1, ∈, 0.1) (2, ∈, 0.5)

(gender, F), 2 (2, ∈, 0.3)

ID Expression K

1 age ∈ {3} ∧ state ∈ {NY } 2

2 age ∈ {3} ∧ gender ∈ {F} 2

Matching Requires Two Kinds of Indexes

Example: Ad Matching

•  Assignment [S]: age=20 & country=FR & gender=F

•  Boolean Expression[SF]: age ∈ {10,20} & country ∉ {US}

Given an assignment S, find all matching Boolean expressions (SFs)

•  Boolean Expression[DF]: ad_size ∈ {800x400,200x50} & type ∉ {flash}

•  Assignment [D]: crtv_tag =sports & size=800x400 & type=Flash

Given a Boolean Expression DF, find all matching Assignments (Ds)

Return al matching Ad Units satisfying the two-way match!!

Opportunity Query = Supply Attributes (values)^ Demand Filters (BE)

Indexed Ad UnitsDemand

Attributes (values)

Supply Filters (BE)

Ranking Phase II: Auction

ì  Bids are computed as an op,miza,on based on objec,ves subject to budget constraints.

vG =X

g

�gvgaction-rate goal value

goal

Predictive Analytics and Models

ì  ML and CF techniques can be used to compute ì  Weights for Relevance Ranking

ì  Assigned to BE clauses and assignment pairs

ì  Ac,on-‐Rates ì  E.g. Response predic,on: what is the probability of a user

comple,ng an ad view, clicking or conver,ng

ì  Op,miza,on ì  Delivery: Availability and Pacing based on Budgets ì  Revenue/ROI based Op,miza,on

ì  Explora,on-‐Exploita,on is required to “learn” new signals.

ì  Resul,ng models should be par,,oned and loaded into Bidding Servers

50

The Business of Recommendations

ì  Recommenda,ons impact your business ì  Create campaigns that target certain audiences,

sec,ons of the applica,on, geo-‐loca,on, etc. ì  Use recommenda,ons as a way to do promo,ons as

well as upsell and cross-‐sell

ì  Not all items-‐ac,ons are created equal ì  Assess the value of the goals. Bidding agents will

take care of the rest ì  Some items have a limited life-‐span (e.g. window of

availability). Be sure to represent this as constraints or budgets

Summary

Adver.sing Recommender Systems

Targe,ng Constraints

Budget Availability

Bid Relevance

Auc,on Selec,on

Model Model

The All Encompassing Data Engine

Data Engine

Search & Discovery

Recommenda,ons Adver,sing

Intelligence

Data Engine = Data Core + Analy,cs

The Crux of Metrics and Evaluation

Business

• Revenue • User Experience • Product and Service Ra,ng

Systems

• Conversion Rate • ROC Curves • Precision • Recall

User

• Relevance • Enjoyable • Novelty • Originality

Intel Confiden.al

Bucket Testing and Offline Evaluation

Ad Server To be

evaluated

Ad Server (Random Bucket)

Traces (100%)

Event Data (impr, click, Conv, prob)

Replayer

Ad Calls

HTTP Response

Join Final Data For Evalua,on

The Big Fish: OTT Television

•  Online TV and Video-‐on-‐Demand is here to stay

•  Star.ng to tap into tradi.onal TV/Cable adver.sing Budgets

•  Viewership + Web Data will power new forms of Online Adver.sement

References

ì  Indexing Boolean Expressions

ì  Computa,onal Adver,sing and Recommender Systems

ì  A Market-‐Based Approach to Recommender Systems

ì  ICML’11 Tutorial on Machine Learning for Large Scale Recommender Systems