display advertising with real-time bidding (rtb) and arxiv ... · pdf filedisplay advertising...

122
Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting Jun Wang University College London [email protected] Weinan Zhang Shanghai Jiao Tong University [email protected] Shuai Yuan MediaGamma Ltd [email protected] July 18, 2017 arXiv:1610.03013v2 [cs.GT] 15 Jul 2017

Upload: ngohuong

Post on 12-Mar-2018

240 views

Category:

Documents


4 download

TRANSCRIPT

  • Display Advertising withReal-Time Bidding (RTB) and

    Behavioural Targeting

    Jun WangUniversity College London

    [email protected]

    Weinan ZhangShanghai Jiao Tong University

    [email protected]

    Shuai YuanMediaGamma Ltd

    [email protected]

    July 18, 2017

    arX

    iv:1

    610.

    0301

    3v2

    [cs

    .GT

    ] 1

    5 Ju

    l 201

    7

  • Contents

    1 Introduction 21.1 A short history of online advertising . . . . . . . . . . . . . . 3

    1.1.1 The birth of sponsored search and contextual advertising 31.1.2 The arrival of ad exchange and real-time bidding . . . 4

    1.2 The major technical challenges and issues . . . . . . . . . . . 51.2.1 Towards information general retrieval (IGR) . . . . . . 6

    1.3 The organisation of this monograph . . . . . . . . . . . . . . . 7

    2 How RTB Works 82.1 RTB ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 User behavioural targeting: the steps . . . . . . . . . . . . . . 102.3 User tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Cookie syncing . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    3 RTB Auction & Bid Landscape 163.1 The second price auction in RTB . . . . . . . . . . . . . . . . 17

    3.1.1 Truthful bidding is the dominant strategy . . . . . . . 183.2 Winning probability . . . . . . . . . . . . . . . . . . . . . . . 203.3 Bid landscape forecasting . . . . . . . . . . . . . . . . . . . . 21

    3.3.1 Tree-based log-normal model . . . . . . . . . . . . . . 223.3.2 Censored linear regression . . . . . . . . . . . . . . . . 233.3.3 Survival Model . . . . . . . . . . . . . . . . . . . . . . 24

    4 User Response Prediction 274.1 Data sources and problem statement . . . . . . . . . . . . . . 284.2 Logistic regression with stochastic gradient descent . . . . . . 294.3 Logistic regression with follow-the-regularised-leader . . . . . 304.4 Bayesian probit regression . . . . . . . . . . . . . . . . . . . . 314.5 Factorisation machines . . . . . . . . . . . . . . . . . . . . . . 324.6 Decision trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.7 Ensemble learning . . . . . . . . . . . . . . . . . . . . . . . . 33

    4.7.1 Bagging (bootstrap aggregating) . . . . . . . . . . . . 344.7.2 Gradient boosted regression trees . . . . . . . . . . . . 34

    i

  • CONTENTS ii

    4.7.3 Hybrid Models . . . . . . . . . . . . . . . . . . . . . . 374.8 User lookalike modelling . . . . . . . . . . . . . . . . . . . . . 374.9 Transfer learning from Web browsing to ad clicks . . . . . . . 384.10 Deep learning over categorical data . . . . . . . . . . . . . . . 404.11 Dealing with missing data . . . . . . . . . . . . . . . . . . . . 414.12 Model comparison . . . . . . . . . . . . . . . . . . . . . . . . 434.13 Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    5 Bidding Strategies 465.1 Bidding problem: RTB vs. sponsored search . . . . . . . . . . 475.2 Concept of quantitative bidding in RTB . . . . . . . . . . . . 485.3 Single-campaign bid optimisation . . . . . . . . . . . . . . . . 49

    5.3.1 Notations and preliminaries . . . . . . . . . . . . . . . 505.3.2 Truth-telling bidding . . . . . . . . . . . . . . . . . . . 515.3.3 Linear bidding . . . . . . . . . . . . . . . . . . . . . . 515.3.4 Budget constrained clicks and conversions maximisation 525.3.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . 55

    5.4 Multi-campaign statistical arbitrage mining . . . . . . . . . . 565.5 Budget pacing . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.6 Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    6 Dynamic Pricing 616.1 Reserve price optimisation . . . . . . . . . . . . . . . . . . . . 61

    6.1.1 Optimal auction theory . . . . . . . . . . . . . . . . . 626.1.2 Game tree based heuristics . . . . . . . . . . . . . . . 666.1.3 Exploration with a regret minimiser . . . . . . . . . . 68

    6.2 Programmatic direct . . . . . . . . . . . . . . . . . . . . . . . 696.3 Ad options and first look contracts . . . . . . . . . . . . . . . 71

    7 Attribution Models 757.1 Heuristic models . . . . . . . . . . . . . . . . . . . . . . . . . 767.2 Shapley value . . . . . . . . . . . . . . . . . . . . . . . . . . . 777.3 Data-driven probabilistic models . . . . . . . . . . . . . . . . 77

    7.3.1 Bagged logistic regression . . . . . . . . . . . . . . . . 787.3.2 A simple probabilistic model . . . . . . . . . . . . . . 787.3.3 An extension to the probabilistic model . . . . . . . . 79

    7.4 Other models . . . . . . . . . . . . . . . . . . . . . . . . . . . 797.5 Applications of attribution models . . . . . . . . . . . . . . . 80

    7.5.1 Lift-based bidding . . . . . . . . . . . . . . . . . . . . 807.5.2 Budget allocation . . . . . . . . . . . . . . . . . . . . . 81

  • CONTENTS iii

    8 Fraud Detection 828.1 Ad fraud types . . . . . . . . . . . . . . . . . . . . . . . . . . 828.2 Ad fraud sources . . . . . . . . . . . . . . . . . . . . . . . . . 83

    8.2.1 Pay-per-view networks . . . . . . . . . . . . . . . . . . 838.2.2 Botnets . . . . . . . . . . . . . . . . . . . . . . . . . . 858.2.3 Competitors attack . . . . . . . . . . . . . . . . . . . 868.2.4 Other sources . . . . . . . . . . . . . . . . . . . . . . . 87

    8.3 Ad fraud detection with co-visit networks . . . . . . . . . . . 878.3.1 Feature engineering . . . . . . . . . . . . . . . . . . . . 89

    8.4 Viewability methods . . . . . . . . . . . . . . . . . . . . . . . 908.5 Other methods . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    9 The Future of RTB 94

    A RTB Glossary 96

  • Abstract

    The most significant progress in recent years in online display advertisingis what is known as the Real-Time Bidding (RTB) mechanism to buy andsell ads. RTB essentially facilitates buying an individual ad impression inreal time while it is still being generated from a users visit. RTB not onlyscales up the buying process by aggregating a large amount of available in-ventories across publishers but, most importantly, enables direct targetingof individual users. As such, RTB has fundamentally changed the landscapeof digital marketing. Scientifically, the demand for automation, integrationand optimisation in RTB also brings new research opportunities in infor-mation retrieval, data mining, machine learning and other related fields.In this monograph, an overview is given of the fundamental infrastructure,algorithms, and technical solutions of this new frontier of computational ad-vertising. The covered topics include user response prediction, bid landscapeforecasting, bidding algorithms, revenue optimisation, statistical arbitrage,dynamic pricing, and ad fraud detection.

    1

  • 1

    Introduction

    An advertisement is a marketing message intended to encourage potentialcustomers to purchase a product or to subscribe to a service. Advertisingis also a way to establish a brand image through the repeated presence ofan advertisement (ad) associated with the brand in the media. Television,radio, newspaper, magazines, and billboards are among the major channelsthat traditionally place ads, however, the advancement of the Internet en-ables users to seek information online. Using the Internet, users are able toexpress their information requests, navigate specific websites and performe-commerce transactions. Major search engines have continued to improvetheir retrieval services and users browsing experience by providing relevantresults. Since many more businesses and services are transitioning into theonline space, the Internet is a natural choice for advertisers to widen theirstrategy, reaching potential customers among Web users [Yuan et al., 2012].

    As a result, online advertising is now one of the fastest advancing areasin the IT industry. In display and mobile advertising, the most significanttechnical development in recent years is the growth of Real-Time Bidding(RTB), which facilitates a real-time auction for a display opportunity. Real-time means the auction is per impression and the process usually occursless than 100 milliseconds before the ad is placed. RTB has fundamentallychanged the landscape of the digital media market by scaling the buyingprocess across a large number of available inventories among publishers in anautomatic fashion. It also encourages user behaviour targeting, a significantshift towards buying focused on user data rather than contextual data [Yuanet al., 2013].

    Scientifically, the further demand for automation, integration and op-timisation in RTB opens new research opportunities in the fields such asInformation Retrieval (IR), Data Mining (DM), Machine Learning (ML),and Economics. IR researchers, for example, are facing the challenge ofdefining the relevancy of underlying audiences given a campaign goal, andconsequently, developing techniques to find and filter them out in the real-

    2

  • 1. INTRODUCTION 3

    time bid request data stream [Zhang et al., 2016a, Perlich et al., 2012]. Fordata miners, a fundamental task is identifying repeated patterns over thelarge-scale streaming data of bid requests, winning bids and ad impressions[Cui et al., 2011]. For machine learners, an emerging problem is telling amachine to react to a data stream, i.e., learning to bid cleverly on behalfof advertisers and brands to maximise conversions while keeping costs to aminimum [Xu et al., 2016, Kan et al., 2016, Cai et al., 2017].

    It is also of great interest to study learning over multi-agent systems andconsider the incentives and interactions of each individual learner (biddingagent). For economics researchers, RTB provides a new playground for microimpression-level auctions with various bidding strategies and macro multiplemarketplace competitions