Machine learning on big data for personalized Internet advertising

Download Machine learning on big data for personalized Internet advertising

Post on 12-Jul-2015

322 views

Category:

Data & Analytics

0 download

Embed Size (px)

TRANSCRIPT

  • M. RECCE

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Machine Learning on Big Data for Personalized Adver

  • Adver
  • Internet adverBsing (the business) Internet adverBsing (the data) Understanding consumers (the models) Organizing for success

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    3

    Outline

  • The Personalized Media Economy

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    4

    Media is transiBoning from a one size ts all broadcast model to dynamic real-Bme choice

    Online AdverBsing Ecosystem

  • Globally, hundreds of billions of

    dollars of ad spend will shiY

    Money Follows Media ConsumpBon

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    $30B opportunity

    ?

    5

  • Media spend processes are well established New media channels lag unBl audiences and value can be

    properly quanBed

    Historically, digital audiences were poorly quanBed StraBed sampling has been the norm in media measurement for

    decades Bias and sampling error prevail

    Why the Spending Disparity?

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    6

  • Launched September 2006 to enable addressable adverBsing at scale

    First we had to x audience measurement Launched a free service based on direct measurement of

    media consumpBon

    Use machine learning to infer audience characterisBcs

    Enter Quantcast

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    7

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    8

    Broad Par

  • Massive expansion in number of decisions Individuals, not whole audiences Impressions, not whole sites Screens/Bmes/locaBons/

    Decision Bmeframe reduced from weeks to milliseconds This problem can only be solved algorithmically

    An Adver

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Data Rich Environment

    4 Billion Cookies /mo. observed

    400,000+ Events /sec real-

  • .let adver
  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    12

    RTB A Rapid & Transforma

  • Media Buying & Execu
  • Data Mining Challenges

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    14

    Audience EsBmaBon Using reference data from a small number of people and a small number of web sites infer the demographics/anributes of the audience of all sites.

    User EsBmaBon Using media consumpBon records and audience esBmates, determine the characterisBcs of an Internet user across arbitrary dimensions.

    Lookalike SelecBon From the behavior of a small number of buyers of a product, determine the set of people who will buy it next.

    Live Trac Modeling Compute the value for showing an adverBsement to a user as a funcBon of the user, adverBsing environment, Bme of day etc.

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    15

    Quantcast Lookalikes for Marketers RevoluBonary Ad TargeBng for Performance and Brand

    1. Understand marketers BEST CUSTOMERS with Quantcast Measurement

    2. Isolate DISTINCTIVE INTERESTS

    3. Find MILLIONS OF LOOKALIKES

    4. Reach them ANYWHERE PERFORMANCE LOOKALIKES Quantcast technology conBnually opBmizes real-Bme media for adverBser

    BRAND LOOKALIKES Buy custom audiences from trusted media partners

    Your Site Traffic

  • Given an archetype group of users, nd the feature set that best separates them from their complement

    Features can be posiBve or negaBve indicators of content relevance

    Find more that look like them

    Lookalike Selec

  • Math compeBBon

    Largest number of conversions (purchasers) during contest wins

    Leverage informaBon on prior purchasers to nd more

    Decide how to compete

    Bring mathemaBcians

    More data on each converter

    Management by metrics

    Know what the compeBtors are doing

    Problem Statement

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    17

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Lookalike Mass-Produc

  • Lookalikes Iden
  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    20

    Wide Range of Ac

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    21

    RTLAL Bidding Architecture

    Model DeniBon Pixel Data Real Time Ad Exchange

    Model Training and Scoring Bidding AucBon Mgmt

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    AcBvity Level VariaBons

    22

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Cookie DeleBon Rates

    23

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Media consumpBon is non-staBonary

    13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:00

    Michael Jackson Media ConsumpBon June 25, 2009

    Pages consumed per minute

    24

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    25

    Choose the Right Objec

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    26

    Machines High Performance Plalorm

    450,000 / Second Real-Bme events

    5PB / Day Processing throughput

    MulBple Global Datacenters Ultra-high availability with advanced trac management

  • Collabora
  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    28

    Measuring Lib ROC

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    29

    Cumula

  • Learning experimentaBon

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    2 Days Mins

    6 Hours

    New model development

    New model in producBon

    To process 100TB with rst MapReduce job

    Hours Live performance assessment

    2 Weeks To inuence billions of real-Bme decisions every day and millions of dollars of adverBsing spend

    30

  • Technology Maners

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Leaders will be world-class in every discipline, and will operate all as a fully integrated whole.

    Machine Learning & OpBmizaBon

    Comprehensive Coherent Data

    Petascale Big-Data CompuBng

    Real-Time Tech Mastery

    31

  • If you have all that then....

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Having more Data really maners.

    32

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    33

    Numerous Open Challenges

    Dealing with sparsity Feature selecBon Real-Bme scoring and bidding True performance & anribuBon modeling LiY, liY and more liY! Handling 100,000s of concurrent models

  • 11/18/2011 2011 Quantcast. All Rights Reserved QCon

    Summary

    Digital adverBsing is a vast analyBcal environment Enormous data volumes Rich behaviors ObjecBve performance metrics

    MarkeBng will be transformed by computaBonal approaches Hundreds of billions of dollars of spend are at stake

    34

  • Quantcast

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    35

  • Contact: mrecce@quantcast.com

    11/18/2011 2011 Quantcast. All Rights Reserved QCon

    36

Recommended

View more >