survival analysis for marketing attribution · regression in many cases, log data is available only...

33
Revolution Confidential Survival Analysis for Marketing Attribution LondonR July 2013 Andrie de Vries Business Services Director – Europe @RevoAndrie Revolution Analytics @RevolutionR

Upload: others

Post on 21-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential

Survival Analysis for

Marketing Attribution

LondonR

July 2013

Andrie de Vries

Business Services Director – Europe

@RevoAndrie

Revolution Analytics

@RevolutionR

Page 2: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialWho am I?

CRAN package ggdendro

StackOverflow

LondonR, July 2013 2

Page 3: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialFrom Toledo to Albacete

LondonR, July 2013 3

Page 4: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialRent a car?

LondonR, July 2013 4

Page 5: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential… or take a bus?

LondonR, July 2013 5

Page 6: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential… but what’s happening here?

LondonR, July 2013 6

Page 7: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialMarketing attribution: the Question

How to attribute conversion success to marketing spend?

Where to spend the next marketing dollar?

LondonR, July 2013 7

Page 8: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialAgenda

Digital marketing attribution

Using Survival models

At scale, on big data

LondonR, July 2013 8

Page 9: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialAgenda

Digital marketing attribution

Using Survival models

At scale, on big data

LondonR, July 2013 9

Page 10: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialTwo-click conversion journey

Click1:

Open landing page

Click 2:

Sign up to offer

LondonR, July 2013 10

Page 11: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialTypical conversion journey…

Impressions

• Banner ad

• Page post ad

• Sponsored tweet

• Search ad

Click

• Landing page

• Special offer

• Application form

Conversion

• Sign up

• Ask for more detail

11LondonR, July 2013

Page 12: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential…but no two journeys are the same…

Impressions

• Banner ad

• Page post ad

• Sponsored tweet

• Search ad

Click

• Landing page

• Special offer

• Application form

Conversion

• Sign up

• Ask for more detail

12LondonR, July 2013

Person 2

Person 3

Person 1

Page 13: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential

Attribution models

Last click only

All events even

Rule based

Statistical modelling

…so how to attribute the value?

13LondonR, July 2013

Person 2

Person 3

Person 1

Page 14: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialAttribution with statistical modelling

Regression

In many cases, log data is available only

for conversions

And when non-conversion data is available,

these people may convert in near future

LondonR, July 2013 14

Page 15: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialAttribution with statistical modelling

Regression

In many cases, log data is available only

for conversions

And when non-conversion data is available,

these people may convert in near future

Survival analysis

Use time to conversion as dependent

variable

Can use each interaction (view or click)

as an observation

Can include censored (incomplete) data

No need to flatten the data

LondonR, July 2013 15

Page 16: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialAgenda

Digital marketing attribution

Using Survival models

At scale, on big data

LondonR, July 2013 16

Page 17: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialSurvival models

Kaplan-Meier survivor function

Cox proportional hazards model

𝑆𝑘𝑚 =

𝑡𝑖<𝑡

𝑟 𝑡𝑖 − 𝑑(𝑡𝑖)

𝑟(𝑡𝑖)

𝐿(𝛽) = 𝐿𝑖(𝛽)

𝐿𝑖(𝛽) =𝑟𝑖 𝑡∗

𝑗 𝑌𝑗 𝑡∗ 𝑟𝑗 𝑡∗

𝜆 𝑡; 𝑍𝑖 = 𝜆0(𝑡)𝑟𝑖(𝑡) Hazard function

𝑟𝑖 𝑡 = 𝑒𝛽𝑍𝑖(𝑡) Risk score

Likelihood that

individual i dies

Partial likelihood

> library(survival)

> Surv(…)

> library(survival)

> coxph( Surv(…) ~ …)

LondonR, July 2013 17

Page 18: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialWhat is death?

LondonR, July 2013

Medicine: actual death of patient

Engineering: failure of component

18

Page 19: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialWhat is death?

LondonR, July 2013

For attribution: cookie conversion

19

Page 20: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential

Worked example

Attribution of digital media for

telecoms client

LondonR, July 2013 20

Page 21: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialRead the data

> rdsFile <- "survival_data.rds"

> xd <- readRDS(rdsFile)

> class(xd)

[1] "data.table" "data.frame"

> nrow(xd)

[1] 775782

> ncol(xd)

[1] 31

LondonR, July 2013 21

Page 22: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialWhat does the data look like?> xd[1:25, 1:6, with=FALSE]

id Conversion.Time Event.Number Event.Time Event.Type Campaign

1: 10101:49721794 01/10/2012 00:05 1 01/10/2012 00:02 Click Free Sims

2: 10101:49721801 01/10/2012 00:05 1 29/09/2012 16:25 View BAU High Media

6: 10101:49721854 01/10/2012 00:07 3 17/09/2012 18:32 View BAU High Media

7: 10101:49721854 01/10/2012 00:07 4 17/09/2012 19:13 View BAU High Media

8: 10101:49721854 01/10/2012 00:07 5 17/09/2012 19:17 View BAU High Media

9: 10101:49721854 01/10/2012 00:07 6 17/09/2012 19:20 View BAU High Media

10: 10101:49721854 01/10/2012 00:07 7 17/09/2012 19:21 View BAU High Media

11: 10101:49721854 01/10/2012 00:07 8 17/09/2012 19:47 View BAU High Media

12: 10101:49721854 01/10/2012 00:07 9 17/09/2012 19:49 View BAU High Media

13: 10101:49721854 01/10/2012 00:07 10 17/09/2012 19:53 View BAU High Media

14: 10101:49721854 01/10/2012 00:07 11 17/09/2012 20:04 View BAU High Media

15: 10101:49721854 01/10/2012 00:07 12 18/09/2012 10:02 View BAU High Media

16: 10101:49721854 01/10/2012 00:07 13 18/09/2012 10:03 View BAU High Media

17: 10101:49721854 01/10/2012 00:07 14 18/09/2012 10:03 View BAU High Media

18: 10101:49721854 01/10/2012 00:07 15 18/09/2012 20:06 View BAU High Media

19: 10101:49721854 01/10/2012 00:07 16 18/09/2012 20:10 View BAU High Media

20: 10101:49721854 01/10/2012 00:07 17 19/09/2012 18:14 View BAU High Media

21: 10101:49721854 01/10/2012 00:07 18 19/09/2012 20:23 View BAU High Media

22: 10101:49721854 01/10/2012 00:07 19 20/09/2012 20:22 View BAU High Media

23: 10101:49721854 01/10/2012 00:07 20 22/09/2012 14:57 View BAU High Media

24: 10101:49721854 01/10/2012 00:07 21 22/09/2012 22:18 View BAU High Media

25: 10101:49721854 01/10/2012 00:07 22 23/09/2012 21:06 View BAU High Media

LondonR, July 2013 22

Page 23: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialHistogram of cookie lifetime

LondonR, July 2013

Impressions and clicks in customer journey

Cookie duration (days)

Eve

nts

(im

pre

ssio

ns a

nd

clicks)

0 5 10 15 20 25 30

05

00

00

15

00

00

25

00

00

23

Page 24: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialFitting the model

LondonR, July 2013

> library(survival)

> fitp <- coxph(

Surv(times, event=Converted) ~ Type +

Event.Type +

Supplier +

PrevClicks +

AdFormat,

data=xd)

24

Page 25: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialWhat does the data say?AdFormat Event.Type PrevClicks Supplier Type

0

1

2

Super

Sky-1

60x600

Leaderb

oard

- 728x90

MP

U-3

00x250

Vie

w

Click

Pre

vC

licks

MexA

d

Specific

Media

Valu

eC

lick

Ebay

AO

L N

etw

ork

Gum

tree

Google

Dis

pla

y N

etw

ork

Facebook A

PI

Pay m

onth

ly S

IM

Contr

act

Phone

SIM

only

Exp

on

en

tia

ted

co

eff

icie

nt

LondonR, July 2013 25

Page 26: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential

AdFormat Event.Type PrevClicks Supplier Type

0

1

2

Super

Sky-1

60x600

Leaderb

oard

- 728x90

MP

U-3

00x250

Vie

w

Click

Pre

vC

licks

MexA

d

Specific

Media

Valu

eC

lick

Ebay

AO

L N

etw

ork

Gum

tree

Google

Dis

pla

y N

etw

ork

Facebook A

PI

Pay m

onth

ly S

IM

Contr

act

Phone

SIM

only

Exp

on

en

tia

ted

co

eff

icie

nt

What does the data say?

• Advertise the right product!

• Some suppliers are better at generating

conversion

• But note the data wasn’t an unbiased

experiment!

LondonR, July 2013 26

Page 27: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialEstimated hazard function

LondonR, July 2013

> x <- survfit(fitp)

> xx <- with(x, data.frame(time, surv, upper, lower))

> ggplot(xx, aes(time, surv)) + geom_step() …

27

Page 28: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialAgenda

Digital marketing attribution

Using Survival models

At scale, on big data

LondonR, July 2013 28

Page 29: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialCase study: Datasong

Profile:

Multi-channel marketing analytics

Software developer and service provider

Growing, innovative, cost-conscious

Technology:

LondonR, July 2013 29

Page 30: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialModeling the Baseline Hazard

LondonR, July 2013

Capture nonlinear trends in

baseline, while overlaying

marketing treatment variables

as well as other customer

attributes

Revolution R package used:

• RevoScaleR

Revolution R functions used:

• rxImport()

• rxSummary()

• rxCube()

• rxLogit()

• rxPredict()

• rxRoc()

30

Page 31: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialTransformations

LondonR, July 2013

Catalog Email

31

Page 32: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution ConfidentialOutcome

Massively scalable infrastructure

Attribution and optimization at individual customer level for clients

such as Williams-Sonoma

Client saved $250K in one campaign

Rapid deployment of customer-specific models

Innovative techniques, e.g. GAM Survival models

Performance improvement

Experienced 4x performance improvement on 50 million records

LondonR, July 2013 32

Page 33: Survival Analysis for Marketing Attribution · Regression In many cases, log data is available only for conversions And when non-conversion data is available, these people may convert

Revolution Confidential

33

www.revolutionanalytics.com Twitter: @RevolutionR

The leading commercial provider of software and support for the popular open source R statistics language.

LondonR, July 2013