collective response spike prediction for mutually interacting consumers

25
Modeling of the Individual Factors Modeling of the Collective Factors Experimental Results References Collective Response Spike Prediction for Mutually Interacting Consumers Rikiya Takahashi 1 Hideyuki Mizuta 1 Naoki Abe 2 Ruby L. Kennedy 3 Vincent J. Jeffs 3 Ravi Shah 3 Robert H. Crites 3 1 IBM Research - Tokyo 2 IBM Thomas J. Watson Research Center 3 IBM Software Group, Enterprise Marketing Management December 8, 2013 ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Upload: rikiya-takahashi

Post on 18-Nov-2014

461 views

Category:

Marketing


0 download

DESCRIPTION

The presentation deck used in the 13th IEEE International Conference on Data Mining (ICDM 2013),

TRANSCRIPT

Page 1: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.

. ..

.

.

Collective Response Spike Prediction

for Mutually Interacting Consumers

Rikiya Takahashi1 Hideyuki Mizuta1 Naoki Abe2

Ruby L. Kennedy3 Vincent J. Jeffs3 Ravi Shah3

Robert H. Crites3

1IBM Research - Tokyo

2IBM Thomas J. Watson Research Center

3IBM Software Group, Enterprise Marketing Management

December 8, 2013

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 2: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Response Spike Forecasting in e-Commerce

Goal: predict the prob. of consumer i ’s response spike in time[t, t+∆t), by using the two types of factors.

individual factor the consumer i ’s experiences before time t

collective factor many consumers’ experiences before time t.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 3: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Omni-Channel Events and Essential Questions

Examples of past events that could affect future responses

response purchase, web-site browsing, inquiry to call center

stimulus e-mail, direct-mail, reach to TV ad

interaction observable word-of-mouth in online review sitesunobservable offline word-of-mouth

physiological sync among humans

Need to answer the two essential questions

.

.

.

1 How to model the time-dependency among these events?

.

.

.

2 How to handle the unobservable interaction?

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 4: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Agenda

.

..

1 Introduction of the Goal and Issues

.

.

.

2 Modeling of the Individual Factors

Basics in Continuous-Time Event PredictionHyperbolic Discounting in Human MemoryEfficient Learning with Piecewise-Constant States

.

.

.

3 Modeling of the Collective Factors

.

.

.

4 Experimental Results

.

.

.

5 Conclusion

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 5: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Regression for Inhomogeneous Poisson Processes

Regressing Poisson Process Rates with a State Vector

Inhomogeneous Poisson Process: point process whoseprob. of event occurrence per time is time-varying.

Yik(t, t+∆t): random variable to represent # of type-kresponse events in time [t, t+∆t).

Model the time-varying log-intensity

zik(t) = lim∆t→0

logP (Yik(t, t+∆t)≥1)

∆t,

as a function of some state vector xi(t)∈Rd .

xi(t) must be designed with past events before time tthrough multiple response and stimulus channels.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 6: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Desirable Property of State: Piecewise-Constancy

How to appropriately model the state xi(t)?Exploit piecewise-constant features (Rajaram et al., 2005;Gunawardana et al., 2011).

Analytically tractable Poisson log-likelihood terms.Examples: “# of type-k events in past L days/weeks”

Efficiently computed with terminators of sliding windows.

Figure: Computing state time-series with multiple sliding windows.Every element of the state vector is finally time-aligned.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 7: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Desirable Property: Hyperbolic Discounting

State should be psychologically interpretable for marketers.Power-law decay (hyperbolic discounting) in memory(Rubin and Wenzel, 1996; Wixted and Ebbesen, 1997).

Fast initial decay whilelong-term persistence

0

1

mag

nitu

de

elapsed time after event

hyperbolicexponential

Formed as infinite mixture ofexponential discounting

1

(1 + t/τ)α ≡∫ ∞

0e−λtGa(λ;α, τ)dλ

whereGa(λ; α, τ), τα

Γ(α)exp(−τλ)

Finite-mixture approximation with sampling1

(1 + t/τ)α 'K∑

i=1

wie−λi t where wi =1/K , λi ∼ Ga(λ; α, τ)

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 8: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Nonparametric Staircase Modeling

Effective compromise between the piecewise constancy andhyperbolic discounting - staircase function approximation.

time

Finite mixture of step functions,instead of mixing exponentials

K∑i=1

wi I (ti < t < ti + Li )

Nonparametric curve fitting: contain many combinationsof event types & window lengths in the state xI (t).

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 9: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Nonparametric Staircase Modeling

An example of forecasting the response rates in future.

How to efficiently fit event-specific curves from real data?

Fix every sliding-window length a priori, andoptimize only the height of each step function.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 10: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Convex Variable-Interval Poisson Regression

Linear model with a mapping Φ : Rd →RdΦ to representdiminishing returns (e.g., element-wise sub-linear function).

P (Yik(t, t+∆t)≥1)=∆t exp(bk +w>

k Φ (xi(t)))

Maximum A Posteriori estimation with L1 penalty:

maxbk ,wk

[−nC0‖wk‖1+

n∑i=1

Ti∑j=1

`(yijk ; bk +w>

k Φ (xij) , τij

)]

`(y ; z , τ),yz−τ exp (z): continuous-time log-likelihood

n: # of consumers

τij : the j-th interval time of consumer i

Update with either coordinate-wise batch learning or onlinelearning algorithms (e.g., FOBOS (Duchi and Singer, 2009)).

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 11: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Examples of the Fitted Curves

Data: individual-level daily records of 2-year (2009-2011)marketing-action and response events in an online retailer.

1684210 time [week]

purchase→purchase

1684210 time [week]

catalog→purchase

1684210 time [week]

e-mail→purchase

1684210 time [week]

e-mail→browsing1684210 time [week]

browsing→purchase

1684210 time [week]

browsing→browsing

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 12: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Agenda

.

..

1 Introduction of the Goal and Issues

.

.

.

2 Modeling of the Individual Factors

.

.

.

3 Modeling of the Collective Factors

Dependence among Consumer ResponsesFrequency Aggregation with Residual ClusteringMulti-Task Learning of Cluster-Specific Models

.

.

.

4 Experimental Results

.

.

.

5 Conclusion

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 13: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Evidence of Interactions among Consumers

Reject a hypothesis of independence among consumers.Left: frequent excesses of confidence intervals.

Response freq. by each individual is Poisson-distributed.Sum of response frequencies by independent consumersmust obey a Poisson distribution.

Right: significant autocorrelation (predictability) aboutthe sum of regression residuals.

Date 1... 7... 1... 0 5 10 15 20 25

0.0

0.2

0.4

0.6

0.8

1.0

Lag [days]

Aut

o−co

rrel

atio

n

Figure: High dispersion from the sum of individuals. Black: actual.Blue: predicted mean. Green: predicted 0.5%- & 99.5%-tiles.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 14: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Approaches to Incorporate Interactions

Graphical Granger Modeling(e.g., (Lozano et al., 2009))

to connect individuals

Unscable quadratic costto the # of consumers

Our approach: fit a graphto connect only clusters

Quadratic cost only tothe # of clusters

In our data, fitting an individual-to-individual graph didnot improve accuracy even with L1 sparsification.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 15: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.The 3-Step Estimator with Residual Clustering

Aggregating frequency among population stabilizes the fitting.Detect each group whose members follow the common trend.

Clustering to exclude the autocor. within the same consumer.

A) Initial Fitting regression using only the individual factors.

B) Residual Clustering clustering of residual time-series in A.

C) Final Fitting fit interaction coefficients among clusters.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 16: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Residual Clustering

Do a clustering of vectors {rik ,(ri1k , . . . , riHk)>}n

i=1 where

rijk , yijk −∫

etj

et(j−1)

exp(bk +w>

k Φ (xit))

dt.

yijk : weekly-smoothed actual response frequencyWe recommend m-medians for suppressing outlying peaks.

Persistent biases confirmed in residual time-series for eachcluster: implying autocorrelation within the same cluster.

Date 1... 7... 1...

actual

Date 1... 7... 1...

predictedDate

1... 7... 1...

residualsFigure: Frequency time-series for 4 example clusters. Aggregatedresiduals for each of clusters have non-zero means.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 17: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Multi-Task Regression for Clusters

Reformulate the log-intensity as

zik(t) = bc[i ]k +w>c[i ]kΦ (xi(t))+

m∑c ′=1

θ>c[i ]c ′kΨ (zc ′(t)) .

zc ′(t): sum of freq. within cluster c ′ with sliding windows

Ψ: another mapping function

θc[i ]c ′ : interaction strength from cluster c ′ to c[i ].

Using the initial estimate W in step A, we maximize

L (Dk |Θ∗k)−

n

m

m∑c=1

(C1‖wck−wk‖1+C2

m∑c ′=1

‖θcc ′k‖1

),

which is the sum of data log-likelihood L (Dk |Θ∗k) (details are

abbreviated) and convex multi-task learning penalty.ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 18: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Experimental Setting

Evaluate the predictive accuracy by using individual-level daily2-year (2009-2011) events in an online retailer.

3 folds: each fold contains about 2M (browsing: 900K &purchase: 60K) events by 30K customers.

response: 1 type of purchase & 9 types of browsing

stimulus: 5 types of omni-channel direct marketing

8 types of window lengths: 1 day, 2 days, 4 days, 1 week,2 weeks, 4 weeks, . . ., and 32 weeks

Each dataset is split into a training and a test datasets byusing the middle date as a dividing point.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 19: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Performance Metric

As well as the log-likelihood, we evaluate the Continuous-TimeArea-Under-Curve (CTAUC) based on

Continuous-Time Receiver-Operator-Characteristics(CTROC) curve to represent what fractions of the actualresponse events are covered in the high-score periods.

Figure: Principle in computing the CTAUC.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 20: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Predictability with Rich Individual Factors

Results when using only the individual factors

Baseline: models only containing past purchase events.

Adding past browsing events and stimuli improve the acc.

Over 80% of responses are covered in top-20% periods.

0.6

0.7

0.8

0.9

1

1 2 3

CT

AU

C

index of the dataset

All (Proposed)PastPurchase+Action

OnlyPastPurchase

-0.1

-0.08

-0.06

-0.04

-0.02

1 2 3

avg.

test

-set

log-

likel

ihoo

d

index of the response type

All (Proposed)PastPurchase+Action

OnlyPastPurchase

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Cov

erag

e fo

r A

ctua

l Res

pons

es

Coverage for High-Intensity Periods

All (Proposed)PastPurchase+Action

OnlyPastPurchase

Figure: Comparing performances for the inclusion of covariates.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 21: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Implication from the Fitted Curves

Marketing implications from autocorrelationStrong impacts from browsing into next purchase

Cause a seq. of (campaigns→browsing→purchase)Necessity of simulation in long-term forecasting

To marginalize direct impact & and chains of responses

1684210 time [week]

purchase→purchase

1684210 time [week]

catalog→purchase

1684210 time [week]

e-mail→purchase

1684210 time [week]

e-mail→browsing1684210 time [week]

browsing→purchase

1684210 time [week]

browsing→browsing

Figure: Examples of the nonparametrically fitted curves.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 22: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Gains Obtained with the Collective Factors

Collective factors are useful for parts of the labels.Significant improvements of CTAUC by adding thecollective factor even without clustering (m=1)Somewhat outperforming log-likelihood when m>1

0.8

0.9

1

1 2 3 4 5 6 7 8 9 10

CT

AU

C

index of the response type

Individualm=1

m=16m=64

-0.001

0

0.001

0.002

0.003

1 2 3 4 5 6 7 8 9 10

gain

of a

vg. t

est l

og-li

kelih

ood

index of the response type

m=1m=16m=64

Dataset #1 (Results for datasets #2 & #3 are similar.)

Figure: Label-dependent performances for collective-factor models.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 23: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.Conclusion and Future Directions

Proposed continuous-time response prediction models formarketing decision making

Piecewise-constant states as a staircase function toapproximate hyperbolic discountingCollective factors for unobservable interactions

Stabilized estimation by frequency aggregationDetect groups following the common trends withclustering of regression residuals

Issues to be handled in future

Richer structures of social interactionsNonlinear functional approximationsNon-stationarity of correlation: rare butsuddenly-exploding word-of-mouth events

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 24: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.References I

Duchi, J. and Singer, Y. (2009). Efficient online and batch learningusing forward backward splitting. Journal of Machine LearningResearch, 10:2899–2934.

Gunawardana, A., Meek, C., and Xu, P. (2011). A model fortemporal dependencies in event streams. In Shawe-Taylor, J.,Zemel, R., Bartlett, P., Pereira, F., and Weinberger, K., editors,Advances in Neural Information Processing Systems 24, pages1962–1970.

Lozano, A. C., Abe, N., Liu, Y., and Rosset, S. (2009). Groupedgraphical granger modeling methods for temporal causalmodeling. In Proceedings of the 15th ACM SIGKDDinternational conference on Knowledge discovery and datamining, KDD ’09, pages 577–586, New York, NY, USA. ACM.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction

Page 25: Collective Response Spike Prediction for Mutually Interacting Consumers

Modeling of the Individual FactorsModeling of the Collective Factors

Experimental ResultsReferences

.

.References II

Rajaram, S., Graepel, T., and Herbrich, R. (2005).Poisson-networks: A model for structured point processes. InProceedings of the 10th International Workshop on ArtificialIntelligence and Statistics (AISTATS 2005).

Rubin, D. C. and Wenzel, A. E. (1996). One hundred years offorgetting: A quantitative description of retention. PsychologicalReview, 103:734–760.

Wixted, J. T. and Ebbesen, E. B. (1997). Genuine power curves inforgetting: A quantitative analysis of individual subjectforgetting functions. Memory and Cognition, 25:731–739.

ICDM 2013: IEEE International Conference on Data Mining Collective Response Spike Prediction