mining the network value of customers

29
Mining the Network Value of Customers Zhenwei He & Cen Zhe Qiao School of Informatics University of Edinburgh

Upload: saul

Post on 23-Feb-2016

50 views

Category:

Documents


0 download

DESCRIPTION

Mining the Network Value of Customers. Zhenwei He & Cen Zhe Qiao School of Informatics University of Edinburgh. Outline. Introduction Modeling Markets as Markov random field Mining from Collaborative Filtering System(CFS) Example: - the EachMovie collaborative filtering database - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mining the Network Value of Customers

Mining the Network Value of Customers

Zhenwei He & Cen Zhe QiaoSchool of Informatics

University of Edinburgh

Page 2: Mining the Network Value of Customers

Outline

• Introduction• Modeling Markets as Markov random field• Mining from Collaborative Filtering

System(CFS)• Example: - the EachMovie collaborative filtering database

• Future work• Conclusion

Page 3: Mining the Network Value of Customers

Introduction

• Mass Marketing• Direct Marketing: independent assumption• Viral Marketing: strongly dependent• Data mining: plays a key role

General framework Optimize the choice of which customers to market to Estimating what customer acquisition cost is justified for each

Page 4: Mining the Network Value of Customers

How to do that?

• Modeling markets as Social Network

• Mining the network from Collaborative Filtering Databases

Page 5: Mining the Network Value of Customers

Modeling Markets as Social Network

• Some mathematical notations:n - the number of customers - if customer i buys the product/ ith-customer - set of neighbors of - the customers whose value is know(unknown) - the number of unknown neighbors of - set of attributes of the product - the marketing action that is taken for customer iC - the cost of marketing to a customer

}1,0{iX},...,{ ,niiii xxN iX

)(, uk XXu

iui XNN

}{ ,...,1 mYYY iX

iM

Page 6: Mining the Network Value of Customers

Modeling Markets as Social Network

r0 - the revenue from selling the product to customer if NO marketing action is performed.

r1 - the revenue from selling the product to customer if marketing action is performed

- the result of setting to 1 and leaving the rest of M unchanged - similar

Where

)(1 Mfi iM)(0 Mfi

}{ ,...,1 nMMM

Page 7: Mining the Network Value of Customers

Modeling Markets as Social Network

• The customer’s network value = {the Customer’s TOTAL value} – {The customer’s INTRINSIC value}

• The total value of customer is measured byWhich is

• The intrinsic value of customer is

),,( MYXELP k

))(,,())(,,( 01 MfYXELPMfYXELP ik

ik

),,( MYXELP ki

Page 8: Mining the Network Value of Customers

Modeling Markets as Social Network

• The global lift in profit:

Where ri = r1 if Mi =1, ri = r0 otherwise, and |M| is the number of 1’s in M

• The expected lift in profit:

CMMYXXPrMYXXPrMYXELP n

ik

in

ik

iik ||),,|1(0),,|1(),,(

1 01

CMfYXXPrMfYXXPrMYXELP ik

iik

ik

i ))(,,|1(0))(,,|1(1),,( 01

Page 9: Mining the Network Value of Customers

Modeling Markets as Social Network

• Our goal: - to find the assignment of values to M that maximizes ELP

• Problem: - required trying all possible combinations of assignment!

• Solution: - approximate procedures

Single Pass Methods Greedy search

Hill-Climbing search

Page 10: Mining the Network Value of Customers

Modeling Markets as Social Network

• There may be another problem.• How do we compute ?

• L.Pelkwitz (1990), A continuous relaxation labeling for Markov Random fields

• can be approximate by its maximum entropy estimate given the marginal

),,|( MYXXP ki

)(

)(

)(

),,|(),,|(

),,|(),,,|(

),,|,(

),,|(

ui

uij

ui

ui

NC NXk

jii

kuiNC

kuii

NCku

ii

ki

MYXXPMYNXP

MYXNPMYXNXP

MYXNXP

MYXXP

),,|( MYXNP kui

uij

kj NforXMYXXP ),,,|(

Page 11: Mining the Network Value of Customers

Modeling Markets as Social Network

• expresses as a function of themselves • Can be iteratively to find them• Relaxation labeling : - guaranteed to converge to locally consistent values as long as the initial

assignment is sufficiently close to them.• Initialization: the network-less probability

• Problem: exponential in

• Solution: Gibbs Sampling / k-shortest-path algorithm

),,|( MYXXP ki

),|( MYXP i

uiN

m

k iiiiiii MYPXYPXMPXPMYXP1

),(/)|()|()(),|(

Page 12: Mining the Network Value of Customers

Modeling Markets as Social Network

• Recall:

• still don’t know!• From Naïve Bayes:

Where• Now can be computed by :

)(),,|(),,|(),,|( u

iuijNC NX

kjii

ki MYXXPMYNXPMYXXP

),,|( MYNXP ii

m

k ikii

iiiiii XyP

NMYPXMPNXPMYNXP

1)|(

)|,()|()|(),,|(

)|0()0|,()|1()1|,()|,( iiiiiiiiii NXPXMYPNXPXMYPNMYP

),,|( MYNXP ii

)|(),|(),(),|( ikiiiii XyPXMPXPNXP

Page 13: Mining the Network Value of Customers

Mining the network from Collaborative Filtering Databases

• : vary from application to application

• Collaborative Filtering System: Users rate a set of items (like: amazon.com) These ratings are then used to recommend other items the user might be

interested in

• But…how?

• The basic idea( given by GroupLens ): To predict a user’s rating of an item as a weighted average of the rating given by

similar users Then recommend items with high predicted ratings

)|( ii NXP

Page 14: Mining the Network Value of Customers

Mining the network from Collaborative Filtering Databases

• The Pearson correlation coefficient:

Where is user i’s rating of item k, is the mean of user i’s ratings , likewise for j;and the summations and means are computed over the item k that both i and j haverated.

• Given an item k that user I has not rated, the rating of k for the user is then predicted as:

Where is a normalization factor, and is the set of users most similar to I according to PCC

k k jjkiik

k jjkiikij

RRRR

RRRRW

22 )()(

))((

ikR iR

ij NX jjkjiiik RRWRR )(ˆ

ij NX ijW ||/1iN in

Page 15: Mining the Network Value of Customers

Mining the network from Collaborative Filtering Databases

• Thus we can compute :

Piecewise-linear model Obtained by dividing ‘s range into bins Compute Mean and for each bin Estimate by interpolating linearly between the two nearest

means

• Finally for the model:

))(ˆ|()|( iiiii NRXPNXP

)|( ii NXP

iR̂

iR̂ )ˆ|( ii RXP))(ˆ|( iii NRXP

)|(),|(),|(),(),ˆ|( YRPXYPXMPXPRXP iikiiiii

Page 16: Mining the Network Value of Customers

Example: the ‘EachMovie’ collaborative filtering database

• ‘EachMovie’---word of mouth ---Rating ---Movie Information• The Data• Model Accuracy• Network Value• Marketing Experiments

Page 17: Mining the Network Value of Customers

The Model

• Y={Y1,Y2,…,Y10} p(Y|Xi) • Pearson correlation coefficient for Wij (with

penalized value 0.05)• •

)|(),|(),|(),(),ˆ|( YRPXYPXMPXPRXP iikiiiii

}1),0|1(min{)1|1( iiii MXPMXP

Page 18: Mining the Network Value of Customers

Frame of the model

Page 19: Mining the Network Value of Customers

Empirical distribution

Page 20: Mining the Network Value of Customers

The Data

• Training set: all movies before Sep 1 1996 ---Sold before Jan 1996 ---Srecent Jan-Sep 1996• Test set: movies Sep-Dec 1996• Inactive people

Page 21: Mining the Network Value of Customers

Model Accuracy

• Set M=M0 • Estimate the p(Xi|Xk, Y, M)• No rating from inactive people---p(Xi|Y)=0• Correlation=p(Xi|Xk, Y, M)/actual Xi

• Not really satisfactory as the genre is the only input

Page 22: Mining the Network Value of Customers

Network Value

Page 23: Mining the Network Value of Customers

Weight ranking function

Page 24: Mining the Network Value of Customers

A good customer to market

• Likely to give high rating• Strong weight to influence• Has many neighbors who are easily be

influenced• High probability of purchasing

Page 25: Mining the Network Value of Customers

Marketing Experiments

• Traditional direct marketing• Network-based marketing ---single pass ---greedy search ---hill climbing• Scenarios: Free Movie, Discounted Movie,

Advertising

Page 26: Mining the Network Value of Customers

Profits and runtimes obtained using different marketing strategies

Page 27: Mining the Network Value of Customers

Related Work

• Regarding the Netwotk ---Email logs (Schwartz and Wood) ---ReferralWeb ---MRF classification of Web pages(Chak)• Regarding the Marketing ---impact on the customers’ closest friends

(Krackhardt)

Page 28: Mining the Network Value of Customers

Future Work

• Expect larger network to be mined• Mining a network from multiple sources of

relevant information• Mining the unknown networks• Towards more detailed node models and

multiple types of relations between nodes

Page 29: Mining the Network Value of Customers

Conclusion

• Data mining in viral marketing• Customers as nodes and impact on each other• social network from collaborative filtering

database• Optimize marketing decision