webinar - pattern mining log data - vega (20160426)

33
Churn Prediction: Understanding your customers and taking action. @datoinc #churnPredictionDato

Upload: turi-inc

Post on 16-Apr-2017

184 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Webinar - Pattern Mining Log Data - Vega (20160426)

Churn Prediction: Understanding your

customers and taking action.

@datoinc#churnPredictionDato

Page 2: Webinar - Pattern Mining Log Data - Vega (20160426)

Hi! My name is …

Antoine AtallahPrincipal Data Scientist

Dato toolkits team, novice powerlifter, Hawks fan.

2

Hi! My name is …

#churnPredictionDato

Page 3: Webinar - Pattern Mining Log Data - Vega (20160426)

Hi! My name is …

Karla VegaCustomer Success Manager

Aerospace engineer, dog trainer, running fan @vegakp

3

Hi! My name is …

#churnPredictionDato

Page 4: Webinar - Pattern Mining Log Data - Vega (20160426)

About Us!

#churnPredictionDato

Page 5: Webinar - Pattern Mining Log Data - Vega (20160426)

+ =

Questions?• (Now) we love questions. Feel free to interrupt for questions!• (Later) Email us [email protected], [email protected]

Webinar!

#churnPredictionDato

Page 6: Webinar - Pattern Mining Log Data - Vega (20160426)

Extracting Insights from Data

Page 7: Webinar - Pattern Mining Log Data - Vega (20160426)

Data Science Workflow

Ingest Transform Model Insight

#churnPredictionDato

Page 8: Webinar - Pattern Mining Log Data - Vega (20160426)

Log Journey

Lots of data

Insights Profits

#churnPredictionDato

Page 9: Webinar - Pattern Mining Log Data - Vega (20160426)

Mining Log Data

Page 10: Webinar - Pattern Mining Log Data - Vega (20160426)

Logs are everywhere!

#churnPredictionDato

Page 11: Webinar - Pattern Mining Log Data - Vega (20160426)

Different kinds of logs• Raw logs

• Each row containing an individual event for a user, at a given time

• Aggregated logs• Each row contains the interactions for a user over a period of

time• For instance, user activity over one-month rollups• This is the traditional data output of Business Intelligence

infrastructures• User side-data

• Information about each user (demographics, etc…)#churnPredictionDato

Page 12: Webinar - Pattern Mining Log Data - Vega (20160426)

Logs contain usage patterns

Small Purchase

Large Purchase

#churnPredictionDato

Page 13: Webinar - Pattern Mining Log Data - Vega (20160426)

Different kinds usage patternsKinds of Patterns

Visits, Purchases, Events Frequency

Visits, Purchase Quantity

Changes in value over time

Change in time between visits, purchases, events

Time since last action or visit

Demographic information (age, gender, …)

Types of items purchased (seasonality, quality)

#churnPredictionDato

Page 14: Webinar - Pattern Mining Log Data - Vega (20160426)

Retaining customers/visitors is important• Cost to acquire a new customer is high vs retaining a customer• Gives a pulse on the health of the business• Can help take preventive actions and act before it’s too late• Can help create more effective marketing campaigns

#churnPredictionDato

Page 15: Webinar - Pattern Mining Log Data - Vega (20160426)

What is Churn Prediction

Page 16: Webinar - Pattern Mining Log Data - Vega (20160426)

What is Churn• Churn Prediction is predicting user’s probability to stop coming

back (churn)• Works by observing past user behavior

#churnPredictionDato

Page 17: Webinar - Pattern Mining Log Data - Vega (20160426)

Churn Prediction

#churnPredictionDato

(Apr 2016)

Daily activity logs for Jan 2015 – April 2016

Page 18: Webinar - Pattern Mining Log Data - Vega (20160426)

More Precisely• Churn Prediction is predicting user’s probability to stop coming

back (churn)• Works by observing past user behavior• We define a time boundary at which we want to predict churn• Anyone not present N days (default is 30) after the boundary is

considered to have churned• The M days (default 60) before the boundary are used to

generate features• Multiple boundaries can be specified to extract more patterns

#churnPredictionDato

Page 19: Webinar - Pattern Mining Log Data - Vega (20160426)

Feature and Label Generation

#churnPredictionDato

(Apr 2016)

Daily activity logs for Jan 2015 – April 2016

Page 20: Webinar - Pattern Mining Log Data - Vega (20160426)

How to use Churn Prediction

Page 21: Webinar - Pattern Mining Log Data - Vega (20160426)

Choosing Time Boundaries• Time Boundaries are moments in the past that are used to

observe user behavior and generate labels• The time before the boundary is used to observe patterns• The time after the boundary is used to generate labels

Boundaries Meaning

January 1st 2016 Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016

January 1st 2016,December 1st 2015

Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016;Will use the patterns from before December 1st 2015 to predict User Churn after December 1st 2015

This will analyze more patterns and build a richer model#churnPredictionDato

Page 22: Webinar - Pattern Mining Log Data - Vega (20160426)

Choosing a Churn Period• The Churn Period corresponds to how far in the future we want to

predict.• It also means that for training purposes, users who have not been

active for this amount of time will be considered to have churned

Churn Period Predicts

7 Days Probability for each user to be leaving next week

30 Days Probability for each user to be leaving next month

3 Months Probability for each user to be leaving next quarter

#churnPredictionDato

Page 23: Webinar - Pattern Mining Log Data - Vega (20160426)

Choosing Lookback Periods• Lookback Periods is how far in the past we look to extract user

behavior patterns (features)• Multiple lookback periods can be provided to generate richer

features

Lookback Periods Features

3 Days Will use the 3 days before each Time Boundary to extract usage patterns

30 Days Will use the 30 days before each Time Boundary to extract usage patterns

7 Days, 1 Month Will use the week and the month before each Time Boundary to extract usage patterns

#churnPredictionDato

Page 24: Webinar - Pattern Mining Log Data - Vega (20160426)

Choosing appropriate parameters• If we want to predict Churn for this quarter, we might want to set:

• Churn Period to be 3 Months (how far in the future we predict)• Lookback Periods to be 2, 4, 8, 16 weeks (how far in the past

to extract patterns from)• Time Boundaries to be January 1st 2016, January 1st 2015,

January 1st 2014• Notice that we chose the same quarter each year for Time

Boundary• Choosing past data with the same underlying behavior will

provide more accurate predictions

#churnPredictionDato

Page 25: Webinar - Pattern Mining Log Data - Vega (20160426)

Choosing appropriate parameters• If we want to predict Churn for this month, we might want to set:

• Churn Period to be 1 Month (how far in the future we predict)• Lookback Periods to be 7, 14, 30, 60 days (how far in the past

to extract patterns from)• Time Boundaries to be January 1st 2016, October 1st 2015,

September 1st 2015, August 1st 2015• In this case, we intentionally skipped over November and

December 2015 since it is the holiday season, and may exhibit very different behavior

#churnPredictionDato

Page 26: Webinar - Pattern Mining Log Data - Vega (20160426)

Key Takeaways• Label generation is extremely simplified (choose a Churn Period)• Feature generation is extremely simplified (choose Lookback

Periods and Time Boundaries)• Choose representative time frames to predict churn in the desired

time frame

#churnPredictionDato

Page 27: Webinar - Pattern Mining Log Data - Vega (20160426)

Interpreting the Results

Page 28: Webinar - Pattern Mining Log Data - Vega (20160426)

Output of the model• The Churn Prediction model returns a probability of churn for

each provided user

#churnPredictionDato

Page 29: Webinar - Pattern Mining Log Data - Vega (20160426)

Using the Probabilities

Churn Probability

Num

ber o

f Use

rs

High Probability of Churn:

Might be hard to rescue these users

Mid-Probability of Churn: We should try to rescue these users

Low-Probability of Churn: Send a thank-you note!

#churnPredictionDato

Page 30: Webinar - Pattern Mining Log Data - Vega (20160426)

Using the Probabilities• We can target different users, using their probability of Churn as a

guideline• Different marketing messages can be created based on the

probability of Churn• The highest-probability users are not always the best to target,

depending on the cost of the action to take to retain them• Gives a new dimension on the user base• Can be used to monitor the health of the user population over

time

#churnPredictionDato

Page 31: Webinar - Pattern Mining Log Data - Vega (20160426)

Demo

Page 32: Webinar - Pattern Mining Log Data - Vega (20160426)

Summary

Log Data Mining

≠Rocket Science

• Define time parameters to identify patterns and generate labels.

• Extract predictions to gain insights about your user population.

• Take action and help grow your healthy business.

Churn Prediction

#churnPredictionDato

Page 33: Webinar - Pattern Mining Log Data - Vega (20160426)

SELECT questions FROM audienceWHERE difficulty == “Easy”

Thanks!