my five predictive analytics pet peeves april 2013...my five predictive analytics pet peeves dean...

23
My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon) April 16, 2013 Email: [email protected] Blog: http://abbottanalytics.blogspot.com Twitter: @deanabb © Abbott Analytics, Inc. 2001-2013 1

Upload: others

Post on 10-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

My Five Predictive Analytics

Pet Peeves Dean Abbott

Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

April 16, 2013

Email: [email protected] Blog: http://abbottanalytics.blogspot.com

Twitter: @deanabb © Abbott Analytics, Inc. 2001-2013 1

Page 2: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Topics

•  Why Pet Peeves? •  A call for humility for Predictive Modelers

•  The Five Pet Peeves 1.  Machine Learning Skills > Domain Expertise

2.  Just Build the Most Accurate Model!

3.  Significance?…What do you mean by Significance?

4.  My Algorithm is better than Your Algorithm

5.  My classifier calls everything 0…time to resample!

© Abbott Analytics, Inc. 2001-2013 2

Page 3: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Peeve 1 Which is Better: Machine Learning

Expertise or Domain Expertise?

•  Question: who is more important in the process of building predictive models: •  The Data Scientist / Predictive Modeler / Data Miner

•  The Domain Expert / Business Stakeholder?

© Abbott Analytics, Inc. 2001-2013 3

Photo from http://despair.com/superioritee.html

Page 4: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Which is Better: 2012 Strata Conference Debate?

From Strata Conference: http://radar.oreilly.com/2012/03/machine-learning-expertise-google-analytics.html

© Abbott Analytics, Inc. 2001-2013 4

“I think you can get pretty far with some

common sense, maybe Google-ing the

basic information you need to know

about a domain, and a lot of statistical

intuition”

Page 5: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Formula for Success?

© Abbott Analytics, Inc. 2001-2013 5

Page 6: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Conclusion: Frame the Problem First

•  Mark Driscoll: Moderator of Strata Debate •  “could you currently prepare your data for a Kaggle

competition? If so, then hire a machine learner. If not, hire a data scientist who has the domain expertise and the data hacking skills to get you there.” – http://medriscoll.com/post/18784448854/the-data-science-debate-domain-expertise-or-machine

•  But even this may not work, which brings me to the second pet peeve…

© Abbott Analytics, Inc. 2001-2013 6

Page 7: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Peeve 2 Just Build Accurate Models

•  The Problems with Model Accuracy:

1.  There’s More to Success than “Accuracy”

2.  Which Accuracy?

© Abbott Analytics, Inc. 2001-2013 7

Page 8: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

The Winner is… Best Accuracy

© Abbott Analytics, Inc. 2001-2013 8

http://www.netflixprize.com/leaderboard

Page 9: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Why Model Accuracy is Not Enough: Netflix Prize

http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html

© Abbott Analytics, Inc. 2001-2013 9

Page 10: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Why Data Science is Not Enough: Netflix Prize

http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html

There’s more to a solution than accuracy—you have to be able to use it!

© Abbott Analytics, Inc. 2001-2013 10

Page 11: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Peeve 3 The Best Model Wins

•  We select the “winning model”, but is there a significant difference in model performance?

© Abbott Analytics, Inc. 2001-2013 12

Page 12: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

KDD Cup 98 Results

© Abbott Analytics, Inc. 2001-2013 13

Calculator from http://www.answersresearch.com/means.php

Page 13: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Example: Statistical Significance without Practical Significance

Measure Control Campaign

(based on model)

Number Mailed 5,000,000 4,000,000 Response Rate 1% 1.011%

outside margin of error? yes i.e., statisticall significant? yes

expected responders 50,000 40,000 actual responders 50,000 40,440

difference 0 440

Revenue Per Responder $100 Total Revenue Expected $4,000,000

Total Revenue Actual $4,044,000 Difference Revenue $44,000

Significance based on z=2 (95.45% confidence)

•  Cost per contact: negligible (email)

•  Cost for analysts to build model: $80,000

© Abbott Analytics, Inc. 2001-2013 14

Page 14: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Peeve 4 My Algorithm is Better than Your Algorithm

© Abbott Analytics, Inc. 2001-2013 15

From 2011 Rexer Analytics Data Mining Survey

http://www.rexeranalytics.com/Data-Miner-Survey-Results-2011.html

Page 15: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Every Algorithm Has its Day

© Abbott Analytics, Inc. 2001-2013 16

Elder, IV, J. F., and Lee, S. S. (1997), “Bundling Heterogeneous Classi�ers with Advisor Perceptrons,” Technical Report, University of Idaho, October, 14.

Page 16: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Modeling)Technique)/>)Modeling)

Implementa6on)/>)Par6cipant)

Affilia6on)Loca6on))Par6cipant)

Affilia6on)Type)

AUCROC)(Trapezoidal)Rule))

AUCROC)(Trapezoidal)Rule))Rank))

Top)Decile)Response)Rat)

Top)Decile)Response)Rate)Rank)

TreeNet&+&Logis-c&Regression& Salford&Systems& Mainland&China& Prac--oner& 70.01%& 1& 13.00%& 7&

Probit&Regression& SAS& USA& Prac--oner& 69.99%& 2& 13.13%& 6&

MLP&+&nHTuple&Classifier& Brazil& Prac--oner& 69.62%& 3& 13.88%& 1&

TreeNet& Salford&Systems& USA& Prac--oner& 69.61%& 4& 13.25%& 4&

TreeNet& Salford&Systems& Mainland&China& Prac--oner& 69.42%& 5& 13.50%& 2&

Ridge&Regression& Rank& Belgium& Prac--oner& 69.28%& 6& 12.88%& 9&

2HLayer&Linear&Regression& USA& Prac--oner& 69.14%& 7& 12.88%& 9&

Log&Regr+&Decision&Stump&+&AdaBoost&+&VFI& Mainland&China& Academia& 69.10%& 8& 13.25%& 4&

Logis-c&Average&of&Single&Decision&Func-ons& Australia& Prac--oner& 68.85%& 9& 12.13%& 17&

Logis-c&Regression& Weka& Singapore& Academia& 68.69%& 10& 12.38%& 16&

Logis-c&Regression& Mainland&China& Prac--oner& 68.58%& 11& 12.88%& 9&

Decision&Tree&+&Neural&Network&+&Log.Regression& Singapore& 68.54%& 12& 13.00%& 7&

Scorecard&Linear&Addi-ve&Model& Xeno& USA& Prac--oner& 68.28%& 13& 11.75%& 20&

Random&Forest& Weka& USA& 68.04%& 14& 12.50%& 14&

Expanding&Regression&Tree&+&RankBoost&+&Bagging& Weka& Mainland&China& Academia& 68.02%& 15& 12.50%& 14&

Logis-c&Regression& SAS&+&Salford& India& Prac--oner& 67.58%& 16& 12.00%& 19&

J48&+&BayesNet& Weka& Mainland&China& Academia& 67.56%& 17& 11.63%& 21&

Neural&Network&+&General&Addi-ve&Model& Tiberius& USA& Prac--oner& 67.54%& 18& 11.63%& 21&

Decision&Tree&+&Neural&Network& Mainland&China& Academia& 67.50%& 19& 12.88%& 9&

Decision&Tree&+&Neural&Network&+&Log.&Regression& SAS& USA& Academia& 66.71%& 20& 13.50%& 2&

PAKDD Cup 2007 Results: Look at all them Algorithms!

•  18 Different Algorithms Used in Top 20 Solutions;

© Abbott Analytics, Inc. 2001-2013 17

http://lamda.nju.edu.cn/conf/pakdd07/dmc07/results.htm

Page 17: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Peeve 5 You Must Stratify Data

to Balance the Target Class

•  For example, 93% non-responders (N), 7% responders (R)

•  What’s the Problem? (The justification for resampling) •  “Sample is biased toward responders” •  “Models will learn non-responders better” •  “Most algorithms will generate models that say ‘call

everything a non-responder’ and get 93% correct classification!” (I used to say this too)

•  Most common solution: •  Stratify the sample to get 50%/50% (some will argue that one

only needs 20-30% responders)

© Abbott Analytics, Inc. 2001-2013 18

Page 18: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Neural Network Results on Same Data

© Abbott Analytics, Inc. 2001-2013 19

Distribution of Target

NOTE: all models built using JMP 10, SAS Institute, Inc.

Page 19: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Sample Decision Tree Built on Imbalanced Population

© Abbott Analytics, Inc. 2001-2013 20

Distribution of Target

Predictions of Target Variable Se

nsiti

vity

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00 0.20 0.40 0.60 0.80 1.001-Specificity

But….ROC Curve Looks like this

Why do we get a ROC Curve that looks OK, but the confusion matrix says “everything is N (No)”?

All Rows

5388Count

2778.7248G^2

114.84791LogWorth

AVG_DON>=12.6

2462Count

1906.1388G^2

16.436964LogWorth

REC_DON_AMT>=22

1110Count

1073.8838G^2

1.0384958LogWorth

RFA_2(L3F, L2F, L3G)

91Count

115.37798G^2

RFA_2(L4G, L2G, L1F, L1G, L1E, L2E, L4F)

1019Count

947.1647G^2

0.7753319LogWorth

MAX_DON_DT<9110

32Count

41.183459G^2

MAX_DON_DT>=9110

987Count

900.58503G^2

REC_DON_AMT<22

1352Count

772.40299G^2

1.5219174LogWorth

CARDPM12>=8

26Count

30.289597G^2

CARDPM12<8

1326Count

734.00892G^2

AVG_DON<12.6

2926Count

623.4558G^2

14.98445LogWorth

REC_DON_AMT>=15

1256Count

463.93369G^2

5.3217072LogWorth

MAX_DON_AMT>=21

155Count

122.97078G^2

MAX_DON_AMT<21

1101Count

317.08262G^2

REC_DON_AMT<15

1670Count

101.41981G^2

1.739958LogWorth

MAX_DON_AMT>=20

132Count

35.849605G^2

1.6609832LogWorth

CARDGIFT_LIFE<4

15Count

15.012073G^2

CARDGIFT_LIFE>=4

117Count

11.515776G^2

MAX_DON_AMT<20

1538Count

55.605138G^2

Page 20: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

So What Happened?

•  Note: no algorithm predicts decisions (N or R): they all produce probabilities/likelihoods/confidences

•  Every data mining tool creates decisions (and by extension, forms confusion matrices) by thresholding the predicted probability by 0.5 (i.e., assuming equal likelihoods is the baseline)

•  When the imbalance is large, algorithms will not produce probs/likelihoods > 0.5… a score this large is far too unlikely for an algorithm to be “that sure”

© Abbott Analytics, Inc. 2001-2013 21

Page 21: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

What the Predictions Looks Like

© Abbott Analytics, Inc. 2001-2013 22

Page 22: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Confusion Matrices For the Decision Tree: Before and After

Response_STR) N) R) Total)N& 2,798& 2,204& 5,002&

R& 45& 341& 386&

Total& 2,843& 2,545& 5,388&

© Abbott Analytics, Inc. 2001-2013 24

Response_STR) N) R) Total)N& 5,002& 0& 5,002&

R& 386& 0& 386&

Total& 5,388& 0& 5,388&

Decision Tree: Threshold at 0.071

Decision Tree: Threshold at 0.5

Page 23: My Five Predictive Analytics Pet Peeves April 2013...My Five Predictive Analytics Pet Peeves Dean Abbott Abbott Analytics, Inc. Predictive Analytics World, San Francisco, CA (#pawcon)

Conclusions •  The Rant is Done!

•  The Five Pet Peeves 1.  Machine Learning Skills > Domain Expertise

•  Be humble; we need both data science and domain experts! 2.  Just Build the Most Accurate Model!

•  Select the model that addresses your metric 3.  Significance?…What do you mean by Significance?

•  Don’t get hung up on “best” when many models will do well •  Learn from difference in patterns found by these models

4.  My Algorithm is better than Your Algorithm •  Don’t stress about the algorithm; learn to use a few very well

5.  My classifier calls everything 0…time to resample! •  Don’t throw away 0s needlessly; only do it when there are enough of them

that you won’t miss them.

© Abbott Analytics, Inc. 2001-2013 25