uplift modeling with roc: an srl case studypages.cs.wisc.edu/~hous21/presentations/ilp_13.pdf ·...

25
Uplift Modeling with ROC: An SRL Case Study Houssam Nassif, Finn Kuusisto, Elizabeth S. Burnside, and Jude Shavlik University of Wisconsin, Madison, USA Thursday, August 29, 2013

Upload: others

Post on 30-Aug-2019

9 views

Category:

Documents


0 download

TRANSCRIPT

Uplift Modeling with ROC: An SRL Case Study

Houssam Nassif, Finn Kuusisto, Elizabeth S. Burnside,and Jude Shavlik

University of Wisconsin, Madison, USA

Thursday, August 29, 2013

Introduction

The Task

What are we trying to accomplish?

I Identify patients with breast cancer who may be goodcandidates for watchful waiting.

I Use ILP to take advantage of our relational dataset andproduce interpretable classifiers.

I Use metrics that are understandable to medical experts.

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

In Situ vs. Invasive Breast Cancer

Breast Cancer Stages

There are two main stages of breast cancer.

In Situ

I Earlier stage

I Cancer is localizedg

Invasive

I Later stage

I Cancer has invadedsurrounding tissue

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

In Situ vs. Invasive Breast Cancer

Breast Cancer Age Differences

Breast cancer differs between older and younger patients.

Olderg

I Cancer tends to progressless aggressively

I Patient has less timeremaining for progression

Younger

I Cancer tends to progressmore aggressively

I Patient has more timeremaining for progression

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

In Situ vs. Invasive Breast Cancer

Overtreatment Problem

Who is treated?

Everyone

Can we reduce costly and risky overtreatment in older patients within situ cancer?

That is the goal

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

In Situ vs. Invasive Breast Cancer

Overtreatment Problem

Who is treated?

Everyone

Can we reduce costly and risky overtreatment in older patients within situ cancer?

That is the goal

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

In Situ vs. Invasive Breast Cancer

Watchful Waiting

Who are our most viable candidates for watchful waiting?

I Older

I In situ

I Sufficiently different from that of younger patients

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

In Situ vs. Invasive Breast Cancer

Our Dataset

Older Younger

In Situ Invasive In Situ Invasive

132 401 110 264

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

Uplift Modeling

Uplift Modeling

Predictive modeling technique that attempts to specificallycharacterize a particular subgroup of a population.

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

Lift and Uplift

0.0 0.2 0.4 0.6 0.8 1.0

020

040

060

080

010

00

Examples Labeled Positive

Tru

e P

ositi

ves

Lift

Lift

Lift

t

c

0.0 0.2 0.4 0.6 0.8 1.0

050

100

150

200

250

Examples Labeled Positive

Upl

ift

Uplift

Uplift = Liftt − Liftc

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

Lift and Uplift

0.0 0.2 0.4 0.6 0.8 1.0

020

040

060

080

010

00

Examples Labeled Positive

Tru

e P

ositi

ves

Lift

Lift

Lift

t

c

0.0 0.2 0.4 0.6 0.8 1.00

5010

015

020

025

0

Examples Labeled Positive

Upl

ift

Uplift

Uplift = Liftt − Liftc

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

Understandable Metrics

What’s the problem?

Lift isn’t a common metric

Can we achieve the same characterization using a different metric?

That is the goal

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

Understandable Metrics

What’s the problem?

Lift isn’t a common metric

Can we achieve the same characterization using a different metric?

That is the goal

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

Differential ILP

How do we get an ILP algorithm to consider metrics like liftand ROC?

Start with Score as You Use (SAYU)

Now how do we make it differential?

Train two classifiers instead of one

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

Differential ILP

How do we get an ILP algorithm to consider metrics like liftand ROC?

Start with Score as You Use (SAYU)

Now how do we make it differential?

Train two classifiers instead of one

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Learning Method

SAYL Algorithm

SAYL

Initialize naıve classifiers and theorywhile Stop criteria not met do

Select seed exampleConstruct bottom clausewhile Clause space not exhausted do

Select new clauseTrain classifiers with theory and new clauseif New clause improves ROC difference then

Add new clause to theorybreak

end ifend while

end while

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Evaluation

SAYL-ROC Performance

0

10

20

30

40

50

60

70

80

90

100

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Up

lift

(nb

of

po

sitiv

es)

Fraction of total mammograms

SAYL-ROC

SAYL

DPS

MF

Baseline

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Conclusions and Future Work

Conclusions and Future Work

Conclusions

I No significant difference between SAYL and SAYL-ROC

I SAYL-ROC training may be easier to understand outsideof marketing

I SAYL-ROC tends to construct much larger theories

I SAYL-ROC theories may be more difficult to interpret

Future Work

I Experiment with different class skews

I Experiment with different domains

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Questions?

Appendix

Learned Rules

Some example rules.

1. Patient had prior in situ biopsy,BI-RADS score of prior biopsy was 1

2. Patient has low breast density,principal finding is calcification or single dilated duct

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Appendix

SAYL TAN Models

TAN model learned on older population

breast category

combined BI-RADS increased up to 3 points over previous mammogram

had previous in situ biopsy at same location breast BI-RADS score = 4

no family history of cancer, and no prior surgery breast has mass size ≤ 13 mm

TAN model learned on younger population

breast category

combined BI-RADS increasedup to 3 points

over previous mammogram

had previous in situ biopsyat same location

breast BI-RADS score = 4no family history of cancer,

and no prior surgery breast has mass size ≤ 13 mm

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Appendix

Marketing Customer Groups

PersuadablesCustomers who will respond only when targeted.

Sure ThingsCustomers who will respond even when not targeted.

Lost CausesCustomers who will not respond, regardless of whether theywere targeted or not.

Sleeping DogsCustomers who will not respond as a result of being targeted.

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Appendix

Marketing Ideal Ranking

Persuadables Sleeping Dogs Sure Things, Lost Causes

Increasing probability of response from targeting

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Appendix

Marketing Dataset

Target Control

Response No Response Response No Response

Persuadables Sure Things

Sleeping Dogs Lost Causes

Sleeping Dogs Sure Things

Persuadables Lost Causes

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA

Appendix

In Situ vs. Invasive Dataset

Older Younger

In Situ Invasive In Situ Invasive

Indolent In Situ Aggressive In Situ

Always Excise Aggressive In Situ Always Excise

Uplift Modeling with ROC: An SRL Case Study University of Wisconsin, Madison, USA