the “bellwether” effect and its implications to transfer learning

63
The “Bellwether” Effect Rahul Krishna ([email protected]) Tim Menzies, and Wei Fu And Its Implications to Transfer Learning 1

Upload: rahul-krishna

Post on 12-Apr-2017

113 views

Category:

Software


2 download

TRANSCRIPT

Page 1: The “Bellwether” Effect and Its Implications to Transfer Learning

The “Bellwether” Effect

Rahul Krishna ([email protected]) Tim Menzies, and Wei Fu

And Its Implications to Transfer Learning

1

Page 2: The “Bellwether” Effect and Its Implications to Transfer Learning

2WeTOSM ‘14

[Turhan09] Data fromTurkish toasters canpredict defects inNASA flight systems

Today’s topic:Transfer Learning

Page 3: The “Bellwether” Effect and Its Implications to Transfer Learning

3

Today’s topic:Simpler Transfer Learning with“Bell…. what?”

Page 4: The “Bellwether” Effect and Its Implications to Transfer Learning

DefinitionsBellwether effect

4

• If a community builds many software projects

• There exists one many∈ from which

• quality predictors can be built …

• … and used for all

Bellwether method

• find the one• use it

Page 5: The “Bellwether” Effect and Its Implications to Transfer Learning

Definitions

5

• find the one• use it

Note: vastly simpler than other transfer learning methods [Turhan09, Turhan11, Nam13, etc]

Bellwether effect Bellwether method

• If a community builds many software projects

• There exists one many∈ from which

• quality predictors can be built …

• … and used for all

Page 6: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data○ Prediction Model○ Statistical Measures

● Results● Conclusions 6

Page 7: The “Bellwether” Effect and Its Implications to Transfer Learning

The “Cold-Start” Problem

Past Projects Prediction Model Upcoming releases

7

Page 8: The “Bellwether” Effect and Its Implications to Transfer Learning

The “Cold-Start” Problem

Past Projects Prediction Model

?8

Upcoming releases

Page 9: The “Bellwether” Effect and Its Implications to Transfer Learning

Challenges: Variable Datasets

... “New projects are always emerging, and old ones are being rewritten…”

… “the quality, representativeness, and volume of the training data have a major influence on the usefulness and stability of model performance…”

— Rahman et al. [Rah12]

Growing VolumeOf Projects

9

Page 10: The “Bellwether” Effect and Its Implications to Transfer Learning

• Unstable conclusions are typical in SE [Menzies12]

• Usefulness of some lesson “X” is contradictory

Challenges:Conclusion Instability

10

Page 11: The “Bellwether” Effect and Its Implications to Transfer Learning

• Unstable conclusions are typical in SE [Menzies12]

• Usefulness of some lesson “X” is contradictory

Challenges:Conclusion Instability

11

Kitchenham et al. ‘07• Are data from other

organizations …• … as useful as local

data?• Inconclusive• 3 cases: Just as good.

4 cases: Worse.

Page 12: The “Bellwether” Effect and Its Implications to Transfer Learning

• Unstable conclusions are typical in SE [Menzies12]

• Usefulness of some lesson “X” is contradictory

Challenges:Conclusion Instability

12

Zimmermann et al. ‘09• 622 pairs of projects• Only 4% of pairs

were useful

Kitchenham et al. ‘07• Are data from other

organizations …• … as useful as local

data?• Inconclusive• 3 cases: Just as good.

4 cases: Worse.

Page 13: The “Bellwether” Effect and Its Implications to Transfer Learning

• Menzies et al. [Men12] offer several ways• They ask for better experimental practice.

• Is there a better way?•Yes! Look for the “Bellwether”

• As long as the bellwether continues to offer good quality predictions

•Then conclusions from one…•... are conclusions for all

13

How to Reduce this Instability?

Page 14: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data○ Prediction Model○ Statistical Measures

● Results● Conclusions 14

Page 15: The “Bellwether” Effect and Its Implications to Transfer Learning

Estimating Quality Why not Static Analyzers?

• [Rahman14] et al. compared • Code analysis tools:

FindBugs, JLint, and PMD• with Static Code defect

Predictors• Found no difference

(measurement: AUCEC)

15

• And• Using lightweight parsers...• … Defect predictors can

quickly jump to new languages

• Same is not true for static code analysis tools

• Lesser Bugs Better Software

Page 16: The “Bellwether” Effect and Its Implications to Transfer Learning

Estimating Quality Why not Static Analyzers?

16

• And • They work surprisingly well!• [Ostrand04]: ~80% of the bugs localized

in 20% of the code

Page 17: The “Bellwether” Effect and Its Implications to Transfer Learning

Estimating Quality:Static code Defect Prediction1. Ubiquitous

• Researchers and Industrial practitioners frequently use them. Eg. Companies like Google [Lew14], V&V books [Raktin01]

2. A lot of (ongoing) research• Tremendous Attention [Nam13]

• Better approaches are constantly being proposed

3. They are easy to use• Software Metrics can be collected fast• Wide variety of tools, open source data miners [sklearn][weka]

17

Page 18: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality

○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data○ Prediction Model○ Statistical Measures

● Results● Conclusions 18

Page 19: The “Bellwether” Effect and Its Implications to Transfer Learning

Transfer Learning: Introduction• Extract knowledge from source (S) and apply to

target (T)• Data needs to be massaged before use[Zhang15]

• Careful sub-sampling• Transformation

• Based on data source, TL is categorized as:• Homogeneous vs. Heterogeneous

• Based on transformation[Nam13, Nam15, Jing15]

• Similarity vs. Dimensionality

19

Page 20: The “Bellwether” Effect and Its Implications to Transfer Learning

Transfer Learning: CategoriesHomogeneous• Source (S) and Target

(T) are quantified using the same attributes

Heterogeneous• Source (S) and Target

(T) are quantified using different attributes

Similarity• Learn from subsampled

rows/columns of the source (S)

Dimensionality• Manipulate

rows/columns of source (S) to match target (T)

20

Page 21: The “Bellwether” Effect and Its Implications to Transfer Learning

Heterogeneous• Source (S) and Target

(T) are quantified using different attributes

Dimensionality• Manipulate

rows/columns of source (S) to match target (T)

Transfer Learning: CategoriesHomogeneous• Source (S) and Target

(T) are quantified using the same attributes

Similarity• Learn from subsampled

rows/columns of the source (S)

This Talk

21

Page 22: The “Bellwether” Effect and Its Implications to Transfer Learning

Homogeneous TL: Burak Filter

22

• Burak[Tur09] used relevancy filtering• Filter using kNN

• Gather two sets of data• Validation set (S) Test Data• Candidate set (T) Train Data

• Use kNN • Pick “similar” instances from T• Filter T using S

Page 23: The “Bellwether” Effect and Its Implications to Transfer Learning

Homogeneous TL: Burak Filter

• First study on relevancy• Their conclusion:

23

… The performances of defect predictors based on the NN-filtered data do not give necessary empirical evidence to make a strong conclusion …

… Sometimes NN data based models may perform better than WC data based models …

Page 24: The “Bellwether” Effect and Its Implications to Transfer Learning

Homogeneous TL: Mixed Model Learner• Turhan et al.[Tur11] proposed a mixed-model learner• Combine local data with curated non-local data• Gather two sets of data

• Validation set (S): Pick a random 10% of local data• Candidate set (T): Remaining 90% and non-local data

• For non-local data, they use Burak filter[Tur09]

• Experiment with various 90%-10% splits • 400 experiments were conducted to pick the best model

24

Page 25: The “Bellwether” Effect and Its Implications to Transfer Learning

Homogeneous TL: Mixed Model Learner• Extension to Burak Filter

• Incorporated local data

Challenges• Similar issues as Burak Filter

• Biased; Unstable model.• The authors report:

… mixed project models offer only limited improvements i.e., 3 out 10 projects

— Turhan ‘11

25

Page 26: The “Bellwether” Effect and Its Implications to Transfer Learning

Homogeneous TL: Addressing the challenges• Researchers have offered a bleak view of TL • Zimmerman et al.[Zimm09]

•Transfer is not always consistent•IE could learn from Firefox but not vice versa

•Rahman et al.[Rahman12] •The “imprecision” of learning across projects

• Recent research has resorted to more complex approaches

26

Page 27: The “Bellwether” Effect and Its Implications to Transfer Learning

More Transfer Learners …

27 WeTOSM ‘14

Page 28: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning

○ The “Bellwether”● Experimental Setup

○ Benchmark Data○ Prediction Model○ Statistical Measures

● Results● Conclusions 28

Page 29: The “Bellwether” Effect and Its Implications to Transfer Learning

Is this complexity necessary?

• Short answer — No• Just look for the “Bellwether”

•Use our bellwether method•Build your model•Et voilà!

29

Page 30: The “Bellwether” Effect and Its Implications to Transfer Learning

The Bellwether Method

Generate

Apply Monitor

#

Page 31: The “Bellwether” Effect and Its Implications to Transfer Learning

The Bellwether Method

Generate• Project Pairs Pi , j• Perform a Leave-one-out Test

Train on Pi Test on Pj• Pick the Project with the

best model

Apply Monitor

#

Page 32: The “Bellwether” Effect and Its Implications to Transfer Learning

The Bellwether Method

Generate

Apply• Predict Quality

on future projects

Monitor

#

Page 33: The “Bellwether” Effect and Its Implications to Transfer Learning

The Bellwether Method

Generate

Apply

Monitor• When

predictions fail. Restart.

#

Page 34: The “Bellwether” Effect and Its Implications to Transfer Learning

The Bellwether Method

Generate

Apply Monitor

#

Page 35: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data○ Prediction Model○ Statistical Measures

● Results● Conclusions 35

Page 36: The “Bellwether” Effect and Its Implications to Transfer Learning

Experiment Setup: Benchmark Data• 120 Datasets from 4 communities• Defects in 3 levels of granularity

• File, Class, and Function• Open source and Proprietary

36

Page 37: The “Bellwether” Effect and Its Implications to Transfer Learning

Experiment Setup: Benchmark Data• BTW, Apache has local data

• Multiple versions• Temporally ordered

37

A total of 54 datasets

Page 38: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data

○ Prediction Model○ Statistical Measures

● Results● Conclusions 38

Page 39: The “Bellwether” Effect and Its Implications to Transfer Learning

Experiment Setup: Prediction Model• We use Random Forests[Zimmerman08]

• Build several decision trees from random subsamples• Use ensemble learning

• Samples are imbalanced[Pelayo07]

• More “clean” examples

• Use SMOTE [Chawla01] to rebalance data*

• Randomly down sample “clean” instances• Up-sample “buggy” instances

*Apply only to training data

38

Page 40: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data○ Prediction Model

○ Statistical Measures● Results● Conclusions 40

Page 41: The “Bellwether” Effect and Its Implications to Transfer Learning

Experiment Setup: Statistical Measures

41

• Prediction is usually measured using ROC• ROC is a plot of Recall vs. False Alarm

• Plot requires several treatments • Obtained by cross validation.

• We refrain from Cross-Validation• It tends to mix the test data with the bellwether

• Instead,• We use Balance [Ma07]

Page 42: The “Bellwether” Effect and Its Implications to Transfer Learning

Experiment Setup: Statistical Measures

42

• Instead of a set of points for ROC, • Produce one point.• X, Y = Pd (Recall), Pf (False Alarm)

• Balance is the weighted distance from the ideal point

• Ideal Point => (Pd, Pf) = (1, 0)• Balance = • Lower the Balance, better the performance

Page 43: The “Bellwether” Effect and Its Implications to Transfer Learning

Experiment Setup: Statistical Measures• Prediction Model is inherently random

• Rerun model 40 times with different seeds• Collect Balance measure in every run

• Use Scott-Knott Test to compare Balance values• Scott-Knott ranks Balance values (best to worst)

• Rank -> Effect Size Test + Hypothesis Test • Why SK?

•It’s been used by recent high profile papers at TSE [Mittas13] and ICSE [Ghotra15]

43

Page 44: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data○ Prediction Model○ Statistical Measures

● Results● Conclusions 44

Page 45: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

How does the bellwether fare against local models?

Is Bellwether better than other transfer learning methods?

Can we predict which data set will be bellwether?

How much of the “Bellwether” data is required?

Results: Research Questions

45

Page 46: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

How does the bellwether fare against local models?

Is Bellwether better than other transfer learning methods?

Can we predict which data set will be bellwether?

How much of the “Bellwether” data is required?

Results: Research Question 1

46

Page 47: The “Bellwether” Effect and Its Implications to Transfer Learning

Results: Research Question 1

47

Research AnswerOur results suggest bellwethers are not rare.

How rare are “Bellwethers”?

Page 48: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

Community:Bellwether: Lucene

Apache

Results: Research Question 1

48

Page 49: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

Community:Bellwether: MC

NASA

Results: Research Question 1

49

Page 50: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

Community:Bellwether: LC

AEEEM

Results: Research Question 1

50

Page 51: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

Community:Bellwether: Safe

ReLink

X===

Results: Research Question 1

51

Page 52: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

How does the bellwether fare against local models?

Is Bellwether better than other transfer learning methods?

Can we predict which data set will be bellwether?

How much of the “Bellwether” data is required?

Results: Research Question 2

52

Page 53: The “Bellwether” Effect and Its Implications to Transfer Learning

How does the bellwether fare against local models?

Research AnswerFor projects measured with the same quality metrics, training models with bellwether is just as good as — if not better than — local models

Results: Research Question 2

53

Page 54: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

How does the bellwether fare against local models?

Is Bellwether better than other transfer learning methods?

Can we predict which data set will be bellwether?

How much of the “Bellwether” data is required?

Results: Research Question 3

54

Page 55: The “Bellwether” Effect and Its Implications to Transfer Learning

Is Bellwether better than other transfer learning methods?

Research AnswerThe bellwether outperforms standard homogeneous transfer learners.

Results: Research Question 3

55

Page 56: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

How does the bellwether fare against local models?

Is Bellwether better than other transfer learning methods?

Can we predict which data set will be bellwether?

How much of the “Bellwether” data is required?

Results: Research Question 4

56

Page 57: The “Bellwether” Effect and Its Implications to Transfer Learning

Can we predict which data set will be bellwether?

Research AnswerThis is non-trivial. Trying to statistically determine if a project will be a bellwether was unsuccessful. This is open to further examination.

Results: Research Question 4

57

Page 58: The “Bellwether” Effect and Its Implications to Transfer Learning

How rare are “Bellwethers”?

How does the bellwether fare against local models?

Is Bellwether better than other transfer learning methods?

Can we predict which data set will be bellwether?

How much of the “Bellwether” data is required?

Results: Research Question 5

58

Page 59: The “Bellwether” Effect and Its Implications to Transfer Learning

How much data is required before detecting the “Bellwether”?

Research AnswerA few dozen defective samples from the bellwether is sufficient to build a reliable model

Results: Research Question 5

59

Page 60: The “Bellwether” Effect and Its Implications to Transfer Learning

Outline● Motivation● Background

○ Evaluating Quality○ Transfer Learning○ The “Bellwether”

● Experimental Setup○ Benchmark Data○ Prediction Model○ Statistical Measures

● Results● Conclusions 60

Page 61: The “Bellwether” Effect and Its Implications to Transfer Learning

Practical Implications

• The problem of generality in SE• Reproducibility is hard to achieve.

• With Bellwethers Transfer Learners can• Not only be reproducible • But also be stable • and Reliable

• Identification of Bellwether earlier • Would have changed course of research• More focus on coarse grain analysis• Less on relevancy filtering, model generation

61

Page 62: The “Bellwether” Effect and Its Implications to Transfer Learning

Future Work

• Bellwethers in heterogeneous learners• Promising heterogeneous transfer learners [Nam15][Jing15]

• Perform complex dimensionality mapping transforms• Can Bellwethers assist in finding the best mapping?

• Study and quantify bellwether • what makes a bellwether, a bellwether?

•Bellwethers beyond defect prediction•Are there bellwethers in other data?

62

Page 63: The “Bellwether” Effect and Its Implications to Transfer Learning

In conclusion...

•Look for bellwethers •To use as a baseline •To justify the use of transfer learning

•Stabilize the pace of conclusions•Not permanent conclusion stability

•Easy to find•Look when necessary•New data can be discarded•Updated only as they start failing

63