impact analysis - a seismology-inspired approach to study change propagation

25
Soccerlab and Ptidej team Seminar Salima Hassaine, Ferdaous Boughanmi, Yann-Ga¨ el Gu´ eh´ eneuc, Sylvie Hamel, Giuliano Antoniol Introduction The Earthquake Metaphor Approach Empirical Study Study Results Conclusion A Seismology-inspired Approach to Study Change Propagation Salima Hassaine, Ferdaous Boughanmi, Yann-Ga¨ el Gu´ eh´ eneuc, Sylvie Hamel, Giuliano Antoniol SOCCER Lab. and Ptidej Team – DGIGL, ´ Ecole Polytechnique de Montr´ eal, Qu´ ebec, Canada September 27, 2011 Pattern Trace Identification, Detection, and Enhancement in Java SOftware Cost-effective Change and Evolution Research Lab

Upload: icsm-2011

Post on 13-Jan-2015

377 views

Category:

Technology


3 download

DESCRIPTION

Paper: "A Seismology-inspired Approach to Study Change Propagation".Authors: Salima Hassaine, Ferdaous Boughanmi, Yann-Gaël Guéhéneuc, SylvieHamel and Giuliano AntoniolSession: "Research Track Session 2: Impact Analysis"

TRANSCRIPT

Page 1: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

A Seismology-inspired Approachto Study Change Propagation

Salima Hassaine, Ferdaous Boughanmi, Yann-GaelGueheneuc, Sylvie Hamel, Giuliano Antoniol

SOCCER Lab. and Ptidej Team – DGIGL, Ecole Polytechnique deMontreal, Quebec, Canada

September 27, 2011

Pattern Trace Identification, Detection, and Enhancement in JavaSOftware Cost-effective Change and Evolution Research Lab

Page 2: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Context and Motivation

I Software evolves continuously, requiring continuousmaintenance and development

I Software maintenance is the most costly and difficultphase in software life cycle

I Making changes without understanding theireffects can lead to poor effort estimation and delaysin release schedules, because of their consequences(e.g., the introduction of bugs, etc.)

2 / 25

Page 3: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Change Impact Analysis

I Change impact analysis is defined by Bohner andArnold [1] as“identifying the potential consequences ofa change, or estimating what needs to be modified toaccomplish a change”.

[1] S. A. Bohner and R. S. Arnold, Software Change Impact Analysis. IEEEComputer Society Press, 1996.

3 / 25

Page 4: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Existing approaches

I Structure-based AnalysisI Dependency analysis of source code is performed using

static or dynamic program analysesI The relationships between classes make change impact

difficult to anticipate (e.g., hidden propagation)I History-based Analysis

I Mining software repositories to identify co-changes ofsoftware artefacts within a change-set

I It is often able to capture change couplings that cannotbe captured by static and dynamic analyses.

I They lack to capture how changes are spread over space(e.g., class diagram) V They could not help developersprioritise their changes according to the forecast scopeof changes

I Probabilistic ApproachesI Building change propagation models to predict future

change couplings using probabilistic tools (e.g.,Bayesian Networks, Time Series Analysis, etc.)

4 / 25

Page 5: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Motivating example

I Bug ID200551 reports a bug in Rhino, that wasintroduced by a developer when he implemented achange to class Kit and missed a required change toclass DefiningClassLoader.

I Information passes from class Kit to classDefiningClassLoader through an intermediary classContextFactory that remains unchanged.

5 / 25

Page 6: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Our goal

I Propose an approach to study the scope of changepropagation based on a seismology metaphore

I Our approach considers changes to a class as anearthquake that propagates through a long chain ofrelationships

I Our approach combines static dependenciesbetween classes and historical co-change relationsto study how far a change propagation will proceedfrom a given class to the others.

6 / 25

Page 7: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

The Earthquake Metaphor

Active seismic areas “Important” classesEarthquake Software changeEpicenter “Important” changed classSeismic wave propagation Change propagationDamaged sites “Impacted” classesDistance from an epicenter Class level

7 / 25

Page 8: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Approach

I Step 1: Identifying the most important classesI Using PageRank-based metric, History-based metric,

and Combination of the both metrics.

I Step 2: Identifying class levelsI Using static dependencies between classes

I Step 3: Identifying impacted classesI Using historical co-change relations extracted from

software repositories

8 / 25

Page 9: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Step 1: Identifying the most important classes

9 / 25

Page 10: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Step 2: Identifying class levels (1/2)

C

B

A

D

F

E

G

cr cr cr

co

in

ag as

in

in

in

(a) UML-like model

C D

A

B

E

F G

dm dm

cr

dmdm

cr cr

co

dm

in

ag as

in

in

in

(b) Eulerian model

(c) String representation of the Eulerian model

Figure: The conversion of a class diagram into string (from [2]).

[2] O. Kaczor, Y.-G. Gueheneuc, and S. Hamel, Efficient identification ofdesign patterns with bit-vector algorithm,pp. 175–184. IEEE ComputerSociety Press, 2006.

10 / 25

Page 11: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Step 2: Identifying class levels (2/2)

Bit-Vector AlgorithmI Input:

I The Epicenter Class (e.g., class A)I The String Representation of the program

I Output:I Class levels (e.g., Level0 = {A}, Level1 = {B,F},

Level2 ={D,E ,C}, Level3 = {G}, Level4 = {F})

11 / 25

Page 12: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Step 3: Identifying impacted classes

I We define a time window T of observation as themedian of time between two subsequent changes to theepicenter class.

I We extract all the commits that happened after anychange to the epicenter class and within the chosentime window T.

I We use our framework Ibdoos to implement queries forcollecting the set of classes that changed after anychange to the epicenter class and during T.

12 / 25

Page 13: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Empirical Study Design

I Goal: to show the applicability and usefulness of ourapproach

I Purpose: to gather interesting observations on thescope of change propagation and confirming theseobservations statistically

I Quality focus: is the accuracy of the identified scope ofchange propagation

I Perspective: researchers and practitioners who shouldbe aware of the scope of a change to estimate the effortrequired for future maintenance tasks. The observedphenomena can help for making decisions concerningthe process of future software projects.

I Context: three open source systems: Pooka, Rhino, andXerces-J.

13 / 25

Page 14: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Research Questions (1/3)

RQ1: Does our metaphor allow us to observe the scope ofchange impact?

I We investigate whether it is possible to apply ourapproach to observe change propagation through classlevels

I We perform a qualitative study to confirm ourobservations of change propagation, using externalinformation

I Thus, we can show that, indeed, like in seismology,certain levels are more impacted by a change thanothers

14 / 25

Page 15: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Research Questions (2/3)

RQ2: What is the level most impacted by a change?

I We perform a quantitative study to confirm ourobservations of change propagation, using statisticaltests to investigate which level may be the mostimpacted by a change, and classifying the levels havingsimilar impact

I Thus, we can deduce all classes with a higher risk to beimpacted by any change to epicenter class

15 / 25

Page 16: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Research Questions (3/3)

RQ3: What is the most reachable level by a change?

I As in RQ2, we perform a quantitative study to confirmour observations of change propagation, using statisticaltests to investigate, for each level, the number ofearthquakes that propagate until a given level.

I Thus, we can deduce the most reachable level.

16 / 25

Page 17: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Analysis Methods

I RQ1: Using the R statistical system, we build the 3Dgraph visualising the change propagation from theepicenter class to other classes.

I RQ2: We compute, for each level, the number ofclasses that changed after any change to the consideredepicenter class.

I RQ3: For each level, we create a subset that containsthe number of earthquakes that stop at this level.

I We conduct Duncan’s multiple range test to classify thesubsets with respect to the differences between them.

17 / 25

Page 18: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Study Results (1/3)

RQ1: Does our metaphor allow us to observe the scope ofchange impact?

(a) class XMLEventImpl (b) class TypeValidator

Figure: Change propagation

I Epicenter class XMLEntityScanner: we found the bugID1099 that relate the changes to the epicenter classwith changes to XMLParser (level 3).

18 / 25

Page 19: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Study Results (2/3)

RQ2: What is the level most impacted by a change?

Homogenous subsets for alpha = 0.1Levels Range 1 Range 2 Range 35 107.54104 147.77783 150.00002 202.04081 354.4828

Table: Rhino: Duncan’s test applied on“number of changes”

Homogenous subsets for alpha = 0.1Levels Range 1 Range 2 Range 36 6.40155 10.84854 24.83333 50.27892 83.72731 895.2652

Table: Xerces-J: Duncan’s test applied on“number of changes”

19 / 25

Page 20: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Study Results (3/3)

RQ3: What is the most reachable level by a change?

Homogenous subsets for alpha = 0.1Max Level Range 1 Range 2 Range 35 .58334 1.37123 1.75002 4.61361 11.7121

Table: Rhino: Duncan’s test applied on“number of earthquakes”

Homogenous subsets for alpha = 0.1Max Level Range 1 Range 2 Range 36 10.53335 16.33334 21.66673 30.00332 43.20001 54.8667

Table: Xerces-J: Duncan’s test applied on“number of earthquakes”

20 / 25

Page 21: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Threats to Validity (1/2)

I Construct validity concerns the relation betweentheory and observations. In this study, they could bedue to the chosen time windows which may affect ourobservations.

I Internal Validity of a study is the extent to which atreatment impacts the dependent variable. Theinternal validity of our study is not threatened becausewe have not manipulated the independent variable,extent of the change propagation.

21 / 25

Page 22: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Threats to Validity (2/2)

I External Validity of a study relates to the extent towhich we can generalise its results. The main threatto the external validity of our study that could affectthe generalisation of the presented results relates to theanalysed programs. Future work includes replicating thisstudy on other programs to confirm our results.

I Conclusion validity threats deals with the relationbetween the treatment and the outcome. We paidattention not to violate assumptions of the performedstatistical tests. Thus, we improved our conclusionvalidity by increasing the risk of making a Type I error(increase the chance that we will find a relationshipwhen in fact there is not), we can do that statisticallyby raising the alpha level. For instance, instead of using0.05 significance level, we use 0.1 as our cutoff point.

22 / 25

Page 23: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Conclusion (1/2)

I We proposed an approach to study how far a changepropagation will proceed from a given class to theothers.

I We performed a qualitative and two quantitativestudies. We showed that our intuition, about theimpacted classes by a change must be near to thechanged class, is incorrect in some cases. However,there are some change propagations that reach the 5thlevel in Rhino (and 6th in Xerces-J).

23 / 25

Page 24: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Conclusion (2/2)

I Identifying the scope of change propagation could help,both developers and managers. Developers could locateeasily the change impact. Managers could estimate theefforts required to perform changes more accurately.

I Future work: Apply our metaphor and our approach toother programs to confirm our observations. We willalso adapt seismology models to predict changes toclasses.

24 / 25

Page 25: Impact analysis - A Seismology-inspired Approach to Study Change Propagation

Soccerlab andPtidej teamSeminar

Salima Hassaine,Ferdaous

Boughanmi,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

The EarthquakeMetaphor

Approach

Empirical Study

Study Results

Conclusion

Questions?

25 / 25