130705 zephyrin soh - how developers spend their effort during maintenance activities

34
École Polytechnique Montréal July, 5th 2013 How Developers Spend Their Effort During Maintenance Activities Zéphyrin Soh, Foutse Khomh, Yann-Gaël Guéhéneuc, Giuliano Antoniol

Upload: ptidej-team

Post on 26-Jun-2015

77 views

Category:

Technology


0 download

DESCRIPTION

Software maintenance, developers efforts, exploration strategies

TRANSCRIPT

Page 1: 130705   zephyrin soh - how developers spend their effort during maintenance activities

École Polytechnique MontréalJuly, 5th 2013

How Developers Spend Their Effort During

Maintenance Activities

Zéphyrin Soh, Foutse Khomh, Yann-Gaël Guéhéneuc, Giuliano Antoniol

Page 2: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Outline

Introduction

Goal and Benefits

Background

Empirical Study

– RQ1

– RQ2

Conclusion

Threats to Validity

Page 3: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Introduction

Performing maintenance task = spend a certain effort to provide some results

Effort:

explore the program

find relevant program entities

understand and make changes on the program entities

Result:

Changes made on source code

Provide as patch or commit in repository

Page 4: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Motivating Example Exploration graph

Explored, relevant, additional files

Page 5: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Motivating Example Exploration graph

Explored, relevant, additional files

Page 6: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Motivating Example Exploration graph

Explored, relevant, additional files

Page 7: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Motivating Examples

Does the effort spend is proportional to the results?

What are the factors that may affect developers' effort?

Page 8: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Motivations

How developers' spend their effort may

affect their productivity

Some tasks need less effort while others need more effort

Some developers may take comparatively less effort than others :

for a task of equal complexity

to provide the results of equal size

==> Important to know why these differences

Page 9: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Goal

Main goal

How can we help developers based on

the way they spend their effort?

Specific goals

Understand how developers spend their effort

Study whether there are factors which can be influenced by tooling

==> Propose such tooling

Page 10: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Benefits

Better assign tasks to developers

A developer may perform well for some kind of task e.g., task that need to change a method vs. add new method

Refine recommendation systems

Be careful when recommending effort consuming entities that are not relevant to a task

Guide developers during program exploration

Page 11: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Background

Need detailed information about :

Developers' programming activities

Interactions histories

Recorded by Eclipse Mylyn plugin

==> opportunity to estimate developers' effort

The changes made to address a task

Patches : contains the source code before and after the changes

Page 12: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Background

Developers must activate the task to gather interactions, and disable to stop gathered interactiosns

XML files

– How developers explore a program entities

– The time spend on each entity

Page 13: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Empirical Study

Four open-source projects : ECF, Mylyn, PDE, Eclipse Platform

RQ1 : Does the complexity of the implementation of a task reflects developer’s effort?

Effort

Complexity of theimplementation

?

Page 14: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Empirical Study

RQ2 : How developers spend their effort?

What are the factors affecting developers’ effort?

Effort

AdditionalFiles

?Bug severity

Developers'Experience

??

Page 15: 130705   zephyrin soh - how developers spend their effort during maintenance activities

RQ1 : Effort vs. Complexity of the implementation

Does the task that require complex implementation require more effort?

– Sometimes, a simple implementation can also require a lot of effort

Effort : Interactions data

Time spend

Cyclomatic complexity

Complexity of the implementation : Patches data

Entropy

Change distance

Page 16: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Developers' effort

Time Spend

Total duration spent on all files and their contents

f1(2min), f2.m1(3min), f3.field1(1min) ==> 6min

The more a task takes time, the more the effort is spent to understand and perform changes.

Cyclomatic complexity

Cyclomatic = edges – vertex + connected_components

connected_components = 1

The more is the cyclomatic complexity of the exploration graph, the more developers' spend effort to explore the program.

Page 17: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Complexity of the Implementation

Entropy

How much the changes are scarttered between files

Change 5 LOC : on 1 file vs. on 5 files

The more the entropy is high, the more the changes are scattered between files, thus complex implementation

Change distance

How much is the difference between the source code before the changes and source code after the changes

Add new LOC vs. rename a variable in a LOC

The greater is the difference between the old and the new code (i.e., more change distance), the more the change is complex

Page 18: 130705   zephyrin soh - how developers spend their effort during maintenance activities

RQ1 : Approach

Identify the patch that is the result of a given interaction

Ensure that the changes in the patch are the results of the effort spend in the interaction

==> Use working date and modifications dates

When developers create a patch?

Before disable the task : 26.04%

After disable the task : 73.95%

delta(t) ~ 12 min

==> Cannot automatically match based on these dates

Page 19: 130705   zephyrin soh - how developers spend their effort during maintenance activities

RQ1 : Approach

A patch is the result of the corresponding interaction if and only if both are attached to the same bug report, by the same developer at the same date and time.

Unbalanced matchings

Files are changed but not explored

==> refactoring (propagation) or interruption period

no common files between the interaction and the patch

==> Changes without activited the task

The patch is not the result of the interaction

Page 20: 130705   zephyrin soh - how developers spend their effort during maintenance activities

RQ1 : Results

After finding where and how to perform the task, developers' disable the task before performing the changes

Matching Unbalanced

ECF 15 2

Mylyn 663 122

PDE 132 27

Platform 218 66

Total 1,028 217

Page 21: 130705   zephyrin soh - how developers spend their effort during maintenance activities

RQ1 : Results

Spearman correlation Effort vs. Complexity of the implementation (0.16 to 0.33)

Developers do not necessary spend more effort on tasks requiring more complex implementations

==> some of the effort spend by developers on a task do not materialise in the patch?

Effort

Complexity of theimplementation

Page 22: 130705   zephyrin soh - how developers spend their effort during maintenance activities

How developers spend their effort?

Additional files may increase the developers’ effort and decrease their productivity

How much developers used additional files?

– Jaccard similarity between the significantly relevant files and explored files

– On average, developers use about 62% of additional files.

Developers who explore a large number of additional files spend more effort to perform the task

Effort

AdditionalFiles

Page 23: 130705   zephyrin soh - how developers spend their effort during maintenance activities

What are the factors affecting developers’ effort?

Bug severity

Developers' experience

Effort

?Bug severity

Developers'Experience

?

Page 24: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Bug Severity

The effort spent to perform easy tasks must be different to the effort spent to perform more complex tasks

Approach

– Make sure that the bug severity indicates the complexity of the implementation

• Kruskall-Wallis test : Bug severity vs. Complexity of the implementation

– Kruskall-Wallis test : Bug severity vs. Effort

Page 25: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Bug Severity

Bug severity indicates the complexity of the implementation

Bug severities that required less files did not necessarily required less changes

Page 26: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Bug severity

The relation between developers’ effort and bug severities is project dependant

Bug severityComplexity of theimplementation

Effort

Project dependent

Page 27: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Experience

An experienced developer would spend less effort compared to inexperienced developers

Approach

– Compute experience metrics

– Correlation : Experience vs. Effort

Page 28: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Experience

Experience metrics

– #Bugs (NB) that the devloper fixed before

– #Files (NF) that the developer changed before

– #LOC (NLOC) : sum of added and deleted LOC

Task1 : f1(2 LOC) and f2(5 LOC)

– First task ==> experience (0, 0, 0)

Task2 :

– Experience (1 task, 2 files, 7 LOC)

– C1 : Significantly relevant files for Task2 = f2, f3, f4

– C2 : Significantly relevant files for Task2 = f6, …, f9

Experience(C1) vs. Experience(C2)?

Page 29: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Experience

More experience when the significantly relevant files for a given task have already been used in previous tasks.

==> Two kind of experience :

Overall Experience (OE) : task-independent

(task) Relevant Experience (RE) : based on relevant files

C1 : Significantly relevant files for Task2 = f2, f3, f4

– OE = 1 task, 2 files, and 7 LOC

– RE = 1 task, 1 file, and 2 LOC

C2 : Significantly relevant files for Task2 = f6, …, f9

– OE = 1 task, 2 files, and 7 LOC

– RE = 1 task, 0 file, and 0 LOC

Page 30: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Experience

No consensus about the benefits of experience on the effort spend

The experience may reduce the effort in some few cases

==> weak negative correlation between effort and experience

Robbes and Röthlisberger [1] also found weak negative correlation using the number of commits to measure experience

==> one can use the number of task and the number of files changed before to measure developers’ experience

[1] R. Robbes and D. Röthlisberger. Using developer interaction data to compare expertise metrics. MSR ’13

Page 31: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Effort vs. Experience

No benefits of experience

When a program evolves, developers increasingly perform tasks on the files that they never used before

Files never used More files never used Less files never usedMore files never used

Page 32: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Conclusion

Effort

Complexity of theimplementation

AdditionalFiles

Bug severity

Developers'Experience

Project dependent

Page 33: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Threats to Validity

Construct validity : Mismatching (the patch is not the result of the interaction)

– Remove unbalanced matchings

– No guarantee that we didn't miss some matchings

Conclusion validity

– No violation of assumptions of the statistical tests

– No causation : only observations based on correlation and distribution of metrics values

Internal validity

– Only Mylyn's interaction histories

External validity

– Open-source projects

Page 34: 130705   zephyrin soh - how developers spend their effort during maintenance activities

Thanks for your attention!

Effort

Complexity of theimplementation

AdditionalFiles

Bug severity

Developers'Experience

Project dependent