impact of design complexity on software quality - a systematic review

© Fraunhofer IESE

1Apr 13, 2023

Impact of design complexity on software

qualityStudent: Nguyen Duc Anh

First supervisor: Marcus Ciolkowski, Fraunhofer IESE

Second supervisor: Sebastian Barney, BTH

General supervisor: Prof. Dr. Dr. h.c. Dieter Rombach

Master thesis presentation

© Fraunhofer IESE

2Apr 13, 2023

Agenda

Motivation

Problem statement

Research methodology

Research result

Threat of validity

Conclusion

Future work

© Fraunhofer IESE

3Apr 13, 2023

Motivation

High complexity leads to high cost and low quality

Impact onCost & quality

attributesDesign complexity

More effort to understand,

implement, maintain

increased fault proneness, reduced

maintainability.

Complex design structure

© Fraunhofer IESE

4Apr 13, 2023

Problem statement

What is the impact

of design

complexity on

software cost &quality

?

SQ1: Which cost & quality attributes are predicted using design complexity metrics?

SQ2: What (kind of ) design complexity metrics are most frequently used in literature?

SQ3: Which complexity metrics are potential predictors of quality attribute?

SQ4: Is there an overall influence of these metrics on quality attributes? If yes, what are the impacts of those metrics on those attributes?

SQ5: If no, what explains the inconsistency between studies? Is this explanation consistent across different metrics?

© Fraunhofer IESE

5Apr 13, 2023

Research methodology

What is the impact

of design

complexity on

software cost &quality

?

Search for relevant publications

Extract information about design complexity metrics & quality attributes

Extract numerical representation of impact relationship & context factors

Synthesize data & interpret results

© Fraunhofer IESE

6Apr 13, 2023

Study selection resultSearch range: 1960 to

2010Scope: Object oriented

metricsSearch terms

Title + Abstract exclusion

Application of detailed exclusion

Full text exclusion

N= 906

N=281

N=85

N=39

57 primary studies

Reference scan, search for grey

literatureN=18

© Fraunhofer IESE

7Apr 13, 2023

Research resultSQ1 - Which quality attributes are predicted using software design metrics?

Cost (effort) is excluded due to lack of sufficient number of investigated studies

Probability of a module to be faulty

Effort to maintain a software module

Number of fault per LOC

Probability of a module to be

changed

© Fraunhofer IESE

8Apr 13, 2023

Research resultSQ2- What kind of complexity metrics is most frequently used in literature? Design complexity

dimension

No o

f st

udie

s

© Fraunhofer IESE

9Apr 13, 2023

Research resultSQ2- What complexity metrics is most frequently used in literature?

Design complexity metric: Chidamber & Kemerer (CK) metric set (*)

Fault proneness

Maintainability

Metric TypeNo of studie

sNOC (Number Of Children) inheritan

ce28

DIT (Depth of Inheritance Tree)

inheritance

27

CBO (Coupling Between Object)

coupling 22

LCOM (Lack of Cohesion between Method)

cohesion 22

WMC (Weighted Method Count)

scale 22

RFC (Response For a Class) coupling 21… … 12

Metric TypeNo of

studiesWMC scale 9

RFC coupling 8

DIT coupling 7

NOC inheritance 6

CBO coupling 4

LCOM cohesion 3

… … 3

(*) S.R. Chidamber and C.F. Kemerer, “A Metrics Suite for Object Oriented Design,” IEEE Trans. Softw. Eng., vol. 20, 1994, pp. 476-493

© Fraunhofer IESE

10Apr 13, 2023

Research resultSQ3 - Which complexity metrics are potential predictors of fault proneness? Potential prediction – Statistical correlation analysis

Correlation coefficient

Spearman

Odds ratios (estimated from univariate logistic regression model)

Significant correlation

Vote counting

Count the number of reported significant impacts over total number of studies

© Fraunhofer IESE

11Apr 13, 2023

Metric No of studies

Out comesProportional ratio

of +

Positive impact?

No of +

No of -

No of non significan

tNOC 19 6 1 12 32% NoDIT 14 2 0 12 14% NoCBO 17 10 0 7 59% YesLCOM 14 6 0 8 43% NoWMC 26 18 0 8 69% YesRFC 15 9 0 6 60% YesWMC McCabe 16 11 0 5 69%

Yes

SDMC 6 6 0 0 100% YesAMC 6 6 0 0 100% YesNIM 6 6 0 0 100% YesNCM 6 6 0 0 100% YesNTM 6 6 0 0 100% Yes

Metric No of studies

Out comesProportional ratio

of +

Positive impact?

No of +

No of -

No of non significan

tNOC 19 6 1 12 32% NoDIT 14 2 0 12 14% NoCBO 17 10 0 7 59% YesLCOM 14 6 0 8 43% NoWMC 26 18 0 8 69% YesRFC 15 9 0 6 60% YesWMC McCabe 16 11 0 5 69%

Yes

SDMC 6 6 0 0 100% YesAMC 6 6 0 0 100% YesNIM 6 6 0 0 100% YesNCM 6 6 0 0 100% YesNTM 6 6 0 0 100% Yes

Research resultSQ3 - Which complexity metrics are potential predictors of fault proneness?(Example: Vote counting for Spearman correlation coefficient in Fault proneness studies)

Except NOC, DIT, LCOM

listed metrics are potential predictor of

fault proneness !

≤ 50% no

impact !

≥ 50% positive impact

© Fraunhofer IESE

12Apr 13, 2023

Research result

(*) J. Cohen, Statistical Power Analysis for the Behavioral Science, Lawrence Erlbaum Hillsdale, New Jersey, 1988.

SQ3 - Which complexity metrics are potential predictors of fault proneness? Strength of correlation (*)

Trivial Small Medium Large

Meta analysis

Synthesize reported correlation coefficients

Assess the agreement among studies about aggregated result

© Fraunhofer IESE

13Apr 13, 2023

Research resultSQ4 - Is there an overall influence of these metrics on fault proneness?

Trivial Small Medium Large

Scale, coupling metrics are stronger correlated than cohesion, inheritance metric

LOC is strongest correlated to fault proneness

95% confidence interval of aggregated correlation

coefficient between the metric and fault

proneness

© Fraunhofer IESE

14Apr 13, 2023

Research result

Forest plot of RFC

Global Spearman coefficient

0.31

95% Confidence Interval

[0.22;0.40]

P-value 0.000

(Example: Meta analysis for Spearman coefficient of metric RFC in Fault proneness studies)

Aggregated results

SQ4 - Is there an overall influence of these metrics on fault proneness?

© Fraunhofer IESE

15Apr 13, 2023

Is this result consistent across studies?

I2 test for heterogeneity !0% 40% 70%small medium high

totallyagree

totally disagree

100%

RFC: I2=78%

(Example: Meta analysis for Spearman coefficient of metric RFC in Fault proneness studies)

Research result

Metric I2

CBO 95%DIT 83%NOC 75%LCOM 74%RFC 78%WMC 93%LOC 84%

SQ4 - Is there an overall influence of these metrics on fault proneness?

© Fraunhofer IESE

16Apr 13, 2023

Research resultSQ4* - How many cases is enough to draw the statistically significant conclusion?

α value 0.1

Tails 2

Expected effect size

0.31

Expected power 80%

Number of cases needed: 60 cases !

(Example: Power analysis for Spearman coefficient of metric RFC in Fault proneness studies)

© Fraunhofer IESE

17Apr 13, 2023

Research result

Moderator variable

Programming Language: C++ & Java

Project type: Open source, Closed source academic & Closed source industry

Defect collection phase: Pre release defects & Post release defects

Business domain: Embedded system & Information system

Dataset size: Small, Medium & Large

Are the correlations different across each moderator variable?

SQ5: What explains the inconsistency between studies? Is this explanation consistent across different metrics?

© Fraunhofer IESE

18Apr 13, 2023

Metric Programming

Language

Project type

Defect col.

Phase

Business Domain

Dataset size

CBO 6% 4% 83% 4% 8%DIT 3% 0% 20% 0% 1%NOC 34% 24% 15% 22% 14%LCOM 1% 0% 60% 0% 6%RFC 5% 3% 78% 3% 2%WMC 32% 4% 60% 4% 3%LOC 7% 2% 51% 15% 0%

Research resultSQ5: What explains the inconsistency between studies? Is this explanation consistent across different metrics?

Metric Programming

Language

Project type

Defect col.

Phase

Business Domain

Dataset size


Remaining inconsistency is still excessive

No consistent explanation for heterogeneity across metrics !

Variance explanatio

n in percent

Metric Programming

Language

Project type

Defect col.

Phase

Business Domain

Dataset size


© Fraunhofer IESE

19Apr 13, 2023

Comparison of results with perception in literature

Vote counting & meta analysis common claims in literature

Common claims in literature In Lit. OursThe more classes a given class is coupled, the more likely that class is faulty Yes Yes

The more methods that can potentially be executed in response to a message received by an object of a given class, the more likely that class is faulty

Yes Yes

The deeper the inheritance tree for a given class is, the more likely that class is faulty Yes No

The more immediate sub-classes a given class has, the more likely that class is faulty No No

The less similar methods within a given class, the more likely that class is faulty Yes No

The more local methods a given class has, the more likely that class is faulty Yes Yes

The larger size a given class has, the more likely that class is faulty Yes YesDo the effects of CK metrics differ across different programming languages Yes NoDo the effects of CK metrics differ between embedded system and information system

UNK No

Design metric is stronger predictor than LOC No No

© Fraunhofer IESE

20Apr 13, 2023

Limitation

Internal validity

Selection of publications

Quality of selected studies.

External validity

Limitation to models with single complexity metric

Limitation to object oriented systems

Conclusion validity

Lack of comparable studies

Lack of reported context information

© Fraunhofer IESE

21

Conclusion SQ1: Most common predicted attributes:

Fault proneness & Maintainability

SQ2: Most common design complexity dimension & metric: Coupling: CBO, RFC

Scale: WMC

Inheritance: DIT, NOC

Cohesion: LCOM

SQ3,4: Overall impact of design complexity on software quality: Moderate impact of WMC, CBO, RFC on fault proneness

LOC shows strongest impact on fault proneness !

SQ5: What explains the inconsistency between studies? Not able to explain for the inconsistency

Defect collection phase explains part of the inconsistency

© Fraunhofer IESE

22Apr 13, 2023

InterpretationLook for quality predictor in source code: LOCLook for quality predictor in design: CBO, RFC and

WMCBuild different prediction models for pre release and

post release defectNeed context information to increase predictive

performanceAdapt the design metrics for any software systems

© Fraunhofer IESE

23Apr 13, 2023

Future work

Construction of a generic model

prediction

Quality benchmarking

CBO XXX

RFC XXX

WMC XXX

LCOM

XXX

DIT XXX

System A

?

CBO XXX

RFC XXX

WMC XXX

LCOM

XXX

DIT XXX

System B Context setting

Prediction modelsModel selection & modification

impact of design complexity on software quality - a systematic review

Documents

metrics software

kind of complexity metrics

software design metrics

ys e proneness

o o jcic fraunhofer

ys e dit

research resultsq3

ys e potentialamc