impact of design complexity on software quality - a systematic review
DESCRIPTION
METRICON 2010, GermanyTRANSCRIPT
© Fraunhofer IESE
1Apr 13, 2023
Impact of design complexity on software
qualityStudent: Nguyen Duc Anh
First supervisor: Marcus Ciolkowski, Fraunhofer IESE
Second supervisor: Sebastian Barney, BTH
General supervisor: Prof. Dr. Dr. h.c. Dieter Rombach
Master thesis presentation
© Fraunhofer IESE
2Apr 13, 2023
Agenda
Motivation
Problem statement
Research methodology
Research result
Threat of validity
Conclusion
Future work
© Fraunhofer IESE
3Apr 13, 2023
Motivation
High complexity leads to high cost and low quality
Impact onCost & quality
attributesDesign complexity
More effort to understand,
implement, maintain
increased fault proneness, reduced
maintainability.
Complex design structure
© Fraunhofer IESE
4Apr 13, 2023
Problem statement
What is the impact
of design
complexity on
software cost &quality
?
SQ1: Which cost & quality attributes are predicted using design complexity metrics?
SQ2: What (kind of ) design complexity metrics are most frequently used in literature?
SQ3: Which complexity metrics are potential predictors of quality attribute?
SQ4: Is there an overall influence of these metrics on quality attributes? If yes, what are the impacts of those metrics on those attributes?
SQ5: If no, what explains the inconsistency between studies? Is this explanation consistent across different metrics?
© Fraunhofer IESE
5Apr 13, 2023
Research methodology
What is the impact
of design
complexity on
software cost &quality
?
Search for relevant publications
Extract information about design complexity metrics & quality attributes
Extract numerical representation of impact relationship & context factors
Synthesize data & interpret results
© Fraunhofer IESE
6Apr 13, 2023
Study selection resultSearch range: 1960 to
2010Scope: Object oriented
metricsSearch terms
Title + Abstract exclusion
Application of detailed exclusion
Full text exclusion
N= 906
N=281
N=85
N=39
57 primary studies
Reference scan, search for grey
literatureN=18
© Fraunhofer IESE
7Apr 13, 2023
Research resultSQ1 - Which quality attributes are predicted using software design metrics?
Cost (effort) is excluded due to lack of sufficient number of investigated studies
Probability of a module to be faulty
Effort to maintain a software module
Number of fault per LOC
Probability of a module to be
changed
© Fraunhofer IESE
8Apr 13, 2023
Research resultSQ2- What kind of complexity metrics is most frequently used in literature? Design complexity
dimension
No o
f st
udie
s
© Fraunhofer IESE
9Apr 13, 2023
Research resultSQ2- What complexity metrics is most frequently used in literature?
Design complexity metric: Chidamber & Kemerer (CK) metric set (*)
Fault proneness
Maintainability
Metric TypeNo of studie
sNOC (Number Of Children) inheritan
ce28
DIT (Depth of Inheritance Tree)
inheritance
27
CBO (Coupling Between Object)
coupling 22
LCOM (Lack of Cohesion between Method)
cohesion 22
WMC (Weighted Method Count)
scale 22
RFC (Response For a Class) coupling 21… … 12
Metric TypeNo of
studiesWMC scale 9
RFC coupling 8
DIT coupling 7
NOC inheritance 6
CBO coupling 4
LCOM cohesion 3
… … 3
(*) S.R. Chidamber and C.F. Kemerer, “A Metrics Suite for Object Oriented Design,” IEEE Trans. Softw. Eng., vol. 20, 1994, pp. 476-493
© Fraunhofer IESE
10Apr 13, 2023
Research resultSQ3 - Which complexity metrics are potential predictors of fault proneness? Potential prediction – Statistical correlation analysis
Correlation coefficient
Spearman
Odds ratios (estimated from univariate logistic regression model)
Significant correlation
Vote counting
Count the number of reported significant impacts over total number of studies
© Fraunhofer IESE
11Apr 13, 2023
Metric No of studies
Out comesProportional ratio
of +
Positive impact?
No of +
No of -
No of non significan
tNOC 19 6 1 12 32% NoDIT 14 2 0 12 14% NoCBO 17 10 0 7 59% YesLCOM 14 6 0 8 43% NoWMC 26 18 0 8 69% YesRFC 15 9 0 6 60% YesWMC McCabe 16 11 0 5 69%
Yes
SDMC 6 6 0 0 100% YesAMC 6 6 0 0 100% YesNIM 6 6 0 0 100% YesNCM 6 6 0 0 100% YesNTM 6 6 0 0 100% Yes
Metric No of studies
Out comesProportional ratio
of +
Positive impact?
No of +
No of -
No of non significan
tNOC 19 6 1 12 32% NoDIT 14 2 0 12 14% NoCBO 17 10 0 7 59% YesLCOM 14 6 0 8 43% NoWMC 26 18 0 8 69% YesRFC 15 9 0 6 60% YesWMC McCabe 16 11 0 5 69%
Yes
SDMC 6 6 0 0 100% YesAMC 6 6 0 0 100% YesNIM 6 6 0 0 100% YesNCM 6 6 0 0 100% YesNTM 6 6 0 0 100% Yes
Research resultSQ3 - Which complexity metrics are potential predictors of fault proneness?(Example: Vote counting for Spearman correlation coefficient in Fault proneness studies)
Except NOC, DIT, LCOM
listed metrics are potential predictor of
fault proneness !
≤ 50% no
impact !
≥ 50% positive impact
© Fraunhofer IESE
12Apr 13, 2023
Research result
(*) J. Cohen, Statistical Power Analysis for the Behavioral Science, Lawrence Erlbaum Hillsdale, New Jersey, 1988.
SQ3 - Which complexity metrics are potential predictors of fault proneness? Strength of correlation (*)
Trivial Small Medium Large
Meta analysis
Synthesize reported correlation coefficients
Assess the agreement among studies about aggregated result
© Fraunhofer IESE
13Apr 13, 2023
Research resultSQ4 - Is there an overall influence of these metrics on fault proneness?
Trivial Small Medium Large
Scale, coupling metrics are stronger correlated than cohesion, inheritance metric
LOC is strongest correlated to fault proneness
95% confidence interval of aggregated correlation
coefficient between the metric and fault
proneness
© Fraunhofer IESE
14Apr 13, 2023
Research result
Forest plot of RFC
Global Spearman coefficient
0.31
95% Confidence Interval
[0.22;0.40]
P-value 0.000
(Example: Meta analysis for Spearman coefficient of metric RFC in Fault proneness studies)
Aggregated results
SQ4 - Is there an overall influence of these metrics on fault proneness?
© Fraunhofer IESE
15Apr 13, 2023
Is this result consistent across studies?
I2 test for heterogeneity !0% 40% 70%small medium high
totallyagree
totally disagree
100%
RFC: I2=78%
(Example: Meta analysis for Spearman coefficient of metric RFC in Fault proneness studies)
Research result
Metric I2
CBO 95%DIT 83%NOC 75%LCOM 74%RFC 78%WMC 93%LOC 84%
SQ4 - Is there an overall influence of these metrics on fault proneness?
© Fraunhofer IESE
16Apr 13, 2023
Research resultSQ4* - How many cases is enough to draw the statistically significant conclusion?
α value 0.1
Tails 2
Expected effect size
0.31
Expected power 80%
Number of cases needed: 60 cases !
(Example: Power analysis for Spearman coefficient of metric RFC in Fault proneness studies)
© Fraunhofer IESE
17Apr 13, 2023
Research result
Moderator variable
Programming Language: C++ & Java
Project type: Open source, Closed source academic & Closed source industry
Defect collection phase: Pre release defects & Post release defects
Business domain: Embedded system & Information system
Dataset size: Small, Medium & Large
Are the correlations different across each moderator variable?
SQ5: What explains the inconsistency between studies? Is this explanation consistent across different metrics?
© Fraunhofer IESE
18Apr 13, 2023
Metric Programming
Language
Project type
Defect col.
Phase
Business Domain
Dataset size
CBO 6% 4% 83% 4% 8%DIT 3% 0% 20% 0% 1%NOC 34% 24% 15% 22% 14%LCOM 1% 0% 60% 0% 6%RFC 5% 3% 78% 3% 2%WMC 32% 4% 60% 4% 3%LOC 7% 2% 51% 15% 0%
Research resultSQ5: What explains the inconsistency between studies? Is this explanation consistent across different metrics?
Metric Programming
Language
Project type
Defect col.
Phase
Business Domain
Dataset size
CBO 6% 4% 83% 4% 8%DIT 3% 0% 20% 0% 1%NOC 34% 24% 15% 22% 14%LCOM 1% 0% 60% 0% 6%RFC 5% 3% 78% 3% 2%WMC 32% 4% 60% 4% 3%LOC 7% 2% 51% 15% 0%
Remaining inconsistency is still excessive
No consistent explanation for heterogeneity across metrics !
Variance explanatio
n in percent
Metric Programming
Language
Project type
Defect col.
Phase
Business Domain
Dataset size
CBO 6% 4% 83% 4% 8%DIT 3% 0% 20% 0% 1%NOC 34% 24% 15% 22% 14%LCOM 1% 0% 60% 0% 6%RFC 5% 3% 78% 3% 2%WMC 32% 4% 60% 4% 3%LOC 7% 2% 51% 15% 0%
© Fraunhofer IESE
19Apr 13, 2023
Comparison of results with perception in literature
Vote counting & meta analysis common claims in literature
Common claims in literature In Lit. OursThe more classes a given class is coupled, the more likely that class is faulty Yes Yes
The more methods that can potentially be executed in response to a message received by an object of a given class, the more likely that class is faulty
Yes Yes
The deeper the inheritance tree for a given class is, the more likely that class is faulty Yes No
The more immediate sub-classes a given class has, the more likely that class is faulty No No
The less similar methods within a given class, the more likely that class is faulty Yes No
The more local methods a given class has, the more likely that class is faulty Yes Yes
The larger size a given class has, the more likely that class is faulty Yes YesDo the effects of CK metrics differ across different programming languages Yes NoDo the effects of CK metrics differ between embedded system and information system
UNK No
Design metric is stronger predictor than LOC No No
© Fraunhofer IESE
20Apr 13, 2023
Limitation
Internal validity
Selection of publications
Quality of selected studies.
External validity
Limitation to models with single complexity metric
Limitation to object oriented systems
Conclusion validity
Lack of comparable studies
Lack of reported context information
© Fraunhofer IESE
21
Conclusion SQ1: Most common predicted attributes:
Fault proneness & Maintainability
SQ2: Most common design complexity dimension & metric: Coupling: CBO, RFC
Scale: WMC
Inheritance: DIT, NOC
Cohesion: LCOM
SQ3,4: Overall impact of design complexity on software quality: Moderate impact of WMC, CBO, RFC on fault proneness
LOC shows strongest impact on fault proneness !
SQ5: What explains the inconsistency between studies? Not able to explain for the inconsistency
Defect collection phase explains part of the inconsistency
© Fraunhofer IESE
22Apr 13, 2023
InterpretationLook for quality predictor in source code: LOCLook for quality predictor in design: CBO, RFC and
WMCBuild different prediction models for pre release and
post release defectNeed context information to increase predictive
performanceAdapt the design metrics for any software systems
© Fraunhofer IESE
23Apr 13, 2023
Future work
Construction of a generic model
prediction
Quality benchmarking
CBO XXX
RFC XXX
WMC XXX
LCOM
XXX
DIT XXX
System A
?
CBO XXX
RFC XXX
WMC XXX
LCOM
XXX
DIT XXX
System B Context setting
Prediction modelsModel selection & modification
© Fraunhofer IESE
24Apr 13, 2023
Q&A