chep 2001 , 2001 lassi a. tuura, northeastern university analysing software dependencies with...
Post on 04-Jan-2016
218 Views
Preview:
TRANSCRIPT
CHEP 2001CHEP 2001
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysing Software Analysing Software DependenciesDependenciesWith IgnominyWith Ignominy
Lucas TaylorLassi A. Tuura
Northeastern University, Boston
2September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
MotivationMotivation IGUANA is largely an integrator for CMS: we need to have
a good grasp of the external software before its inclusion into our system By and large we are not seeking to select one product… but are trying to merge the strengths of several packages into a
very good physics analysis environment… and are seeking to provide feedback to component authors
We are interested in, among others: How much of the external package we would use Its impact on our physical software structure How well it fits in with the philosophy of CMS software and
other imports—in design and architecture, usage patterns, GUI, … What other software it depends in Commitment required, possibility of varying how much we use
3September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
ExamplesExamplesSee http://iguana.cern.ch/2_4_3/dependencies.html
4September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
ignominy: dishonour, disgrace, shame; infamy; the condition of being in disgrace, etc.
(Oxford English Dictionary)
IgnominyIgnominy Model
Examines and reports on direct and transitive source and binary dependencies
Creates reports of the collected results As a set of web pages Numerically Graphically As tables
SourceCode
BuildProducts
Metrics
Graphs
Tables
DependencyDatabase
User-definedlogical dependencies
+
ignominy: a suite of perl and shell scripts plus a number of configuration files (IGUANA)
5September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Dependency AnalysisDependency Analysis Ignominy scans…
Make dependency data produced by the compilers (*.d files) Source code for #includes (resolved against the ones actually
seen) Shared library dependencies (“ldd” output) Defined and required symbols (“nm” output)
And maps… Source code and binaries into packages #include dependencies into package dependencies Unresolved/defined symbols into package dependencies
And warns… about problems and ambiguities (e.g. multiply defined symbols or dependent shared libraries not found)
Produces a simple text file database for the different dependencies: source only, binaries only, combined, forward and reverse, by package, by domain, …
6September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
CaveatsCaveats Ignominy does only static dependencies, not dynamic
ones Indirect calls through pointers, virtual function calls State dependencies: Data reads and writes, thread synchronisation, …
The analysis of external software is heuristic; exact information from the build system helps considerably
Difficulties are posed by copied code (copy and paste or merged libraries) and defaults dependent on link-order (“dummies” that are supposed to be overridden by client) Most headaches so far with FORTRAN code
Ignominy must guess software structure when in doubt Based on project-defined heuristic search rules, usually works fine In face of an ambiguity Ignominy warns and assumes the worst
– Multiply defined symbol: dependency on all definitions– Multiple header matches: dependency on all (but correct with
compiler-generated dependency data!)
7September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Single Package DependenciesSingle Package Dependencies
Cmscan/IgCmscanTesting Level: 5Outgoing edges: 6- from includes: 6 (145 files)- from symbols: 4 (636 symbols)
Incoming edges: 1- from includes: 1 (1 file)- from symbols: 1 (1 symbol)
8September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Domain Test PlanDomain Test Plan
9September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Package Impact DiagramPackage Impact Diagram
“Used-by” dependencie
s
10
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
An Extra DependencyAn Extra Dependency
Bad dependency in prototype code;
was resolved to be from bad class
placement
1 IgSoReaderAppDriver IgQtTwigBrowservia IgQtTwigModel.h
1 IgSoReaderAppDriver IgQtTwigBrowservia IgQtTwigRep.h
11
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Static vs. LogicalStatic vs. Logical
Logical dependencies from packages used through “Interfaces”
12
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Discovering Forms of ModularityDiscovering Forms of Modularity A fairly good tool for discovering “philosophical
structure” IGUANA and Geant4 mostly use direct abstract interfaces
– The interfaces normally generate “correct” functional dependencies: interface definitions are in packages that obviously imply the function
“Plug in one implementation of this interface”– Some use in Lizard/AIDA and ROOT
All interfaces bundled into “interface” (or framework) packages– Used by Lizard/AIDA and ROOT
Explicit dynamic loading to solve modularity issues– Used extensively by ROOT
Fall back on scripts or commands evaluated at run-time– Some use in Geant4– Used quite a bit in ROOT
13
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of Anaphe Analysis of Anaphe Distribution of tools and utilities for LHC era physics
Combination of commercial, free and HEP software Claims to be a toolkit
Appears to live up to its toolkit claims Good work on modularity Clean design is evident in many places Dependency diagrams often split
naturally into functional units
14
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of ATLASAnalysis of ATLAS Torture-test exercise for the tool
Large release size (~50% F77, ~50% mainly C++ but also C, Java) Near the limit of Ignominy’s ability to discover software structure Pictures below illustrate analysis difficulties
Visible (and known) problems Many cleanly designed packages shadowed by a cycle with very
unpleasant effects on the overall structure A number of places show poor packaging and/or lack of abstract
interfaces
Known bybuild
system
Misconfiguredanalysis (1.3.2)
Improvedanalysis (1.3.7)
15
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of CMS/ORCA Analysis of CMS/ORCA Large C++ project Deliberately fast development shows in places
Good design in key parts has helped
Recognised problems Especially with the length of the release sequence Clean-up/restructuring necessary soon
– To some extent starting alreadyORCA Visualisation —needs most of the rest
16
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of CMS/COBRA, IGUANA Analysis of CMS/COBRA, IGUANA COBRA
CMS Reconstruction, analysis and simulation framework Recently successfully split off from ORCA Quite many small packages
Has helped with modularity– Some issues with partitioning: some small cycles, certain
package groups appear quite frequently
IGUANA Generic data analysis environment with CMS focus Many fairly small packages with targeted purpose (similar to
Anaphe) Project focus as an integrator and glue provider is fairly evident We too have some rats nests to clean up, but at least they are
small… Has had the advantage of considerable monitoring!
17
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of Geant4 Analysis of Geant4 Fairly large C++ project
Very fine-grained (and multi-level) package structuring Seems quite clean from the preliminary analysis
Fine package subdivision helps in many ways but makes analysis and code understanding more complicated
One subsystemseems stronglycoupled andneeds attention
Need to studythe use of theinternal commandsystem
18
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of ROOTAnalysis of ROOT ROOT developers have done a formidable job of breaking
binary (shared library) dependencies, but… It makes dubious use of its internal scripting facility For example: By static analysis, nothing seems to use the
postscript package directly (no incoming dependencies), but there is this code:
void TPad::Print (const char *filename, Option_t *option) { […]
TVirtualPS *psave = gVirtualPS;
if (gROOT->LoadClass("TPostScript","Postscript")) return;
gROOT->ProcessLineFast("new TPostScript()");
gVirtualPS->Open(psname,pstype);
gVirtualPS->SetBit(kPrintingPS); […] }
Taking these and global objects into account makes the dependency diagrams very different—and cast doubt on usefulness of binary-only dependency diagrams for ROOT
Sign of fast growth? Need a “next evolutionary step”? So “coherent” that replacing parts could get painful…
19
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of ROOT…Analysis of ROOT…
Binary only Binary + Source + Logical = Real
20
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Package MetricsPackage MetricsProject Release Packages
Average #of direct
dependencies
Cycles(Packages Involved)
# of levels ACD* CCD* NCCD* Size
Anaphe 3.6.1 31 2.6 -- 8 5.4 167 1.3 630/170kATLAS 1.3.2 230 6.3 2 (92) 96 70 16211 10 1350k
1.3.7 236 7.0 2 (92) 97 77 18263 11 1350kCMS/ORCA 4.6.0 199 7.4 7 (22) 35 24 4815 3.6 420kCMS/COBRA 5.2.0 87 6.7 4 (10) 19 15 1312 2.7 180kCMS/IGUANA 2.4.2 35 3.9 -- 6 5.0 174 1.2 150/38kGeant4 4.3.2 108 7.0 3 (12) 21 16 1765 2.8 680kROOT 2.25/05 30 6.4 1 (19) 22 19 580 4.7 660k*) John Lakos, Large-Scale C++ Programming
Size = total amount of source code (roughly—not normalised across projects!) ACD = average component dependency (~ libraries linked in) CCD = sum of single-package component dependencies over whole release
– Indicates testing/integration cost NCCD = Measure of CCD compared to a balanced binary tree
– A good toolkit’s NCCD will be close to 1.0
– < 1.0: structure is flatter than a binary tree (= independent packages)
– > 1.0: structure is more strongly coupled (vertical or cyclic)
– Aim: Minimise NCCD for given software/functionality
21
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Metrics: NCCD vs CyclesMetrics: NCCD vs Cycles
0
2
4
6
8
10
12
0% 10% 20% 30% 40% 50% 60% 70%
Fraction of Packages in Cycles
NC
CD
Toolkits &Frameworks
ATLAS
ORCA
Anaphe
IGUANA
COBRAG4
ROOT
22
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Metrics: NCCD vs SizeMetrics: NCCD vs Size
0
2
4
6
8
10
12
0 200 400 600 800 1000 1200 1400 1600
Size (k-lines of source [files])
NC
CD
Toolkits &Frameworks
ATLAS
ORCA
AnapheIGUANACOBRA
G4
ROOT
23
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Metrics: NCCD vs ACDMetrics: NCCD vs ACD
0
2
4
6
8
10
12
0% 10% 20% 30% 40% 50% 60% 70%
Av. Component Deps (Fraction of Packages)
NC
CD
Toolkits &Frameworks
ATLAS
ORCA
AnapheIGUANACOBRAG4
ROOT
24
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Metrics: NCCD vs AIDMetrics: NCCD vs AID
0
2
4
6
8
10
12
0% 5% 10% 15% 20% 25%
Av. Immediate Deps (Fraction of Packages)
NC
CD
Toolkits &Frameworks
ATLAS
ORCA
Anaphe IGUANA
COBRAG4
ROOT
25
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Metrics: Packages vs SizeMetrics: Packages vs Size
0
50
100
150
200
250
0 200 400 600 800 1000 1200 1400 1600
Size (Own Only)
Pa
ck
ag
es
Toolkits &Frameworks
ATLAS
ORCA
AnapheIGUANA
COBRA
G4
ROOT
26
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Metrics: Packages vs SizeMetrics: Packages vs Size
0
50
100
150
200
250
0 200 400 600 800 1000 1200 1400 1600
Size (All)
Pa
ck
ag
es
Toolkits &Frameworks
ATLAS
ORCA
AnapheIGUANA
COBRA
G4
ROOT
27
September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
SummarySummary Ignominy is a rather simple tool—and as such tremendously
helpful in keeping a project on track Especially for keeping external software in check Also for giving hard facts about the project itself
It provides tools to study a software system structure It should not be used blindly, results must be understood and
interpreted correctly; a human is certainly required! We find it valuable—output is now a part of our release documentation
It doesn’t do everything, but what it does, it seeks to do well Feedback, suggestions for improvements etc. would be most welcome! Planning to add support for Java
Available for free at http://iguana.cern.ch/ See the IGUANA distributions (latest = 2.4.3 recommended) For questions please mail lassi.tuura@cern.ch or iguana-interest@cern.
ch
top related