transferring evolutionary couplings to industry piërre van de laar embedded systems institute 2009...
TRANSCRIPT
Transferring Evolutionary Couplings to Industry
Piërre van de LaarEmbedded Systems Institute
2009 October 14
Introducing Piërre van de Laar
’89 - ’94 Master - Catholic University Nijmegen, The Netherlands
Theoretical and Computational Physics
’94 - ’98 PhD - Catholic University Nijmegen, The Netherlands Selection in Neural Information Processing
’98 - ’00 Philips Research, TU/e, and KUN.From PhD in natural sciences to computer scientist
’00 - ’04 Philips Research Eindhoven, The NetherlandsResource-limited Component-Based Software Architecting
’04 - ’06 Philips Research @ Embedded Systems Institute Television Related Architecture and Design to Enhance
Reliability
’06 - now Research Fellow of Embedded Systems Institute Darwin: an Industry-as-Laboratory project on evolvability
’89 - ’94 Master - Catholic University Nijmegen, The Netherlands
Theoretical and Computational Physics
’94 - ’98 PhD - Catholic University Nijmegen, The Netherlands Selection in Neural Information Processing
’98 - ’00 Philips Research, TU/e, and KUN.From PhD in natural sciences to computer scientist
’00 - ’04 Philips Research Eindhoven, The NetherlandsResource-limited Component-Based Software Architecting
’04 - ’06 Philips Research @ Embedded Systems Institute Television Related Architecture and Design to Enhance
Reliability
’06 - now Research Fellow of Embedded Systems Institute Darwin: an Industry-as-Laboratory project on evolvability
2Darwin Project © 2009 ESI/Philips04/20/23
Embedded Systems Institute
Founded by Industry and Academia:
3Darwin Project © 2009 ESI/Philips04/20/23
Partners in Darwin
PH-MRI Best
&
Philips Research
6Darwin Project © 2009 ESI/Philips04/20/23
The Evolvability Challenge
PII
Philips Medical
Informatics Infrastructure
Intervention(HIFU)
Multi-modalitye.g. PET Product
integrationCost
Reduction
New RF Coils
7Darwin Project © 2009 ESI/Philips04/20/23
Darwin Project Goal
specific methods, techniques, and patterns
to improve the evolvability
of product families
within industrial constraints
and while maintaining other qualities
very relevant for MRIalso relevant for others
(partially) validated
faster to marketless effort
more predictable
market responseto anticipated and
unexpected changes
based on modeling andreference architectures
patient throughputsystem responsiveness
image qualitysafety
reliability
diverse productsinstalled base diversity
scientifically soundsuitable for PhD
people, process,project duration,
and cost
8Darwin Project © 2009 ESI/Philips04/20/23
Impact of change
9Darwin Project © 2009 ESI/Philips04/20/23
• Scan with table down• Table contains motor• Motor moves with
table movement• Motor contains iron• Moving iron changes
magnetic field
• Change in BIOS setting crashed viewing software• New video-card driver crashed viewing software
10Darwin Project © 2009 ESI/Philips04/20/23
Change Propagation• Changes propagate along couplings• Couplings caused by
– Calls from caller to callee– Include from includer to includee– Inheritance from derived class to base class– Communication from sender to receiver– …
• Direct and indirect coupling– Transitive closure
How to find all couplings?
11Darwin Project © 2009 ESI/Philips04/20/23
• Evolutionary couplings are based on historical evidence of couplings
• Software is developed incrementally• Often a feature is developed, tested, accepted,
and checked-in as one increment• Files regularly checked-in together are coupled• Mine change history to find couplings
CouplingViewer
• Visualize evolutionary couplings• According to state-of-art• Try to transfer to industry
– What are the industrial requirements for evolutionary couplings?
– Are all industrial requirements met by the state-of-art?
– If not, what research directions for improvement exist?
12Darwin Project © 2009 ESI/Philips04/20/23
How to visualize couplings?
13Darwin Project © 2009 ESI/Philips04/20/23
modules
changes
m1 m2 m3 m4 m5 m6 m7 m8 …
change1 1 1 1 0 0 1 0 0…
change2 1 0 0 1 1 0 0 0…
change3 0 1 0 1 0 0 0 1…
change4 0 0 1 0 0 0 1 0…
change5 0 0 1 0 0 1 1 1…
change6 1 0 0 0 1 0 1 0…
… … … … … …
• Too many changes• Too many modules
How to visualize couplings?
• Too many changes– Similarity measures between modules
• Absolute vs relative• Symmetric
– Except one
• Too many modulesTens of thousands of files, hundreds of components, tens of subsystems– Use hierarchy to limit entities visible at a time– Use navigation to make all entities visible
14Darwin Project © 2009 ESI/Philips04/20/23
Historical Evidence
15Darwin Project © 2009 ESI/Philips04/20/23
• Postlists– Tested by integrator– Signed off– Explicit change set
• ClearCase– Backup of a day’s
work– Not tested– Heuristic to get
change set
Different conceptual models
• Not all changes for one reason– Merges of multiple bug-fixes and features
from one branch to another– Ignored based on naming convention
16Darwin Project © 2009 ESI/Philips04/20/23
Coupling Information
17Darwin Project © 2009 ESI/Philips04/20/23
# changed together
# changed A # changed BNumber of times entities changed together
18Darwin Project © 2009 ESI/Philips04/20/23
CouplingViewer
19Darwin Project © 2009 ESI/Philips04/20/23
Identical to standard hierarchy• Tens of subsystems• Hundreds of building blocks• Hundreds of thousands of files
CouplingViewer
21Darwin Project © 2009 ESI/Philips04/20/23
CouplingViewer
Zoom into “reason”
• Explain coupling
22Darwin Project © 2009 ESI/Philips04/20/23
Files changed together
Tens of thousands of postlists
CouplingViewer
Zoom into “reason”
Example: exam card & patient support table due to contrast fluid
Transfer to industry• Failed
– Pilot study 20 software developers & architects
– Performance requirement not met• Too many false positives and false negatives
• Quantitative experiment– 15% false positives– At least 6.3% false negatives– First measurement in literature
• Is state-of-art performance– Feedback MSR 2009 & Microsoft Research
23Darwin Project © 2009 ESI/Philips04/20/23
Threats to validity
• CouplingViewer replacement of or addition to the tool set
• Differences between developers and architects
• A lot of data, yet no information:No guidance to relevant couplings
More dedicate tool also more expensive (economy of scale)
24Darwin Project © 2009 ESI/Philips04/20/23
Analysis of faults
• What causes the false positives and negatives?
• Do research directions for improvement exist?
25Darwin Project © 2009 ESI/Philips04/20/23
Not all modules changed are related
Actual
26Darwin Project © 2009 ESI/Philips04/20/23
pref-appearance.dtd pref-applications.dtd pref-applications-edit.dtd
pref-advanced.dtd
/base directory
pref-appearance.dtd
pref-applications.dtd
pref-applications-edit.dtd
pref-advanced.dtd /base directory
Assumed
C compiler
a.c
b.c c.c
d.c
Mozilla
C compiler
a.c b.c c.c d.c
How to capture couplings?
• More information is needed than just “files changed together”
• What information?– Directed or not?– Dependencies on not-changed entities?
• Where to get the information?– Capture while editing by development
environment– Ask developer on check-in
• Everything or confirmation only27Darwin Project © 2009 ESI/Philips04/20/23
Hierarchy of changes
28Darwin Project © 2009 ESI/Philips04/20/23
1
2
3
4
5
new featureexisting product
early feedback
backwards compatiblestrong code ownership
couplings between hierarchical layers remain hidden
Support a hierarchy of change
• New version of program has• New and changed features implemented in• Many development steps [work
breakdowns]
• How can changes be linked and combined?– Link based on unique feature identifiers– How to avoid false positives?
29Darwin Project © 2009 ESI/Philips04/20/23
Minimizing overhead
• Work effectively and efficiently• Find bug while developing
– Not two postlists
• Combine small & trivial changes• Postlists rejected by integrator
– Changing postlist is difficult– Additional postlist to solve problem
30Darwin Project © 2009 ESI/Philips04/20/23
How to minimize overhead?
• Both in tooling and process– Filling bug– Getting bug-id– Correcting bug– Checking in bug– Documenting bug resolution
• Always modify postlist– I.s.o. create & occasionally modify
31Darwin Project © 2009 ESI/Philips04/20/23
Different kind of changes• Merges
– Literature: exclude• Infrastructural
– E.g. compiler changes• Exclusion also hides couplings
• Organizational– E.g. typos in translations
• Collected by service• Submitted in batch to
development to correct• Implement a feature
– Tens of files• Change a feature
– A few files• Fix bug in feature
– A single file
32Darwin Project © 2009 ESI/Philips04/20/23
Long-lived feature implemented by strongly coupled modules seemed
weakly coupled due to many small changes and bug-fixes
Strong coupling between unrelated language modules
Mismatch in conceptual models• When couplings are removed, the files are
changed together, hence evolutionary couplings increases!
• Is a coupling binary or has it a strength?– One call is enough? – Developer or architect
• Are evolutionary couplings direct or transitive?– Does a brittle interface decouple two modules?
33Darwin Project © 2009 ESI/Philips04/20/23
Research questions
• What is the appropriate conceptual model?– When are modules coupled?
• Different language modules translate the same messages in the UI
– Likelihood of propagation iso coupling?
• Should changes for different kind of reasons be combined?– If so, how could they be combined?
• Can compression techniques, like Latent Semantic Indexing, create a module-feature relation?
34Darwin Project © 2009 ESI/Philips04/20/23
Conclusions
Current state-of-art in evolutionary couplings does not meet industrial requirements
35Darwin Project © 2009 ESI/Philips04/20/23
Research directions exist to improve the current state-of-art
36Darwin Project © 2009 ESI/Philips04/20/23
Identical to standard hierarchy• Tens of subsystems• Hundreds of building blocks• Hundreds of thousands of files
Files changed together
Tens of thousands of postlists
CouplingViewer
# changed together
# changed A # changed B
Number of times entities changed together