on the topology of package dependency networks: a comparison of programming language ecosystems
TRANSCRIPT
![Page 1: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/1.jpg)
On the Topology of Package Dependency NetworksA Comparison of Programming Language Ecosystems
Alexandre Decan, Tom Mens, Maëlick ClaesSoftware Engineering Lab
1
29 November 2016 – Int’l Workshop Software Ecosystem Architectures (WEA)
![Page 2: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/2.jpg)
ResearchTeam
![Page 3: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/3.jpg)
Previous Work
• A. Decan, T. Mens, M. Claes, P. Grosjean– IWSECO-WEA 2015: "On the Development and Distribution of R
Packages: An Empirical Analysis of the R Ecosystem"– SANER 2016:"When GitHub Meets CRAN: An Analysis of Inter-
Repository Package Dependency Problems”
•A. Serebrenik, T. Mens– WEA 2015: "Challenges in Software Ecosystems Research"• Generalizability• Comparing different ecosystems
3
![Page 4: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/4.jpg)
Software Packaging Ecosystems
• Ecosystem: ”a collection of software projects which are developed and evolve together in the same environment” [Lungu]
• Software distributed as packages– Dependency relationships between
packages– Package versioning
4
![Page 5: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/5.jpg)
Software Packaging Ecosystemsfor programming languages
• Many programming-language specificpackage managers
5
npmJavaScript
PyPIPython
RubyGemsRuby
CRANR
![Page 6: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/6.jpg)
Software Packaging Ecosystemsfor programming languages
IEEE Spectrum ranking of most popular programming languages
6
(http://spectrum.ieee.org/image/Mjc5MjI0Ng.png)
“The real standard library people want is more like what you find in Python
or Ruby, and it’s more batteries included, feature complete, and that is not
in JavaScript. That’s in the NPM world or the larger world.”
![Page 7: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/7.jpg)
Ecosystem comparison
7
CRAN PyPI NPM
Snapshot date 2016-04-26 2016-02-17 2016-06-28Packages 9k 56k 317k
Dependencies 21k 53k 728kNew packages in
20151.6k 17k 113k
Updates in 2015 8k 131k 711k
![Page 8: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/8.jpg)
Data extraction
• CRAN: https://github.com/ecos-umons/extractoR• npm: https://registry.npmjs.org• PyPI: Missing dependencies information
=> https://kgullikson88.github.io/blog/pypi-analysis.html
8
![Page 9: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/9.jpg)
Terminology
• b is a dependency of a• a is a reverse dependency of b• c is a transitive dependency of a• a is a transitive reverse dependency of c• {a, b, c, d, e, f} is a (weakly connected) component• g is an isolated package 9
![Page 10: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/10.jpg)
Dependency usagein programming language ecosystems
PyPI has proportionally more isolated Python packages(due to its extensive standard library?)
10
“The real standard library people want is more like what you find in Python or Ruby, and it’s more batteries included, feature complete, and that is not in JavaScript. That’s in the NPM world or the larger world.”
![Page 11: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/11.jpg)
Topologyof programming language ecosystems
The majority of packages are part of a single huge component
11
Largest component:• 76.5% (CRAN), 35.6% (PyPI), 63.8% (npm) of all packages• 91% (CRAN), 88% (PyPI), 92% (npm) of all non-isolated packages
![Page 12: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/12.jpg)
Differences in dependenciesbetween programming language ecosystems
12
npm packages have a much higher ratio of transitive dependencies
![Page 13: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/13.jpg)
Differences in reverse dependencies between programming language ecosystems
13
There are proportionally more very popular npm packages(i.e. higher number of transitive reverse dependencies)
![Page 14: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/14.jpg)
Differences in reverse dependencies between programming language ecosystems
14
Number of packages required by more than 2% of the ecosystem
![Page 15: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/15.jpg)
Possible explanationmicro-packages in npm
“In a lot of JavaScript environments, space is at a premium. [...] Several larger libraries […] have actually intentionally split themselves into sub-modules because people usually only ever load them to use a single merge function.”
Example: isarray150 direct, 77K inverse transitive deps in August 2016
var toString = {}.toString;module.exports = Array.isArray || function (arr) { return toString.call(arr) == '[object Array]’;};
15
![Page 16: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/16.jpg)
function leftpad (str, len, ch) { str = String(str); var i = -1; if (!ch && ch !== 0) ch = ' '; len = len - str.length; while (++i < len) { str = ch + str; } return str;}
Known problems: leftpad
16
Its developer removed all his packages from npm:“This impacted many thousands of projects. [...] We began observing hundreds of failures per minute, as dependent projects – and their dependents, and their dependents... – all failed when requesting the now-unpublished package.”
http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm
![Page 17: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/17.jpg)
function leftpad (str, len, ch) { str = String(str); var i = -1; if (!ch && ch !== 0) ch = ' '; len = len - str.length; while (++i < len) { str = ch + str; } return str;}
Known problems: leftpad
17
npm managers un-unpublished leftpad but …
“a number of dependency chains [...] explicitly requested 0.0.3.”
http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm
![Page 18: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/18.jpg)
Conclusion
• Simple metrics can be used to compare the topology of different package-based software ecosystems
• Similarities in the dependency graph structure• Most non isolated packages are part of a large weakly
connected component• Differences that can be explained by the specificities of
each ecosystem• Python’s extensive standard library• CRAN’s particular versioning policy• npm's abundance of micro-packages
18
![Page 19: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/19.jpg)
Future work
• See our SANER 2017 article“An empirical comparison of dependency issues in OSS packaging ecosystems”• Include RubyGems• Study the evolution over time• Frequency of package updates• Resilience of packages to failures in dependencies• Impact of solutions that rely on dependency
constraints and semantic versioning• Beyond SANER 2017: study the interplay between social
and technical aspects19
![Page 20: On the topology of package dependency networks: A comparison of programming language ecosystems](https://reader031.vdocuments.net/reader031/viewer/2022021507/5871afb01a28abda6a8b6543/html5/thumbnails/20.jpg)
Thanks for you attention!
Questions?
20