data science challenges in personal program analysis
TRANSCRIPT
Data Science Challenges inPersonal Program Analysis
Bas van SchaikNew York R Conference (April 2016)
- Cloud service for personal program analysis
- Free for OSS projects
- Currently in private beta, release imminent
Personal Program Analysis: why?
We are passionate about code.
We wish everyone would write better code.
We help people build better software better.
Ehm… Program analysis?
Compiler
What’s an ‘Alert’?
Short answer: a bug or a violation of good coding practice
Example: define the same key twice in a Python dict
E.g. in OpenStack Designate:
self.target = objects.PoolTarget.from_dict({ 'type': 'powerdns', 'options': [{ 'key': 'connection', 'value': 'memory://', 'key': 'host', 'value': '127.0.0.1', 'key': 'port', 'value': 53}],})
My guess of what was intended:
self.target = objects.PoolTarget.from_dict({ 'type': 'powerdns', 'options': [ {'key': 'connection', 'value': 'memory://'}, {'key': 'host', 'value': '127.0.0.1'}, {'key': 'port', 'value': 53}],})
What’s an ‘Alert’?
Alerts are found by queries: ● The source code is our database● Every query result is an alert.
Support for 10 different programming languages (and counting), a total > 1000 queries and metrics.
What does a query look like?
from Method mwhere m.hasName("hashcode") and m.hasNoParameters() select m, "Should this method be called 'hashCode' rather than 'hashcode'?"
Making it interesting: project over timenet alerts
activ
ityco
mpo
sitio
nne
t LO
C
OpenStack Nova (python)
Or: compare different projectsCinder
Nova
Neutron
Horizon
Heat
SwiftSahara
Glance
Designate
Keystone
FuelIronic
aler
ts
LOC
Even more interesting: make it personal
A
X
net LOC contributed (all OpenStack modules)
net
aler
ts
B
Data Science for PPA: finding fun facts
Trailblazer
Bug squasher
Refactorer
None
Major release
Tota
l con
trib
utor
s%
con
trib
utor
s
Who's doing what in OpenStack?
Data science for PPA: cleaningPostgreSQL (net churn and net alerts - before cleaning)
PostgreSQL: after cleaning
Warning:
DEMO of beta software
But… why make it personal?
Some developers not so happy:
“are you questioning my ability to write code?”
No. We're helping you to improve.
But… why make it personal?
By making it personal, we make people care.
When people care, they improve.
When developers improve, the code improves.
But… why make it personal?
When developers improve, the code improves.
● Automated code review on GitHub pull requests
● “On 12/11/2015 you introduced X, fancy fixing that?”
● “You recently fixed alert A in file B. Based on your expertise, you might also be interested in fixing alert X in file Y?”
● “Compared to developers like you, you rank 20 out of 100”
● “… and by fixing these 5 alerts, you'll be in the top 10!”
● Found a bug in your project? Write a query for it, share it!
Not rocket science… Or is it?
DEMO (continued)
Interested in…
Early access to CodingStars?
Having your OSS project analysed?
Working for us in New York, San Francisco, Oxford (UK), or Copenhagen (Denmark)?
Talk to us!(in person, or [email protected])