patrick adam wagstrom october 2004 community building in open source software ecosystems patrick...
TRANSCRIPT
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/11
October 2004October 2004
Community Building in Open Source Software Ecosystems
Patrick Adam WagstromDepartment of Engineering and Public Policy
Carnegie Mellon UniversityAdvisors: Jim Herbsleb and Kathleen Carley
October 2004
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/22
October 2004October 2004
Overview
What is Free and Open Source Software (F/OSS)?
Previous Work
Research Questions
Modeling Software Development
Sources of Validation Data
Preliminary Results
Virtual Experiment
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/33
October 2004October 2004
What is F/OSS?
Utilize copyright law to protect the rights of the user
Term “Open Source” was coined in 1998
“Free” software started by Richard Stallman in 1984
Free for any use
Free to redistribute and modify
Communities are dynamic and driven by merit
Increasing amounts of commercial interest
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/44
October 2004October 2004
Previous Work
Developer motivation (Ghosh 2003, Lakhini et al 2002, Lakhini 2003, Shah 2003)
Economic Basis for Development (Lerner and Tirole 2002, Schiff 2002)
Social network overview (Xu and Madey 2004, Sandusky et al 2004)
Simulation of communities (Gao et al 2003)
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/55
October 2004October 2004
Research Questions
Can we predict adoption of open source projects?
Can we predict development of open source projects?
In what ways does the community effect the development of projects?
What happens to a project when new users or developers join the project?
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/66
October 2004October 2004
Overview
What is Free and Open Source Software (F/OSS)?
Previous Work
Research Questions
Modeling Software Development
Sources of Validation Data
Preliminary Results
Virtual Experiment
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/77
October 2004October 2004
Modeling Software Development
Multi-Agent simulation called OSSim (pronounced AWESOME!!)
Developers have differing skill levels, problem sets, and motives
Find the best project that solves each problem
Some agents choose to modify the project to solve problems better
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/88
October 2004October 2004
OSSim OverviewPool of users andprojects created
Users join projects thatbest solve their problem
If user a developer,modify the project
Users vote on changesthat satisfy the most people
Users re-evaluatesatisfaction with projects
Social networks of users and projects updated
Unhappy users may leaveproject or start new project
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/99
October 2004October 2004
OSSim Input Parameters
Name Value SourceNumber of users 50 Arbitrary for virtual experimentNumber of projects 2 Arbitrary for virtual experimentPercentage of developers 0.1-0.3 uniform distribution Based on Advogato.org and bug
Information from Tigris.orgDeveloper skill 0-100 custom distribution Bimodal distribution with lognormal like
Tail; from Advogato.org and GhoshSocial network size 0-50 uniform distribution Advogato.org, Sandusky 2004, Xu 2004Project network size 2-20 uniform distribution Conjecture based on Ghosh dataAgent Attention Span normal distribution (m=10, s=2) My expert estimate
Problem model NK Kaufman 1993Problem complexity N=15, K=2 Experiments to provide results similar to
observed values
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1010
October 2004October 2004
OSSim Output Values
Project members
Social network information
Project code growth information
Meta data to determine project direction
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1111
October 2004October 2004
Validation Data Sources
Advogato.org Free/Open source developer site
Only tracks social aspects of interactions
Tigris.org Free/Open source hosting site
Does not explicitly track social interactions
SourceForge.net Similar to Tigris.org, but much larger
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1212
October 2004October 2004
Preliminary Results
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1313
October 2004October 2004
Preliminary Results (2)
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1414
October 2004October 2004
Overview
What is Free and Open Source Software (F/OSS)?
Previous Work
Research Questions
Modeling Software Development
Sources of Validation Data
Preliminary Results
Virtual Experiment
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1515
October 2004October 2004
Virtual Experiment
Medium scale open source project (5-10 volunteer developers)
Corporate interest spawns new contributors Vary number of contributors (1-2, 3-5, 8-10)
Vary skill and motivation of contributor
Observe progress of project (users, overall fitness)
What happens to volunteer developers
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1616
October 2004October 2004
Virtual Experiment Results
Note: These are like WMD's in Iraq...still Imaginary results
When compared to the BATIK project, OSSim showed similar results 72% of the time
When compared to SpamAssassin, OSSim showed similar results 82% of the time
Corporate influence frequently skews projects away from their original volunteer hacker nature
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1717
October 2004October 2004
Major Contributions
New analysis of the social network structure of Free/Open Source Projects using new data streams
First model of software development focusing exclusively on free/open source software engineering
Results from OSSim have been shown to be close to those of real projects
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1818
October 2004October 2004
Questions, comments, and large amounts of currency are now welcome
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/1919
October 2004October 2004
References
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/2020
October 2004October 2004
NK ModelAgents have a genotype bit string of length N
0 1 1 1 0 0 1 1 0 1 1 1 0 1 1
N=15Each bit is associated with K neighbors to create alleles011
111110100001
011110101011111
110101011110101
K=2
Each allele has is evaluated against the randomly generated fitness landscape for that position. Overall fitness is the average of these values.
0.4913
0.15 + 0.19 + 0.20 + 0.46 + 0.34 + 0.85 + 0.67 + 0.12 + 0.77 + 0.91 + 0.85 + 0.55 + 0.72 + 0.06 + 0.59
15
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/2121
October 2004October 2004
Mutation in the NK Model
0.4913
0 1 1 1 0 0 1 1 0 1 1 1 0 1 1
0.15 + 0.19 + 0.20 + 0.46 + 0.34 + 0.85 + 0.67 + 0.12 + 0.77 + 0.91 + 0.85 + 0.55 + 0.72 + 0.06 + 0.5915
011111110100001
011110101011111
110101011110101
K=2
Selects a bit at random, flip it, and evaluate new fitness
0 1 1 1 1 0 1 1 0 1 1 1 0 1 1
011111111110101
011110101011111
110101011110101
0.15 + 0.19 + 0.82 + 0.31 + 0.74 + 0.85 + 0.67 + 0.12 + 0.77 + 0.91 + 0.85 + 0.55 + 0.72 + 0.06 + 0.5915
0.5493
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/2222
October 2004October 2004
The OSSim NK Process
A pool of people and projects is initially created
People Projects
Agent 1DeveloperSkill 30021211112220112
Agent 2Non-Developer120011222012121
We'll focus on just two agentsAgents evaluate problem string against each project
0.43
0.520.33
0.390.61
0.57
0.52
0.61
Associate with project that provides highest fitnessDeveloper agents modify fitness landscape of projects
0:010 +0.133:222 -0.0512:121 +0.224:201 -0.08 . . .
People evaluate new fitness
0.50
0.67
Vote to accept changes, ties settled by fitnessCycle repeats until terminated
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/2323
October 2004October 2004
Agent Properties
Set of problems the agent is encountering
Attempt to match problems to projects (solutions)
If the agent is a developer or is a free rider
Skill if the agent is a developer
Focus of the agent is controlled by attention-span
Who an agent knows – social network
Projects an agent knows – project network
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/2424
October 2004October 2004
Project Properties
A set of developers who create the project and users who utilize the project
Social norms that characterize the interaction process
A walled server that controls access to the project resources and mediates communication
A fitness landscape that can be used to evaluate problems of agents
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/2525
October 2004October 2004
Preliminary Results
Patrick Adam Patrick Adam WagstromWagstrom
http://patrick.wagstrom.net/http://patrick.wagstrom.net/2626
October 2004October 2004
Preliminary Results (2)