analyzing foss collaboration and social dynamics with...

13
Analyzing FOSS Collaboration and Social Dynamics with Temporal Social Networks Amir Azarbakht, Carlos Jensen Oregon State University, School of Electrical Engineering & Computer Science 1148 Kelley Engineering Center, Corvallis OR 97331, USA {azarbaam,cjensen}@eecs.oregonstate.edu http://eecs.oregonstate.edu/~azarbaam http://eecs.oregonstate.edu/people/jensen Abstract. How can we understand FOSS collaboration better? Can so- cial issues that emerge be identified and addressed before it is too late? Can the community heal itself, become more transparent and inclusive, and promote diversity? We propose a technique to address these issues by quantitative analysis of social dynamics in FOSS communities. We propose using social network analysis metrics to identify growth pat- terns and unhealthy dynamics; giving the community a heads-up when they can still take action to ensure the sustainability of the project. Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source Software, FLOSS, Forking, Social Network Analysis 1 Introduction Social networks are a ubiquitous part of our social lives, and the creation of online social communities has been a natural extension of this phenomena. Free/Open Source Software (FOSS) development efforts are prime examples of how community can be leveraged in software development, as efforts are formed around communities of interest, and depend on continued interest and involve- ment in order to stay alive [Nyman 2011]. Though the bulk of collaboration and communication in FOSS communities occurs online and is publicly accessible, there are many open questions about the social dynamics in FOSS communities. Projects might go through a meta- morphosis when faced with an influx of new developers or the involvement of an outside organization. Conflicts between developers raised as the result of di- vergent opinions about the future of the project might lead to an ensuing fork of project and the dilution of the community. Forking, either as a violent split when there is a conflict or as a friendly divide when new features are experi- mentally added both affect the community [Bezrukova et al. 2010]. Most recent studies of FOSS communities have tended to suffer from two important limitations. First, they treat community as a static structure rather

Upload: others

Post on 28-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

Analyzing FOSS Collaboration and SocialDynamics with Temporal Social Networks

Amir Azarbakht, Carlos Jensen

Oregon State University, School of Electrical Engineering & Computer Science1148 Kelley Engineering Center, Corvallis OR 97331, USA

{azarbaam,cjensen}@eecs.oregonstate.eduhttp://eecs.oregonstate.edu/~azarbaam

http://eecs.oregonstate.edu/people/jensen

Abstract. How can we understand FOSS collaboration better? Can so-cial issues that emerge be identified and addressed before it is too late?Can the community heal itself, become more transparent and inclusive,and promote diversity? We propose a technique to address these issuesby quantitative analysis of social dynamics in FOSS communities. Wepropose using social network analysis metrics to identify growth pat-terns and unhealthy dynamics; giving the community a heads-up whenthey can still take action to ensure the sustainability of the project.

Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/OpenSource Software, FLOSS, Forking, Social Network Analysis

1 Introduction

Social networks are a ubiquitous part of our social lives, and the creation ofonline social communities has been a natural extension of this phenomena.Free/Open Source Software (FOSS) development efforts are prime examples ofhow community can be leveraged in software development, as efforts are formedaround communities of interest, and depend on continued interest and involve-ment in order to stay alive [Nyman 2011].

Though the bulk of collaboration and communication in FOSS communitiesoccurs online and is publicly accessible, there are many open questions aboutthe social dynamics in FOSS communities. Projects might go through a meta-morphosis when faced with an influx of new developers or the involvement ofan outside organization. Conflicts between developers raised as the result of di-vergent opinions about the future of the project might lead to an ensuing forkof project and the dilution of the community. Forking, either as a violent splitwhen there is a conflict or as a friendly divide when new features are experi-mentally added both affect the community [Bezrukova et al. 2010].

Most recent studies of FOSS communities have tended to suffer from twoimportant limitations. First, they treat community as a static structure rather

Page 2: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

2 Amir Azarbakht, Carlos Jensen

than a dynamic process. Second, many social dynamics in FOSS have beenstudied using a case-study methodology, focusing on a selected subset of theavailable data.

In this paper, we propose to use social network analysis to study the evo-lution and social dynamics of FOSS communities. With these techniques wehope to identify measures associated with unhealthy group dynamics, e.g. asimmering conflict, as well as early indicators of major events in the lifespanof an online community. One dynamic we are especially interested in are thoseof forked FOSS projects. We will seek to validate this technique by comparingthe results of our analysis to the results of a study of forked FOSS projectsby [Robles and Gonzalez-Barahona 2012]. The goal is to demonstrate that thisquantitative approach can be applied to commonly available FOSS archives toget a better understanding of the evolution of these communities.

This paper is organized as follows: We present related literature on onlinesocial communities, recounting their focus and the findings. We then presentthe gap in the literature, and what further study needs to be done. Next, wediscuss why the issue needs to be addressed and who benefits from it, in themotivation section. Following that, we present three research questions framedas hypotheses that we are going to test. After that, we propose a methodologyas to how to test the validity of the hypotheses, which includes gathering data,doing the analysis, and the visualization of the findings. At the end, we presentfuture work and challenges.

2 Related Work

The social structures of FOSS communities have been studied extensively overthe past decade. Researchers have studied the social structure and dynamicsof team communications [Howison et al. 2006, Bird et al. 2008], identifyingknowledge brokers and their associated activities in FOSS projects [Sowe etal. 2006], their sustainability [Nyman 2011], FOSS forking [Nyman 2011, Rob-les and Gonzalez-Barahona 2012], their topology [Bird et al. 2008], their de-mographic diversity [Kunegis et al. 2012], gender differences in the process ofjoining them [Kuechler et al. 2012] and the role of the core team in their com-munities [Torres et al. 2011]. They have tended to look at community as a staticstructure rather than a dynamic process. This makes it hard to determine causeand effect, or the exact impact of social changes.

The study of communities has grown in popularity in part thanks to ad-vances in social network analysis. From the earliest works on studying infor-mation flow and predicting conflict and fission in groups by Zachary [Zachary1977], to the more recent works of [Leskovec et al.] on the statistical propertiesof community structure in social networks, there is a growing body of quanti-

Page 3: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

Analyzing FOSS Collab. & Social Dynamics w/ Temporal Social Networks 3

tative research on online communities.

The earliest works on communities was done with a focus on informationdiffusion in a community [Zachary 1977]. Zachary investigated the fission of acommunity, the process of communities splitting into two or more parts. Hefound that fission could be predicted by applying the Ford-Fulkerson min-cutalgorithm [Ford and Fulkerson 1957] on the group’s communication graph; “theunequal flow of sentiments across the ties” and discriminatory sharing of infor-mation lead to “subcommunities with more internal stability than the commu-nity as a whole.”

Community splits in FOSS projects are referred to as forks, and are rela-tively common. Forking is defined as “when a part of a development community(or a third party not related to the project) starts a completely independentline of development based on the source code basis of the project.” Robles andGonzalez-Barahona [Robles and Gonzalez-Barahona 2012] identified 220 signif-icant FOSS projects that have forked over the past 30 years, and compiled acomprehensive list of the dates and reasons for forking. They classified theseinto six main categories. Table 3. shows their results, discussed further in thenext sections, and which we build on extensively in this work. They identified agap in the literature in case of “how the community moves when a fork occurs”.

The dynamic behavior of a network and identifying key events was the aimof a study by [Asur et al. 2009]. They studied three DBLP co-authorship net-works and defined the evolution of these networks as following one of thesepaths: a) Continue, b) k-Merge, c) k-Split, d) Form, or e) Dissolve. They alsodefined four possible transformation events for individual members: 1) Appear,2) Disappear, 3) Join, and 4) Leave. They compared groups extracted from con-secutive snapshots, based on the size and overlap of every pair of groups. Then,they labeled groups with events, and used these identified events to calculatethe metrics in Table 1 for the nodes and the network.

Table 1. The measures of diversity used by [Asur et al. 2009]

Metrics Meaning

Stability Tendency of a node to have interactions with the same nodes overtime

Sociability Tendency of a node to have different interactions

Influence Number of followers a node has on a network and how its actions arecopied and/or followed by other nodes. (e.g. when it joins/leaves aconversation, many other nodes join/leave the conversation, too)

Popularity Number of nodes in a cluster (how crowded a sub-community is)

Page 4: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

4 Amir Azarbakht, Carlos Jensen

The communication patterns of FOSS developers in a bug repository wereexamined by [Howison et al. 2006]. They calculated out-degree centrality astheir metric. Out-degree centrality measures the proportion of the number oftimes a node contacted other nodes (outgoing) over how many times it was con-tacted by other nodes (incoming). They calculated this centrality over time “in90-day windows, moving the window forward 30 days at a time.” They foundthat “while change at the center of FOSS projects is relatively uncommon,”participation across the community is highly skewed, following a power-law dis-tribution, where many participants appear for a short period of time, and a verysmall number of participants are at the center for long periods. Our approach issimilar to theirs in how we form collaboration graphs and perform our temporalanalysis. Our approach is different in terms of our project selection criteria, themetrics we examine, and our research questions.

The tension between diversity and homogeneity in a community was studiedby [Kunegis et al. 2012]. Table 2 lists five network statistics they used to examinethe evolution of large-scale networks over time. They found that except for thediameter, all other measures of diversity shrunk as the networks matured overtheir lifespan. Kunegis et al. [Kunegis et al. 2012] argued that one possiblereason could be that the community structure consolidates as projects mature.

Table 2. The measures of diversity used by [Kunegis et al. 2012]

Network property Network is diverse when A network is diverse when

Paths between nodes Paths are long Effective diameter

Degrees of nodes Degrees are equal Gini coefficient of the degreedistribution

Communities Communities have similar sizes Fractional rank of the adja-cency matrix

Random walks Random walks have high proba-bility of return

Weighted spectral distribution

Control of nodes Nodes are hard to control Number of driver nodes

To date, most studies on FOSS have only been carried out on a small numberof projects, and using snapshots in time. To our knowledge, no study has beendone of project forking that has taken into account the temporal dimension.

3 Motivation

To better understand and measure the evolution and social dynamics of FOSSprojects, integral components to understanding their evolution and direction,we need new and better tools. With this knowledge and these tools, we could

Page 5: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

Analyzing FOSS Collab. & Social Dynamics w/ Temporal Social Networks 5

help projects reflect on their actions, and help community leaders make in-formed decisions about possible changes or interventions. We want to map thedynamics of communities to real world phenomena. Identification is the firststep to rectify an undesired dynamic before the damage is done. A communitythat does not manage growing pains may end up stagnating or dissolving.

Managing growing pains is especially important in case of FOSS, wherenear half the project contributors are volunteers [Forrest et al. 2012]. [Oh et al.2004] have argued that openness in FOSS is “[...] generally perceived as havinga positive connotation, however, the term can also be interpreted as referringto some unconstructive characteristics, such as unobstructed exit, susceptible,vulnerable, fragile, lacking effective regulation, and so on. The unobstructedexit and lack of regulatory force inherent in the OSS community can result in acommunity’s susceptibility and vulnerability to herded exits by its participants.Commercial vendor intervention, an alternative project becoming available, andlicensing issues can result in some original core members ceasing to provide theirloyal service for the community, which can prompt their coworkers to leave aswell” [Oh et al. 2004].

Recipes for success or stagnation, sustainability or fragmentation could beidentifiable, leading to a set of best practices and pitfalls.

4 Research Questions

We argue that the social interactions data reflects the changes the communitygoes through, and will be able to describe the context surrounding a forkingevent.Robles and Gonzalez-Barahona [Robles and Gonzalez-Barahona 2012] classifythe main reasons for forking into six classes, listed in Table 3.

Table 3. The main reasons for forking as classified by [Robles and Gonzalez-Barahona2012]

Reason for forking Example forks

Technical (Addition of functionality) Xpdf & Poppler

More community-driven development EGCS & GCC

Discontinuation of the original project Apache web server

Commercial strategy forks LibreOffice & OpenOffice.org

Legal issues X.Org & XFree

Differences among developer team OpenBSD & NetBSD

Three of the six listed reasons are socially related, and so should arguably bereflected somehow in the social interaction data. As an example, if a fork occurs

Page 6: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

6 Amir Azarbakht, Carlos Jensen

because of a desire for “more community-driven development”, we expect to seean interaction patterns in the collaboration data showing. a strongly-connectedcore community getting more isolated from the rest of the community, and theformation of another very active and well-connected core. More specifically, ourresearch hypotheses are:

Hypothesis #1: Social interactions data can be used to predict an imminentfork.

Hypothesis #2: “Personal differences” and “technical differences” are as-sociated with distinctly different interaction patterns. It is possible to tell thedifference between when the reason for forking was personal vs. technical.

Hypothesis #3: It is possible to tell through collaboration patterns betweenwhen the reason for forking is “more community-driven development” and “dif-ferences among the developer team”.

5 Proposed Methodology

Determining how FLOSS networks evolve over time requires us to adopt a casestudy methodology. To this end, we propose to examine several FLOSS projects.This will involve obtaining mailing list archives, bug repositories, Git logs, cre-ating the collaboration graphs, applying social network analysis technique andmeasuring SNA metrics, and visualizing the evolving graphs. We plan to havefive phases as described in the following:

5.1 Phase 1: Data Mining

We plan to study six projects that have forked and six projects that have notforked as a control group. Data collection involves mining mailing list archives,bug repositories, and Git logs. [Howison and Crowston 2004] described theirexperiences of “spidering, parsing and analysis of SourceForge data.” Buildingon their research, we will carefully avoid the pitfalls prone to skew the study.We plan to contribute to and use FLOSSmole [Howison et al. 2006] resources,which provides scripts for FLOSS research data and analyses.

5.2 Phase 2: Creating Communication Graphs

Many social structures can be represented as graphs. The nodes represent ac-tors/players and the edges between them represent the interaction among theactors. Such graphs can be a snapshot of the network, which forms a staticgraph, or a longitudinally changing network, also called a dynamic graph.

Page 7: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

Analyzing FOSS Collab. & Social Dynamics w/ Temporal Social Networks 7

In this phase, we process the data to form a communication graph represent-ing the community. This is fraught with complications. Collaboration graph fora large community, e.g. Linux Kernel with 9515 contributors will be a complexnetwork of 9515 nodes and millions of edges. Special graph sampling algorithmsare needed to use clustering techniques and social network analysis tools on sucha large and complex network. These graph-sampling techniques have been stud-ied and compared by [Wang et al. 2011] and [Lu et al. 2011]. [Chi et al. 2007]e.g. applied a temporal smoothness to their spectral clustering algorithm, andargued that incorporating this method results in a more robust clustering re-sults, while it provides less sensitivity to short-term noise and is “adaptive tolong-term clustering drift.”

5.3 Phase 3: Temporal Evolution Analysis

In this phase, we want to analyze the changes that happen to the communityover a given period of time, e.g. two years before and two years after the forkingevent. To this end, we are interested in seeing the changes in the roles individ-uals have, any shifts in the center of gravity of the project, unusual dilution orconcentration of the part of the community and how information can diffusealong the community. These changes can translate into questions, e.g. is every-one still connected to everyone like before, or there are bridges burnt down. Theroles individuals take in the community are in par with the individual’s impor-tance or centrality in the network. We can measure each individual’s importancewith a social network analysis measure called centrality. i.e. who is more im-portant/central than the others? e.g. who has the most number of friends in acommunity? (degree centrality)

There are many ways of looking at an individual’s importance. This is mea-sured by a centrality called closeness centrality. The farness of a node is definedas the sum of its distances to all other nodes. The closeness of a node is de-fined as the inverse of the farness. More informally, the more central a nodeis the lower its total distance to all other nodes. Closeness centrality can beused as a measure of how fast information will spread through the network[Chakrabarty and Faloutsos 2006]. Secondly, if we are looking for people whocan serve as bridges between two distinct communities, we could measure thenode’s betweenness centrality. Betweenness centralities for mediators who act asintermediate entities between other nodes are higher [Chakrabarty and Falout-sos 2006]. Third, if cross-community collaboration is the focus, we can measureedge betweenness centrality. Edges connecting nodes from different communitieshave higher edge centrality values. In the community collaboration graph, edgebetweenness or stress of an edge is the number of these shortest paths thatthe edge belongs to, considering all shortest paths between all pairs of nodesin the graph. Fourth, one can claim that certain people in the community aremore important than others, and whoever is close to them, is relatively more

Page 8: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

8 Amir Azarbakht, Carlos Jensen

important than others. In graph terms, this is measured by eigenvector central-ity, which is based on the assumption that connections to high-profile nodescontribute more to the importance of a node. Google’s PageRank link-analysisalgorithm by [Brin and Page 1998] is a variant of the eigenvector centrality mea-sure. In short, centrality measures have been used in several studies to identifykey player in a community.

In addition to the centrality measures, we plan to look into the resilienceof the community as well. By resilience, we mean how well the network holdsits structure and form when some parts of it are deleted, added, or changed.For a graph, the resilience of a graph is a measure of its robustness to node oredge failures. This could occur for instance when an influential member of thecommunity leaves. Many real-world graphs are resilient to random failures butvulnerable to targeted attacks. Resilience can be related to the graph diameter :a graph whose diameter does not increase much on node or edge removal hashigher resilience [Chakrabarty and Faloutsos 2006].

Fig. 1. Heat-map color-coded examples of nodes with high centrality metric areshown above. The same network is analysed four times with the following centralitymeasures: A) Degree centrality, B) Closeness centrality, C) Betweenness centrality andD) Eigenvector centrality [Rocchini 2012]

Page 9: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

Analyzing FOSS Collab. & Social Dynamics w/ Temporal Social Networks 9

Fig. 2. Github community visualization, color-coded by users’ primary programminglanguage used [Cuny 2010]

We plan to measure these metrics defined on the communication graph, aswell as clustering coefficient of node, degree distribution of the entire network,and the graph’s diameter, whose definitions are out of the scope of this paper.

The significance of SNA measures on a single collaboration graph is definedas compared to a random graph, e.g. an Erdos-Renyi random graph of the samesize [Erdos and Renyi 1959], to examine their relative significance. We plan todo this analysis over time, rather than just a one-snapshot analysis. So, thetemporal changes in measured values are what we are primarily interested in.

We are aware that while datasets consisting of huge real-world graphs are ob-tainable, their sizes may range from thousands nodes and several million edges.While conventional algorithms to compute graph metrics (shortest paths, cen-trality, betweenness, etc.) exist, many of these algorithms become impracticalfor large graphs. Leskovec [Leskovec et al. 2008] argues that one can use sam-pling algorithms – but that is a non-trivial problem since the goal is to maintainstructural properties of the network.

Page 10: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

10 Amir Azarbakht, Carlos Jensen

5.4 Phase 4: Temporal Visualization

Several visualization techniques and tools are used in the field of social net-work analysis, for instance Gephi [Bastian et al. 2009], NetLogo [Wilensky1999], igraph [Csardi and Nepusz 2006], NetworkX [Hagberg et al. 2008], SoNIA[Bender-deMoll and McFarland 2006], NodeXL [Smith et al. 2009]. Gephi, is aFLOSS tool for exploring and manipulating networks. It is capable of handlinglarge networks with more than 20,000 nodes and features several SNA algo-rithms. It is customizable with plugins and can be used for dynamic networkvisualization.

5.5 Phase 5: Interviewing & Triangulation

To complement the quantitative techniques, we plan to interview key playersidentified in the community and the situation studied to gain their perspectiveon the community as well. This will be useful in cross-validating the results,as the part of communication that was not logged online could be taken intoaccount in this way. Furthermore, we will compare our findings to those of[Robles and Gonzalez-Barahona 2012]. This should allow us to determine howwell the SNA techniques work to identify the underlying reasons for a fork.

5.6 Challenges

The proposed technique uses the data from online communications. The as-sumption that all the communication can be captured by mining repositoriesis intuitively imperfect, but inevitable. Hence, to minimize the effect of thisassumption, we complement the quantitative approach with a qualitative ap-proach of interviewing key individuals from the community. This will help tri-angulate the results.

6 Preliminary Results

Some initial work is done, namely scraping mailing list data off several FOSSprojects, as well as their bug repositories. We have also done a literature reviewof the existing work on social network analysis on FOSS communities, andhave experienced working with large graphs. There are a myriad of possibleapplications for the possible outcomes of this line of research, and the gap inthe existing body of research shows a need for further study of FOSS socialnetwork analysis.

7 Acknowledgments

We would like to thank the peers at the Human-Computer Interaction Lab atOregon State University, with special thanks to Victor Kuechler and JenniferL. Davidson for their feedback.

Page 11: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

Analyzing FOSS Collab. & Social Dynamics w/ Temporal Social Networks 11

References

1. Asur, S., S. Parthasarathy, and D. Ucar, (2009), “An event-based framework forcharacterizing the evolutionary behavior of interaction graphs,” in ACM Trans.Knowledge Discovery Data. 3, 4, Article 16, (November 2009), 36 pages. 2009.

2. Bastian, M., S. Heymann, and M. Jacomy, “Gephi: an open source software forexploring and manipulating networks,” presented at the International AAAI Con-ference on Weblogs and Social Media, 2009.

3. Bender-deMoll, S. and D. A. McFarland, “The Art and Science of Dynamic Net-work Visualization,” Journal of Social Structure, vol. 7, 2006.

4. Bergquist, M. and Ljungberg, J., (2001). , “The power of gifts: organizing socialrelationships in open source communities,” Info. Systems J., 11, 305-320. 2001.

5. Bird, C., D. Pattison, R. D’Souza, V. Filkov, and P. Devanbu, “Latent socialstructure in open source projects,” in Proceedings of the 16th ACM SIGSOFTInternational Symposium on Foundations of software engineering, New York, NY,USA: ACM, pp. 24-35, 2008.

6. Brin, S. and L. Page, “The anatomy of a large-scale hypertextual Web searchengine,” in Proceedings of the 7th international conference on World Wide Web7 (WWW7), Philip H. Enslow, Jr. and Allen Ellis (Eds.). Elsevier Science Pub-lishers B. V., Amsterdam, The Netherlands, The Netherlands, 107-117. 1998.

7. Chakrabarti, D. and C. Faloutsos. “Graph mining: Laws, generators, and algo-rithms,” ACM Computing Surveys, 38, 1, Article 2, 2006.

8. Chi, Y., X. Song, D. Zhou, K. Hino, and B. L. Tseng, “Evolutionary spectral clus-tering by incorporating temporal smoothness,” in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining(KDD ’07). ACM, New York, NY, USA, 153-162. 2007.

9. Crowston, K., and Scozzi, B., (2002). “Open Source Software Projects as VirtualOrganizations: Competency Rallying for Software Development,” IEE Proceed-ings Software. 149(1), 3-17.

10. Crowston, K., K. Wei, J. Howison. 2012. “Free/Libre open-source software devel-opment: What we know and what we do not know,” ACM Comput. Surv. 44, 2,Article 7, 2012.

11. Csardi, G. and T. Nepusz, “The igraph software package for complex networkresearch,” in InterJournal Complex Systems, 2006.

12. Cuny, F., (Mar. 25 2010), Github Explorer, Available at:http://www.flickr.com/photos/franck /4460144638/, 2010.

13. De Souza, C. R. B., Froehlich, J., and Dourish, P. (2005), “Seeking the Source:Software Source Code as a Social and Technical Artifact,” Proc. ACM Intern.Conf. Supporting Group Work (GROUP 2005), 197-206, Sanibel Island, Florida.2005.

14. Elliott, M. and Scacchi, W., (2003), “Free Software Developers as an OccupationalCommunity: Resolving Conflicts and Fostering Collaboration,” Proc. ACM In-tern. Conf. Supporting Group Work, 21-30, Sanibel Island, FL, November. 2003.

15. Erdos, P. and A. Renyi, “On random graphs, I,” Publicationes Mathematicae(Debrecen), vol. 6, pp. 290-297, 1959.

16. FLOSS (2002). , “Free/Libre and Open Source Software: Survey and Study,FLOSS Final Report,”http://www.flossproject.org/report/. 2006.

Page 12: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

12 Amir Azarbakht, Carlos Jensen

17. Ford, L. R. and D. R. Folkerson, “A simple algorithm for finding maximal net-work flows and an application to the Hitchcock problem,” Canadian Journal ofMathematics, vol. 9, pp. 210-218, 1957.

18. Hagberg, A. A., D. A. Schult, and P. J. Swart, “Exploring network structure,dynamics, and function using NetworkX,” in Proceedings of the 7th Python inScience Conference (SciPy2008), Pasadena, CA USA, 2008.

19. Howison, J. and K. Crowston. “The perils and pitfalls of mining SourceForge,”In Proceedings of the International Workshop on Mining Software Repositories(MSR 2004), pp. 7-11. 2004.

20. Howison, J., K. Inoue, and K. Crowston, “Social dynamics of free and open sourceteam communications,” in Proceedings of the IFIP Second International Confer-ence on Open Source Systems, 319-330, 2006.

21. Howison, J., M. Conklin, and K. Crowston, “FLOSSmole: A collaborative reposi-tory for FLOSS research data and analyses,” International Journal of InformationTechnology and Web Engineering, 1(3), 1726. 2006.

22. Huntley, C.L., (2003), “Organizational Learning in Open-Source SoftwareProjects: An Analysis of Debugging Data,” IEEE Trans. Engineering Manage-ment, 50(4), 485-493. 2003.

23. Jergensen, C., A. Sarma, and P. Wagstrom. 2011. “The onion patch: migrationin open source ecosystems,” In Proceedings of the 19th ACM SIGSOFT sympo-sium and the 13th European conference on Foundations of software engineering(ESEC/FSE ’11). ACM, New York, NY, USA, 70-80. 2011.

24. Koch, S. (Ed.), (2005), “Free/Open Source Software Development,” Idea GroupPublishing, Hershey, PA. 2005.

25. Kuechler, V., C. Gilbertson, and C. Jensen, “Gender Differences in Early Freeand Open Source Software Joining Process,” Open Source Systems: Long-TermSustainability, pp. 78-93. 2012.

26. Kunegis, J., S. Sizov, F. Schwagereit, and D. Fay, “Diversity dynamics in onlinenetworks,” in Proceedings of the 23rd ACM Conference on Hypertext and SocialMedia, Milwaukee, Wisconsin, USA, 2012.

27. Leskovec, J., Kleinberg, J., and Faloutsos, C.: “Graphs over time: densifica-tion laws, shrinking diameters and possible explanations,” in Proceedings of theSIGKDD International Conference on Knowledge Discovery and data Mining,2005.

28. Leskovec, J., K. J. Lang, A. Dasgupta, and M. W. Mahoney, “Statistical propertiesof community structure in large social and information networks,” in Proceedingsof the 17th International Conference on World Wide Web (WWW ’08), ACM,New York, NY, USA, 695-704. 2008.

29. Lu, J. and D. Li, “Sampling online social networks by random walk,” in Proceed-ings of the First ACM International Workshop on Hot Topics on InterdisciplinarySocial Networks Research, ACM, New York, NY, USA, 33-40. 2012.

30. Madey, G., Freeh, V., and Tynan, R., (2002), “The Open Source DevelopmentPhenomenon: An Analysis Based on Social Network Theory,” Proc. AmericasConf. Info. Systems (AMCIS2002), 1806-1813, Dallas, TX. 2002.

31. Madey, G., Freeh, V., and Tynan, R., (2005), “Modeling the F/OSS Community:A Quantitative Investigation,” in S. Koch (ed.), Free/Open Source Software De-velopment, 203-221, Idea Group Publishing, Hershey, PA. 2005.

32. Nyman, L. , “Understanding code forking in open source software,” in Proceedingsof the 7th International Conference on Open Source Systems Doctoral Consor-tium, Salvador, Brazil, 2011.

Page 13: Analyzing FOSS Collaboration and Social Dynamics with ...research.engr.oregonstate.edu/hci/sites/research... · Keywords. FOSS, Social Dynamics, Temporal Analysis, Free/Open Source

Analyzing FOSS Collab. & Social Dynamics w/ Temporal Social Networks 13

33. Nyman, L., T. Mikkonen, J. Lindman, and M. Fougere, “Forking: the invisiblehand of sustainability in open source software,” in Proceedings of SOS 2011:Towards Sustainable Open Source, 2011.

34. Oh, W. and S. Jeon, “Membership dynamics and network stability in the open-source community: the ising perspective,” in Proceedings of the InternationalConference on Information Systems, 2004.

35. Ovaska, P., Rossi, M. and Marttiin, P. (2003), “Architecture as a CoordinationTool in Multi-Site Software Development,” Software ProcessImprovement andPractice, 8(3), 233-247. 2003.

36. Robles, G. and J. M. Gonzalez-Barahona, “A comprehensive study of softwareforks: Dates, reasons and outcomes,” in Proceedings of the 8th InternationalConference on Open Source Systems, Hammamet, Tunisia, 2012.

37. Rocchini, C. (Nov. 27 2012), Wikimedia Commons, Available:http://en.wikipedia.org/wiki/File:Centrality.svg, 2012.

38. Scacchi, W., (2004), “Free/Open Source Software Development Practices in theComputer Game Community,” IEEE Software, 21(1), 59-67, January/February.2004.

39. Scacchi, W., (2007), “Free/Open Source Software Development: Recent ResearchResults and Methods,” in M.V. Zelkowitz (ed.), Advances in Computers, 69, 243-295, 2007.

40. Smith, M. A., B. Shneiderman, N. Milic-Frayling, E. Mendes Rodrigues, V.Barash, C. Dunne, T. Capone, A. Perer, and E. Gleave, “Analyzing (social me-dia) networks with NodeXL,” in Proceedings of the 4th International Conferenceon Communities and Technologies (C& T ’09), 2009.

41. Sowe, S., L. Stamelos, and L. Angelis, “Identifying knowledge brokers that yieldsoftware engineering knowledge in OSS projects,” Information and Software Tech-nology, vol. 48, pp. 1025-1033, Nov 2006.

42. Torres, M. R. M., S. L. Toral, M. Perales, and F. Barrero, “Analysis of the CoreTeam Role in Open Source Communities,” in Complex, Intelligent and SoftwareIntensive Systems (CISIS), 2011 International Conference on, pp. 109-114. IEEE,2011.

43. Wang, T., Y. Chen, Z. Zhang, T. Xu, L. Jin, P. Hui, B. Deng, and X. Li. “Under-standing graph sampling algorithms for social network analysis,” in DistributedComputing Systems Workshops (ICDCSW), 31st International Conference on,pp. 123-128. IEEE, 2011.

44. Wilensky, U., “NetLogo,” in Center for Connected Learning and Computer-BasedModeling, Northwestern University. Evanston, IL., 1999.

45. Xu, J., Y. Gao, S Christley, and G Madey. 2005, “A Topological Analysis of theOpen Souce Software Development Community,” in Proceedings of the Proceed-ings of the 38th Annual Hawaii International Conference on System Sciences -Volume 07 (HICSS ’05), Vol. 7. IEEE Computer Society, Washington, DC, USA.2005.

46. Ye, Y. and Kishida, K., (2003), “Towards an understanding of the motivation ofopen source software developers,” Proc. 25th Intern. Conf. Software Engineering,Portland, OR, 419-429, IEEE Computer Society, May. 2003.

47. Zachary, W., “An information flow model for conflict and fission in small groups,”Journal of Anthropological Research, vol. 33, no. 4, pp. 452-473, 1977.