crowdsourcing documentation in software engineering
DESCRIPTION
Presented at ICSE 2014 Workshop on Crowdsourcing in Software Engineering June 2, 2014, Hyderabad India.TRANSCRIPT
Crowdsourcing Documentation in Software Engineering
Margaret-Anne (Peggy) Storey ICSE 2014 1st International Workshop on Crowdsourcing in Software Engineering
Christoph Treude Brendan Cleary Fernando Figueira Filho Jamie Starke Gargi Bougie Peter Rigby Lars Grammel Leif Singer Laura MacLeod Daniel German Alexey Zagalsky
Chris Parnin, Georgia Tech Ohad Barzilay, Tel-Aviv University, Israel Arie van Deursen, TU Delft, the Netherlands Li-Te Cheng, IBM Research Ian Bull, Eclipsesource
Acknowledgements
“Documentation is the castor oil of software development”
Gerald Weinberg, Psychology of Computer Programming 1975
Documentation to capture…
Requirements Architecture Features, implementation Scenarios of use Examples of use Testing Decisions And more?
Created by…
Developers, contributors
Documenters Automatically
generated Users The crowd!
Designed for…
End users Client developers Contributors
Documentation rationale…
To replace communication
To specify a contract with partners To provide organizational memory To reflect To seek feedback
For the public good! [Wasko et al.]
Documentation formats…
Formal documentation (hierarchically structured)
Technical articles Books Self documenting code Source code comments Forums Email lists Usenet
Issues, bug tracking Archived chats Wikis Blog posts, microblogs Tagging Stackoverflow Videos, podcasts Community portals
(aggregate channels)
Documentation challenges… Navigability, discoverability Audience and “fit for purpose” Boring prose Consistent use of terminology Staying current Costly, slow Explicit versus tacit knowledge Lack of good examples
Crowdsourcing…
“…obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers… the work comes from an undefined public rather than being commissioned from a specific, named group…
Explicit crowdsourcing lets users work together to evaluate, share and build different specific tasks, while implicit crowdsourcing means that users solve a problem as a side effect of something else they are doing.” [Wikipedia, June 1, 2014]
Community versus crowd contributions?
Individual or team contributions (e.g. design documents, podcasts)
Community contributions: created by a few (e.g. translation efforts)
Crowdsourcing contributions: many small contributions that add value (e.g. views, likes, comments, tags, votes)
Social production [Yochai Benkler] Industrial revolution, high costs to access broadcast media
Low cost distributed small contributions at scale Not just turning levers but adding wisdom, creativity
Not a fad! Critical long term shift caused by the internet
Social media as a disruptive force: an enabler for crowdsourcing
Enhancing the participatory culture in software development and in software documentation
Storey, M.-A., L. Singer, F. Figueira Filho, B. Cleary and A. Zagalsky, The (R)evolutionary Role of Social Media in Software Engineering, ICSE 2014 Future of Software Engineering Track), Hyderabad, 2014.
Social Media Channels for Software Documentation
Community Portal
Tagging
Microblogging Question &
Answer Websites
Videos, podcasts
Blogging
Wikis
Outline of the rest of this talk
Some insights on how social media channels can support “crowdsourced” documentation in software development
Discussion
Community Portals
Tagging
MicroBlogging Question &
Answer Websites
Videos, podcasts
Blogging
Wikis
Wikis
Wikis for documenting Software
Wikis and software documentation
Used extensively (requirements, design, planning), integrated with many tools
Some shortcomings: lack of authoritativeness [Dagenais and Robillard FSE 2010]
Designed by Ward Cunningham in 1994
Community Portals
Question & Answer Websites
Videos, podcasts
Tagging
Wikis
MicroBlogging
Blogging
Social Tagging
How does tagging help with crowdsourced software documentation?
TagSEA: Tagging Waypoints in source code and gathering into Tours
M.-A. Storey, J. Ryall, J. Singer, D. Myers, L.-T. Cheng, M. Muller, 2009. How Software Developers Use Tagging to Support Reminding and Refinding. IEEE Transactions on Software Engineering (TSE), 2009.
Tagging in Studied introduction and adoption of tags by several teams for work items
C. Treude and M.-A. Storey. Work Item Tagging: Communicating Concerns in Collaborative Software Development. In IEEE Transactions on Software Engineering 38, 1 (January/February 2012). pp. 19-34.
Tagging in
Findings: – Categorization (cross cutting concerns, see also
Martin Robillard’s Feat tool) – Organization – Finding and refinding
ConcernLines
Treude, C., and M.-A. Storey, Concernlines: A timeline view of co-occurring concerns, formal research demonstration, IEEE ICSE’09.
Question & Answer Websites
Tagging
MicroBlogging
Community Portals
Videos, podcasts
Wikis
Blogging
Microblogging
Why do developers tweet?
Microblogging Software engineers tweet actively (share) facts about
software engineering topics and technology
G. Bougie, J. Starke, M.-A. Storey and D. German. Towards Understanding Twitter Use in Software Engineering: Preliminary Findings Ongoing Challenges and Future QuestionsIn Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering. 2011.
Survey/Interviews/Survey
Findings: – Awareness – Learning – Relationships
“It was evolving way faster than I was able to keep up with it. And the only way to keep up was to follow some Node.js people on Twitter.”
Leif Singer, Fernando Figueira Filho, Margaret-Anne Storey. Software Engineering at the Speed of Light: How Developers Stay Current Using Twitter ICSE 2014.
Question & Answer Websites
Tagging
MicroBlogging
Blogging
Community Portal
Videos, podcasts
Wikis
Blogging
Why do developers blog?
Blogging Determining requirements through blogs [Park and Maurer, CHASE 2009]
How developers blog: high-level concept discussion and requirements
[Pagano and Maalej, MSR 2011]
Blogs play a role in documenting APIs [Treude and Parnin, Web2SE 2011]
Is there potential to increase the size of the Blogging crowd for software documentation?
Question & Answer Websites
Tagging
MicroBlogging
Blogging
\
Community Portal
Videos, podcasts
Wikis
Question and Answer Websites
What role do Question and Answer websites play in documentation?
Over 92% of the questions on Stackoverflow are answered, and for those 92% the median answer time is 11 minutes
L. Mamykina, B. Manoim, M. Mittal, G. Hripcsak, and B. Hartmann. Design lessons from the fastest q&a site in the west. CHI 2011.
Stackoverflow
How-to questions prevalent, and used frequently by novices
C. Treude, O. Barzilay and M.-A. Storey. How do Programmers Ask and Answer Questions on the Web? NIER/ICSE 2011.
Linking Stackoverflow data with API usage
C. Parnin, C. Treude, L. Grammel and M.-A. Storey. Crowd Documentation: Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow”. Under submission, blogged (50,000 hits) at http://blog.ninlabs.com/2012/05/crowd-documentation/ May 2012.
Stackoverflow as Crowd Documentation
Coverage of API documentation: 77% of the Java API classes & 87% of Android API classes
Speed of coverage:
Impact on documentation tools? Automatically generating documentation Visualizing crowd documentation
http://latest-print.crowd-documentation.appspot.com/?api=android
Community Portals,
Question & Answer Websites
Videos, podcasts
Tagging
Wikis
MicroBlogging
Blogging
How do Developers use YouTube to Share Knowledge?
Videos, podcasts
44
Developer motivations?
Documentation! But also …
Reputation: Improves their online persona
Dedication to helping others “What I wish I had known when I started”
Efficiency “Throw it up on the internet and forget about it”
http://lmacleod.com/
Implications Many projects use videos to support documentation
and onboarding (e.g. MSDN) so…
How can they be improved for the recipient? How effective are videos at sharing tacit knowledge? Tool enhancements? Integration with IDE?
[e.g. Tours]
Cheng, L.-T., M. Desmond and M.-A. Storey, “Presentations by Programmers for Programmers”, ICSE 2007, IEEE 29th International Conference on Software Engineering.
Is this crowdsourcing? Are code walkthroughs on YouTube effective?
How much do the social features matter?
A social platform for crowd input for video documentation?
Question & Answer Websites
Tagging
MicroBlogging/Blogging
Community Portal
Videos, podcasts
Blogging
Wikis
Community portals
Stores code and project resources Provides version control Hosts web pages Connects people Links to communication tools Records interactions
C. Treude and M.-A. Storey. Effective Communication of Software Development Knowledge Through Community Portals. ESEC/FSE ’11.
Implications of different media Content on wikis is often stale, but useful for
posting information quickly
Blog posts create more buzz or fanfare
Official product documentation is trusted (review it carefully or rely on the crowd?)
Have an updating process (or crowdsource it?)
Have mechanisms to solicit feedback (e.g. commenting, blog posts, voting)
Social Media Channels to support Software Documentation
Community Portal
Tagging
Microblogging Question &
Answer Websites
Videos, podcasts
Blogging
Wikis
Discussion
Documentation challenges revisited
Recommenders to aid in discoverability Keeping up: leverage the crowd Incentive: participatory culture Video and podcasts for tacit knowledge Mining of social media can point to code
examples (implicit mechanism)
Discussion points
When does a community become a crowd? Gaps and nichification? Incentives? Dynamics? Study other portals, hubs? Do these mechanisms translate to industry?
What do you see as challenges, opportunities for involving the crowd?
http://www.thechiselgroup.org http://margaretannestorey.wordpress.com/
@thechiselgroup, @margaretstorey [email protected]
Funded by NSERC/DRDC/IBM