crowdsourcing documentation in software engineering

57
Crowdsourcing Documentation in Software Engineering Margaret-Anne (Peggy) Storey ICSE 2014 1 st International Workshop on Crowdsourcing in Software Engineering

Upload: margaret-anne-storey

Post on 06-May-2015

719 views

Category:

Software


4 download

DESCRIPTION

Presented at ICSE 2014 Workshop on Crowdsourcing in Software Engineering June 2, 2014, Hyderabad India.

TRANSCRIPT

Page 1: Crowdsourcing Documentation in Software Engineering

Crowdsourcing Documentation in Software Engineering

Margaret-Anne (Peggy) Storey ICSE 2014 1st International Workshop on Crowdsourcing in Software Engineering

Page 2: Crowdsourcing Documentation in Software Engineering

Christoph Treude Brendan Cleary Fernando Figueira Filho Jamie Starke Gargi Bougie Peter Rigby Lars Grammel Leif Singer Laura MacLeod Daniel German Alexey Zagalsky

Chris Parnin, Georgia Tech Ohad Barzilay, Tel-Aviv University, Israel Arie van Deursen, TU Delft, the Netherlands Li-Te Cheng, IBM Research Ian Bull, Eclipsesource

Acknowledgements

Page 3: Crowdsourcing Documentation in Software Engineering

“Documentation is the castor oil of software development”

Gerald Weinberg, Psychology of Computer Programming 1975

Page 4: Crowdsourcing Documentation in Software Engineering

Documentation to capture…

Requirements Architecture Features, implementation Scenarios of use Examples of use Testing Decisions And more?

Page 5: Crowdsourcing Documentation in Software Engineering

Created by…

Developers, contributors

Documenters Automatically

generated Users The crowd!

Designed for…

End users Client developers Contributors

Page 6: Crowdsourcing Documentation in Software Engineering

Documentation rationale…

To replace communication

To specify a contract with partners To provide organizational memory To reflect To seek feedback

For the public good! [Wasko et al.]

Page 7: Crowdsourcing Documentation in Software Engineering

Documentation formats…

Formal documentation (hierarchically structured)

Technical articles Books Self documenting code Source code comments Forums Email lists Usenet

Issues, bug tracking Archived chats Wikis Blog posts, microblogs Tagging Stackoverflow Videos, podcasts Community portals

(aggregate channels)

Page 8: Crowdsourcing Documentation in Software Engineering

Documentation challenges… Navigability, discoverability Audience and “fit for purpose” Boring prose Consistent use of terminology Staying current Costly, slow Explicit versus tacit knowledge Lack of good examples

Page 9: Crowdsourcing Documentation in Software Engineering

Crowdsourcing…

“…obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers… the work comes from an undefined public rather than being commissioned from a specific, named group…

Explicit crowdsourcing lets users work together to evaluate, share and build different specific tasks, while implicit crowdsourcing means that users solve a problem as a side effect of something else they are doing.” [Wikipedia, June 1, 2014]

Page 10: Crowdsourcing Documentation in Software Engineering

Community versus crowd contributions?

Individual or team contributions (e.g. design documents, podcasts)

Community contributions: created by a few (e.g. translation efforts)

Crowdsourcing contributions: many small contributions that add value (e.g. views, likes, comments, tags, votes)

Page 11: Crowdsourcing Documentation in Software Engineering

Social production [Yochai Benkler] Industrial revolution, high costs to access broadcast media

Low cost distributed small contributions at scale Not just turning levers but adding wisdom, creativity

Not a fad! Critical long term shift caused by the internet

Page 12: Crowdsourcing Documentation in Software Engineering

Social media as a disruptive force: an enabler for crowdsourcing

Enhancing the participatory culture in software development and in software documentation

Storey, M.-A., L. Singer, F. Figueira Filho, B. Cleary and A. Zagalsky, The (R)evolutionary Role of Social Media in Software Engineering, ICSE 2014 Future of Software Engineering Track), Hyderabad, 2014.

Page 13: Crowdsourcing Documentation in Software Engineering

Social Media Channels for Software Documentation

Community Portal

Tagging

Microblogging Question &

Answer Websites

Videos, podcasts

Blogging

Wikis

Page 14: Crowdsourcing Documentation in Software Engineering

Outline of the rest of this talk

Some insights on how social media channels can support “crowdsourced” documentation in software development

Discussion

Page 15: Crowdsourcing Documentation in Software Engineering

Community Portals

Tagging

MicroBlogging Question &

Answer Websites

Videos, podcasts

Blogging

Wikis

Page 16: Crowdsourcing Documentation in Software Engineering

Wikis

Wikis for documenting Software

Page 17: Crowdsourcing Documentation in Software Engineering

Wikis and software documentation

Used extensively (requirements, design, planning), integrated with many tools

Some shortcomings: lack of authoritativeness [Dagenais and Robillard FSE 2010]

Designed by Ward Cunningham in 1994

Page 18: Crowdsourcing Documentation in Software Engineering

Community Portals

Question & Answer Websites

Videos, podcasts

Tagging

Wikis

MicroBlogging

Blogging

Page 19: Crowdsourcing Documentation in Software Engineering

Social Tagging

How does tagging help with crowdsourced software documentation?

Page 20: Crowdsourcing Documentation in Software Engineering
Page 21: Crowdsourcing Documentation in Software Engineering

TagSEA: Tagging Waypoints in source code and gathering into Tours

M.-A. Storey, J. Ryall, J. Singer, D. Myers, L.-T. Cheng, M. Muller, 2009. How Software Developers Use Tagging to Support Reminding and Refinding. IEEE Transactions on Software Engineering (TSE), 2009.

Page 22: Crowdsourcing Documentation in Software Engineering

Tagging in Studied introduction and adoption of tags by several teams for work items

C. Treude and M.-A. Storey. Work Item Tagging: Communicating Concerns in Collaborative Software Development. In IEEE Transactions on Software Engineering 38, 1 (January/February 2012). pp. 19-34.

Page 23: Crowdsourcing Documentation in Software Engineering

Tagging in

Findings: – Categorization (cross cutting concerns, see also

Martin Robillard’s Feat tool) – Organization – Finding and refinding

Page 24: Crowdsourcing Documentation in Software Engineering

ConcernLines

Treude, C., and M.-A. Storey, Concernlines: A timeline view of co-occurring concerns, formal research demonstration, IEEE ICSE’09.

Page 25: Crowdsourcing Documentation in Software Engineering

Question & Answer Websites

Tagging

MicroBlogging

Community Portals

Videos, podcasts

Wikis

Blogging

Page 26: Crowdsourcing Documentation in Software Engineering

Microblogging

Why do developers tweet?

Page 27: Crowdsourcing Documentation in Software Engineering

Microblogging Software engineers tweet actively (share) facts about

software engineering topics and technology

G. Bougie, J. Starke, M.-A. Storey and D. German. Towards Understanding Twitter Use in Software Engineering: Preliminary Findings Ongoing Challenges and Future QuestionsIn Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering. 2011.

Page 28: Crowdsourcing Documentation in Software Engineering

Survey/Interviews/Survey

Findings: – Awareness – Learning – Relationships

“It was evolving way faster than I was able to keep up with it. And the only way to keep up was to follow some Node.js people on Twitter.”

Leif Singer, Fernando Figueira Filho, Margaret-Anne Storey. Software Engineering at the Speed of Light: How Developers Stay Current Using Twitter ICSE 2014.

Page 29: Crowdsourcing Documentation in Software Engineering

Question & Answer Websites

Tagging

MicroBlogging

Blogging

Community Portal

Videos, podcasts

Wikis

Page 30: Crowdsourcing Documentation in Software Engineering

Blogging

Why do developers blog?

Page 31: Crowdsourcing Documentation in Software Engineering

Blogging Determining requirements through blogs [Park and Maurer, CHASE 2009]

How developers blog: high-level concept discussion and requirements

[Pagano and Maalej, MSR 2011]

Blogs play a role in documenting APIs [Treude and Parnin, Web2SE 2011]

Is there potential to increase the size of the Blogging crowd for software documentation?

Page 32: Crowdsourcing Documentation in Software Engineering

Question & Answer Websites

Tagging

MicroBlogging

Blogging

\

Community Portal

Videos, podcasts

Wikis

Page 33: Crowdsourcing Documentation in Software Engineering

Question and Answer Websites

What role do Question and Answer websites play in documentation?

Page 34: Crowdsourcing Documentation in Software Engineering
Page 35: Crowdsourcing Documentation in Software Engineering

Over 92% of the questions on Stackoverflow are answered, and for those 92% the median answer time is 11 minutes

L. Mamykina, B. Manoim, M. Mittal, G. Hripcsak, and B. Hartmann. Design lessons from the fastest q&a site in the west. CHI 2011.

Page 36: Crowdsourcing Documentation in Software Engineering

Stackoverflow

How-to questions prevalent, and used frequently by novices

C. Treude, O. Barzilay and M.-A. Storey. How do Programmers Ask and Answer Questions on the Web? NIER/ICSE 2011.

Page 37: Crowdsourcing Documentation in Software Engineering

Linking Stackoverflow data with API usage

C. Parnin, C. Treude, L. Grammel and M.-A. Storey. Crowd Documentation: Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow”. Under submission, blogged (50,000 hits) at http://blog.ninlabs.com/2012/05/crowd-documentation/ May 2012.

Page 38: Crowdsourcing Documentation in Software Engineering

Stackoverflow as Crowd Documentation

Coverage of API documentation: 77% of the Java API classes & 87% of Android API classes

Speed of coverage:

Page 39: Crowdsourcing Documentation in Software Engineering

Impact on documentation tools? Automatically generating documentation Visualizing crowd documentation

http://latest-print.crowd-documentation.appspot.com/?api=android

Page 40: Crowdsourcing Documentation in Software Engineering
Page 41: Crowdsourcing Documentation in Software Engineering

Community Portals,

Question & Answer Websites

Videos, podcasts

Tagging

Wikis

MicroBlogging

Blogging

Page 42: Crowdsourcing Documentation in Software Engineering

How do Developers use YouTube to Share Knowledge?

Videos, podcasts

Page 43: Crowdsourcing Documentation in Software Engineering

44

Page 44: Crowdsourcing Documentation in Software Engineering

Developer motivations?

Documentation! But also …

Reputation: Improves their online persona

Dedication to helping others “What I wish I had known when I started”

Efficiency “Throw it up on the internet and forget about it”

http://lmacleod.com/

Page 45: Crowdsourcing Documentation in Software Engineering

Implications Many projects use videos to support documentation

and onboarding (e.g. MSDN) so…

How can they be improved for the recipient? How effective are videos at sharing tacit knowledge? Tool enhancements? Integration with IDE?

[e.g. Tours]

Cheng, L.-T., M. Desmond and M.-A. Storey, “Presentations by Programmers for Programmers”, ICSE 2007, IEEE 29th International Conference on Software Engineering.

Page 46: Crowdsourcing Documentation in Software Engineering

Is this crowdsourcing? Are code walkthroughs on YouTube effective?

How much do the social features matter?

A social platform for crowd input for video documentation?

Page 47: Crowdsourcing Documentation in Software Engineering

Question & Answer Websites

Tagging

MicroBlogging/Blogging

Community Portal

Videos, podcasts

Blogging

Wikis

Page 48: Crowdsourcing Documentation in Software Engineering
Page 49: Crowdsourcing Documentation in Software Engineering

Community portals

Stores code and project resources Provides version control Hosts web pages Connects people Links to communication tools Records interactions

Page 50: Crowdsourcing Documentation in Software Engineering

C. Treude and M.-A. Storey. Effective Communication of Software Development Knowledge Through Community Portals. ESEC/FSE ’11.

Page 51: Crowdsourcing Documentation in Software Engineering
Page 52: Crowdsourcing Documentation in Software Engineering

Implications of different media Content on wikis is often stale, but useful for

posting information quickly

Blog posts create more buzz or fanfare

Official product documentation is trusted (review it carefully or rely on the crowd?)

Have an updating process (or crowdsource it?)

Have mechanisms to solicit feedback (e.g. commenting, blog posts, voting)

Page 53: Crowdsourcing Documentation in Software Engineering

Social Media Channels to support Software Documentation

Community Portal

Tagging

Microblogging Question &

Answer Websites

Videos, podcasts

Blogging

Wikis

Page 54: Crowdsourcing Documentation in Software Engineering

Discussion

Page 55: Crowdsourcing Documentation in Software Engineering

Documentation challenges revisited

Recommenders to aid in discoverability Keeping up: leverage the crowd Incentive: participatory culture Video and podcasts for tacit knowledge Mining of social media can point to code

examples (implicit mechanism)

Page 56: Crowdsourcing Documentation in Software Engineering

Discussion points

When does a community become a crowd? Gaps and nichification? Incentives? Dynamics? Study other portals, hubs? Do these mechanisms translate to industry?

What do you see as challenges, opportunities for involving the crowd?

Page 57: Crowdsourcing Documentation in Software Engineering

http://www.thechiselgroup.org http://margaretannestorey.wordpress.com/

@thechiselgroup, @margaretstorey [email protected]

Funded by NSERC/DRDC/IBM