a virtual crowdsourcing community for open collaboration in science processes
TRANSCRIPT
A Virtual Crowdsourcing Community for Open Collaboration in Science Processes
1Software Engineering for Business Information Systems, Technical University of Munich2Information Sciences Institute, University of Southern California
Felix Michel12, Yolanda Gil2, Varun Ratnakar2,Matheus Hauder1
21th Americas Conference on Information Systems 2015
Organic Data Science Frameworkhttp://www.organicdatascience.org/
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 2
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Evolution of the scientific enterprise
Evolution of the scientific enterprise from [Barabasi, 2005] extended with the ATLAS Detector Project at the Large Hadron Collider [The ATLAS Collaboration, 2012].
Motivation
single-authorship co-authorship large number co-authors
the community as author
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 3
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Requirements of Scientific CollaborationsIntroduction
Significant organization and coordination
Maintaining a community over the longer term
Growing the community based on unanticipated needs
R1:
R2:
R3:
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 4
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Key objectives are:I. Self-organization of the work
II. Sustainable on-line communities
III. Open science processes that expose all tasks and activities publicly
Reducing the coordination effort, lower the barriers to growing the community
Organic Data Science FrameworkApproach
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 5
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Social Design Principles
Selected social principles from [Kraut and Resnick 2012] for building successful online communities that can be applied to Organic Data Science.
A1: Carve a niche of interest, scoped in terms of topics, members, activities, and purpose A2: Relate to competing sites, integrate content A3: Organize content, people, and activities into subspaces once there is enough activity A4: Highlight more active tasks A5: Inactive tasks should have “expected active times” A6: Create mechanisms to match people to activities
B1: Make it easy to see and track needed contributions B2: Ask specific people on tasks of interest to them B3: Simple tasks with challenging goals are easier to comply with B4: Specify deadlines for tasks, while leaving people in control B5: Give frequent feedback specific to the goals …B10 …
C1: Cluster members to help them identify with the community C2: Give subgroups a name and a tagline C3: Put subgroups in the context of a larger group C4: Make community goals and purpose explicit C5: Interdependent tasks increase commitment and reduce conflict
DD1: Members recruiting colleagues is most effective D2: Appoint people responsible for immediate friendly interactions D3: Introducing newcomers to members increases interactions D4: Entry barriers for newcomers help screen for commitment D5: When small, acknowledge each new member …D12 …
B
A C
Approach
Starting communities
Encouraging contributions through motivation
Encouraging commitment
Dealing with newcomers
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 6
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Best Practices from Polymath and Encode
Selected best practices from the Polymath [Nielsen 2012] project and lessons learned from ENCODE [Encode 2004].
E1: Permanent URLs for posts and comments, so others can refer to themE2: Appoint a volunteer to summarize periodicallyE3: Appoint a volunteer to answer questions from newcomersE4: Low barrier of entry: make it VERY easy to commentE5: Advance notice of tasks that are anticipatedE6: Keep few tasks active at any given time, helps focus
F1: Spine of leadership, including a few leading scientists and 1-2 operational project managers, that resolves complex scientific and social problems and has transparent decision makingF2: Written and publicly accessible rules to transfer work between groups, to assign credit when papers are published, to present the workF3: Quality inspection with visibility into intermediate stepsF4: Export of data and results, integration with existing standards
E
F
Approach
Lessons learned from ENCODE
Best practices from Polymath
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 7
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
The Organic Data Science FrameworkOrganic Data Science Wiki
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 8
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Welcome Page0
Welcome Page0
Organic Data Science Wiki
A1: Carve a niche of interest, scoped in terms of topics, members, activities, and purpose
D6: Advertise members particularly community leaders, include pictures
D7: Provide concrete incentives to early members
E6: Keep few tasks active at any given time, helps focus
[A1, A2,A3,B7,D1, D5, D6, D7, E2, E6, F1, F2, F4]
II. Sustainable On-Line Communities
III. Opening Science Process
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 9
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Task Representation1Organic Data Science Wiki
Task Representation1B1: Make it easy to see and track needed contributions
C3: Put subgroups in the context of a larger group
E1: Permanent URLs for posts and comments, so others can refer to them
[A3, A4, A6, B1, B3, B10, C2, C3, C4, C5, E1, F3]
I. Self-Organization
III. Opening Science Process
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 10
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Task Metadata2Organic Data Science Wiki
Task Metadata2A5: Inactive tasks should have "expected active times“ A6: Create mechanisms to match people to activities B4: Specify deadlines for tasks, while leaving people in control [A4, A5, A6, B1, B2, B4, B5, B6, C1, C2, C5, E5, F3]
I. Self-OrganizationII. Sustainable On-Line CommunitiesIII. Opening Science Process
AMCIS
eScience
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 11
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation ConclusionTask Status (Progress Estimation)
PIHM model Documentation
Calibrating …90%
85%
80%
70% 80% 90%
90%
Low-level TaskLow uncertainty in estimation Estimated by users
90%
90%
90%
Medium-level TaskMedium uncertainty in estimation Average of its Subtasks
High-level TaskHigh uncertainty in the estimation Linear progress based on start and target date
90%
95%
Task metadata is used to estimate the progress and status of tasks
Legend
The abstraction level of a parent node/task can be equal or higher but not lower.
Rule:
Organic Data Science Wiki
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 12
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Task Navigation3Organic Data Science Wiki
Task Navigation3B1: Make it easy to see and track needed contributions
C3: Put subgroups in the context of a larger group
F3: Quality inspection with visibility into intermediate steps
[B1, B4, B10, C1, C2, C3, C4, C5, F3]
I. Self-Organization
III. Opening Science Process
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 13
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Timeline Navigation6Organic Data Science Wiki
Timeline Navigation6A5: Inactive tasks should have "expected active times“ B5: Give frequent feedback specific to the goals E5: Advance notice of tasks that are anticipated [A5, B1, B5, E5, F3]
I. Self-OrganizationIII. Opening Science Process
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 14
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Task Alert7Organic Data Science Wiki
Task Alert7B1: Make it easy to see and track needed contributions
B4: Specify deadlines for tasks, while leaving people in control
[B1, B4]
I. Self-Organization
II. Sustainable On-Line Communities
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 15
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation ConclusionUser Tasks and Expertise
9Organic Data Science Wiki
User Tasks and Expertise9B2: Ask specific people on tasks of interest to them
C1: Cluster members to help them identify with the community
C5: Interdependent tasks increase commitment a. reduce conflict
[B1, B2, B5, B8, B10, C1, C5]
I. Self-Organization
II. Sustainable On-Line Communities
III. Opening Science Process
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 16
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Task State10Organic Data Science Wiki
Task StateB1: Make it easy to see and track needed contributionsB5: Give frequent feedback specific to the goalsE5: Advance notice of tasks that are anticipated [B1, B5, E5, F3]
I. Self-OrganizationII. Sustainable On-Line CommunitiesIII. Opening Science Process
10
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 17
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Train New Members11Organic Data Science Wiki
Train New MembersD3: Introducing newcomers to members increases interactions D8: Design common learning experiences for newcomersD9: Design clear sequence of stages to newcomers [D2, D3, D4, D8, D9, D10, D11, D12, E3, E4]
II. Sustainable On-Line CommunitiesIII. Opening Science Process
11
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 18
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Mapping Features, Objectives and Social Principles
I. Self
-Orga
nizati
on
II. Su
staina
ble O
n-Lin
e
Commun
ities
Welcome Page A1, A2, A3, B7, D1, D5, D6, D7, E2, E6, F1, F2, E4
Task Representation A3, A4, A6, B1, B3, B10, C2, C3, C4, C5, E1, E5, F3
Task Metadata A4, A5, A6, B1, B2, B4, B5, B6, C1, C2, C5, F3
Task Navigation B1, B4, B10, C1, C2, C3, C4, C5, F3
Personal Worklist A4, B1, B4, C3
Subtask Navigation B1, B5, B9, B10 C5, F3
Timeline Navigation A4, A5, B1, B5, E5, F3
Task Alert B1, B4
Task Management A3, B3, B10, F3
User Tasks and Expertise B1, B2, B5, B8, B10, C1, C5
Task State B1, B5, E5
Training New Members D2, D3, D4, D8, D9, D10, D11, D12, E3, E4
Social Principlesand Best Practices
III. O
penin
g
S
cienc
e Pro
cessOrganic Data Science
Features
11
Organic Data Science Wiki
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 19
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Is the Framework Helping Users Organize their Work?
13
33
27
32
11
6
0 1 2 3 4 5
Num
ber o
f Tasks
(a) Number of ancestors
31
3 2 4 6 0 4 1 1 1
61
41 1 1 1 0 0 0 0
0 1 2 3 4 5 6 7 8 9(b) Number of children
Tasks with incomplete metadataTasks with completed metadata
7-9
Subtask Hierarchies
Evaluation
10 Weeks:
122 Tasks
Task pages accessed 2,900 times
Person pages accesses 328 times
One paper was written with the ODS framework
19,000 log entries used for evaluation
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 20
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Is the Framework Helping to Create Communities?
100
90
80
70
60
50
40
30
20
10
0
Perc
enta
ge o
f Tas
ks
Number of Different Persons
14%
3%
14%
21%
48%
31T
8T
32T
47T
111T
1%
13%
32%
30%
24%
1T
14T
36T
33T
27T
0%1%2%
16%
81%
0T1T3T
25T
126T
1%0%1%
10%
88%
2T0T3T
25T
210T
T = Number of Tasks
How many tasks are viewed by more than one person?
How many tasks have more than one person signed up?
How many tasks have more thanone person editing task metadata?
How many tasks have more thanone person editing their content?
Evaluation
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 21
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Is the Framework Helping to Open the Science Processes?
Evaluation
Organic Data Science Collaboration Graph
Number of tasks in common = edge strength
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 22
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
ConclusionsConclusion
The Organic Data Science Framework provides a task-centered organization incorporates social design principles open exposure of scientific processes
Future work: Analyzing the evolution of the communities in quantitative terms.
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 23
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Thank You
https://github.com/IKCAP/organicdatascience
Organic Data Science Frameworkhttp://www.organicdatascience.org/
Development
AcknowledgmentsWe gratefully acknowledge funding from the US National Science Foundation under grant IIS-1344272.
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 24
ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion
Task Management8Organic Data Science Wiki
Task Management8A3: Organize content, people, and activities into subspaces once there is enough activity
F3: Quality inspection with visibility into intermediate steps
[A3, B3, B10, F3]
I. Self-Organization