a virtual crowdsourcing community for open collaboration in science processes

24
TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 1 Approach Introduct ion Motivat ion Organic Data Science Wiki Evaluati on Conclusi on A Virtual Crowdsourcing Community for Open Collaboration in Science Processes 1 Software Engineering for Business Information Systems, Technical University of Munich 2 Information Sciences Institute, University of Southern California Felix Michel 12 , Yolanda Gil 2 , Varun Ratnakar 2 ,Matheus Hauder 1 21 th Americas Conference on Information Systems 2015 Organic Data Science Framework http://www.organicdatascience.org/

Upload: dr-matheus-hauder

Post on 20-Mar-2017

45 views

Category:

Science


0 download

TRANSCRIPT

Page 1: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

1Software Engineering for Business Information Systems, Technical University of Munich2Information Sciences Institute, University of Southern California

Felix Michel12, Yolanda Gil2, Varun Ratnakar2,Matheus Hauder1

21th Americas Conference on Information Systems 2015

Organic Data Science Frameworkhttp://www.organicdatascience.org/

Page 2: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 2

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Evolution of the scientific enterprise

Evolution of the scientific enterprise from [Barabasi, 2005] extended with the ATLAS Detector Project at the Large Hadron Collider [The ATLAS Collaboration, 2012].

Motivation

single-authorship co-authorship large number co-authors

the community as author

Page 3: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 3

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Requirements of Scientific CollaborationsIntroduction

Significant organization and coordination

Maintaining a community over the longer term

Growing the community based on unanticipated needs

R1:

R2:

R3:

Page 4: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 4

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Key objectives are:I. Self-organization of the work

II. Sustainable on-line communities

III. Open science processes that expose all tasks and activities publicly

Reducing the coordination effort, lower the barriers to growing the community

Organic Data Science FrameworkApproach

Page 5: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 5

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Social Design Principles

Selected social principles from [Kraut and Resnick 2012] for building successful online communities that can be applied to Organic Data Science.

A1: Carve a niche of interest, scoped in terms of topics, members, activities, and purpose A2: Relate to competing sites, integrate content A3: Organize content, people, and activities into subspaces once there is enough activity A4: Highlight more active tasks A5: Inactive tasks should have “expected active times” A6: Create mechanisms to match people to activities

B1: Make it easy to see and track needed contributions B2: Ask specific people on tasks of interest to them B3: Simple tasks with challenging goals are easier to comply with B4: Specify deadlines for tasks, while leaving people in control B5: Give frequent feedback specific to the goals …B10 …

C1: Cluster members to help them identify with the community C2: Give subgroups a name and a tagline C3: Put subgroups in the context of a larger group C4: Make community goals and purpose explicit C5: Interdependent tasks increase commitment and reduce conflict

DD1: Members recruiting colleagues is most effective D2: Appoint people responsible for immediate friendly interactions D3: Introducing newcomers to members increases interactions D4: Entry barriers for newcomers help screen for commitment D5: When small, acknowledge each new member …D12 …

B

A C

Approach

Starting communities

Encouraging contributions through motivation

Encouraging commitment

Dealing with newcomers

Page 6: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 6

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Best Practices from Polymath and Encode

Selected best practices from the Polymath [Nielsen 2012] project and lessons learned from ENCODE [Encode 2004].

E1: Permanent URLs for posts and comments, so others can refer to themE2: Appoint a volunteer to summarize periodicallyE3: Appoint a volunteer to answer questions from newcomersE4: Low barrier of entry: make it VERY easy to commentE5: Advance notice of tasks that are anticipatedE6: Keep few tasks active at any given time, helps focus

F1: Spine of leadership, including a few leading scientists and 1-2 operational project managers, that resolves complex scientific and social problems and has transparent decision makingF2: Written and publicly accessible rules to transfer work between groups, to assign credit when papers are published, to present the workF3: Quality inspection with visibility into intermediate stepsF4: Export of data and results, integration with existing standards

E

F

Approach

Lessons learned from ENCODE

Best practices from Polymath

Page 7: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 7

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

The Organic Data Science FrameworkOrganic Data Science Wiki

Page 8: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 8

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Welcome Page0

Welcome Page0

Organic Data Science Wiki

A1: Carve a niche of interest, scoped in terms of topics, members, activities, and purpose

D6: Advertise members particularly community leaders, include pictures

D7: Provide concrete incentives to early members

E6: Keep few tasks active at any given time, helps focus

[A1, A2,A3,B7,D1, D5, D6, D7, E2, E6, F1, F2, F4]

II. Sustainable On-Line Communities

III. Opening Science Process

Page 9: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 9

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Task Representation1Organic Data Science Wiki

Task Representation1B1: Make it easy to see and track needed contributions

C3: Put subgroups in the context of a larger group

E1: Permanent URLs for posts and comments, so others can refer to them

[A3, A4, A6, B1, B3, B10, C2, C3, C4, C5, E1, F3]

I. Self-Organization

III. Opening Science Process

Page 10: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 10

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Task Metadata2Organic Data Science Wiki

Task Metadata2A5: Inactive tasks should have "expected active times“ A6: Create mechanisms to match people to activities B4: Specify deadlines for tasks, while leaving people in control [A4, A5, A6, B1, B2, B4, B5, B6, C1, C2, C5, E5, F3]

I. Self-OrganizationII. Sustainable On-Line CommunitiesIII. Opening Science Process

AMCIS

eScience

Page 11: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 11

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation ConclusionTask Status (Progress Estimation)

PIHM model Documentation

Calibrating …90%

85%

80%

70% 80% 90%

90%

Low-level TaskLow uncertainty in estimation Estimated by users

90%

90%

90%

Medium-level TaskMedium uncertainty in estimation Average of its Subtasks

High-level TaskHigh uncertainty in the estimation Linear progress based on start and target date

90%

95%

Task metadata is used to estimate the progress and status of tasks

Legend

The abstraction level of a parent node/task can be equal or higher but not lower.

Rule:

Organic Data Science Wiki

Page 12: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 12

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Task Navigation3Organic Data Science Wiki

Task Navigation3B1: Make it easy to see and track needed contributions

C3: Put subgroups in the context of a larger group

F3: Quality inspection with visibility into intermediate steps

[B1, B4, B10, C1, C2, C3, C4, C5, F3]

I. Self-Organization

III. Opening Science Process

Page 13: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 13

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Timeline Navigation6Organic Data Science Wiki

Timeline Navigation6A5: Inactive tasks should have "expected active times“ B5: Give frequent feedback specific to the goals E5: Advance notice of tasks that are anticipated [A5, B1, B5, E5, F3]

I. Self-OrganizationIII. Opening Science Process

Page 14: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 14

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Task Alert7Organic Data Science Wiki

Task Alert7B1: Make it easy to see and track needed contributions

B4: Specify deadlines for tasks, while leaving people in control

[B1, B4]

I. Self-Organization

II. Sustainable On-Line Communities

Page 15: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 15

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation ConclusionUser Tasks and Expertise

9Organic Data Science Wiki

User Tasks and Expertise9B2: Ask specific people on tasks of interest to them

C1: Cluster members to help them identify with the community

C5: Interdependent tasks increase commitment a. reduce conflict

[B1, B2, B5, B8, B10, C1, C5]

I. Self-Organization

II. Sustainable On-Line Communities

III. Opening Science Process

Page 16: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 16

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Task State10Organic Data Science Wiki

Task StateB1: Make it easy to see and track needed contributionsB5: Give frequent feedback specific to the goalsE5: Advance notice of tasks that are anticipated [B1, B5, E5, F3]

I. Self-OrganizationII. Sustainable On-Line CommunitiesIII. Opening Science Process

10

Page 17: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 17

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Train New Members11Organic Data Science Wiki

Train New MembersD3: Introducing newcomers to members increases interactions D8: Design common learning experiences for newcomersD9: Design clear sequence of stages to newcomers [D2, D3, D4, D8, D9, D10, D11, D12, E3, E4]

II. Sustainable On-Line CommunitiesIII. Opening Science Process

11

Page 18: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 18

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Mapping Features, Objectives and Social Principles

I. Self

-Orga

nizati

on

II. Su

staina

ble O

n-Lin

e

Commun

ities

Welcome Page A1, A2, A3, B7, D1, D5, D6, D7, E2, E6, F1, F2, E4

Task Representation A3, A4, A6, B1, B3, B10, C2, C3, C4, C5, E1, E5, F3

Task Metadata A4, A5, A6, B1, B2, B4, B5, B6, C1, C2, C5, F3

Task Navigation B1, B4, B10, C1, C2, C3, C4, C5, F3

Personal Worklist A4, B1, B4, C3

Subtask Navigation B1, B5, B9, B10 C5, F3

Timeline Navigation A4, A5, B1, B5, E5, F3

Task Alert B1, B4

Task Management A3, B3, B10, F3

User Tasks and Expertise B1, B2, B5, B8, B10, C1, C5

Task State B1, B5, E5

Training New Members D2, D3, D4, D8, D9, D10, D11, D12, E3, E4

Social Principlesand Best Practices

III. O

penin

g

S

cienc

e Pro

cessOrganic Data Science

Features

11

Organic Data Science Wiki

Page 19: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 19

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Is the Framework Helping Users Organize their Work?

13

33

27

32

11

6

0 1 2 3 4 5

Num

ber o

f Tasks

(a) Number of ancestors

31

3 2 4 6 0 4 1 1 1

61

41 1 1 1 0 0 0 0

0 1 2 3 4 5 6 7 8 9(b) Number of children

Tasks with incomplete metadataTasks with completed metadata

7-9

Subtask Hierarchies

Evaluation

10 Weeks:

122 Tasks

Task pages accessed 2,900 times

Person pages accesses 328 times

One paper was written with the ODS framework

19,000 log entries used for evaluation

Page 20: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 20

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Is the Framework Helping to Create Communities?

100

90

80

70

60

50

40

30

20

10

0

Perc

enta

ge o

f Tas

ks

Number of Different Persons

14%

3%

14%

21%

48%

31T

8T

32T

47T

111T

1%

13%

32%

30%

24%

1T

14T

36T

33T

27T

0%1%2%

16%

81%

0T1T3T

25T

126T

1%0%1%

10%

88%

2T0T3T

25T

210T

T = Number of Tasks

How many tasks are viewed by more than one person?

How many tasks have more than one person signed up?

How many tasks have more thanone person editing task metadata?

How many tasks have more thanone person editing their content?

Evaluation

Page 21: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 21

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Is the Framework Helping to Open the Science Processes?

Evaluation

Organic Data Science Collaboration Graph

Number of tasks in common = edge strength

Page 22: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 22

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

ConclusionsConclusion

The Organic Data Science Framework provides a task-centered organization incorporates social design principles open exposure of scientific processes

Future work: Analyzing the evolution of the communities in quantitative terms.

Page 23: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 23

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Thank You

https://github.com/IKCAP/organicdatascience

Organic Data Science Frameworkhttp://www.organicdatascience.org/

Development

AcknowledgmentsWe gratefully acknowledge funding from the US National Science Foundation under grant IIS-1344272.

Page 24: A Virtual Crowdsourcing Community for Open Collaboration in Science Processes

TECHNICAL UNIVERITY OF MUNICH | USC INFORMATION SCIENCES INSTITUTE Felix Michel 24

ApproachIntroductionMotivation Organic Data Science Wiki Evaluation Conclusion

Task Management8Organic Data Science Wiki

Task Management8A3: Organize content, people, and activities into subspaces once there is enough activity

F3: Quality inspection with visibility into intermediate steps

[A3, B3, B10, F3]

I. Self-Organization