virtual research environments for implementing long tail open science

16
BlueBRIDGE receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 675680 www.bluebridge-vres.eu Virtual Research Environments for implementing Long-tail Open Science 30 September 2016 - Krakow Pasquale Pagano CNR, Italy [email protected]

Upload: blue-bridge

Post on 17-Jan-2017

215 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Virtual research environments for implementing long tail open science

BlueBRIDGE receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 675680 www.bluebridge-vres.eu

Virtual Research Environments for implementing Long-tail Open Science

30 September 2016 - Krakow

Pasquale PaganoCNR, [email protected]

Page 2: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

2

Context

new approaches to collect-analyse-curate the data

New tools to exchange and guarantee the longevity of the data and the reusability of the experiments

Open Science: make scientific research, data and dissemination accessible to all levels of an inquiring society, amateur or professional

Keywords: Open Access, Open Research, Open Notebook Science

Science 2.0: use network technologies to process large data sets and share experimental results and processes using a collaborative approach. Support Reproducibility-Repeatability-Reusability (R-R-R) of Science

Keywords: Provenance of the scientific process and workflows, Collaborative and Repeatable Science

Page 3: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

3

Long-tail Open Science

• are multidisciplinary, involve members belonging to diverse organisations • require to access data and services that are spread among many providers

dynamically aggregated to address research questions/problems

• cannot rely on pre-organised and costly supporting environments managed by dedicated organizations

build and operate their own supporting environments

wish to effectively inject open science in daily tasks

cost and time required to implement this approach largely exceed the available capacities

Not individual researchers but group of researchers

Page 4: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

4

Requirements for IT systems

• Support collaborative research and experimentation

• Implement Reproducibility-Repeatability-Reusability

• Allow sharing data and findings

• Grant open access to produced scientific knowledge and data

• Tackle simplified access to existing computing and storage resources

• Ensure low operational and maintenance costs

• Manage heterogeneous data access policies

Page 5: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

5

Virtual Research Environment

An operational environment

• Where set of resources (data, services, computational, and storage resources)

• are assigned to group of users via interfaces

• for a limited timeframe

L. Candela, D. Castelli, P. Pagano (2013) Virtual Research Environments: An Overview and a Research Agenda. Data Science Journal, Vol. 12

Created on demand

Regulated by tailored policies

No cost for the resource providers

Open to host and operate custom software

Page 6: Virtual research environments for implementing long tail open science

VRE Creation

Configuration

ApplicationsMetadata

Data

Simple and effective process to define a new environment

Page 7: Virtual research environments for implementing long tail open science

Applications vs Services

Registry

Logi

cal

View

Applications Data

Phys

ical

View

Hardware

Software, Tools, Services

Configuration

Data

Page 8: Virtual research environments for implementing long tail open science

The iMarine Use Case

VREs in operation

Data Infrastructures Computing Infrastructures

Mediator Connector Mediator Connector

Data Curation

Data Preparation

Data Analysis

Data Sharing

Data Publication

Data Provenance

VRE Builder

Security

Monitoring

Page 9: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

9

VREs in operationThe D4Science e-Infrastructure

D4Science supports scientists in several domains

1. More than 25 000 taxonomicstudies per monthi-marine.d4science.org

2. More than 60 000 species distribution maps produced and hostedi-marine.d4science.org

3. Used to build a pan- European geothermal energy mapegip.d4science.org

4. Performs cross-disciplinary social mining researchsobigdata.d4science.org

5. Enhances communication and exchange in Linguistic Studies, Humanities, Cultural Heritage, History and Archaeologyservices.d4science.org

PARTHENOS

Page 10: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

10

VREs: Social NetworkingSocial networking is key to share information in the VRE

It offers a continuously updated list of events / news produced by users and applications

User-shared News

Application-shared News

Share News

Page 11: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

11

VREs: The WorkspaceA folder-based file system allowing

managing and sharing information objects

Information objects can be

• files, dataset, workflows, experiments, etc.

• organized into folders

Users can

• Share with selected users

• disseminate via persistent public URLs

Page 12: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

12

Software Integration

Download the (python, R, Java, …) script and the user’s data

Execute script

Collect output

Destroy local copies of I/O and script

Save Output on the User’s Workspace, with provenance info

Scientist’s provided script

User’s data

Infrastructure

Page 13: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

13

Collaborative experiments

WS

Shared online folders

Inputs

Outputs

Results

Computational system

In the e-Infrastructure

Through third party software

Page 14: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

14

Scientific Workflow

Script provider

Updates the script on his private Workspace

The service downloadsthe script on-the-fly

A user executes an experiment on his/her data

The output, the input and the parameters can be shared with another user

This user can execute the experiment againand share the computation with other users

1

2

3

4

5

6

7

89

10

Page 15: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

15

ConclusionsVRE are defined by users and created on demand• New software can be integrated and used as-a-Service• Invoked via standard interfaces

VRE ensures • provenance management• Results on a easy-to-use storage system• Collaboration and sharing

VRE enables • Repeatability, Reproducibility and Reusability

Page 16: Virtual research environments for implementing long tail open science

Virtual Research Environments for implementing Long-tail Open Science

16

Visit us at www.bluebridge-vres.euTry it at i-marine.d4science.org