2013 07-18 myexperiment research objects poster (pdf)

1
myExperiment Research Objects: Beyond Workflows and Packs Stian Soiland-Reyes 1 , Don Cruickshank 2 , Finn Bacall 1 , Jun Zhao 2 , Khalid Belhajjame 1 , David De Roure 3 , Carole A. Goble 1 1 School of Computer Science, University of Manchester, UK 2 Department of Zoology, University of Oxford, UK 3 Oxford e-Research Centre, University of Oxford, UK ABSTRACT We have evolved Research Objects as a mechanism to preserve digital resources related to research, by providing mechanisms, formats and architecture for describing aggregated resources (hypothesis, workflow, datasets, scripts, services), their relations (is input for, explains, used by), provenance (graph was derived from dataset A, B and C) and attribution (who contributed what, and when?). The website myExperiment is already popular for collaborating on, publishing and sharing scientific workflows, however we have found that for understanding and preserving a workflow over time, its definition is not enough, specially faced with workflow decay, services and tools that change over time. We have therefore adapted the research object model as a foundation for the myExperiment packs, allowing uploading of workflow runs, inputs, outputs and other files relevant to the workflow, relating them with annotations and integrated the Wf4Ever architecture for performing decay analysis and tracking a research object’s evolution as it and its constituent resources change over time. MAKING RESEARCH OBJECTS myExperiment is a website for collaboration and sharing of experiments, in particular scientific workflows. We are enhancing myExperiment’s packs to be based on the Research Object model, allowing users to form collections of workflows, example input data, results, presentation slides, hypothesis, workflow runs and documentation, effectively building a Research Object (RO). These uploaded resources can then be further related, typed, described and given their own attribution provenance record. ROs are versioned and shareable. WF4EVER ARCHITECTURE The architecture for Research Objects is realized as a Linked Data platform of RESTful web services that support preservation aspects such as decay monitoring and evolution tracking, presented to the user through a regular web interface on myExperiment. RESEARCH OBJECT MODEL A research object (RO) is described in an RDF manifest which lists the aggregated resources and their annotations as separate RDF graphs containing user annotations (title, description, example value), typing information (hypothesis, workflow, input data, etc) and automatically extracted metadata (provenance, workflow structure). The ontologies for the RO Model is based on standards for aggregations (OAI-ORE) and annotations (Annotation Ontology, W3C Open Annotation Core OAC). RESEARCH OBJECTS AS FILES A RO Bundle is a JSON-LD-based serialization of a research object as a ZIP file (Adobe UCF, ePub), allowing a hybrid of embedded resources and external references (URIs). This allows a self-contained RO to be downloaded, transferred, modified and inspected without requiring a dedicated web server, well suited for desktop environments such as scientific workflow systems. Taverna uses RO Bundle to make a workflow run bundle, a single file that contains the input and output values, the workflow definition, complete with provenance of the run and intermediate values. This allows sharing of a workflow run, e.g. uploading to myExperiment, and later reloaded in a different Taverna installation. This work was enabled by the Wf4Ever project funded by the European Commission’s 7th FWP (FP7-ICT-2007-6 270192), and the myGrid platform grant by the EPSRC (EP/G026238/1) Project sites http://www.myexperiment.org/ http://www.wf4ever-project.org/ Source code http://myexperiment.rubyforge.org/svn/ https://github.com/wf4ever/ License BSD 3-Clause License MIT license http://www.researchobject.org/ Research Object Resource Resource Resource Annotation Annotation Annotation oa:hasTarget oa:hasBody ore:aggregates Manifest Annotation graph http://purl.org/wf4ever/model outputA.txt outputC.jpg outputB/ intermediates/ 1.txt 2.txt 3.txt de/def2e58b-50e2-4949-9980-fd310166621a.txt inputA.txt workflow URI references attribution execution environment Aggregating in Research Object ZIP folder structure (RO Bundle) mimetype application/vnd.wf4ever.robundle+zip workflowrun.prov.ttl (provenance) .ro/manifest.json http://alpha.myexperiment.org/packs/387 http://sandbox.wf4ever-project.org/portal/ The Research Object is stored and manipulated in a Research Object Digital Library using REST APIs, allowing any tools to view and modify the RO, like the RO portal: https://w3id.org/bundle REST API RDF triple store (RO structure, Annotations) RO index Uploaded files RO Portal Checklist service ...

Upload: stian-soiland-reyes

Post on 01-Jun-2015

177 views

Category:

Technology


2 download

DESCRIPTION

Poster to be presented at BOSC 2013, ISMB. We have evolved Research Objects as a mechanism to preserve digital resources related to research, by providing mechanisms, formats and architecture for describing aggregated resources (hypothesis, workflow, datasets, scripts, services), their relations (is input for, explains, used by), provenance (graph was derived from dataset A, B and C) and attribution (who contributed what, and when?). The website myExperiment is already popular for collaborating on, publishing and sharing scientific workflows, however we have found that for understanding and preserving a workflow over time, its definition is not enough, specially faced with workflow decay, services and tools that change over time. We have therefore adapted the research object model as a foundation for the myExperiment packs, allowing uploading of workflow runs, inputs, outputs and other files relevant to the workflow, relating them with annotations and integrated the Wf4Ever architecture for performing decay analysis and tracking a research object’s evolution as it and its constituent resources change over time. Submitted abstract: https://docs.google.com/document/d/1jaAuPV-EnbsyI14L56HKHBQP7eDVfeXGLlK-LwohnWw/edit?usp=sharing See also PPTX version at http://www.slideshare.net/soilandreyes/2013-0718-myexperiment-research-objects-poster

TRANSCRIPT

Page 1: 2013 07-18 myExperiment research objects poster (PDF)

Printing:This poster is 48” wide by 36” high. It’s designed to be printed on a large

Customizing the Content:The placeholders in this formatted for you. placeholders to add text, or click an icon to add a table, chart, SmartArt graphic, picture or multimedia file.

Tfrom text, just click the Bullets button on the Home tab.

If you need more placeholders for titles, make a copy of what you need and drag it into place. PowerPoint’s Smart Guides will help you align it with everything else.

Want to use your own pictures instead of ours? No problem! Just rightChange Picture. Maintain the proportion of pictures as you resize by dragging a corner.

myExperiment Research Objects: Beyond Workflows and PacksStian Soiland-Reyes1, Don Cruickshank2, Finn Bacall1, Jun Zhao2, Khalid Belhajjame1, David De Roure3, Carole A. Goble1

1 School of Computer Science, University of Manchester, UK2 Department of Zoology, University of Oxford, UK3 Oxford e-Research Centre, University of Oxford, UK

ABSTRACT

We have evolved Research Objects as a mechanism to preserve digital resources related to research, by providing mechanisms, formats and architecture for describing aggregated resources (hypothesis, workflow, datasets, scripts, services), their relations (is input for, explains, used by), provenance (graph was derived from dataset A, B and C) and attribution (who contributed what, and when?).

The website myExperiment is already popular for collaborating on, publishing and sharing scientific workflows, however we have found that for understanding and preserving a workflow over time, its definition is not enough, specially faced with workflow decay, services and tools that change over time. We have therefore adapted the research object model as a foundation for the myExperiment packs, allowing uploading of workflow runs, inputs, outputs and other files relevant to the workflow, relating them with annotations and integrated the Wf4Ever architecturefor performing decay analysis and tracking a research object’s evolution as it and its constituent resources change over time.

MAKING RESEARCH OBJECTS

myExperiment is a website for collaboration and sharing of experiments, in particular scientific workflows. We are enhancing myExperiment’s packs to be based on the Research Object model, allowing users to form collections of workflows, example input data, results, presentation slides, hypothesis, workflow runs anddocumentation, effectively building a Research Object (RO). These uploaded resources can then be further related, typed, described and given their own attribution provenance record. ROs are versioned and shareable.

WF4EVER ARCHITECTURE

The architecture for Research Objects is realized as a Linked Data platform of RESTful web services that support preservation aspects such as decay monitoring and evolution tracking, presented to the user through a regular web interface on myExperiment.

RESEARCH OBJECT MODEL

A research object (RO) is described in an RDF manifest which lists the aggregated resources and their annotations as separate RDF graphs containing user annotations (title, description, example value), typing information (hypothesis, workflow, input data, etc) and automatically extracted metadata (provenance, workflow structure).

The ontologies for the RO Model is based on standards for aggregations (OAI-ORE) and annotations (Annotation Ontology, W3C Open Annotation Core OAC).

RESEARCH OBJECTS AS FILES

A RO Bundle is a JSON-LD-based serialization of a research object as a ZIP file (Adobe UCF, ePub), allowing a hybrid of embedded resources and external references (URIs). This allows a self-contained RO to be downloaded, transferred, modified and inspected without requiring a dedicated web server, well suited for desktop environments such as scientific workflow systems.

Taverna uses RO Bundle to make a workflow run bundle, a single file that contains the input and output values, the workflow definition, complete with provenanceof the run and intermediate values. This allows sharing of a workflow run, e.g. uploading to myExperiment, and later reloaded in a different Taverna installation.

This work was enabled by the Wf4Ever project funded by the European Commission’s 7th FWP (FP7-ICT-2007-6 270192), and the myGrid platform grant by the EPSRC (EP/G026238/1)

Project sites http://www.myexperiment.org/ http://www.wf4ever-project.org/

Source code http://myexperiment.rubyforge.org/svn/ https://github.com/wf4ever/

License BSD 3-Clause License MIT license

http://www.researchobject.org/

Research Object

ResourceResource

Resource

AnnotationAnnotation

Annotation

oa:hasTarget

oa:hasBody

ore:aggregates

Manifest

Annotation graph

http://purl.org/wf4ever/model

outputA.txt

outputC.jpg

outputB/

intermediates/

1.txt2.txt

3.txt

de/def2e58b-50e2-4949-9980-fd310166621a.txt

inputA.txtworkflow

URI references

attribution

executionenvironment

Aggregating in Research Object

ZIP folder structure (RO Bundle)

mimetype

application/vnd.wf4ever.robundle+zip

workflowrun.prov.ttl(provenance)

.ro/manifest.json

http://alpha.myexperiment.org/packs/387

http://sandbox.wf4ever-project.org/portal/

The Research Object is stored and manipulated in a Research Object Digital Libraryusing REST APIs, allowing any tools to view and modify the RO, like the RO portal:

https://w3id.org/bundle

REST API

RDF triple store(RO structure, Annotations)

RO indexUploaded files

RO Portal

Checklist service

...