stian soiland-reyes mygrid, school of computer science university of manchester, uk ukoln devsci:...

29
Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30 http://taverna.org.uk/

Upload: naomi-henry

Post on 05-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

Stian Soiland-ReyesmyGrid, School of Computer Science

University of Manchester, UK

UKOLN DevSci: Workflow ToolsBath, 2010-11-30

http://taverna.org.uk/

Page 2: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

What is myGrid? An e-Science Collaboration Since 2001 Not a grid! Numerous partners involved:

University of ManchesterUniversity of SouthamptonUniversity of OxfordEMBL-EBI

Provides sustainable and production quality softwareSupported by OMII-UK, EPSRC and BBSRC

Mixture of developers, bioinformaticians and researchers

Software | Services | Content | Skills | Community

Page 3: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Motivation Challenge:

BioinformaticsLarge amounts of dataMany open questionsNumerous freely

available public datasets and analysis tools

Page 4: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Huge amounts of data

100+ Genes

QTL regions

Microarray

1000+ Genes

Next Gen Sequencing

10,000+ Genes

How do I look at all the genes systematically?

Page 5: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Manual approach Search using public web sites and databases

PubmedUniprotEBI BioMart

Copy and paste to web tools for analysisNCBI BlastEBI InterPro

Further processing locallyRPerlPython

Page 6: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Manual: disadvantages• Scale of analysis task overwhelms researchers

– lots of data• User bias and premature filtering of datasets –

cherry picking• Hypothesis-Driven approach to data analysis• Constant changes in data - problems with re-

analysis of data• Implicit methodologies (hyper-linking through

web pages)• Error proliferation from any of the listed issues

– notably human error

Page 7: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Web services and workflows Web services

Technology and standards for exposing code and data resources that can be programmatically consumed by a remote third party

Description on how to interact with the service, parameters, documentation

WorkflowsGeneral technique for describing and executing

a processDescribe what you want to do running which

services

Page 8: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Taverna workflows A set of (local and remote)

services to analyze or manage data

Nested workflows are also services

Data-links connects services i.e. output from service A is input to

service B and C Describes the desired dataflow

instead of process coordination Automatic iterations Can customize list handling and

control links

Page 9: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

What types of services? Public/private/secured WSDL/SOAP web services RESTful web services Spreadsheet import Command line tools (local/ssh) Inline scripts (Beanshell, R) Java APIs Customizations:

BioMart, BioMoby / SADISoaplabGrid services (Globus, EGEE gLite, caGrid)… your tool (Plugin tutorial on wiki)

Page 10: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Which services? Taverna is general, can connect to standard

web services for any domain Bioinformatics:

From professional third-party organisations providing robust & open data/analysis services

..to under-the-desk web services for one particular purpose, ran by PhD students

http://biocatalogue.org/ - 1730 services from 130 providers – crowd sourced and quality monitored

Page 11: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Page 12: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Taverna workbench

Graphical desktop tool No server installation

required Drag-and-drop services

into diagram Connect services, run,

reconnect, rerun Integrates diverse set

of tools

Page 13: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Page 14: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Page 15: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Page 16: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Sharing workflows

myExperiment.org allows users to share, find, download and rate workflows

“Facebook for the scientist” 3000 members, 1100 workflows

Page 17: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Extensible UI and engine Plugins can provide new “perspectives”

i.e.: BioCatalogue, myExperiment Provide service-specific customization

BioMart interface replicates web site Adding new functionality

Looping, branching, dynamic service resolutionNew service typesDesign helpers, “Find matching service”

Page 18: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Taverna 3 “Next-gen” Under development for 2011

Interactive, component-centric and data-centric workflow design

Pre-packaged workflow componentsSearching for workflow components from

BioCatalogue and myExperimentNew myGrid workflow components library

Page 19: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Taverna command line Executes from a

Windows/Linux/OSX shells

Takes a predefined workflow with files as inputs and outputs

Quick way to “productionize” a workflow

Page 20: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Taverna Server REST/SOAP interface to

execute workflows Client libraries for Ruby and Java Two demonstration web interfaces

RubyJava Portlets

FutureDetailed execution support and controlSecurity delegation

Page 21: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Taverna portlet Example portlet

implementation Executes workflows

using Taverna Server

Page 22: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Page 23: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Ruby web interface Example customized

web interface Uses Ruby gemt2-server

Page 24: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Taverna on the cloud Use-case:

SNP analysis and annotation ofgenome sequenced frombreeds of cows in Africa – why are some of them resistent to X?

Amazon EC2 with Taverna Server and local services

Custom (built-in-a-week) Ruby on Rails web interface

Runs through 31 chromosomes in 6.5 hours using 10 instances - $26

Page 25: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Page 26: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Open source, open development

Taverna suite of tools are all open source and free to use

Large user community, active mailing lists Lead developers: myGrid in Manchester Contributors from across the world PAL programme myGrid provides training, tutorials and

documentation

Page 27: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Acknowledgements

Page 28: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

Page 29: Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

http://taverna.org.uk/http://mygrid.org.uk/

More information http://www.mygrid.org.uk/

http://www.taverna.org.uk/

http://www.myexperiment.org/

http://www.biocatalogue.org/