a framework based on web services orchestration for bioinformatics workflow management laboratory...
Post on 19-Dec-2015
226 views
TRANSCRIPT
A FRAMEWORK BASED ON WEB SERVICES ORCHESTRATION FOR BIOINFORMATICS WORKFLOW
MANAGEMENT
Laboratory for Bioinformatics (LBI), Institute of Computing (IC) - UNICAMP
Luciano Antonio Digiampietri
João Carlos Setubal
Cláudia Bauzer Medeiros
PhD Student:
Advisor:
Co-advisor:
Motivation
Genome assembly and annotation pipeline
assemblyreads contigs human
validation
incrementalassembly
new reads generation
assembly OK annotation
Motivation
The growth of bioinformatics activities– Data– Services
Data and services don’t use public standards
Goals
Specification and development of a framework that allows:– Data integration;– Service integration;– Modeling of complex tasks as workflows;– Coordination of workflow execution
This talk
Work in progress– no results yet
Overview: user interaction
Related issues
Web services– “a software application identified by a URI,
whose interfaces and bindings are capable of being defined, described, and discovered as XML artifacts” [W3C:webservices]
Related issues
Workflows– Workflows represent a set of activities
to be executed, their interdependencies relations, inputs and outputs.
activity1
activity2
activity3input data1
input data2
input data3
activity4
output data
Related issues
Service coordination– Service orchestration is a centralized
mechanism that describes how diverse services can interact. This interaction includes message exchange, business logic and order of execution;
– We are using PBEL4WS as the specification language for service orchestration
Related issues
Bioinformatics tools and data– Selection of basic bioinformatics tools for
genomic assembly and annotation;– Selection of some important data sources;– Use of tools and data ontology.
Related issues
Example of part of a tool ontology:
Alignment service
Local Alignment Global Alignment
Heuristic Alignment
Non-Heuristic Alignment
Non-Heuristic Alignment
Heuristic Alignment
The framework
Workflow management
Service layer
Service discovery Service request
application1 application2
Workflow discovery
Workflow repository
Workflow design Workflow engine
User interface
Service Catalog application3
The framework
Service layer – bioinformatics basic Web services
• assembly,• matching,• consensus, • etc
Service catalog layer – stores Web services'
• syntactical description;• semantical description;• URI.
The framework
Service discovery – search by:
• functionality,• context,• syntax.
Service request layer – management of each Web service solicitation:
• Sending input data;• Receiving results;• Detecting service failure.
The framework
Workflow engine layer – controls execution of all workflow tasks, via orchestration. – The main functions:
• interpretation of the process (or task) definition,• creation and management of process instances,• navigation between activities,• supervisory functions.
Workflow design layer – supports workflow specification and edition. – The facilities provided are:
• graphical interface for workflow edition,• service list,• interface description of selected services,• syntactical check of workflow.
Methodology
A bottom-up approach:
Service1
Service2
Service3
Workflow design and
engine
Workflow management
Service discovery
and requisition
Methodology
Services– bioinformatics basic services
• specification• development.
– metadata types definition Service discovery
– development of techniques for service discovery and request using syntactic and semantic search mechanisms.
Methodology
Workflow– specification and development of methods
for workflow design and execution;– Design of workflows is being done using
WOODS*; – Specification and implementation of an
orchestration mechanism.
*Seffino, L.A., Medeiros, C.B., Rocha, J.V.R., Yi, B.:WOODS- a spatial decision support system based on workflows. Decision Support Systems 27 (1999) 105-123
Conclusions
The main contribution is the framework itself:– It allows multi-institutional cooperation,
sharing:• Data• Tools• Workflows
– It can be the interface among various kinds of users of different research centers
Conclusions
Other contributions lie in– Scientific workflows specification and
publishing (using Web services as basic units);
– Semantic specification of bioinformatics tasks;
– Definition of a generic methodology for data and tools integration.
Thank you!Laboratory for Bioinformatics www.lbi.ic.unicamp.br
Institute of Computation (IC) www.ic.unicamp.br
University of Campinas (UNICAMP) www.unicamp.br
Luciano Antonio Digiampietri [email protected]