1 new emboss web service shaun mcglinchey ([email protected])

31
1 New EMBOSS Web Service Shaun McGlinchey ([email protected])

Upload: loreen-ellis

Post on 25-Dec-2015

237 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

1

New EMBOSS Web Service

Shaun McGlinchey ([email protected])

Page 2: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Outline

The presentation will discuss the challenges encountered in

exposing the EMBOSS suite of command line sequence analysis

tools as a ‘stateful’ SOAP based web service. An overview of the

proposed framework for client-side requests, server-side job

submission and results delivery will then be given.

Page 3: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

What is EMBOSS?

EMBOSS is "The European Molecular Biology Open Software

Suite".

What can I use EMBOSS for? Consists of approx 300 command line applications covering areas such

as:

Sequence alignment

Rapid database searching with sequence patterns

Protein motif identification, including domain analysis

Phylogenetic analysis

Presentation tools for publication

Page 4: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

What is JAX-WS?

In the words of SUN: JAX-WS - Java API for XML Web Services

(JAX-WS). is the centerpiece of a newly rearchitected API stack

for Web services, the so-called "integrated stack" that includes

JAX-WS 2.0, JAXB 2.0, and SAAJ 1.3.

Essentially a SOAP toolkit for Java

The implementation has been renamed (JAXRPC)

It brings clear improvements on data binding capabilities through

its tight integration with JAXB – Java API for XML Binding

Page 5: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Current State of (old) EBI EMBOSS Web Service

The current server-side implementation is Perl-based. Sample clients

are available in .Net, SOAP::Lite and Java (Axis) solutions.

Currently accepts free text as data input – weak typing – poor validation

capability

Supports both Synchronous and Asynchronous job submission.

Asynchronous requests are allocated a job id

Migrating to a Java-based JAX-WS server side implementation enables

us to have more control over the generated artifacts, increased data

validation capabilities and to rapidly improve on the functionality

provided.

Page 6: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

EMBOSS Data Types

There are 52 datatypes (at the last count) used within the

EMBOSS suite of applications. These fall under five headings

1. Simple – Array, Boolean, Integer, String …

2. Input – Codon, Features, Sequence, Seqall …

3. Selection Lists – List, Selection …

4. Output – Align, Report, Seqout …

5. Graphics – Graph, Xygraph

Page 7: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

EMBOSS Qualifiers

EMBOSS command line program

Accepts application name + qualifiers (each of which is a

datatype):

Water -asequence tsw:hba_human -bsequence

tsw:hbb_human : (water sequence seqall)

-asequence is of datatype Sequence, bsequence of Seqall

Qualifiers consist of associated qualifiers which can be also

passed to the command line to enable advanced configuration

of the application call. - sbegin, -send, -sformat

Page 8: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

General, Additional & Advanced Qualifiers

General are common to all EMBOSS applications

-auto true - Turn off prompts (boolean datatype)

-stdout true - Write standard output (boolean)

Page 9: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Web Service Development

In accordance with the Technology Recommendation we have

chosen Top-Down approach to WS Development, not Bottom-

Up.

Top-Down Approach to WS Development

Express data types in schema

Write WSDL (include schema)

Generate Artifacts (JavaBeans – data objects, server side

stubs, implementation class

Page 10: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Top-Down Approach to WS Development

Top-Down

Express data types in schema

Write WSDL (include schema)

Generate Artifacts (JavaBeans – data objects, server side

stubs, implementation class

Package (WAR file)

Deploy WAR file to server

Page 11: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Sample EMBOSS Application Schema (Head)

<?xml version="1.0" encoding="UTF-8"?>

<definitions targetNamespace=“emboss"

xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"

xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<types>

<xsd:schema xmlns="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://www.ebi.ac.uk/ws/emboss/water/>

<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns:tns="http://www.ebi.ac.uk/ws/emboss/applications/water/"

xmlns:jxb="http://java.sun.com/xml/ns/jaxb" jxb:version="1.0">

Page 12: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Application Schema – Custom Bindings (cont’d)

<xsd:annotation>

<xsd:appinfo>

<jxb:schemaBindings>

<jxb:package name="uk.ac.ebi.ws.emboss.applications.water">

</jxb:package>

</jxb:schemaBindings>

</xsd:appinfo>

</xsd:annotation>

Page 13: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Express Application Parameters

<xsd:element name="asequence“/>

<xsd:complexType name="asequence">

<xsd:sequence>

<xsd:element name="asequence" type="xsd:string" nillable="false"/>

<xsd:element name="asequenceQualifiers" type="tns:asequenceQualifiers"

nillable="true"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

Page 14: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Express asequenceQualifiers

<xsd:element name=“asequenceQualifiers”>

<xsd:complexType name=“asequenceQualifiers">

<xsd:sequence>

<xsd:element name="sbegin" type="xsd:integer"/>

<xsd:element name="send" type="xsd:integer"/>

<xsd:element name=“usa" type="xsd:string"/>

……

</xsd:sequence>

</xsd:complexType>

</xsd:element>

Page 15: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Encapsulate all data types inside an application element

<xsd:element name="water" type="tns:water"/>

<xsd:complexType name="water">

<xsd:sequence>

<xsd:element name="asequence" type="tns:asequence"/>

<xsd:element name="bsequence" type="tns:bsequence"/>

<xsd:element name="datafile" type="xsd:string"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

Page 16: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Using JAXB Generated Java Beans at the client side

Java Bean Objects are generated using

for client using JAX-WS ‘wsimport’ tool –

compiles wsdl + schema

Generated objects are populated using

setter (client-side) i.e.

Sequence asequence = newSequence();

asequence.setUsa("tsw:hba_human");

asequenceQual.setSprotein(true);

asequenceQual.setSbegin(0);

Page 17: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

EMBOSS Applications (300)

Manually create the schema – Not scaleable

Maven is a software project management & build tool.

Written an EMBOSS ACD parser plugin for our Maven WS

Software Build

Java class

Takes EMBOSS application definitions (ACD) as input

Output XML Schema, WSDL, representing each EMBOSS

application

These schema are passed to a JAXB compiler which generates

our Java Bean objects

Page 18: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Advantages of WS EMBOSS Software Build

Advantage of this approach is

We can auto-generate XML schema, Application WSDLs

Generate Java Objects for use on Client-Side

We can easily integrate new EMBOSS applications as a WS

by running the ACD file through our software build

Page 19: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Generated Artifacts

Page 20: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Why go to these lengths?

Because of sheer number of EMBOSS apps, necessary to

provide a clear means of representing the invocation of

separate applications and the passing of parameters

appropriate to that app.******* CLIENT SIDE CODE **********RunEmbossRequest run = new

RunEmbossRequest();EmbossParams water = new EmbossParams();water.setAsequence(asequence);water.setBsequence(bsequence);Emboss emboss = new Emboss();emboss.setApplication(EmbossApplication.WAT

ER);emboss.setApplicationParams(water);run.setEmbossParams(emboss);service = new WSEmbossService(); WSEmboss wsemboss =

service.getWSEmboss(); RunEmbossResponse response =

wsemboss.run(run);

Page 21: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Server-side – Reverse Process

At the server-side level, to obtain values objects can be de-

serialised using the Java getter methods, i.e.

******* SERVER-SIDE CODE **********

Emboss emboss = input.getEmbossParams();

EmbossApplication embossApp = emboss.getApplication();

String appname = embossApp.value();

EmbossParams water = emboss.getApplicationParams();

Sequence asequence = water.getAsequence();

Seqall bsequence = water.getBsequence();

This solution does not scale well

Page 22: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

How do we get from a Web Service payload to a valid command line?

We are looking at the possibility of developing a generic

mechanism to transform the SOAP envelope (our WS inputs –

Water params etc) using XSL (Extensible Stylesheets) into a

form (that can used to access the EMBOSS binary (application)

Page 23: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Understanding our Job Submission Requirements

Building a valid & secure command line (approx 300 EMBOSS

applications)

Issuing the command line (300 applications)

Retrieving results from the EMBOSS application

Our WS Job Submission should fulfill the EMBRACE Technology

recommendations of: Being a ‘Stateful Web Service’

Implement both synchronous and asynchronous functionality

Synchronous – submit a job (locked in to that application untill it returns a result)

Asynchronous (not synchronised) – submit a job but retain a free hand (not locked in)

– we can poll the service with a jobid to obtain job status and results

Page 24: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Operations to support requirement of ‘Stateful’ WS

RunJob: i.e. runJob(water); – all parameters for the job are

encapsulated in the water object. Operation will return a jobid.

CancelJob: i.e. cancelJob(“water12”);

This can be used to cancel the job execution

GetStatus: i.e. getStatus(“water12”);

Waiting, Scheduled, Running, Done, Cancelled, Aborted)

GetResult: i.e. getResult(“water12”);

Retrieve result of job, given a identifier

Page 25: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Do we have to reinvent the wheel? – Enter OMII

We propose borrowing established technology as one possible

solution to our requirements

Recently (this week) I met with Software Group Leader at OMII –

Open Middleware Infrastructure Institute based at University of

Southampton – www.omii.ac.uk

OMII is an established GRID middleware service provider – very

keen to have real users (developers using their products)

OMII design GRID related software products

Page 26: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

What can they offer us?

We are interested in their GridSAM product

GridSAM consists of several subsystems that support:

Pluggable job persistence (if your job fails, it will be retried)

Job Queuing, Launching

Job Monitoring

Pending, staging in, active, executed, staging out, job

completed

Page 27: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

GridSAM cont’d

File Staging (stage in input files, stage out output files)

All this functionality is available through an API – JobManager

Interface

Providing us with rich job submission functionality at little cost

Typically this functionality will be invoked from within the embedding

Application – web service – using the API

Page 28: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

How do I pass my job content to GridSAM Server

Jobs are launched by passing a JSDL (Job Submission

Description Language) document to the GridSAM server from a

GridSAM client using the JobManager API

All of this can exist underneath your web service layer

Opportunity for a shared EMBRACE server perhaps!

Page 29: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Sample JSDL

<xml version”1.0” encoding=“UTF-8”?>

<JobDefinition xmlns=http://schemas.ggf.org/jsdl/2005/11/jsdl>

<JobDescription>

<Application>

<POSIXApplication xmlnshttp://schema.gff.org.jsdl/2005/11/jsdl-posix>

<Executable>/bin/echo</Executable>

</Application

</JobDescription>

</JobDefinition>

Page 30: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Very good! – What about the EMBOSS WS

As mentioned, we propose to transform the EMBOSS WS

payloads (soap message) at runtime into a valid JSDL document

to be submitted to GridSAM

GridSAM looks promising!

We will use the EMBOSS WS as a test bed

If successful we may make a recommendation to WP3

Page 31: 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

Thank you for listening!