biomoby: an architecture for interoperability
Post on 06-Jan-2016
24 Views
Preview:
DESCRIPTION
TRANSCRIPT
BioMOBY:An architecture for
interoperability
Benjamin GoodWilkinson Laboratory
iCAPTURE Centre
University of British Columbia
Acknowledgements
Mark Wilkinson , Edward Kawas, Nina Opushneva – iCAPTURE @ UBCPhillip Lord, Martin Senger – myGrid @ U Manchester
Heiko Schoof, Rebecca Ernst – MIPSPaul Gordon - University of Calgary
Carole Goble – myGrid @ U Manchester Lincoln Stein - CSHL
Damian Gessler, Andrew Farmer, Gary Schiltz - NCGRBill Crosby, Matthew Links, Luke McCarthy – U of S
Midori Harris – EBI & GO ConsortiumMike Niemi – IBM
Fiona Cunningham, Shuly Avraham – CSHLKen Stuebe – SDSC
Richard Bruskiewich – IRRI
Outline
• What BioMOBY is
• Why it was needed
• How it works
• Current Status
• Works in Progress
What BioMOBY is
A generic solution for sharing distributed computational resources
Why it was/is needed
High throughput Biology
SGDSGD
SGDSGD
SGDSGD
SGDSGD
Why it was/is needed
High throughput Biology
SGDSGD
SGDSGD
SGDSGD
TAIR
SGD
Why it was needed
High throughput Biology
SGDSGD
SGDSGDMIPS
Gramene
TAIR
IRRI
Why it was/is needed
High throughput Biology
SGDSGD
SGDGO
MIPS
Gramene
TAIR
IRRI
Why it was/is needed
High throughput Biology
SGDSGD
?!?!?
GO
MIPS
Gramene
IPGRI
IRRI
Integration?
DB1 Program DB2
Dis-An Architecture for
Web ServicesAnother architecture for Dis-Integration?
NCI WuBlast Genbank
API1 API2 API3
BioMOBYAn architecture for Integration
DB1 Program DB2
Note the Target Audience
• Not NCBI• Small to medium sized resource providers
• First priority to support their own users• Limited time and money
• Makes certain options impossible• No massive data warehouse• No standardization of implementation
(database, programming language)
Outline
• What BioMOBY is
• Why it was needed
• How it works
• Current Status
The Moby plan
1. Design an ontological framework for data-type creation
2. Let independent service providers build data-types using this framework
3. Use these data-types to define web service interfaces.
4. Register these interfaces in a “yellow pages”
• Machines can find an appropriate service• Machines can execute that service
unattended
Object Ontology• Data types defined in an open, shared GO-
like ontology– Nodes define data Classes– Edges define the relationships between Classes
• Edges define one of three relationships– ISA
• Inheritance relationship• All properties of the parent are present in the child
– HASA• Container relationship of ‘exactly 1’
– HAS• Container relationship with ‘1 or more’
Data-typing is the key
• Each Object in the ontology maps to a simple, concise XML Schema
• This rigid yet easily extensible structure facilitates serialization and parsing in any language.
• Sharing a framework for creating data-types turns out to be largely sufficient to achieve interoperability
The Simplest Data-Type<Object namespace=‘NCBI_gi’ id=‘111076’/>
Object
The combination of a namespace and an identifier within that namespace uniquely identify a data ‘entity’.
(Not its representation)
MOBY Primitives
Object
Integer
String
Float
DateTimeISA
ISA
ISA
ISA
<Integer namespace=‘’ id=‘’>38</Integer>
A MOBY Data-Type<VirtualSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer></ VirtualSequence >
Object
Integer
VirtualSequence
String
ISA
ISA
ISA
HASA
A MOBY Data-Type<GenericSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”>
ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String></ GenericSequence >
Object
Integer
VirtualSequence
String
ISA
ISA
ISA
HASA
GenericSequence
ISA
HASA
A MOBY Data-Type<DNASequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”>
ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String></ DNASequence >
Object
Integer
VirtualSequence
String
ISA
ISA
ISA
HASA
GenericSequence
ISA
HASA
DNASequence
ISA
A portion of the MOBY-SObject Ontology
…community-built!
170 registered by 34 authorities
Gene names
MOBY-S follows the typical Web Service Paradigm
MOBYCentral - yellowpages
MOBY hosts & services
SequenceAlignment SequenceExpress. Protein Alleles…
AlignPhylogenyPrimers
• What BioMOBY is
• Why it was needed
• How it works
• Current status
• Works in progress
Outline
Moby Stats
• Mailing list count 162 members
• Google Scholar – ‘BioMOBY’ 103– Citations of original BioMOBY paper
52
• Google links to biomoby.org 322
Deployed Moby Services
http://castor.brc.mcw.edu/files/mobysphere/
> 10 < 10
Thanks to Simon Twigger
• Services registered 478• Services developers (by contact email) 69
Major Implementations
• PlaNet consortium– European consortium of plant databases– 121 Services
• European Bioinformatics InstituteSOAPLab, myGrid
• National Bioinformatics Institute of Spain – Nationwide initiative – 35 public services (plus many more on
private registry)
MOBY Central Activity
API Calls Per Weekday 2004-2005
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
Sun Mon Tue Wed Thu Fri Sat
Cu
mu
lati
ve H
its
reqs
MOBY Central Activity
It seems to be working! Why?
• It provides useful functionality for the target audience.
• Functionality not currently available from any other WS/SWS project
• It is not difficult to deploy services.
Outline
• What BioMOBY is
• Why it was needed
• How it works
• Current Status
• Works in Progress
Is it useful outside of these consortia?
• Many public services now available (via passive altruism).
• As a result, interesting clients are emerging.
Client style 1,2,3
1. Power User when you want to do what you already know how to do
– Taverna• Produced by the myGrid Consortium• Graphical workflow composer and
invoker• Supports BioMOBY services (and
many others)
Taverna
Client style 1,2,3
2. Quick and Dirty You know what you have and what you want, but you don’t know how to make it happen
– MobyGraphs • Martin Senger of myGrid• Discovers service connectivity between two
datatypes
– PlaNet Service Aggregator• Precomputes all possible workflows starting
from a single input
Client style 1,2,3
3. Exploration Mode
– Gbrowse_moby– Ahab
Starting Data
Ahab
• Java Server Pages• Simultaneous service invocations• Session stored as RDF graph• Results displayed with clickable
graph.
• 0_1 Runs all possible services• 0_2 Gives user control
http://bioinfo.icapture.ubc.ca/bgood/Ahab.html
Core Development
1. Make service development even easier
2. Expand myGrid collaboration– Migrate to their registry & service
ontology– Enhance support for BioMOBY in
Taverna• Validation of workflows• Workflow construction “wizards”
3. Continue Development of Ahab– Visualization
Conclusions
• BioMOBY was designed to allow distributed communities to share their computational resources, it seems to be working
• Many new opportunities for real distributed data integration are starting to appear
Sponsors
BioMOBY
BC Bioinformatics Training Program
National Science Foundation (NSF), USACanadian Bioinformatics Resource, NRC, Halifax
Open-Bio FoundationIBM
top related