mygrid: personalised e-biology on the grid professor carole goble contact [email protected]...

26
myGrid: Personalised e-Biology on the Grid Professor Carole Goble http://www. mygrid .org. uk Contact [email protected] e-Science

Upload: ira-harrell

Post on 13-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid: Personalised e-Biology

on the Grid

Professor Carole Goblehttp://www.mygrid.org.uk

Contact [email protected]

e-Science

Page 2: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid: Personalised e-Science

on the Grid

Personalised extensible environments

fordata-intensive

in silico experiments in biology

Page 3: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

e-Science & Biology

• Biology is a multi-faceted & increasingly multi-disciplinary science.

• Bioinformatics is an “e-Science”.– Discovery is done in silico on results

obtained from experiments using a number of analysis & data resources.

• Molecular biology & genomics are our particular focus.

Page 4: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Circadian Rhythms

• Has anyone studied the effect of neurotransmitters on the circadian rhythms in Drosophila?

• How do the functions of the clusters of proteins from my experiment interrelate? What are the proteins with a particular function?

• Is a structure known for this protein and what other proteins have a similar structure?

• Can I build a homology 3D model?• What is known about the homologous protein?

Page 6: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Information Weaving

• Large amounts of data & many applications.

• Highly heterogeneous.– Different types,

algorithms, forms, implementations, communities, service providers

• Highly complex and inter-related.

• Highly volatile.• Obstacles Everywhere

Page 7: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Descriptive knowledge

Page 8: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Circadian Rhythms

1. Has anyone else studied the effect of neurotransmitters on the circadian rhythms in Drosophila?

2. How do the functions of the clusters of proteins from my experiment interrelate? And what are the proteins with a particular function?

3. Is a structure known for this protein and what other proteins have a similar structure?

4. Can I build a homology 3D model?5. What is known about the

homologous protein?

1

2

54

3

Page 9: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

E-Science Q & A

Who else has asked this question & can I use/adapt their approach?– Workflow.

What were the results at each stage?– Dynamic Data Repositories.

When was P12345 last updated?Which BLAST did I use?

– Provenance.Has PDB changed since I last ran this?

– Notification.

1

2

54

3

Personalisation.

3

54

Page 10: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid Objectives

• Straightforward discovery, interoperation, fusion, sharing of data, knowledge and workflows.

• Explicit management of workflows.– information & processes & best practice.

• Improving quality of experiments & data.– provenance & propagating change.

• Scientific discovery is personal & global.– personalisation & collaborative working.

• Security, ownership -> valuable assets.

Page 11: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Who is myGrid for?

– Users, developers, maintainers.– Biologists.– Bioinformaticians, resource

providers.– Tool builders, system

administrators.

myGrid users

biologists IS specialists

infrequentproblem specificbioinformaticians

tool builders

serviceprovider

systemsadministrators

bioinformaticstool builders

Page 12: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid Outcomes

1. e-Scientists– Environment built on toolkits for service

access, personalisation & community.– Gene function expression analysis (fly & yeast).– Annotation workbench for the PRINTS pattern

database.

2. Developers– Protocols and service descriptions.– myGrid-in-a-Box developers kit of core services.– Reference implementation services &

applications.– Bio services – already delivered.

Page 13: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid Stack

MetadataServicesCoordination Services

DataWorkflow Directory

Networked Services

Applications

Client Framework

Governance

DirectoryProvenancePersonalisation

SemanticServices Info. Extraction Workflow Ontology

Portal User AgentCollaboration

Data

Admin

Page 14: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid Pre-Prototype

Portal

Bioinformatic Services

PersonalRepository

Metadata:OntologyWorkflow

Enactment

Metadata:Service

DirectoryWorkflowRepository Bioinformatic Services

Page 15: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Portal

PersonalRepository

Meta Data:Ontology

WorkflowRepository

Meta Data:Service Type

Directory

RepositoryClient

OntologyClient

WorkflowClient

How do the functions of the clusters of proteins from my experiment interrelate?

Locating a workflow

Page 16: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Portal

PersonalRepository

Meta Data:Ontology

WorkflowRepository

Meta Data:Service Type

Directory

RepositoryClient

OntologyClient

WorkflowClientLocating a

workflow

Page 17: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Portal

PersonalRepository

Meta Data:Ontology

WorkflowRepository

Meta Data:Service Type

Directory

RepositoryClient

OntologyClient

WorkflowClientLocating a

workflow

Page 18: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Portal

PersonalRepository

Meta Data:Ontology

WorkflowRepository

Meta Data:Service Type

Directory

RepositoryClient

OntologyClient

WorkflowClientLocating a

workflow

Page 19: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Repos.Client

Bioinformatic Services

PersonalRepository

WorkflowEnactment

ServiceDirectory

4

2

2?

2?

ProvenanceData

3

WorkflowClient

Service SelectionClient

1

Running a workflow

Page 20: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Repos. Client

Bioinformatic Services

PersonalRepository

WorkflowEnactment

ServiceDirectory

4

2

2?

2?

ProvenanceData

3

WorkflowClient

Service SelectionClient

1

Running a workflow

Page 21: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid generic technologies

1. Ontologies, Protocols & APIs.2. Database access from the Grid.

Reference implementation for UK DBTF.

3. Process enactment on the Grid.4. Provenance services.5. Metadata services.

– From Semantic Web: DAML+OIL, RDF(S).

6. Personalisation services.7. Reference implementation of OGSA.

Page 22: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

Converging Technologies

Agents

Grid Computing

Web Technologies

Globus, Sun Grid Engine, Condor, DS (Jini, Corba)

SOAP, WSDL, UDDI, WSFL

DAML+OIL, OWL, RDF(S)

ACL, methodology

An early adopter for OGSA

Page 23: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

The myGrid Team• Carole Goble• Norman Paton• Brian Warboys• Stephen Pettifer• Luc Moreau• Dave De Roure• Chris Greenhalgh• Tom Rodden• John Brooke• Paul Watson• Alan Robinson• Rob Gaizauskas• Robert Stevens• Ian Horrocks• Neil Wipat

• Matthew Addis• Nick Sharman• Rich Cawley• Simon Harper• Karon Mee• Simon Miles• Vijay Dailani• Xiaojian Liu• Tom Oinn• Martin Senger• Milena Radenkovic• Kevin Glover• Angus Roberts• Chris Wroe

• Mark Greenwood • Phil Lord• Neil Davis• Darren Marvin• Justin Ferris• Peter Li• Nedim Alpdemir• Luca Toldo• Robin McEntire• Anne Westcott• Tony Storey• Bernard Horan• Paul Smart• Robert Haynes

Page 24: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid Partners

m

Page 25: MyGrid: Personalised e-Biology on the Grid Professor Carole Goble  Contact mygrid@cs.man.ac.uk e-Science

myGrid Summary

• myGrid aims to develop infrastructure middleware for an e-Biologist’s workbench.

• The setting is bioinformatics but the results are intended to be generally applicable to e-Science.

• A mix of standard, vanguard and bleeding edge technologies, advanced development and (some) research.

• Academic & commercial partnership.• myGrid project is timely & reflects a

community desire to “collaborate, or die”.