in silico systems biology:network reconstruction, analysis and network based modelling

111
In silico systems biology:network reconstruction, analysis and network based modelling EMBO practical course 10-13 April 2010, Hinxton, UK

Upload: yamin

Post on 16-Jan-2016

31 views

Category:

Documents


1 download

DESCRIPTION

In silico systems biology:network reconstruction, analysis and network based modelling. EMBO practical course 10-13 April 2010, Hinxton, UK. Integration of genomic data with biological networks state of the art and future challenges. Laura I. Furlong - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: In silico systems biology:network reconstruction, analysis and network based modelling

In silico systems biology:network reconstruction, analysis and

network based modellingEMBO practical course

10-13 April 2010, Hinxton, UK

Page 2: In silico systems biology:network reconstruction, analysis and network based modelling
Page 3: In silico systems biology:network reconstruction, analysis and network based modelling
Page 4: In silico systems biology:network reconstruction, analysis and network based modelling
Page 5: In silico systems biology:network reconstruction, analysis and network based modelling
Page 6: In silico systems biology:network reconstruction, analysis and network based modelling
Page 7: In silico systems biology:network reconstruction, analysis and network based modelling
Page 8: In silico systems biology:network reconstruction, analysis and network based modelling
Page 9: In silico systems biology:network reconstruction, analysis and network based modelling
Page 10: In silico systems biology:network reconstruction, analysis and network based modelling
Page 11: In silico systems biology:network reconstruction, analysis and network based modelling
Page 12: In silico systems biology:network reconstruction, analysis and network based modelling
Page 13: In silico systems biology:network reconstruction, analysis and network based modelling
Page 14: In silico systems biology:network reconstruction, analysis and network based modelling
Page 15: In silico systems biology:network reconstruction, analysis and network based modelling
Page 16: In silico systems biology:network reconstruction, analysis and network based modelling
Page 17: In silico systems biology:network reconstruction, analysis and network based modelling
Page 18: In silico systems biology:network reconstruction, analysis and network based modelling
Page 19: In silico systems biology:network reconstruction, analysis and network based modelling

Integration of genomic data with biological

networksstate of the art and future

challengesLaura I. Furlong

Integrative Biomedical Informatics Group, Research

Unit on Biomedical Informatics (GRIB)

Page 20: In silico systems biology:network reconstruction, analysis and network based modelling

SNP

Phenotypic effect

Disease association

Functional effect(e.g. loss of binding

site)

Network modelling

Bauer-Mehren A, Furlong L, Rautschka M, Sanz F: From SNPs to pathways: integration of functional effect of sequence variations on models of cell signalling pathways. BMC Bioinformatics 2009, 10(Suppl 8):S6.

Network visualization

Integration of SNPs and their effects with networks

Page 21: In silico systems biology:network reconstruction, analysis and network based modelling

Prediction of pathogenic effect of mutations and SNPs

Page 22: In silico systems biology:network reconstruction, analysis and network based modelling

Prediction of pathogenic effect of mutations and SNPs

Page 23: In silico systems biology:network reconstruction, analysis and network based modelling

EntrezGene

dbSNP

Cytoscape node attribute file

MySQL DB

SNP, mutagenesis information•Association to disease•Functional effect

Mapping to dbSNP

Mapping to NCBI Gene

Identification of GO concepts

A data integration approach

Page 24: In silico systems biology:network reconstruction, analysis and network based modelling

Biological network data

• More than 200 pathway repositories and over 60 specialized on reactions in human

• More than 200 curated models

Page 25: In silico systems biology:network reconstruction, analysis and network based modelling

Manually curated information on nsSNPs, mutations

•Association to disease•Results from mutagenesis experiments

Broad collection of SNPs and short range sequence variantsdbSNP

Sequence variation data

Page 26: In silico systems biology:network reconstruction, analysis and network based modelling
Page 27: In silico systems biology:network reconstruction, analysis and network based modelling

Visualization

Page 28: In silico systems biology:network reconstruction, analysis and network based modelling

28/64

Page 29: In silico systems biology:network reconstruction, analysis and network based modelling

S->A mutation at position 218 leads to protein inactivation

Modelling the impact of sequence variation

Page 30: In silico systems biology:network reconstruction, analysis and network based modelling

Birtwistle MR, Hatakeyama M, Yumoto N, Ogunnaike BA, Hoek JB, Kholodenko BN (2007) Ligand-dependent responses of the ErbB signaling network: experimental and modeling analyses. Mol Syst Biol 3: 144.

Modelling the impact of sequence variation

Page 31: In silico systems biology:network reconstruction, analysis and network based modelling

Concerning sequence variations

•Too few have been functionally characterized

•Synonymous (“silent”) mutations can also alter function, e.g. through modulation of splicing or altering protein folding

•Need of tools for prediction of the impact of coding and non coding SNPs on gene/protein function (and even on biological process)

Challenges

Page 32: In silico systems biology:network reconstruction, analysis and network based modelling

The IntAct project

Page 33: In silico systems biology:network reconstruction, analysis and network based modelling

1. Define a standard for the representation and annotation of molecular interaction data

2. provide a public repository

3. populate the repository with experimental data from project partners and curated literature data

4. provide modular analysis tools

5. provide portable versions of the software to allow installation of local IntAct nodes.

IntAct goals & achievements

- Curation manual available from home page- Member of the International Molecular interaction Exchange consortium (IMEx)

http://www.ebi.ac.uk/intactftp://ftp.ebi.ac.uk/pub/databases/intact

4200+ distinct publications, 209,000+ binary interactions, 63,000+ proteins imported from UniProt

Known installation: AstraZeneca, GSK, MERCK, MINT, Proteome Center of Shanghai

search & advanced search, hierarchView, pay-as-you-go, MiNe…

Page 34: In silico systems biology:network reconstruction, analysis and network based modelling

Master headline

“Lifecycle of an Interaction”

Publication(full text)

CVs Curation manual

.abstract annotate

p1

p2I

exp

curator Super curator

che

ck IMEx

MatrixDB Mint DIP

rejectPublic web site

FTP siteaccept

Sanity Checks(nightly)

report report

IntAct Curation

Page 35: In silico systems biology:network reconstruction, analysis and network based modelling

Public data

• All data is manually curated by expert curators

• Curation manual rigorously followed

• All curated data is reviewed by a senior curator

• All data is made available on FTP site:

(!) data updated every week

(!) format available:

ftp://ftp.ebi.ac.uk/pub/databases/intactData

Page 36: In silico systems biology:network reconstruction, analysis and network based modelling

Controlled vocabularies• Why do we use them ?

e.g. far too many ways to write: yeast two hybrid, Y2H, 2H, two-hybrid, …

• Full integration of PSI-MI ontology

• Over 1,500 terms, fully defined and cross-referenced

Page 37: In silico systems biology:network reconstruction, analysis and network based modelling

How to deal with Complexes

• Some experimental protocol do generate complex data:Eg. Tandem affinity purification (TAP)

• One may want to convert these complexes into sets of binary interactions, 2 algorithms are available:

Both are somewhat wrong, spoke is said to generated 3 times less false positive (Bader et al.).

Page 38: In silico systems biology:network reconstruction, analysis and network based modelling

The IntAct web site

Page 39: In silico systems biology:network reconstruction, analysis and network based modelling

htt

p:/

/ww

w.e

bi.

ac.u

k/i

nta

ct

htt

p:/

/ww

w.e

bi.

ac.u

k/i

nta

ct

IntAct: Home page

Page 40: In silico systems biology:network reconstruction, analysis and network based modelling

UniProt Taxonomy PubMed Method (PSI-MI CV)Interaction details

Complex ?Interactors

IntAct: Search and results

IMEx dataOther PSICQUIC services

Page 41: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct: Search and results

ExportCustom columns

Filters

Page 42: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct: Browse

Page 43: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct: Advanced search: Ontologies

Page 44: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Advanced search: Fields

Filtering options

Add more filtering options

Page 45: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Advanced search: MIQL

• Molecular Interaction Query Language

Page 46: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Chemical search

1. Draw your compound

2. View matching molecules

3. View known interactions

Page 47: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Interaction details

Page 48: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Interaction details > More ..

Page 49: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Interaction details > Find similar interactions

We search for similar interaction by looking for interactions sharing the same participants. Interactions having the most in commons are shown first.

So far all hits are shown, we will work at speeding up that view as it can be rather slow when many participants exist in the original interaction.

Page 50: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > List of interactors

Page 51: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > List of interactors > Compounds

Page 52: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Graph view

Page 53: In silico systems biology:network reconstruction, analysis and network based modelling

IntAct > Linking to Cytoscape

Page 54: In silico systems biology:network reconstruction, analysis and network based modelling

Molecular Interaction Standards

Page 55: In silico systems biology:network reconstruction, analysis and network based modelling

Engineering 1850Engineering 1850

• Nuts and bolts fit perfectly together, but only if they originate from the same factory

• Standardisation proposal in 1864 by William Sellers

• It took until after WWII until it was generally accepted, though …

Proteomics 2003Proteomics 2003

•Proteomics data are perfectly compatible, but only if they are from the same lab / database / software

• “Publish and vanish” by data producers

• Collecting all publicly available data requires huge effort

• Urgent need for standardisation

Page 56: In silico systems biology:network reconstruction, analysis and network based modelling

• Community standard for Molecular Interactions

• XML schema and detailed controlled vocabularies

• Jointly developed by major data providers: BIND, CellZome, DIP, GSK, HPRD, Hybrigenics, IntAct, MINT, MIPS, Serono, U. Bielefeld, U. Bordeaux, U. Cambridge, and others

• Version 1.0 published in February 2004The HUPO PSI Molecular Interaction Format - A community standard for the representation of protein interaction data.Henning Hermjakob et al, Nature Biotechnology 2004.

• Version 2.5 published in October 2007Broadening the horizon - Level 2.5 of the HUPO-PSI format for molecular interactions.Samuel Kerrien et al., BMC Biology 2007.

PSI-MI XML format

Page 57: In silico systems biology:network reconstruction, analysis and network based modelling

57

IntAct specific columns (+11):• Experimental role(s) of interactors• Biological role(s) of interactors• Properties (CrossReference) of interactors• Type(s) of interactors• HostOrganism(s)• Expansion method(s)• Dataset name(s)

Standard columns (15):• ID(s) interactor A & B• Alt. ID(s) interactor A & B • Alias(es) interactor A & B• Interaction detection method(s)• Publication 1st author(s)• Publication Identifier(s)• Taxid interactor A & B• Interaction type(s)• Source database(s)• Interaction identifier(s)• Confidence value(s)

+

PSIMITAB Format

Standardization in progress !!

Page 58: In silico systems biology:network reconstruction, analysis and network based modelling

58

PSI-MI

Data format

Data distribution

Control vocabulary

Data submission

Website

SearchInteractionsInteraction detailsInteractorsMolecular viewGraph view

Standard format

Tools

PSICQUIC

PSI-MI CV

Reporting guideline MIMIx

Tools

PSI-MI XMLPSI-MITAB

XML Java APIMITAB Java API

XMLMakerFlattenerSemantic ValidatorRPsiXML (Bioconductor)

PSI-MI XML filesPSI Excel SheetPSI Web Form

Data

ServersRegistryClients

Page 59: In silico systems biology:network reconstruction, analysis and network based modelling

MIMIxMIMIx

•Experiments

•Interaction detection method (eg. Yeast two hybrid)

•Participant detection method (eg. Mass Spectrometry)

•Host organism

• Interactions

•Interactors

•Identifiers from public database

•Species of origin

•Biological/experimental roles (eg. enzyme,target / bait,prey)

•Confidence

Page 60: In silico systems biology:network reconstruction, analysis and network based modelling

IMEx: The International Molecular Exchange Consortium

• Group of major public interaction data providers sharing curation effort: DIP, IntAct, MINT, MPact, MatrixDB, MPIDB and BioGRID

• Independent molecular interaction resources

• Common curation standards for detailed curation

• Common data formats (PSI-MI XML, PSICQUIC)

• Common accession number space

• Coordinated & non-redundant curation

• In production mode since February 2010

• Since 3/2009 supported by the European Commission under PSIMEx, contract number FP7-HEALTH-2007-223411, with additional partners Vital-IT, Nature, Wiley, BiaCore (GE), U. Maryland, CSIC, TU Munich, MIPS, SCBIT (Shanghai)

Imex.sf.net

Page 61: In silico systems biology:network reconstruction, analysis and network based modelling

IMEx website Imex.sf.net

Page 62: In silico systems biology:network reconstruction, analysis and network based modelling

Getting access to more data is easy !!

Page 63: In silico systems biology:network reconstruction, analysis and network based modelling

Data distribution: PSICQUIC• Proteomics Standards Initiative Common QUery InterfaCe.

• Community effort to standardise the way to access and retrieve data from Molecular Interaction databases.

• Widely implemented by independent interaction data resources.

• Based on the PSI standard formats (PSI-MI XML and MITAB)

• Not limited to protein-protein interactions, also e.g.• Drug-target interactions• Simplified pathway data

• A registry listing resources implementing PSICQUIC

• Documentation: http://psicquic.googlecode.com

Page 64: In silico systems biology:network reconstruction, analysis and network based modelling

PSICQUIC implementation

….…. ….....

….…. ….....

PSICQUIC PSICQUIC PSICQUIC

Sample

Observation error

Interaction databases

Publications

PSICQUIC sources

Annotation error

User

PSICQUIC Registry

PSICQUIC client

Page 65: In silico systems biology:network reconstruction, analysis and network based modelling

21.04.2365

Servicebroker

Serviceconsumer

Serviceprovider

ServiceContract......

Interact

PublishFind

Service Oriented Architecture

PSI-MI.........

PSICQUIC Registry

DAS ClientsDAS ClientsPSICQUICClients

Format

PSICQUICsourcesPSICQUICsourcesPSICQUICsources

PSICQUIC implementation• PSICQUIC Server (SOAP/REST web service)• PSICQUIC Registry• PSICQUIC Clients

• PSICQUIC view• Cytoscape• Envision2• …

Page 66: In silico systems biology:network reconstruction, analysis and network based modelling

PSICQUIC Registry• http://www.ebi.ac.uk/Tools/webservices/psicquic/registry/registry?action=STATUS

• 1.693.000 binary interactions !

Page 67: In silico systems biology:network reconstruction, analysis and network based modelling

What can I do with PSICQUIC• User can query the registry to get a list of available services

• Registry supports tagging

• Users can script against these services using pretty much any programming language (SOAP / REST)

• Easy to parse MITAB to extract data of interest

• Data can be loaded in cytoscape to visualize a network

Page 68: In silico systems biology:network reconstruction, analysis and network based modelling

PSICQUIC limitations• Currently users can only download MITAB format

• We are planning to enable PSI-MI XML download too so users can get the original complex

• We are currently working on adding additional data formats:

• BioPax (only in IntAct’s PSICQUIC so far)

• SBML

Page 69: In silico systems biology:network reconstruction, analysis and network based modelling

Systems

Biology

Markup

Language

Page 70: In silico systems biology:network reconstruction, analysis and network based modelling

Overview of SBML

A machine-readable format for representing computational models in systems biology

Tool-neutral exchange language for software applications

Page 71: In silico systems biology:network reconstruction, analysis and network based modelling

Declares model not procedure

Independent of modelling formalism

Overview of SBML

Page 72: In silico systems biology:network reconstruction, analysis and network based modelling

Overview of SBML

Expressed in XML

Not really meant for humans to read

Page 73: In silico systems biology:network reconstruction, analysis and network based modelling

SBML structure and syntax<?xml version="1.0" encoding="UTF-8"?>

Page 74: In silico systems biology:network reconstruction, analysis and network based modelling

<?xml version="1.0" encoding="UTF-8"?><sbml xmlns="http://www.sbml.org/sbml/level3/version1/core" level="3" version="1">

</sbml>

SBML structure and syntax

Page 75: In silico systems biology:network reconstruction, analysis and network based modelling

<?xml version="1.0" encoding="UTF-8"?><sbml xmlns="http://www.sbml.org/sbml/level3/version1/core" level="3" version="1"> ... <model ...> ... </model></sbml>

SBML structure and syntax

Page 76: In silico systems biology:network reconstruction, analysis and network based modelling

<?xml version="1.0" encoding="UTF-8"?><sbml xmlns="http://www.sbml.org/sbml/level3/version1/core" level="3" version="1"> ... <model ...> <listOfXYZ>

... </model></sbml>

SBML structure and syntax

Page 77: In silico systems biology:network reconstruction, analysis and network based modelling

<?xml version="1.0" encoding="UTF-8"?><sbml xmlns="http://www.sbml.org/sbml/level3/version1/core" level="3" version="1"> ... <model ...> <listOfSpecies> <species … </listOfSpecies> ... </model></sbml>

SBML structure and syntax

Page 78: In silico systems biology:network reconstruction, analysis and network based modelling

Compartment

SBML structure and syntax

a container of finite size for well-stirred substances

<listOfCompartments> <compartment id="cell" spatialDimensions="3" size="2.3" units="litre" constant="true"/></listOfCompartments>

Page 79: In silico systems biology:network reconstruction, analysis and network based modelling

Species

SBML structure and syntax

a pool of a chemical substance

<listOfSpecies> <species id="s" compartment="cell" initialAmount="4.6" substanceUnits="mole" hasOnlySubstanceUnits="false" boundaryCondition="false" constant="false"/> </listOfSpecies>

Page 80: In silico systems biology:network reconstruction, analysis and network based modelling

Parameter

SBML structure and syntax

a quantity of whatever type is appropriate

<listOfParameters> <parameter id="p1" value="3000" constant="false"/> <parameter id="p2" value="8000" constant="true"/> </listOfParameters>

Page 81: In silico systems biology:network reconstruction, analysis and network based modelling

Reaction

SBML structure and syntax

a statement describing some transformation, transport or binding process that can change one or more species

ReactantsR

ProductsP

ModifiersM

‘Kinetic law’:v = f(R, P, M, parameters)

Page 82: In silico systems biology:network reconstruction, analysis and network based modelling

SBML structure and syntax

S0 S1

S2

rate law: k * S0 * S2

Page 83: In silico systems biology:network reconstruction, analysis and network based modelling

<listOfSpecies> <species id=“S0" compartment="comp1" initialAmount="1.66057788110262e-21“/> <species id=“S1" compartment="comp1" initialAmount="0“/> <species id=“S2” compartment=“comp1” initialAmount=“2e-21”/> </listOfSpecies>

<listOfCompartments> <compartment id="comp1" size="1e-16"/> </listOfCompartments>

SBML structure and syntax

Page 84: In silico systems biology:network reconstruction, analysis and network based modelling

<listOfReactions> <reaction> <listOfReactants> <speciesReference species=“S0"/> </listOfReactants> <listOfProducts> <speciesReference species=“S1"/> </listOfProducts> <listOfModifiers> <modifierSpeciesReference species=“S2”/>

SBML structure and syntax <listOfSpecies> <species id=“S0" compartment="comp1" initialAmount="1.66057788110262e-21/\> <species id=“S1" compartment="comp1" initialAmount="0“/> <species id=“S2” compartment=“comp1” initialAmount=“2e-21”/> </listOfSpecies>

Page 85: In silico systems biology:network reconstruction, analysis and network based modelling

<kineticLaw> <math xmlns="http://www.w3.org/1998/Math/MathML"> <apply> <times/> <ci> comp1 </ci> <ci> k </ci> <ci> S0 </ci> <ci> S2 </ci> </apply> </math></kineticLaw>

‘id’ of other elements

SBML structure and syntax

Page 86: In silico systems biology:network reconstruction, analysis and network based modelling

<kineticLaw> <math xmlns="http://www.w3.org/1998/Math/MathML"> <apply> <times/> <ci> comp1 </ci> <ci> k </ci> <ci> S0 </ci> <ci> S2 </ci> </apply> </math> <listOfLocalParameters> <localParameter id=“k” value=“2”/> </listOfLocalParameters></kineticLaw>

SBML structure and syntax

Page 87: In silico systems biology:network reconstruction, analysis and network based modelling

SBML structure and syntax

S0 S1

S2

rate law: k * S0 * S2

dS0/dt = - k * S0 * S2 * comp

dS1/dt = + k * S0 * S2 * comp

Page 88: In silico systems biology:network reconstruction, analysis and network based modelling

Rule

SBML structure and syntax

a mathematical expression that is added to the model equations

assignmentRule

rateRule

algebraicRule

x = f(y)

dx/dt = f(y)

f(x,y) = 0

Page 89: In silico systems biology:network reconstruction, analysis and network based modelling

<listOfEvents>

<event id=" Turn_on_current "> <trigger> … <listOfEventAssignments> <eventAssignment variable=“flag"> <math xmlns="http://www.w3.org/1998/Math/MathML"> <cn> 1 </cn> </math> </eventAssignment>

SBML structure and syntax

At specific point t > 30 flag = 1

Page 90: In silico systems biology:network reconstruction, analysis and network based modelling

ModelCompartment

Reaction

Species

Rule

Unit Parameter

Level 1 Version 1

Level 1 Version 2

Function

Event

Level 2 Version 1

InitialAssignment

Constraint CompartmentType

SpeciesType

Level 2 Version 2

Level 2 Version 3

Level 2 Version 4

SBML structure and syntax

Page 91: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Level 3

additional information

spatial

qual

Submodel 1 Submodel 2

comp

layout

core

mathematically necessary for correct interpretation

possibly necessary

Page 92: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

181 applications

(that we know about)

Page 93: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 94: In silico systems biology:network reconstruction, analysis and network based modelling

SBML ResourcesOnline

validator

Page 95: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 96: In silico systems biology:network reconstruction, analysis and network based modelling

SBML ResourcesOnline

Test Suite

Page 97: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 98: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 99: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 100: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 101: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

MathSBML

Mathematica

SBMLToolbox

MATLAB

Octave

Page 102: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 103: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resourcesconverters

BioPAX

Page 104: In silico systems biology:network reconstruction, analysis and network based modelling

SBML Resources

Page 105: In silico systems biology:network reconstruction, analysis and network based modelling
Page 106: In silico systems biology:network reconstruction, analysis and network based modelling
Page 107: In silico systems biology:network reconstruction, analysis and network based modelling
Page 108: In silico systems biology:network reconstruction, analysis and network based modelling
Page 109: In silico systems biology:network reconstruction, analysis and network based modelling
Page 110: In silico systems biology:network reconstruction, analysis and network based modelling
Page 111: In silico systems biology:network reconstruction, analysis and network based modelling

Conclusions

• No idea how to integrate discrete models

• No optimal solution how to fit data to the model for discrete modelling

• We are at the beginning of in silico systems biology

• New modelling, data analysis, integration approaches and tools are needed