the vcell database - the cellml project€¦ · the vcell database – sharing, ... document...

30
The The VCell VCell Database Database Sharing, Publishing, Reusing Sharing, Publishing, Reusing VCell VCell Models Models http://vcell.org

Upload: others

Post on 21-Apr-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

The The VCellVCell DatabaseDatabase–– Sharing, Publishing, Reusing Sharing, Publishing, Reusing VCellVCell Models Models ––

http://vcell.org

Design RequirementsDesign Requirements

►► ResourcesResourcesCompilers, libraries, addCompilers, libraries, add--ons, HPC hardware ons, HPC hardware –– NO!NO!

►► PortabilityPortabilityRun on Windows, Mac, UnixRun on Windows, Mac, Unix

►► AvailabilityAvailabilitySome simulations run for weeksSome simulations run for weeks……

►► Sharing and PersistenceSharing and PersistenceCollaboration; immutable and reproducible public modelsCollaboration; immutable and reproducible public models

►► Maintenance and InteroperabilityMaintenance and InteroperabilityContinuous feature updates; backwards compatibility; links to Continuous feature updates; backwards compatibility; links to other data/services; interchange with other toolsother data/services; interchange with other tools

Minimal Usage RequirementsMinimal Usage Requirements

►► RegistrationRegistrationFree; open sourceFree; open source

►► JavaJavaVersion 1.5 or later (except Mac Version 1.5 or later (except Mac –– 1.4 required)1.4 required)

►► Internet connectionInternet connectionRequired for:Required for:►► Database accessDatabase access►► Running simulationsRunning simulations►► Viewing resultsViewing results

But also standalone versionsBut also standalone versions……

►► A large monitorA large monitor…… !!

Compute Cluster

Simulation Worker Service

Physiology Editor Geometry EditorApplication Editor

Physiology

Reactions

Species

Structures

Fluxes

Diagrams

ApplicationReaction

Specification

Species Specificatio

n

Electrical Protocols

Structure Mapping

Model Analysis

Simulation Editor Simulation

Monitor

MathDescription

Geometry

Domains

Parameters

Equations

Simulation

Math Description

Parameter Overrides

Solver Specifications Data ViewerData Exporter

Math Generation Service

Slow Reaction Stoichiometry Analyzer

Fast Reaction Stoichiometry Analyzer

Electrical Circuit Analyzer

Math Description Generator

Geometry

Subvolumes

Regions

Surfaces

Connection Service

Authentication Service

Job Control Service

Data Service

Persistence Service

Remote Message Handler

Document Manager

Simulation Data

JMS Broker

(SonicMQ)

Siumulation Data Service

Data Export Service

Database Service

Simulation Dispatch Service

Database(Oracle)

ConnectionManager

ServerManager

Database ServiceDatabase

Service

Data Export ServiceData Export

Service

Siumulation Data ServiceSiumulation Data

Service

Simulation Dispatch ServiceSimulation

Dispatch Service

Simulation Worker ServiceSimulation

Worker Service

Compiled Simulation JobsCompiled

Simulation JobsCompiled Simulation JobsCompiled

Simulation JobsCompiled Simulation JobsCompiled

Simulation JobsCompiled Simulation Jobs

Batch Scheduler(PBSPro)

StorageCluster

Distributed ArchitectureDistributed Architecture

Mathematical Description(view-only, automaticallygenerated)

Mathematical Description(view-only, automaticallygenerated)

Mathematical Description(view-only, automaticallygenerated)

ResultsResultsResults

Applications

Structure mapping(topology to geometry)Initial ConditionsBoundary conditionsDiffusion constants (if spatial)Electrophysiology protocolsEnable/disable reactionsFast reactionsModel analysisStochastic rate conversion

Applications

Structure mapping(topology to geometry)Initial ConditionsBoundary conditionsDiffusion constants (if spatial)Electrophysiology protocolsEnable/disable reactionsFast reactionsModel analysisStochastic rate conversion

Applications

Structure mapping(topology to geometry)Initial ConditionsBoundary conditionsDiffusion constants (if spatial)Electrophysiology protocolsEnable/disable reactionsFast reactionsModel analysisStochastic rate conversion

Simulations

TimecourseTimestepMesh sizeSolver typeSolver settingsParameter changesParameter scansParameter sensitivity

Simulations

TimecourseTimestepMesh sizeSolver typeSolver settingsParameter changesParameter scansParameter sensitivity

Simulations

TimecourseTimestepMesh sizeSolver typeSolver settingsParameter changesParameter scansParameter sensitivity

Physiology

MoleculesStructures(topology)

ReactionsFluxes

single modellocations/molecules/mechanisms

non-spatial appsODEs, sensitivity analysismultiple simulations

spatial apps1D,2D,3D PDEsreaction/diffusion/advectionmultiple simulations

non-spatial “Math Model”ODEs, sensitivity analysismultiple simulations

spatial “Math Model”1D,2D,3D PDEsreaction/diffusion/advectionmultiple simulations

Math ModelsMath Models

Current Scope and Future PlansCurrent Scope and Future Plans►► Intended UsersIntended Users

BiologistsBiologistsBiophysicists/MathematiciansBiophysicists/Mathematicians

►► Modeling DomainModeling DomainCompartmental (0D) or Spatial (1D, 2D, 3D)Compartmental (0D) or Spatial (1D, 2D, 3D)Reaction/Diffusion/Membrane TransportReaction/Diffusion/Membrane TransportElectric Potential and CurrentsElectric Potential and CurrentsAdvection & Directed TransportAdvection & Directed TransportMembrane DiffusionMembrane Diffusion

►► Algorithms and SolversAlgorithms and SolversDeterministic Deterministic –– ODE and PDEODE and PDEStochastic and HybridStochastic and HybridParameter ScansParameter ScansParameter EstimationParameter Estimation

►► Under developmentUnder developmentComplexes and RulesComplexes and RulesStandStand--alone, Gridalone, Grid--Enabled & Customized VersionsEnabled & Customized VersionsProtocols & Virtual ExperimentsProtocols & Virtual ExperimentsPlugPlug--ins, Modules, Web Servicesins, Modules, Web ServicesConstraint HandlingConstraint HandlingMechanical ForcesMechanical ForcesCell motilityCell motility

Pollard & Borisy, 2003

13. Thymosin-ß4buffers G-actin

14. Actin filaments anneal and fragment

3 kinds of actin:ATP, ADPPi, ADP

2 kinds of end:barbed, pointed

At the Membrane In the Cytosol

Reaction Network in Virtual Cell

Aging: ATP hydrolysisfollowed by

Pi dissociation

Barbed endcapping

Profilin binding andADP/ATP Exchange

Thymosin-ß4buffering

Pointed end turnover

Barbed end turnoverwith and without profilin

Cofilin cooperativebinding, severing andenhanced Pi release

Annealingand

fragmentation

Aging and dissociation of

branches

Arp2/3 activationand side

branching

VCell Top-Level Documents

• Database object containers– BioModel– MathModel– Geometry

• Referential objects– ResultSet– FieldData

BioModel Object Hierarchy

SimulationSpec Element – a.k.a. “Application” wizard, a.k.a. SimulationContext object –

Reactions Database Retrieval

Reactions Database RetrievalReactions Database Retrieval

Controlled Vocabulary

KEGG LIGAND database– COMPOUND: small molecules

– ENZYME: enzyme classifications

– REACTION: enzymatic reactions

– GLYCAN: glycolipids, glycoproteins

SwissProt database– proteins

Species binding

Import Enzymatic Reactions

ControlledVocabulary

Phys Model

VCellVCell Database (~60 tables)Database (~60 tables)BioModelinstance

[cached XML]SwissProt Molecular

entities

Reactions

Experimental Context

Geometry/Images

GeneratedMathematics

Protocols,conditions

ExperimentalData

CellularStructures

Simulations

Model & solverparameters

Job Status

Kegg Compound

Reference DataKegg

EnzymaticReactions

Database design featuresDatabase design features►► StorageStorage of intermediate models.of intermediate models.

Manage multiple editions of same model.Manage multiple editions of same model.Client allows compare/mergeClient allows compare/merge

►► Records are immutableRecords are immutableEntire models cached in XML (fast loads)Entire models cached in XML (fast loads)Incremental saves (fast saves).Incremental saves (fast saves).

►► Access ControlAccess ControlUsers have private storageUsers have private storagesupports collaboration (with list of users)supports collaboration (with list of users)

►► Scalable: Scalable: multiple stateless serversmultiple stateless servers

►► Reliable: Reliable: Distributed transactions, persistent messaging middlewareDistributed transactions, persistent messaging middleware

►► Searchable (as permitted by access control).Searchable (as permitted by access control).Search for reactions from users models or Search for reactions from users models or KeggKeggBind molecular species to controlled vocabulary (Bind molecular species to controlled vocabulary (KeggKegg, , SwissProtSwissProt.).)

VCell Database Contents• Top-level object containers

– BioModels, MathModels, Geometries– All their elements (fine granularity)– Link tables and references

• References to external objects– ResultSets, FieldData (file-based storage)

• Ancillary data– User data (registration info, preferences)– Access control lists

• Controlled vocabulary data– Kegg, Swissprot bindings

VCell Usage

Feb-09 May-08 diff Increase

Users Who Ran Simulations 2224 1885 339 18%

Currently Stored Models 29117 25204 3913 16%

Currently Stored Simulations 160539 118403 42136 36%

Publicly Available Models 687 626 61 10%

Publicly Available Simulations 2377 1945 432 22%

Standards and ResourcesStandards and Resources

►► Languages and Languages and OntologiesOntologiesSBML, SBML, CellMLCellML►► VCellVCell imports/exports SBML, imports/exports SBML, CellMLCellML…… VCMLVCML

BNGLBNGLBioPAXBioPAXSBOSBOKiSAOKiSAO……SBGNSBGNMIRIAMMIRIAMMIASE MIASE –– SEDMLSEDML

►► RepositoriesRepositoriesBioModelsBioModels,, DoQCSDoQCS, JWS Online, JWS OnlineCellMLCellML model repositorymodel repositoryReactomeReactome,, PID, PID, BioCycBioCycPSLIDPSLID

Model or Pathway Representations?Model or Pathway Representations?

►► PathwaysPathwaysQualitativeQualitativeStaticStaticNo kineticsNo kineticsOften lack spatial Often lack spatial contextcontextMinimal mergingMinimal mergingMinimal experimental Minimal experimental contextcontext

►► ModelsModelsQuantitativeQuantitativeDynamicDynamicKineticsKineticsUsually encode some Usually encode some spatial contextspatial contextFrequent merging and Frequent merging and other approximationsother approximationsOften encode/depend Often encode/depend on experimental contexton experimental context

VCML and Model ExchangeVCML and Model ExchangeWhere are the problems?Where are the problems?

►► VCML is not the DOM for VCML is not the DOM for VCellVCell►► VCellVCell separates models from equations (imposes math restrictions)separates models from equations (imposes math restrictions)

The math is always derived!The math is always derived!No support for Rate Rules, Algebraic RulesNo support for Rate Rules, Algebraic RulesUnitsUnits……

►► VCellVCell automates compartmental to spatial porting of models (imposes automates compartmental to spatial porting of models (imposes physical physical ““realizabilityrealizability”” restrictions)restrictions)

Reactions must have locationReactions must have locationMolecules canMolecules can’’t cross double boundariest cross double boundaries

►► VCellVCell supports spatial informationsupports spatial information►► VCellVCell has hierarchical structurehas hierarchical structure

Multiple Multiple ““ApplicationsApplications””Multiple Multiple ““SimulationsSimulations””, includes simulator specification, includes simulator specification

►► VCellVCell includes database and ontology informationincludes database and ontology informationExternal bindings, native External bindings, native ““rolesroles””

EBI collaboration1. What?

– Connect VCell to BMDB– Simulate BMDB models with

VCell and search/import from BMDB

2. Why?– Provide a source of public, peer-

reviewed, curated quantitative models to VCell users

– Provide support for modules and model aggregation

3. How?– Publish VCML specification– Create standalone batch SBML

<–> VCML converter – Create automated links on BMDB

web interface– Customize granular

query/retrieval API

CMU collaboration1. What?

– Connect VCell to PSLID and SLIF– Use generative models for

“virtual” geometries and “virtual”molecular distributions

2. Why?– Provide a large source of public

image-based geometries and quantitative data to VCell users

– Provide realistic “artificial” data to complement/supplant real data

3. How?– Search/import interface for CMU

databases– XML repository of generative

models– Server-side Matlab libraries– Use field data for conversion

PSLID Modules for VCell

1. Asynchronous Communication Layer– XML parser for PSLID script queries and

downloads2. Generalized Data Structure

– Combined geometry regions and protein distributions

3. Custom User Interface– Experimental images and generated images

Field Data Structure