biopax - biological pathway data exchange format tutorial
TRANSCRIPT
![Page 1: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/1.jpg)
BioPAX - Biological Pathway Data Exchange Format
Tutorial
Gary BaderCCBR, University of Toronto
BioPAX Workgroupwww.biopax.org
NETTAB June.12.2007.Pisa
http://baderlab.org
![Page 2: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/2.jpg)
BioPAX Supporting GroupsCurrent Participants• Memorial Sloan-Kettering Cancer Center: E.Demir, M. Cary, C.
Sander• University of Toronto: G. Bader• SRI Bioinformatics Research Group: P. Karp, S. Paley, J. Pick• Bilkent University: U. Dogrusoz• Université Libre de Bruxelles: C. Lemer• CBRC Japan: K. Fukuda• Dana Farber Cancer Institute: J. Zucker• Millennium: J. Rees, A. Ruttenberg• Cold Spring Harbor/EBI: G. Wu, M. Gillespie, P. D'Eustachio, I.
Vastrik, L. Stein• BioPathways Consortium: J. Luciano, E. Neumann, A. Regev,
V. Schachter• Argonne National Laboratory: N. Maltsev, E. Marland, M.Syed• Harvard: F. Gibbons• AstraZeneca: E. Pichler• BIOBASE: E. Wingender, F. Schacherer• NCI: M. Aladjem, C. Schaefer• Università di Milano Bicocca, Pasteur, Rennes: A. Splendiani• Vassar College: K. Dahlquist• Columbia: A. Rzhetsky
Collaborating Organizations• Proteomics Standards Initiative (PSI)• Systems Biology Markup Language (SBML)• CellML• Chemical Markup Language (CML)
Databases• BioCyc, WIT, KEGG, BIND, PharmGKB,
aMAZE, INOH, Transpath, Reactome, PATIKA, eMIM, NCI PID, CellMap
Wouldn’t be possible withoutGene OntologyProtégé, U.Manchester, Stanford
Grants/Support• Department of Energy (Workshop)• caBIG
![Page 3: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/3.jpg)
http://creativecommons.org/licenses/by-sa/3.0/
Will be made available from biopax.org wiki
![Page 4: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/4.jpg)
The Cell
How does it work?
How does it fail in
disease?
![Page 5: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/5.jpg)
Pathways• Pathways are biological processes
• But, not really pathways networks• Metabolic, signaling, regulatory and genetic• Define gene function at many different levels
• Biologists have found useful to group together for organizational, historic, biophysical or other reasons
Note: generally out of cell context
![Page 6: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/6.jpg)
![Page 7: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/7.jpg)
![Page 8: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/8.jpg)
![Page 9: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/9.jpg)
![Page 10: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/10.jpg)
Pathway information for systems biology, Cary MP, Bader GD, Sander C, FEBS Lett. 2005 Mar 21;579(8):1815-20
![Page 11: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/11.jpg)
Pathway Information
• Databases– Fully electronic– Easily computer readable
• Literature– Increasingly electronic– Human readable
• Biologist’s brains– Richest data source– Limited bandwidth access
• Experiments– Basis for models
![Page 12: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/12.jpg)
Pathway Databases
• Arguably the most accessible data source, but...• Varied formats, representation, coverage• Pathway data extremely difficult to combine and usePathguide Pathway Resource List (http://www.pathguide.org)
220 PathwayDatabases!
![Page 13: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/13.jpg)
http
://pa
thgu
ide.
org
Vuk Pavlovic
![Page 14: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/14.jpg)
Gathering Pathway Information is Hard
>100 DBs and toolsTower of Babel
Database
Software
User
![Page 15: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/15.jpg)
Biological Pathway Exchange (BioPAX)
Before BioPAX After BioPAXUnifying language
Reduces work, promotes collaboration, increases accessibility
>100s DBs and toolsTower of Babel
Database
Software
User
![Page 16: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/16.jpg)
BioPAX Pathway Language• Represent:
– Metabolic pathways– Signaling pathways– Protein-protein, molecular interactions– Gene regulatory pathways– Genetic interactions
• Community effort: pathway databases distribute pathway information in standard format
![Page 17: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/17.jpg)
Ontologies: Components• Classes, relations & attributes, constraints, objects, values• Classes (AKA “Concepts”, “Types”)
– Arranged into a specialization hierarchy (AKA “Taxonomy”)• Parent-child relationships between classes• Class A is a parent of class B iff all instances of B are also instances of A
– E.g. “Protein”, “RNA”, “Reaction”
• Relations & Properties (AKA “Slots”, “Attributes”, “Fields”)– Classes have properties, which may have values of specific types – Relationships: the value type is some other class in the ontology
• E.g. “Substrate”, “Transporter”, “Participant”– Attributes: the value type is a simple data type
• E.g. “Molecular Wt.”, “Sequence”, “∆G”
From Peter Karp, “Ontologies: Definitions, Components, Subtypes”, SRI International, presentation available at http://www.biopax.org
![Page 18: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/18.jpg)
Ontologies: Components (cont)
• Constraints– Define allowable values and connections within an ontology– E.g. “MOLECULAR_WT must be a positive real number”
• Objects and Values– Objects are instances of classes– Values occupy the slots of those instances – Strictly speaking, an ontology with instances is a knowledge base– Beyond the scope of BioPAX workgroup, our users will create the
instances of classes in the BioPAX ontology
![Page 19: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/19.jpg)
BioPAX Structure
• Pathway– A set of interactions– E.g. Glycolysis, MAPK, Apoptosis
• Interaction– A basic relationship between a set of entities– E.g. Reaction, Molecular Association, Catalysis
• Physical Entity– A building block of simple interactions– E.g. Small molecule, Protein, DNA, RNA
Entity
Pathway
Interaction
Physical Entity
Subclass (is a)Contains (has a)
![Page 20: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/20.jpg)
BioPAX: InteractionsInteraction
Control Conversion
Catalysis BiochemicalReaction
ComplexAssembly
Modulation Transport
TransportWithBiochemicalReaction
Physical Interaction
![Page 21: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/21.jpg)
BioPAX: Physical Entities
Complex
PhysicalEntity
RNAProtein Small MoleculeDNA
![Page 22: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/22.jpg)
BioPAX Ontology
![Page 23: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/23.jpg)
XML
Snip
pet
![Page 24: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/24.jpg)
Biochemical ReactionGlycolysis Pathway
Source: BioCyc.org
Phosphofructokinase
![Page 25: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/25.jpg)
RightLeft
EC # 2.7.1.11
![Page 26: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/26.jpg)
ControllerControlled
Phosphofructokinase
Direction: reversible
![Page 27: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/27.jpg)
Catalysis
BiochemicalReaction
Transport
Complex
Protein
DNA
![Page 28: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/28.jpg)
Controlled Vocabularies (CVs)• BioPAX uses existing CVs where available via
openControlledVocabulary instances– Cellular location: Gene Ontology (GO) component– PSI-MI CVs for:
• Protein post-translational modifications• Interaction detection experimental methods• Experimental form
– PATO phenotypic quality ontology– Some database providers use their own CVs
• E.g. BioCyc evidence codes
• More at the Ontology Lookup Service– http://www.ebi.ac.uk/ontology-lookup/
![Page 29: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/29.jpg)
Worked examples• Metabolic pathway
– EcoCyc Glycolysis (energy metabolism pathway)• Protein-protein interaction
– Proteomics, PSI-MI• Signaling pathway step
– Reactome CHK2-ATM
• Switch to Protégé• Available from biopax.org
– http://www.biopax.org/Downloads/Level2v1.0/biopax-level2.zip
![Page 30: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/30.jpg)
Exchange Formats in the Pathway Data Space
BioPAX
SBML,CellML
GeneticInteractions
Molecular InteractionsPro:Pro All:All
Interaction NetworksMolecular Non-molecularPro:Pro TF:Gene Genetic
Regulatory PathwaysLow Detail High Detail
Database ExchangeFormats
Simulation ModelExchange Formats
RateFormulas
Metabolic PathwaysLow Detail High Detail
Biochemical Reactions
Small MoleculesLow Detail High Detail
PSI-MI
![Page 31: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/31.jpg)
Using BioPAX• Databases
– BioCyc (EcoCyc, MetaCyc, many pathway genome databases)
– KEGG (available soon – KEGG, aMAZE, Sander)
– MSKCC Cancer Pathway Resource– Reactome– PSI-MI (via converter)– Switch to Pathguide
• Tools– cPath, Cytoscape, GenMAPP,
PATIKA, QPACA, VisANT• caBIG
![Page 32: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/32.jpg)
The Cancer Cell Map
cancer.cellmap.org
![Page 33: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/33.jpg)
http://visant.bu.edu/
![Page 34: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/34.jpg)
Ethan Cerami, MSKCC
![Page 35: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/35.jpg)
Switch to Cytoscape
• Load BioPAX pathway from Reactome (reactome.org)– http://reactome.org/cgi-
bin/biopaxexporter?DB=gk_current&ID=195721• Load, view + lay out• Extract UniProt IDs from Cytoscape attributes
![Page 36: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/36.jpg)
Systems Biology
Graphical Notation
http://sbgn.orgIn progress
![Page 37: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/37.jpg)
Software Development
• PaxTools– Open source Java– Read/write BioPAX files (Level 1,2)– Object model in memory that can be populated
and queried– Validation on create, read (under development by
MSKCC, OHSU)– http://biopax.cvs.sourceforge.net/biopax/Paxerve/
![Page 38: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/38.jpg)
BioPAX Level 3 (in progress)
• States and generics– E.g. phosphorylated P53, alcohols
• Gene regulation– E.g. Transcription regulation by transcription
factors, translation regulation by miRNAs• Genetic interactions
– E.g. synthetic lethality, epistasis • Better controlled vocabulary integration
– More accessible to reasoners• Switch to Protégé
![Page 39: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/39.jpg)
How to participate and contribute
• Visit biopax.org and join the discussion mailing list– [email protected]
• Make pathway data available in BioPAX• Build software that supports BioPAX• Contribute BioPAX worked examples,
documentation and specification reviews• Spread the word about BioPAX
![Page 40: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/40.jpg)
BioPAX Supporting GroupsCurrent Participants• Memorial Sloan-Kettering Cancer Center: E.Demir, M. Cary, C.
Sander• University of Toronto: G. Bader• SRI Bioinformatics Research Group: P. Karp, S. Paley, J. Pick• Bilkent University: U. Dogrusoz• Université Libre de Bruxelles: C. Lemer• CBRC Japan: K. Fukuda• Dana Farber Cancer Institute: J. Zucker• Millennium: J. Rees, A. Ruttenberg• Cold Spring Harbor/EBI: G. Wu, M. Gillespie, P. D'Eustachio, I.
Vastrik, L. Stein• BioPathways Consortium: J. Luciano, E. Neumann, A. Regev,
V. Schachter• Argonne National Laboratory: N. Maltsev, E. Marland, M.Syed• Harvard: F. Gibbons• AstraZeneca: E. Pichler• BIOBASE: E. Wingender, F. Schacherer• NCI: M. Aladjem, C. Schaefer• Università di Milano Bicocca, Pasteur, Rennes: A. Splendiani• Vassar College: K. Dahlquist• Columbia: A. Rzhetsky
Collaborating Organizations• Proteomics Standards Initiative (PSI)• Systems Biology Markup Language (SBML)• CellML• Chemical Markup Language (CML)
Databases• BioCyc, WIT, KEGG, BIND, PharmGKB,
aMAZE, INOH, Transpath, Reactome, PATIKA, eMIM, NCI PID, CellMap
Wouldn’t be possible withoutGene OntologyProtégé, U.Manchester, Stanford
Grants/Support• Department of Energy (Workshop)• caBIG
![Page 41: BioPAX - Biological Pathway Data Exchange Format Tutorial](https://reader036.vdocuments.net/reader036/viewer/2022071602/613d749e736caf36b75d8992/html5/thumbnails/41.jpg)
Aim: Convenient Access to Pathway Information
Facilitate creation and communication of pathway dataAggregate pathway data in the public domainProvide easy access for pathway analysis
http://www.pathwaycommons.org
Long term: Convergeto integrated cell map