topic outline
DESCRIPTION
Network Biology Data Biological, Conceptual and Computational Issues around Network, System, and Pathway Data The Abstract and The Concrete. Topic Outline. Lessons from Genome Program and Abstract Ideas to transform data to information when looking at systems data. - PowerPoint PPT PresentationTRANSCRIPT
Network Biology Data
Biological, Conceptual and Computational Issues around Network, System, and Pathway Data
The Abstract andThe Concrete
Topic Outline Lessons from Genome Program and Abstract Ideas to transform data to
information when looking at systems data.
Two examples of Concrete Tools (ready for use) WebGestalt (for large sets of genes) Ingenuity (for networks)
A Concrete Thing: Bioinformatics Resource Center
(under development) Other tools under development
Human Genome Project (HGP): Past Lessons and Future Directions in Data…
Phenotype and System Data
Individualized Genotype data within populations
Genome Data
Genome-encoded “parts list” as data
integrator. -Common Data Elements of gene and gene Products of transcripts and
proteins. Enabling Integration and Comparison of data in NEW ways…
GeneKeyDB and related work as an integrative
foundation that can help merge with other data.
Genome Data`
HGP Highlighted some ways to succeed or fail with large data sets. ? Lessons Learned applicable for systems bio of expression, proteomics, genetic data sets? Yes.
?But, are some new approaches needed to understand SYSTEM data? Yes.
Biggest Lesson: A Biodata item has 2 questions attached to it…Mayr…HGP showed importance of the why questions in thinking about and organizing data.
Other genotype, phenotype, system data
Genome Data
How? Why?
A datum…
Genotype + Environment + DEVELOPMENT ==> Phenotype
1) Astounding Results Importance of Network thinking in development and physiology for data to
explain phenotype (e.g. PAX6)
2) Some relevance from HGP data approaches, but…Need new bioinformatics tools for network data and
thinking…
HGP results and Future Issues for new data….
Δ data in Regulatory networks
Δ data in Cellular signaling networks
Δ data in protein coding
Δ data in Regulatory networks
Δ data in Cellular signaling networks
Δ data in protein coding
A way of thinking about data…
Bioinformatics: Finding the (genotypic, environmental data) difference that makes
the (phenotypic data) difference.
(Many differences that make an interesting difference, NOT at protein coding, but at complex networks)
e.g. Alon U. 2003. Science 301: 1866; Barabasi Linked. 2003. Plume Books. Barabasi AL, Oltvai ZN. 2004. Nat. Rev. Genetics 5: 101
What is a “Network” way of viewing data…
A Biological network can be expressed and manipulated in terms of “graph theory.”
Combinatorial algorithms are needed to analyze graphs.
Nodes or Vertices May be • Genes• Gene products• Hormones, signals• Metabolites• Publications• Functional Sequence
Elements
Nodes or Vertices May be • Genes• Gene products• Hormones, signals• Metabolites• Publications• Functional Sequence
Elements
Edges or LinesEdges or Linesmay be may be • Undirected vs. directedUndirected vs. directed• Weighted vs. unweighted.Weighted vs. unweighted.
Edges or LinesEdges or Linesmay be may be • Undirected vs. directedUndirected vs. directed• Weighted vs. unweighted.Weighted vs. unweighted.
1.21.70.9
++++++
Could be…Could be…• Co-expression Networks• Gene Regulatory networks • Cell-Cell communication and signal
transduction networks.• Phylogenetic relationships among
genes, species, networks: orthology, paralogy, etc. (trees, clades, etc.)
• Gene Ontology or other Directed Acyclic Graphs.
Could be…Could be…• Co-expression Networks• Gene Regulatory networks • Cell-Cell communication and signal
transduction networks.• Phylogenetic relationships among
genes, species, networks: orthology, paralogy, etc. (trees, clades, etc.)
• Gene Ontology or other Directed Acyclic Graphs.
e.g. Alon U. 2003. Science 301: 1866; Barabasi Linked. 2003. Plume Books. Barabasi AL, Oltvai ZN. 2004. Nat. Rev. Genetics 5: 101
What is a “Network” way of viewing data…
A Biological network can be expressed and manipulated in terms of “graph theory.”
Combinatorial algorithms are needed to analyze graphs.
Nodes or Vertices May be • Genes• Gene products• Hormones, signals• Metabolites• Publications• Functional Sequence
Elements
Nodes or Vertices May be • Genes• Gene products• Hormones, signals• Metabolites• Publications• Functional Sequence
Elements
Edges or LinesEdges or Linesmay be may be • Undirected vs. directedUndirected vs. directed• Weighted vs. unweighted.Weighted vs. unweighted.• Experimental correlation (can Experimental correlation (can
be undirected) vs. be undirected) vs. mechanistic & directedmechanistic & directed
Edges or LinesEdges or Linesmay be may be • Undirected vs. directedUndirected vs. directed• Weighted vs. unweighted.Weighted vs. unweighted.• Experimental correlation (can Experimental correlation (can
be undirected) vs. be undirected) vs. mechanistic & directedmechanistic & directed
1.21.70.9
++++++
Tightly connected modules might be
found…Might be loosely analogous to
a protein sequence module that is conserved, duplicated,
and diverged. Might see similarity across different
tissue, species, etc.
Data Storage & Collaborative
Bioinformatics
Integrative Bioinformatics
Genotype & Phenotype Data Sets
Comparative Bioinformatics & Data Mining
Data Visualization
& Stats
GeneKeyDB
Large Molecular data sets
Genetic
Data
Existing Knowledge
Phenotype Data
Microarray data, proteome, etc.
MuTrack WebQTL Williams et al UTHSC
Gene-centered data integration (via GeneKEyDB, BioFoundation)Comparative, Boolean, other operations on Gene Sets & Networks WebGestalt and Ingenuity are two examples
Comparative Cladistic
Phylogenetic Analysis
Network Analysis
CS, Stats, Bio
Graph Algorithms
Sequence and
Network Modularity
Network modules:
DuplicatedDiverged
Converged
Need to collaborate, integrate, and COMPARE to find differences in
biological NETWORKS. Collaborative, Integrative, and Comparative
Bioinformatics
WebGestalt Web-based Gene Set Analysis Toolkit http://bioinfo.vanderbilt.edu/webgestalt
BingZhang
Can upload gene sets based on
1)IDs (e.g. affy, locus link, protein IDs from chip, proteome, etc.)
2) Genome LocationOr…3) Gene Ontology(common biological process,
molecular function, cellular location)
Manipulate data, as set of genes or gene productsRNA expression, proteome, genomics, statistical genetics, etc. all produce list of genes that may function in a network.
1 of 3 things to doBoolean operations on multiple sets or retrieving orthologs.
2 of 3 things to doRetrieve Data and other IDs
1 of 3 things to do
3rd thing to do “Unusual” Properties across set
e.g. What GO (biological processes, molecular functions,
and cellular locations) are in the set? Are they any that seem to occur
more than than expected…
Co-occurrence of genes and publications (GRIF)
Protein Domains in set
Chromosome locations in set…
Pathways in set (1)
Pathways in set (2)
Ingenuity
A commercial tool for manipulating graphs (networks).
VU Licensehttp://bioinfo.vanderbilt.edu/wiki/Ingenuity
(Also some open source tools, cytoscape, GeNetViz, etc. )
Use of Commercial
tool, Ingenuity by Dr N. Deanne
and Dr. Beauchamp
Pathways (3)
Bioinformatics Resource Center Developing a Bioinformatics Resource Center (BRC) that will
consist
Training infrastructure and applied workshops Support faculty using existing tools and databases (CaBIG, custom
statistical packages, NCBI genomics, imaging,molecular structure resources).
Collaborative IT Establish accessible databases in shared cores and support faculty
using these resources. … Integrative IT
Web sites that integrate information from disparate data sets: Comparative IT
Systems biology: comparing data across multiple platforms to identify new patterns—tissues and cells, molecular pathways, model organisms, toxins, etc
(taken from VUMC Strategic Plan).
Other systems…
Construction projects that can be further formed by your needs… CollabCore and Lab Blogs Genepedia, GeneKeyDB, BioFoundation Extensions to Webgestalt TFCAT, GeneCAT, CladeCAT, Pazar
AcknowledgmentsBing Zhang Stefan Kirov
Leslie GallowayBarbara JacksonBetty Lou AlspaughOakley CrawfordSuzanne Baktash Xinxia Peng Harold Shanafield Sam WangAdam TebbeShawn Ericson
Jeff Horner
A few collaborators…Bonnie LaFleur Shawn Levy
Phil Dexheimer
Michael LangstonCS collaborator
Wyeth WassermanDan Goldowitz and the TMGCRob Williams et al
WebQtl, etc.Erich BakerDan Beauchamp
Natasha Deanne Chad Johnson