the elucidation of regulatory networks in complex biological systems: the convergence of biology,...
Post on 20-Dec-2015
218 views
TRANSCRIPT
The Elucidation of Regulatory Networksin Complex Biological Systems:
The Convergenceof
Biology, Medicine and Computing
G. PosteStanford University, 15 March 2002
The Analysis and Application of Principles of Biological Design
biology
biology chemistry
genomics computing
1750-1980
1980-2010
• the descriptive narrative
• empirical technology
• mechanistic reductionism
systems biology
• the encoded information content of biological systems
• mapping the basis of biological variation
• rational medicine and customized care
Biology and Medicine as Information-Based Sciences
From Reductionism to Integrated Systems Biology
• individual genes and proteins
• molecular interactions in simple systems
• limited, fragmented datasets
• poor annotation
• limited capacity for predictive simulation
• analog information
• biological circuits, pathways and networks
• assembly of higher order systems
• massive, integrated datasheets
• stringent, standardised annotation
• robust algorithms for predictive biology• biology in silico
• digital information
21st Century Biology and Medicine “SYSTEMS BIOLOGY”• the design principles of biological order and complexity• mapping the information content of biopathways and networks
BiotechnologyAnd
SystemsBiology
New Analytical
Capabilities
Large ScaleComputing
“BIG BIOLOGY”• interdisciplinary, massive datasets, information-based• infrastructure, investment and education
Convergence :The Technological Platforms Shaping
the Evolution of Healthcare
BiotechnologyAnd
SystemsBiology
New Analytical
Capabilities
Large ScaleComputing
Rule-BasedDesign Principles
Computational Biology
Exploring“Biospace”
AutomationEngineering
and Robotics
MaterialsScience
Micro-/Opto-Electronics
From Reductionism to Integrated Systems Biology
understanding the information content encoded in biological networks
mapping the design rules for progressively greater complexity of biological order
gene(s)
pathways, circuits and networks
progressively ordered assemblies: organelles, cells, tissues organs
homeostatic integration of myriad, complex, interactive networks(Physiology)
High Level Abstraction of Biological Pathways and Network Systems
Encoded Information
Pathways and Networks
Rule Sets
Plasticity• adaptive fitness• pathological peturbation
• directed evolution• biology in silico
Predictive Biology
Novel Biospaceand
Carbon : Silicon Union
Global and Nodal Pathway Map of Genomic and Proteomic Elements in Yeast Galactose Utilization
From: T. Ideker et. al. 2001. Science 292, 929
Genetic Networks
bioinformation processing involves leverage of interactive feedback loops in diverse domains- physical, chemical, electrical
genomic and proteomic codes represent a dense network of nested hyperlinks
matter becomes code
Nonlinear Complexity in Biological Systems
distinct classes of nonlinear interactions long-range (fractal) correlations self-similarity, self-dissimilar and organized
criticality pattern formation complex adaptive networks highly optimized tolerance = robustness with
fragility barriers to cascading failures deterministic chaos emergent properties
Nonlinear Complexity in Biological Systems
abrupt changes- bifurcations; intermittency/bursting;
bistability/multistability; phase transitions nonlinear oscillations
- limit cycles; phase-resetting; entrainment nonlinear waves
- spirals; scrolls; solitons complex periodic cycles and quasiperiodicities scale invariance
- fractal and multifractal scaling; long-range correlations; self-organized criticality
stochastic resonance and related noise-modulated mechanisms
time irreversibility
Informationand
Technology Platform Overload
Principal Themes in theAnalysis of Biological Systems
large scale
miniaturization
automation
parallelism
networked systems
real time, interactive, adaptive
Major Technology Gaps
rapid gene ID in complex genomes structural genomics and protein structure-function
prediction mapping the proteome
- abundance, modification, localisation and protein-protein interactions
- large scale parallelism (protein-arrays)- small organic molecule networks
mapping the metabolome- circuits, modules, networks
robust predictive algorithms for ADMET profiling of drug candidate SAR
The Need for Standards and Stringent Semantics
“... without which ….. wanton and luxuriant fancies climbing up into the Bed of Reason, do not only defile it by unchaste and illegitimate embraces, but instead of real conceptions and notices of things do impregnate the mind with nothing but Ayerie and Subventaneous Phantasmes”
Samuel Parker, FRS 1666
standards
standards
STANDARDS
The Analysis and Comprehension of Biological Systems
descriptiveignorance
complexity
defined rule sets
initialmechanisticinsights
• elucidation of patterns • defining rule sets
• disease heterogeneity• patient heterogeneity• disease predisposition
burgeoning,bewildering complexity
• elegant simplicity revealed• predictive biology
• right Rx : right disease• right Rx : right patient• from reactive treatment to proactive prevention
molecularphylogenies
andgeneology
population geneticsclinical
databanks
chemicalSAR
biologicalorder
IntegratedIntegratedDistributedDistributed
HeterogeneousHeterogeneousDatabasesDatabases
and Databanksand Databanks
datawarehousing
anddata mining
human-computerinterfacesystems
evolvinghardware
andelectronicevolution
object-orientedand pattern /spatial arrayrecognitionExpertExpert
SystemsSystemsandand
KnowledgeKnowledgeManagementManagement
Convergence, Consilience, Cognition and Computing
• more science• better science• faster science• cross-disciplinary science
• interdisciplinary convergence• technological convergence• corporate convergence
MEGADATA
Performance
Vo
lum
e
• burgeoning data volumes• more transactions• increasing diversity of datasets/apps• expanding user communities
• complexity of distributed environments• rising performance expectations• confidentiality and privacy
• pressures on network bandwidth
The Scalability
Crisis
Major Challenges for Life Sciences Computing
exponentially growing data repositories (102TB/PB)
highly variable data formats and standards as obstacles to data access and mining
inadequate attention to data Q.C./annotation standards
excessive reliance on customized solutions and fragmented data sources
inadequate access and integration of public and private datasets
primitive data visualization tools 80% time spent on data preparation tasks and
20% on productive exploration
Major Challenges for Life Sciences Computing
infrastructure scale and capital investment new tools for mining, visualization, simulation data storage conventions and technologies dynamic, adaptive, scalable systems active networks
- software into the network- subnet interoperability- integration of distributed and collaborative working
environments fast data access at all levels
- storage, I/O and networks to support analysis and simulation
expanded bandwidth for high usage and high transfer rates
Big Biology
Bracing For the Inevitable : Petabyte-Size Databases
1000 terabytes 250 billion text pages 20 million four drawer filing cabinets 2000 mile high tower of 1 billion diskettes typical US consumer generates 100 Gbytes
personal data/lifetime- education, insurance, credit, medical
100 million consumers 10,000 petabytes
Data Grids
from Napster and Gnutella
to
ubiquitous peer-to-peer exchange of data sets
to
apportioned distributed computing for solutions of computationally massive problems
Informatics for Big Biology and e.Health Networks
• instructive precedents in high end computing from other disciplines- cosmology, quantum chromodynamics, climate research, materials
• Scientific Simulation Initiative• National Computational Science Alliance• Long Term Ecological Research• NASA, DOE, NOAA• Accelerated Strategic Computing
Initiative
• UNICORE• Pangea• E-Science• LHC Challenge• E-Grid
USA Europe
•Grid Physics Network
The Bibliome
The Bibliome
Modified from : T. Berners-Lee and J. Hendler Nature 2000 410, 1023
The GlobalVirtual
Archive/Universal
Knowledge Web
Metadata
WWWI
Proof, logicand
ontologylanguages
• shared terms/ terminology• machine-machine
communication• inter-memetic translation• self-evolving translators
• Resource Description Framework• eXtensible Markup Language
• Metadata tagging standardsfor interoperable distributed archives• self-assembling datasets• self-describing documents
• HyperText Markup Language• HyperText Transfer Protocol
• The first generation Web
Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex
Biological Systems
unique complexity of biological systems multiple levels of abstraction
- organismal- ecosystem dynamics- social/memetic networks
qualitative not quantitative data- diversity of experimental conditions- inaccessibility/replication of experimental
conditions upgrading to hybrid qualitative/quantitative
analysis tools
Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex
Biological Systems entity classes : finite elements action properties : state properties intramolecular site interactions intermolecular site interactions massively parallel networks : unit modules continuum systems compartments economy and parsimony evolutionary relationships network pathways
- redundancy (degeneracy), pleiotropy- complex emergent properties
Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex Biological Systems
entity classes : finite elements action properties : state properties intramolecular site interactions intermolecular site interactions massively parallel networks : unit modules continuum systems compartments economy and parsimony evolutionary relationships network pathways
- redundancy (degeneracy), pleiotropy- complex emergent properties
submodels for searchable characteristics of functional knowledge integration of submodels into web-based distributed model networks
Jabberwocky
“ ’Twas brillig and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves And the mome raths outgrabe”
Lewis Carroll
The Divide Between Syntax and Semantics
“Colorless ideas sleep furiously”
Noam Chomsky (1957)
syntactically valid
semantically void
The Divide Between Syntax and Semantics
“Colorless green ideas sleep furiously” Noam Chomsky (1957)
encoded genome structure (syntax) and diverse expression repertoires (semantics)- alternative splicing- overlapping reading frames- nonsense mutations- differential modulation by different transcription
factors
database formats (syntax) and ontology (semantics)
The Conceptual Complexity of Ontology Design
ontology- set of axioms in a logical language- representational vocabulary with precise
definitions of shared understanding- axioms constrain interpretation of defined terms
XML versus ontology and evolution of the semantic web- XML less complex since semantics are not
represented- objective to reduce uncertainty favors
ontologies- objectives to reduce complexity favors XML
Convergence, Consilience, Cognition and Computing
scientific, technological and economicconvergence
datacomplexity
datascale
datadiversity
optimizeddata
representation
optimizeddata
comprehension
optimizeddata
utilization
• novel visualization and mining tools• human medicine interfaces
• ‘mind in the loop’ computing• modulation of brain function for optimum perceptualization
• adaptive IT• novel emergent networks
Bounded Rationality
human mind’s processing capacity is small relative to the size of the problems requiring analysis/comprehension (Simon)
objective solutions require complexity reduction in information, task and coordination
complexity reduction- omission and abstraction- division of labor (systems decomposition)
complexity reduction simultaneously increases uncertainty (Fox)
implications for evolution of ontologies for the semantic web
Enhancing Human Cognitive Capacities for Optimizing information Utilization
escalating quantities and types of information real time decision making new multi-modal, multi-sensory high performance
human : information interfaces representation and comprehensibility of
information flows- optimize information representation (perception)- modulation of brain function to optimize
comprehension systemic application of advances in cognitive
neurobiology
Enhancing Human Cognitive Capacities for Optimizing information Utilization
optimizing representations of information- perceptualization
optimizing cognitive capacities- states of the brain affect states of mind
(perception and cognition)- perceptual modulation techniques
Interdisciplinary Linquistics : Memetic Engineering
molspeak, medspeak, nerdspeak
standardization coding
speech recognition
object-oriented computing
synthetic intelligence
Molecular Medicine, Population Segmentation
andTargeted Patient Care
large-scalepopulation genetics
geno-phenotypecorrelations
in subpopulations
‘at-risk’subpopulations
individualrisk
profiling
Population Genetics
Linking Clinical Outcomes to Genetic Variation
populationgenetics
haplotype blocksSNP maps
low costhigh-throughput
genotyping
gene-diseaseassociations
ethics
dbases informatics
Large-Scale Disease Association Genetics and Disease Predisposition Risk Profiling
formidable logistics and cost
robust algorithms forcombinatorial gene interactions
slow evolution
complex ethical, legal and social issues
public acceptance and legislative controls
evidentiary standards and regulation
Legislative and Regulatory Considerations in the Creation and Management of Large Scale
Population Health Data Networks
consent identifiable (clinical) versus anonymous
(research) data authentication of communicating parties compliance
- HIPAA (USA)- EU Data Directive- individual nation/US State requirements- ICH5 Common Technical Document
e.health
Content
Care
Population Databanks and the Rise of Molecular Medicine
individual / family records
privacy and confidentiality
gene-disease correlations
gene-outcome correlations
gene-disease predispositionassociations
individual (targeted) care- optimum Tx- predisposition and
proactive risk management
infection
CNS
CPD
CVD
diabetes
renal
stroke
cancer
Shaping Physician Behaviour
decision support / control Dx/PDx Rx, PRx clinical guidelines education
Rx validation utilization compliance AE avoidance
wellness education compliance risk mitigation remote monitoring
populationdBase
individualrecord
andrisk
profile
Shaping Consumer / Patient Behaviour
PhysicianDesk-TopNetwork
e.Pharmacy
e.HomeHealth
Who Knows Wins!
Health Databanks
Population Segmentation and Individual Patient
Profiling
• clinical • pharmacy• lab data• outcomes
“The average person will have three to five internet devices on their body by the end of 2010…..not just the mobile phone,but health monitors,maybe even an implanted device,a GPS type of system, etc………..”
John ChambersCisco Systems
dot.CEO January 2001, p. 53
Consumer Health InformationSystems and Services
in-home to physician / pharmacy links
next generation tele-medicine andpersonal health monitoring
compliance monitoring
independent living
emergency management
integration of new imaging /diagnostic sensor systems
Biology and Medicine as Information-Based Disciplines
on-body / in-body / in-home remote devices for health status / compliance monitoring
interactive computational software and Rx of behavioral disorders
ubiquitous physician decision-support software to optimize clinical care and compliance
Cyber-Medicine
The Evolution of Large-Scale Biologygenome sequencingcomparative genomics
proteomicsfunctional genomics
structural genomics
genetic circuitsbiological order
complex systems
SNPs and gene-diseaseassociation studies
large-scale populationand statistical genetics
robust geno-phenotypecorrelations
individual genotypingand disease risk profiling
INFORMATICS
Biology and Medicine as Information-Based Disciplines
understanding the encoded instructions for biological design- genes proteins higher order assemblies- abnormal information coding in disease
assembly of large-scale population databases- gene-disease correlations- gene-Rx outcome correlations- individual genotyping and disease
predisposition risk profiling
Research
Clinical Medicine
Systems AnalysisBiology as an Informational Science
new technological platforms- automation, miniaturization, high-throughput- parallelism
new computational tools- scale, diversity of content- mining algorithms
new organizational linkages- convergence of biology and computing (science)- health / telco / compco (technology)
Systems AnalysisBiology as an Informational Science
new skills- graduate / post-graduate curricula- clinical training
new organizational structures- inter-disciplinary
new policies- grant agencies- national / international science- regulation, legislation
Computational Biology
predictive simulation of gene regulation and genetic networks- from genotype to phenotype
fast algorithms for molecular simulations modeling of molecular interactions, chemical
dynamics, transport and compartmentalization in cells
metabolic and physiological simulations scalar modeling
- molecules to cells to tissues to organs to organisms to populations
predictive tools for pre-emptive stabilization of system dysregulation
Grand Challenges
From Bioinformatics to Computational Biology
Bioinformatics : The Phenomenological Era
Computational Biology : The Theoretical Era
• ID and classification of statistical regulation among the most recurrent objects• optimum database design• fast classification/clustering algorithms• data mining software and ontological relationships
• elucidation of robust design rules• higher order multistate detector and component interactions• contextual recognition• pathways, circuits, networks and higher order assemblies• predictive biology
biology and medicine are in transition to become information-based sciences
this transition will shift R&D focus from the current reductionist framework to the analysis of biological complexity (systems biology)
these transitions will demand adoption of large scale analyses (big biology) and obligate adoption of more stringent standardization- data QC, annotation, curation- dBase formats and clinical profiling tools- massive computational capacity and dynamic,
scalable networks- distributed computing and collaborative
networks- from bioinformatics to ‘rules-based’
computational biology and cybermedicine