systems biology: bioinformatics on complete biological system
TRANSCRIPT
Lars Juhl Jensen
Systems biologyBioinformatics on complete biological
systems
can a biologist fix a radio?
Lazebnik, Biochemistry, 2004
single gene studies
many experiments
knockout phenotype
Lazebnik, Biochemistry, 2004
everything about one gene
high-throughput biology
single technology
microarrays
one thing about every gene
systems biology
model complete systems
mathematical modeling
a simple system
Chen, Mol. Biol. Cell, 2004
simulation
Chen, Mol. Biol. Cell, 2004
many equations
Chen, Mol. Biol. Cell, 2004
many parameters
Chen, Mol. Biol. Cell, 2004
requires detailed knowledge
molecular networks
what is an interaction?
physical contact
stable interactions
transient interactions
interaction assays
yeast two-hybrid
fragment complementation
affinity purification
Jensen & Bork, Science, 2008
Jensen et al., Drug Discovery Today: TARGETS, 2004
spoke representation
Jensen et al., Drug Discovery Today: TARGETS, 2004
matrix representation
Jensen et al., Drug Discovery Today: TARGETS, 2004
interaction databases
BioGRIDGeneral Repository for Interaction Datasets
DIPDatabase of Interacting Proteins
IntAct
MINTMolecular Interactions Database
Exercise 1Go to http://thebiogrid.org
Query for human TYMS
Find the interaction partners
Check their sources
Think of possible problems
possibly many errors
purely high-throughput
one assay
one study
functional associations
guilt by association
STRING
experimental data
physical interactions
genetic interactions
Beyer et al., Nature Reviews Genetics, 2007
gene coexpression
curated knowledge
complexes
pathways
Letunic & Bork, Trends in Biochemical Sciences, 2008
genomic context
operons
Korbel et al., Nature Biotechnology, 2004
bidirectional promoters
Korbel et al., Nature Biotechnology, 2004
gene fusion
Korbel et al., Nature Biotechnology, 2004
phylogenetic profiles
Korbel et al., Nature Biotechnology, 2004
visualization
Franceschini et al., Nucleic Acids Research, 2013
many databases
different formats
different identifiers
variable quality
not comparable
not same species
hard work
(students)
quality scores
von Mering et al., Nucleic Acids Research, 2005
calibrate vs. gold standard
von Mering et al., Nucleic Acids Research, 2005
homology-based transfer
Franceschini et al., Nucleic Acids Research, 2013
Exercise 2Query STRING for human TYMS
Show network in confidence mode
Show up to 20 interaction partners
Show only experimental evidence
Show also low-confidence links
text mining
>10 km
too much to read
computer
as smart as a dog
teach it specific tricks
named entity recognition
comprehensive lexicon
cyclin dependent kinase 1
CDC2
flexible matching
cyclin dependent kinase 1
cyclin-dependent kinase 1
orthographic variation
CDC2
hCdc2
“black list”
SDS
co-mentioning
within documents
within paragraphs
within sentences
scoring scheme
NLPNatural Language Processing
grammatical analysis
Gene and protein namesCue words for entity recognitionVerbs for relation extraction
[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]
more precise
worse recall
related web resources
STITCH
STRING + 300k chemicals
drugs
metabolites
known drug targets
high-throughput screens
metabolic pathways
Exercise 3Go to http://stitch-db.org
Query for human TYMS
What is the role of thymidylate?
What is the role of dUMP?
What is the role of Permetrexed?
general approach
suite of new resources
COMPARTMENTS
TISSUES
DISEASES
curated knowledge
experimental data
text mining
computational predictions
common identifiers
quality scores
visualization
compartments.jensenlab.org
tissues.jensenlab.org
thank you!