strategies & examples for functional modeling
DESCRIPTION
Strategies & Examples for Functional Modeling. COST Functional Modeling Workshop 22-24 April, Helsinki. Types of data sets and modeling. Commercial array data – more likely to have tools that support the use of array IDs. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/1.jpg)
Strategies & Examples for Functional Modeling
COST Functional Modeling Workshop22-24 April, Helsinki
![Page 2: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/2.jpg)
Types of data sets and modeling• Commercial array data – more likely to have tools that
support the use of array IDs.• Custom/USDA array data – problems with updating IDs,
linking to function and using array IDs directly in functional modeling tools.
• Proteomics data – larger data sets; need to make background references to determine enrichment.
• RNA-Seq data – largerand more complex data sets; novel transcripts currently can’t be included in modeling (contact AgBase to assign GO).
• Real-time data or quantitative proteomics data – hypothesis testing.
![Page 3: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/3.jpg)
Functional Modeling Strategies1. GO summary (using Slim sets)2. GO enrichment (statistical!)3. Pathways analysis4. Interaction or networks analysis5. Hypothesis testing
Note:• Functional modeling should be integrated.• Approaches are complementary, not exclusive.• Modeling is driven by the biology (not the other way round).
![Page 4: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/4.jpg)
Modeling Strategy• Think about using multiple functional approaches.
• GO, pathways, networks• complementary
• What is available for your species?• What GO is available?• What species does the pathways/network analysis use?
• What resources do you have?• at your institute (e.g. commercial pathways analysis)• open source (e.g. GO Enrichment analysis)• using online vs installed
• Iterative – further functional modeling based on initial results• GO hypothesis testing?
![Page 5: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/5.jpg)
1. GO Functional Summary• high throughput data sets gives us 1000s -10,000s of gene
products• can’t know everything about all gene products• tendency to ‘cherry pick’ ones you recognize
• instead, can group gene products by function• this gives us a manageable number of categories to process• enables us to see trends, patterns, etc
• Use GO Slim sets to ‘summarize’ data• Lose details (but can gain perspective).• Some GO Slim sets are ageing – not being updated as changes to
the GO are made.• Different Slim sets have different terms – which is best for
your data?AgBase GOSlimViewer tool.
![Page 6: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/6.jpg)
http://www.agbase.msstate.edu/help/slimviewerhelp.htm
The Slim set you use matters - need to determine which one to use & report it in Methods.
![Page 7: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/7.jpg)
![Page 8: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/8.jpg)
![Page 9: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/9.jpg)
Functional Summary
• Not all GO terms are annotated equally, e.g., metabolism!• can slim the complete GO for a species as a
background set and then determine terms in your data are disproportionately expressed.
• Can use Slims to compare two data sets (e.g., control vs treatment).
• Use Slims for your own sanity – are you seeing what you expect to see?
![Page 10: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/10.jpg)
ion/proton transportcell migration
cell adhesioncell growthapoptosisimmune response
cell cycle/cell proliferation
cell-cell signalingfunction unknowndevelopmentendocytosisproteolysis and peptidolysis
protein modificationsignal transduction
B-cells StromaMembrane proteins grouped by GO BP:
![Page 11: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/11.jpg)
B-cells StromaMembrane proteins grouped by GO BP:
cell migration
apoptosis
immune response
cell cycle/cell proliferation
cell-cell signalingfunction unknown
![Page 12: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/12.jpg)
BVDV Infection – cytopathic (CP) vs non-cytopathic (NCP) infection(comparing function between 2 different conditions)
![Page 13: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/13.jpg)
![Page 14: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/14.jpg)
2. Determining over-represented or under-represented function.
• most typically used functional analysis method• many, many tools that do this – see:
http://www.geneontology.org/GO.tools.microarray.shtml• very different visualization• will use some of these tools in practical session
![Page 15: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/15.jpg)
http://david.abcc.ncifcrf.gov/home.jsp
![Page 16: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/16.jpg)
Some useful expression analysis tools:• Database for Annotation, Visualization and Integrated
Discovery (DAVID)• http://david.abcc.ncifcrf.gov/
• AgriGO -- GO Analysis Toolkit and Database for Agricultural Community
• http://bioinfo.cau.edu.cn/agriGO/• used to be EasyGO• chicken, cow, pig, mouse, cereals, dicots• adding new species by request
• Onto-Express• http://vortex.cs.wayne.edu/projects.htm#Onto-Express• can provide your own gene association file
• Ontologizer• WebStart widget (requires Java); now on Galaxy• http://compbio.charite.de/contao/index.php/ontologizer2.html• requires OBO file & GAF (enables users to select their own annotations)
![Page 17: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/17.jpg)
GO Enrichment tools that support agricultural species.
![Page 18: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/18.jpg)
![Page 19: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/19.jpg)
![Page 20: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/20.jpg)
• structurally and functionally re-annotated a microarray• quantified the impact of this re-annotation based on GO
annotations & pathways represented on the array• tested using a previously published experiment that used
this microarray• re-annotation allows more comprehensive GO based
modeling and improves pathway coverage • re-annotation resulted in a different model from
previously published research findings
![Page 21: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/21.jpg)
![Page 22: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/22.jpg)
Evaluating GO toolsSome criteria for evaluating GO Tools:1. Does it include my species of interest (or do I have to “humanize”
my list)?2. What does it require to set up (computer usage/online)3. What was the source for the GO (primary or secondary) and when
was it last updated?4. Does it report the GO evidence codes (and is IEA included)?5. Does it report which of my gene products has no GO?6. Does it report both over/under represented GO groups and how
does it evaluate this?7. Does it allow me to add my own GO annotations?8. Does it represent my results in a way that facilitates discovery?
![Page 23: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/23.jpg)
RNASeq GO Enrichment• RNASeq experiments: longer transcripts and more highly expressed
transcript are more likely to be differentially expressed.• Current GO enrichment tools do not account for RNASeq platform
bias (most based upon arrays).• assume that all genes are independent and equally likely to be selected
as DE
![Page 24: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/24.jpg)
3. Pathway Analysis• Freely available tools:
• from public databases, e.g. KEGG & Reactome• Freely available tools, e.g. Cytoscape
• Commercial pathways analysis tools: e.g., Ingenuity Pathways Analysis (IPA), Pathway Studio, etc.• some tools only have limited species – need to “humanize” animal
data, etc for plants with Arabidopsis• everything gives you cancer
• Many pathways analysis tools combine pathways analysis, network analysis.
![Page 25: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/25.jpg)
Reactome Skypainterhttp://www.reactome.org/cgi-bin/skypainter2
![Page 26: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/26.jpg)
KEGG Pathwayshttp://www.kegg.jp/kegg/download/kegtools.html
![Page 27: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/27.jpg)
Analysis tools (commercial)
Ingenuity Pathway Analysis
NetworksPathwaysfunctions and diseases
Gene Ontology (GO) groupsPathway StudioGSEAPathways
http://www.ingenuity.com
http://www.ariadnegenomics.com/
IPA analysis included as IPA.txt
![Page 28: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/28.jpg)
Data Curation• Ingenuity: Manually curated database by Ph.D level scientists
(mining 32 different peer reviewed journals).• Pathway studio: Automated curation by Medscan Reader using
Natural language processing (NLP) technology. Mining Pubmed abstracts and peer reviewed journals • users can do their own text mining
![Page 29: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/29.jpg)
(Comparison by Divya Peddinti)
Comparison Criteria• Features• Proportion of proteins involved in modeling• Data generation• Display• Test Dataset: 3,600 bovine spermatozoa proteins
![Page 30: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/30.jpg)
Feature Ingenuity Pathway analysis (IPA)
Pathway studio
Input GI numberMicroarray IDAffymetrix IDGenBankSwiss Prot AccessionUnigene IDName orAliasHUGO ID
Entrez geneGenBankMicroarray IDSwiss Prot AccessionUnigene IDName or AliasHUGO ID
Databases Contains biological interactions data for human, mouse, rat Orthologous mapping available for dog, Cow, Chimp, Chicken, Rhesus macaque monkey, Arabidopsis thaliana, Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Danio rerio
Contains biological data for human, mouse, rat, bacteria, chicken, Zebra fish, frog, cow, bee, dog, Arabidopsis, Drosophila, Yeast, and transplantation research etc.
![Page 31: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/31.jpg)
Ingenuity Pathway analysis (IPA)
Pathway studio
Statistical test The significance value (p value) assigned to the function / pathways using Fischer’s exact test
The statistical significance of the overlap between the protein list and a GO group or pathway using the Fischer’s exact test.
Updates Quarterly Quarterly
Networks Builds networks with a maximum of 35 genes/ proteins
-
![Page 32: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/32.jpg)
Proteins involved in modeling
Ingenuity
Pathwaystudio
0
20
40
60
80
100
120
57.5
99.85
42.5
0.15
Proteins not involved in modelingProteins involved in modeling
![Page 33: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/33.jpg)
Data generation
Pathways05
101520253035404550
44
33
Ingenuity pathway anlaysisPathway studio
37 7 26
![Page 34: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/34.jpg)
Pathway display EGF signaling pathway
![Page 35: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/35.jpg)
4. Network Analysis• IPA & Pathway Studio equally efficient at drawing networks of
relationships.• IPA : simplifies the pathway display and creates more
manageable user friendly network for users to analyze.• Pathway Studio: Shows the relations in a table format. • STRING Database - known and predicted protein interactions.
![Page 36: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/36.jpg)
http://string-db.org/
![Page 37: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/37.jpg)
http://www.cytoscape.org/
![Page 38: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/38.jpg)
5. Hypothesis Testing• high throughput data sets – ‘fishing expedition’
or hypothesis generation• but GO also serves as a repository of biological
function – can be used for hypothesis testing based on these data sets
![Page 39: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/39.jpg)
days post infection
mea
n to
tal l
esio
n sc
ore
0
2
4
6
8
10
12
14
16
18
0 20 40 60 80 100
Susceptible (L72)
Resistant (L61)
Genotype
Non-MHC associated resistance and susceptibility
The critical time point in MD lymphomagenesis
Hypothesis At the critical time point of 21
dpi, MD-resistant genotypes have a T-helper (Th)-1 microenvironment (consistent with CTL activity), but MD-susceptible genotypes have a T-reg or Th-2 microenvironment (antagonistic to CTL).
![Page 40: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/40.jpg)
Th-1 Th-2
NAIVE CD4+ T CELL
CYTOKINES AND T HELPER CELL DIFFERENTIATION
APC T reg
Shyamesh Kumar
![Page 41: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/41.jpg)
Th-1 Th-2
NAIVE CD4+ T CELL
IFN γ IL 12 IL 18
Macrophage
NK Cell
IL 12 IL 4
IL 4 IL10
APC
CTL
TGFβ
T regSmad 7
L6 Whole
L7 Whole
L7 Micro
Th-1, Th-2, T-reg ?
Inflammatory?
![Page 42: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/42.jpg)
Step I. GO-based Phenotype Scoring.
Gene product Th1 Th2 Treg Inflammation
IL-2 1.58 1.58 -1.58
IL-4 0.00 0.00 0.00 0.00
IL-6 0.00 -1.20 1.20 -1.20
IL-8 0.00 0.00 1.18 1.18
IL-10 0.00 0.00 0.00 0.00
IL-12 0.00 0.00 0.00 0.00
IL-13 1.51 -1.51 0.00 0.00
IL-18 0.91 0.91 0.91 0.91
IFN-g 0.00 0.00 0.00 0.00
TGF-b -1.71 0.00 1.71 -1.71
CTLA-4 -1.89 -1.89 1.89 -1.89
GPR-83 -1.69 -1.69 1.69 -1.69
SMAD-7 0.00 0.00 0.00 0.00
Net Effect -1.29 -5.38 10.15 -5.98
Step III. Inclusion of quantitative data to the phenotype scoring table and calculation of net affect.
1-111SMAD-7
-11-1-1GPR-83
-11-1-1CTLA-4
-110-1TGF-b
11-11IFN-g
1111IL-18
NDND1-1IL-13
NDND-11IL-12
011-1IL-10
11NDNDIL-8
1-11IL-6
ND11-1IL-4
-11ND1IL-2
InflammationTregTh2Th1Gene product
ND = No data
Step II. Multiply by quantitative data for each gene product.
![Page 43: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/43.jpg)
- 20
- 10
0
10
20
30
40
50
60
Th-1 Th-2 T-regInflammation
Phenotype
Net
Effe
ct
5mm
Microscopic lesions
L6 (R)
L7 (S)
![Page 44: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/44.jpg)
ProT-reg Pro
Th-1Anti Th-2
Pro CTLAnti CTL
L7 Susceptible
Pro CTLAnti CTL
L6 Resistant
ProT-reg Pro
Th-2AntiTh-1
![Page 45: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/45.jpg)
![Page 46: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/46.jpg)
Concluding thoughts on functional modeling.
“By doing just a little every day, I can gradually let the task overwhelm me.”
Ashleigh Brilliant
![Page 47: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/47.jpg)
Bringing it all together…
• There is no one “correct” way; there is no “right” answer.
• Using multiple functional modeling strategies (e.g., GO, pathways, networks) can help with insights.
• Need to use biological knowledge to bring these different approaches together.
• Functional modeling is often iterative.• Need to focus not only on what is known but
what is new!
![Page 48: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/48.jpg)
Protein/Gene identifiers
GORetriever
GO annotations
Genes/Proteins with no GO annotations
GOanna
Pathways and network analysisIngenuity Pathways Analysis (IPA)Pathway StudioCytoscapeDAVID
GO Enrichment analysisIngenuity Pathways Analysis (IPA)Pathway StudioCytoscapeDAVIDAgriGOOnto-tools
ArrayIDer
GOSlimViewer
Yellow boxes represent AgBase toolsGreen boxes are non-AgBase resources
Overview of Functional Modeling Strategy
AutoSlim
Proteomics
RNASeqGenome2seq
Microarrays
Blast2GO
![Page 49: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/49.jpg)
Functional Modeling Considerations
• Should I add my own GO?• use GOProfiler to see how much GO is available for your species• use GORetriever to find existing GO for your dataset• Does analysis tool allow me to add my own GO?
• Should I do GO analysis and pathway analysis and network analysis?• different functional modeling methods show different aspects about your data
(complementary)• is this type of data available for your species (or a close ortholog)?
• What tools should I use?• which tools have data for your species of interest?• what type of accessions are accepted?• availability (commercial and freely available)
![Page 50: Strategies & Examples for Functional Modeling](https://reader033.vdocuments.net/reader033/viewer/2022051003/5681649c550346895dd67e71/html5/thumbnails/50.jpg)
Some Limitations• Annotation is not complete.
• not all the data is annotated• some gene products have no functional information
• Gene Ontology is only one aspect of functional modeling.• anatomy, tissue expression, phenotype, disease, etc
• Gene nomenclature – need to know what we are annotating!
• Functional modeling tools need to handle larger data sets (& multiple ontologies?).