using gene ontology to characterise key players in ......using gene ontology to characterise key...

1
Progress – Annotation www.geneontology.org www.ucl.ac.uk/functional-gene-annotation/neurological Twitter: @UCLgene [email protected] Using Gene Ontology to characterise key players in Parkinson's disease Paul Denny 1 , Rebecca E Foulger 1 , Marc Feuermann 2 , David P Hill 3,6 , Paola Roncaglia 4,6 , Maria J Martin 4 , John Hardy 5 and Ruth C Lovering 1 1. Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London, WC1E 6JF, UK 2. SIB, Swiss Institute of Bioinformatics, Geneva, Switzerland 3. The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA 4. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK 5. Department of Molecular Neuroscience & Reta Lila Weston Laboratories, Institute of Neurology, University College London, Queen Square, London WC1N 3BG, UK 6. The Gene Ontology Consortium www.geneontology.org/ Funded by: Parkinson’s UK, grant G-1307 (Co-grant holders: RC Lovering (PI), J Hardy, P Denny); NIH NHGRI U41 HG002273, Gene Ontology Consortium (Co-grant holders: JA Blake, JM Cherry, S Lewis, PW Sternberg and P Thomas). How YOU can help •We are keen to hear from you about your suggestions for our workshop; even better, if you would like to be involved. Please speak to us, or email [email protected]. •Search the GO annotations associated with your favourite autophagy gene - let us know if you think any annotations are missing. GENE ONTOLOGY Unifying Biology www.ucl.ac.uk/functional-gene-annotation [email protected] @UCLgene A critical application of GO annotations is in the functional analysis of high-throughput datasets, e.g. to group together proteins in an interactome (Figure 2). Our main focus has been on the annotation of the proteins involved in 11 processes relevant to Parkinson’s (Table 1). Across all species, we have created over 6000 annotations to more than 1400 distinct proteins (including over 800 human proteins) from the curation of over 500 papers (June 2nd 2016). Table 1. Parkinson’s-relevant processes & GO annotations for each (2 nd June 2016). The number of proteins manually associated with each GO process term by the Parkinson’s UK funded UCL team and the total number of proteins associated with these terms. NB Some proteins are associated with a single term multiple times, based on evidence from different sources. We have worked with GO editors to create >360 new GO terms relevant to Parkinson’s. Curation of the literature for a biological process often suggests the need for new GO terms, revisions of existing terms including definitions and term placement. Improvements to an ontology domain benefit both those new to a field, and scientists who wish to analyse their datasets accurately. The focus on autophagy and Wnt signalling has led to the creation of 65 new GO terms; two examples are: •‘parkin-mediated mitophagy in response to mitochondrial depolarization’ this term was created because autophagy of mitochondria (mitophagy), is abnormal in early-onset Parkinson’s and a key step in mitophagy is triggered by the E3 ubiquitin ligase, parkin. This term has been associated with 35 proteins in GO. •‘Wnt signalosome’ was added to improve the description of Wnt signalling components in GO, and to align our project with the Parkinson’s disease map (PD-map; minerva.uni.lu/MapViewer/), a database that offers a pathway-view of Parkinson’s. This term has been associated with 23 proteins across species. Progress – Ontology Comprehensive annotation of all biological processes relevant to neurological diseases would support the identification of additional candidate genes involved in many pathological conditions, as well as enable researchers to efficiently interpret high- throughput datasets, and understand the networks of interactions better. Manual GO annotation is an essential aspect to creating the required resource, and is very time consuming. However, these annotations have a significant impact on the quality of data interpretation. Figure 2. Gene Ontology annotations associated with PARK2 protein interactome. Protein interactors of human PARK2 (centre) analysed with BinGO. Each node is a protein and each edge is an experiment. GO terms associated with each protein are indicated by the node colour (see Table 2 for colour key), white nodes indicate proteins that are not annotated to any of the selected GO terms. More specific terms are also enriched, e.g. ‘regulation of establishment of protein localization to mitochondrion’ is associated with 11 nodes, but for simplicity not included. Further reading Using the Gene Ontology to Annotate Key Players in Parkinson's Disease. Foulger RE, Denny P, Hardy J, Martin MJ, Sawford T, Lovering RC. Neuroinformatics. Jan 29 (2016). PMID: 26825309 Computational analysis of the LRRK2 interactome. Manzoni C, Denny P, Lovering RC, Lewis PA. PeerJ. (2015) PMID: 25737818 Cytoscape 2.8: new features for data integration and network visualization. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Bioinformatics 27: 431-432. PMID: 21149340 BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Maere S, Heymans K, Kuiper M (2005) Bioinformatics 21: 3448-3449. PMID: 15972284 Table 2. Selection of GO terms significantly enriched in the PARK2 interactome. The human PARK2 interactome was analysed using BinGO. Coloured cells provide a key for the terms in Figure 2. The analyses were conducted using the BinGO plugin for the Cytoscape v3.3.0 tool on the May 2016 GO annotation dataset & March 2016 ontology files. Functional Analysis and Future Work The Parkinson’s GO annotation project: aims, priorities The primary aims of this project are to: Modify and extend the Gene Ontology to describe the functions of products of genes involved in processes relevant to Parkinson’s Provide high-quality GO annotations – an association between a gene product and a GO term - for Parkinson’s-relevant proteins. These aims are also combined in a special project, focused on autophagy. • This is the first annotation effort to focus on Parkinson’s disease, and we have established collaborations with local and international neurological researchers to guide our priorities. • We extract data from primary papers and reviews to attach GO terms to proteins. Our primary focus is human, but we also capture information from model organisms including mouse, rat and fly. Products of genes associated with Parkinson’s, including causative genes and risk genes Includes 48 high priority genes, identified through GWAS and genetic analyses Our current focus Parkinson’s UK-funded research Within each topic, and for each priority protein, we prioritise publications funded by Parkinson’s UK Our ongoing focus Cellular pathways known to be dys- regulated in Parkinson’s disease Including: mitophagy, autophagy, oxidative stress response, unfolded protein response, regulation of neuron death, Wnt-regulated dopaminergic neuron differentiation, and synaptic vesicle transport Our 2014-2015 focus Our annotation priorities include: The work described here is a collaboration between members of the Gene Ontology Consortium at University College London (UCL), the European Bioinformatics Institute (EMBL- EBI), the Jackson Laboratory (MGI) and the Swiss Institute of Bioinformatics (SIB). Introduction to GO • The Gene Ontology (GO) project is a collaborative effort to provide consistent descriptions of gene products across all kingdoms of life, and is a key resource for researchers wishing to understand the biological role of a gene product, or a list of gene products. • GO contains three structured controlled vocabularies (ontologies e.g. Figure 1) that describe gene products in terms of their associated molecular functions, biological processes, and cellular compartments, in a species-independent manner. • There are now over 42,000 terms describing a wide range of concepts to differing levels of specificity. Figure 1 ‘Parkin- mediated mitophagy’ & selected ancestor terms in the Gene Ontology. Blue lines indicate is_a relations to the parent terms. Created using OBO-edit 2.3.1 Priority process Human proteins (Annotations) Parkinson’s UK Total autophagy 299 (417) 780 (1352) mitophagy 167 (183) 210 (242) response to oxidative stress 30 (68) 625 (1019) regulation of neuron death 25 (38) 300 (445) ERAD pathway (ER stress associated protein degradation) 49 (98) 103 (220) cellular response to unfolded protein (UPR) 32 (50) 170 (263) intrinsic apoptotic signaling pathway in response to ER stress 26 (37) 64 (104) synaptic vesicle transport 25 (39) 196 (298) Wnt signaling pathway 72 (126) 764 (1917) dopaminergic neuron differentiation 21 (35) 38 (67) retrograde transport, endosome to Golgi (incl. retromer complex) 9 (10) 93 (124) Over the next 6 months we will: •Fully annotate 48 high-priority genes that are candidate genes for GWAS loci •Curate processes related to neuroinflammation, an important aspect of Parkinson’s, but currently poorly represented in GO. •Further assess the impact of our improvements to annotation and ontology In the longer term, we plan to: •Annotate proteins involved in other key processes relevant to Parkinson’s e.g. dopamine metabolism, lysosomal processes, and to •Fully annotate further high-priority genes identified recently in meta-GWAS •Improve annotation of the novel candidate genes being uncovered in exome and genome sequencing

Upload: others

Post on 14-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Gene Ontology to characterise key players in ......Using Gene Ontology to characterise key players in Parkinson's disease Paul Denny 1 , Rebecca E Foulger 1 , Marc Feuermann

Progress – Annotation

www.geneontology.orgwww.ucl.ac.uk/functional-gene-annotation/neurologicalTwitter: @[email protected]

Using Gene Ontology to characterise key players in Parkinson's diseasePaul Denny1, Rebecca E Foulger1, Marc Feuermann2, David P Hill3,6, Paola Roncaglia4,6, Maria J Martin4, John Hardy5 and Ruth C Lovering1

1. Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London, WC1E 6JF, UK2. SIB, Swiss Institute of Bioinformatics, Geneva, Switzerland3. The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA4. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK5. Department of Molecular Neuroscience & Reta Lila Weston Laboratories, Institute of Neurology, University College London, Queen Square, London WC1N 3BG, UK 6. The Gene Ontology Consortium www.geneontology.org/

Funded by: Parkinson’s UK, grant G-1307 (Co-grant holders: RC Lovering (PI), J Hardy, P Denny); NIH NHGRI U41 HG002273, Gene Ontology Consortium (Co-grant holders: JA Blake, JM Cherry, S Lewis, PW Sternberg and P Thomas).

How YOU can help

•We are keen to hear from you about your suggestions for our workshop; even better, if you would like to be involved. Please speak to us, or email [email protected].•Search the GO annotations associated with your favourite autophagy gene - let us know if you think any annotations are missing.

GENEONTOLOGYUnifying Biology

www.ucl.ac.uk/functional-gene-annotation [email protected] @UCLgene

A critical application of GO annotations is in the functional analysis of high-throughput datasets, e.g. to group together proteins in an interactome (Figure 2).

Our main focus has been on the annotation of the proteins involved in 11 processes relevant to Parkinson’s (Table 1).Across all species, we have created over 6000 annotations to more than 1400 distinct proteins (including over 800 human proteins) from the curation of over 500 papers (June 2nd 2016).

Table 1. Parkinson’s-relevant processes & GO annotations for each (2nd June 2016). The number of proteins manually associated with each GO process term by the Parkinson’s UK funded UCL team and the total number of proteins associated with these terms. NB Some proteins are associated with a single term multiple times, based on evidence from different sources.

We have worked with GO editors to create >360 new GO terms relevant to Parkinson’s.

Curation of the literature for a biological process often suggests the need for new GO terms, revisions of existing terms including definitions and term placement.

Improvements to an ontology domain benefit both those new to a field, and scientists who wish to analyse their datasets accurately.

The focus on autophagy and Wnt signalling has led to the creation of 65 new GO terms; two examples are:

•‘parkin-mediated mitophagy in response to mitochondrial depolarization’ this term was created because autophagy of mitochondria (mitophagy), is abnormal in early-onset Parkinson’s and a key step in mitophagy is triggered by the E3 ubiquitin ligase, parkin. This term has been associated with 35 proteins in GO.

•‘Wnt signalosome’ was added to improve the description of Wnt signalling components in GO, and to align our project with the Parkinson’s disease map (PD-map; minerva.uni.lu/MapViewer/), a database that offers a pathway-view of Parkinson’s. This term has been associated with 23 proteins across species.

Progress – Ontology

Comprehensive annotation of all biological processes relevant to neurological diseases would support the identification of additional candidate genes involved in many pathological conditions, as well as enable researchers to efficiently interpret high-throughput datasets, and understand the networks of interactions better. Manual GO annotation is an essential aspect to creating the required resource, and is very time consuming. However, these annotations have a significant impact on the quality of data interpretation.

Figure 2. Gene Ontology annotations associated with PARK2 protein interactome. Protein interactors of human PARK2 (centre) analysed with BinGO. Each node is a protein and each edge is an experiment. GO terms associated with each protein are indicated by the node colour (see Table 2 for colour key), white nodes indicate proteins that are not annotated to any of the selected GO terms. More specific terms are also enriched, e.g. ‘regulation of establishment of protein localization to mitochondrion’ is associated with 11 nodes, but for simplicity not included.

Further reading• Using the Gene Ontology to Annotate Key Players in Parkinson's Disease. Foulger RE, Denny P, Hardy J, Martin MJ, Sawford T, Lovering RC.

Neuroinformatics. Jan 29 (2016). PMID: 26825309• Computational analysis of the LRRK2 interactome. Manzoni C, Denny P, Lovering RC, Lewis PA. PeerJ. (2015) PMID: 25737818• Cytoscape 2.8: new features for data integration and network visualization. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Bioinformatics

27: 431-432. PMID: 21149340 • BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Maere S, Heymans K, Kuiper M (2005)

Bioinformatics 21: 3448-3449. PMID: 15972284

Table 2. Selection of GO terms significantly enriched in the PARK2 interactome. The human PARK2 interactome was analysed using BinGO. Coloured cells provide a key for the terms in Figure 2. The analyses were conducted using the BinGO plugin for the Cytoscape v3.3.0 tool on the May 2016 GO annotation dataset & March 2016 ontology files.

Functional Analysis and Future Work

The Parkinson’s GO annotation project: aims, priorities The primary aims of this project are to:

• Modify and extend the Gene Ontology to describe the functions of products of genes involved in processes relevant to Parkinson’s

• Provide high-quality GO annotations – an association between a gene product and a GO term - for Parkinson’s-relevant proteins.

• These aims are also combined in a special project, focused on autophagy.

• This is the first annotation effort to focus on Parkinson’s disease, and we have established collaborations with local and international neurological researchers to guide our priorities.

• We extract data from primary papers and reviews to attach GO terms to proteins. Our primary focus is human, but we also capture information from model organisms including mouse, rat and fly.

Products of genes associated with Parkinson’s,

including causative genes and risk genes

Includes 48 high priority genes, identified through

GWAS and genetic analyses

Our current focus

Parkinson’s UK-funded research

Within each topic, and for each priority protein, we prioritise

publications funded by Parkinson’s UK

Our ongoing focus

Cellular pathways known to be dys-regulated in Parkinson’s disease

Including: mitophagy, autophagy, oxidative stress response, unfolded protein response, regulation of neuron death, Wnt-regulated dopaminergic neuron differentiation, and

synaptic vesicle transport

Our 2014-2015 focus

Our annotation priorities include:

The work described here is a collaboration between members of the Gene Ontology Consortium at University College London (UCL), the European Bioinformatics Institute (EMBL-EBI), the Jackson Laboratory (MGI) and the Swiss Institute of Bioinformatics (SIB).

Introduction to GO• The Gene Ontology (GO) project is a collaborative effort to provide consistent descriptions of gene products across all kingdoms of life, and is a key resource for researchers wishing to understand the biological role of a gene product, or a list of gene products.

• GO contains three structured controlled vocabularies (ontologies e.g. Figure 1) that describe gene products in terms of their associated molecular functions, biological processes, and cellular compartments, in a species-independent manner.

• There are now over 42,000 terms describing a wide range of concepts to differing levels of specificity.

Figure 1 ‘Parkin-mediated mitophagy’& selected ancestor terms in the Gene Ontology. Blue lines indicate is_a relations to the parent terms.

Created using OBO-edit 2.3.1

Priority process Human proteins (Annotations)Parkinson’s UK Total

autophagy 299 (417) 780 (1352)mitophagy 167 (183) 210 (242)response to oxidative stress 30 (68) 625 (1019)regulation of neuron death 25 (38) 300 (445)ERAD pathway (ER stress associated protein degradation) 49 (98) 103 (220)

cellular response to unfolded protein (UPR) 32 (50) 170 (263)

intrinsic apoptotic signaling pathway in response to ER stress 26 (37) 64 (104)

synaptic vesicle transport 25 (39) 196 (298)Wnt signaling pathway 72 (126) 764 (1917)dopaminergic neuron differentiation 21 (35) 38 (67)retrograde transport, endosome to Golgi (incl. retromer complex) 9 (10) 93 (124)

Over the next 6 months we will:

•Fully annotate 48 high-priority genes that are candidate genes for GWAS loci

•Curate processes related to neuroinflammation, an important aspect of Parkinson’s, but currently poorly represented in GO.

•Further assess the impact of our improvements to annotation and ontology

In the longer term, we plan to:

•Annotate proteins involved in other key processes relevant to Parkinson’s e.g. dopamine metabolism, lysosomal processes, and to

•Fully annotate further high-priority genes identified recently in meta-GWAS

•Improve annotation of the novel candidate genes being uncovered in exome and genome sequencing