finding candidate genes for developmental...

1
FINDING CANDIDATE GENES FOR DEVELOPMENTAL DYSLEXIA BACKGROUND Developmental Dyslexia is a learning disorder that involves difficulty in reading, spelling, writing comprehension, and phonological processing of words. It is one of the most common neurocognitive disorders, affecting up to 10% of people. It is thought to be caused by brain development problems in a foetus. Symptoms include trouble with coordination and motor skills, lack of fluency, poor visual gestalt, and trouble with decoding words. The cerebral cortex has been shown to play a role in dyslexia, with regions including the frontal lobe, temporal lobe, occipital lobe, and parietal lobe. - The frontal lobe controls speech, the ability to read, and phonological processing of words. It contains Broca’s area, situated in the inferior frontal gyrus ,involved in speech production and articulation, and Wernicke’s area, involved in the comprehension and expression of language and word analysis. -The temporal lobe is involved in verbal memory and the occipital lobe is involved in the identification of visual information. -The parietal lobe links spoken and written language to memory and contains the angular gyrus, which helps in memory retrieval and number processing. METHODS A scientific report, ‘Identification of NCAN as a Candidate Gene for Dyslexia’, published on August 24, 2017 on nature.com (https://www.nature.com/articles/s41598-017-10175-7.pdf ) was used to identify previously suggested candidate genes for Developmental Dyslexia. STRING (https://string-db.org ), a database consisting of protein interaction networks was used to identify other potential genes of interest. DAVID (https://david.ncifcrf.gov/ ), a bioinformatics tool, and GeneWeaver (http://geneweaver.org ),a database that consists of experimentally validated gene sets and lists, were used to determine if any of the genes could be associated with Dyslexia by finding their Gene Ontology classifications. KEGG (https://www.genome.jp/kegg/ ) ,a collection of databases dealing with biological pathways, was used to find if any of the genes of interest were participants in pathways relating to Dyslexia. The BrainSpan Atlas of the Developing Human Brain (http://www.brainspan.org ) , a database used for studying brain development, was used to find heatmaps of several brains donated to the Allen Brain Institute which depicted gene expression data for the chosen genes of interest. It was also used to find correlates for the genes of interest- genes with similar expression patterns in the chosen brain regions(DFC, VFC, MFC, OFC, M1C, IPC, A1C, V1C, HIP, STR, MD, CBC, ITC, S1C, STC) and time period (8 pcw- 4 years). 1. IDENTIFYING GENES OF INTEREST : PROTEIN INTERACTION NETWORKS Ten proven Developmental Dyslexia susceptibility genes that were obtained from the research paper (DYX1C1, ROBO1, CYP1NA1, CTNND2, FOXP2, GRIN2B, CNTNAP5, KIAA0319, GCFC2, DCDC2) were entered into the STRING database to identify other genes that have been experimentally determined to associate with the given ten genes. RESULTS Out of the ten genes, only 6 of them ( DYX1C1, ROBO1, CTNND2, GCFC2, GRIN2B, DCDC2) were found to associate with other genes, either by direct physical binding or by indirect interaction such as participation in the same metabolic pathway or cellular process. All the genes shown in the networks were used for further analysis. CTNND2 GRIN2B DYX1C1 GCFC2 ROBO1 DCDC2 2. IDENTIFYING TOP GENES OF INTEREST : GENE FUNCTIONS PART 1- FINDING GENE ONTOLOGY CLASSIFICATIONS An analysis of these genes by finding their Gene Ontology classifications from DAVID and GeneWeaver revealed that 83% of the genes are involved in processes related to dyslexia - like learning, memory and development of the nervous system ,the brain, and the cerebral cortex. Most of these genes are expressed in several regions of the brain relating to dyslexia like the cortex, cerebellum, cerebrum and the hippocampus. Genes of general interest are highlighted in orange while the top 5 genes of interest are highlighted in blue. PART 2- INVOLVEMENT IN PATHWAYS Each of the 10 previously suggested DD candidate genes is implicated in brain development processes, such as neuronal migration, axonal guidance and ciliary functions. Interestingly, out of the 49 genes displayed above (excluding the original 10), 4 genes were found to be involved in axon guidance, as shown below. Furthermore, 11 genes turned up in the Wnt signaling pathway and 8 genes turned up in the MAPK signaling pathway; both pathways are involved in axon guidance. The axon guidance pathway and some of the genes involved in the process - 4 genes of interest (circled) are a part of the list. The top 5 genes of interest were all found to have roles in axon guidance either by being indirectly involved (MAPK signaling pathway and Wnt signaling pathway) or directly involved in the process. CTNNB1 was found to be involved in neuron migration. PART 3- COMBINED SCORES The combined scores of the genes of interest with respect to the initially identified susceptibility genes were obtained from the STRING database. High scores indicate that the genes of interest are highly associated with the identified DD candidate genes and also may be potential DD candidate genes. This shows the combined score of CTNNB1 with respect to CTNND2 (obtained from STRING). 3. GENE EXPRESSION PROFILE CALM2,DLG4, GRIN1, CTNNB1 and SLIT2 were entered into the Allen Brain Atlas and using the gene search, heat maps demonstrating the gene expression profiles of the 5 genes were obtained. The heat maps were filtered to show gene expression just in certain brain structures (DFC, VFC, MFC, OFC, M1C, IPC, A1C, V1C, HIP, STR, MD, CBC, ITC, S1C, STC) and the time period of interest (8 pcw- 5 years).The columns in the heat map are sorted first by donor, then by structure. The light blue, green and yellow parts of the map demonstrate high gene expression and dark blue indicates low gene expression. The heatmaps above compare the gene expression in RKPM of the top genes of interest to DYX1C1 and GRIN2B (proved DD susceptibility genes). The 4 genes of interest(CALM2 , DLG4, GRIN1, CTNNB1) that had higher gene expression values than DYX1C1 and GRIN2B were considered for further analysis. 4. ANALYZING GENE EXPRESSION IN REGIONS OF THE CEREBRAL CORTEX Using the data obtained from the Allen Brain Atlas, graphs were made showing the gene expression in regions of the cerebral cortex. The x-axis shows the brain donor and the age of the donor. The y axis shows the gene expression in RKPM. Each colored line represents a region of the cerebral cortex of the brain. All four genes were found to have consistently high gene expression in all regions of the cerebral cortex throughout a child’s developing years, indicating their functions are closely tied to those of the cerebral cortex (reading , language and phonological processing of words). CALM2, CTNNB1 and DLG4 were found to have consistent gene expression over time whereas GRIN1 showed a gradual increase in gene expression in the cerebral cortex. 5. EVIDENCE OF INTERACTIONS BETWEEN THE GENES OF INTEREST 3 out of the 4 candidate genes were found to interact with each other and with GRIN2B, an identified DD candidate gene. Interactions between these genes indicate that they are commonly associated with one another or that they frequently work together to perform a similar function. GRIN1, DLG4 AND CALM2 were picked for further analysis. 6. FINDING CORRELATES GRIN1, DLG4 AND CALM2 were entered into the Allen Brain Atlas using the correlative search. The search was filtered to show gene correlates with similar expression patterns over a subset of brain regions and developmental stages. Gene Ontology Classifications of all correlate genes with r values above 0.8 were obtained from the DAVID Functional Annotation Table. The correlates found to be involved in processes related to dyslexia like learning, memory, development of the cerebral cortex, axon guidance and balance are displayed below. Ayati Sharma, The Shri Ram School Moulsari , Gurugram, India and BioScience Project Wakefield, MA CONCLUSION The correlates for CALM2 were generally found to be involved in neuronal and cellular processes like axon guidance and synaptic transmission. DLG4 had the least number of correlates for the given brain regions and time period; the correlates aided in a variety of processes, including learning and memory. Like GRIN1, the correlates were found to be actively involved in cellular and neuronal processes, learning, memory, development and motor and coordination skills. The top 6 correlates of interest, based on their r values and involvement in functions directly related to dyslexia, are highlighted in blue. Using known risk factor genes, a bioinformatics approach was used to identify other candidate genes for developmental dyslexia. A variety of databases including STRING, DAVID, GeneWeaver, KEGG and The Allen Brain Atlas were used. The top genes of interest, based on their gene ontology classifications, involvement in pathways and combined scores with proved DD susceptibility genes, were found to be GRIN1, DLG4, SLIT2, CTNNB1 and CALM2. Evidence was found that the genes of interest were actively involved in the activities of the cerebral cortex (reading , learning, language and phonological processing, verbal memory). SLIT2 does not have high gene expression in the cerebral cortex but based on its gene ontology classifications , it could be a valid candidate gene for dyslexia. Gene interaction networks were found between GRIN1, DLG4 and CALM2 . CTNNB1 does not interact with the other top genes of interest but according to the findings, there is a high likelihood that it is related to dyslexia. Gene ontology classifications of the correlates of the top 3 genes of interest (GRIN1, CALM2 and DLG4) showed that all correlate genes were involved in learning, memory, formation and development of regions of the brain, motor and coordination skills as well as cellular and neuronal processes related to dyslexia. All of these correlates could be potential candidate genes for Developmental Dyslexia. From amongst the correlates of the top 3 genes of interest, NDRG4, PPP1R9B, JPH4, and CAMK2B have been shown to be highly associated with processes and regions relating to Developmental Dyslexia. Overall, the results indicate that GRIN1, CALM2 and DLG4 are highly associated with Dyslexia related phenotypes and should be further investigated to identify their role in developmental dyslexia. 7. CORRELATE GENES GENE EXPRESSION IN THE CEREBRAL CORTEX The columns in the heat map are sorted first by donor, then by brain structure. The light blue, green and yellow and orange parts of the map demonstrate high gene expression and dark blue indicates low gene expression. Based on the heat maps, it can be inferred that NDRG4 expresses itself most in the cerebral cortex, especially from the ages of 4 months to 1 year, followed by PPP1R9B, JPH4 and CAMK2B. PPP1R9B shows a gradual increase in gene expression over the years, JPH4 expresses itself in the cerebral cortex more or less consistently, and CAMK2B shows a sharp increase in gene expression in the cortex. The gene expression of both CISD1 and SHANK1 is low and constant. The top 6 correlates of interest were entered into the Allen Brain Atlas and using the gene search, heat maps showing the gene expression in regions of the cerebral cortex over the time period of interest (8pcw-5 years) were obtained.

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FINDING CANDIDATE GENES FOR DEVELOPMENTAL DYSLEXIAfiles.hgsitebuilder.com/hostgator55788/file/dyslexia_ayati.pdfDevelopmental Dyslexia is a learning disorder that involves difficulty

FINDING CANDIDATE GENES FOR DEVELOPMENTAL DYSLEXIA

BACKGROUND

Developmental Dyslexia is a learning disorder that involves difficulty in reading, spelling, writingcomprehension, and phonological processing of words. It is one of the most commonneurocognitive disorders, affecting up to 10% of people. It is thought to be caused by braindevelopment problems in a foetus.

Symptoms include trouble with coordination and motor skills, lack of fluency, poor visual gestalt,and trouble with decoding words.

The cerebral cortex has been shown to play a role in dyslexia, with regions including the frontallobe, temporal lobe, occipital lobe, and parietal lobe.

- The frontal lobe controls speech, the ability to read, and phonological processing of words. Itcontains Broca’s area, situated in the inferior frontal gyrus ,involved in speech production andarticulation, and Wernicke’s area, involved in the comprehension and expression of languageand word analysis.-The temporal lobe is involved in verbal memory and the occipital lobe is involved in theidentification of visual information.-The parietal lobe links spoken and written language to memory and contains the angulargyrus, which helps in memory retrieval and number processing.

METHODS

A scientific report, ‘Identification of NCAN as a Candidate Gene for Dyslexia’, published on August 24, 2017 on nature.com(https://www.nature.com/articles/s41598-017-10175-7.pdf) was used to identify previously suggested candidate genes for Developmental Dyslexia.

STRING (https://string-db.org), a database consisting of protein interaction networks was used to identify other potential genes of interest.

DAVID (https://david.ncifcrf.gov/), a bioinformatics tool, and GeneWeaver (http://geneweaver.org), a database that consists of experimentally validated gene sets and lists, were used to determine if any of the genes could be associated with Dyslexia by finding their Gene Ontology classifications.

KEGG (https://www.genome.jp/kegg/) ,a collection of databases dealing with biological pathways, was used to find if any of the genes of interest were participants in pathways relating to Dyslexia.

The BrainSpan Atlas of the Developing Human Brain (http://www.brainspan.org) , a database used for studying brain development, was used to find heatmaps of several brains donated to the Allen Brain Institute which depicted gene expression data for the chosen genes of interest. It was also used to find correlates for the genes of interest- genes with similar expression patterns in the chosen brain regions(DFC, VFC, MFC, OFC, M1C, IPC, A1C, V1C, HIP, STR, MD, CBC, ITC, S1C, STC) and time period (8 pcw- 4 years).

1. IDENTIFYING GENES OF INTEREST : PROTEIN INTERACTION NETWORKS

Ten proven Developmental Dyslexia susceptibility genes that were obtained from theresearch paper (DYX1C1, ROBO1, CYP1NA1, CTNND2, FOXP2, GRIN2B, CNTNAP5, KIAA0319,GCFC2, DCDC2) were entered into the STRING database to identify other genes that havebeen experimentally determined to associate with the given ten genes.

RESULTS

Out of the ten genes, only 6 of them ( DYX1C1, ROBO1, CTNND2, GCFC2, GRIN2B,

DCDC2) were found to associate with other genes, either by direct physical binding or

by indirect interaction such as participation in the same metabolic pathway or cellular

process. All the genes shown in the networks were used for further analysis.

CTNND2GRIN2B

DYX1C1

GCFC2

ROBO1 DCDC2

2. IDENTIFYING TOP GENES OF INTEREST : GENE FUNCTIONS

PART 1- FINDING GENE ONTOLOGY CLASSIFICATIONS

An analysis of these genes by finding their Gene Ontology classifications from DAVID andGeneWeaver revealed that 83% of the genes are involved in processes related to dyslexia - likelearning, memory and development of the nervous system ,the brain, and the cerebral cortex.Most of these genes are expressed in several regions of the brain relating to dyslexia like thecortex, cerebellum, cerebrum and the hippocampus.

Genes of general interest are highlighted in orange while the top 5 genes of interest arehighlighted in blue.

PART 2- INVOLVEMENT IN PATHWAYS

Each of the 10 previously suggested DD candidate genes is implicated in brain development processes, suchas neuronal migration, axonal guidance and ciliary functions. Interestingly, out of the 49 genes displayedabove (excluding the original 10), 4 genes were found to be involved in axon guidance, as shown below.Furthermore, 11 genes turned up in the Wnt signaling pathway and 8 genes turned up in the MAPK signalingpathway; both pathways are involved in axon guidance.

The axon guidance pathway

and some of the genes

involved in the process - 4

genes of interest (circled) are

a part of the list.

The top 5 genes of interest were all found to have roles in axon guidance either by being indirectlyinvolved (MAPK signaling pathway and Wnt signaling pathway) or directly involved in the process.CTNNB1 was found to be involved in neuron migration.

PART 3- COMBINED SCORES

The combined scores of the genes of interest with respect to the initially identified susceptibility genes wereobtained from the STRING database. High scores indicate that the genes of interest are highly associated withthe identified DD candidate genes and also may be potential DD candidate genes.

This shows the combined score of CTNNB1 with respect to CTNND2

(obtained from STRING).

3. GENE EXPRESSION PROFILE

CALM2,DLG4, GRIN1, CTNNB1 and SLIT2 were entered into the Allen Brain Atlas and using the gene search, heat maps demonstrating the gene expression profilesof the 5 genes were obtained. The heat maps were filtered to show gene expression just in certain brain structures (DFC, VFC, MFC, OFC, M1C, IPC, A1C, V1C, HIP,STR, MD, CBC, ITC, S1C, STC) and the time period of interest (8 pcw- 5 years).The columns in the heat map are sorted first by donor, then by structure.The light blue, green and yellow parts of the map demonstrate high gene expression and dark blue indicates low gene expression.

The heatmaps above compare the gene expression in RKPM of the top genes of interest to DYX1C1 and GRIN2B (proved DD susceptibility genes).The 4 genes of interest(CALM2 , DLG4, GRIN1, CTNNB1) that had higher gene expression values than DYX1C1 and GRIN2B were considered for further analysis.

4. ANALYZING GENE EXPRESSION IN REGIONS OF THE CEREBRAL CORTEX

Using the data obtained from the Allen Brain Atlas, graphs were made showing the gene expression in regions of the cerebral cortex. The x-axis shows the braindonor and the age of the donor. The y axis shows the gene expression in RKPM. Each colored line represents a region of the cerebral cortex of the brain.

All four genes were found to have consistently high gene expression in all regions of the cerebral cortex throughout a child’s developing years, indicating theirfunctions are closely tied to those of the cerebral cortex (reading , language and phonological processing of words). CALM2, CTNNB1 and DLG4 were found to haveconsistent gene expression over time whereas GRIN1 showed a gradual increase in gene expression in the cerebral cortex.

5. EVIDENCE OF INTERACTIONS BETWEEN THE GENES OF INTEREST

3 out of the 4 candidate genes were found to interact with each other and with GRIN2B, an identified DD candidate gene. Interactions between these genes indicate that they are commonly associated with one another or that they frequently work together to perform a similar function. GRIN1, DLG4 AND CALM2 were picked for further analysis.

6. FINDING CORRELATES

GRIN1, DLG4 AND CALM2 were entered into the Allen Brain Atlas using the correlative search. The search was filtered to show gene correlates with similar expressionpatterns over a subset of brain regions and developmental stages. Gene Ontology Classifications of all correlate genes with r values above 0.8 were obtained from theDAVID Functional Annotation Table. The correlates found to be involved in processes related to dyslexia like learning, memory, development of the cerebral cortex,axon guidance and balance are displayed below.

Ayati Sharma, The Shri Ram School Moulsari , Gurugram, India and BioScience Project Wakefield, MA

CONCLUSION

The correlates for CALM2 were generally found to be involved in neuronal and cellular processeslike axon guidance and synaptic transmission. DLG4 had the least number of correlates for thegiven brain regions and time period; the correlates aided in a variety of processes, includinglearning and memory. Like GRIN1, the correlates were found to be actively involved in cellularand neuronal processes, learning, memory, development and motor and coordination skills.

The top 6 correlates of interest, based on their r values and involvement in functions directlyrelated to dyslexia, are highlighted in blue.

• Using known risk factor genes, a bioinformatics approach was used to identify other candidate genes for developmental dyslexia.

• A variety of databases including STRING, DAVID, GeneWeaver, KEGG and The Allen Brain Atlas were used.

• The top genes of interest, based on their gene ontology classifications, involvement in pathways and combined scores with proved DD susceptibility genes, were found to be GRIN1, DLG4, SLIT2, CTNNB1 and CALM2.

• Evidence was found that the genes of interest were actively involved in the activities of the cerebral cortex (reading , learning, language and phonological processing, verbal memory).

• SLIT2 does not have high gene expression in the cerebral cortex but based on its gene ontology classifications , it could be a valid candidate gene for dyslexia.

• Gene interaction networks were found between GRIN1, DLG4 and CALM2 .

• CTNNB1 does not interact with the other top genes of interest but according to the findings, there is a high likelihood that it is related to dyslexia.

• Gene ontology classifications of the correlates of the top 3 genes of interest (GRIN1, CALM2 and DLG4) showed that all correlate genes were involved in learning, memory, formation and development of regions of the brain, motor and coordination skills as well as cellular and neuronal processes related to dyslexia. All of these correlates could be potential candidate genes for Developmental Dyslexia.

• From amongst the correlates of the top 3 genes of interest, NDRG4, PPP1R9B, JPH4, and CAMK2B have been shown to be highly associated with processes and regions relating to Developmental Dyslexia.

• Overall, the results indicate that GRIN1, CALM2 and DLG4 are highly associated with Dyslexia related phenotypes and should be further investigated to identify their role in developmental dyslexia.

7. CORRELATE GENES – GENE EXPRESSION IN THE CEREBRAL CORTEX

The columns in the heat map are sorted first by donor, then by brain structure. The light blue,green and yellow and orange parts of the map demonstrate high gene expression and dark blueindicates low gene expression.Based on the heat maps, it can be inferred that NDRG4 expresses itself most in the cerebral cortex, especially from the ages of 4 months to 1 year, followed by PPP1R9B, JPH4 and CAMK2B. PPP1R9B shows a gradual increase in gene expression over the years, JPH4 expresses itself in the cerebral cortex more or less consistently, and CAMK2B shows a sharp increase in gene expression in the cortex. The gene expression of both CISD1 and SHANK1 is low and constant.

The top 6 correlates of interest were entered into the Allen Brain Atlas and using the genesearch, heat maps showing the gene expression in regions of the cerebral cortex over the timeperiod of interest (8pcw-5 years) were obtained.