introduction genomic analyses of transport proteins in mycobacterium tuberculosis and mycobacterium...

1
Introduction Genomic Analyses of Transport Proteins in Mycobacterium tuberculosis and Mycobacterium leprae Ji-Won Youm and Milton H. Saier Jr. Section of Molecular Biology, University of California, San Diego, La Jolla, CA 92093 It had long been accepted that tuberculosis was eradicated, but recent findings of virulent strains that are multi-drug resistant indicate otherwise. Tuberculosis has resurfaced as a serious pandemic and has thus drawn much attention from the scientific community. Leprosy has likewise been a major public health problem. Tuberculosis and leprosy are caused by the causative pathogens Mycobacterium tuberculosis (Mtu) and Mycobacterium leprae (Mle), respectively. While Mle is an intracellular bacterium, Mtu is an obligate aerobe capable of intracellular and extracellular life. Previous genomic analyses have revealed that Mle lost many genes through evolution, some of which are undoubtedly required for extracellular life since Mle is an intracellular parasite while Mtu is an extracellular parasite. Various events such as deletions, missense & nonsense mutations, and translocations are thought to have contributed to the inactivation of genes. The remnants of the genes resulting from such evolutionary events are what are pseudogenes – inactivated genes that no longer produce functional proteins. Pseudogenes are known to be present in Mtu, and very prevalent in Mle. We are interested in pseudogenes because the presence of pseudogenes in Mle and their functional counterparts in Mtu suggests that these particular genes are “on their way out” because they are unnecessary for intracellular life. My research focuses on the transport proteins of these two major parasitic bacteria. Comparative analysis of the transport proteins encoded within the two proteomes and the subsequent analysis of the putative functions of pseudogenes may shed light onto the transport proteins requisite to intracellular life and/or extracellular life. The former discovery may open up possibilities of synthesizing drugs to inhibit growth in its entirety, and the latter may aid in curbing the contagious nature of tuberculosis. My project is not yet complete. Table 1 reflects an incomplete set of data, but the protein-specific figures accurately depict the selected findings. The complete protein sequences of Mtu and Mle were extracted from the NCBI non- redundant database. Computer-aided analyses were conducted to retrieve all proteins encoded within the genomes of Mtu and Mle that are recognizably homologous to transport system constituents included in the transporter classification database (TCDB) [1,2]. Briefly, all proteins were blasted in an automated manner (using BLASTP, tBLASTn, NCBI PSI-BLAST) against the Transporter Classification and NCBI databases. Additional databases used for protein functional analysis were the nonredundant SWISSPROT and TrEMBL protein sequence databases. Several protein pattern databases (Conserved Domain Database at NCBI and Pfam) were also used. Charge-bias analyses of membrane protein topology were performed using the TMHMM [6] and WHAT [7] programs. A side-by-side comparison of transport proteins within the two genomes were performed using The Integrated Microbial Genomes (IMG) system. Currently, we are on the brink of incorporating PSI-FI to systematically search for pseudogenes in a consistent manner. 1. Busch W and Saier MH Jr. (2002) The Transporter Classification (TC) System, 2002. CRC Crit. Rev. Biochem. Mol. Biol. 37:287-337. 2. Tran CV, Yang NM, and Saier MH Jr. (2003) TC-DB: An architecture for membrane transport protein analysis. Proc. 2 nd Intl. IEEE Computer Society Computational Systems Bioinformatics Conference, p. 658. 3. Lerat E and Ochman H (2004) Exploring the outer limits of bacterial pseudogenes. Genome Research 14: 2273-2278. 4. Lerat E. and Ochman H (2005) Recognizing pseudogenes in bacterial genomes. Nucleic Acids Research 33: 3125-3132. 5. Gerstein M and Zheng D (2006) The Real life of Pseudogenes. Scientific American August: 49-55. 6. Krog A, Larsson B, von Heijne G, and Sonnhammer EL. (2001) Predicting transmembrane protein topology with a hidden Markov model. Application to complete genomes. J. Mol. Biol. 305:567-580. 7. Zhai Y and Saier MH Jr. (2001) A web-based program (WHAT) for the simultaneous prediction of hydropathy, amphipathicity, secondary structure and transmembrane topology for a single protein sequence. J. Mol. Microbiol. Biotechnol. 3:501-502. Results Assignment GENE FUSIONS Nonetheless, the result of my scripts and subsequent analyses have already revealed that M. tuberculosis abounds with examples of gene fusions. For example, a drug transporter (15843349, 2.A.1.3.12) was found to have fused a N-terminal fragment of the native transport protein with a regulatory region D. Other transport proteins showing fusions include a virulence factor (15843543, 2.A.66.4.1), a threonine export carrier (15843358, 2.A.79.1.1), and an arsenical resistance protein/arsenate reductase (15842183, 2.A.59.1.2). Further analyses revealed fusion of regulatory domains such as the cAMP binding domain, and various phosphatases as in the case of 15842183 to be rather prevalent in M. tuberculosis. This reinforces the notion that evolution tends toward complexity and that like humans, mycobacteria fuse related genes together. There were other interesting discoveries as well. 3.A.1.5.2, which belongs to the Peptide/Opine/Nickel Uptake Transporter (PepT) Family, is known to be a 5-component transport system. That is, the transport system requires 5 different proteins encoded by 5 different genes. Generally, the genes encoding the proteins that work in concert towards a specific goal such as a transport mechanism are usually embedded within the same operon. What was particularly interesting is that while four of these five proteins showed significant similarity scores, one of them (15843277) did not show significant similarity to its expected match: the dipeptide transport ATP-binding protein DppD, found in Bacillus subtilis. This suggests that there was an alternate gene different from those ABC systems characterized, recruited for energization of this mycobacterial transporter. However, it is not clear at this moment as to whether this 5th gene is essential for proper function of this transporter system. The most striking finding was regarding 3.A.2.1.2, a family of ATP synthases. It was found that in M. tuberculosis, there is a fusion of a delta subunit to the c-terminus of a b subunit and in contrast to all known F-type ATPases, there are 3 b-subunits. It is possible that evolutionary pressure to make the genome more compact and the transcription/translation of genes to be more efficient may have led to these fusional events. PSEUDOGENES A very interesting pseudogene that encodes for the protein of interest (gi: 15841200) was found in M. tuberculosis, as shown in Fig. 1. Its origin appears to be that of the anaerobic, respiratory, membrane-bound nitrate reductase (5.A.3.1.2) of The Prokaryotic Molybdopterin-containing Oxidoreductase (PMO) Family. The nitrate reductase system has 3 components; in the order in which it is transcribed in the operon, we have the alpha chain (1245 aas, the hydrophilic component that reduces NO 3 - to NO 2 - ), the beta chain (512 aas, the hydrophilic component that has 4 iron- sulfur centers), and the gamma chain (225 aas, the 5 TMS hydrophobic component that anchors the alpha and beta chains). Assembly of this transport system is aided by a chaperon protein, the delta chain (narJ, 206 aas, located in between the two genes encoding the beta and gamma chains). The first 214 aas of the protein of interest (gi: 15841200, 652 aas, 5 TMSs) shows significant sequence similarity to the first 214 aas of the alpha chain, the last 250 aas (containing the 5 TMSs) aligns to the entire delta chain, also with 5 TMSs. Since the middle region of the delta chain, there must have been at least two deletion events to give rise to this putative pseudogene. It will be interesting to see if this pseudogene is or if this gene was deleted in its entirety. Acknowledgement References I would like to thank Dr. Saier for his endless guidance, as well as Dorjee Tamang and Ming Zheng for their technical support. Also, I would like to thank Calit2 for their financial support and general research advice Discussion 1 3.A.1.1.7 8.00E-42 290 300 6 (TC 8) 35 7~276 17~290 [14, [23,47] [ sugar ABC transporter, permease p INNE MRD MEGK 15841821 O51924 2584684..2585556 3 comp 5 3.A.1.1.7 8.00E-30 274 278 6 30 20~274 19~276 [11, ? sugar ABC transporter, permease p INNE MSS ? 15841822 Q9R9Q5 2585543..2586367 3 comp 2 3.A.1.1.7 2.00E-06 426 450 1 24 59~369 66~389 [7,3 ? sugar ABC transporter, sugar-bind TREH MTR ? 15841823 O51923 2586364..2587644 3 comp 1 3.A.1.1.7? 5.00E-35 317 330 0 (TC 1) 35 56~313 29~312 [29,46] ABC transporter, ATP-binding protein Daun MDE MNTQ 15842226 ? 2999689..3000642 2 comp is this the C hit # TC # e-value q. lens. len TMS: query/ subject Idnt ity Sco re query align subject align q.T MS loca tion s.TMS location ncbi annot tcd b ann ot q. se q s. seq gi # swissPro t query Locations Syst ems co mp on ent CONCLUS Description Pseudogenes Classes of Transporters Found In M. tuberculosis and M. leprae Computer Methods Table 1 TC Class M. tuberculosis M. leprae M. tuberculosis M. leprae 1 Channels 7 0 0 (0%) 0 (0%) 2 Secondary carriers 82 26 0 (0%) 4 (15.4%) 3 Primary transporters 136 - - - 4 Group translocators (PTS) 0 - - - 5 Transmembrane electron carriers 9 - 1 (< 0.37%) - 8 Auxiliary transport proteins1 - - - 9 Poorly defined systems 35+ - - - -' = not yet analyzed/assigned ( %) = relative percentage o pseudogenes with respect to number of proteins Categories of Recognized Transport Proteins and Pseudogene M. tuberculosis and M. leprae Number of Transport Proteins Functional ? Pseudogenes ?

Upload: kerrie-townsend

Post on 18-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Introduction Genomic Analyses of Transport Proteins in Mycobacterium tuberculosis and Mycobacterium leprae Ji-Won Youm and Milton H. Saier Jr. Section

Introduction

Genomic Analyses of Transport Proteins in Mycobacterium tuberculosis and Mycobacterium leprae

Ji-Won Youm and Milton H. Saier Jr.

Section of Molecular Biology, University of California, San Diego, La Jolla, CA 92093

It had long been accepted that tuberculosis was eradicated, but recent findings of virulent strains that are multi-drug resistant indicate otherwise. Tuberculosis has resurfaced as a serious pandemic and has thus drawn much attention from the scientific community. Leprosy has likewise been a major public health problem.

Tuberculosis and leprosy are caused by the causative pathogens Mycobacterium tuberculosis (Mtu) and Mycobacterium leprae (Mle), respectively. While Mle is an intracellular bacterium, Mtu is an obligate aerobe capable of intracellular and extracellular life. Previous genomic analyses have revealed that Mle lost many genes through evolution, some of which are undoubtedly required for extracellular life since Mle is an intracellular parasite while Mtu is an extracellular parasite. Various events such as deletions, missense & nonsense mutations, and translocations are thought to have contributed to the inactivation of genes. The remnants of the genes resulting from such evolutionary events are what are pseudogenes – inactivated genes that no longer produce functional proteins. Pseudogenes are known to be present in Mtu, and very prevalent in Mle. We are interested in pseudogenes because the presence of pseudogenes in Mle and their functional counterparts in Mtu suggests that these particular genes are “on their way out” because they are unnecessary for intracellular life.

My research focuses on the transport proteins of these two major parasitic bacteria. Comparative analysis of the transport proteins encoded within the two proteomes and the subsequent analysis of the putative functions of pseudogenes may shed light onto the transport proteins requisite to intracellular life and/or extracellular life. The former discovery may open up possibilities of synthesizing drugs to inhibit growth in its entirety, and the latter may aid in curbing the contagious nature of tuberculosis.

My project is not yet complete. Table 1 reflects an incomplete set of data, but the protein-specific figures accurately depict the selected findings.

The complete protein sequences of Mtu and Mle were extracted from the NCBI non-redundant database. Computer-aided analyses were conducted to retrieve all proteins encoded within the genomes of Mtu and Mle that are recognizably homologous to transport system constituents included in the transporter classification database (TCDB) [1,2]. Briefly, all proteins were blasted in an automated manner (using BLASTP, tBLASTn, NCBI PSI-BLAST) against the Transporter Classification and NCBI databases. Additional databases used for protein functional analysis were the nonredundant SWISSPROT and TrEMBL protein sequence databases. Several protein pattern databases (Conserved Domain Database at NCBI and Pfam) were also used. Charge-bias analyses of membrane protein topology were performed using the TMHMM [6] and WHAT [7] programs. A side-by-side comparison of transport proteins within the two genomes were performed using The Integrated Microbial Genomes (IMG) system. Currently, we are on the brink of incorporating PSI-FI to systematically search for pseudogenes in a consistent manner.

1. Busch W and Saier MH Jr. (2002) The Transporter Classification (TC) System, 2002. CRC Crit. Rev. Biochem. Mol. Biol. 37:287-337.

2. Tran CV, Yang NM, and Saier MH Jr. (2003) TC-DB: An architecture for membrane transport protein analysis. Proc. 2nd Intl. IEEE Computer Society Computational Systems Bioinformatics Conference, p. 658.

3. Lerat E and Ochman H (2004) Exploring the outer limits of bacterial pseudogenes. Genome Research 14: 2273-2278.

4. Lerat E. and Ochman H (2005) Recognizing pseudogenes in bacterial genomes. Nucleic Acids Research 33: 3125-3132.

5. Gerstein M and Zheng D (2006) The Real life of Pseudogenes. Scientific American August: 49-55.

6. Krog A, Larsson B, von Heijne G, and Sonnhammer EL. (2001) Predicting transmembrane protein topology with a hidden Markov model. Application to complete genomes. J. Mol. Biol. 305:567-580.

7. Zhai Y and Saier MH Jr. (2001) A web-based program (WHAT) for the simultaneous prediction of hydropathy, amphipathicity, secondary structure and transmembrane topology for a single protein sequence. J. Mol. Microbiol. Biotechnol. 3:501-502.

Results

Assignment GENE FUSIONS

Nonetheless, the result of my scripts and subsequent analyses have already revealed that M. tuberculosis abounds with examples of gene fusions. For example, a drug transporter (15843349, 2.A.1.3.12) was found to have fused a N-terminal fragment of the native transport protein with a regulatory region D. Other transport proteins showing fusions include a virulence factor (15843543, 2.A.66.4.1), a threonine export carrier (15843358, 2.A.79.1.1), and an arsenical resistance protein/arsenate reductase (15842183, 2.A.59.1.2). Further analyses revealed fusion of regulatory domains such as the cAMP binding domain, and various phosphatases as in the case of 15842183 to be rather prevalent in M. tuberculosis. This reinforces the notion that evolution tends toward complexity and that like humans, mycobacteria fuse related genes together.

There were other interesting discoveries as well. 3.A.1.5.2, which belongs to the Peptide/Opine/Nickel Uptake Transporter (PepT) Family, is known to be a 5-component transport system. That is, the transport system requires 5 different proteins encoded by 5 different genes. Generally, the genes encoding the proteins that work in concert towards a specific goal such as a transport mechanism are usually embedded within the same operon. What was particularly interesting is that while four of these five proteins showed significant similarity scores, one of them (15843277) did not show significant similarity to its expected match: the dipeptide transport ATP-binding protein DppD, found in Bacillus subtilis. This suggests that there was an alternate gene different from those ABC systems characterized, recruited for energization of this mycobacterial transporter. However, it is not clear at this moment as to whether this 5th gene is essential for proper function of this transporter system.

The most striking finding was regarding 3.A.2.1.2, a family of ATP synthases. It was found that in M. tuberculosis, there is a fusion of a delta subunit to the c-terminus of a b subunit and in contrast to all known F-type ATPases, there are 3 b-subunits. It is possible that evolutionary pressure to make the genome more compact and the transcription/translation of genes to be more efficient may have led to these fusional events.

PSEUDOGENESA very interesting pseudogene that encodes for the protein of interest (gi: 15841200) was

found in M. tuberculosis, as shown in Fig. 1. Its origin appears to be that of the anaerobic, respiratory, membrane-bound nitrate reductase (5.A.3.1.2) of The Prokaryotic Molybdopterin-containing Oxidoreductase (PMO) Family. The nitrate reductase system has 3 components; in the order in which it is transcribed in the operon, we have the alpha chain (1245 aas, the hydrophilic component that reduces NO3

- to NO2-), the beta chain (512 aas, the hydrophilic component that has

4 iron-sulfur centers), and the gamma chain (225 aas, the 5 TMS hydrophobic component that anchors the alpha and beta chains). Assembly of this transport system is aided by a chaperon protein, the delta chain (narJ, 206 aas, located in between the two genes encoding the beta and gamma chains). The first 214 aas of the protein of interest (gi: 15841200, 652 aas, 5 TMSs) shows significant sequence similarity to the first 214 aas of the alpha chain, the last 250 aas (containing the 5 TMSs) aligns to the entire delta chain, also with 5 TMSs. Since the middle region aligns to the first 3/4th of the delta chain, there must have been at least two deletion events to give rise to this putative pseudogene. It will be interesting to see if this pseudogene is still present in M. leprae or if this gene was deleted in its entirety.

Acknowledgement

References

I would like to thank Dr. Saier for his endless guidance, as well as Dorjee Tamang and Ming Zheng for their technical support. Also, I would like to thank Calit2 for their financial support and general research advice

Discussion

13.A.1.1.78.00E-422903006 (TC 8)357~27617~290[14,38] [69,93] [106,127] [154,178] [209,231] [252,276] [23,47] [81,105] [114,131] [148,172] [181,203] [212,229] [238,262] [271,295] sugar ABC transporter, permease proteinINNER MEMBRANE PROTEIN MALF - Thermococcus litoralis. MRDAPRRRTALAYALLAPSLVGVVAFLLLPILVVVWLSLHRWDLLGPLRYVGLTNWRSVLTDSGFADSLVVTAVFVAIVVPAQTVLGLLAASLLARRLPGTGLFRTLYVLPWICAPLAIAVMWRWILAPTDGAISTVLGHRIEWLTDPGLALPVVSAVVVWTNVGYVSLFFLAGLMAIPQDIHNAARTDGASAWQRFWRITLPMLRPTMFFVLVTGIISAAQVFDTVYALTGGGPQGSTDLVAHRIYAEAFGAAA MEGKIMDNNLTSKLKYREAKLGYLMILPLLTVVLVFIILPVMGTFWISLHRDVTFIPEKPFVGLRNYLRVLSAREFWYSTFVTVSFSFVSVSLETILGLSFALILNERLKGRGVLRAIVLIPWAVPTIISARTWELMYNYSYGLFNWILSILGVSPVNWLGTPISAFFAIVIADVWKTTPLMTLLLLAGLQAIPQDLYEAALIDGASMFERFKSITLPLLKPVLIVALILRTIDALRVFDIIYVLTGGGPGGATT 15841821O519242584684..25855563comp53.A.1.1.78.00E-3027427863020~27419~276[11,35] [69,93] [106,129] [138,156] [181,204] [241,260] ?sugar ABC transporter, permease proteinINNER MEMBRANE PROTEIN MALG (TREHALOSE/MALTOSE TRANSPORT INNER MEMBRANE PROTEIN) - Thermococcus litoralis, and Pyrococcus furiosus. MSSPSRVSNTAVYAVLTIGAVITLSPFLLGLLTSFTSAHQFATGTPLQLPRPPTLANYADIADAGFRRAAVVTALMTAVILLGQLTFSVLAAYAFARLQFRGRDALFWVYVATLMVPGTVTVVPLYLMMAQLGLRNTFWALVLPFMFGSPYAIFLLREHFRLIPDDLINAARLDGANTLDVIVHVVIPSSRPVLAALAMITVVSQWNNFMWPLVITSGHKWRVLTVATADLQSRFNDQWTLVMAATTVAIVPLIA ?15841822Q9R9Q52585543..25863673comp23.A.1.1.72.00E-0642645012459~36966~389[7,30] ?sugar ABC transporter, sugar-binding protein TREHALOSE/MALTOSE BINDING PROTEIN - Thermococcus litoralis, and Pyrococcus furiosus. MTRPRQSTLVATALVLVAILLGVTAVLLGLSAEPRGGKIVVTVRLWDEPIAAAYRQSFAAFTRSHPDIEVRTNLVAYSTYFETLRTDVAGGSADDIFWLSNAYFAAYADSGRLMKIQTDAADWEPAVVDQFTRSGVLWGVPQLTDAGIAVFYNADLLAAAGVDPTQVDNLRWSRGDDDTLRPMLARLTVDADGRTANTPGFDARRVRQWGYNAANDPQAIYLNYIGSAGGVFQRDGKFAFDNPGAIEAFRYLVGL ?15841823O519232586364..25876443comp13.A.1.1.7?5.00E-353173300 (TC 1)3556~31329~312[29,46] ABC transporter, ATP-binding proteinDaunorubicin resistance ATP-binding protein drrA - Streptomyces peucetius. MDEPAHRARPKGNGANHDGAQPCCGIGTCGNRGDPRARAHLPLPKGGRAGGAWHGVTVGRGEIFGLLGPSGAGKSTTQKLLIGLLRDHGGQATVWDKEPAEWGPDYYERIGVSFELPNHYQKLTGYENLRFFASLYAGATADPMQLLAAVGLADDAHTLVGKYSKGMQMRLTFARSLINDPELLFLDEPTSGLDPVNARKIKDIIVDLKARGRTIFLTTHDMATADELCDRVAFVVDGRIVALDSPTELKIARSR MNTQPTRAIETSGLVKVYNGTRAVDGLDLNVPAGLVYGILGPNGAGKSTTIRMLATLLRPDGGTARVFGHDVTSEPDTVRRRISVTGQYASVDEGLTGTENLVMMGRLQGYSWARARERAAELIDGFGLGDARDRLLKTYSGGMRRRLDIAASIVVTPDLLFLDEPTTGLDPRSRNQVWDIVRALVDAGTTVLLTTQYLDEADQLADRIAVIDHGRVIAEGTTGELKSSLGSNVLRLRLHDAQSRAEAERLLSAE 15842226?2999689..30006422compis this the C??

hit #

TC # e-value q. len s. len TMS: query/ subject

Idntity Score

query align

subject align

q.TMS location

s.TMS location

ncbi annot tcdb annot

q. seq

s. seq

gi # swissProt

query Locations

Systems

component

CONCLUSIONDescription

Pseudogenes

Classes of Transporters Found In M. tuberculosis and M. leprae Computer Methods

Table 1

TC Class M. tuberculosis M. leprae M. tuberculosis M. leprae

1 Channels 7 0 0 (0%) 0 (0%)2 Secondary carriers 82 26 0 (0%) 4 (15.4%)3 Primary transporters 136 - - -4 Group translocators (PTS) 0 - - -5 Transmembrane electron carriers 9 - 1 (< 0.37%) -8 Auxiliary transport proteins 1 - - -9 Poorly defined systems 35+ - - -

-' = not yet analyzed/assigned( %) = relative percentage of pseudogenes with respect to total number of proteins

Categories of Recognized Transport Proteins and Pseudogenes Found in M. tuberculosis and M. leprae

Number of Transport ProteinsFunctional ? Pseudogenes ?