small protein modules with similar 3d structure but different amino acid sequence institute of...

1
Small protein modules with similar 3D structure but different amino acid sequence Institute of Evolution, University of Haifa, ISRAEL Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, ISRAEL * [email protected] Z.M. Frenkel*, E. N. Trifonov Acknowledgements This work was supported by an ISF grant 710/02-19.0 and by an EU grant QLG2-CT-2002-01298. Z.M.Frenkel is a Post-Doctoral Fellow of the Center for Complexity Science and supported by the Ministry of Absorption. 1qap A (218) LDELDDALKA GADIIMLDNF [94] 1dv1 A (64) PAIISAAEIT GAVAIHPGYG [23] 1tre A (218) SNAAELFAQP DIDGALVGGA 1tre 1l0o C (134) TVTEIADHLG ISPEDVVLAQ [52] 1fse A (28) TTKEIASELF ISEKTVRNHI [48] 1bl0 A (29) TLYDVAEYAG VSYQTVSRVV 1bl0 1ecr A (13) TFRQMEQELA IFAAHLEQHK [33] 1set A (79) EAKRLEEALR EKEARLEALL [30] 1eiy A (64) ALEAREKALE EAALKEALER 1eiy Abstract It is well known, that often protein fragments with almost identical 3D structure have quite different amino acid sequences. Such effect can be observed both for entire protein domain (fold) and for much smaller fragments present in different folds, and makes difficult structure prediction and classification. We proposed a new method that allows to reveal relationship between protein fragments with similar 3D structure but different amino acid sequences by identifying 'intermediate' sequences in proteomic database. The aim of this paper is to define proper parameters of the protein structure/sequence comparison and to illustrate the new approach. All calculations were carried out for fragments of 20 aa. It was shown (for selected protein fragments), that for such length it is practically impossible to find two fragments with identity more or equal 12 aa such that RMSD (root mean square deviation) of their structures is lager 3Å. With these thresholds of sequence and structure similarity, for a fragment from PDB (protein data bank) all similar fragments in database of 113 prokaryotic proteomes were found. The same was done for every such similar fragment, repeatedly. For every new sequence all similar fragments were located in PDB, if found. If the respective structures in PDB were different from the structure of the initial fragment, the sequence was excluded from further searches in proteomes. As a result an 'evolutionary tree' is obtained originating from our initial fragment of PDB. The tree contains elements similar in structure to initial fragment yet with different sequences. These elements are connected to origin in chain like fashion with adjacent elements having similar sequences. Three such evolutionary trees for different protein structures are described. This technique opens new possibilities for protein evolution studies and structure prediction. סס" סFigure 1 Figure 2 Figure 3 Table II Table I Table III Table IV Figure and Table description Fragments of the 'evolutionary trees' are shown in Fig. 1 – 3. The numbers mark the nodes with known structure similar to initial fragment. Description of such structures for Fig. 1 is given in Table I (only more similar structures are described). The thresholds of similarity are 3Å for Fig. 1 – 2 and 1.5Å for Fig. 3. The thin line marks examples of branches described in Tables II – IV. In these tables on the left the sequences from bacterial proteomes are given, that correspond to the nodes of the thin lines. Each sequence differs less (or equal) than 8 amino acids of 20 from its immediate neighbors. Corresponding structures from PDB (if found) are listed in the right. Some of them are shown near the respective tree superimposed with initial element structure. Conclusions Protein fragments with similar 3D structure can be selected, such that the sequences of the group's members can be related to one another in chain- like fashion with incremental changes in the sequences, and no similarity between the ends of the chain.

Upload: leonard-burke

Post on 02-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Small protein modules with similar 3D structure but different amino acid sequence Institute of Evolution, University of Haifa, ISRAEL Genome Diversity

Small protein modules with similar 3D structure but different amino acid sequence

Institute of Evolution, University of Haifa, ISRAEL

Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, ISRAEL

* [email protected]

Z.M. Frenkel*, E. N. Trifonov

AcknowledgementsThis work was supported by an ISF grant 710/02-19.0 and by an EU grant QLG2-CT-2002-01298. Z.M.Frenkel is a Post-Doctoral Fellow of the Center for Complexity Science and supported by the Ministry of Absorption.

1qap A (218) LDELDDALKA GADIIMLDNF [94]1dv1 A (64) PAIISAAEIT GAVAIHPGYG [23]1tre A (218) SNAAELFAQP DIDGALVGGA 1tre

1l0o C (134) TVTEIADHLG ISPEDVVLAQ [52]1fse A (28) TTKEIASELF ISEKTVRNHI [48]

1bl0 A (29) TLYDVAEYAG VSYQTVSRVV 1bl0

1ecr A (13) TFRQMEQELA IFAAHLEQHK [33]1set A (79) EAKRLEEALR EKEARLEALL [30]1eiy A (64) ALEAREKALE EAALKEALER 1eiy

AbstractIt is well known, that often protein fragments with almost identical 3D structure have quite different amino acid sequences. Such effect can be observed both for entire protein domain (fold) and for much smaller fragments present in different folds, and makes difficult structure prediction and classification.We proposed a new method that allows to reveal relationship between protein fragments with similar 3D structure but different amino acid sequences by identifying 'intermediate' sequences in proteomic database. The aim of this paper is to define proper parameters of the protein structure/sequence comparison and to illustrate the new approach. All calculations were carried out for fragments of 20 aa. It was shown (for selected protein fragments), that for such length it is practically impossible to find two fragments with identity more or equal 12 aa such that RMSD (root mean square deviation) of their structures is lager 3Å.With these thresholds of sequence and structure similarity, for a fragment from PDB (protein data bank) all similar fragments in database of 113 prokaryotic proteomes were found. The same was done for every such similar fragment, repeatedly. For every new sequence all similar fragments were located in PDB, if found. If the respective structures in PDB were different from the structure of the initial fragment, the sequence was excluded from further searches in proteomes. As a result an 'evolutionary tree' is obtained originating from our initial fragment of PDB. The tree contains elements similar in structure to initial fragment yet with different sequences. These elements are connected to origin in chain like fashion with adjacent elements having similar sequences.Three such evolutionary trees for different protein structures are described. This technique opens new possibilities for protein evolution studies and structure prediction.

בס"ד

Figure 1

Figure 2 Figure 3

Table II

Table I

Table III

Table IV

Figure and Table descriptionFragments of the 'evolutionary trees' are shown in Fig. 1 – 3. The numbers mark the nodes with known structure similar to initial fragment. Description of such structures for Fig. 1 is given in Table I (only more similar structures are described). The thresholds of similarity are 3Å for Fig. 1 – 2 and 1.5Å for Fig. 3. The thin line marks examples of branches described in Tables II – IV. In these tables on the left the sequences from bacterial proteomes are given, that correspond to the nodes of the thin lines. Each sequence differs less (or equal) than 8 amino acids of 20 from its immediate neighbors. Corresponding structures from PDB (if found) are listed in the right. Some of them are shown near the respective tree superimposed with initial element structure.

ConclusionsProtein fragments with similar 3D structure can be selected, such that the sequences

of the group's members can be related to one another in chain-like fashion with incremental changes in the sequences, and no similarity between the ends of the

chain.