tekaia evol trends proteomes
TRANSCRIPT
![Page 1: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/1.jpg)
Exploring Evolutionary Trends in Proteomes
•
••••
Eukaryotes
Hyperthermophiles
Psychrophiles
Prokaryotes mesophiles
Thermophiles
Fredj Tekaia Edouard Yeramian
Institut [email protected]
![Page 2: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/2.jpg)
433 36
46
http://www.genomesonline.org/
Tree of life
Complete genomes 2434 projects • 520 published (01-03-07)• 1086 Bacteria• 59 Archaea• 696 eukaryotes• 73 metagenomes
• 3 phylogenetic domains;• Lifestyles: mesophiles; (hyper)thermophiles; psychrophiles; extreme conditions,...
![Page 3: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/3.jpg)
• In the post genomic era, multidimensional data resulting from large scale genome comparisons are available.
• Multivariate analysis methods are particularly helpful for the discovery of evolutionary trends associated with such data.
• Data driven exploratory analyses as opposed to model driven methods.
![Page 4: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/4.jpg)
Methodology
Matrice T kij > 0
1 i p1
j
n
kij
sup
sup •
•
•
••
•
•
•
••
Correspondence Analysis
F1
Fp
•
•
••
•
• •
•
•
••
••
•
F(is) = -1/2.∑{fis
j.G(j) ; j=1,p};
![Page 5: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/5.jpg)
Methodology
•
•
•
••
•
•
•
••
Matrice T kij > 0
Correspondence Analysis
Classification
1 i p1
j
n
kij
sup
F1
Fp
•
•
••
•
• •
•
•
••
••
•
••
•••
••
• orthogonal system;
• use of euclidean distance;
![Page 6: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/6.jpg)
1. Evolution of Proteomes: Signatures and Trends in Amino Acid Compositions
2. Genome Trees from Whole Proteome Comparisons
![Page 7: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/7.jpg)
Evolution of Proteomes: Signatures and Trends in Amino Acid
Compositions
•
•
•••
Eukaryotes
Hyperthermophiles
Psychrophiles Prokaryotes mesophiles
Thermophiles
![Page 8: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/8.jpg)
Mining the wealth of information contained in complete genomes, to decipher genomic characteristics to the adaptive evolution of organisms in extreme conditions as high or low temperatures, has long been a matter of interest:• Kreil DP, Ouzounis CA (2001). Identification of thermophilic species by the amino acid compositions deduced from their genomes. NAR 2001, 468: 1608-15.• Tekaia F, Yeramian E, Dujon B (2002). Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. Gene, 297: 51-60.• Suhre K, Claverie JM (2003). Genomic correlates of hyperthermostability, an update. J. Biol. Chem., 278: 17198-202. • Hickey DA, Singer GA (2004). Genomic and proteomic adaptations to growth at high temperature. Genome Biol., 5: 117. Epub 2004.• Brocchieri L (2004). Environmental signatures in proteome properties. Proc Natl Acad Sci U S A., 101: 8257-8.• Cavicchioli R (2006). Cold-adapted archaea. Nat. Rev. Microbiology,4: 331-3.• Lobry JR, Necsulea A. (2006). Synonymous codon usage and its potential link with optimal growth temperature in prokaryotes.Gene. 385:128-36.• Zeldovich KB, Berezovsky IN, Shakhnovich EI. (2007). Protein and DNA Sequence Determinants of Thermophilic Adaptation. PLoS Comput Biol. 3:e5.
![Page 9: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/9.jpg)
The significant number of available completely sequenced genomes with different lifestyles offers an unprecedented opportunity to explore species evolution.
• Which universal properties can be deduced from amino acid compositions of proteomes?
• Are there specific properties associated with lifestyles and with phylogeny?
• What are the underlying evolutionary trends?
Among simple analyses:
amino acid composition of proteomes.
![Page 10: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/10.jpg)
Outline
• Methodology;
• Species considered and data analysed;
• Species and amino acids distributions;
• Amino acids distribution and comparison with theoretical and experimental model chronologies of amino acids recruitment into the genetic code;
• Example: application to predicting candidate thermostable proteins in Aspergillus fumigatus.
![Page 11: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/11.jpg)
Methodology
Matrice T kij > 0
1 i p1
j
n
kij
sup
sup •
•
•
••
•
•
•
••
Correspondence Analysis
F1
Fp
•
•
••
•
• •
•
•
••
••
•
F(is) = -1/2.∑{fis
j.G(j) ; j=1,p};
![Page 12: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/12.jpg)
GC%
Growth t°
Tekaia, F., Yeramian, E. and Dujon, B. 2002. Gene 297: 51-60.54 species
Hyperthermophiles
Mesophiles
Thermophiles
Previous work showed:
![Page 13: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/13.jpg)
including:
• 20 hyperthermophiles (HTH) (OGT >60°C up to 120°C),
• 7 thermophiles (TH) (OGT >50°C up to 60°C),
• 8 psychrophiles (PSYC) (OGT: -10°C, up to 15°C),
• 173 mesophiles (BMES) including 53 eukaryotes (EUK)
Data table: 222 (208 + 14 sup) vs 23 (20 aa + pol, char, hyd)
Amino Acid composition of 208 proteomes
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=genomeprj
+ specific sites
![Page 14: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/14.jpg)
208
13
...............
Amino Acid composition
Correspondence Analysis was used to explore relationships between species and amino acids.
org A R N D C Q E G H I L K M F P S T W Y V char pol hyd PC sc 5.5 4.4 6.1 5.8 1.3 3.9 6.6 5.0 2.1 6.6 9.6 7.4 2.1 4.5 4.3 9.0 5.8 1.0 3.3 5.6 26.3 34.4 39.1 8.1 sp 6.3 4.8 5.2 5.3 1.5 3.8 6.5 5.0 2.3 6.1 9.8 6.4 2.1 4.6 4.7 9.4 5.6 1.1 3.4 6.0 25.3 33.9 40.7 8.6 ncu 8.7 6.2 3.7 5.6 1.1 4.3 6.5 7.2 2.5 4.4 8.4 5.1 2.2 3.4 6.5 8.3 6.1 1.4 2.6 6.0 25.8 33.3 40.8 7.5 ca 4.9 3.7 6.7 5.7 1.2 4.4 6.2 5.0 2.1 7.1 9.3 7.2 1.9 4.5 4.5 9.3 6.2 1.0 3.5 5.5 25. 36.2 38.7 11.2 mgr 9.4 6.6 3.5 5.7 1.3 4.1 5.9 7.4 2.3 4.4 8.5 4.8 2.2 3.5 6.3 8.0 5.9 1.5 2.5 6.2 25.3 32.7 42.0 7.4 fg 8.2 5.8 3.9 5.9 1.3 4.0 6.2 6.7 2.4 5.1 8.7 5.1 2.3 3.8 5.9 8.1 6.1 1.5 2.8 6.1 25.4 32.9 41.6 7.5 an 8.6 6.2 3.7 5.6 1.2 4.0 6.2 6.8 2.4 5.0 9.2 4.6 2.0 3.7 6.0 8.4 6.0 1.5 2.9 6.1 24.9 32.9 42.0 8 ecun 5.0 6.7 3.9 5.5 2.0 2.3 8.1 6.5 1.9 6.7 9.5 7.1 3.0 4.8 3.4 8.0 4.1 0.8 3.6 7.0 29.3 30.4 40.2 1.1
HTH 7.4 5.8 3.5 4.7 0.8 2.0 8.3 7.4 1.6 7.4 10.6 7.0 2.2 4.2 4.5 5.2 4.4 1.1 3.9 8.0 27.4 27.0 45.4 -0.4 TH 9.0 6.3 3.6 5.3 0.8 3.1 6.4 7.5 1.9 7.0 9.9 4.7 2.6 4.0 4.7 6.1 5.1 1.2 3.6 7.4 24.6 29.7 45.6 5.2 PSYC 8.4 4.6 4.3 5.7 1.1 4.0 6.3 6.9 2.2 7.2 9.9 5.5 2.7 4.1 3.9 6.5 5.8 1.1 3.2 6.9 24.2 31.8 44.0 7.6 BMES 8.6 5.1 4.4 5.4 1.0 3.8 6.3 7.0 2.1 6.9 10.2 5.8 2.3 4.3 4.1 6.2 5.4 1.1 3.2 6.9 24.6 30.9 44.4 6.3 EUK 6.9 5.4 4.9 5.4 1.7 4.2 6.6 6.0 2.4 5.6 9.3 6.1 2.2 4.0 5.2 8.4 5.6 1.2 3.1 6.0 25.9 33.8 40.2 7.9 SPEC 7.6 6.1 4.8 5.1 1.8 4.0 6.3 6.1 2.5 4.9 8.8 5.7 2.2 3.6 5.7 8.8 5.8 1.2 2.9 6.0 25.8 34.2 39.9 8.4 A 6.7 5.4 4.8 5.4 1.2 2.6 7.8 6.3 1.8 7.3 9.6 6.9 2.3 4.1 4.0 6.7 5.0 1.1 3.9 7.1 27.2 30.5 42.2 3.3 B 9.4 5.8 4.1 5.4 1.0 4.1 6.0 7.3 2.1 5.6 10.1 5.0 2.2 3.9 4.7 6.6 5.5 1.4 3.0 6.8 24.3 31.5 44.0 7.2 E 6.9 5.7 4.4 5.3 2.0 4.6 6.6 6.0 2.6 4.8 9.1 5.8 2.2 3.8 5.8 8.7 5.7 1.2 2.9 5.9 26.0 34.2 39.7 8.2 EA 6.8 5.7 4.5 5.5 1.8 4.1 6.8 5.8 2.4 5.7 9.6 6.5 2.3 4.0 4.6 7.6 5.5 1.1 3.2 6.5 26.8 32.4 40.7 5.6 EB 7.4 5.5 4.3 5.5 1.5 4.0 6.3 6.7 2.5 5.4 9.5 5.4 2.2 4.1 5.4 7.7 5.6 1.3 3.1 6.5 25.1 33. 41.8 7.9 AB 8.6 5.3 3.9 5.0 1.0 3.3 6.3 7.1 1.9 7.0 10.7 5.5 2.4 4.5 4.2 6.2 5.1 1.3 3.3 7.3 24.0 29.8 46.0 5.8 EAB 8.1 5.4 4.0 5.4 1.3 3.8 6.6 7.0 2.2 6.1 9.9 5.7 2.4 4.1 4.6 6.9 5.5 1.2 3.0 7.0 25.2 31.4 43.3 6.2
![Page 15: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/15.jpg)
P1
proteome1
Pn
proteomen
• bestnpp1
• allnpp1
• segmatchnpp1
• bestnppn
• allnppn
• segmatchnppn
• bestp1np
• allp1np
• segmatchp1np
• bestpnnp
• allpnnp
• segmatchpnnp
Species specific comparisons
NP
new proteome
blastp, pam250, SEG filter
bestnppi
np1 size pij e-value1 HS/IS/NS
allnppi
np1 size pij e-value1 HS/IS/NS
np1 size pik e-value HS/IS/NS
The expected number of HSPs with score at least S is given by: E = Kmne-S.
m and n are sequence and database lengths.
• Paralogs • Orthologs
![Page 16: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/16.jpg)
•
•
•••
Eukaryotes
Hyperthermophiles
Psychrophiles Prokaryotes mesophiles
Thermophiles
Encephalitozoon cuniculi
Thermosynechococcus
elongatus
![Page 17: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/17.jpg)
GC%
growth t°
•
•
•••
Mycoplasma mycoides
23%
Nocardia farcinica:
70%
Streptomyces coelicolor: 72%
Tetrahymena thermophila (Protists)
Saccharomyces
Entamoeba histolytica (Protists)
Cryptosporidium hominis Leishmania major:60%
Cyanidioschyzon merolae
Aspergilus fumigatus:50%Homo sapiens
Methanococcus jannaschii:31%Pyrococcus abyssi:44%
Methanopyrus kandleri:61%
Thermus-thermophilus:69%
Colwellia psychrerythraea Pseudoalteromonas haloplanktis
Encephalitozoon cuniculi
A. nidulans
A. oryzae
C. neoformansMus musculus
Rat
Candida Glabrata
![Page 18: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/18.jpg)
Statistical characterization of the observed groups:
Mean amino acids between the 3 groups were compared using:
-One-way analysis of variance;
-Newman-Keuls multiple comparison test to detect significant differences at the probability level of p<0.001.
![Page 19: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/19.jpg)
0123456789
1011
V (Val)Y (Tyr)E (Glu)G (Gly)I (Ile)
L (Leu)A (Ala)H (His)S (Ser)Q (Gln)T (Thr)C (Cys)D (Asp)P (Pro)N (Asn)R (Arg)M (Met)K (Lys)F (Phe)W (Trp)
**
*
*
*
*
*
* *
*
*
*
*
*
**
*
*
Mean aa composition in (hyper)thermophiles, prokaryotic mesophiles-psychrophiles and eukaryotes (*: sig. different at p<0.001)
![Page 20: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/20.jpg)
0
5
10
15
20
25
30
35
40
45
50
hyd pol pol-char char
**
*
**
*
*
AA physico-chemical properties in (hyper)thermophiles, prokaryotic-pshychrophiles and eukaryotes(*: sig. different at p<0.001)
![Page 21: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/21.jpg)
HTH-TH BMES-PSYC EUKV(Val) V(Val) V (Val)H(His) H(His) H (His)S (Ser) S (Ser) S (Ser)
pol pol polpol-char pol-char pol-charY (Tyr)
E (Glu a)Q (Gln)T (Thr)
D (Asp a)G (Gly)I (Ile)
L (Leu)C (Cys)
Hyd
Amino acid signatures (p<0.001)
• R (Arg), M (Met), F (Phe), K (Lys), N (Asn) and W (Trp) show no significant difference (at p<0.001).
![Page 22: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/22.jpg)
Species evolutionary trends
![Page 23: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/23.jpg)
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
T1
T2
• •••
••••
GC%
growth t°
[moderate_temperature]-[low_GC]
[high_temperature]-[high_GC]
ABAEAB B
SPECE
EBEA
Ancient
Recent
![Page 24: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/24.jpg)
• Comparison of amino acid distribution with recent models of:• Jordan et al. Nature 433: 633-638 (2005)
• Trifonov, J. Biomol. Struct. & Dyn. 22: 1-11 (2004)
• Miller’s experiments: Science 117, 528-529. (1953)
• Analysis of Murchison meteorite (1983)
• and with ancient amino acids:
Comparison with model chronologies of amino acids recruitment into the genetic code
![Page 25: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/25.jpg)
• They analysed 15 sets of three-way alignments of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions.
• All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code;
• conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late.
Model of Jordan et al. 2005: A universal trend of amino acid gain and loss in protein evolution. Nature.433:633-8.
![Page 26: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/26.jpg)
• 4 “weak gainers”: Asn, Thr, Ile (accrue in 11 taxa/15) and Val (accrues slowly in all taxa);
• 5 strong “gainers”: Cys, Met, His, Ser and Phe (accrue in 14/15 taxa)
“were probably recruited late” i.e most recent aa.
• 4 strong “losers”: Pro, Ala, Glu, and Gly (decline in at least 13 taxa/15)
“thought to be among the first incorporated into the genetic code” i.e most ancient aa.
• 1 “weak looser”: Lys (lost in 10 taxa/15).
• In contrast: the remaining six amino-acids (Arg, Gln, Trp, Leu and Tyr) evolve more erratically.
Jordan et al. 2005.
Following observed frequencies, they subdivided amino acids into what they called:
![Page 27: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/27.jpg)
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
T1
T2
GC%
growth t°
•
•
•••
Jordan et al., Nature 433, 633 (2005).
•”strong loosers” in T1: most ancient aa
Pro
AlaGlu Gly
A universal trend of aa gain and loss in protein evolution.
•”weak gainers”
Asn Thr
Ile Val
• “strong gainer” in T2: recruited late to the genetic code
HisSer
Cys
Phe Met
![Page 28: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/28.jpg)
Model of Trifonov, E.N. 2004. The triplet code from first principles. J. Biomol. Struct. & Dyn. 22: 1-11.
• The chronology results in the consensus order:
G1 (Gly), A2 (Ala), D3 (Asp), V4 (Val), P5 (Pro), S6 (Ser), E7 (Glu), (L8 (Leu), T8 (Thr)), R10 (Arg), (I11 (Ile), Q11 (Gln), N11 (Asn)), H14 (His), K15 (Lys), C16 (Cys), F17 (Phe), Y18 (Tyr), M19 (Met), W20 (Trp).
• A consensus chronology of amino acids is built on the basis of 60 different criteria each offering certain temporal order.
![Page 29: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/29.jpg)
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
T1
T2
GC%
growth t°
•
•
•••
•
Trifonov, E.N. (2004). The triplet code from first principles. J. Biomol. Struct. & Dyn. 22: 1-11.
Gly1Ala2
Val4
Asp3
Glu7
Pro5
Ser6
Leu8
Thr8
Arg10
•Tyr18
Asn11
Lys15
Gln11
Ile11
Cys16
His14
Phe17
Trp20
Met19
![Page 30: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/30.jpg)
Comparison with ancient amino acids
![Page 31: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/31.jpg)
Miller/Urey Experiment: 1953
• By the 1950s, scientists were in hot pursuit of the origin of life. The scientific community was examining what kind of environment would be needed to allow life to begin.
• In 1953, Miller took molecules which were believed to represent the major components of the early Earth's atmosphere and put them into a closed system
• Miller's experiment showed that organic compounds such as amino acids, which are essential to cellular life, could be made easily under the conditions that scientists believed to be present on the early earth.
![Page 32: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/32.jpg)
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
T1
T2
GC%
growth t°
•
•
•••
Miller, S.L. Science 117, 528-529. (1953) Production of aa under possible primitive earth conditions.
Gly AlaVal
Asp
Glu
Pro
Ser
Leu
Thr
++++++
+
+
+
+
+
+
+
Ile+
![Page 33: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/33.jpg)
The Murchison meteorite fall occurred on September 28, 1969 over Murchison, Australia. Over 100 kilograms of this meteorite have been found. This meteorite is of possible cometary origin due to its high water content of 12%.
An abundance of amino acids found within this meteorite has led to intense study by researchers as to its origins. More than 92 different amino acids have been identified within the Murchison meteorite to date. Nineteen of these are found on Earth. The remaining amino acids have no apparent terrestrial source.
Murchison meteorite 09-28-1969
![Page 34: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/34.jpg)
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
T1
T2
GC%
growth t°
•
•
••
Cronin, J.R. and Pizzarello, S. (1983). Amino acids in meteorites. Adv Space Res. 3: 5-18. Murchison meteorite 28-09-1969
LeuAsp
Val+
+
Gly AlaGlu
Pro
+++++
++
++
Ile+++
![Page 35: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/35.jpg)
Conclusions:
• segregation of eukaryotes;
• segregation of hyperthermophiles;
• non discrimination of psychrophiles.
• Simple description of amino acid compositions of proteomes (free from a priori model) revealed fundamental evolutionary properties:
• Amino acid signatures for hyperthermophiles and for eukaryotes.
![Page 36: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/36.jpg)
Conclusions...:
• Correspondence Analysis helped these properties to be shown.
• Amino acids distribution is consistent with suggested model chronologies of their recruitment into the genetic code;
![Page 37: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/37.jpg)
General Conclusion
• Amino acids are significant markers for species evolution.
![Page 38: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/38.jpg)
Genome Trees from Whole Proteome Comparisons
![Page 39: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/39.jpg)
• Species tree construction and difficulties;
• Post genome era species tree construction;
• Genome tree construction based on conservation profiles;
Outline
• Conclusions;
• References.
• Conservation profiles;
![Page 40: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/40.jpg)
Species tree - Tree Of Life
• 16/18s rRNA tree (Woese 1990);Woese and others have used rRNA comparisons to construct a “Tree Of Life” showing the evolutionary relationships of a wide variety of organisms.
The « Tree Of Life » has long served as a useful tool for describing the history and relationships of organisms over evolutionary time. One species is represented as a branching point, or node, on the tree, and the branches represent paths of descent from a parental node.
![Page 41: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/41.jpg)
The three-domain proposal based on the ribosomal RNA tree. Woese et al. PNAS. 87:4576-4579. (1990)
The two-empire proposal, separating eukaryotes from prokaryotes and eubacteria from archaebacteria. Mayr, D. PNAS 95:9720-23. (1998).
The three-domain proposal, with continuous lateral gene transfer among domains. Doolittle. Science 284:2124-8. (1999)
The ring of life, incorporating lateral gene transfer but preserving the prokaryote eukaryote divide. Rivera & Lake JA. Nature 431: 152-5. (2004)
Martin & Embley
Nature 431:152-5.(2004)
![Page 42: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/42.jpg)
The 1.2-Megabase Genome Sequence of Mimivirus Raoult et al. Sciences, 306:1344-1350. (2004)
Genomic Databases and the Tree of LifeKeith A. Crandall and Jennifer E. BuhaySciences, 306; 1144-1145. (2004)
Prospects for Building the Tree of Life from Large Sequence Databases Driskell, et al .Sciences, 306; 1172-1174. (2004)
![Page 43: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/43.jpg)
Pennisi, E. (1998). Genome data shake tree of life.Science 280:672-4.
New genome sequences are mystifying evolutionary biologists by revealing unexpected connections between microbes thought to have diverged hundreds of millions of years ago.
and suggests to construct species trees from their whole gene content.
![Page 44: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/44.jpg)
Genome phylogeny based on gene content (1999)Snel, Bork, Huynen. Nature Genetics 21, 108-110.
E
A
B
![Page 45: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/45.jpg)
Tekaia, Lazcano & Dujon (1999)Genome Research 9: 550-7.
E
A
B
![Page 46: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/46.jpg)
433 36
46
http://www.genomesonline.org/
Tree of life
Complete genomes 2434 projects • 520 published (01-03-07)• 1086 Bacteria• 59 Archaea• 696 eukaryotes• 73 metagenomes
Abundance of genome data is raising expectations to accurately depict the evolutionary history of all genomes.
Idea: construct a species tree from many genes instead of only one gene.
![Page 47: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/47.jpg)
Genomes 2 edition 2002. T.A. Brown
Gene tree - Species tree
Species tree
A B C
Gene tree
A B C
•
•
Time Duplication
Duplication
Speciation
Speciation
A B C
![Page 48: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/48.jpg)
Problems with species tree construction
• main difficulties in species tree construction include extensive incongruence between alternative phylogenies generated from single-gene data sets;
-Genes don't evolve at the same rate nor in the same way;-the evolutionary history inferred from one gene may be different from what another gene appears to show.
![Page 49: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/49.jpg)
Alternative solutions: integrative methods
• “supertree”The supertree approach estimates phylogenies for subsets of genes with good overlap, then combines these subtree estimates into a supertree.
Bininda-Emonds et al. 2002
• Depends on the ability to distinguish between orthologs and paralogs;
• Supertree approaches are controversial, in part because the methodology results in a degree of disconnection between the underlying genetic data and the final tree produced.
![Page 50: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/50.jpg)
• “phylogenomic tree”(based on concatenation of a gene sample common to the considered species);
S1
Sn
.
.
• genes don't evolve at the same rate nor in the same way;
• a limited number of genes are shared among all species;
The tree of one percent (2006)Dagan and Martin. Genome Biology, 7:118.
![Page 51: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/51.jpg)
More generally these methods suffer difficulties related to the phylogenetic tree construction:
• global sequence alignment (quality, gaps,...);
• different evolutionary histories of genes;
• substitution saturation;...
and
• more seriously from gene sampling difficulties.
![Page 52: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/52.jpg)
A B C
Gene tree - Species tree: The gene sampling problem
A B C
Red is lost in CBlue is lost in A and B
A B C
gene tree # species tree
Adapted from:
Linder, Moret, Nakhleh, Warnow.
True species tree
![Page 53: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/53.jpg)
A B C
Gene tree - Species tree: The gene sampling problem
All red orthologs has been lost in the 3 species.
A B C
Luckily: sampling gives the blue orthologs. The true species tree is reconstructed.
![Page 54: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/54.jpg)
A B C
Gene tree - Species tree: The gene sampling problem
All versions of the gene are in the 3 species
A AB BC C
Gene trees are the same as the species tree
![Page 55: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/55.jpg)
Genome tree is another alternative to construct species tree.
• The concept of genome tree is based on overall gene content similarity.
(consider more than single gene information)
![Page 56: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/56.jpg)
Methodology
•
•
•
••
•
•
•
••
Matrice T kij > 0
Correspondence Analysis
Classification
1 i p1
j
n
kij
sup
F1
Fp
•
•
••
•
• •
•
•
••
••
•
••
•••
••
• orthogonal system;
• use of euclidean distance;
![Page 57: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/57.jpg)
Systematic Analysis of Completely Sequenced Organisms
• In silico species specific comparisons (Tekaia & Dujon. J. Mol. Evol. 1999)
(27 eucaryal, 19 archaeal and 33 bacterial species: 541880 proteins)
Proteome1
Proteomen
Proteome
blastp, pam250, SEG filter
• 99 species
(B: 33; A: 19; E:27)
• total of 541880 proteins
![Page 58: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/58.jpg)
Systematic Analysis of Completely Sequenced Organisms
• In silico species specific comparisons (27 eucaryal, 19 archaeal and 33 bacterial species: 541880 proteins)
• Degree of ancestral duplication and of ancestral conservation between pairs of species;
• Families of paralogs (Partition-MCL);• Families of orthologs (Partition-MCL);• Distribution of orthologous families according to the three domains of life;
• Determination of the protein dictionary (orthologs);
• Determination of protein conservation profiles;
![Page 59: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/59.jpg)
Genome trees: data matrices
T = {Tij ; i=1,n; j=1,n; n is the number of surveyed species}
Tij is the overall similarity score between species j and i.
• Ancestral duplication and ancestral conservationT = {Tij = wij = (number of proteins in j conserved in i)/size(j)); i=1,n; j=1,n }.
n = 99 species and T corresponds to 541880 total proteins
![Page 60: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/60.jpg)
org SC SP CE DM AG CA ATH HS MUS FR PF ECUNSC 40.5 63.9 17.5 27.1 22.3 65.9 23.4 22.9 27.3 18.0 22.5 35.8SP 58.4 37.4 18.8 29.3 26.3 54.3 25.0 25.0 29.6 20.0 24.6 38.4CE 38.1 46.6 65.2 51.9 50.6 35.5 27.5 44.6 54.4 42.4 24.8 34.8DM 40.5 50.2 39.2 65.8 69.9 37.5 29.5 50.3 62.7 47.9 26.5 36.3AG 40.9 50.2 39.8 73.1 59.5 38.0 30.6 50.2 60.3 48.7 26.5 36.0CA 71.8 65.5 18.4 27.7 25.7 35.8 24.3 23.2 27.8 18.5 22.3 35.7ATH 40.3 47.8 21.7 31.5 30.3 37.0 83.6 25.6 29.7 21.9 26.2 33.4HS 43.0 53.3 40.0 61.3 54.5 39.7 32.1 66.7 90.8 68.8 28.2 37.7MUS 41.7 52.5 39.5 62.1 54.7 39.1 31.5 76.8 77.8 67.7 27.6 37.2FR 42.0 52.6 40.0 60.7 59.9 39.5 32.7 68.7 81.8 63.4 27.6 37.4PF 25.9 31.2 13.1 19.3 15.9 22.2 16.3 17.2 21.0 13.2 28.3 28.9ECUN 19.5 23.4 8.9 13.1 10.8 16.2 11.4 12.0 15.2 9.0 13.6 26.1MJ 11.5 13.3 4.9 6.7 6.0 10.2 6.0 4.8 5.6 3.7 8.7 15.4MTH 13.6 16.2 4.6 7.4 7.6 11.2 8.0 5.1 6.1 4.0 8.3 15.2AF 14.4 16.5 5.9 8.2 8.7 11.8 8.7 5.6 6.6 4.5 8.6 15.4PH 16.3 18.7 5.0 7.1 9.2 11.1 9.7 5.2 6.0 4.1 7.9 15.3PA 14.3 15.2 5.4 7.5 7.3 11.9 7.4 5.5 6.4 4.3 8.3 15.9APEM 15.5 20.1 4.8 7.3 10.6 10.3 9.4 5.2 5.9 3.9 7.2 14.9TA 15.2 17.5 5.9 8.3 8.3 12.7 8.2 5.3 6.3 4.2 8.6 14.8TV 15.4 17.8 6.2 8.3 8.7 13.3 8.3 5.6 6.8 4.4 8.7 15.0H 14.8 17.7 5.8 8.3 9.8 12.0 10.2 5.5 6.6 4.5 8.0 13.9SSP2 16.7 19.4 7.1 9.1 9.4 14.2 9.5 6.2 7.4 4.9 9.5 15.9PFU 17.0 22.8 6.5 9.3 11.1 13.3 12.3 7.0 8.0 5.6 9.1 17.1STO 18.6 23.1 6.8 8.6 11.4 13.7 11.1 5.9 7.1 4.5 9.1 15.7PYAE 15.6 19.5 5.3 8.2 9.9 11.8 9.5 5.8 6.9 4.5 8.1 15.0MA 16.0 18.9 7.1 10.8 12.5 14.7 9.7 7.4 8.7 6.4 9.8 17.0MK 13.0 14.6 4.0 6.2 6.1 10.7 6.9 4.6 5.4 3.5 7.3 14.1MMA 14.8 17.4 6.4 9.2 9.5 13.5 8.1 6.6 7.9 5.3 9.7 15.8HI 13.0 14.3 4.8 7.3 8.5 11.1 8.7 4.4 5.4 4.0 8.2 8.7…..tnsp 74.4 79.2 49.7 76.4 81.0 72.6 58.8 78.7 93.7 72.8 42.3 48.1
Ancestral duplication and ancestral conservation
Wij
![Page 61: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/61.jpg)
conservation tree
•species are clustered into 3 phylogenetic domains;• bacterial species cluster with archaeal species;• similar species cluster together;• “whole genome” species clustering tree;• very low resolution of deep clustering;
![Page 62: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/62.jpg)
Genome trees: data matrices
T = {Tij ; i=1,n; j=1,n; n is the number of surveyed species}
Tij is the overall similarity score between species j and i.
• Shared orthologous genes
{sij = (shared orthologs between i and j) }
T = {Tij = sij/size(j); i=1,n; j=1,n }
![Page 63: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/63.jpg)
Note on: Homologs - Paralogs - Orthologs
Homologs: A1, B1, A2, B2
Paralogs : A1 vs B1 and A2 vs B2
Orthologs: A1 vs A2 and B1 vs B2
S1 S2a b
Sequence analysis
Species-1 Species-2
Duplication
Ancestor
Evolution
Speciation
A1 A2
B1 B2
A
B
A
B
A
Time
![Page 64: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/64.jpg)
Shared orthologous genesorg SC SP CE DM AG CA ATH HS MUS FR PF ECUNSC 0 2532 1533 1660 1671 3371 1582 1789 1733 1731 890 600SP 2532 0 1753 1917 1907 2588 1754 2060 2032 2024 1008 645CE 1533 1753 0 3910 3869 1611 1902 4036 3994 4047 1015 580DM 1660 1917 3910 0 7018 1728 2094 5057 5147 5035 1106 616AG 1671 1907 3869 7018 0 1738 2160 5016 5013 5059 1085 617CA 3371 2588 1611 1728 1738 0 1590 1850 1824 1827 873 595ATH 1582 1754 1902 2094 2160 1590 0 2404 2406 2399 1067 539HS 1789 2060 4036 5057 5016 1850 2404 0 14053 10286 1185 638MUS 1733 2032 3994 5147 5013 1824 2406 14053 0 10304 1169 632FR 1731 2024 4047 5035 5059 1827 2399 10286 10304 0 1146 626PF 890 1008 1015 1106 1085 873 1067 1185 1169 1146 0 453ECUN 600 645 580 616 617 595 539 638 632 626 453 0MJ 238 233 214 216 242 230 279 223 216 217 169 142MTH 254 247 237 247 278 245 306 251 248 249 171 141AF 261 255 254 260 303 248 310 260 263 265 182 151PH 251 245 250 259 297 237 281 273 258 271 187 155PA 267 261 255 268 311 256 312 276 273 278 189 156APEM 212 233 228 228 251 215 242 248 237 230 165 136TA 264 260 252 254 279 261 298 268 264 261 182 141TV 263 255 256 249 276 258 296 260 258 270 184 138H 255 264 258 249 284 248 318 271 267 272 173 140SSP2 302 317 293 292 326 300 360 310 309 311 200 155PFU 264 284 256 275 324 286 316 292 274 280 195 150STO 281 291 273 263 313 278 329 293 282 298 196 143PYAE 245 258 236 249 285 238 278 258 246 256 170 143MA 303 316 298 293 368 301 369 329 326 326 200 161MK 210 214 195 204 216 211 244 205 202 195 160 125MMA 289 298 276 280 338 280 349 305 299 297 194 160HI 268 273 231 243 388 268 382 259 259 267 181 86
sij
![Page 65: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/65.jpg)
orthologs tree
• 3 phylogenetic domains;• bacterials species cluster with archaeal species;• similar species cluster together;• better resolution of deep species clustering.
![Page 66: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/66.jpg)
Ancestor
species genome
Evolutionary processes include
Phylogeny*duplication genesis
Expansion*
HGT HGT
Exchange* loss Deletion*selection*
Expansion, Exchange and Deletion are noise. They should be eliminated or at least reduced.
• Large scale comparative analysis of predicted proteomes revealed significant evolutionary processes:
![Page 67: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/67.jpg)
Genome tree construction from “Protein Conservation Profiles” and attempt to reduce
noisy evolutionary processes
To overcome some of these limitations, we consider
![Page 68: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/68.jpg)
p 0111111000111111111000110110111101001111101111
• A “conservation profile” is an n-component binary vector describing a protein conservation pattern across n species.
Components are 0 and 1, following absence or presence of homologs.
• A conservation profile is the trace of protein evolutionary histories jointly captured in a set of n species (multidimensional feature);
• Conservation profiles are signatures of evolutionary relationships;
Conservation profiles
• 99 species (B: 33; A: 19; E:27); 541880 proteins
Main interesting properties of conservation profiles:
![Page 69: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/69.jpg)
E A B S1..............I.............I................Sn
G1,1 100000000000000000000000000000000000000000000000 G2,1 111111111111111111111111111111111111111111111111 G3,1 111111110011111111111111011101110101111111101111 ....................................................... Gn1,1 100001110001000000000000000000000000000000000000 G1,2 010000000000000000010100000000000111000011100011 G2,2 010000000000000000010100000000000111000011100011........................................................ Gn2,2 111111110011111111111111011101110101111111101111........................................................ G1,n 011110100000000000000000001000000000000000000001 G2,n 111111110011111111100011011101110101111111101111 G3,n 111111110011111111100011011101110101111111101111........................................................ Gnp,n 100110000000000000000000000000000000000000000001
Protein conservation profiles
Table : 541880 proteins x 99 species• Different conservation profiles represent different evolutionary histories
![Page 70: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/70.jpg)
original total proteins (99 species)
non-specific proteins i.e conservation profiles (82%)
distinct conservation profiles (42%)
Distinct conservation profiles
541880
442460
184130
111111110011111111111111011101110101111111101111
100110000000000000000000000000000000000000000001
100000000000000000000000000000000000000000000000111111111111111111111111111111111111111111111111
010000000000000000010100000000000111000011100011
................................................
• This set is indicative of the various observed evolutionary histories.
• Effect of the duplication process is reduced(one representative from each set of identical conservation profiles)
![Page 71: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/71.jpg)
0102030405060708090
100110120130140150160170180190200210220230240250
c01c02c03c04c05c06c07c08c09c10c11c12c13c14c15c16c17c18c19c20c21c22c23c24c25c26c27c28c29c30c31c32c33c34c35c36c37c38c39c40c41c42c43c44c45c46c47c48c49c50c51c52c53c54c55c56c57c58c59c60c61c62c63c64c65c66c67c68c69c70c71c72c73c74c75c76c77c78c79c80c81c82c83c84c85c86c87c88c89c90c91c92c93c94c95c96c97c98c99Conservation weights (sum of "1":presence)
Fractions (*10000) of distinct conservation profiles
Presence in the 184130 distinct conservation profiles:Mean=32.2; SD=23.3; min=1; Max=99.
![Page 72: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/72.jpg)
Genome tree construction: data matrices
• Jaccard similarity scores between speciessij = N11/(N11+N01+N10);
N11; N01; N10 are respectively total occurrences of (1,1), (0,1) and (1,0) between i,j.
• 184130 d.c.prof
T = { Tij = sij ; i=1,n; j=1,n; n }
111111110011111111111111011101110101111111101111
100110000000000000000000000000000000000000000001
100000000000000000000000000000000000000000000000111111111111111111111111111111111111111111111111
010000000000000000010100000000000111000011100011
................................................
i jvarious evolutionary histories
![Page 73: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/73.jpg)
Tekaia F, Yeramian E. (2005). PLoS Comput Biol.1(7):e75
profiles tree
![Page 74: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/74.jpg)
Conclusions: Methodology
• Species classification is not an easy task!
• Methods that take into account whole genome informations are still needed;
• Correspondence analysis method might be helpful in revealing evolutionary trends embedded in the multidimensional relationships as obtained from large scale genome comparisons;
• Species tree construction should take into account the whole information included in the genomes;
![Page 75: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/75.jpg)
• Thus they should correspond to the most accurate type of markers for species classification;• In principal profiles tree derived from distinct conservation profiles should considerably minimize genome acquisition effects and should reflect less noisy phylogenetic signals;• The profiles tree presents evidence of conservation of stable phylogenetic relationships and reveals unconventional species clustering;• The profiles tree corresponds to the classification of the evolutionary scenari.
Conclusions...• Conservation profiles represent most conserved and meaningful evolutionary signals jointly captured in a set of species;
![Page 76: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/76.jpg)
References:• Tekaia, F. and Dujon, B. (1999). Pervasiveness of gene conservation and persistence of duplicates in cellular genomes. Journal of Molecular Evolution, 49:591-600.
• Tekaia, F., Lazcano, A. and B. Dujon (1999). Genome tree as revealed from whole proteome comparisons. Genome Res. 12:17-25.• Tekaia, F., Yeramian, E. and Dujon, B. (2002).Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. Gene 297: 51-60.• Tekaia, F. and Yeramian, E. (2005).Genome Trees from Conservation Profiles. PLoS Comput Biol.1(7):e75.
• Tekaia F, Latgé JP. (2005). Aspergillus fumigatus: saprophyte or pathogen?Curr Opin Microbiol. 8:385-92. Review.
• Tekaia, F. and Yeramian, E. (2006).Evolution of Proteomes: Fundamental signatures and global trends in amino acid composition. BMC Genomics. 7:307.
• Systematic analysis of completely sequenced organisms:http://www.pasteur.fr/~tekaia/sacso.html
![Page 77: Tekaia Evol Trends Proteomes](https://reader036.vdocuments.net/reader036/viewer/2022062412/577ccf2c1a28ab9e788f1279/html5/thumbnails/77.jpg)
References:• Bininda-Emonds ORP (2005). Supertree Construction in the Genomic Age.Methods in Enzymology 395: p.745-757.• Bininda-Emonds,OPRP, John L. Gittleman, Mike A. Steel (2002)The (super)Tree Of Life: Procedures, Problems, and Prospects. Annual Review of Ecology and Systematics, Vol. 33: 265-289.
• Dagan, T. and W, Martin (2006). The tree of one percent. Genome Biology, 7:118.• Delsuc F, Brinkmann H, Philippe H. (2005). Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet. 6:361-75. Review.• Doolittle. Science 284:2124-8. (1999)• Driskell, et al. (2004). Sciences, 306; 1172-1174.
• http://www.genomesonline.org/gold.cgi (list of genome projects)• Keith A. Crandall and Jennifer E. Buhay (2004). Sciences, 306; 1144-1145.
• Linder, Moret, Nakhleh, and Warnow: http://compbio.unm.edu/networks1.ppt
• Martin & Embley (2004). Nature 431:152-5.
• MCL: a cluster algorithm for graphs: http://micans.org/mcl/
• Pennisi, E.(1998). Genome data shake tree of life.Science. 280:672-4.
• Rivera & Lake JA.(2004). Nature 431: 152-5.• Raoult et al.(2004). Sciences, 306:1344-1350.• Snel, Bork, Huynen (1999). Genome phylogeny based on gene content.Nature Genetics 21, 108-110.
• Snel B, Huynen MA, Dutilh BE (2005). Genome trees and the nature of genome evolution.Annu Rev Microbiol.;59:191-209. Review.
• Woese et al.(1990). PNAS. 87:4576-4579.