supplemental online materials

32
MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised Supplemental Online Materials Material and Methods Sampling Protocols. Sampling on the RV Weatherbird II was done as follows: Seawater (170 liters) from stations 11 and 13 was directly filtered through a 0.8µm Supor membrane disc filter (Pall Life Sciences) followed in series by a 0.22µm Supor membrane disc filter (Pall Life Sciences). The sample from station 3 was pumped into a 250 L carboy prior to being filtered through the impact filters. The length of time from collection of the sample until the end of the filtration step was approximately one hour. Filters were placed in 5ml of sucrose lysis buffer (20mM EDTA, 400mM NaCl, 0.75 M Sucrose, 50mM Tris-HCl, pH 9.0) and stored in liquid nitrogen on the Weatherbird then placed at -80ºC until DNA extractions were done. Alternatively seawater (340 liters) was collected from 5 meters below the surface into a carboy then filtered through a 0.8µm Supor membrane disc filter (Pall Life Sciences), followed by concentration to 1 liter using a Pellicon tangential flow filtration system (Millipore) with a 0.1µm Durapore VVPP cartridge (Millipore); again the total time for the filtration and concentration was approximately one hour. Cells were pelleted at 10,000 rpm, 4ºC for 30 minutes. ). The impact filters and the retentate from the TFF were then handled as described above. The carboys, tubing and filter systems were cleaned with a 10% hydrochloric acid wash prior to each leg of the sampling. Any of the sampling equipment (tubing, etc.) that could reasonably be soaked was soaked in an acid bath is for at least 24 hours. Sampling carboys were filled with the acid wash and “soaked” for at least 24 hours as well. All acid washed items were subsequently rinsed very liberally with Milli-Q water. A liberal Milli-Q water rinse was also conducted between samples on the same leg. All spigots from the carboys were covered with a ziploc bag until needed. Tubing was stored in clean ziploc bags until needed. Sample preparation. The impact filters were cut into quarters and placed in individual 50 ml conical tubes. TE buffer (5 ml, pH 8) containing 150 ug/ml lysozyme was added to each tube. The tubes were incubated at 37oC for 2 hours. SDS was added to 0.1% and the samples were then put through three freeze/thaw cycles. The lysate was then treated with Proteinase K (100 ug/mL) for one hour at 55oC followed by three aqueous phenol extractions and one extraction with phenol/chloroform. The supernatant was then precipitated with two volumes of 100% ethanol and the DNA pellet washed with 70% ethanol. DNA preparation. DNA was randomly sheared, end-polished with consecutive BAL31 nuclease and T4 DNA polymerase treatments, and size-selected by electrophoresis on 1% low-melting-point agarose. After ligation to Bst XI adapters (Invitrogen, catalog no.

Upload: others

Post on 03-Feb-2022

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Supplemental Online Materials

Material and Methods

Sampling Protocols. Sampling on the RV Weatherbird II was done as follows: Seawater (170 liters) from stations 11 and 13 was directly filtered through a 0.8µm Supor membrane disc filter (Pall Life Sciences) followed in series by a 0.22µm Supor membrane disc filter (Pall Life Sciences). The sample from station 3 was pumped into a 250 L carboy prior to being filtered through the impact filters. The length of time from collection of the sample until the end of the filtration step was approximately one hour. Filters were placed in 5ml of sucrose lysis buffer (20mM EDTA, 400mM NaCl, 0.75 M Sucrose, 50mM Tris-HCl, pH 9.0) and stored in liquid nitrogen on the Weatherbird then placed at -80ºC until DNA extractions were done. Alternatively seawater (340 liters) was collected from 5 meters below the surface into a carboy then filtered through a 0.8µm Supor membrane disc filter (Pall Life Sciences), followed by concentration to 1 liter using a Pellicon tangential flow filtration system (Millipore) with a 0.1µm Durapore VVPP cartridge (Millipore); again the total time for the filtration and concentration was approximately one hour. Cells were pelleted at 10,000 rpm, 4ºC for 30 minutes. ). The impact filters and the retentate from the TFF were then handled as described above. The carboys, tubing and filter systems were cleaned with a 10% hydrochloric acid wash prior to each leg of the sampling. Any of the sampling equipment (tubing, etc.) that could reasonably be soaked was soaked in an acid bath is for at least 24 hours. Sampling carboys were filled with the acid wash and “soaked” for at least 24 hours as well. All acid washed items were subsequently rinsed very liberally with Milli-Q water. A liberal Milli-Q water rinse was also conducted between samples on the same leg. All spigots from the carboys were covered with a ziploc bag until needed. Tubing was stored in clean ziploc bags until needed.

Sample preparation. The impact filters were cut into quarters and placed in individual 50 ml conical tubes. TE buffer (5 ml, pH 8) containing 150 ug/ml lysozyme was added to each tube. The tubes were incubated at 37oC for 2 hours. SDS was added to 0.1% and the samples were then put through three freeze/thaw cycles. The lysate was then treated with Proteinase K (100 ug/mL) for one hour at 55oC followed by three aqueous phenol extractions and one extraction with phenol/chloroform. The supernatant was then precipitated with two volumes of 100% ethanol and the DNA pellet washed with 70% ethanol.

DNA preparation. DNA was randomly sheared, end-polished with consecutive BAL31 nuclease and T4 DNA polymerase treatments, and size-selected by electrophoresis on 1% low-melting-point agarose. After ligation to Bst XI adapters (Invitrogen, catalog no.

Page 2: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

N408-18), DNA was purified by three rounds of gel electrophoresis to remove excess adapters, and the fragments, now with 3'-CACA overhangs, were inserted into Bst XI-linearized plasmid vector with 3'-TGTG overhangs. Fragments were cloned in a medium-copy pBR322 derivative.

Sequence assembly. With default parameter settings, the highly covered genome sequences would have been treated as repetitive DNA by the Celera Assembler. Since the Celera Assembler constructs scaffolds only from a backbone of sequence heuristically classified as unique, these organisms would not have been eligible for scaffolding and would have been absent from the final assembly. However, by tuning the threshold parameter for classifying unique sequence, we were able to compensate for the apparent repetitiveness of these genomic regions, and scaffold them appropriately. This was accomplished by identifying the most deeply assembling, obviously non-repetitive contigs in an initial run of the assembler (in this case, the strong assemblies at 21-36x coverage which were identified as gene-rich Burkholderia-like and plasmid scaffolds), and using a value slightly below the calculated “A-statistic” (an empirical uniqueness measure within the Assembler) of these contigs as the threshold parameter in a subsequent run. This allows the deep contigs to be treated as unique sequence, when they would otherwise be labeled as repetitive. At the other end of the spectrum, rare organisms in the sample have been sampled by sequencing only to a shallow depth of coverage. Routine assembly would not have considered the small fragment overlap based assemblies with shallow coverage as an eligible basis for scaffolding, due to a minimum length requirement of 1000bp, which is typically in place for efficiency. Therefore, in the present use case, the organisms represented by these sequences would not have been ordered and oriented with mate-pairs without adjusting the default minimum length to compensate for the low anticipated coverage depth and assembly length. With this selection of parameters, more suitable to the enivironmental project at hand, we were able to adequately assemble both the dominant and rare species simultaneously.

Scaffold taxonomy. The taxonomy of different scaffolds using sequence similarity was determined as follows. Taxonomically informative blast hits were identified along the length of the scaffolds. Taxonomically uninformative hits are those instances where the best blast hit closely resembles the next best blast hit implying that the gene is highly conserved and potentially ambiguous when informing the taxon of the containing scaffold. Informative blast hits were defined as full length hits (80%+ length of highest scoring blast hit) which had 15% or higher identity relative to the next most identical blast hit. The spans of the scaffold were associated with the appropriate NCBI taxonomies through the informative blast hits. The global percentage of each taxon present on the scaffold was determined. If the most prevalent taxon covered 20%+ more sequence or was 4 times more common than the second most prevalent taxon then the taxon of the scaffold was assigned to the given taxonomy at the given taxonomic level. If these conditions were not met, each taxon was transformed to a higher more general taxonomic level and the termination conditions were retested. If no taxon had been chosen upon reaching the superkingdom taxonomic level then the most prevalent

Page 3: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

superkindgom taxon was chosen. The following taxonomic levels as specified in the NCBI taxonomic database were used: genus species, family, order, class, phylum, and superkindgom.

Evidence-based gene finding. Evidence in the form of protein alignments was used to determine the most likely coding frame. The approximate start and stop positions were likewise determined from the bounding coordinates of the alignments and refined to identify specific start and stop codons. This methodology was applied to several different lines of evidence in a series of steps as follows: First evidence was generated by searching the bacterial portion of the nraa dataset against the Sargasso data using tblastn. All blast searches were performed using NCBI blastall version 2.2.26. Unless stated otherwise all blast searches were performed with these generic parameters: -Y 3000000000000 -F "m L" -U T -v 5. The NCBI non-redundant amino acid dataset (nraa) was downloaded from NCBI on September 2nd and contains 1,510,260 peptides of which 626,877 were classified as bacterial. The tblastn searches using the bacterial subset of nraa were performed with these additional parameters: -e 1e-4 -b 10000 -K 10000. The blastx searches against all of nraa were done with these additional parameters: -e 1e-3 -b 5000 -K 10000. The tblastx searches used these parameters: -e 1e-3 -b 5000 -K 5000. The search of the Sargasso data against itself using blastn used these parameters: -W 20 -K 10000 -b 10000 -q -9 -e 1e-40.

Computing role categories. A valid overlap was defined as over 40bp in length and occurring in the same reading frame as the predicted gene. Genes with three or more corroborating blast hits were assigned to the most frequent role category. Additional roles were assigned to genes if the additional roles were supported by at least three quarters as many blast hits as the most frequent role.

Phylogenetic marker identification. 16S ribosomal sequences were identified in the assembled Sargasso sequences using NCBI blastall (version 2.2.6) with the following parameter set; -p blastn -e 1e-25 -r 3 -q -4 -Y 3000000000000 -F "m L" -U T -K 20000 -b 20000 -v 5. The queries were a set of 187 ungapped representitive 16S ribosomal sequences(1). The results were then filtered to remove HSPs less than 80% identical and shorter than 200bp. Intervals were determined using the bounding coordinates of overlapping HSPs.

Computational taxonomy. Phylogenetically informative genes were assigned to prokaryotic taxonomic groups using an automated, iterative phylogenetic analysis (Eisen et al., in preparation). In summary the method works by (1) aligning the new rRNA sequence to a pre-aligned set of small-subunit rRNAs that contain representatives of major phylogenetic groups (2) performing a phylogenetic analysis of these sequences and identifying the closest 2-3 major phylogenetic groups to the query sequence (3) aligning the query sequence to a pre-aligned set of small subunit rRNAs from just those 2-3 major taxonomic groups identified in step 2 along with 3-4 outgroup sequences (4) performing another phylogenetic analysis and (5) identifying the nearest neighbor(s) in

Page 4: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

the phylogenetic tree and assigning the query sequence the taxonomic ID of the nearest neighbor. The number of markers assigned to different major phylogenetic groups was then counted, weighting by the fold coverage of the contig in which the marker was found so that the final count is an estimate of the abundance of each phylo-type in our sample.

Log-normal parametric estimation of diversity. Curtis et al. (2) describe a method for estimating total species diversity given the total number of organisms, the fraction of the total made up by the most abundant species, and the abundance of the least abundant species. The first two are readily estimated from our data; cell counts imply approximately one billion cells per liter or 2x1011 cells per 200L sample, while the relative abundance of the most common organism ranged from 12% of the total (sample A) to approximately 3% of the total (sample B). For the third parameter, we arbitrarily assumed a lowest abundance of 1 cell per liter (1 part per billion). From this, we obtain an estimate of total diversity of more than 3000 species in the A sample and more than 17000 species in the B sample. (If we were to assume a reduced minimum abundance, this would lead to higher estimated diversity; if we increased the minimum abundance, the estimates would decrease.)

Depth of Coverage Diversity Models. Assuming standard random models for shotgun sequencing, sequencing of an environmental sample should result in depths of sequence coverage reflecting a mixture of Poisson distributions. We computed the empirical distribution of coverage depth at every position in the full set of assemblies (including single fragment contigs, but not counting gaps between contigs), and compared it with hand-constructed mixtures of Poisson distributions. An excellent fit can be obtained; the first column of Table 3 specifies one such mixture, whose fit to the empirical data is shown in Fig. S2. To the extent that a limited range of mixtures give acceptable fits, this may be used to estimate the diversity of the water from which DNA was extracted. For instance, the mixture in first model of Table 3 implies the following: The assembly as a whole covers 650Mbp of consensus sequence (including singletons). Assume an average genome size of 2.0M (≈ total base pairs in the assembly / # of distinct copies of phylogenetic markers). The number of genomes corresponding to a specific sequence coverage level (i.e. a row in Table 3), can be computed as follows: letting the total assembly length be L and the average genome size be g, if fraction f of the assembly is contributed by organisms sampled to sequence coverage c, the number of species represented by this coverage level may be estimated as f * L / ( g * c); the corresponding estimate is shown as the last column in the table.

The resulting estimate of the total number of organisms is not a tight one, in the sense that it is difficult to set an upper bound. The second model in Table 3 shows an alternative mixture leading to as good a fit to the empirical distribution, but different conclusions regarding the total diversity (nearly 10-fold increase in estimated total species). For this model, we set the lowest abundance organism to make up about 2ppm

Page 5: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

of all organisms. Dropping this parameter to, e.g., 1 part per billion (i.e. about 1 cell per liter of water), the estimated total species increases nearly another 2000-fold.

However, a stable lower bound is more approachable. We found that it was possible to give a good fit to the empirical distribution with the lowest abundance class corresponding to organisms sampled to 0.25X (see Model 3 of Table 3), while similarly good fits were not possible with a lowest abundance of 0.35X (not shown). Model 3 indicates that the least divergent acceptable fit that we achieved implies a total number of species of approximately 1800. Notably, the great bulk of “species” are covered only at .25X; such organisms would constitute approximately 1 cell per 3000 in the sample. Thus, even under the most conservative model, we would require 12-fold greater sampling depth to obtain 95% of the sequence of all species in the sample (i.e. 3X coverage).

Phage searches. A total of 241,900 predicted ORFs from 23,651 scaffolds, ranging in size from 826-2,098,317 bp as well as ~ 600,000 predicted proteins from singletons, were searched (BLASTP) against a database of all complete and incomplete phage genomes from GenBank as well as an in-house catalog of prophage regions (255 phages, ~ 30,000 protein sequences), for the presence of putative phage genes and regions.

Supplemental Text

In Sample 1, the most abundant species made up an average of 11.8% of the total fragments (range 7.5% to 18.7%), with at least the high end of the range in keeping with previous observations regarding diversity in ocean waters (28). However, in the other samples we observed an average maximum abundance of only 3.3% (range 2.1-5.7%). This is a level of diversity more akin to what has been observed in terrestrial samples (28).

Page 6: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S1. Comparison of hydrographic features at the three stations sampled during February 2003. Upper panel) water column temperature; Mid panel) dissolved oxygen concentrations; Lower panel) 685nm in vivo fluorescence.

Page 7: Supplemental Online Materials

00.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.55.0

5.5

6.0

6.5

7.0

7.5

8.0

8.5

9.0

Page 8: Supplemental Online Materials

0

2.5

0.5

1.0

1.5

2.0

3.0

3.5

4.0

4.5

Page 9: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S2. Circular depiction of Burkholderia species. The first concentric circle represents genes whose colors have been assigned to the following role categories: amino acid biosynthesis, violet; biosynthesis of cofactors, prosthetic groups, and carriers, light blue; cell envelope, light green, cellular processes, red; central intermediary metabolism, brown; DNA metabolism, gold; energy metabolism, light gray, fatty acid and phospholipid metabolism, mabgenta; protein fate and protein synthesis, pink; purines, pyrimidines, nucleosides, and nucleotides, orange; regulatory functions and signal transduction, olive; transcription, dark green; transport and binding proteins, bluegreen, genes with no known homology to other proteins and genes with homology to genes with no known function, white; genes of unknown function, gray. The second outermost concentric circle depicts the positions of scaffold breaks in the assembly of this environmentally sequenced organism. The third concentric ring depicts Blast similarity scores to the partially determined sequenced genome of Burkholderia fungorum(sequenced by the Department of Energy Joint Genome Institute). Matches of >98% simalrity are shown in red, >90% are shown in gold, and >80% similarity are shown in grey. The forth concentric circle depicts the location of ribosomal RNAs. Tick marks are placed on 100 kb intervals.

Figure S3. Gene conservation among closely related Shewanella species. The outermost concentric circle of the above diagram depicts the competed genomic sequence of Shewanella oneidensis MR1 []. Fragments from environmental sequencing were compared to this completed Shewanella genome and are shown in the inner concentric circles and were given boxed outlines. Genes for the outermost circle have been assigned pseudo-spectrum colors based on the position of those genes along the chromosome, where genes nearer to the start of the genome are colored in red, genes neared to the end of the genome are colored in blue. Fragments from environmental sequencing were subjected to an analysis that identifies conserved gene order between those fragments and the completed Shewanella genome. Genes on the environmental genome segments that exhibited conserved gene order are colored with the same color assignments as the Shewanella MR1 chromosome. Colored regions on the environmental segments exhibiting color differences from the adjacent outermost concentric circle are the result of conserved gene order with other MR1 regions and probably represent chromosomal rearrangments. Genes that did not exhibit conserved gene order are colored in black.

Page 10: Supplemental Online Materials

UEAHX82TF 208

2216336_1212UDAU501TF_0UDAU501TF_5522UU1UUDD2DDAA0AA3AAUU6UU57550_011TT2170711_145762215298_2434

SLARZ85TF_3UEAWG90TR_224UBAUD68TF_0UBAUD68TF_625

SSBD828TR_02220722_24357

2220722_250562160200_10452160200_109

UDAOM75TR_49UDAOM75TR_519UEAK935TF_182

2211716_12742211716_239

SXABM92TF_0SXABM92TF_4392183764_53

2183764_868

222222111555555888888_2227777774442215722_6212

UBAWT24TR_4152221689_3205

SSBW991TF_152196396_0

2211650_17302220501_4175

UAAYG13TR_269

2176463_493

222000444999777111222_4449992222220155 0

SSACF39TF_408SSACF39TF_5

SKARI35TR_450UBAQH75TF_6082212548_2215

UDAWL73TF_7

UAAYK24TR_1

2218728_0UBAXW36TF_10

22119988445500 226655

p.aphidicolaa

S000008864_Coxiella_burnetii

S000089920_unidentified_marine_env.agg58S000007568_Flavobacterium_aquatileS000019622_Myroides_odoratus

S000004277_Blattabacterium_sp.S000009306_Saprospira_grandis

S000089919_unidentified_marineS000011507_Chitinophaga_pinensis

S000118796_Bacteroides_thetaiotaomicronS000009592_Bacteroides_vulgatusS000007810_Prevotella_albensis

S000009138_Porphyromonas_gingivalisS000002378_Porphyromonas_cangingivalis

S000000472_Rikenella_microfususS000014261_Sphingobacterium_thalpophilum

S000011570_Persicobacter_diffluens000006222_Flexibacter_litoralis

S000000569 l d b i WCHB11

S000010884 Helicobacter pylori

S000016753_Streptococcus_pyogenesS000000483_Streptococcus_pyogenesS000013649_Streptococcus_pyogenesS000092040_Streptococcus_agalactiae

S000009408_Streptococcus_pneumoniaeS000093642_Streptococcus_mutans

S000006623_Lactococcus_lactisS000012732_Leuconostoc_mesenteroides

S000113653_Lactobacillus_plantarumS000014648_Lactobacillus_reuteriS000015610_Aerococcus_viridans

S000118310_Enterococcus_faecalisS000006245_Listeria_grayiS000014403_Listeria_monocytogenesS000008207_Staphylococcus_aureusS000010937_Staphylococcus_aureusS000104476_Staphylococcus_aureusS000106604_Staphylococcus_epidermidisS000106606_Staphylococcus_epidermidis

S000098380_Bacillus_cereusS000094385_Bacillus_cereusS000006914_Bacillus_anthracis

S000000173_Bacillus_haloduransS000013140_Sporolactobacillus_laevis

S000001302_Bacillus_subtilisS000001560_Bacillus_smithii

S000000842_Planomicrobium_mcmeekiniiS000009218_Caryophanon_latum

S000097887_Oceanobacillus_iheyensisS000002252_Paenibacillus_sp.S000013045_Alicyclobacillus_cycloheptanicus

S000008814_Acidaminococcus_fermentansS000014265_Heliobacterium_chlorum

S000001386_Clostridium_acetobutylicumS000111409_Clostridium_tetani

S000013435_Halobacteroides_halobiusS000011592_Haloanaerobium_lacurosei

S000016108_Syntrophomonas_wolfeiS000001651_Mycoplasma_genitaliumS000010839_Mycoplasma_gallisepticum

S000106406_Mycoplasma_penetransS000010709_Ureaplasma_urealyticum

S000088167_Mycoplasma_pulmonis

S000014637_Fusobacterium_nucleatumS000003547_Peptostreptococcus_hareii

S000013904_Lachnospira_pectinoschiza

S000020703_Chrysiogenes_arsenatisS000022105_uncultured_eubacterium_env.vaBA07S000016899_Chlamydia_muridarum

S000018006_Chlamydia_suisS000005771_Chlamydia_trachomatis

S000017490_Chlamydophila_caviaeS000017919_Chlamydophila_pneumoniaeS000022934_Chlamydophila_pneumoniaeS000017487_Parachlamydia_acanthamoebae

S000018549_Simkania_negevensisS000020607_Waddlia_chondrophila

S000022125_Treponema_denticolaS000007390_Spirochaeta_smaragdinae

S000001342_Brachyspira_innocensS000015431_Leptospira_weilii

S000092649_Leptospira_interrogansS000016916_Bacteria_env.32_11S000016883_Bacteria_env.11_25

S000018500_Bacteria_env.11_14S000011478_Acidobacterium_capsulatum

S000018502_Bacteria_env.kb2426S000017480_Bacteria_env.iii1_8

S000003911_Fibrobacter_succinogenesS000000359_unidentified_Cytophagales/green

S000090805_Bacteria_env.agg8S000010436_Pirellula_sp.

S000010673_Planctomyces_limnophilusS000005392_Baceria_env.WCHB03

S000116278_Bacteria_env.DA101S000021618_Verrucomicrobium_spinosum

S000012621_unidentified_VerrucomicrobiumS000008042_uncultured_eubacterium_env.WCHB25

S000015975_uncultured_eubacterium_env.WCHB41

S000009928_bacteria_env.OPB_2S000008781_Bacteria_env.OPB_5

S000008959_Thermoterrabacterium_ferrireducensS000004960_Thermoanaerobacter_tengcongensis

S000008497_Deinococcus_radiophilusS000016112_Thermus_thermophilus

S000007285_Bacteria_env.cOS17S000006611_Bacteria_env.OPI-2

S000007353_Oerskovia_turbataS000005891_Promicromonospora_citrea

S000005353_Microbacterium_sp.S000008099_Brevibacterium_epidermidis

S000013460_Micrococcus_lylaeS000009616_Dermabacter_hominisS000004333_Jonesia_denitrificans

S000016041_Intrasporangium_calvumS000014697_Tropheryma_whippeliiS000113860_Tropheryma_whipplei

S000004597_Dermatophilus_congolensisS000007764_Kineosporia_aurantiaca

S000012326_Bifidobacterium_animalisS000000781_Actinomyces_hydrovaginalis

S000004365_Propionibacterium_acnesS000010797_Glycomyces_harbinensis

S000009674_Nocardioides_plantarumS000015971_Thermomonospora_chromogena

S000021445_Streptosporangium_longisporumS000011573_Nocardiopsis_dassonvillei

S000006100_Streptomyces_armeniacusS000118659_Streptomyces_avermitilis

S000002406_Sporichthya_polymorphaS000007769_Geodermatophilus_obscurus

S000019126_Williamsia_muraleS000000087_Gordonia_sputi

S000007802_Nocardia_asteroidesS000096611_Mycobacterium_tuberculosis

S000012475_Mycobacterium_lepraeS000006989_Mycobacterium_chlorophenolicum

S000089173_Corynebacterium_glutamicumS000105408_Corynebacterium_efficiens

S000003095_Corynebacterium_urealyticumS000007394_Tsukamurella_inchonensis

S000009618_Dietzia_marisS000007364_Pseudonocardia_sulfidoxydansS000011189_Pseudonocardia_yunnanensis

S000000247_Microsphaera_multipartitaS000016069_Micromonospora_chalcea

S000013892_Frankia_sp.

S000010754_Acidimicrobium_ferrooxidansS000013627_Coriobacterium_glomeransS000010067_Rubrobacter_xylanophilus

S000002415_Thermotoga_maritima

100

SLAVY77TF_0USSLLELLAAAAAVVBVVYYAYY797777077TTTTTFFRFF___0SLBKK86TR_12SLBKK86TR_260UBAJL21TF_1UEAVD20TF_0SSBBX71TR_410UEAVD20TF_614SZAB268TR_391SZAB268TR_1SLBJR93TR_0SSBSD40TR_106SZATN16TF_1SZAIAA G88TF_3UDAOX40TF_6SLAG177TR_152160465_666SKBSS48TF_3UGAAK95TR_02035963_0SKAOF48TF_0

SSAX920TR 0USSAADAAXXAXX992R2200100TT7TTRR1RRTRR__00F00 _9UEADM15TR_0UBAES80TR_0SSAQ237TF_8UEAIX52TR_4UAAF681TR_9UEAWJ93TF_1UEAWJ93TF_2891994312_228UDAPD57TR_188UEAHU94TF_0UEAGD83TR_0UDAE002TF_02162531_888SKAUN21TF_2SSBFU83TR_1SSBFU83TR_330UAAZG11TR_0UDAE424TR_152UBATS42TF_0SZAIV26TF_267SSSSSSZZBZZAALAAI6IIVV3VV2252266T66TTRTTFF_5__2252266UEAJJ70TF_3SLATY64TR_18UBBEI45TF_62009515_299SXAEO60TR_429UDAUI24TF_7

UBBD831TF_0UDABJ80TF_0UEAQZ57TF_18SVACW90TR_5

UBAL994TF_282039553_2SZAEAA DE 68TF_6SLAB358TF_0SSAFAA LF 828 T22 F_2SLBMS73TF_0SSBTW09TR_3UBAT521TF_321UBAAF95TF_18UBBDL26TR_1SSSBUUCUUBBOBB9DD4DDLLTLL22R2266_0TTRR

2049438_515SSBP4P 26TF_0SKACI34TF_54SLBP686TF_0UAAHB36TR_2UBBAL53TF_2SZAV807TR_273SZAIO13TF_499UEAUC77TR_277

1977552_0SVADN88TR_0UBAQN19TF_1SLAIAA UII 84TF_3SLAIU84TR_0UAAY270TR_262036396_0UDASV77TR_4

SZAH481TF_01965105_02024966_3412122822004002282244844989966_966__3__3323344

SZAW363TR_198UEAYF14TF_372

SZAJA T232 TF_0UDAMI93TR_3462120142_0SZAV223TF_38UEAB310TF_255SKAKK EI11TF_282

SVACI13TF_413SZAFL12TR_320

2028750_9UAAR271TR_11972689_02039098_22SKBBP54TF_0UAALV24TF_0

UDALR34TR_0UBAH252TR_1SVAB233TF_2UBAB749TR_25SUUSUUBBBBBNAAJBB5BB7747744T4499F99TT_9RR__SZAEY95TR_0SZARY40TF_20SSBQK31TR_48UABAT64TF_78SZAGD80TR_0SXAFN70TF_117SSBVO48TR_0SSBCR13TR_0SSBCR13TR_335SKAD236TF_1SKBF067TR_0SSBKU42TR_93

SSBRA91TF_0SKBBG61TR_10SVAA467TF_27

SSBIFII 17TF_0UBAK602TF_0SZAN277TR_7UAAHD64TR_12SZAN592TR_0SZSSZAZZATAANNCNN055999T22TTRTTRR_0__

SKBCV91TF_134UDADK16TF_206UAAKN34TF_158

UBAY217TF_1UDANE30TR_311UBAJ517TR_0UAAGC02TF_92180799_23411981885_22183451_830

SSBAD10TR_2SKAO856TR_0

SKBKK PZ33TF_902185974_900SZAOC72TF_11967691_98SSBQG55TR_4SSBT455TR_6SKAT557TR_02162085_2SSZA22Q221166X66227220050088T8855_R__22_0

UAAOS60TF_0SHAAJ71TR_333UAAM237TR_200UBBE781TF_0SKASD72TF_7SSAVK73TR_0SLALL EAA EE 8EE 78 T77 FTT _02182061_9282214702_1122UAAAV34TR_349

UDAIP27TR_22188744_965SZABT58TF_372107434_359SZAJH11TF_6UBALI87TR_89

2139186_36SZAP9PP 80TF_31

SSAX566TR_284UDAE708TR_18SUUHUUDDADDAAAAAEEIEE777300T88TTRTTRR_1__UDAXW09TR_89SXADT09TF_1SZAIV26TF_19SXAF489TF_46SKBNR90TF_39UDANH83TR_262018802_02101893_319UBAH241TF_26UEAQE76TF_185

SKAUF16TF_307SSBVO48TR_339SZABW35TR_579UEART52TR_3SZAF003TR_0SKAUF16TF_1SLBK259TF_151UEAC582TF_366UEAC582TF_4

SKBK966TR_1SSSKKKBBBKKK999666666TTTRRR_311632140462_527

2174341_3SLAGK2KK 9TF_0SSBIJ51TR_394

2022082_14SKBLG57TF_7UBAW253TR_0

UDAIV24TR_4UEAI124TR_180UDAYE69TR_0UDADK16TR_543UDADD03TR_15UDADD03TR_399UDAFH66TR_5642205564_28131982426_1SSBWN18TF_02008847_02138264_0

SZAPAA NPP 06TF_0UAAPPAAAAEPPNNPPPPANN0N006646TT5TTFF6FF__T__00F00 _476UEASM15TF_4

UAAAA30TF_2SSBD430TF_8UBAJM68TF_0

2006917_5UEAOA25TR_01973383_4SZAZZ FA TFF 0200 T22 FTT _0SKBOL87TR_0SSBOL90TR_5SHABC09TR_0UBAIC56TF_0UEAQO03TR_0SZAU548TR_82142375_2UBAJT49TF_0SKBD375TR_32170507_0UEAN265TR_118UDALR21TF_0SDDAASAALLALLRURR2222211811TT4TTFFTFF_R__00_0SUAA077TF_1SSBBY26TF_24SSAB785TR_0SLAN193TR_02115220_0SSAAL66TF_266SSAAA L66TF_2UEAO859TR_91UDAQL42TR_4462205427_686

UBASP17TF_122SLBPC95TF_0UDAHC43TR_02142923_02142923_317SKAT557TR_336UEAN002TR_31UEAJK85TR_0UGAA155TR_0

1957782_725577277770778898822322__6__77277 _02059610_652SZAMB85TR_0SSBBC63TR_0UBAP163TR_213

SKBTB27TR_0UFAAE35TF_1SZAKD49TR_295

2044859_2SSBD430TR_80SSAHB88TR_0SSAHB88TR_3102071890_02179761_1UEAOX35TF_4

2054102_616UAAV406TF_557

UBBE356TF_5662205427_1645UEAX701TF_23SSBCT94TF_18UBBE356TF_0SUUSUUBBBBBBBFBBEE3EE331335585566T66TTRTTFF_1_00800 2UEAKK36TF_0UAAWY50TR_350SSBI802TR_0SSBI802TR_571

UEAI118TR_02173893_3892173893_4SSBM492TF_2UEAOA25TR_2952056680_28UEASN81TF_262213573_1UDAKF16TF_0

UAAXD13TR_3SZAPNP 06TF_279UAAAA30TF_466

UEAY503TF_0UHAAN24TR_0

SVACI13TR_4UDAZ252TF_0UUUADDAAAZGZZ226225505522322TTTTTFFFFF___000UEAN672TF_1372187370_817

2200533_10872009584_0

2201417_6SXACZ44TR_2152138287_147SKBSR09TF_298UFAAH63TF_3532031090_306SKBJI12TF_2SKBRT75TF_541SZAQR17TF_571

2193857_690SSBK051TF_0

SZASB82TF_0SHAAW74TF_195SLBAK47TR_02044923_23

UDAUW29TF_0SSBIFF16TUURUUDD_4AA5AAUU9UUWWUAAT174TR_0UBAW268TR_02210257_2479SSATJ29TR_1

SZAH805TF_0UEAB310TF_0SZAJ782TR_9UEAJ386TR_21UEAJ386TR_4712004021_320SKBP618TR_01964172_5SZAIN65TF_126

UABBD62TF_01964172_744SSBRC15TF_24SZAND23TR_1SSATA I19T99 F_2SVABX85TR_142167598_10822771554991885__5_11200_2222141552_331

UDAST96TR_19UDAST96TR_461

SHABM48TR_379UAACY83TR_4SKBRT75TF_02004021_02011649_659

2018393_02018393_252SZAFAA L12TR_02124012_272124012_468

UDAMG62TR_14UDAJA59TR_0

2166608_0SSBBW08TR_2272217903_1209

2051192_0SSAXF51TR_285SSSSBSSAUAAXX3XX4XXFF3FF55T5511RTT_0RR__UEU ATE19TF_188UEAUP25TR_4SSBILII 06TF_0UEUU AZ148TF_0

1994463_2572216917_255UDAA915TF_0SKAGB53TF_4UDBA146TR_1

SKAAP65TR_670SSAXF51TF_407UAASJ14TF_0

2169616_481UAALV71TF_3812169616_0

UEAVN52TR_0UEAVN52TR_292UAAAT24TR_5

UBAT459TF_1272221050_1433222222222222232211811005005555500_1__113114474433433332221050_2090UDAWX45TF_02221939_13620UBAFG95TR_0

SSBKH91TF_1SKAZF19TR_0SKAZF19TR_268UBAZJ19TR_02164044_612

2060530_12060530_233SLAER80TF_70UEUU AEE EAA LEE 10T00 F_52199020_32

UAAMQ46TR_7UEAOX50TF_548SSBMM51TF_7

SZANO52TR_7SKAA761TR_0

2196402_800266440440000022622__3__881880000000_0UAARB39TF_0

SKANA59TR_02136909_227UBAOT91TR_495

2096607_0UEAYX42TF_102112765_0UEAYX42TF_303UEAVJ22TR_420

UEAGQ14TF_402SZAW083TR_28SZAW083TR_4322184525_02190582_156

UBAYV25TF_356UBAYV25TF_7SKABN85TF_0

SSBO709TF_117UDAC857TR_1UDAC857TR_4172DDAA2AA1CC7CC884885575577677TT_1RR0RR__4__4444411211772217476_11809UBAAU58TF_367

2217953_2381UBAWN93TR_330

SSBGG69TF_0SSBQV30TF_0UDAAZ92TF_5862217953_30092224010_7074

2210570_2920SZAEW53TR_200

2119252_2482222303_25408SSBA911TRB_0UDAO007TR_140

2222218_3174UEADP2PP 0TF_6

SSBIM47TR_0UDAYL41TR_0

UEAFM47TR_12160433_23SHABU54TR_28SHABU54TR_422

2160433_5492184284_33

UAAAC62TR_107UAAAC62TR_4992223215_11022

2223215_11983UAAKB40TR_111

2209723_2012SLAG057TF_54

2166529_10UDAL719TF_0UDAL719TF_504

UEAQG30TF_1682222542_328712222542_33429

SKBDD54TF_0UEARU93TF_0UBBCU25TF_145SSBVI06TF_19UEAXF27TF_304

SKABB96TR_2SSBVI06TR_11

UAAET90TR_0SSBVI06TR_531

2173663_02173663_28522211211772773313668666676633633___1228818855455 86UABAB31TR_0

2220984_1614

UDAV013TF_3UDAV013TF_515SKAPC39TF_302

SKAR314TF_3

S000013001_endosymbiont_ofS000006765_Wolbachia_sp.S000013347_endosymbiont_ofS0000SS0SS003000040000100070011_E33334h4477r__l__i__eechnnddidda_cyyayy nibbs

S000089958_Ehrlichia_sp.S000010380_Neorickettsia_risticii

S000003165_Rickettsia_rickettsiiS000021452_Rickettsia_conorii

S000019137_Phaeoceros_laevisS000016997_Notothylas_orbicularis

S000097133_Chaetosphaeridium_globosumS000000011_Caedibacter_caryophila

S000003055_Gluconacetobacter_oboediens

S000006293_Rhodobacter_sphaeroidesS000001437_Sphingomonas_paucimobilis

S000017512_Mesorhizobium_lotiS000004588_Phyllobacterium_myrsinacearumS000003962_Bartonella_vinsoniiS000005680_Rhizobium_radiobacterS000008688_Rhizobium_radiobacter

S000010599_Rhizobium_tropiciS000010858_Rhizobium_sp.

S000016377_Brucella_melitensisS000006156_Brucella_melitensis

S000093399_Sinorhizobium_melilotiS000011258_Rhodobium_orientis

S000001633_Bradyrhizobium_j__ aponicumS000115943_Bradyrhizobium_j__ aponicum

S000005069_Beijii erinckia_indicaS000011209_Hyphomicrobium_zavarziniiS000000819_Methylosinus_trichosporiumS00000113111_Caulobacter_fuff sifoff rmis

S000002398_Rhodospirillum_rubrum

S000006281_magnetic_coccusS000091698_Shigella_flff exneriS000006111 i i iS000006111 E h i hi li

a-Proteobacteria

g-Proteobacteria

2206004_26398

2223777_0SZACL47TF_0

2017930_0SSBFP38TR_0

SSBQN35TF_222185004_02185004_2652185004_808

2207297_2542222222000677022099377_7224552442204889_859

2205996_496SLAAK89TR_0

SSARQ77TR_01979184_0

SLAMC19TF_214SLAON28TR_306

2168235_90811962431_0SSAFY10TF_43

SLAVP43TR_0SLBL242TR_12SLADF40TR_11SKARF80TR_42014277_3412206005_540370SSADN80TF_1

1982203_288SLAQE59TF_34SKAA971TR_352SS1KK6KKAA8AA2AAAA3AA9959977_911TT6TTRR2RR4RR__SLABP03TR_267SLAQE59TF_4172205996_24SLBAY06TR_96SKAA971TR_418SKBHQ63TF_11SKAA952TF_0SLADA36TF_9

2207365_3SSAI655TR_0SKAKK NZ80TF_02004803_0

SLAHPHH 3PP 3TF_18SLAHP3PP 3TF_401SSAS691TF_02204941_0SSAVB84TR_101SSAVB84TR_572SSAJD86TF_39SLBHS89TR_510SSSLLLALBBLLL LHHAABB LHHSSLLHH 0SS88LS 58899008888 T99TTFTTRRTTTT _1__855SSAEO68TR_0SSAEO68TR_298SLATJ90TR_385SLATJ90TR_2

SKBE206TF_02207297_22952206003_2702204889_3871978298_272SSAVX78TR_0SLBHS89TR_39SLATK58TR_32SLAL LAA LLL 050 TF_4044 1SLALH24TR_0SLALH24TR_4502209875_30742209875_2518SLATK58TR_415SKATN61TR_0SKAY816TR_250SSSKKKAAAYYY888111666TTTRRR_022

UDAD419TR_0UEALK85TR_0SLBP273TR_0

SKAEQ60TF_4052172204_669UBAXC40TF_556SSAEO68TF_616

2206003_13062204889_14232207297_3106SSAK572TR_0

UEUU AEE TAA L3L 8TF_0SKAQ237TF_0

UBAA908TF_455UBAF370TR_154

SSBAC40TRB_02164519_0

UEAAP04TF_4302174847_859SLBBP45TR_02223541_27332

SLBMM72TF_460UGAAH07TF_549

2224350_28556SSAWG31TR_0

SSAFY10TR_1151957233_339

SKAFC20TF_01960893_220SLBDQ05TR_5SLBHD42TF_25

2109787_647UBAN704TF_9UBAWN49TF_4662172645_0SSBQ163TR_0

UBAAY34TR_5UAAZA46TF_597

2134317_3UEAAT69TR_544

2204303_2080SZADAA FF4622T2222F2200_04433SHACC41TF_0

2210169_1170UDABM31TF_0

UEAXR41TF_534SKBFO87TR_386SSAZO52TF_0

SZAKH75TF_671SZAH833TF_468

UDAVV91TF_459UAAVH60TR_540

UEAX479TR_8UAAUH72TR_397

2209885_32431SLBBK59TF_02164841_9442164841_5652014371_258

SLBI047TF_0SKBKK IZ87TF_02003824_418233881882262244244__9__444441111188_7

2164162_02209885_31905SSAE887TF_289SKBDQ26TR_1

2217636_6469SVADF22TF_189

UDAPR05TR_402UDAPR05TR_72168383_0

1997588_02200583_1378

2222148_66262222148_7016UDAF289TR_1842224283_4408

UAALJ43TR_02196367_12196367_490

UEAJA31TR_6UEATB96TR_102161755_18330

SZANR04TF_273UAATX69TR_6

2161755_2374UEAAB62TF_0UEAAB62TF_500

2222754_11045UBAAP03TR_491UBAAP03TR_42196059_3654

2022711_12212055_8784

2223448_937732223448_944762222648_472

2222648_10UAANY84TF_0UBBDM95TR_0

UAAQT22TR_2942118258_222118258_566222011611088622955388_255366366UAAQT22TR_2

2029321_4232029321_50

UBAJO67TF_0UAAJI88TR_0SZAB747TR_1

UEALJ18TR_255SZAP043TR_19SZAP043TR_400SSAOU01TF_121UAAON55TR_0SZAH554TR_290SZAH554TR_29SKAWD20TR_02222323_54452216346_9054UEAEG84TR_52UBARV21TF_366UDAUF20TR_152040891_274222100744700488299311_122177UABC623TF_520

UBAQ034TR_0UEAOK14TF_44UEAOK14TF_507

2222024_77752222024_8318

SZAK983TF_0SSBW144TF_577SVAC945TF_33SXADC45TR_285

SSBIE24TR_0UBAXR71TR_0UEAG944TR_1SSBW144TF_34UDAXT40TR_782205986_0UEAA502TR_83UEAA502TR_5462222168_524962027331_21222000222777333333111_4228114SVAB757TR_5282219763_63682222168_53320SVAB757TR_0

2216346_8028SZAPWPP 7WW 1TF_1SZAUC77TF_44UEAYC52TR_2

2183284_02221838_62162117115_384SXAAP55TF_8

SSBIE24TR_530SSBV348TR_7SSBV348TR_364UBARV21TF_0

UABC623TF_0SZANM37TR_0SZANM37TR_4332222000_25053S2222S2222B22220W000000200__4__2242255T5500R0055_651

2222000_203772222000_240262210126_16062210126_1985UEAYO57TF_505

2210749_19432210749_2886

2221059_02223785_7092SSBLW54TF_2262223785_8179

2222168_52757UBAXR71TR_418

2216346_8491UDAS070TR_0UDAS070TR_330

2222000_244902210749_2323

SLBBW75TF_0SLBBW75TF_510

UDANC82TF_0 UAAFJ36TR_0UEABG25TF_303UAAS019TF_1UBAKY86TF_0SZAOS18TF_2242214423_1735

UAAZS84TR_3172098146_02220759_8348SZAQQ62TF_0UBAKA68TR_302UBBD071TR_208

SZARS20TR_3342215512_1185UAAIA61TF_288

2191384_02222182_0UBAZ466TR_0

2127406_0UDALP80TR_0UBANZ54TF_486SSZANUUCUUBB3BBAA3AANNTNNFZZ_05544

2217535_33572210361_6423UBAP231TR_02208716_1835SSBFN78TR_02213293_1661UEABF84TF_0

1974399_02217805_9338

SLBCE58TR_2SZAOS18TR_11UAAS754TR_342

2174080_3312174080_92

SSBBV40TR_0SSBR745TR_2

2204718_6322SLALW15TR_0

2079300_254SZAB409TF_422SS2ZZ2ZZAA0AABB4BB7BB447400_1TT0TTFF9FF__0__442223558_4098

UDAG039TF_02224350_26885

2042287_354SSBTJ43TF_225

SZADA60TF_0UBAR243TF_02189846_0

2189846_3702217250_2631UBBE956TR_0

SZABC47TF_17SZABC47TF_301

2220847_90402220847_87772005986_3472220847_10049

2222979_19952222979_2451

2223725_3053222222222333777222555_3335003551332190204_4489

2190204_4964SSBCE05TF_191

SSBMV68TF_19SSBMV68TF_486

2163373_3402S22S1166A6633Q33337B7733233__1_33T3344R440022_0

SSBK370TF_21SSBK370TF_325

S000106052_Pseudomonas_putida

S000006111_Escherichia_coliS000108032_Escherichia_coli

S000002445_Escherichia_coliS000016511_Escherichia_coliS000010177_Salmonella_entericaS000118416_Salmonella_entericaS000016621_Salmonella_tyt phimurium

S000011588_Enterobacter_nimipressuralisS000084622_Yersinia_pestisS000100555_Yersinia_pestis

S000002974_Buchnera_sS000019453_Buchnera_a

S000116859_Buchnera_aphidicolaS00001SS7SS00600000000020011_P1166ast559e__u__reluucclcca_mulaa__t__o__ cppippdhhaS000002757_Pasteurella_multocida

S000010886_Haemophilus_inflff uenzaeS000085926_Haemophilus_ducreyi

S000008226_Vibrio_vulnififf cusS000004812_Vibrio_vulnififf cusS000118506_Vibrio_parahaemolyticusS000119969_Vibrio_parahaemolyticus

S000009984_Aeromonas_salmonicida

S000001297_Alteromonas_macleodiiS000004608_Colwellia_maris

S000101290_Shewanella_oneidensis

S000011298_Xylella_faff stidiosaS000111743_Xylella_faff stidiosa

S000005301_Xanthomonas_campestrisS000014800_Xanthomonas_axonopodis

S000002403_Legionella_lyticaS000013909_Moraxella_catarrhalis

S000001626_Piscirickettsia_salmonisS000010363_Cardiobacterium_hominis

S000011651_Thiothrix_niveaS000002740_Methylococcus_capsulatus

S000009262_Methylococcus_capsulatusS000002004_Methylobacter_whittenburyiS000004362_phototrophic_bacterium

S000018505_Allochromatium_minutissimum

S000114785_Pseudomonas_aeruginosaS000118952_Pseudomonas_syringae

S000002018_Pseudomonas_flff uorescens

S000010423_Oceanospirillum_linumS000010430_Halomonas_halodenitrifif cans

S000118053_Coxiella_burnetiiS000008864 Coxiella burnetii

S000006111 E h i hi li

saa

S000006111 E h i hi li

S000008864 Coxiella burnetiiSLBAZ62TR_20SLBKR58TR_02166002_723

2222437_02223162_15636

SLBKR58TR_4002166002_1287SSAEX35TF_4162216180_25393SKBDP39TR_02KKBB2BBDD2DD0PP9PP3323399299TT_3RR9RR__6__0053SSAYB15TR_16SKAGP42TR_0

SKAK T789TF_0SSAEA X3X 5TF_02205994_0SKBQK49TR_70

2012328_1652216180_24998SKBO377TR_0SKBI184TR_133SKAVM22TR_0SSAIA 2II 322 1TF_4SKAVP05TF_0SKAXG19TF_0

SSAKV78TF_336SSAZB85TR_0

2220922_441842166002_305SLACQ45TR_159SLAD430TR_214SLLBSSPSSLL9LLAAPPLLLL2AA4ADDTDD44F4433_000TT

SSAWK76TF_0SKBSQ09TR_0

2168827_438SKBDY39TF_245SKAD325TR_0

SSBHS32TR_0S000000620_Francisella_tularensis

2184371 1202

S000001305_Bordetella_bronchiseptica

S000013804_Burkholderia_malleiS000015422_Burkholderia_glathei

S000010054_Oxalobacter_foff rmigenesS000088903_Ralstonia_solanacearum

S000022198_Comamonas_testosteroniS000007648_Alcaligenes_defrf agrans

S000001010_Nitrosospira_multifoff rmisS000009840_Nitrosomonas_sp.

S000000630_Methylophilus_methylotrophusS000003176_Spirillum_volutans

S000004788_Rhodocyclus_tenuisS000016789_Neisseria_meningitidisS000003950_Neisseria_gonorrhoeae

S000021436_Chromobacterium_violaceumS0000167667_Hydrogenophilus_thermolSSut000eol00us

_ _

b-Proteobacteria2184371_12022184371_147UBAUS46TF_44

UBAIL87TF_02223086_115872223086_121432221592_6108

UBBDU27TF_0

UAAS013TR_133SSBSM11TR_211SZAZ T6466 6TF_98

SZAT646TF_594UAAYH66TR_0

2218061_3090UBAQY50TF_0

2218061_3942UAACQ89TF_0UUUAAAAAABCC9QQ1QQ8898899T99TTRTTFF_0__00

2222093_13902222093_1946UAAYH52TR_02223614_41430

SSAO409TR_8UBBBS40TR_54UBBBS40TR_5282223906_43440SSAO409TF_222199184_13212199184_844

SZAIAA AII 3AA 8TF_02031035_22142268_3822223903_58123SSBAX28TR_193UAAF858TF_732223903_576432223903_58698

2223906_44310UBAL9441TR_7222101614_4112101614_5

SSBK801TF_0SSBK801TF_381SZAK683TF_2

SSBA737TF_519SSBA737TFB_4702208975_2163

UEAPB65TF_02171897_2079SKAWQ36TR_22

UAAOW61TF_0UDAR923TF_17UEU ATS71TF_352216321_141682223779_160052223779_16548UBADR91TR_1

UDANH18TF_13UDANH18TF_3762UU1UU8DD0DDAA9AA2NN8NNHH_5113887TT

2215722_56522221689_3571

S000019656_uncultured_eubacterium

2184371 12022184371 1202

UE HX TF

2184371 1202

A 82 208

Sar406UEAHX82TF_208UEAHX82TF_623SKBMS36TR_0SVAEP01TR_0SZAFG56TR_2UDABC77TF_625UDABC77TF_210UDAJ409TR_0SKABC71TF_141SKAX690TR_0UDAL343TR_0SKBCD03TR_0SKBIP95TR_0UEAP990TF_0SKBIV20TR_0SZAOR48TR_2882223338_139142223338_13481S22K2222B2233M3333L88_3__1111133T3344R4488_323

UBAOX12TR_445UHAAS27TF_0

UHABC36TF_1SLBNJ24TF_434UEAY946TR_0SHAAC33TF_19SHAAC33TF_432

UEAY946TR_3842207251_34872207251_39001967294_0

SSBIY81TR_12006646_0SLAUL19TRB_4

SLASQ30TR_1SSARV90TF_19SSARV90TF_432

SLAAO50TR_250

2211650_2130

UAAM001TR_3

2216336 1212

S000010676_unidentififf ed_bacterium_SAR7S000009062_Synechococcus_sp.S000SS0SS000000050000000003000060099_P0066ro__c__hSSylyyoyy rococcus_ma__rppippnpp us

S000015664_Synechococcus_PCC6301S000012320_Anabaena_sp.S000000043_Nostoc_sp.S000001387_Arthrospira_sp.

S000004880_Symploca_sp.S000004780_Pleurocapsa_sp.

S000002567_Gloeothece_sp.S000003495_Synechocystis_PCC6803

S000092098_Thermosynechococcus_elongatus

S000020846_uncultured_bacillariophyte

S000018212_Colacium_vesiculosumS000019808_Strombomonas_costata

S000116525_Cuscuta_subinclusaS000104718_Sphagnum_rubellumS000020841_Phaeoceros_laevis

S000099098_Chaetosphaeridium_globosum

UEAHX82TF 208

2216336 1212

Cyanobacteria

CFB

S000015337_Desulfoff bulbus_elongatus

SSS000000000000111000888888444_HHHeeellliiicccooobbbaaacccttteeerrr_pppyyylllooorrriiiS000006688_Helicobacter_pylori

S000004328_Helicobacter_hepaticusS000003598_Wolinella_succinogenes

S000011564_Campylobacter_j__ ejee uniS000005472_Campylobacter_j__ ejee uniS000011167_Campylobacter_feff tus

S000007307_Desulfuff rella_propionicaS000013074_Desulfoff vibrio_halophilusS000003903_Desulfoff vibrio_sp.

S000022341_Desulfoff halobium_retbaenseS000019504_Desulfoff microbium_bacul00at0000umS000008608_Cystobacter_fuff scusS000011612_Myxococcus_xanthus

S000000283_Nannocystis_exedensS000017105_Syntrophobacter_pfeff nnigii

S000015868_Geobacter_metallireducensS000004966_uncultured_bacterium

S000008920_Pelobacter_carbinolicusS000007761_Desulfuff romonas_thiophila

S000020733_Desulfoff bacter_vibriofoff rmisS000015757_Syntrophus_buswellii

SSS000000000000111666777555333 SSStttreptttococcus pyogenes

S000010884 Helicobacter pylori

S000016753 Streptococcus pyogenes d,e-ProteobacteriaLowGC Gram Positive

HighGC Gram Positive

2220155_02223541_2846021221222202222022333335565544_311__7__222170711_15284

2093817_02166142_0

UBAH566TF_02221038_4027

UDAL384TR_0SLAVD64TR_103SSAZK64TR_155SLAJQ21TR_188

UBAX012TR_4252177971_868

SLBIF29TF_483SS000000002200770033 CChh ii tii

2220155 0

S000020703 Ch i ti

Misc. Phyla

2198450_2652198450_641UAAMN53TR_263

UAAPX38TF_34UAASH38TR_181

UBAC657TF_02198383_0

UBAM108TR_505S000014329_unidentififf ed_eubacterium_SAR307

S000007398_unidentififf ed_bacterium_env.slt78S000015625_uncultured_eubacterium_env.wchB50

S000015685_Chloroflff exus_aurantiacusS000008784_Herpetosiphon_geysericola

S000002414_Thermomicrobium_roseumS000013783 candidate division env.OPB46

2198450 265

Green non-sulfur bacteria

a. rRNA

SSS000000000000000000555666999_uuunnncccuuullltttuuurrreeeddd_eeeuuubbbaaacccttteeerrriiiuuummm_eeennnvvv.WWWCCCHHHBBB111111S000012622_candidate_division_env.OPB92

S000011088_uncultured_eubacterium_env.WCHB07S000011850_uncultured_eubacterium_env.WCHB15

S000015978_uncultured_eubacterium_env.WCHB58S000017730_Chlorobium_tepidumS000004578_Chlorobium_vibriofoff rme

S000000569 l d b i WCHB11 Misc. PhylaUnknown

S000116107_Coprothermobacter_proteolyticusS000006882_candidate_division_env.OPB80

S000022188_Aquifeff x_pyrophilusS000115099_Aquifeff x_aeolicus

S000015239_unidentififf ed_ThermotogalesS000002415 Thermotoga maritima

S000013783_candidate_division_env.OPB46_ _ _S000013783 candidate division env.OPB46

S000002415 Thermotoga maritima

Unknown

Page 11: Supplemental Online Materials

High GC gram positive

Chlamydia

IBEA-93700917IBEA-109300055

IBEA-63518639IBEA-57013309

IBEA-53504531IBEA-58512861IBEA-70008477

IBEA-53506425IBEA-92402863

IBEA-71010755IBEA-93011897

IBEA-91407887IBEA-70529337IBEA-66506957

IBEA-68011205IBEA-91503595IBEA-53009215

IBEA-64502101IBEA-56523647

IBEA-70023553IBEA-343700401

IBEA-343000215IBEA-92108609IBEA-71027965IBEA-93000591

IBEA-66524397IBEA-65010033

IBEA-66517969IBEA-92507721

IBEA-90200359IBEA-53024575IBEA-66010323

IBEA-90105673IBEA-69517229

IBEA-60022881IBEA-61019363

IBEA-68027807IBEA-93601875IBEA-71008557

IBEA-64515225IBEA-92111265

IBEA-59022327IBEA-92208511

IBEA-92802095IBEA-52025253

IBEA-50027509IBEA-53009355IBEA-90501499IBEA-72013469

IBEA-58010357IBEA-57013535

IBEA-57511079IBEA-91910647IBEA-94411517IBEA-105800231IBEA-71526729IBEA-70014513IBEA-65009615

IBEA-51023183IBEA-52030797

IBEA-90503197IBEA-71008459

IBEA-91310469IBEA-92007277

IBEA-93009955IBEA-92007339

IBEA-63022655IBEA-71016975

IBEA-91300827IBEA-90011009IBEA-92808749

IBEA-64522183IBEA-66508253

IBEA-67519149IBEA-57025375

IBEA-93309499IBEA-100500307

IBEA-54515933IBEA-349800423

IBEA-93910657IBEA-61023275IBEA-59506075IBEA-63019351

IBEA-71004057IBEA-51525333

IBEA-67011923IBEA-91904317IBEA-90411719

IBEA-90202369IBEA-90807639

IBEA-92508929IBEA-66515125

IBEA-69519341IBEA-52530505

IBEA-114700457IBEA-93109147

IBEA-51016179IBEA-90605303IBEA-66022363

IBEA-50029897IBEA-94405201

IBEA-51509321IBEA-90807379

IBEA-106800443IBEA-94004691IBEA-61517377

IBEA-94110445IBEA-91610137IBEA-90904747IBEA-56012191

IBEA-341000241IBEA-57521967IBEA-93007895

IBEA-348300441IBEA-71518379

IBEA-347000119IBEA-50023979

IBEA-91606931IBEA-70028659IBEA-65011087

IBEA-66526273IBEA-93002639IBEA-70027543IBEA-51521297IBEA-67524709

IBEA-66507149IBEA-68512665

IBEA-65008649IBEA-55008305

IBEA-54527293IBEA-94300029

IBEA-93902093IBEA-93501297IBEA-113500323

IBEA-60021407IBEA-65517969

IBEA-55003897IBEA-58009303

IBEA-65515895IBEA-68013839IBEA-69516513IBEA-72031085IBEA-91510917

IBEA-53005067IBEA-50523547

IBEA-92107371IBEA-332600389IBEA-92308847IBEA-348000099

IBEA-94009827IBEA-64015989IBEA-66026241

IBEA-68000611IBEA-53512011

IBEA-93109755IBEA-91504549IBEA-110300281

IBEA-51028709IBEA-67007889

IBEA-66027451IBEA-529400057IBEA-57515529IBEA-91310963IBEA-343100539IBEA-67526399IBEA-94005983IBEA-72501559IBEA-62009421IBEA-61014589IBEA-53027485

IBEA-92208877IBEA-58026459

IBEA-67524047IBEA-93902091

IBEA-64000043IBEA-90907373

IBEA-91202121IBEA-64519973

IBEA-93806311IBEA-105400383

IBEA-64507485IBEA-91012121

IBEA-71000773IBEA-110300279

IBEA-53017415IBEA-59025629

IBEA-71511175IBEA-91104983

IBEA-53023397IBEA-58510091IBEA-113800069

IBEA-63513689IBEA-61008335

IBEA-90205963IBEA-62525569

IBEA-69022933IBEA-92804223IBEA-72026409

IBEA-90205961IBEA-92808075IBEA-58000501

IBEA-65503019IBEA-67512859

IBEA-71531109IBEA-92202421IBEA-68007739IBEA-59000507

IBEA-70001637IBEA-55010843

IBEA-91202119IBEA-91100041

IBEA-90706963IBEA-61000497IBEA-72524281IBEA-72031303

IBEA-63021031IBEA-60020953IBEA-61522285IBEA-94006137

IBEA-55009545IBEA-66021641IBEA-94207215IBEA-66519931

IBEA-72504975IBEA-62519759IBEA-92902383IBEA-55021539IBEA-91905603

IBEA-58520177IBEA-61517243

IBEA-90402451IBEA-64011409

IBEA-67511945IBEA-62506317IBEA-92703155

IBEA-63008509IBEA-91706149IBEA-68003561

IBEA-346300401IBEA-91902393

IBEA-55507633IBEA-69503635

IBEA-92701823IBEA-69028017

IBEA-92810655IBEA-59509913

IBEA-68506515IBEA-525600041IBEA-67523791

IBEA-59504139IBEA-91603325

IBEA-70503865IBEA-58522307

IBEA-115700307IBEA-94101839

IBEA-71501857IBEA-90300151IBEA-102100139

IBEA-60010287IBEA-50027235

IBEA-92607347IBEA-92209069IBEA-56527193

IBEA-63003399IBEA-64506757

IBEA-508600061IBEA-54009683

IBEA-91706147IBEA-65525391IBEA-58525905

IBEA-54526667IBEA-56025737

IBEA-90201937IBEA-53511201

IBEA-53011187IBEA-62526261IBEA-332000303IBEA-66026961IBEA-90000509

IBEA-70023411IBEA-59008489

IBEA-58022917IBEA-71001717

IBEA-55526027IBEA-62002259

IBEA-61010291IBEA-61524799

IBEA-67004179IBEA-50009849

IBEA-64016315IBEA-56017371IBEA-51527173

IBEA-62512697IBEA-67026025

IBEA-57021561IBEA-65025607

IBEA-50003119IBEA-52511851

IBEA-60510675IBEA-93104451IBEA-63526555

IBEA-102900367IBEA-91511639IBEA-91100673

IBEA-69500531IBEA-64503877IBEA-92005597

IBEA-51006505IBEA-91206631

IBEA-50021793IBEA-62023461IBEA-94302189IBEA-91308743

IBEA-538900019IBEA-53007563

IBEA-57005587IBEA-93006953

IBEA-52523969IBEA-118900285IBEA-92909871

IBEA-91108663IBEA-70019405IBEA-67027145IBEA-59524171IBEA-64511351

IBEA-57522481IBEA-54526051

IBEA-93404615IBEA-64018663IBEA-93108719IBEA-57015677

IBEA-91100817IBEA-105000399IBEA-92600915

IBEA-70025387IBEA-72529143

IBEA-64006637IBEA-90804485

IBEA-72530435IBEA-90708481

IBEA-90007521IBEA-52508173

IBEA-91301017IBEA-109300231

IBEA-69513741IBEA-510500199IBEA-51523409IBEA-50512969

IBEA-94105219IBEA-65027677

IBEA-70003965IBEA-92208169IBEA-64010345

IBEA-115000363IBEA-115000365IBEA-90200625

IBEA-70506607IBEA-60504831IBEA-92411181

IBEA-111900241IBEA-72526507IBEA-54006757IBEA-53504197

IBEA-65513731IBEA-345600355IBEA-69516367

IBEA-67512649IBEA-70512743IBEA-66005113

IBEA-69509595IBEA-91411223

IBEA-71021593IBEA-63520641

IBEA-56027709IBEA-90308633

IBEA-53525641IBEA-55011261

IBEA-343400377IBEA-52507693

IBEA-53504083IBEA-94106533

IBEA-56008007IBEA-62504769IBEA-63500215IBEA-72504375

IBEA-67012093IBEA-94308719

IBEA-69527723IBEA-66527999

IBEA-58513573IBEA-62024095

IBEA-92011675IBEA-66003811

IBEA-54520499IBEA-93107109

IBEA-68516521IBEA-58503743

IBEA-56523193IBEA-67008221

IBEA-57015393IBEA-66005387IBEA-61501961

IBEA-50015627IBEA-67018643

IBEA-59002641IBEA-90903017IBEA-64027401

IBEA-54526305IBEA-69012095

IBEA-59026211IBEA-53504673

IBEA-55016733IBEA-54026257

IBEA-54515859IBEA-55011283

IBEA-59527029IBEA-53523265

IBEA-67002133IBEA-90711403

IBEA-52016835IBEA-54507931

IBEA-60024735IBEA-92909325

IBEA-52016841IBEA-53525859IBEA-90403163IBEA-55512007

IBEA-92911871IBEA-60017661

IBEA-70001255IBEA-347000221

IBEA-93309353IBEA-66020189

IBEA-64016721IBEA-106400115IBEA-94207273

IBEA-103800311IBEA-90210973

IBEA-338200273IBEA-90711645

IBEA-92905511IBEA-101100393

IBEA-92708687IBEA-59021391

IBEA-62012581IBEA-57027663

IBEA-90100839IBEA-60010139

IBEA-60023577IBEA-90006871IBEA-72004385IBEA-344900435

IBEA-92004363IBEA-54529469

IBEA-93310765IBEA-109700029

IBEA-66014195IBEA-92409101

IBEA-50013977IBEA-51008289

IBEA-69509103IBEA-332200397IBEA-91710541IBEA-336800139IBEA-91710533

IBEA-70514129IBEA-91707147

IBEA-116200213IBEA-93202871

IBEA-59018287IBEA-106900379IBEA-92311443

IBEA-64514709IBEA-63513897

IBEA-65506769IBEA-92709331

IBEA-118700489IBEA-90911423

IBEA-54521799IBEA-94109767

IBEA-56016439IBEA-111500031IBEA-91301127

IBEA-101400285IBEA-91605523

IBEA-92001201IBEA-59510065

IBEA-93902989

IBEA-72007325IBEA-63017453

IBEA-54000045IBEA-61003915

IBEA-58020159IBEA-69516525

IBEA-91711153IBEA-56522029

IBEA-67026065

IBEA-54000087

IBEA-63500371IBEA-90111413IBEA-91309377

IBEA-58003517

IBEA-90903599IBEA-93008471

IBEA-93805063IBEA-71509531

IBEA-91409411IBEA-347900205

IBEA-93903015IBEA-92810381IBEA-70519209IBEA-92606113

IBEA-60520109IBEA-94008121

IBEA-119800275IBEA-91606223IBEA-59011851IBEA-64521127

IBEA-94509385IBEA-94303581

IBEA-53508387IBEA-93503481IBEA-92403069

gi17545270_Ralstonia_solanacearumIBEA-64501147IBEA-336300025

IBEA-55022987IBEA-51016977

IBEA-54028005IBEA-66525887

IBEA-55515135IBEA-53016281IBEA-52513461IBEA-90105705

IBEA-72028003IBEA-51517849

IBEA-90401123IBEA-50516091

IBEA-91101107IBEA-330300359

IBEA-67023929IBEA-52030429IBEA-92009389IBEA-57020607

IBEA-51030483IBEA-56013417IBEA-347700385

IBEA-72514863IBEA-92511611IBEA-60526295IBEA-55021483

IBEA-90204783IBEA-92210165

IBEA-90505555IBEA-91908675

IBEA-90610485IBEA-92509027

IBEA-67015099IBEA-92502947

IBEA-63019883IBEA-55017713

IBEA-60518879IBEA-59509071

IBEA-68517773

IBEA-67000111IBEA-93709923

IBEA-60007567IBEA-59002157

IBEA-61018307

IBEA-60003157IBEA-59023629

IBEA-56000791IBEA-60019181IBEA-90509213

IBEA-50019469IBEA-344500323IBEA-57008731IBEA-93809225IBEA-109200431

IBEA-90605225IBEA-63506865

IBEA-54022495IBEA-92501095

IBEA-93908957IBEA-336300065

IBEA-90009157IBEA-91402931

IBEA-57016057IBEA-59014531

IBEA-58022903

IBEA-57016799IBEA-115300145IBEA-94509657

IBEA-70526953IBEA-105900289IBEA-92302173

IBEA-61510547IBEA-90605657

IBEA-71027743IBEA-91911227

IBEA-67006349IBEA-50506593

IBEA-344700147IBEA-62500689

IBEA-94309365IBEA-63026099

IBEA-63014095IBEA-65018273IBEA-337600505

IBEA-59521071

IBEA-61522251IBEA-91102059IBEA-94001637

IBEA-91708481IBEA-90404969

IBEA-70503161IBEA-93805651

IBEA-71001437IBEA-67523823

IBEA-91607187IBEA-59009097

IBEA-50502853IBEA-62527151

IBEA-92809515IBEA-65503329

IBEA-61023131IBEA-57524611IBEA-348900421IBEA-94503277IBEA-63025555IBEA-51530555IBEA-72023395

IBEA-93200027

IBEA-94105225IBEA-342300055IBEA-69500413

IBEA-64001551

IBEA-109200043IBEA-93805653

IBEA-60522979IBEA-57516535

ORF03043_Fibrobacter_succinogenes_S85

ORF02291_Treponema_denticola_ATCC_35405

gi15889174_Agrobacterium_tumefaciens_str._C58_(Cereon)gi17935764_Agrobacterium_tumefaciens_str._C58_(U._Washington)

gi13470353_Mesorhizobium_lotigi15965552_Sinorhizobium_meliloti

gi23502079_Brucella_suis_1330gi17987070_Brucella_melitensis_16MORFB01956_Brucella_ovis_ATCC25840

gi27380866_Bradyrhizobium_japonicum_USDA_110

gi16125339_Caulobacter_crescentus_CB15

gi15893105_Rickettsia_conoriigi15604595_Rickettsia_prowazekii

ORF00134_Wolbachia_sp.

gi28871176_Pseudomonas_syringae_pv._tomato_str._DC3000gi26988361_Pseudomonas_putida_KT2440

gi15598813_Pseudomonas_aeruginosa_PAO1

gi16130606_Escherichia_coli_K12gi30064059_Shigella_flexneri_2a_str._2457Tgi24113997_Shigella_flexneri_2a_str._301gi26249094_Escherichia_coli_CFT073gi15832810_Escherichia_coli_O157:H7gi15803214_Escherichia_coli_O157:H7_EDL933

gi29143089_Salmonella_enterica_subsp._enterica_serovar_Typhi_Ty2gi16761605_Salmonella_enterica_subsp._enterica_serovar_Typhigi16766135_Salmonella_typhimurium_LT2

gi16123460_Yersinia_pestis_CO92gi22124791_Yersinia_pestis_KIM

gi32490984_Wigglesworthia_glossinidia_endosymbiont_of_Glossina_brevipalpisgi24374940_Shewanella_oneidensis_MR-1

gi27364959_Vibrio_vulnificus_CMCP6gi15640565_Vibrio_cholerae

gi28899324_Vibrio_parahaemolyticus_RIMD_2210633ORF01029_Colwellia_psychrerythraea_34H

gi15603682_Pasteurella_multocidagi16272543_Haemophilus_influenzae_Rdgi33151633_Haemophilus_ducreyi_35000HP

ORF05419_Burkholderia_mallei_ATCC:23344gi33601059_Bordetella_bronchisepticagi33597213_Bordetella_parapertussisgi33593529_Bordetella_pertussis

gi21231175_Xanthomonas_campestris_pv._campestris_str._ATCC_33913gi21242489_Xanthomonas_axonopodis_pv._citri_str._306

gi15836728_Xylella_fastidiosa_9a5cgi28198030_Xylella_fastidiosa_Temecula1

gi30249884_Nitrosomonas_europaea_ATCC_19718gi29654363_Coxiella_burnetii_RSA_493

ORF00188_Methylococcus_capsulatus_Bathgi15677302_Neisseria_meningitidis_MC58

gi15794552_Neisseria_meningitidis_Z2491gi34497062_Chromobacterium_violaceum_ATCC_12472

gi15644782_Helicobacter_pylori_26695gi15611211_Helicobacter_pylori_J99

gi32266132_Helicobacter_hepaticus_ATCC_51449gi34557830_Wolinella_succinogenes

gi15792977_Campylobacter_jejuni_subsp._jejuni_NCTC_11168ORF02299_Campylobacter_jejuni_RM1221

gi15834644_Chlamydia_muridarumgi15605382_Chlamydia_trachomatis

gi29840750_Chlamydophila_caviae_GPICgi15618671_Chlamydophila_pneumoniae_CWL029gi15836295_Chlamydophila_pneumoniae_J138gi16752279_Chlamydophila_pneumoniae_AR39

gi33242121_Chlamydophila_pneumoniae_TW-183

gi15807331_Deinococcus_radioduransORF00751_Desulfovibrio_vulgaris_Hildenborough

gi15607092_Aquifex_aeolicus_VF5gi21674742_Chlorobium_tepidum_TLS

ORF00201_Dehalococcoides_ethenogenes_195

gi32477199_Pirellula_sp.gi15609874_Mycobacterium_tuberculosis_H37Rvgi15842276_Mycobacterium_tuberculosis_CDC1551gi31793909_Mycobacterium_bovis_subsp._bovis_AF2122/97

gi15827469_Mycobacterium_lepraegi23465975_Bifidobacterium_longum_NCC2705

gi25028405_Corynebacterium_efficiens_YS-314gi19553160_Corynebacterium_glutamicum_ATCC_13032

gi28572774_Tropheryma_whipplei_TW08/27gi28493578_Tropheryma_whipplei_str._Twist

gi21224115_Streptomyces_coelicolor_A3(2)gi29829034_Streptomyces_avermitilis_MA-4680

gi15895091_Clostridium_acetobutylicumgi18310655_Clostridium_perfringens_str._13

gi28210974_Clostridium_tetani_E88ORF02507_Carboxydothermus_hydrogenoformans_Z-2901

gi20807821_Thermoanaerobacter_tengcongensisgi15224783_Arabidopsis_thaliana

gi15232794_Arabidopsis_thalianagi22330951_Arabidopsis_thaliana

ORF00250_Geobacter_sulfurreducens_PCAORF05065_Myxococcus_xanthus_DK_1622

ORF05001_Myxococcus_xanthus_DK_1622

gi34540649_Porphyromonas_gingivalis_W83ORF01414_Porphyromonas_gingivalis_W83

gi29350018_Bacteroides_thetaiotaomicron_VPI-5482

gi30263785_Bacillus_anthracis_str._AmesORFB04053_Bacillus_cereus_10987

gi30021871_Bacillus_cereus_ATCC_14579gi15614946_Bacillus_halodurans

gi23099079_Oceanobacillus_iheyensis_HTE831gi16078757_Bacillus_subtilis_subsp._subtilis_str._168

gi29377621_Enterococcus_faecalis_V583gi28378887_Lactobacillus_plantarum_WCFS1

ORF02034_Listeria_monocytogenes_4b_F2365gi16803438_Listeria_monocytogenes_EGD-egi16800503_Listeria_innocua

gi15926868_Staphylococcus_aureus_subsp._aureus_N315ORF00845_Staphylococcus_aureus_COLgi15924275_Staphylococcus_aureus_subsp._aureus_Mu50gi21282897_Staphylococcus_aureus_subsp._aureus_MW2ORF01637_Staphylococcus_epidermidis_RP62Agi27467881_Staphylococcus_epidermidis_ATCC_12228

gi28896710_Streptococcus_pyogenes_SSI-1gi21911336_Streptococcus_pyogenes_MGAS315gi19746983_Streptococcus_pyogenes_MGAS8232

gi15675866_Streptococcus_pyogenes_M1_GASORFB00720_Streptococcus_agalactiae_A909gi25012085_Streptococcus_agalactiae_NEM316gi22538228_Streptococcus_agalactiae_2603V/R

gi24380419_Streptococcus_mutans_UA159gi15903799_Streptococcus_pneumoniae_R6gi15901764_Streptococcus_pneumoniae_TIGR4

gi15672336_Lactococcus_lactis_subsp._lactisgi15639679_Treponema_pallidum

gi15594476_Borrelia_burgdorferi_B31

gi33862118_Prochlorococcus_marinus_subsp._pastoris_str._CCMP1986

gi33241165_Prochlorococcus_marinus_subsp._marinus_str._CCMP1375gi33863994_Prochlorococcus_marinus_str._MIT_9313

gi33866594_Synechococcus_sp._WH_8102

gi17230764_Nostoc_sp._PCC_7120gi22299632_Thermosynechococcus_elongatus_BP-1

gi16332317_Synechocystis_sp._PCC_6803gi18412241_Arabidopsis_thaliana

gi19703882_Fusobacterium_nucleatum_subsp._nucleatum_ATCC_25586gi24214879_Leptospira_interrogans_serovar_lai_str._56601

gi15644602_Thermotoga_maritima10

a-Proteobacteria

g-Proteobacteria

b-Proteobacteriae-Proteobacteria

UnknownLow GC gram Positive1

CFB

Low GC gram Positive2

Cyanobacteria

Unknown

Misc. Phyla

Misc. Phyla

Spirochetes

b. RecA

Page 12: Supplemental Online Materials

IBEA 918010 1

gans

gi34499644 Chromobacterium violaceumi3gg0gii303346444454499399997996646644S44hCCC rrlrroo fmmoooooooo rr

gi15839218 Xylella faff stidiosai15645809 H li b l i

gi16330914_Synechocystis_sp.i15805336 D i di d

gi21675001_Chlorobium_tepidumgi15605613 Aquifef x aeolicus

giEE3EEA2AA4A 799511088480080011_P000i11rellula_sp.

100

IBEA-54524967IIIBBBBBEEEEEAAAAA -65544044550552212244944996996656677777IBIIEIIBBABEE-6AA066066000000090011711998996616655IIIBIIBBEBBEEAEEAA-96616600900000000080099399778778818811IBEA-69015843IAABAA 66E6699A900-1115515588988444433033 0249IIIBBBEEEAAA-911311899144000300022344IBBBBBEEEEEAAAA 9-5330885110009331008335IBEA-62505835IIIBBBEEEAAA-566622555000255788333955IIIBBBEEEAAA-655966055000322477533199IBIIBBEBBEEAEEA-5AA 366599100400533044955

IBEA-68027991IEEBAAEAA 66A6688-3002232277777997999909911011 267IBIIBBEBBEEAEEA-7AA 133033277077700400922IBEA-58021263IIIBBBEEEAAA-5554882000220110222665335IIIBBBEEEAAA-655244522200600300922555IBBBBBEEEEEAAAA 6-6226555221661334994551IBEA-68024457IBBBBBEEEEEAAAA 6-6882000222446443559771IBEA-5IIBB4BBE0EE2EAA8AA 366662252200IIIBBBEEEAAA-655544000222788333466755IBEA-92808793 IBEA-59013819IBEIIBBABBEE-6AA4A 5550559979900501173373388

IBEA-93508819IIIBBBEEEAAA-1991332557000880880112995IBEAIIB-9BBEE2EEAA7AA01171111711226227777700IBEIIBBABBE-6AA8AA 5991221775700177177IBEA-52001111IIIBBBEEEAAA-5551224006000110110110113IEEBEEAAEAA A551-64468660500010010090035339IBBEEBEEAEAAA66-788552551101111211997995585599299 3IBEA-51530333IBEA-7202110IIBB9BBIIIBBBEEEAAA-577622500022711711500999

IBEA-92009927IIIBIIBBEBBEEAEEAA-79929922022000000000099799992992292277IIIBIIBBEBBEEAEAA-67787722522000000060000500779772292299IIIBIIBBEBBEEAEEAA-66686688588552550070066866556559979999IBEA-72525685IIIBBBEEEAAA-3773222553220550660888555IAA B3333E3322A2330-50000300000008828855955 041IIIBBBEEEAAA-955233000022499000944911

IBEA-70517671IBIIBBEBBEEAEEA-5AA 577500255411477166977I5555B5522E4444A4411-699 8001289IBIIBBEBBEEAEEA-9AA 366188000300011522188IBEA-53010339IBEIIABBE-6EEAA2AA5550553353300400115110010033IBEIIAIIBB-5EE4AA5AA 0668662242553550050055

IBEA-72012587IBBBBBEEEEEAAAA 7-12210021102205508827779IIIBBBEEEAAA-911211022000000200722177IBEAIIB-7BBEE0EEAA5AA1949922422001000050000IBEA-92004097IBBEEBEEAAEAAA99-622006000000044144005009909977377 9IBIIEIIBBABEE-5AA566066662660010011311558550010033IIIBBBEEEAAA-555055500222311033688511

IBEA-55011935IIIBBBEEEAAA-655655000011811199433155IIIBBBEEEAAA-566166000100188411244711IBEA-57515827IEEBEEAAEAA A557-15501153551880227077285IIIBBBEEEAAA-911200533011500500622588IBEAIIB-9BBEE2EEAA4AA09949922622550550050055

IBEA-109300237IIIBBBEEEAAA-911400599033900600922933IBEA-66II5IBB2BBEE0EEAA2AA8AA 9199IEEBAAEAA 66A666-555221220050022222885881141 71IBEA-91705147IBEAIIB-6BEE8EEAA5AA0995991121177777001005 IBEA-91306257

IBEA-68002691IBEIIAIIBB-5EE5AA0AA 0667668898800400070022IBEIIBBABBEE-5AA9AA 055055558550090000800775799IBIIBEBBEEAEEA-9AA 455599000000688599788IBEA-54009321IBEA-6II3BB5BBEE2EEAA0AA0AA 50554434400IEEAABAAE66A335-9220010009000000033833385IBEIIBBABBEE-9AA1AA 4990911599100408898833

IBEA-72030187IBEAII-5BBEE8EE5EAA0AA 67747222220050033IBEIIAIIBB-9EE0AA8AA 0552558898855450090066IAABAA 9E9900A088-6002252299599440449949 493IBEA-67017911IIIBBBEEEAAA-566077500211877599911911IIIBBBEEEAAA-655300555022388655299999IBEA-57003549I7700B0000E3355A5544-699 8502103IBEA-7IIBB1BB0EE1EAA5AA 4669668858855IBEA-9IIBB0BBE7EE0EEAA0AA 47767111100IBEA-69016155IBBEEBEEAEAAA66-599005001151166066117115555555355 1IBEAII-6BBEE6EE5EAA0AA 95595555055535500IBIIBBEBBEEAEEA-6AA 966566055400599599100

IBEA-69513525IBIIBBEBBEEAEEA-5AA 066099155011933455122IIIBBBBBEEEEEAAAAA -55500600005001121100800998994464411111IBEA-55507687IBIIBBEBBEEAEEA-6AA 355555255100277266588I335B5522E2211A1222-7225505 011755IBIIEIIBBABEE-6AA977077003000000011011115117777755IBEA-70002755IIIBBBBBEEEEEAAAAA -97700400001000000022422776775585555755IBEIIAIIBB-6EE4AA5AA 099694494411911005044IBEIIAIIBB-3EE3AA8AA 16606644044552550016636699

IBEA-54012123IBEA-5IIBB9BBEE5EE0EAA8AA 655654434400IBEA-5IIBB4BBE0EE2EEAA2AA 255459919955IBEA-62504197IBIIEIIBBABEE-5AA066066222225555500300440441151199IIIBBBBBEEEEEAAAAA -75500000005002222255755338330080055355IAAB77E770055A552-97783888588330338815

IBEA-344100265IBEA-5652724IIB5BBIBEIIBBABBEE-7AA2AA 555055662665565228227777722IBIIBEBBEEAEEA-5AA 477022255700922966188IBEA-64517379IBIIEIIBBABEE-5AA366566440445545511811771773313377IAAB55E55335A5500-744882881101111112315

IBEA-67015937IIIBBBEEEAAA-966377700011155299533177IIIBBBEEEAAA-1990335775000110220556113IAA B1100E0055A5550-500007000050066063323 401IBEA-333000199I33B3333E3300A0000-501121990999913937IIIBBBEEEAAA-955122200011633399833977IIIBBBEEEAAA-1991113228000660333889993

IBEA-70501155IIIBIIBBEBBEEAEEAA-57787700500551550020011911111115595555IIIBBBBBEEEEEAAAAA -75588288550551101122522992991151199799IBEAII-3BBEE3EE0EEAA7AA 0770772212200400070055IBEA-60515571IBEAIIB-5BBEE2EEAA0AA0669660000055155171155IEEBAAEAA 55A522-500001009959900100118117747 09

IBEA-63523293IBBEEBEEAEAAA66-933550552222233033223229959933433 1IIIBIIBBEBBEEAEAA-5994900500221220010033033554554434411IBEAIIB-7BBEE2EEAA0AA15585544444555551131111IBEA-52003363IBEIIAIIBB-9EE0AA9AA 0555552242006000133IIIBBBEEEAAA-699300599100155144066911 IBEA-94506337

IBEA-119900155IIIBBBEEEAAA-611011099299600100511755IBIIEIIBBABEE-5AA66606600000003002262266166111155IBEA-61512547IIIBBBEEEAAA-766211555211622655344577IBBEEBEEAAEAAA77-622557552252266066666663393355355 7IBEIIABBE-6EEAA5AA06616678775515501006616699

IBEA-61501313IBIIBBEBBEEAEEA-5AA 766511055500811733911IBIIBEBBEEAEEA-9AA 455577055500555888177IBIIBBEBBEEAEEA-7AA 099044055700655155988IBEA-72024391IBEIIABBE-6EEAA9AA5772772242200300221224474433I66B6699E995522A2244-533113117757708749IBEA-57010687IIIBBBEEEAAA-5552770002110000660884777IIIBBBBBEEEEEAAAAA -9552212200500220220010000300005004454477IIIBIIBBEBBEEAEAA-59919911511550550040011211330335555555

IBEA-61015975IBIIBBEBBEEAEEA-6AA 366011100611155599977IIIBBBEEEAAA-966033500011466811655999IBIIEIIBBABEE-1AA09919900400550550000044444885886616699IBEA-93405311IBIIEIIBBABEE-5AA399599331334424400900554553353311IBEAI-6BBEE1EE5EAA0AA 85515338335595511IBEA-9390033IIB5BB

IBEA-93405305IIIBBBEEEAAA-799133544100155833300355IIIBIIBBEBBEEAEEAA-5778771151155155111111211887883393333IBEA-114000349IIIBBBEEEAAA-911211444000000200533544IBEA-5II3BB5BBEE1EEA4AA0A 939929224IBBEEBEEAEAAA55-633555551101144044004003343399499 9

IBEA-92609597I99B9922E226600A0099-655993997757705501IBEIIBBABBEE-6AA2AA 066266334355855008005595555IBIIBBEBBEEAEEA-5AA 666522100622344388788IBEA-59516093IIIBBBEEEAAA-555799555211766200299533IBEIIAIBB-9EE3AA3AA 0559557797755652272277IBEA-94208187IIIBBBEEEAAA-1990446222000880111884779IIIBIIBBEBBEEAEEAA-91111100900660662282200800003001151144IIIBBBEEEAAA-599911599000688288633555

IBEA-94002297IBEAII-7BBEE2EE5EAA2AA 399399443440070000IIIBBBBBEEEEEAAAAA -57722322550552222233533339333383377777IIIBBBEEEAAA-555933500122455999988177IBEA-72524371IIIBIIBBEBBEEAEAA-57757722522551552262244944336337797711IIIBBBEEEAAA-755055555111366099666199IBEA-93910057IBIIBBEBBEEAEEA-6AA 899033099211400000755IIIBBBBBEEEEEAAAAA -56688488000000000022722444440060077977IIIBIIBBEBBEEAEEAA-55505544044000000010077777442446696699

IBEA-70528317IBEAII-6BBEE6EE0EAA2AA 277077006005515522IIIBIIBBEBBEEAEEAA-96636666266000002262222522001006696611IBBBBBEEEEEAAAA 9-9333229001660558110997IBEA-71509741IBIIBBEBBEEAEEA-7AA 177011155000599677744IBBEEBEEAEAAA77-611001001101100000558556636677377 1IBEA-72029545IEEAABAAE77A220-52299699550554424455955173IBEIIBBABBEE-5AA9AA 055255664600900220229979911IIIBBBEEEAAA-955399000022544099300177IBEA-61521641IBEA-6II6BB0BBEE0EEAA1AA8AA 66669115IIIBBBEEEAAA-566166000200311388966599I55B5511E0022A223-73399199555511973

IBEA-91901583IAABAA 99E9911A199-60011611550558808833833 847IIIBIIBBEBBEEAEAA-96616666066000000040088388887884474477IBIIEIIBBABEE-5AA499599111110020000800449443393377IBEA-57025315IBIIBBEBBEEAEEA-6AA 555577200822555033911IIIBBBBBEEEEEAAAAA -76655155555552212288788558550000099599IBEA-59011039IBEIIBBABBE-7AA0AA 555259999900111111500IEEBEEAAEAA A770-55523229599211911515517IBEIIBBABBE-5AA2AA 555053323554522429939911

IBEA-61006289IIIBBBBBEEEEEAAAAA -5661131100500002006666622922884899999I55B3355E5522A2669-694484499099 11185IBEA-9II2IIBB3BB0EE6EAA9AA46676688IBEA-63009965I66B3300E0000A0999-999664655055 03733IBEIIBBABBEE-9AA3AA 599099446400800003003393377IBEA-93203781 IBEA-109400177IIIBBBEEEAAA-911400499044100300011377IIIBBBEEEAAA-999044144000611833900733

IBEA-94407517IBBEEBEEAAEAAA99-544441440050077277557551131177777 3IBEAII-9BBEE4EE4EAA0AA 155355110115555522IIIBBBBBEEEEEAAAAA -79944244445440000011611333330030055155IBEA-71521879IEEAABAAE77A115-1220118388787790990191IIIBBBEEEAAA-911000633088600200111799IBEA-93009819IBEIIBBABBEE-5AA8AA 099133530000000609999988I55B5588E880011A1155-900662669959907331IBIIEIIBBABEE-5AA699599222225565500200778773373333

IBEA-90007107IIIBBBEEEAAA-999200900000177011700377IIIBBBBBEEEEEAAAAA -6992242995990000011811002007737733133IBEIIABBE-6EEAA4AA566266446445525500600881882IBEA-65527235IBEA-5II8IIBB5BB2EE7EEAA1AA06636655IIIBBBEEEAAA-555788555122777311500133IBEIIABBE-5EEAA7AA0552557727755555112117737733

IBEA-53501813IBIIEIIBBABEE-9AA055755331335515500700112118878811IBBBBBEEEEEAAAA 9-6004770112114776225771IBEA-68006423IAAB66E66880A0000-666441442022332333445IBIIEIIBBABEE-5AA766066111110070022722337334494444IBIIBBEBBEEAEEA-9AA 355577000411477177177

IBEA-60516679IBIIBEBBEEAEEA-1AA 166900655011066066577999IIIBBBEEEAAA-911011299066100400100955IBBBBBEEEEEAAAA 9-6008220002116440113991IBEA-69025501IEEBAAEAA 66A6699-5002282255555550550010011211 23IBBEEBEEAAEAAA55-588555550050011211223222242233433 9IBEA-93803763IIIBBBBBEEEEEAAAAA -19933033883880080033033770776626633033 5IIIBBBEEEAAA-911400233088100900722900IBEIIBBABBEE-9AA0AA 29909441221000011999

IBEA-57503605IIIBBBEEEAAA-955277255000133966600755IEEBEEAAEAA A992-622000001192996266717741IBEA-67II0IIBB2BB5EE9EEAA7AA966IBEA-91502175IBEIIBBABBEE-6AA4AA 59909911555855007002292211IBEAIIB-5BEE0EEAA0AA2667664414456550030055

IBEA-58017289IEEAABAAE55A880-511776772022808899999795IIIBIIBBEBBEEAEAA-55515566566002000080099399779779919955IBEA-5II5BB5BBEE1EEAA3AA2A 5585571155IBEA-71024467I0022B2244E4444A4667-93110805IIIBBBEEEAAA-1990337115110000882003553IAABAA 1E1100A077-555004000000222223323333233 63

IBEA-67519527IEEBEEAAEAA A667-65591195995155722757741IEEBAAEAA 66A699-5551131177577552554414411111 15IBEA-69005807IIIBIIBBEBBEEAEAA-96616699799000000010055655886880090077IBEA-3II3BB2BBEE9EEAA0AA0A 9929910117777700IIIBIIBBEBBEEAEAA-3334333313322822990990000000000228220050077

IBEA-56012201IBBBBBEEEEEAAAA 5-5668005110227224001113IBEA-3II3IIBB5BB0EE0EEAA0AA1551558818855IBIIEIIBBABEE-6AA833033331335555500100004000090011IBEA-54023159IIIBBBBBEEEEEAAAAA -15544144007002272233033110115515599699 3IIIBBBEEEAAA-911311777077100700811566IBIIBBEBBEEAEEA-7AA 199033277300911277388IBEA-93210345IBIIBBEBBEEAEEA-6AA 799033222711700333344IBEIIAIBB-5EE2AA0AA 1660667717008002232277

IBEA-55026267 IBEA-55507415IIIBBBEEEAAA-5553557550000770440112551IIIBBBBBEEEEEAAAAA -6553343377577002000010000700002002212211IBEA-339900067IBIIBBEBBEEAEEA-5AA 933033099299500300300I2255B5533E3333A-64009727IBBBBBEEEEEAAAA 6-6448005000996777222779

IBEA-91603627IIIBBBEEEAAA-699711066200433766422977IBEIIBBABBEE-6AA2AA 56606774700700228224494477IBEA-90608219IIIBIIBBEBBEEAEEAA-69979900000662660030088888223221111199IIIBIIBBEBBEEAEEAA-76606677577001002202233333888883353311IIIBBBBBEEEEEAAAAA -57700500555551121100500337338828855955

IBEA-54001531IIIBBBEEEAAA-555044000200611655133711IBEAIIB-5BBEE5EEAA0AA15535500700005002292266IIIBBBBBEEEEEAAAAA -95555355009001101133533774775565599199IBEA-91305735IIIBBBBBEEEEEAAAAA -59911611335330020055655775773343355555IIIBBBEEEAAA-655166055122066255744955IBEA-105300305IIIBBBEEEAAA-911000155033900700333300IBIIBBEBBEEAEEA-9AA 299800011800599677333IBIIEIIBBABEE-6AA799599222228868800700881885515566

IBEA-69013241IIIBIIBBEBBEEAEEAA-76616699599000001181133533221224494411IBBEEBEEAEAAA77-111550550010088088550551101199299 91IIIBBBEEEAAA-911200811000800500522799IBEA-53019997IIIBBBBBEEEEEAAAAA -55533033005001101199699997999999977977IBEA-63II0IIBB1BBEE8EEAA3AA5AA 57550IBEA-72021075

IBEA-56011029IBEA-6450IIBB8BB9EE5EEAA7AAIIIBBBEEEAAA-766144555200288999355977IBEA-71011821I1111B1188E8822A211 -340000439IIIBBBEEEAAA-933144500000400100644533IIIBBBEEEAAA-699911555100544711866355

IBEA-69012789IEEBEEAAEAA A669-700211252272778088949951IBEA-3460II0IIBB0BBEE3EEA2AA5AAIAAB33E334466A660-900040033333202255555865IBEA-91504169IEEBEEAAEAA A991-6550100044111066939977IBEIIABBE-5EEAA2AA5660661191100200114110070033

IBEA-67008371IBEA-58527603IBEA-690220IIB2BB1EEIAA B6699E9900A0222-9220010229221101 6925IBEA-93611637IEEBEEAAEAA A993-5664115116166733777745IBEA-560148IIB6BB9EEIIIBIIBBEBBEEAEEAA-55505566566001001171144244887886676699

IBEA-56508411IBEIIABBE-6EEAA5AA5552556616655855005008858844IBBBBBEEEEEAAAA 6-5558555221117889553551IBEA-93109991IIIBIIBBEBBEEAEEAA-5993993303311211008009979999399991991IIIBBBEEEAAA-655333500022788977833511IBIIBEBBEEAEEA-6AA 366033055700577699588

IBEA-71030123IBIIBBEBBEEAEEA-6AA 277511000333200511522I66B6622E225500A003-622552555505526127IBEIIBBABBEE-5AA4AA 566266221003002272266111IBEA-69013169IBEIIABBE-5EEAA5AA56616691990800119113393311IBBEEBEEAEAAA55-955553551111111111880889939999199 5

IBEA-339300039I99 BEA-72005671IIIBBBEEEAAA-577722500000455466777111IIIBBBEEEAAA-555177555000844444877511IBEA-54529073IBEAIIB-9BBEE4EEAA4AA05575544244559552252299IBIIEIIBBABEE-5AA199099440444494400200770772252299IBEAII-7BBEE0EE5EAA1AA 15515511010070000

IBEA-502000111IIIBBBEEEAAA-655100022100400700911511IIIBIIBBEBBEEAEEAA-66676611511000001151144144779779959955IBEIIBBABBEE-7AA2AA 066066779775505500005575511IBEA-90609437IBIIBEBBEEAEEA-9AA 199700066200899144933IBBEEBEEAAEAAA99-511777770050022022887881151199599 3IBEA-55522541IBEA-5II6IIBB5BB1EE2EAA2AA05515555IBEA-344400IIBB2BB6EE3EEAA IBEA-50515275

IBEA-67008731IBEAIIB-6BEE2EEAA5AA0660667777700100030088IBEIIBBABBEE-5AA6AA 06612222556500300005077IBIIEIIBBABEE-6AA255055660660020011611227226676633IBEA-72029227IBIIBEBBEEAEEA-5AA 977022100422199922922I99B9999EA-54012661

IBEA-51026739IBIIBBEBBEEAEEA-5AA 555011200322366177733IIIBIIBBEBBEEAEEAA-55585555555002002232233233330331151177IIIBIIBBEBBEEAEAA-65575588588550552272233533229220030055IBEA-93605419IBEA-34II7IIBB7BB0EE0EEAA0AA89999933IBEA-66027II3IIBB9BBEE7EEIIIBBBBBEEEEEAAAAA -76666066005002202277377339339989977777

IBEA-50512509IIIBBBBBEEEEEAAAAA -5550040555551111122622558550050099999IEEBAAEAA 55A544-6551111166066882885515599399 61IBEA-67513343IBEIIBBABBEE-6AA6AA 5662667767551110339333IBEA-6001973IIB9BBIEEBAAEAA 66A6600-9001131199799770773353399199 65

IBEA-56003671IIIBBBEEEAAA-755066500100433866177711IIIBBBBBEEEEEAAAAA -67700200555551111144444889881161177377IBEA-93603193IBEA-91405277IBBBBBEEEEEAAAA 9-6117440000553225778771IBEIIBBABBEE-9AA4AA 2660667707700300006033155IIIBBBBBEEEEEAAAAA -5994444225220020000400337336676611111IBEA-65508833IBBEEBEEAEAAA66-55555455000008828888488330333343 9IBEA-920II0IBB5BBEE8EEA6AA1AA

IBEA-51507599IBEAIIB-6BEE7EEAA5AA055155111150550090077IBIIEIIBBABEE-7AA066566772775575500300115111131100IEEAABAAE77A005-522771773433545530330073IBEA-56510415IBEA-331II3IIBB0BBEE0EEA5AA7AA 755IIIBBBBBEEEEEAAAAA -9333323311911331330020000000550557777777IIIBBBBBEEEEEAAAAA -59922822990991121122622008000050077177

IBEA-71024939IIIBIIBBEBBEEAEAA-67717711511000002272244044994993333399IBEA-115100415IBEA-94300777IIIBBBBBEEEEEAAAAA -69944944330330000000000772777777777377IBEIIBBABBEE-9AA0AA 26606994900900000000122IIIBBBEEEAAA-799000522200744399400311IBEA-347600335IEEBEEAAEAA A334-37746600000200330333033535513IBEIIABBE-9EEAA1AA9330334454400600223220050000IIIBIIBBEBBEEAEEAA-99939911411990990060055055660663333355

IBEA-90203775IBIIEIIBBABEE-5AA699099001002232200300330337737777IBEA-560206II8IIBB1BBEEIBEA-69007815IBEAII-6BBEE5EE0EEAA0AA 966766992990090000I0000B0099E7722A2299-93902837IIIBBBEEEAAA-1991330993000220881332771

IBEA-72002111IBEA-5850699II3IIBBI33BEA-71503011IIIBBBEEEAAA-577911055200533800411711IBEA-52519925IBEA-68021933 IBEA-64004207IBEA-94002167IIIBIIBBEBBEEAEEAA-69949944044000000080022722112116636677I000B0088E7722A2233-50513025IBEIIBBABBEE-6AA4A 5551003005565112333300IBEA-55527665IBEA-61013529IIIBEA-91202159IEEBEEAAEAA A991-5220060022022111455989939

IBEA-92105585IIIBBBBBEEEEEAAAAA -39922322119110080055055550558818855955 7IIIBBBBBEEEEEAAAAA -5333393399599881880010000600118119959977IBEIIBBABBEE-5AA4AA 555255993995585511511966IBEA-59008399IBEA-6750885II5IIBBIBEIIBBABBEE-6AA5AA 56606677275575500188188

IBEA-61020751IBEAII-5BBEE3EE5EAA0AA 066566113110090022IIIBBBBBEEEEEAAAAA -55533633555550010000200551553373399599IBIIEIIBBABEE-7AA055555660665595511011222221111177IBEA-67001287IBEIIABBE-9EEAA2AA3660667707700900003001191122IEEAABAAE99A223-700000009599330339099943IBEA-65003791

IBEA-58011089IBEIIBBABBEE-9AA0AA 9550588080070011711100IBIIBBEBBEEAEEA-7AA 299500399000700977977IBEA-52528657IIIBIIBBEBBEEAEEAA-55595522022551552202288388665665535577IBEA-650167IIBB3BB1EEIEEBEEAAEAA A665-70011106670773373301175

IBEA-59016335IBEA-5II0BB0BBEE1EEA8AA0A 5855979900IBEIIABBE-5EEAA5AA0552550050000400111118898800IBEIIABBE-9EEAA0AA2551555505500200225225555544IBEA-53015801IBIIBBEBBEEAEEA-6AA 155033200411155288100

IBEA-345600113IIIBBBBBEEEEEAAAAA -6334444455555662660020000300112111111133 IBEA-332100337IBBBBBEEEEEAAAA 3-533022011100400733233777IBEA-65502477IIIBBBEEEAAA-966155655000922344277977IBEAII-6BBEE6EE5EAA0AA 5996991161665600

IBEA-60006339IBEIIBBABBE-7AA2AA 5660600600020000900665633IBBBBBEEEEEAAAA 7-6226555000664226998551IEEAABAAE66A665-5004434460668088611735IBEA-65510073IBIIEIIBBABEE-9AA066066551555505511111009000090077IBBEEBEEAEAAA99-600006001151100100110119909999899 3IBEAI-5BB9BBEE0EEAA0AA9A 664666616659551

IBEA-60510073IIIBBBEEEAAA-666300055111100200477333IBEIIABBE-9EEAA0AA9660663393300700112111171122IBEA-64524957IIIBIIBBEBBEEAEAA-76616644044552552232244344995995515577IIIBIIBBEBBEEAEEAA-57787711511002002242233433336335595511IBEIIABBE-9EEAA1AA4550558838855255224224454444

IBEA-57001021IBEIIABBE-5EEAA6AA555255778770050000300111110IAAB55E55665A552-6885535533533110116555IAABAA 6E6633A355-5006666555555515555755 077IBEA-90200789IEEBEEAAEAA A990-6220600000070778988969979IBEIIAIIBB-9EE3AA2AA 16606666666000000030099

IBEA-65508435IBIIEIIBBABEE-9AA466466551555515500500884884474433IIIBBBEEEAAA-1991449448110110553447775IIIBBBBBEEEEEAAAAA -911114199299881880000000600332337757755IBEA-93210601IBEIIAIIBB-6EE3AA0AA 09979933833227221191100IBBEEBEEAAEAAA66-633009000050077177889887797799699 9IBEIIABBE-5EEAA2AA0661669939955855116119959999

IBEA-55516277IAAB55E55555A551-966221227777770779133IBBBBBEEEEEAAAA 9-6113770000997110337337IBEA-93403687IBEAII-1BBE0EE6EAA4AA 099099330334414001033IIIBBBEEEAAA-911100666044300200300911I99B9911E6600A0033-522339339959927657IBEA-63509233IBEA-6IIBB7BBEE5EE0EEAA7AA 6663663353355IBEAIIB-7BBEE2EEAA5AA06656677377554550030077IBEA-52521365II

IBEA-66011987IEEBAAEAA 66A666-5001171111511991998808877977 29IBIIBBEBBEEAEEA-9AA 255477055011000099922IBEIIAIIBB-7EE0AA5AA 299799229224434400100IBEA-93200503I0055B5500E0033A-60017913IAABAA 6E6600A000-3117737799199111133033 0443

IBEA-92500937IBEA-5II5BB0BBEE0EEAA5AA1999929225IBBBBBEEEEEAAAA 5-6557000001554110992991IBBBBBEEEEEAAAA 6-5771005111446006222113IBEA-53522507IBEIIBBABBEE-9AA2AA 655053365525229222292255IBBBBBEEEEEAAAA 9-5227660000669220994997IBEIIAIIBB-5EE3AA0AA 1554577177000000050099

IBEA-64024763IIIBBBEEEAAA-666944500122244677266933IIIBBBEEEAAA-566599055111722466822399IBEA-61510017IBEA-53002635IBEAIIB-9BBEE0EEAA4AA0551553333005000050022IBBBBBEEEEEAAAA 9-9002445000111336554557

IBEA-101700037IBIIBBEBBEEAEEA-6A 011000011177100900900

IBEA-71012019IBEA-67018325IBEA-91300173IBEA-6351II6IBB8BBEE7EEAA1AA

gi27380514_Bradyrhizobium_j__ aponiBBcumgi16127430_Caulobacter_crescentusgi3300100_5__C8CC8CCaa9aauu2uull4oo4bb_Aacctgro__b__ccactsserieennum_tumefaff ciensgggiii111755988388599822344944_AAAgggrobbbacttteriiium_tttumefffaaffffaff ciiiensgi159gg6ggii5117177099633_S88i33no__r__ hAAgiggzgg oboobibbaum_melmmimmlmm__o__tttuiuu

gi13470549_Mesorhizobium_lotigi17987037_Brucella_mgg el33i44t4477ensi44sgggigg2ii1131177577990998828877177001003333377_B__BBruceleellla_s__uimmeesOggii22R2233F5500B0022022112111101133033__1__BB_Brruurucelaal__a__ _ovis

gi99199 5604009_Rickettsia_prowazekiigi1gg5ggii8ii1195525566066009004474400_R0099__i__c_ kRRietcckktkksieetta_co__n__ppopp riooiwwORF00486_Wolbachia_sp.

gi18407650_Arabidopsis_thaliana

gi17533571 Caenorhabditis elegggi17533571 Caenorhabditis ele

IBEA-91905679IBEA-101500313

g _ _ ggi24582462_Drosophila_melanogasterggi18390331__Homo__sappiens gi23509012_Plasmodium_faff lciparumgi19112264_Schizosaccharomyces_pombe

gi6323098_Saccharomyces_cerevisiaegi15225397_Arabidopsis_thalianagEEggiEEAAii1AA115AA 955299222119220990550005533755699366777_7799A99AArabbbiiidddopppsiiis_ttthhhallliiiana

ORF03963_Myxococcus_xanthus ORF02522_Treponema_denticolagi15639754_Treponema_pallidumgi15594885_Borrelia_burgdorfeff rigi24213013_Leptospira_interrogansORF03973 Fibrobacter succinggogenes

_ _ gg gggg

IBEA-331700239IBEA-57510679IIIBBBEEEAAA-755177555011700866777599IBIIBBEBBEEAEEA-9AA 177411055100977388977IBEA-60512891IBBBBBEEEEEAAAA 6-5007550110223886997111IIIBBBEEEAAA-655777500000933166677311IBEA-3IIBB3BBE2EE2EAA0AA 066166770775575500

IBEA-93803425IIIBIIBBEBBEEAEEAA-99919933833881880000033133447442232255IIIBBBBBEEEEEAAAAA -59911511880881121100000110117717733733IBIIEIIBBABEE-9AA155355550550090022922000000070011IBEA-341000273IBEIIABBE-9EEAA1AA53303344144121100800001000IBBEEBEEAEAAA99-511555550000011211227228838811111 7IBEA-93511345IIIBBBEEEAAA-1991331553110110331449555IBIIEIIBBABEE-7AA01101111111115113313300500001001IBBBBBEEEEEAAAA 7-9002005111550113556113IBEA-63008361IBEA-5IIBB2BBE0EE2EEAA1AA 9662663333300IIIBBBEEEAAA-955322700022611699822733IIIBIIBBEBBEEAEEAA-99909933933770770050066766667668878877

IBEA-90905787IIIBIIBBEBBEEAEEAA-99939900500990990010055855779778818877IIIBBBEEEAAA-999033955000511788699911IBBEEBEEAAEAAA99-50099399000005515577277666669949 3IBEA-60514405IBBEEBEEAAEAAA66-600552551151144144447440070055855 3IBEAIIB-6BBEE7EEAA5AA2668662222255155191177

IBEA-92810579IBEA-5II5IIBB5BB2EE7EEAA9AA39959922IIIBBBBBEEEEEAAAAA -95555355554552202277477995993363355955IBBEEBEEAEAAA99-533446440000044244551556616699399 1IBEA-91411681IIIBBBEEEAAA-999111344011511266488111IIIBBBEEEAAA-1991113333000550223448111IIIBBBBBEEEEEAAAAA -9111111133433331330010000600338338898811IBEA-91406959IBEIIAIIBB-9EE2AA6AA 0992991161446400166IIIBIIBBEBBEEAEEAA-59919922522661660080022722668666616611

IBEA-339000221IIIBIIBBEBBEEAEEAA-73313333033991990030000800009002252222IBEA-51526099IIIBIIBBEBBEEAEEAA-55565511011551552222266666003009939999IBEA-335500153IBBBBBEEEEEAAAA 3-933355855000000811655333IBIIEIIBBABEE-7AA299599330338898800800009008838866IIIBIIBBEBBEEAEEAA-57727722022550550060099299882889979933

IBEA-51508431IEEBAAEAA 55A5511-655001008808844144334311811 45IBEA-61II5IBB0BBEE7EEAA0AA7AA 6366IBEA-118200197IIIBBBEEEAAA-911311088022300400811799IIIBBBEEEAAA-599333500200533544988377IBIIBBEBBEEAEEA-5AA 555033055822055755399

IBEA-56511063IBBEEBEEAEAAA55-966553551151111111001006686633433 1IBIIEIIBBABEE-9AA3996993303355155111115118858844IBIIEIIBBABEE-9AA39969933033661660010016111151155IBEA-349600315IBBEEBEEAAEAAA33-64499399660660020000300333331111155355IBBBBBEEEEEAAAA 6-9333008220334339115333

IBEA-64009807IIIBIIBBEBBEEAEAA-66606644044002000050099199886880070077IIIBBBBBEEEEEAAAAA -66600900005002202255855116116626677377IBIIEIIBBABEE-5AA766566991995525500500886886616622IBEA-68007847IBEA-55027IIBB3BB4EE3EEAAIBEA-6II7BB5BBEE2EEAA7AA5AA 5155535500IBEA-339700353

IBEA-94411401IEEBEEAAEAA A994-5444115114044010091137IBEAI-7BB1BBEE5EEAA2AA3AA 5505544144535500IBIIEIIBBABEE-9AA077577110115555522422336330070011IBEA-90709141IBBEEBEEAEAAA99-500776770000099299111114474411911 3I022B2211E7799A9933-53000081

IBEA-64014151IBEA-9II2IIBB6BBE0EE0EEAA7AA56636644IBEA-905010IIB6BB3BEEIIIBBBEEEAAA-1991009555000110000661331IBEA-332900443IBEIIBBABBEE-6AA7AA 53303333532202299590030000IBEIIBBABBEE-5AA9AA 5661778775525003005595500IIIBBBBBEEEEEAAAAA -75599099555551121188088226223343399799

IBEA-90101749I00B0011E7744A4499-50523853IBBBBBEEEEEAAAA 5-5008555221330888558335IBEA-93301745IIIBIIBBEBBEEAEAA-39949933633330330000011011771774484455155IBEIIBBABBEE-6AA6AA 0331442466460050005000IBEA-5II4IIBB5BB1EE1EEAA4AA26616666

IBEA-57512739

IBEA-90811603IBEA-93507425IBEA-71503531

IBEA-68509137IBIIBBEBBEEAEEA-9AA 066488055200599611733IBEA-71511793IIIBIIBBEBBEEAEEAA-97737711511550551171111711771779999933 IBEA-59500527IIIBBBEEEAAA-755199555000500655922177

IBEA-58015025IBIIEIIBBABEE-7AA055055880880070011511555550050022IBEA-59014349IBEA-64014569 IBEA-90106335IEEAABAAE99A001-600660663533331335955509

IBEA-64523973IIIBBBEEEAAA-1661449559220330991771335IIIBBBEEEAAA-911011399199000300111511IBEA-1II1BB6BBEE6EEAA0AA0AA 9199060033933IBEA-93808893IEEBAAEAA 99A9933-688008008808888188998993343 35IIIBIIBBEBBEEAEAA-96606688788000001161188388442443373355IBEA-93307017IIIBBBBBEEEEEAAAAA -59933333335330010077677006001131177577IBIIBBEBBEEAEEA-5AA 255533155611266966933IIIBIIBBEBBEEAEEAA-65575522522552551111166266224229959999IBEA-92204915IIIBIIBBEBBEEAEEAA-99939922122221220000044544999991111155IAABAA 9E9933A311-7110020055555992991101 641IIIBIIBBEBBEEAEAA-37737722722558552202200000664664424411311

IBEA-93110583IIIBBBEEEAAA-699433511111000155188133 IBEA-332100389IBEA-90805423IBEIIBBABBEE-7AA1AA 59910070088100605595544 IBEA-55509483IBEIIBBABBE-7AA2AA 5551559555755009009979944

IBEA-92701799IBEA-570II0IBB1BBEE1EEA7AA5AAIIIBBBEEEAAA-655677500100011011177355IBIIEIIBBABEE-5AA766066662665585511511008000050011IBEA-93609205IIIBBBBBEEEEEAAAAA -59933933660660000099599225220090055555IBEA-51030791

IBEA-66513023IBEA-53516201IBEIIAIIBB-5EE6AA5AA 155555335335505115166

IBEA-344600449

ggi27364737_Vibrio_vulnififf cusgi15640388_Vibrio_cholerae

g _ p y _

gcci77hh277__hh8__VVll8VViill 9iibb9bbrr5rriioo4oo__5__vv_Vuullinnbnniriiicco_parahaemolyticus

giBB2EE4EAA3AA37991990080088II28811BB6111BBEE_EEAAS6600AAh003399e9933w3355a5500n0077e44l22l22a55_oneidensis

OIIBB1133RBBEE3377FEEAA0AA077377113115505500_C35533ol11wellia_psychrerythraea

gi26987192_Pseudomonas_putidagi15599462_Pggii22seud8877omon__as_aerugioonosa__gi2gg8ii11811556555575599899995994414466_P22__s__ eudeuuomonas_s__yrieengaggegggi1559gg7ggii22222886888878866_P7788seud__PPomonas_aerug__ issnyyosagggi26990806_Pseudomonas_putida

gi2IIBB9BB6BEE5EEAA3AA5668666676655_C1133oxi2233ella_burnetii

O3444R4466F66000004244434491998_Methylococcus_capsulatus

IBEA 331700239g _ p y _

IBEA-65501171

IBEA-93510089

IBEA-59020063

g 34499644_Chromobacterium_violaceumgi3gg0ggii3303344644454499399997996646644_S44__h__CiCCgelrrlrrooa_fmmoolooffooooexnerirrgggiii233400100166455633077444_SSShhhiiigggelllllla__ffffffflllfffffffff exneriiigi16131219_Escherichia_coligggiii211666211433911922311699_EEEschhheriiichhhiiia_collliiigggiii122566822344399499433566_EEEschhheriiichhhiiia_collliiigggiii111555888033333844544355_EEEschhheriiichhhiiia_collliiigi16766735_Salmonella_tytt phimuriumgIIBBggiBBii2BBEE119EE661AA774A66646666735577325533540055___0011S11SSaa77l11llmmm11 onel

lllllllaaa___eyynyypptpphherimmcagi16762837_Salmonella_enterica gi32490778_Wigglesworthia_glossinidiagi33gg5ii332332202244044990990080077_C7788a__n_ dWWiiiiigdggggagg tleeus_Boorrlrrttochiiaam__ggagg nnissiiagi27904945_Buchnera_aphidicola

gi15617121_Buchnera_aphidicolagi2167277gg3ggii_B5566uch771ner__a__ _aphhhninndneeieerrco__l__agi22127855_Yersinia_pestisgggiii122622111222077588455155_YYYersiiiniiia_pppestttiiisggi5511150060000008838899299 21_Pasteurella_multocidagi222232211311__1__PP5PPa1aas8sstt4tee0euu_Hrreellae__m__ ophllttoioolooccus_da ucreyigi16272523_Haggeggmoph551i1188l88us_i__HHnfHHaalaaeffaaaauenzppapp e

ggi21241734_Xanthomonas_axonopodisggiII112IIBB11221BBEE22442EEAA44113AA11770AA7733355334469944__100__XX_2200X0000aanna0066n6633t33hoohoomomona__s__ _campppppestddiiris

gi28199863_Xylella_faff stidiosagggiii122588811399999288166833 XXXyyylllelllllla fffaaffffaff stttiiidddiiiosa

g 34499644 Chromobacterium violaceumi3gg0gii33033446444454499399997996646644S44hCCC rrlrroo fmmoooooooo rr

gi15839218 Xylella faff stidiosa

IBEA-53001107IIIBIIBBEBBEEAEEAA-95535533533000000030011711111110030077

IBEA-109100451IIIBBBEEEAAA-911100999011900200244755

IBEA-94500075IBEA-118100023IBEA-93802813IBIIEIIBBABEE-5AA399099330338848800100220228898811

IBEA-72500651

O1100R009911F110010000100448445585511911 _Burkholderia_mallei

gi003993226220773234_Bordetella_bronchisepticagi33598323_Bordetella_parapertussisg5500i001007005774559025_Ralstonia_sggogg l33anacearum__

OBBEEIBBREEAAEEAFAA 11AA0111AA 9311899318833511008803000002_2288B00228811u1133r33 kholderia_mallei

gggi17547741_Ralstonia_solanacearumgi773553445779774444117__6RR_Baallssordonnetaae__l__slssa_pertccussirruusgggiii333333555999444744477866_BBBordddetttelllllla_ppparapertssussisgggigg3ii3333333533559559999944044771774494488_B__BBordrrddeteeeleellla_b__pprpponcpphppeieeseptssisscagi30II2IIBB4BBEE9EE9AA9AA 377_N225500i00t000rosomonas_europaeagi15676066_Neisseria_meggngg i33ngi44t449i99d999i99s

gi15793163_Neisseria_meningitidisgi34499644 Chrggogg mob7799act1166er_i__um visseeolrriiac__e__ um

IBEA-344600449 IBEA 53001107IBEA 53001107IBEA-344600449

gi34499644 Chromobacterium violaceum

IBEA 67017893IIIBBBBBEEEEEAAAAA -96677377001001111177277880889929933733IBIIBEBBEEAEEA-6AA 599033111911322800722IBEA-93104513IIIBBBBBEEEEEAAAAA -59933133115110010044944552551121133333IIIBBBEEEAAA-655611055111999722022733IBEA-54II0IIBB1BBEE5EEAA8AA8AA 9666

IBEA-56026509IBEA-9IIBB2BBE6EE0EAA5AA 3551556656600IBIIBBEBBEEAEEA-7AA 299022166700955833111IEEBEEAAEAA A772-50071177577919968881147IBEA-93403367IBEAII-5BBEE8EE0EAA1AA 499299331334454400IIIBBBEEEAAA-555788000211744222511155

IBEA-60514959IBIIBBEBBEEAEEA-7AA 266500255111644199555IBEA-68016093II IBEA-71530379IBEA-67524299IBIIBBEBBEEAEEA-7AA 266077155422444622199IBBEEBEEAEAAA77-522006001101144244444446636611211 1IBEAIIB-6BEE6EEAA5AA1575566866004002232244

IBEA-59013805IBEA-90II0IIBB0BB4EE8EEAA0AA155IEEBEEAAEAA A990-60000100044818890071119IBEA-93109279IBIIBBEBBEEAEEA-5AA 099033111800099222977IBEA-92008II6IIBB8BBEE7EEIBBEEBEEAEAAA99-922003000030088088669668808877077 9

IBEA-50010695IIIBIIBBEBBEEAEEAA-95535500100000001141100700667669939955IIIBBBBBEEEEEAAAAA -59933033110110010044944779777747733133IBEA-9IIBB2BBE5EE0EEAA4AA 855450070000IBEA-342400281IBEA-66517083IBBBBBEEEEEAAAA 6-7662555112771008885331IBEA-50516407

IBEA-56016015 IBEA-108300359IIIBBBEEEAAA-911000788033700700033755IBEA-338400245IIIBBBEEEAAA-933033788044000400022544IAAB99E990077A770-50044444000051553851IEEBEEAAEAA A554-90041132338088565521163

IBEA-90707709IIIBBBEEEAAA-1990008773000770773006991IIIBBBEEEAAA-911200188133000400733566IBBBBBEEEEEAAAA 9-9220118111002442774555IBEA-61512615IBBBBBEEEEEAAAA 6-7112555110220669117555IBEIIBBABBEE-9AA0AA 277077229225575500500030099

IBEA-91407719IBEA-7251IIB7BB4EE4EAA7AAIBEA-66021555II IBEA-58514343IBEA-60524935IBEA-91000503 gi28895118 Streptococcus pyogenes

IBEA 67017893

gi21909736_Streptococcus_pyogenesgi15674452_Streptococcus_pyogenesggggiii111955766477544444055522_SSStttreppptttococcus_pppyyyogggenesgi99199775774494455055440440020055055__7__SS_Sttrrteerppepp ptooccococcus__pp_pyy nggegg umoniaegi15902294_Streptococcus_pneumoniaegi250gg1ii111155855995990010022_S2299t994r_e__ptttocpoppccus_aga__l__pappctnneeieeuaegggigg2ii2222255555003001171111911880885585511_S__SStSStreppptpptococcus_a__ gggagg laaactcctittaeOggiiR2222F2255B55330770999007880___Sttrrtrrerppeppptoocococcus___agggalaaccactaaiaaeeae

ggi24378856_Streptococcus_mutansgi6__1__SS5SStt6ttrr7rree4pp2ppttoo4oocc2ccoo_Lccccacts__o__ coccus_lactisgi29374846_Enterococcggugg s_f5566aff ecal2244i44sgi28gg3ggii22722997993383377377440448_L446a__c__ tEEobnnttaciooloocclccus_pluussa__n__ taaff arumgi15926225_Staphylococcus_aureusOggiiR1155F55990992202266066227222272255_S__SStSSttappphpphhyyylyyllococcus_a__ ureusgiOO2RR1RRFF2FF0080000200002007737777177___SSStttttaappppphhhyyyyy lllooococcus___aureusgggiii122511922288322522333711_SSStttappphhhyyylllococcus_aureusgi27467229_Staphylococcus_epidermidisOggiiR2277F77440446606677977221222202299__S__SStSSttapppphpphhyyyylyyllococcus__e__ pppippiididdermimmiididdiddiisg__ ipp2ppii3iidd0ddee9rr7mm5mmii7iidd1ddiis_Oceanobacillus_iheyensis

gi15612694_Bacillus_haloduransgi16077180_Bacillus_subtggiggiliii11sgi11881880600_8__BB0BB4aa6acc9iill1llu_Lss_i__ssstuuerittiia_monocytogenesOggiiR1166F6688088000004454466469919911_L__LLiLLiistsstterirriia_m__ onocyyytyyttogggegg nesgi16801864_Listeria_innocuagi30018377_Bacillus_ceggrg eusgi3gg0ggii2ii33600000002001198888833_B7777_a__ciBBaalaalaaccus_an__t__cchcceeraciuusssOggiiR3300F0022B2266000622299988__7__BB_BBaacacilluluuluusus__ _cereus

gi12044941_Mycoplasma_genitaliumgi13507966_Myggcopl00asma_p__ neyyuyy mppopp nissaegi31544736_Mycoggpgg lii11asma_gal66__l__i__MMseyypyy tccoioocppupp mgi13358087_Ureaplasma_urealyticumgi26gg5ii115113333333433558558838800_M8877__ycoplaaappspp ma_p__e__ neteearayynyy sgi15828899 Mycoplasma pulmonis

gi28895118_Streptococcus_pyogenesgggiii222188988099955711311688 SSStttreppptttococcus pppyyyogggenesgi21909736 Streptococcus pyogenesgi28895118 Streptococcus pyogenes

IBEA-330300159IBEA-68528147IIIBBBEEEAAA-566188055322088011044377IIIBBBEEEAAA-555511000133800900700733IIIBIIBBEBBEEAEEAA-65515555555000001111188788993997717777IBEA-58021205IBBBBBEEEEEAAAA 5-5887000221112220006555I55B5577E770011A112-900660665555504979

IBEA-119300243IBEA-66II5IBB2BBEE7EEAA0AA7AA 511IBBEEBEEAAEAAA66-666558552202277077009007797755655 3IBBEEBEEAAEAAA66-988001000050099099990996686633333 9IBEA-69011257IIIBIIBBEBBEEAEAA-96616699399000001121111511220225535577IIIBIIBBEBBEEAEEAA-9993991141330330000022122553550010033IBBBBBEEEEEAAAA 9-9332444000000116338113IBEA-93801311IBIIEIIBBABEE-6AA99909933133883880010016113373311IIIBBBBBEEEEEAAAAA -76699299005001111133833111116616677177IBEA-343500093I33B4433E3355A5000-500040995933233 8519IIIBIIBBEBBEEAEEAA-55595544044550552202288888555551171199IIIBBBBBEEEEEAAAAA -35599399006000070000000880885545577677 9

IBEA-58024465IIIBIIBBEBBEEAEEAA-55585588588002002242244244448446656655IIIBBBEEEAAA-655388055222544622288355IBIIEIIBBABEE-6AA466066330330090022122559556676622IBEA-57502881IIIBBBBBEEEEEAAAAA -65577577550550020022322883888888811911IIIBBBBBEEEEEAAAAA -56655155005002202233433333338818899999IBEAI-5BB3BBEE5EEAA2AA6551558115595500

IBEA-57011149I0011B1111E1144A4499-71011203IBBBBBEEEEEAAAA 7-5119000110115223004333IBEA-93512573IIIBBBEEEAAA-1991333559110220553776335

IBEA-65527333IBEA-53006685IBEAI-9BB2BBEE1EEAA0AA5A 5595533833009000

IBEA-90308999IBEA-53504265IIIBBBEEEAAA-555633055000544822866355

IBEA-59503119IIIBBBBBEEEEEAAAAA -95599199554550000033633115111131199799IIIBIIBBEBBEEAEEAA-59949911511440440020066366558553373377

IBEA-71502253IBIIBEBBEEAEEA-5AA 477511055600522722555IAABAA 55E5544A455-606616655055770775545 649IIIBIIBBEBBEEAEEAA-56606611511000000070044144666664474499IBEA-93901573IBIIEIIBBABEE-6AA299099330339949900900116115515577IBEA-65II5IBB0BBEE1EEA5AA8AA 676622

gIIBBiBB3BBEE3EE8IIAA6BBAA 2BBEE660EEAA556AA5551122_77P1133333r33o33ch

003l33orococcus_marinus

ggi3II3IIBB2BBEE4EE1AA1AA 1994900_0033PIIBB3300r88o99c99h

AA9999lorococcus_marinus

gi33864048_Prochlorococcus_marinus

giBB3BBEE3EEA8AA6A 6554464455655009002_S3388ynechococcus_sp.

gi17231830_Nostoc_sp.gi22gg2ggii9ii1197727722933211_T3300h00__e__ rmosyn__epcpph.. ococcus_elongatusgi16330914 Sggygg nech999ocyst2__i__s sp.

IBEA 330300159gi15828899_Mycoplasma_pulmonis IBEA 330300159g y p pgi15828899 Mycoplasma pulmonis

gi16330914 Synechocystis sp.IBEA-53020187IIIBBBEEEAAA-655333500122400011888577IAAB66E66335A551-6440000088088552554875IIIBIIBBEBBEEAEEAA-96646600500000002262244744881887717755IBEA-66017977IBEIIBBABBEE-6AA5AA 5661661008001107777799IBEIIBBABBEE-9AA4AA 36606655855556511611188IBEIIBBABBEE-6AA2AA 09914494433430040889866

IBEA-66017453IAAB66E66660A001-6774424455555332332213IBEA-54519417IBEA-62014065IBEA-93607547IIIBBBEEEAAA-1991332662000770550446777

IBEA-59022021IBEA-66022755IAA B6666E6600A0222-522779775555555255 0733

IBEA-66016567

g _ y y _ pgi15805336_Deinococcus_radiodurans

ORF01715_Dehalococcoides_ethenogenes

OIIRIIBBFBBEE0EEAA1AA0558559999955_D2200esul33 foff vibrio_vulgarisORF04735_Geobacter_sulfuff rreducens ggi28212177_Clostridium_tetanigi2211157787777977__6__CC3CCll8lloo7oosst_Crriiddlddostmmrim__d__ttitteeum_acetobutylicum

gi18311390_Clostridium_perfrff ingensgi20808663_Thermoanaerobacter_tengcongeggnsi88sORF01017_Carboxydotheggrgg mus_h886yd633r__o__ genofmmoff rmans

g _ y y _ pi15805336 D i di d

IBEA-92301615IIIBIIBBEBBEEAEAA-99939922122331330020011011666661151155IBIIBBEBBEEAEEA-9AA 299333011111622100166IBBBBBEEEEEAAAA 9-6220335001113662118111IBEA-60011719IIIBBBBBEEEEEAAAAA -66600100000001111111411777771101199999IBIIBBEBBEEAEEA-9AA 066011100211444277500

IBEA-54527569IBEA-1I1IIBB7BBE9EE0EEAA0AA3552554454455IIIBBBEEEAAA-911311377199100400533922IBEAIIB-6BEE7EEAA0AA2995993323333233119111IBEA-60005001IBEAIIB-5BBEE4EEAA0AA06666600000002000030055I0000B0066E0022A2233-345700275IIIBBBEEEAAA-333344455977000000322977155

IBEA-94502509IIIBIIBBEBBEEAEEAA-79909944044551550080022122559550030099IIIBBBEEEAAA-577000500211288611299133IBEA-69510363IIIBBBEEEAAA-966499055011800133866333IBEIIABBE-9EEAA2AA8990994444400900007008878811IIIBIIBBEBBEEAEEAA-79929922522881880020044744994997737777

IBEA-54522177IBEA-67018669IBEAIIB-7BBEE1EEAA5AA2663667717709001151188IBEIIAIBB-6EE4AA0AA 077171131155052292233IBEA-72509367IIIBIIBBEBBEEAEEAA-97727722922550550040099799335336636677IBEIIABBE-6EEAA2AA599199232299099003004454477IBEIIABBE-6EEAA6AA5660662282255055116113353300

IBEA-68022049IBEA-54502483IIIIIBIIBBEBBEEAEEAA-55595544044551550070022522442448818833gi34541543_Porphyromonas_gingivalisOggiiR3344F44550554434411111550554434433_P__PPorppphpphhyyyryy omonas_g__ggiggiingggiggiivalaallilliisgi293OO4OORR8RR1FF3FF0080033_B1100a_c__tPPeroppippdpphheyysyy _toomhmmetoonaiaaot__a__ggogg mggiggiicrongi21675001 Chlorobium tepidum

IBEA 66016567IBEA 66016567

gi21675001 Chlorobium tepidum

IBEA-66524615IBEA-58504829

gi15605613_Aquifeff x_aeolicusgi15644251_Thermotoga_maritimagi19704888_Fusobbacterium_nucleatumgicc1tte5eerr6rrii1uu8uumm4mm_6__nn1nn_Cuccllhlleealaaamydm ophila_pneumoniaegi15836081_Chlamydophila_pneumoniaegigg3ggii311255488133966000288_C__CChCClhhamydyyophppihhlhha_p__nppeumoninnaegggiii133633722544211499900122_CCChhhlllamyyydddoppphhhiiillla_pppneumoniiiaegi29839959_Chlamydophila_caviaegi1gg5ggii8ii223229959988388333339969999_C599h__ClCCamydmmimmyayy_mppupp riiilldlaaa__r__ umggi1gg5ggii611055588133655433_IICIBB

66__hBBEE__ClEEAACCaAAmAA 66y6666d6655

mmyyi5522yya2244yy _4466t66aar11__a55__ chuuomatrruiuus

ORF02908 Myxococcus xanthus

gi15605613 Aquifef x aeolicus

IBEA-67017893

gi15839218_Xylella_faff stidiosaggi15645809_Helicobacter_pylorigi11155566644155288100899_3___HHHeelelcciccocobaaacteere__r__p_pyyyloororiggi32265857_Helicobacter hepaticusg__ iHH3HHee4eell5iicc5ccoo6oobb8bbaa9aacc5cctt_Wrr__hholeeieeppnpp elttiliia_succinogenesgi15791857_Campylobacter_jeje uniggOggiiR1155F55770779909911811883885565577_C__CCampppyppyylyylloboobbactcctter_j_jj______ ejejjeeee uninnii

IBEA-67017893

gi15839218 Xylella faff stidiosai15645809 H li b l i

IBEA 67017893

IBEA 331700239

gi31791868_Mycobacterium_bovisgi15607824_Mycobacterium_tuberculosisgggiii111555866400077088822744_MMMyyycobbbacttteriiium_tttubbberculllosiiisgggigg1ii1151155855882884484400000000008858877_M__MMyyycyyoboobactccerirrum_l__e__ prae

gi19551738_Corynebacterium_glutamicumgi25027gg0ggii7119929955_C5511oryn__ee__ bCCoacyytyyeyy rieebbum_efrriifiiuuffiiii iuuffuucimme__n__ggsgggiaa2cctt8ttee5eerr7rriiuu2uu8mm3mm__5__ee_Tffffffff iiffff rophnnseryma_whippleigggiii222888455977322688433255_TTTroppphhheryyyma__whhhiiippppppllleiiigi23465667_Bififf dobacterium_longumggi29831462_Streptomyces_avermitilisgi11442446616622222__2__SS3SStt0ttrr4rree2eepp Soomtmmryeyypyy teeo__m__ yces coeliississcolor IBEA 331700239g _ p y _ IBEA 331700239g _ p y _

IBEA-94000207IBIIBBEBBEEAEEA-3AA 399644200000000122400377IBEA-71030115IBEIIBBABBE-1AA0AA 87737711000003310080011511IIIBBBEEEAAA-911000388033200800811188

IBEA-90109669IIIBIIBBEBBEEAEEAA-69999900500111110080099899662666616699I88B8888E2211A11 -60019367IBIIBEBBEEAEEA-3AA 366900900011099433166177IBEA-56524817IAABAA 5E5566A655-5224414488588111177577 623IBEA-61016297IBEA 6IIBB3BBE5EE1EAA4AA 266661191100

_ y __ IBEA 94000207ORF02908_Myxococcuus__xIIaIIBBnBBEEtEEhAAuAA 99s99IBEA 94000207_ y __ORF02908 Myxococcus xanthus

High GC gram positive

a-Proteobacteria

g-Proteobacteria

b-Proteobacteria

e-ProteobacteriaUnknown

Chrlobia/CFB

Low GC gram Positive

Cyanobacteria

Misc. Phyla

Misc. Phyla

g-Proteobacteria

Chlamyda

Misc. PhylaIBEA-63514269

IBEA-50515405IBEA-67522663

IBEA-72526553IBEA-91801001

ORF01530__Trepponema__denticola gi15639441_Treponema_pallidumgi15595036_Borrelia_bggiurgd3399orf44eff r__i__

iEE3EEAAA4A 799511088480080011 P0000i11 ll l

-IBEA 63514269IBEA 63514269IBEA 63514269

IBEA 918010 1iEE3EEAAA4A 799511088480080011 P000i11 ll l

Unknown

c. EF-G

IBEA-59022667IBBBBBEEEEEAAAA 5-9990003220223667661773IBIIEIIBBABEE-5AA099599002003323300400334337777711IBEAII-9BBEE0EE3EEAA1AA 0552550050551522IBEA-50517263IBEIIAIIBB-5EE2AA0AA 15585500155111177IIIBBBEEEAAA-555322000211588011511711

IBEA-54521331IBIIBBEBBEEAEEA-9AA 455444055622911833733IBBEEBEEAEAAA99-644445440000066166997998818877277 5IBIIEIIBBABEE-9AA266766550550080011811775771131122IBEA-93804577IIIBIIBBEBBEEAEEAA99919933633880880070044744552557797777

_ _ gORF03973_Fibrobacter_succinogenes_ _ gORF03973 Fibrobacter succinogenes

IBEA-72005941

IBEA-64501017IEEBEEAAEAA A664-75501000110100211717797IBEA-56513351 IBEA-91404433IAAB99E99114A440-9444434433733330339139IBEA-10I0IIBB4BB0EE0EEAA0AA29999933

IBEA-61005691IBEA-54514705

IBEA-90802649IBEA-92908097

IBEA-93811071IAABAA 99E9933A388-71111211005007717711611 801IBEIIABBE-7EEAA2AA577177262255655110116656688IBIIBBEBBEEAEEA-9AA 177122055511466466700IBEA-66526371IBEIIABBE-9EEAA2AA966166606655555226226696633

gi32473702_Pirellula_sp. ggi27365098_Vibrio_vulnififf cusgi551005099698848__2__VV3VVii3ibb9bbrr_Voo__i__vvbvvurillnno_chccuolsseraeggi27365097_Vibrio_vulnififf cusg66i664477324455336855006655800155009110009900199772117777__077__VV2VV_Vbbrrirriibiioor__i__vvo_parahuusaemolyticus

O1100R000055F55660669949911611 93_Colwellia_psychrerythraea

gEEAAiIIBBAA2BBAA 994EE9923EEAA22997AA9902990088499008800300880099188009977_77 S2266h6644ewanella_oneidensis

ORF05773_Methylococcus_capsulatus

IBEA-64015673i31 91868 B 1IBEA 64015673i31791868 M b t iBB b664 i11IBEA 64015673i31 91868 BB 117 M b b664

IBEA-91607729IBEA-55501303 IBEA 72005941IBEA 72005941

g co ac er m o s

IBEA 91607729

i31791868 M b t i b i

g-ProteobacteriaUnknown

Page 13: Supplemental Online Materials

IBEA-338600047IBEA-55523751

IBEA-92511717IBEA-342600151IBEA-503400051IBEA-55522651IBEA-64011587IBEA-50511783

IBEA-61522739IBEA-58017185IBEA-69022969

IBEA-93404593IBEA-91908549

IBEA-93012093IBEA-68008463

IBEA-91105979IBEA-64506233

IBEA-94201149IBEA-91011449IBEA-62511123IBEA-59019243

IBEA-65014123IBEA-90808615

IBEA-54014527IBEA-59006505

IBEA-59513245IBEA-53009003

IBEA-72025747IBEA-58028375

IBEA-51514657IBEA-56511517

IBEA-333200267IBEA-63516943IBEA-56511937

IBEA-92912165IBEA-93804977

IBEA-56009277IBEA-60518539

IBEA-72012273IBEA-92603783IBEA-71521479

IBEA-57018747IBEA-92600723

IBEA-62523127IBEA-93809599

IBEA-60503253IBEA-58002839

IBEA-63503985

IBEA-72505721IBEA-56516247

IBEA-90900947IBEA-66026207IBEA-335400359IBEA-92005589

IBEA-51507995IBEA-92204929IBEA-93208063

IBEA-94009925IBEA-91511147IBEA-346900229IBEA-92105009IBEA-92906229IBEA-113500227

IBEA-341200393IBEA-93310515

IBEA-115000403IBEA-50021065

IBEA-90706341IBEA-521800121IBEA-72516139

IBEA-342900449IBEA-93110037

IBEA-65527455IBEA-94510085

IBEA-60508515IBEA-57514879

IBEA-91505803IBEA-115700247IBEA-90106317gi29653575_Coxiella_burnetiigi29653588_Coxiella_burnetii

IBEA-92308733

IBEA-52502141

IBEA-90402565

IBEA-90811601IBEA-104600127

IBEA-50006927IBEA-53005385IBEA-93510121

IBEA-93510087IBEA-104900233

IBEA-90108643IBEA-92403449

IBEA-54504855

IBEA-52504101IBEA-94500073IBEA-58502405

IBEA-56000051IBEA-72005853IBEA-330000089IBEA-118100017IBEA-66500711

IBEA-93802817IBEA-62004959

IBEA-93802787IBEA-344600031

IBEA-58002327

IBEA-59502491IBEA-94305921IBEA-59018315

IBEA-53006941

IBEA-90607481IBEA-63006687

IBEA-70504073IBEA-59520569

IBEA-54503027IBEA-64009941IBEA-67022321

IBEA-63023555IBEA-90105429

IBEA-94005241IBEA-68025697

IBEA-91302501IBEA-93507791IBEA-50006069

IBEA-91500841IBEA-93810207IBEA-53523195IBEA-65503319

IBEA-91904281IBEA-69510975IBEA-90504983

IBEA-93801313IBEA-57003273

IBEA-58023597IBEA-347100091

IBEA-57028443IBEA-68006063

IBEA-93512571IBEA-92105991

IBEA-52000407IBEA-93901575IBEA-64004001

IBEA-93901577IBEA-109700169

IBEA-56503295

IBEA-92210501

IBEA-70018843

IBEA-61016037

IBEA-92803339IBEA-51029913

IBEA-90107465

IIIBBBEEEAAA-333444111222000000222444333

IBEA-66507961IBEA-114700251IBEA-94503631

IBEA-92110473IBEA-90812249

IBEA-90707705IBEA-72520679

IBEA-55025891IBEA-91407723

IBEA-55517197IBEA-526000123IBEA-53016485

IBEA-349100497

IBEA-343400421IBEA-91807863IBEA-93605523

IBEA-71015767IBEA-337800341

IBEA-50516025IBEA-58509865IBEA-60017395

IBEA-50007899IBEA-93002293IBEA-93308997

IBEA-58517747IBEA-100900109IBEA-93309013IBEA-56506651IBEA-62525803

IBEA-62023937IBEA-67014613

IBEA-51021465IBEA-62505733IBEA-62013125IBEA-67519691IBEA-50528069IBEA-53524381IBEA-53517663IBEA-93104523

IBEA-72528699IBEA-119300101IBEA-90309071

IBEA-56020973IBEA-64013741

IBEA-60516743IBEA-67517137

IBEA-340800459IBEA-92001401

IBEA-91711791IBEA-93104751

IBEA-92504827IBEA-72511935

IBEA-66017049IBEA-62018885

IBEA-93508457IBEA-53022527

gi26987193_Pseudomonas_putidagi26987181_Pseudomonas_putidagi28867852_Pseudomonas_syringae

gi15599461_Pseudomonas_aeruginosagi15599473_Pseudomonas_aeruginosa

gi21241723_Xanthomonas_axonopodisgi21241735_Xanthomonas_axonopodisgi21230350_Xanthomonas_campestrisgi21230362_Xanthomonas_campestris

gi28199874_Xylella_fastidiosagi28199862_Xylella_fastidiosagi15839229_Xylella_fastidiosagi15839217_Xylella_fastidiosa

ORF04785_Methylococcus_capsulatusORF02319_Methylococcus_capsulatus

gi15834157_Escherichia_coligi30064737_Shigella_flexnerigi15804570_Escherichia_coli

gi16131810_Escherichia_coligi26250749_Escherichia_coligi24114603_Shigella_flexnerigi26249935_Escherichia_coli

gi24115265_Shigella_flexnerigi15833444_Escherichia_coligi16131218_Escherichia_coligi30065375_Shigella_flexnerigi15803852_Escherichia_coli

gi16762307_Salmonella_entericagi29143795_Salmonella_entericagi16767400_Salmonella_typhimuriumgi29144325_Salmonella_entericagi16766734_Salmonella_typhimuriumgi16762838_Salmonella_enterica

gi16120542_Yersinia_pestisgi22127856_Yersinia_pestis

gi16123891_Yersinia_pestisgi22124392_Yersinia_pestis

gi27904944_Buchnera_aphidicolagi32491264_Wigglesworthia_glossinidia

gi33520007_Candidatus_Blochmanniagi28952057_Buchnera_aphidicola

gi21672772_Buchnera_aphidicolagi27364738_Vibrio_vulnificus

gi27364609_Vibrio_vulnificusgi28899704_Vibrio_parahaemolyticusgi28899544_Vibrio_parahaemolyticus

gi15640348_Vibrio_choleraegi15640389_Vibrio_cholerae

gi24371815_Shewanella_oneidensisgi24371827_Shewanella_oneidensis

ORF00331_Colwellia_psychrerythraeaORF00315_Colwellia_psychrerythraea

gi16272522_Haemophilus_influenzaegi16272575_Haemophilus_influenzaegi15603611_Pasteurella_multocida

gi15603222_Pasteurella_multocidagi33151327_Haemophilus_ducreyigi33151841_Haemophilus_ducreyi

gi15676052_Neisseria_meningitidisgi15676067_Neisseria_meningitidis

gi15793162_Neisseria_meningitidisgi15793177_Neisseria_meningitidis

gi34499643_Chromobacterium_violaceumgi34499655_Chromobacterium_violaceum

gi17547740_Ralstonia_solanacearumgi17547760_Ralstonia_solanacearum

ORF03150_Burkholderia_malleiORF03176_Burkholderia_mallei

gi33594749_Bordetella_parapertussisgi33591281_Bordetella_pertussisgi33599020_Bordetella_bronchisepticagi33594477_Bordetella_pertussisgi33594730_Bordetella_parapertussisgi33599000_Bordetella_bronchiseptica

gi30249992_Nitrosomonas_europaeagi30248416_Nitrosomonas_europaea

gi15645819_Helicobacter_pylorigi15612193_Helicobacter_pylori

gi32265868_Helicobacter_hepaticusgi15791834_Campylobacter_jejuniORF00809_Campylobacter_jejuni

gi19704887_Fusobacterium_nucleatumORF03811_Desulfovibrio_vulgaris

gi33862064_Prochlorococcus_marinus

gi33241113_Prochlorococcus_marinusgi33866670_Synechococcus_sp.

gi33864049_Prochlorococcus_marinus

gi12545449_Euglena_longagi11466993_Euglena_gracilis

gi7524809_Chlorella_vulgaris

gi11467799_Nephroselmis_olivacea

gi11466416_Mesostigma_viridegi15237059_Arabidopsis_thaliana

gi11467443_Odontella_sinensisgi11467350_Cyanophora_paradoxa

gi11465753_Porphyra_purpureagi11465449_Cyanidium_caldarium

gi17231829_Nostoc_sp.gi22299293_Thermosynechococcus_elongatus

gi16330913_Synechocystis_sp.

gi32475088_Pirellula_sp.

gi21282232_Staphylococcus_aureusgi15926226_Staphylococcus_aureusORF00078_Staphylococcus_aureusgi15923538_Staphylococcus_aureus

ORF00911_Staphylococcus_epidermidisgi27467230_Staphylococcus_epidermidis

gi29374847_Enterococcus_faecalisgi16077181_Bacillus_subtilis

gi15612695_Bacillus_haloduransgi23097572_Oceanobacillus_iheyensis

ORF00542_Listeria_monocytogenesgi16804690_Listeria_monocytogenes

gi16801863_Listeria_innocuagi30260299_Bacillus_anthracisgi30018378_Bacillus_cereus

ORFB06299_Bacillus_cereusgi25010837_Streptococcus_agalactiaeORFB01922_Streptococcus_agalactiaegi22536926_Streptococcus_agalactiae

gi19745719_Streptococcus_pyogenesgi28896335_Streptococcus_pyogenesgi21909968_Streptococcus_pyogenesgi15674691_Streptococcus_pyogenes

gi15901337_Streptococcus_pneumoniaegi15903386_Streptococcus_pneumoniae

gi24379182_Streptococcus_mutansgi15673843_Lactococcus_lactis

gi28378740_Lactobacillus_plantarumgi15606942_Aquifex_aeolicus

gi15605614_Aquifex_aeolicusgi15644250_Thermotoga_maritima

gi29833570_Streptomyces_avermitilisgi21219827_Streptomyces_coelicolor

ORF01016_Carboxydothermus_hydrogenoformansORF01027_Carboxydothermus_hydrogenoformans

gi20808676_Thermoanaerobacter_tengcongensisgi20808662_Thermoanaerobacter_tengcongensis

gi28212176_Clostridium_tetanigi28212186_Clostridium_tetani

gi15896386_Clostridium_acetobutylicumgi18311403_Clostridium_perfringensgi18311389_Clostridium_perfringens

gi13508404_Mycoplasma_pneumoniaegi12045310_Mycoplasma_genitalium

gi31544430_Mycoplasma_gallisepticumggi26553484_Myycopplasma_ppenetrans

gi15828876_Mycoplasma_pulmonisgi13358086_Ureaplasma_urealyticum

gi15840088_Mycobacterium_tuberculosisgi31791869_Mycobacterium_bovisgi15828004_Mycobacterium_leprae

gi19551739_Corynebacterium_glutamicumgi25027073_Corynebacterium_efficiens

gi28493087_Tropheryma_whippleigi28572294_Tropheryma_whipplei

gi23465666_Bifidobacterium_longumgi29831463_Streptomyces_avermitilis

gi21223043_Streptomyces_coelicolor

gi15639180_Treponema_pallidumORF02154_Treponema_denticola

gi15594821_Borrelia_burgdorferigi24213437_Leptospira_interrogans

ORF02907_Myxococcus_xanthusORF03172_Myxococcus_xanthus

ORF04734_Geobacter_sulfurreducensORF04754_Geobacter_sulfurreducens

gi21675000_Chlorobium_tepidum

10

IBEA-93006581IBEA-71528531IBEA-58516005IBEA-50031259IBEA-66017945IBEA-62001795IBEA-50508907IBEA-90008871IBEA-90104749IBEA-116300521IBEA-90104761

IBEA-90607737IBEA-57527979IBEA-55020291IBEA-93006417IBEA-93011257IBEA-55521207

IBEA-55513819IBEA-91807447IBEA-55513105IBEA-93502263IBEA-67012347

IBEA-70517361IBEA-52020257IBEA-93102631IBEA-92804803

IBEA-91704689IBEA-112300281

IBEA-60021283IBEA-63008427

IBEA-56525325IBEA-56027739

IBEA-90504451IBEA-51521999IBEA-61013603IBEA-72020459IBEA-90803465IBEA-56509727

IBEA-92404247IBEA-54029799IBEA-60011943

IBEA-63526207IBEA-69520001

IBEA-53503879IBEA-66016449IBEA-57014263

IBEA-58008609IBEA-61000299IBEA-54014651

IBEA-52002405IBEA-90809485IBEA-50508167IBEA-71512431IBEA-90301167IBEA-67013523IBEA-51002367

IBEA-348500379IBEA-60509365IBEA-64506333IBEA-66521555IBEA-69522559

IBEA-71024573IBEA-54503761IBEA-70013987

IBEA-61002465IBEA-53511985IBEA-63509895

IBEA-341700059IBEA-65528319IBEA-64027877

IBEA-90505615IBEA-58026485

IBEA-53514161IBEA-94410425

IBEA-59508701IBEA-52005261

IBEA-69002871IBEA-67528805

IBEA-65507141IBEA-90310361IBEA-68506899IBEA-61502185IBEA-62502241IBEA-94500155IBEA-62000961

IBEA-69500767IBEA-92307113IBEA-94500649

IBEA-55014841IBEA-66523765

IBEA-64502105IBEA-91503699IBEA-114700169IBEA-61021557

IBEA-71019035IBEA-105900259IBEA-90310369IBEA-53528663

IBEA-68012137IBEA-53506757

IBEA-59005467IBEA-59004815IBEA-65007069IBEA-61508599

IBEA-94409441IBEA-51501347

IBEA-91004625IBEA-52006425IBEA-70509463IBEA-65512319

IBEA-64007639IBEA-50529077IBEA-105300133IBEA-91901039IBEA-70010945

IBEA-91710323IBEA-60025049

IBEA-62518713IBEA-93408095

IBEA-51017733IBEA-72519841

IBEA-94103941IBEA-70521031IBEA-93710639IBEA-68528617

IBEA-105300135IBEA-91901037IBEA-54522183

IBEA-71015701IBEA-66505447

IBEA-92510031IBEA-63510381IBEA-94501725

IBEA-58508881IBEA-90401783

IBEA-92308783IBEA-50527657IBEA-58524827IBEA-59500147

IBEA-69525387IBEA-68000567

IBEA-62014467IBEA-50512381IBEA-57501331IBEA-57517401

IBEA-64008493IBEA-59000631

IBEA-94110535IBEA-61509125IBEA-55003243

IBEA-69004761IBEA-91702011

IBEA-59508815IBEA-72029927

IBEA-90201921IBEA-50017747IBEA-72029115

IBEA-72502217IBEA-65000367

IBEA-67023155IBEA-68021589

IBEA-56521775IBEA-511000189IBEA-93700769IBEA-52027069

IBEA-72516899IBEA-71027877IBEA-62023299IBEA-60525265IBEA-53010435IBEA-71528957

IBEA-94303769IBEA-56505811

IBEA-64508951IBEA-72010181IBEA-94010963IBEA-101600063

IBEA-93500167IBEA-90010441

IBEA-61023709IBEA-92910131

IBEA-55016535IBEA-58513215

IBEA-70512513IBEA-64009081

IBEA-50516311IBEA-71010365IBEA-51008409

IBEA-54013573IBEA-63508799

IBEA-57515495IBEA-54519933IBEA-59020833IBEA-94009755

IBEA-66003791IBEA-54528573IBEA-71002963

IBEA-70013215IBEA-91803951IBEA-62017303IBEA-68520285IBEA-62520839IBEA-60008475

IBEA-65501791IBEA-59516605IBEA-92007605IBEA-91000073

IBEA-68027715IBEA-54018609IBEA-52528477IBEA-71511477

IBEA-91105319IBEA-59024687IBEA-50510961IBEA-92112155

IBEA-72017321IBEA-60504747

IBEA-341300159IBEA-54505359

IBEA-69502861IBEA-92804895IBEA-56507663IBEA-56514337IBEA-67526195IBEA-71513209IBEA-67026859

IBEA-64506781IBEA-90706713

IBEA-94311061IBEA-67510655IBEA-59527219

IBEA-72031155IBEA-67513191

IBEA-71521205IBEA-55023783IBEA-50511189IBEA-65015369

IBEA-55513277IBEA-71010391

IBEA-58503653IBEA-92903827IBEA-100500223

IBEA-90810263IBEA-62023101

IBEA-91103039IBEA-64028139

IBEA-72506043IBEA-57505877IBEA-63020525IBEA-57521485IBEA-53013211

IBEA-64025151IBEA-62522317

IBEA-90901495IBEA-119500321

IBEA-54008657IBEA-66021003IBEA-520200169IBEA-335100513

IBEA-65526105IBEA-105300299IBEA-90109771

IBEA-56521205IBEA-93705969

IBEA-52522477IBEA-52512555IBEA-68023599IBEA-64028051

IBEA-54018659IBEA-66522995IBEA-55026351

IBEA-69027365IBEA-52519727

IBEA-58018071IBEA-91901597

IBEA-94001059IBEA-72523131

IBEA-63006801IBEA-53011883

IBEA-57022657IBEA-63014361

IBEA-93100449IBEA-94305447IBEA-117300071IBEA-62519819

IBEA-59019649IBEA-91910389IBEA-54025823IBEA-92701313

IBEA-72015913IBEA-102200015IBEA-90403233

IBEA-94411571IBEA-56514889

IBEA-338600061IBEA-65504289

IBEA-50505995IBEA-92204601IBEA-51505815

IBEA-53001443IBEA-52002771

IBEA-69517011IBEA-71501919IBEA-65526869

IBEA-71020393IBEA-68509733IBEA-54514827

IBEA-62024259IBEA-93308617IBEA-336600337

IBEA-94306755IBEA-54014443IBEA-59524045IBEA-68010093

IBEA-71510179IBEA-92308007

IBEA-72001623IBEA-92302761

IBEA-56001307IBEA-61005157

IBEA-60012501IBEA-68505495IBEA-507100027IBEA-538900015

IBEA-58009213IBEA-517700073

IBEA-70522223IBEA-54516071

IBEA-56010599IBEA-72018869

IBEA-90600433IBEA-343800367

IBEA-93305363IBEA-103100125

IBEA-53521633IBEA-56522117

IBEA-66003741IBEA-71022603IBEA-549100025

IBEA-70008875IBEA-92911983

IBEA-92501131IBEA-107300423

IBEA-65523549IBEA-92507463IBEA-106100129

IBEA-62521873IBEA-91902297IBEA-68009825IBEA-92000643IBEA-59009945IBEA-94109555IBEA-63015719

IBEA-93409763IBEA-68528649IBEA-60505145IBEA-65027743

IBEA-52001765IBEA-54512011IBEA-65518775

IBEA-71018207IBEA-65012367

IBEA-59006963IBEA-338200347

IBEA-59508061IBEA-61510225

IBEA-54523705IBEA-50014575

IBEA-70513841IBEA-56501951IBEA-71523497

IBEA-90203447IBEA-110200473

IBEA-52515983IBEA-70528229IBEA-67006271IBEA-51507377IBEA-50506307IBEA-92300941IBEA-109300041

IBEA-69513135IBEA-72527771

IBEA-55011401IBEA-53527319

IBEA-63522865IBEA-342900045

IBEA-94201071IBEA-59024341

IBEA-54004873IBEA-90604599

IBEA-72030873IBEA-68527877

IBEA-53512505IBEA-55027843

IBEA-94404673IBEA-51507987

IBEA-51010615IBEA-52003783

IBEA-93111411IBEA-90210253

IBEA-71503031IBEA-68024237

IBEA-64019541IBEA-53019865

IBEA-67015419IBEA-68512567

IBEA-56513739IBEA-90501755

IBEA-67518715IBEA-66014481

IBEA-90404343IBEA-55528901

IBEA-69003101IBEA-92609141

IBEA-68505951IBEA-94206457IBEA-64011373

IBEA-94002165IBEA-63013661

IBEA-59009103IBEA-57510985

IBEA-69501447IBEA-93905279IBEA-91812687

IBEA-91711695IBEA-116900143IBEA-91711697

IBEA-92705827IBEA-90203751

gi15889258_Agrobacterium_tumefaciensgi17935855_Agrobacterium_tumefaciensgi15889243_Agrobacterium_tumefaciensgi17935838_Agrobacterium_tumefaciens

gi15965107_Sinorhizobium_melilotigi15965092_Sinorhizobium_meliloti

gi13470531_Mesorhizobium_lotigi13470532_Mesorhizobium_loti

gi23502128_Brucella_suisgi23502112_Brucella_suisgi17987038_Brucella_melitensisORFB02000_Brucella_ovisgi17987025_Brucella_melitensisORFB02020_Brucella_ovis

gi27380513_Bradyrhizobium_japonicumgi16125489_Caulobacter_crescentusgi16127429_Caulobacter_crescentus

gi15236220_Arabidopsis_thalianaIBEA-341600371

gi11466508_Reclinomonas_americanagi15892931_Rickettsia_conorii

gi15604507_Rickettsia_prowazekii

IBEA 341200243gi15805338_Deinococcus_radioduransgi15807044_Deinococcus_radiodurans

gggi15607825_Myyycobacterium_tuberculosis

IBEA 341200243

IBEA-61513997IBEA-90109673

IBEA-72018133IBEA-59520255

IBEA-57523155IBEA-65513391IBEA-56017665

IBEA-94102631IBEA-63015427

IBEA 90012729

High GC gram positive

a-Proteobacteria

g-Proteobacteria

b-Proteobacteriae,d-Proteobacteria

UnknownCFB

Low GC gram Positive

Cyanobacteria

Misc. Phyla

Misc. Phyla and Unknown

Chlorobia

IBEA-90012729IBEA-66500399

IBEA-68008829IBEA-91302065

IBEA-93205983IBEA-62022453

IBEA-50026099IBEA-52011705

IBEA-72014957IBEA-70009679IBEA-62517571

ORF00620_Porphyromonas_gingivalis

IBEA-90012729IBEA-90012729IBEA 90012729

ORF00921_Fibrobacter_succinogenesORF03787_Fibrobacter_succinogenes

IBEA-67516231IBEA-50014715

IBEA-91501459IBEA-94505433

IBEA-65514023IBEA-56015989

IBEA-64003693gi19921738_Drosophila_melanogaster

gi25141371_Caenorhabditis_elegans

ORF01025_Dehalococcoides_ethenogenes

gi16752970_Chlamydophila_pneumoniaegi33241409_Chlamydophila_pneumoniaegi15835609_Chlamydophila_pneumoniaegi15617998_Chlamydophila_pneumoniae

gi15835213_Chlamydia_muridarumgi15605043_Chlamydia_trachomatis

gi29840456_Chlamydophila_caviae

gi17556456_Caenorhabditis_elegansgi21359837_Homo_sapiens

gi17864358_Drosophila_melanogastergi19112538_Schizosaccharomyces_pombe

gi6324761_Saccharomyces_cerevisiae

IBEA-63020287

gi34540215_Porphyromonas_gingivalisgi29348149_Bacteroides_thetaiotaomicron

g

d. EF-Tu

Planctomyces

Misc. Phyla

Page 14: Supplemental Online Materials

IBEA-94209989IBEA-50519451IBEA-71031009

IBEA-61018267IBEA-68000271IBEA-64004669IBEA-347700255

IBEA-59506139IBEA-65004445IBEA-94506241IBEA-93109157

IBEA-346000265IBEA-90107027IBEA-93101931IBEA-92202501

IBEA-56005817IBEA-55514325

IBEA-332300145IBEA-58510075IBEA-69522469IBEA-92111961

IBEA-55007323IBEA-67011467IBEA-93306131IBEA-92011155IBEA-90707141

IBEA-339000535IBEA-91503041

IBEA-56515861IBEA-62506985IBEA-93810619

IBEA-57022949IBEA-57016665

IBEA-91105397IBEA-94503009IBEA-55514339

IBEA-59510093IBEA-67011199IBEA-61025367IBEA-92400555IBEA-92104113IBEA-71018661IBEA-60511687

IBEA-60525371IBEA-332500511

IBEA-92203553IBEA-67508913

IBEA-346200443IBEA-62508371

IBEA-55513639IBEA-94209701

IBEA-91304661IBEA-100700391

IBEA-92801347IBEA-94108295IBEA-93703165IBEA-55012393IBEA-51013169

IBEA-91504033IBEA-90600535

IBEA-94008153IBEA-338200121IBEA-68510367IBEA-346400349IBEA-69025595

IBEA-61517805IBEA-92112393

IBEA-93308941IBEA-61516081IBEA-343400173

IBEA-58013613IBEA-59024807IBEA-55005443

IBEA-63007927IBEA-71509129

IBEA-91304305IBEA-58520905

IBEA-90104317IBEA-65503775

IBEA-63500677IBEA-63523603IBEA-56009077

IBEA-51508231IBEA-56510917IBEA-91503565IBEA-61025339IBEA-332500469

IBEA-55514445IBEA-70503837

IBEA-52522975IBEA-67506743

IBEA-61010573IBEA-332500151

IBEA-56008541IBEA-345300493

IBEA-61515595IBEA-62511491IBEA-70529463IBEA-63014943

IBEA-60021345IBEA-63026643IBEA-68522013

IBEA-90306059IBEA-334000425IBEA-93606245

IBEA-70008141IBEA-59024589

IBEA-346100155IBEA-61508631

IBEA-90600491IBEA-63526883

IBEA-93708839IBEA-57503223IBEA-72023757

IBEA-92701851IBEA-93602993IBEA-55022921IBEA-69508581

IBEA-64021643IBEA-92911583IBEA-94007061

IBEA-94308835IBEA-91103925IBEA-58510029IBEA-61012815

IBEA-55028723IBEA-61526071

IBEA-347200021IBEA-347900213

IBEA-92302917IBEA-92302289

IBEA-343100561IBEA-90302469IBEA-67025653

IBEA-59010299IBEA-56521377

IBEA-69007915IBEA-340300263IBEA-91710773IBEA-55010577IBEA-90000413IBEA-91900721IBEA-91900723IBEA-111100279IBEA-61516401IBEA-102400245IBEA-94203967IBEA-92907839

IBEA-94306347IBEA-101700223IBEA-57009269IBEA-64006779IBEA-57500599IBEA-57500579IBEA-93202553IBEA-56015549IBEA-93810183

IBEA-93702975IBEA-64007875

IBEA-93105811IBEA-116800265

IBEA-90610397IBEA-330200501IBEA-93606133IBEA-50014929

IBEA-57507081IBEA-56003713

IBEA-57501577IBEA-71522067

IBEA-91711439

IBEA-55527955IBEA-67010565IBEA-61005343IBEA-67526159

IBEA-94500839IBEA-91103591

IBEA-90510765IBEA-56501407IBEA-330800087

IBEA-60026025IBEA-343900433

IBEA-92805667

IBEA-61501715

IBEA-92102837IBEA-342200073

IBEA-63005809

IBEA-92910403

IBEA-63006819IBEA-56517241IBEA-347400213IBEA-93306939IBEA-61510049

IBEA-92201109IBEA-59525015IBEA-67517663IBEA-111800399

IBEA-66006033IBEA-349700305

IBEA-109400255IBEA-91111239IBEA-64016057IBEA-50021083

IBEA-333500175IBEA-56005937IBEA-335200239

IBEA-335300425IBEA-101900241IBEA-94111141IBEA-111700201IBEA-93905907

IBEA-102400181IBEA-93408265IBEA-93408271IBEA-92803965

IBEA-61509327IBEA-342800465IBEA-92703669IBEA-55011673

IBEA-55515021IBEA-331000171IBEA-62023439

IBEA-60026105IBEA-60015413IBEA-340400331

IBEA-91303685IBEA-67527047

IBEA-60009623IBEA-93104557

IBEA-345700145IBEA-64526309

IBEA-92704053IBEA-91800427IBEA-71030883IBEA-55021423IBEA-345100469IBEA-92103243IBEA-56519475

IBEA-55005357IBEA-50020123

IBEA-60526865IBEA-61008747

IBEA-92402429IBEA-70518157

IBEA-94503451IBEA-90707529IBEA-104300175

IBEA-55522753IBEA-104400093IBEA-91408671

IBEA-92607385IBEA-93709361

IBEA-52007383

IBEA-63004575IBEA-50501423

IBEA-60020025IBEA-60519523

IBEA-90001369

IBEA-333600321IBEA-92410859IBEA-71001471

IBEA-94207135IBEA-52022793IBEA-63519263IBEA-64014585IBEA-57514649IBEA-63003667

IBEA-63015409IBEA-56014033

IBEA-66511483IBEA-90001371

IBEA-60009045IBEA-63018833IBEA-64013283

IBEA-57515159IBEA-94108229

IBEA-102900177IBEA-64011295

IBEA-57518081IBEA-106800481

IBEA-91111191IBEA-67517445

IBEA-93003021IBEA-90208621IBEA-63518467

IBEA-53515123IBEA-58006161

IBEA-56010585

IBEA-93605675IBEA-347300517IBEA-61009441

IBEA-61517447

Lactobacillus_plantarum_WCFS1_gi28378660Bacillus_anthracis_str._Ames_gi30264385Bacillus_cereus_ATCC_14579_gi30022393

Saccharomyces_cerevisiae_gi6322505Schizosaccharomyces_pombe_gi19114371

Encephalitozoon_cuniculi_gi19074854Caenorhabditis_elegans_gi17562024

Homo_sapiens_gi24234688Arabidopsis_thaliana_gi30691626Arabidopsis_thaliana_gi15242459

Plasmodium_falciparum_3D7_gi23508542

Rickettsia_prowazekii_gi15604059Rickettsia_conorii_gi15892156

Wolbachia_sp._ORF01614

Caulobacter_crescentus_CB15_gi16124266Brucella_ovis_ATCC25840_ORFB00159Brucella_melitensis_16M_gi17988285Brucella_suis_1330_gi23502973

Agrobacterium_tumefaciens_str._C58_Cereon_gi15887476Agrobacterium_tumefaciens_str._C58_U._Washington_gi17934041Sinorhizobium_meliloti_gi15963935

Mesorhizobium_loti_gi13473986

Bradyrhizobium_japonicum_USDA_110_gi27375790Shigella_flexneri_2a_str._2457T_gi30061584Shigella_flexneri_2a_str._301_gi24111463Escherichia_coli_K12_gi16128008Escherichia_coli_O157H7_gi15829268Escherichia_coli_CFT073_gi26245936Escherichia_coli_O157H7_EDL933_gi15799694

Salmonella_enterica_subsp._enterica_serovar_Typhi_Ty2_gi29140555Salmonella_enterica_subsp._enterica_serovar_Typhi_gi16759005Salmonella_typhimurium_LT2_gi16763402

Buchnera_aphidicola_str._APS_Acyrthosiphon_pisum_gi15616773Buchnera_aphidicola_str._Sg_Schizaphis_graminum_gi21672436Buchnera_aphidicola_str._Bp_Baizongia_pistaciae_gi27904645

Candidatus_Blochmannia_floridanus_gi33519589Yersinia_pestis_CO92_gi16120797Yersinia_pestis_KIM_gi22127580

Vibrio_parahaemolyticus_RIMD_2210633_gi28897427Vibrio_vulnificus_CMCP6_gi27363827

Vibrio_cholerae_gi15640871Haemophilus_influenzae_Rd_gi16273156

Haemophilus_ducreyi_35000HP_gi33151440Pasteurella_multocida_gi15602601

Shewanella_oneidensis_MR-1_gi24372709

Colwellia_psychrerythraea_34H_ORF01392Pseudomonas_syringae_pv._tomato_str._DC3000_gi28871639

Pseudomonas_putida_KT2440_gi26991410Pseudomonas_aeruginosa_PAO1_gi15599955

Xylella_fastidiosa_Temecula1_gi28199254Xylella_fastidiosa_9a5c_gi15838931

Xanthomonas_axonopodis_pv._citri_str._306_gi21242273Xanthomonas_campestris_pv._campestris_str._ATCC_33913_gi21230929

Colwellia_psychrerythraea_34H_ORF04197

Methylococcus_capsulatus_Bath_ORF03296Coxiella_burnetii_RSA_493_gi29654590

Neisseria_meningitidis_Z2491_gi15793712Neisseria_meningitidis_MC58_gi15676459

Chromobacterium_violaceum_ATCC_12472_gi34497098Burkholderia_mallei_ATCC23344_ORF11727

Nitrosomonas_europaea_ATCC_19718_gi30249897

Geobacter_sulfurreducens_PCA_ORF00058Desulfovibrio_vulgaris_Hildenborough_ORF00281

Dehalococcoides_ethenogenes_195_ORF00476

Myxococcus_xanthus_DK_1622_ORF03036

Fibrobacter_succinogenes_S85_ORF02763

Porphyromonas_gingivalis_W83_gi34540920Porphyromonas_gingivalis_W83_ORF01924

Bacteroides_thetaiotaomicron_VPI-5482_gi29350023

Chlorobium_tepidum_TLS_gi21673475

Chlamydophila_pneumoniae_CWL029_gi15618414Chlamydophila_pneumoniae_J138_gi15836034Chlamydophila_pneumoniae_AR39_gi16752540Chlamydophila_pneumoniae_TW-183_gi33241854Chlamydophila_caviae_GPIC_gi29840008

Chlamyyydia___muridarum___gggi15835290

_ g _ _g

Streptococcus_pyogenes_MGAS8232_gi19746714Streptococcus_pyogenes_M1_GAS_gi15675606Streptococcus_pyogenes_MGAS315_gi21911067Streptococcus_pyogenes_SSI-1_gi28895247

Streptococcus_agalactiae_2603V/R_gi22536282Streptococcus_agalactiae_A909_ORFB00511Streptococcus_agalactiae_NEM316_gi25010171Streptococcus_mutans_UA159_gi24378606

Streptococcus_pneumoniae_TIGR4_gi15900431Lactococcus_lactis_subsp._lactis_gi15672936

Enterococcus_faecalis_V583_gi29375876Listeria_monocytogenes_4b_F2365_ORF01949Staphylococcus_aureus_subsp._aureus_Mu50_gi15924570Staphylococcus_aureus_subsp._aureus_MW2_gi21283261Staphylococcus_aureus_COL_ORF01206Staphylococcus_aureus_subsp._aureus_N315_gi15927160Staphylococcus_epidermidis_RP62A_ORF01952Staphylococcus_epidermidis_ATCC_12228_gi27468185

Bacillus_cereus_10987_ORFB04730Mycoplasma_genitalium_gi12045161Mycoplasma_gallisepticum_R_gi31544743

Mycoplasma_penetrans_gi26554398Ureaplasma_urealyticum_gi13357899

Mycoplasma_pulmonis_gi15828694Fusobacterium_nucleatum_subsp._nucleatum_ATCC_25586_gi19703464

Pirellula_sp._gi32475715

Helicobacter_hepaticus_ATCC_51449_gi32266162Helicobacter_pylori_26695_gi15644739

Wolinella_succinogenes_gi34556917Campylobacter_jejuni_subsp._jejuni_NCTC_11168_gi15792097Campylobacter_jejuni_RM1221_ORF00918

Mycobacterium_tuberculosis_CDC1551_gi15839736Mycobacterium_tuberculosis_H37Rv_gi15607491Mycobacterium_bovis_subsp._bovis_AF2122/97_gi31791528Mycobacterium_leprae_gi15828350

Corynebacterium_efficiens_YS-314_gi25029185Corynebacterium_glutamicum_ATCC_13032_gi19553990

Streptomyces_coelicolor_A32_gi32141213Streptomyces_avermitilis_MA-4680_gi29831027

Tropheryma_whipplei_TW08/27_gi28572903Tropheryma_whipplei_str._Twist_gi28493717Bififf dobacterium_longum_NCC2705_gi23465109

y y p gStreptomyces_avermitilis_MA-4680_gi29833779

Deinococcus_radiodurans_gi15805168

Thermoplasma_acidophilum_gi16082111Methanothermobacter_thermautotrophicus_str._Delta_H_gi15679294

Aquifex_aeolicus_VF5_gi15606302Thermotoga_maritima_gi15643141

Carboxydothermus_hydrogenoformans_Z-2901_ORF01769

a-Proteobacteria

g-Proteobacteria

b-Proteobacteria

d-Proteobacteria

Proteobacteria

Misc. Phyla and UnknownsChlorobia/CFB

ChlamydiaSpirochetes

Low GC Gram positive

e-ProteobacteriaHigh GC Gram positive

Cyanobacteria

Misc. Phyla

IBEA-62512343IBEA-90906827

IBEA-90906823IBEA-109500181

IBEA-90508927IBEA-58519565

Clostridium_acetobutylicum_gi15894564Clostridium_tetani_E88_gi28211653

Clostridium_perfrff ingens_str._13_gi18311015Methanosarcina_acetivorans_C2A_gi20090337

Methanosarcina_acetivorans_C2A_gi20088986

e. HSP70

UnknownLeptospira_interrogans_serovar_lai_str._56601_gi24216404

Treponema_pallidum_gi15639209Treponema_denticola_ATCC_35405_ORF01967

Borrelia burgdorfeff ri B31 gi15594863

y g

g gIBEA-58510401

IIBBEEAA-1111114400000011555999

g gg g

Unknown

Misc. Phyla

IBEA-337800107IBEA-340400039IBEA-91302317IBEA-343400507IBEA-101600483

IBEA-56521905IBEA-60000581

IBEA-61501625IBEA-93201643IBEA-90208281

IBEA-64527799IBEA-61508795

IBEA-91800469IBEA-60001637

IBEA-61004541IBEA-93703119

IBEA-55504275IBEA-341200099

IBEA-116300291IBEA-93703117

IBEA-55022961

IBEA-67027439IBEA-92605005

IBEA-94108099IBEA-62020105IBEA-56007949IBEA-62500529IBEA-94108101

IBEA-71509885

IBEA-93107835IBEA-64002781IBEA-57502935

IBEA-94306579IBEA-56002283

IBEA-51523791

IBEA-93407377IBEA-56502665

Prochlorococcus_marinus_subsp._pastoris_str._CCMP1986_gi33862260Prochlorococcus_marinus_subsp._marinus_str._CCMP1375_gi33241320

Prochlorococcus_marinus_str._MIT_9313_gi33864519

Synechococcus_sp._WH_8102_gi33867038Synechocystis_sp._PCC_6803_gi16331261

Nostoc_sp._PCC_7120_gi17229234Thermosynechococcus_elongatus_BP-1_gi22299276

Odontella_sinensis_gi11467472

Porphyra_purpurea_gi11465781Cyanidium_caldarium_gi11465421

Cyanophora_paradoxa_gi11467407Cyanophora_paradoxa_gi11467283

Arabidopsis_thaliana_gi15233779Arabidopsis_thaliana_gi15240578

Prochlorococcus_marinus_subsp._pastoris_str._CCMP1986_gi33861454Prochlorococcus_marinus_subsp._marinus_str._CCMP1375_gi33240388

Synechococcus_sp._WH_8102_gi33865803

Prochlorococcus_marinus_str._MIT_9313_gi33862975

Thermosynechococcus_elongatus_BP-1_gi22299693Nostoc_sp._PCC_7120_gi17230482

Synechocystis_sp._PCC_6803_gi16329715y y p g

Misc. Phyla

Page 15: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S4a-e. Phylogenetic trees of five phylogenetic markers commonly used in studies of bacterial phylogeny (16s rRNA, RecA, elongation factor G (EFG), elongation factor Tu (EFTu), and heat shock protein 70 (HSP70). Phylogenetic trees are shown for each gene, with sequences from this study colored Red, and with major phylogenetic groups outlined (clades of sequences that could not be assigned to any group are labeled as “Unknown”. Only the bacterial portions of the trees are shown. The phylogenetic tree of rRNAs was generated by (1) aligning each Sargasso Sea rRNA of greater than 400 bp against its closest match in the alignments from the RDP II database and then using that alignment to align the new sequence to the complete RDP database; (2) a phylogenetic tree was generated using the dnapars algorithm of the Phylip package in which all new Sargasso sequences were included as were all sequences from complete genomes and sequences from representatives of major phylogenetic groups. The phylogenetic trees of the four protein coding sequences were generated in the following way: (1) homologs of each protein were identified in the Sargasso predicted protein set and in complete genome sequences using blastp and HMM searches; (2) distant paralogs of each protein were excluded using a reciprocal-top match filter; (3) all sequences were aligned to each other using the HMM as a template; (4) poorly aligned regions were identified and removed using a conservation-score based filter; (5) all sequences that did not have >50% overlap with the E. coli ortholog were exclude; (6) phylogenetic trees were generated using the protein parsimony algorithm in Phylip (parsimony was used to better deal with the limited overlap between many pairs of sequences). Only complete genomes were used for comparison so that each tree can be compared to the others without differences in species sampling complicating the comparison.

Page 16: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S5. Observed (black) and fit (red) distribution of fraction of assembly (y-axis) covered at given depths (x-axis) corresponding to the first model described in Table 3.

Page 17: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S6. Accumulation curve for rpoB. Observed (black) OTU counts for rpoB (based on the fragment grouping summarized in Table 2), as well as the Chao1-corrected estimate of total species (red; see (3)). Points are mean values of 1000 shufflings of the observed data, while bars show 90% confidence intervals.

Page 18: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S7. Each point in the figure corresponds to a scaffold from the assembly (restricted to scaffolds > 10kb). Scaffolds were placed in separate panels of the figure according to the most closely related organism as indicated by the BLAST searches described in the text. Within a panel, a scaffold is shown with x coordinate equal to its length, y coordinate equal to its estimated depth of coverage, and color determined by which of 6 k-mer composition clusters it was assigned to. Depth of coverage was estimated as the total base pairs in reads belonging to a given assembly piece divided by the length of the consensus sequence for the piece. K-mer composition clusters were determined by representing each scaffold as a vector of the frequencies of all possible 4-mers, considering both the forward and reverse strands of the sequence, and then applying the K-means clustering algorithm.

Page 19: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

301 gapped | | | | | | | | | | 301 ungapped u u u u u u u u u u CGCGGTCACCACGCCCAACTGGGCGCTGCCCGCCAACTTCCAGAATCACGGCAACTACGTGATTTCGCTGGGCAACGTCACGCGCTGGCTTGGACAGCAG cns (uid,iid) type llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll qlt __________________________________________________________________________________________________________________________________.................................................................................................... > (234640,234342) > (234640,234342) .................................................................................................... < (463505,463207) < (463505,463207) .................................................................................................... > (29681,29383) > (29681,29383) .................................................................................................... < (496854,496556) < (496854,496556) .................................................................................................... < (335001,334703) < (335001,334703) .................................................................................................... < (438154,437856) < (438154,437856) .................................................................................................... < (254599,254301) < (254599,254301) .................................................................................................... > (326218,325920) > (326218,325920) .................................................................................................... < (113234,112936) < (113234,112936) .................................................................................................... < (363413,363115) < (363413,363115) .................................................................................................... < (588750,588452) < (588750,588452) .................................................................................................... > (421052,420754) > (421052,420754) .................................................................................................... > (204127,203829) > (204127,203829) .................................................................................................... < (528794,528496) < (528794,528496) .................................................................................................... < (171610,171312) < (171610,171312) ..................................................................................... > (318489,318191) > (318489,318191) ............................................. > (489864,489566) > (489864,489566) ....... < (99833,99535) < (99833,99535)

<<< Contig 842466, gapped length: 555203 >>> 401 gapped | | | | | | | | | | 401 ungapped u u u u u u u u u u GCCGAGGCGCTCGGTGTGGAGATCTTCCCCGGCTTCCCGGCTGCCGAGATTCTCTATAACGACGACGGCTCGGTGAAGGGTGTCGCGACCGGCAACATGG cns (uid,iid) type llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll qlt __________________________________________________________________________________________________________________________________.................................................................................................... > (234640,234342) > (234640,234342) .................................................................................................... < (463505,463207) < (463505,463207) .................................................................................................... > (29681,29383) > (29681,29383) .................................................................................................... < (496854,496556) < (496854,496556) .................................................................................................... < (335001,334703) < (335001,334703) .................................................................................................... < (438154,437856) < (438154,437856) .................................................................................................... < (254599,254301) < (254599,254301)

Page 20: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

.................................................................................................... > (326218,325920) > (326218,325920) .................................................................................................... < (113234,112936) < (113234,112936) .................................................................................................... < (363413,363115) < (363413,363115) .................................................................................................... < (588750,588452) < (588750,588452) .................................................................................................... > (421052,420754) > (421052,420754) .................................................................................................... > (204127,203829) > (204127,203829) .................................................................................................... < (528794,528496) < (528794,528496) .................................................................................................... < (171610,171312) < (171610,171312) .................................................................................................... > (318489,318191) > (318489,318191) .................................................................................................... > (489864,489566) > (489864,489566) .................................................................................................... < (99833,99535) < (99833,99535) ............................................................................................... < (320135,319837) < (320135,319837) ......................................... < (23880,23582) < (23880,23582) ..................................... < (454580,454282) < (454580,454282)

<<< Contig 842466, gapped length: 555203 >>> 501 gapped | | | | | | | | | | 501 ungapped u u u u u u u u u u GCGTGGGCAAGGACGGCGAGCCGACCGAGAACTTCCAGCTCGGCATGGAGCTGCACGCGAAGTACACGCTGTTCGCCGAAGGCTGCCGCGGCCACCTCGG cns (uid,iid) type llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll qlt __________________________________________________________________________________________________________________________________.................................................................................................... > (234640,234342) > (234640,234342) .................................................................................................... < (463505,463207) < (463505,463207) .................................................................................................... > (29681,29383) > (29681,29383) .................................................................................................... < (496854,496556) < (496854,496556) .................................................................................................... < (335001,334703) < (335001,334703) .................................................................................................... < (438154,437856) < (438154,437856) .................................................................................................... < (254599,254301) < (254599,254301) .................................................................................................... > (326218,325920) > (326218,325920) .................................................................................................... < (113234,112936) < (113234,112936) .................................................................................................... < (363413,363115) < (363413,363115) .................................................................................................... < (588750,588452) < (588750,588452) .................................................................................................... > (421052,420754) > (421052,420754) .................................................................................................... > (204127,203829) > (204127,203829) .................................................................................................... < (528794,528496) < (528794,528496) .................................................................................................... < (171610,171312) < (171610,171312) .................................................................................................... > (318489,318191) > (318489,318191) .................................................................................................... > (489864,489566) > (489864,489566) .................................................................................................... < (99833,99535) < (99833,99535) .................................................................................................... < (320135,319837) < (320135,319837) .................................................................................................... < (23880,23582) < (23880,23582) .................................................................................................... < (454580,454282) < (454580,454282) ......................................................... < (61034,60736) < (61034,60736) ......................................................... < (46734,46436) < (46734,46436) ...........................................c......... < (537912,537614) B < (537912,537614) ....................................... < (347112,346814) < (347112,346814) ...........-...... > (499562,499264) - > (499562,499264)

Figure S8. Scaffold 2205994, identified as Burkholderia related, has deep coverage, and the fidelity of the multiple sequence alignment is indicative of a clonal population. This image represents the structure of Scaffold 2205994 with respect to assembly. Blue segments represent contigs, green segments represent fragments, and yellow and red segments indicate stages of the assembly of fragments into the resulting contigs. Included is a representative sample of the multiple sequence alignment of Scaffold 2205994.

Page 21: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

23201 gapped | | | | | | | | | | 23163 ungapped u u u u u u u u u u CAGCTTTTGACGTTGCTGTTGAACACAATGCAGTGGATACCTGGGCAGAAATGCTCACGTTCG-CCGCACTGGTTAGTGAAAACGAAACCATGCAGCCAC cns (uid,iid) type llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll qlt __________________________________________________________________________________________________________________________________............................c.................t..............t.......t..................a........... > (272795,272497) R \ G H P > (272795,272497) .......................................................t............................................ < (592827,592529) Y < (592827,592529) .......................................................t............................................ < (540528,540230) _ < (540528,540230) .................................................................................................... > (350663,350365) > (350663,350365) .................................................................................................... > (590097,589799) > (590097,589799) .................................................................................................... > (233243,232945) > (233243,232945) .................................................................................................... > (363062,362764) > (363062,362764) ............................c.................t..............t.......t.............................. < (624726,624426) _ \ _ ] < (624726,624426) .......................................................t............................................ > (105700,105402) T > (105700,105402) ............................c.................t..............t.......t.............................. < (532104,531806) X X T R < (532104,531806) .................................................................................................... > (1459411,1389095) > (1459411,1389095) ............................c.................t..............t.......t.............................. > (273518,273220) K T T T > (273518,273220) ............................c.................t..............t.......t.............................. > (405822,405524) \ _ ] Y > (405822,405524) .............t..............t.......t.............................. < (246300,246002) A B B < (246300,246002) .c.................................... > (560656,560358) E > (560656,560358)

Page 22: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

<<< Contig 844369, gapped length: 38497 >>> 23301 gapped | | | | | | | | | | 23262 ungapped u u u u u u u u u u TGCTAACTGGTTCTTTAGCCAGTACTAAACTTGCTGCACTCTTTATTAGTGTGTGTGGTGAGCAAATCAATGAGCAAGGTCAGAACTTGATAAAGGTAAT cns (uid,iid) type llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll qlt __________________________________________________________________________________________________________________________________....g.....c...........c.ac.....c..c.......... > (272795,272497) ] F ; ;: R R > (272795,272497) ........................................t............................................. < (592827,592529) B < (592827,592529) ........................................t..... < (540528,540230) F < (540528,540230) .................................................................................................... > (350663,350365) > (350663,350365) .................................................................................................... > (590097,589799) > (590097,589799) .................................................................................................... > (233243,232945) > (233243,232945) .................................................................................................... > (363062,362764) > (363062,362764) ....g.....c...........c..c.....c..c...........c...................................a................. < (624726,624426) _ _ S ] I S T \ < (624726,624426) ........................................t........................................................... > (105700,105402) T > (105700,105402) ....g.....c...........c..c.....c..c...........c...................................a................. < (532104,531806) \ \ _ a M a S _ < (532104,531806) .................................................................................................... > (1459411,1389095) > (1459411,1389095) ....g.....c...........c..c.....c..c...........c...................................a................. > (273518,273220) K H K T T T H T > (273518,273220) ....g.....c...........c..c.....c..c...........c...................................a................. > (405822,405524) T T \ Q ] _ \ ] > (405822,405524) ....g.....c...........c..c.....c..c...........c...................................a................. < (246300,246002) _ @ E Y Q Q X ] < (246300,246002) ......................c............................................................................. > (560656,560358) 7 > (560656,560358)

<<< Contig 844369, gapped length: 38497 >>> 23401 gapped | | | | | | | | | | 23362 ungapped u u u u u u u u u u GGCTGAAAACGGTCGCTTAAAGGTACTACCTGCTGTATCTGAGCTTTTTGCTCAATACCGTAATGAGTGGGCAAAAGAAGTGGAAGCCGATGTGGTTTCT cns (uid,iid) type llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll qlt __________________________________________________________________________________________________________________________________ ............................................................................. > (32795,32497) > (32795,32497) .......................................... > (579616,579318) > (579616,579318) ................................ < (1191845,1153279) < (1191845,1153279) .................................................................................................... > (350663,350365) > (350663,350365) .................................................................................................... > (590097,589799) > (590097,589799) .................................................................................................... > (233243,232945) > (233243,232945) .................................................................................................... > (363062,362764) > (363062,362764) .............................................. ....................... > (434724,434426) > (434724,434426) .................................................................................................... > (105700,105402) > (105700,105402) .................................................................................................... < (532104,531806) < (532104,531806) .................................................................................................... > (1459411,1389095) > (1459411,1389095) .................................................................................................... > (273518,273220) > (273518,273220) .................................................................................................... > (405822,405524) > (405822,405524) .................................................................................................... < (246300,246002) < (246300,246002) .................................................................................................... > (560656,560358) > (560656,560358)

<<< Contig 844369, gapped length: 38497 >>> 23501 gapped | | | | | | | | | | 23462 ungapped u u u u u u u u u u GCAGCTGAGCTCAGCTCTGAACAACAGCAGCAGATCAGTAATTCTTTAGAGAAACGTCTCGCACGCAAAGTTAAGCTGAATTGCAGCACTGACGCCGCGC cns (uid,iid) type llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll qlt __________________________________________________________________________________________________________________________________........................................t........g.................................................. > (32795,32497) _ O > (32795,32497) .................................................................................................... > (579616,579318) > (579616,579318) .................................................................................................... < (1191845,1153279) < (1191845,1153279) .................................................................................................... > (350663,350365) > (350663,350365) .................................................................................................... > (590097,589799) > (590097,589799) .................................................................................................... > (233243,232945) > (233243,232945) .................................................................................................... > (363062,362764) > (363062,362764) ........................................t........g.................................................. > (434724,434426) M S > (434724,434426) .................................................................................................... > (105700,105402) > (105700,105402) ........................................t........g.................................................. < (532104,531806) X \ < (532104,531806) .................................................................................................... > (1459411,1389095) > (1459411,1389095) ........................................t........g.................................................. > (273518,273220) T T > (273518,273220) ........................................t........g.................................................. > (405822,405524) _ _ > (405822,405524) ........................................t........g.................................................. < (246300,246002) ] T < (246300,246002) .................................................................................................... > (560656,560358) > (560656,560358) .............................................................................................. > (411021,410723) > (411021,410723)

Page 23: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S9. Scaffold 2205169, identified as Shewanella related, has significant coverage, and the fidelity of the multiple sequence alignment is indicative of a co-assembled strains; in subsequent studies, these strains will be effectively separated by assembling with a more stringent overlap criteria. This image represents the structure of Scaffold 2205169 with respect to assembly. Blue segments represent contigs, green segments represent fragments, and yellow segments indicate stages of the assembly of fragments into the resulting contigs. Note the double coverage in the interval from ~23Kb-38Kb. The yellow bars indicate that fragments were initially assembled in several different pieces, which were collapsed to form the final contig structure. The multiple sequence alignment for this region shows distinct, separable haplotypes in this region of the assembly. Included is a representative sample of the multiple sequence alignment of Scaffold 2205169.

Page 24: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Figure S10. Scaffold 2217664, containing the gene encoding Proteorhodopsin. Genes are colored using color assignments described in Fig. 2, and contig boundaries are indicated with red vertical lines. In this scaffold, rhodopsin is associated with a DNA-directed RNA polymerase, sigma subunit (rpoD) originating in the CFB group.

Page 25: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Table S1. Sample details. Sample sites refer to established stations (indicated in Fig. 1). Precise latitude, longitude coordinates of the stations are: Station 13:31º 32.10'N, 63º 35.70'W; Station 11: 31º 10.50'N, 64º 19.46'W; Station 3: 32º 09.51'N, 64º00.61'W; Hydrostation S: 32º10.00'N, 64º30.00'W. Chemistry is provided where available. Unit abbreviations: dbar=”decibar”; psu=”practical salinity units”; rfu=”relative fluorescent units”.

Sample 1 2 3 4 5 6 7 8 Site (station) 13 and 11 13 and 11 3 13 Hydro S Hydro S Hydro S Hydro S Date 2/26/2003 2/26/2003 2/25/2003 2/25/2003 5/15/2003 5/15/2003 5/15/2003 5/15/2003 Water Volume (l) 170 340 250 170 200 200 200 200 O2 (µmol/kg) 221.3/217.4 221.3/217.4 221.3 221.3 NA NA NA NA Temp. (ºC) 20.0/20.5 20.0/20.5 19.8 20.0 NA NA NA NA Salinity (psu) 36.6 36.6 36.6 36.6 NA NA NA NA Fluorescent (rfu) 0.230/0.108 0.230/0.108 0.371 0.230 NA NA NA NA Pressure (dbar) 6 6 6 6 NA NA NA NA Prefilter pore (µm)

0.8 0.8 0.8 0.8 20 3 0.8 0.1

Collection filter pore (µm)

0.1 (TFF)

0.22 0.22 0.22 3.0 0.8 0.1 50 Kda (TFF)

Clone insert size (Kbp)

2.5 – 6.0 2.0 – 6.0 3.5 – 6.0 2.0 – 6.0 2.0-6.0 2.0-4.0 2.0-6.0 NA

Sequences 644,553 317,178 368,835 331,762 142,352 90,905 92,351 NA

Page 26: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

Table S2. Summary statistics on the sequence and assembly data.

Samples 1-4 Samples 5-7 Sequence reads 1,662,328 325,608

sequence (Mbp) 1,361 265 Contigs 121,477 N/A

sequence (Mbp) 256.0 N/A Scaffolds 64,398 N/A

span (Mbp) 400.0 N/A Mini-scaffolds 217,015 153,458

span (Mbp) 820.7 518.4 sequence (Mbp) 353.3 250.5

Singletons 215,038 18,692 sequence (Mbp) 169.9 15.0

Total nonredundant sequence (Mbp) 779.2 265.5 Scaffolds>3x 333 N/A Span>3x (Mbp) 30.9 N/A Annotated genes 1,001,987 212,220 16S rRNA genes 1,164 248 rhodopsin homologs 650 132

Page 27: Supplemental Online Materials

MS

1093

857:

Env

iron

men

tal G

enom

e Sh

otgu

n Se

quen

cing

of

the

Sarg

asso

Sea

V

ente

r et

al.,

rev

ised

Tab

le S

3. M

ajor

sca

ffol

ds. S

caff

olds

w

ith s

eque

nce

cove

rage

(n-

fold

) ov

er

3 ar

e sh

own.

Phy

la a

ssig

nmen

ts a

re

mad

e by

seq

uenc

e si

mila

rity

to

know

n sp

ecie

s. P

hylo

gene

tic

mar

kers

loca

ted

on s

caff

olds

are

no

ted

as f

ollo

ws:

HSP

70(

),R

ecA

(),

rR

NA

(),

EF

G(

),E

FTu(

), R

poB

().

Pla

smid

sc

affo

lds

are

high

light

ed w

ith

shad

ing.

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld (

kb)

SN

P/k

b

1 2

3 4

feat

ure

2221

900

4.8

57

2.87

21

.9

15.6

39

.4

20.1

Act

inob

acte

ria

2223

269

3.9

66

7.09

31

.9

22.5

14

.8

28.2

2223

246

4.5

59

9.56

29

.3

23.8

14

.7

30.3

2223

373

4.5

58

5.56

34

.4

19.9

14

.9

30.7

2223

927

5.7

150

7.99

6.

8 26

.7

42.9

22

.7

,

2220

082

5.1

36

9.53

27

.3

27.8

7.

8 36

.6

2223

913

3.6

61

6.38

20

.7

19.4

23

.2

35.9

2220

503

3.0

62

5.75

20

.7

37.0

9.

2 32

.6

,

2213

560

3.1

60

4.61

24

.2

32.2

21

.5

20.8

2219

467

3.9

39

8.78

28

.3

40.8

5.

0 24

.2

,

2213

324

3.9

50

7.46

12

.5

25.5

36

.5

23.6

,

2223

625

4.9

43

8.94

8.

3 21

.6

41.1

28

.2

2222

752

3.7

46

10.0

8 11

.2

46.5

18

.8

23.5

2223

497

7.1

70

12.6

8 4.

0 29

.4

48.5

16

.0

2223

616

5.4

64

8.68

6.

5 27

.4

44.5

20

.5

2223

819

3.7

67

4.68

6.

4 20

.5

57.3

12

.7

2220

665

4.5

24

8.53

14

.3

22.4

34

.7

28.6

2222

567

3.6

43

4.76

7.

3 21

.3

39.3

31

.3

2218

471

3.2

42

5.26

8.

1 31

.5

45.2

12

.1

2222

672

3.5

30

10.5

2 9.

2 24

.8

51.4

14

.7

2217

601

6.8

28

1.25

4.

3 18

.0

59.0

16

.8

2223

566

4.2

101

5.81

7.

2 26

.1

47.5

18

.4 A

rcha

ea

2224

220

4.8

49

6.89

7.

7 27

.3

35.8

28

.4 A

rcha

ea

2222

847

4.1

58

5.26

9.

4 29

.6

37.2

20

.2 A

rcha

ea

2223

361

4.4

80

4.53

5.

1 28

.4

46.6

19

.9 A

rcha

ea

2220

906

4.0

55

5.00

5.

7 17

.5

50.7

25

.8 A

rcha

ea

2217

575

4.2

63

5.85

3.

8 29

.2

40.3

25

.0 A

rcha

ea

2222

387

3.8

31

5.61

13

.4

30.5

34

.1

19.5

Arc

haea

2222

777

3.1

47

3.33

2.

8 30

.8

34.6

31

.8 A

rcha

ea

2206

159

15.4

45

0.

81

99.2

0

0.6

0.2

Bac

teria

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld(k

b)

SN

P/k

b1

23

4fe

atu

re

2207

698

15.5

381.

0899

.10.

10.

80

Bac

teria

,

2220

155

6.2

340.

4998

.11.

50

0.4

Bac

teria

2204

896

7.3

180

97.1

02.

90

Bac

teria

2223

278

4.0

876.

0930

.131

.38.

928

.2B

acte

ria

2206

020

3.2

451.

5745

.222

.06.

524

.4B

acte

ria

2220

896

3.9

394.

8439

.014

.719

.823

.7B

acte

ria

2219

624

4.4

453.

2331

.916

.232

.418

.1B

acte

ria

2222

247

3.7

562.

9029

.321

.335

.112

.4B

acte

ria

2223

000

3.2

712.

8832

.222

.927

.317

.6B

acte

ria

2224

363

4.0

496.

3032

.012

.426

.327

.8B

acte

ria

2219

868

3.1

592.

6738

.816

.922

.520

.0B

acte

ria

2223

955

3.1

391.

7244

.921

.322

.89.

6B

acte

ria

2222

885

4.3

396.

7330

.629

.08.

628

.0B

acte

ria

2223

251

3.9

486.

9529

.315

.234

.019

.9B

acte

ria,

2219

030

3.6

322.

6442

.319

.214

.621

.5B

acte

ria

2223

615

4.7

374.

7330

.630

.09.

428

.9B

acte

ria

2216

981

9.1

731.

677.

541

.632

.917

.1B

acte

ria

2224

268

3.8

484.

5525

.110

.240

.621

.4B

acte

ria

2217

436

4.3

538.

8320

.326

.425

.926

.4B

acte

ria

2222

467

3.7

3410

.45

38.1

21.9

21.9

18.1

Bac

teria

2223

516

3.7

264.

2236

.723

.97.

328

.4B

acte

ria

2221

095

3.3

397.

1433

.323

.728

.114

.9B

acte

ria

2224

250

4.2

626.

6916

.646

.214

.321

.5B

acte

ria

2223

848

6.6

871.

155.

933

.643

.715

.4B

acte

ria

2206

018

5.1

158.

8635

.723

.514

.326

.5B

acte

ria

2223

244

3.5

273.

7436

.625

.815

.122

.6B

acte

ria

2223

482

3.1

712.

6814

.434

.715

.829

.7B

acte

ria

2218

890

3.6

544.

7015

.048

.113

.123

.8B

acte

ria,

2223

262

3.3

315.

5525

.432

.86.

633

.6B

acte

ria

2223

171

4.5

266.

1527

.320

.936

.413

.6B

acte

ria

2206

186

5.2

319.

4918

.220

.139

.020

.1B

acte

ria

2224

041

3.5

504.

5724

.818

.334

.921

.1B

acte

ria

2209

506

3.2

443.

4325

.742

.921

.010

.5B

acte

ria

2206

025

3.9

155.

8735

.412

.327

.721

.5B

acte

ria,

2219

767

3.3

400.

9818

.442

.413

.625

.6B

acte

ria

2221

878

3.6

1814

.22

32.9

22.9

5.7

38.6

Bac

teria

2223

891

3.1

109

1.65

6.5

13.7

61.4

16.0

Bac

teria

2224

183

4.8

758.

596.

526

.654

.611

.9B

acte

ria

2205

999

7.9

1911

.77

9.8

26.3

49.0

14.4

Bac

teria

2220

985

3.2

512.

6516

.843

.410

.629

.2B

acte

ria

2223

054

3.3

116

4.35

5.4

5.7

77.6

10.9

Bac

teria

2223

849

3.6

913.

686.

26.

973

.213

.8B

acte

ria

2221

945

4.1

445.

499.

023

.739

.726

.3B

acte

ria

2222

100

3.7

157

3.54

2.2

5.4

83.4

8.3

Bac

teria

,

2223

989

5.4

437.

236.

421

.852

.019

.3B

acte

ria

2224

322

3.3

462.

828.

329

.047

.615

.2B

acte

ria

2223

419

5.6

358.

395.

823

.350

.820

.1B

acte

ria

2223

903

3.2

111

3.25

3.3

32.7

47.9

16.2

Bac

teria

2210

775

5.0

335.

467.

419

.954

.416

.2B

acte

ria,

2221

947

3.3

322.

689.

429

.247

.214

.2B

acte

ria

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld(k

b)

SN

P/k

b1

23

4fe

atu

re

2222

093

3.8

963.

162.

37.

479

.510

.7B

acte

ria

2220

466

3.8

114

4.86

2.5

5.8

79.4

11.8

Bac

teria

2222

119

5.6

4713

.22

4.1

23.6

50.0

20.0

Bac

teria

2221

794

3.3

733.

044.

138

.741

.515

.7B

acte

ria

2206

184

3.3

563.

285.

930

.950

.712

.5B

acte

ria

2223

564

4.7

387.

526.

722

.754

.615

.1B

acte

ria

2222

192

3.2

215.

0911

.629

.030

.429

.0B

acte

ria

2223

457

3.3

932.

452.

26.

187

.93.

8B

acte

ria

2221

919

3.3

105

2.56

2.3

5.0

84.4

8.3

Bac

teria

2222

078

4.4

491.

762.

735

.038

.921

.4B

acte

ria

2224

157

3.2

104

3.06

1.9

5.7

85.4

6.4

Bac

teria

2217

439

4.1

593.

291.

89.

880

.08.

4B

acte

ria

2222

403

3.7

764.

791.

513

.675

.09.

2B

acte

ria

2224

042

4.2

423.

422.

413

.770

.812

.5B

acte

ria

2218

490

3.1

383.

084.

26.

377

.99.

5B

acte

ria

2223

479

3.3

892.

571.

07.

978

.412

.0B

acte

ria

2206

179

3.4

724.

581.

45.

181

.411

.2B

acte

ria

2223

091

3.4

722.

831.

69.

181

.86.

4B

acte

ria

2223

888

3.4

551.

392.

15.

687

.54.

9B

acte

ria

2206

009

3.2

393.

362.

49.

771

.816

.1B

acte

ria,

(3)

2206

158

4.2

583.

530.

88.

585

.55.

2B

acte

ria

2223

389

3.2

581.

891.

28.

284

.75.

9B

acte

ria

2212

154

3.1

933.

250.

45.

185

.78.

8B

acte

ria

2218

269

3.2

522.

630.

86.

078

.215

.0B

acte

ria,

2224

187

3.9

423.

290

5.2

86.3

8.5

Bac

teria

2223

468

3.2

583.

660

7.9

85.4

5.3

Bac

teria

,

2215

689

3.2

611.

870

7.0

86.6

6.3

Bac

teria

2219

449

3.4

492.

580

7.2

87.0

5.8

Bac

teria

,

2222

954

5.0

660.

880

45.8

33.6

18.5

Bac

teria

2206

001

4.2

225.

230

3.8

88.7

3.8

Bac

teria

2211

103

3.2

422.

980

11.0

84.0

4.0

Bac

teria

2214

618

3.3

383.

460

4.9

90.2

2.4

Bac

teria

2217

985

4.0

1512

.96

25.0

27.6

17.1

28.9

Bor

dete

lla

2204

886

22.1

2106

0.09

99.6

t0.

4t

Bur

k

2205

994

21.8

1351

0.09

99.6

t0.

4t

Bur

k

2223

291

21.4

760

0.05

99.6

t0.

4t

Bur

k,

(2)

2219

903

21.7

684

0.09

99.6

t0.

4t

Bur

k,

2166

002

21.8

632

0.11

99.6

t0.

4t

Bur

k

2224

110

20.5

557

0.08

99.5

t0.

5t

Bur

k

2220

863

23.3

428

0.09

99.5

t0.

5t

Bur

k

2206

006

20.9

459

0.12

99.5

00.

4t

Bur

k,

2204

905

22.7

345

0.08

99.6

t0.

4t

Bur

k

2162

449

21.6

348

0.10

99.5

t0.

5t

Bur

k

2220

922

22.4

302

0.10

99.5

00.

4t

Bur

k

2216

180

21.6

286

0.16

99.6

00.

4t

Bur

k,

(2)

2216

320

21.7

252

0.08

99.5

t0.

4t

Bur

k

2223

649

21.4

222

0.13

99.1

t0.

7t

Bur

k

2219

961

14.1

320.

4498

.60.

50

0.9

Bur

k

2223

476

3.3

263.

000

8.2

86.7

5.1

Cau

loba

cter

2222

784

4.1

875.

516.

627

.744

.118

.7C

rena

rcha

eote

Page 28: Supplemental Online Materials

MS

1093

857:

Env

iron

men

tal G

enom

e Sh

otgu

n Se

quen

cing

of

the

Sarg

asso

Sea

V

ente

r et

al.,

rev

ised

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld (

kb)

SN

P/k

b

1 2

3 4

feat

ure

2223

542

5.2

35

3.60

28

.0

20.5

2.

9 47

.7 C

yano

bact

eria

2224

023

3.1

56

1.03

5.

2 48

.5

29.1

17

.2 E

nter

obac

teria

2204

901

6.2

18

0 97

.1

0 2.

9 0

Esc

heric

hia

2165

346

5.6

16

2.55

94

.6

0 5.

4 0

Euk

aryo

ta

2206

067

13.0

13

33.

56

98.7

t

1.0

0.2

2204

900

8.4

20

5.39

99

.1

0 0.

9 0

2223

998

6.8

123

1.49

7.

2 36

.9

37.7

17

.5

2220

231

7.4

115

0.76

6.

0 38

.6

37.3

16

.4

,

2221

027

3.6

107

3.94

16

.6

43.5

10

.5

28.0

2223

448

6.9

116

1.88

6.

5 39

.3

37.6

16

.2

,

2221

942

9.9

59

2.92

6.

8 42

.1

29.4

20

.1

2223

982

6.7

124

0.62

3.

8 36

.5

40.9

17

.3

2221

973

6.4

100

2.35

5.

3 37

.1

38.6

18

.0

2222

168

4.5

96

11.3

1 9.

7 48

.3

13.4

26

.5

2214

764

3.2

110

2.10

8.

3 14

.8

61.5

13

.1

,

2221

018

3.2

113

0.99

7.

4 13

.2

60.3

18

.1

2223

232

3.3

24

1.72

32

.9

19.5

17

.1

26.8

2216

346

3.7

44

4.29

16

.8

38.5

7.

0 37

.1

2216

186

3.6

44

2.36

15

.2

41.3

15

.2

24.6

2217

018

7.8

38

14.0

4 6.

4 25

.4

48.5

18

.1

2221

917

7.3

45

9.57

5.

4 25

.4

48.9

19

.3

2221

682

5.6

59

2.12

4.

5 42

.7

32.8

17

.5

2224

279

4.3

59

8.17

7.

1 27

.6

48.5

16

.3

2223

378

5.5

44

8.94

3.

3 25

.5

54.6

14

.4

2224

077

3.9

62

0.58

5.

7 12

.7

68.4

13

.3

2224

120

3.7

30

6.97

6.

5 27

.2

43.5

17

.4

2222

122

4.7

79

4.73

0.

7 5.

4 90

.9

2.7

2223

787

3.8

86

5.23

0

10.4

80

.7

8.0

, (2

)

2217

292

4.6

22

7.91

28

.6

24.8

4.

8 41

.9 H

aem

ophi

lus

2223

327

6.3

36

7.16

33

.2

23.9

9.

7 32

.8 M

agne

toco

ccus

2219

697

3.7

30

5.21

32

.8

28.6

3.

4 34

.5 M

agne

tosp

irillu

m

2221

898

7.2

25

10.0

1 10

0.0

0 0

0 M

icro

bulb

ifer

2160

690

6.0

24

0 10

0.0

0 0

0 M

icro

bulb

ifer

2212

091

5.4

40

0.19

7.

3 39

.1

40.9

12

.3 M

icro

bulb

ifer

2206

033

3.1

20

2.88

6.

3 20

.3

57.8

15

.6 M

icro

bulb

ifer,

2221

052

4.1

31

1.87

2.

1 2.

1 90

.7

3.1

Mic

robu

lbife

r

2223

677

4.6

33

12.8

1 44

.0

11.5

3.

8 39

.6 P

roch

loro

2220

925

5.3

29

18.4

6 40

.4

12.3

1.

8 43

.9 P

roch

loro

,

2221

894

3.3

33

6.93

43

.5

10.1

4.

3 40

.6 P

roch

loro

2223

338

5.8

45

12.7

2 29

.4

24.1

1.

6 42

.8 P

roch

loro

2210

177

3.9

38

9.54

35

.5

21.7

3.

3 37

.5 P

roch

loro

2223

326

3.3

33

3.39

40

.9

24.2

1.

5 32

.6 P

roch

loro

2222

516

4.3

32

12.9

4 39

.8

15.8

1.

5 41

.4 P

roch

loro

2223

290

5.8

35

9.46

28

.9

15.6

4.

0 48

.6 P

roch

loro

2222

268

3.2

42

1.99

33

.6

20.6

9.

2 35

.9 P

roch

loro

2208

837

3.8

22

7.62

43

.6

10.9

1.

0 44

.6 P

roch

loro

2223

224

4.9

23

17.8

4 32

.8

16.8

4.

0 44

.8 P

roch

loro

2223

133

4.1

24

13.8

7 32

.1

11.3

2.

8 53

.8 P

roch

loro

2222

883

3.3

40

8.24

24

.1

20.7

4.

3 50

.9 P

roch

loro

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld(k

b)

SN

P/k

b1

23

4fe

atu

re

2205

038

3.3

175.

1433

.816

.90

49.3

Pro

chlo

ro

2220

012

3.1

1313

.79

30.8

11.5

1.9

55.8

Pro

chlo

ro

2217

574

22.9

180

0.14

99.3

t0.

6t

Pro

teob

2207

028

36.5

110

0.21

99.4

t0.

40.

1P

rote

ob

2204

904

22.0

180

0.16

99.5

00.

50

Pro

teob

2206

017

21.0

770.

0499

.40

0.5

tP

rote

ob

2205

165

10.4

121

2.05

98.5

0.1

1.2

0.1

Pro

teob

2161

373

6.7

820.

0310

0.0

00

0P

rote

ob

2205

847

11.2

462.

2099

.50

0.5

0P

rote

ob

2223

292

3.8

965.

2233

.729

.45.

330

.5P

rote

ob

2222

493

4.7

866.

4725

.636

.17.

331

.0P

rote

ob

2223

334

4.7

506.

1231

.627

.67.

632

.8P

rote

ob,

2223

340

5.3

866.

8512

.723

.743

.318

.8P

rote

ob

2221

061

3.1

128

2.41

15.5

44.6

10.1

28.0

Pro

teob

2223

941

3.1

645.

2835

.026

.416

.622

.1P

rote

ob

2222

038

7.5

819.

968.

127

.148

.514

.5P

rote

ob

2216

331

4.2

377.

5236

.115

.615

.631

.3P

rote

ob,

2219

891

3.6

423.

9330

.426

.718

.623

.6P

rote

ob,

2222

486

3.3

374.

7032

.830

.68.

228

.4P

rote

ob

2222

294

6.3

712.

118.

440

.432

.517

.8P

rote

ob

2220

673

3.8

593.

6817

.236

.513

.333

.0P

rote

ob

2206

076

4.7

314.

6525

.213

.740

.518

.3P

rote

ob

2223

981

3.3

417.

7225

.019

.227

.525

.8P

rote

ob

2223

282

4.4

2310

.97

29.1

25.2

20.4

22.3

Pro

teob

2164

328

4.1

270.

2124

.039

.210

.426

.4P

rote

ob

2223

443

5.4

807.

896.

919

.853

.718

.1P

rote

ob

2224

176

7.9

613.

145.

142

.931

.319

.6P

rote

ob

2211

760

4.4

3610

.96

18.2

29.7

18.2

31.1

Pro

teob

2223

874

9.1

433.

885.

842

.135

.316

.0P

rote

ob

2222

470

3.0

169

1.79

4.0

12.7

61.6

20.8

Pro

teob

,

2223

396

5.0

6611

.11

6.3

23.0

48.4

21.6

Pro

teob

2223

709

3.4

328.

1615

.234

.318

.131

.4P

rote

ob

2223

410

5.6

3910

.33

7.1

27.3

45.9

18.6

Pro

teob

2220

062

5.6

339.

906.

126

.038

.826

.5P

rote

ob

2206

012

4.6

260.

4610

.420

.053

.016

.5P

rote

ob

2223

636

3.1

486.

367.

134

.033

.322

.7P

rote

ob

2218

314

4.4

464.

744.

527

.349

.517

.7P

rote

ob

2222

348

3.2

305.

717.

623

.943

.523

.9P

rote

ob

2222

144

3.2

561.

773.

86.

483

.46.

4P

rote

ob

2222

648

6.2

421.

332.

339

.339

.317

.9P

rote

ob

2222

765

3.6

365.

262.

724

.358

.614

.4P

rote

ob

2222

834

3.0

872.

310.

42.

886

.210

.7P

rote

ob

2223

428

3.3

110

2.65

04.

088

.47.

6P

rote

ob

2220

387

4.1

643.

230

3.4

93.2

3.4

Pro

teob

,

2206

115

3.4

353.

120

2.7

90.3

7.1

Pro

teob

,

2218

993

3.1

600.

800

17.1

70.7

12.1

Pro

teob

2206

045

3.5

233.

280

7.0

80.2

12.8

Pro

teob

2223

557

3.1

123

2.28

13.4

40.6

15.4

26.9

Pse

udom

on.

2223

319

3.3

735.

5018

.839

.710

.928

.0P

seud

omon

.

2222

096

4.3

154.

4421

.411

.942

.922

.6R

alst

onia

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld(k

b)

SN

P/k

b1

23

4fe

atu

re

2222

506

6.1

688.

5813

.924

.243

.218

.3R

hizo

bial

es

2165

280

8.9

452

4.01

98.8

t1.

0t

She

w

2206

005

6.9

543

0.05

99.3

00.

6t

She

w

2206

003

9.9

248

4.94

99.2

00.

60.

1S

hew

2169

979

6.4

355

0.40

99.6

00.

4t

She

w,

2204

940

6.4

341

0.02

99.2

00.

60.

1S

hew

2219

462

9.4

215

3.43

98.7

t1.

10.

2S

hew

,

2210

011

9.2

218

5.97

98.7

t1.

10.

1S

hew

2219

439

6.0

316

0.02

99.8

00.

20

She

w

2220

899

8.9

210

3.66

98.8

00.

90.

2S

hew

2220

903

9.2

203

4.82

98.9

t1.

00

She

w

2204

949

8.8

201

6.99

98.8

01.

10

She

w

2165

474

6.8

255

0.21

99.2

0.1

0.6

0S

hew

,

2211

373

9.8

164

6.08

97.8

t1.

90.

2S

hew

2205

163

10.2

151

9.34

98.4

0.1

1.5

0S

hew

2205

167

9.7

149

7.47

99.4

00.

60

She

w,

2205

174

9.6

124

6.70

99.5

00.

50

She

w

2173

780

5.8

204

0.01

99.9

00.

10

She

w

2205

995

7.5

155

1.06

98.9

00.

90.

3S

hew

2216

487

9.1

124

3.60

98.9

01.

10

She

w

2204

902

9.9

110

7.70

98.1

01.

90

She

w

2220

893

7.5

140

4.61

99.0

0.4

0.4

0.1

She

w,

2206

678

6.0

171

099

.70

0.3

0S

hew

2209

221

6.0

165

0.02

99.7

00.

30

She

w

2164

666

6.7

148

0.09

99.1

00.

90

She

w,

2219

456

6.1

157

0.01

99.3

00.

50.

2S

hew

2220

870

8.4

110

10.3

498

.20.

21.

10.

5S

hew

2204

893

5.8

153

0.03

99.6

00.

40

She

w

2216

368

5.6

150

099

.40

0.4

0.2

She

w

2216

236

6.2

131

0.02

99.4

00.

60

She

w

2206

949

9.2

846.

5698

.90.

20.

70.

2S

hew

2206

621

6.5

116

0.30

99.6

00.

40

She

w

2219

664

7.9

981.

7698

.70

1.3

0S

hew

2216

651

8.1

943.

8597

.90.

41.

00.

6S

hew

2205

920

9.6

749.

5799

.40

0.2

0.4

She

w

2204

954

6.4

111

099

.60

0.4

0S

hew

2205

164

8.4

862.

4798

.60.

21.

20

She

w

2163

072

7.4

100

6.31

98.8

01.

20

She

w

2204

937

8.4

811.

9098

.60

0.8

0.5

She

w

2207

906

5.3

121

0.07

99.2

00.

60.

2S

hew

2205

918

10.6

593.

0698

.60

1.4

0S

hew

2208

596

8.7

703.

1499

.30

0.7

0S

hew

2222

491

8.2

731.

9397

.20.

32.

00.

5S

hew

,

2205

127

5.9

970.

0198

.90

1.1

0S

hew

,

2204

979

9.7

555.

3299

.00

0.7

0.3

She

w

2210

125

6.4

840.

0799

.40.

30.

30

She

w

2205

133

6.2

860.

0499

.70

0.3

0S

hew

2206

062

7.8

694.

2198

.00

2.0

0S

hew

2216

181

5.6

900.

0399

.50

0.5

0S

hew

2204

874

6.5

750.

0199

.70

0.3

0S

hew

Page 29: Supplemental Online Materials

MS

1093

857:

Env

iron

men

tal G

enom

e Sh

otgu

n Se

quen

cing

of

the

Sarg

asso

Sea

V

ente

r et

al.,

rev

ised

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld (

kb)

SN

P/k

b

1 2

3 4

feat

ure

2204

871

9.9

47

6.22

98

.3

0.3

1.3

0 S

hew

,

2204

941

9.6

48

8.23

99

.0

0 1.

0 0

She

w

2168

940

8.9

50

12.8

3 97

.6

0 2.

1 0.

3 S

hew

2204

920

6.6

64

0.06

99

.6

0 0.

4 0

She

w

2216

439

5.9

73

7.06

98

.0

1.1

0.9

0 S

hew

2205

898

7.4

55

1.74

99

.1

0 0.

9 0

She

w

2206

989

5.3

73

0.04

99

.0

0 1.

0 0

She

w,

2204

910

7.5

48

3.09

99

.6

0 0.

4 0

She

w

2221

153

5.8

70

0.50

95

.6

1.1

2.3

1.1

She

w

2205

169

9.0

38

26.3

9 98

.4

0 1.

1 0.

4 S

hew

2219

980

4.0

87

0.14

97

.8

0.4

1.3

0.4

She

w

2206

068

6.3

56

3.94

99

.8

0 0.

2 0

She

w

2166

569

5.1

63

0.02

99

.5

0 0.

5 0

She

w

2205

173

8.4

38

3.16

98

.6

0 1.

4 0

She

w

2204

944

8.7

37

3.89

99

.3

0 0.

7 0

She

w

2205

198

9.1

34

3.64

99

.7

0 0.

3 0

She

w

2204

922

8.0

39

2.38

98

.0

0 2.

0 0

She

w

2216

297

8.7

35

5.53

99

.7

0.3

0 0

She

w

2205

908

8.4

36

8.35

99

.0

0 1.

0 0

She

w

2204

969

8.7

34

5.21

98

.9

0 0.

5 0.

5 S

hew

2204

903

5.8

47

0.25

10

0.0

0 0

0 S

hew

2222

542

7.9

38

4.14

96

.4

0.3

2.8

0.6

She

w

2204

872

7.1

35

3.08

98

.4

0 0.

9 0.

6 S

hew

2168

227

6.2

37

0 10

0.0

0 0

0 S

hew

2169

121

5.8

39

0 10

0.0

0 0

0 S

hew

2204

911

7.5

30

3.49

98

.9

0 1.

1 0

She

w,

2223

639

3.1

76

0.28

99

.6

0 0

0.4

She

w

2205

846

9.0

24

21.9

9 97

.8

0 2.

2 0

She

w,

(2)

2204

947

8.4

23

6.98

98

.8

0 0.

4 0.

8 S

hew

2204

889

7.3

25

0.61

10

0.0

0 0

0 S

hew

2163

269

6.7

26

0.12

99

.1

0 0.

9 0

She

w,

2216

838

6.7

28

6.50

97

.3

0 2.

7 0

She

w

2205

996

6.3

23

0.34

98

.9

0 1.

1 0

She

w

2173

395

6.6

22

1.45

97

.9

0 2.

1 0

She

w

2208

384

4.6

29

0.04

99

.4

0.6

0 0

She

w

2205

757

5.7

20

2.16

10

0.0

0 0

0 S

hew

2205

758

6.6

16

2.95

10

0.0

0 0

0 S

hew

,

2161

683

5.2

20

0 98

.5

0 1.

5 0

She

w

2204

923

5.5

19

0 10

0.0

0 0

0 S

hew

2223

559

4.9

28

0.83

90

.6

4.3

2.9

0.7

She

w

2162

269

5.4

19

0 10

0.0

0 0

0 S

hew

,

2204

966

6.1

15

0 10

0.0

0 0

0 S

hew

2205

128

5.1

16

0 10

0.0

0 0

0 S

hew

2205

166

5.9

13

1.62

10

0.0

0 0

0 S

hew

2204

984

3.4

22

0 10

0.0

0 0

0 S

hew

2205

756

4.2

10

0.10

10

0.0

0 0

0 S

hew

2223

632

3.2

55

2.09

4.

0 10

.3

73.0

12

.6 S

hew

2217

568

11.6

29

2.

24

0 23

.1

60.7

15

.6 S

hew

2222

081

3.4

68

2.65

17

.9

32.3

27

.2

21.8

Sin

orhi

zobi

um

2222

388

3.3

44

5.32

21

.6

26.8

35

.3

15.7

Spi

roch

aeta

les

Gen

ban

k

Sp

an

Sam

ple

co

mp

osi

tio

n (

%)

ph

yla,

Acc

. n

-Fo

ld(k

b)

SN

P/k

b1

23

4fe

atu

re

2222

083

5.0

437.

7425

.719

.835

.917

.7S

trep

, (4

)

2223

293

4.0

202.

7639

.424

.516

.020

.2S

trep

,

2221

531

4.2

453.

4927

.118

.631

.419

.3S

trep

2223

578

7.1

971.

694.

345

.334

.914

.6V

ibrio

nace

ae

Page 30: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

23

Table S4. Gene count breakdown by TIGR Role Category, sample comparison. Samples are described in Table S1; analyses for samples 1-4 were carried out on the pooled assembly of those samples. Note that there are 28023 genes which were classified in more than one role category.

TIGR Role Category Samples 1-4 Samples 5-7 Total Amino acid biosynthesis 32193 4925 37118 Biosynthesis of cofactors, prosthetic groups, and carriers 22517 3388 25905 Cell envelope 24510 3373 27883 Cellular processes 15059 2201 17260 Central intermediary metabolism 11624 2015 13639 DNA metabolism 21178 4168 25346 Energy metabolism 59997 9721 69718 Fatty acid and phospholipid metabolism 16282 2276 18558 Mobile and extrachromosomal element functions 969 92 1061 Protein fate 24296 4472 28768 Protein synthesis 41686 6326 48012 Purines, pyrimidines, nucleosides, and nucleotides 17321 2591 19912 Regulatory functions 7459 933 8392 Signal transduction 4205 612 4817 Transcription 10697 2059 12756 Transport and binding proteins 42386 6799 49185 Unknown function 34104 3963 38067 Miscellaneous 1 1863 1864 Conserved hypothetical 641706 152355 794061 Total Number Of Roles Assigned 1028190 214040 1242230 Total Number Of Genes 1001987 212220 1214207

Page 31: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

24

Table S5. Analysis of microbial content versus eukaryotic content by sample. The proportion of sequence covered by predicted genes with similarity to known bacterial genes is shown for the unassembled singletons from each sample as well as for all scaffolds in the assembly of Samples 1-4. The larger scaffolds are derived entirely from microbial genomes and have the highest proportion (75%) of their sequences covered by known microbial genes. Assuming this value is a general upper limit for our ability to detect known microbial genes in any of the samples, we used this value to normalize the bacterial content found in the other samples, resulting in values ranging from 98% to 30%. The non-bacterial content is presumably eukaryotic DNA as indicated by the presence of the 18S rRNAs in the various samples. The eukaryotic genomes sampled here appear to be at very low coverage, since the 18S rRNAs are distinct and most are found in the unassembled reads. Indeed, even the 18S rRNAs found in the scaffolds are found only on mini-scaffolds (consisting of a read and its mate-pair). The normalized bacterial content varies inversely to the size of the pre-filter and filter, presumably because the former has the ability to exclude relatively large eukaryotic cells while the later increases the proportion of the smaller bacterial cells collected.

bacterial gene

content (Kbp)

Total sequence

(Kbp)Bacterial fraction

Normalizedbacterial content

18SrRNAs

Filter (µµµµ)

Pre-filter

(µµµµ)Assemblies: all 217,610 299,020 0.73 0.97 6 - -

large (>10 kb) 83,405 111,504 0.75 1.00 0 - -Sample 1 72,416 100,221 0.72 0.97 1 0.10 0.80

2 76,500 130,450 0.59 0.78 7 0.22 0.803 82,547 140,225 0.59 0.79 10 0.22 0.804 82,096 145,627 0.56 0.75 16 0.22 0.805 25,277 112,418 0.22 0.30 24 3.00 20.006 29,048 73,996 0.39 0.52 5 0.80 3.007 57,815 78,778 0.73 0.98 0 0.10 0.80

Page 32: Supplemental Online Materials

MS 1093857: Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et al., revised

25

References

1. J. R. Cole et al., Nucleic Acids Research 31, 442 (Jan 1, 2003). 2. T. P. Curtis, W. T. Sloan, J. W. Scannell, Proceedings of the National Academy of

Sciences of the United States of America 99, 10494 (AUG 6, 2002). 3. A. Chao, M.-C. Ma, M. C. K. Yang, Biometrics 43, 783 (1993).