gutell 120.plos_one_2012_7_e38320_supplemental_data

65
The Fragmented Mitochondrial Ribosomal RNAs of Plasmodium falciparum Jean E. Feagin 1 * ¤a , Maria Isabel Harrell 1¤b , Jung C. Lee 2¤c , Kevin J. Coe 1¤d , Bryan H. Sands 1¤e , Jamie J. Cannone 2 , Germaine Tami 1¤f , Murray N. Schnare 3 , Robin R. Gutell 2 1 Seattle Biomedical Research Institute, Seattle, Washington, United States of America, 2 The Institute for Cellular and Molecular Biology, and the Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, United States of America, 3 Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada Abstract Background: The mitochondrial genome in the human malaria parasite Plasmodium falciparum is most unusual. Over half the genome is composed of the genes for three classic mitochondrial proteins: cytochrome oxidase subunits I and III and apocytochrome b. The remainder encodes numerous small RNAs, ranging in size from 23 to 190 nt. Previous analysis revealed that some of these transcripts have significant sequence identity with highly conserved regions of large and small subunit rRNAs, and can form the expected secondary structures. However, these rRNA fragments are not encoded in linear order; instead, they are intermixed with one another and the protein coding genes, and are coded on both strands of the genome. This unorthodox arrangement hindered the identification of transcripts corresponding to other regions of rRNA that are highly conserved and/or are known to participate directly in protein synthesis. Principal Findings: The identification of 14 additional small mitochondrial transcripts from P. falcipaurm and the assignment of 27 small RNAs (12 SSU RNAs totaling 804 nt, 15 LSU RNAs totaling 1233 nt) to specific regions of rRNA are supported by multiple lines of evidence. The regions now represented are highly similar to those of the small but contiguous mitochondrial rRNAs of Caenorhabditis elegans. The P. falciparum rRNA fragments cluster on the interfaces of the two ribosomal subunits in the three-dimensional structure of the ribosome. Significance: All of the rRNA fragments are now presumed to have been identified with experimental methods, and nearly all of these have been mapped onto the SSU and LSU rRNAs. Conversely, all regions of the rRNAs that are known to be directly associated with protein synthesis have been identified in the P. falciparum mitochondrial genome and RNA transcripts. The fragmentation of the rRNA in the P. falciparum mitochondrion is the most extreme example of any rRNA fragmentation discovered. Citation: Feagin JE, Harrell MI, Lee JC, Coe KJ, Sands BH, et al. (2012) The Fragmented Mitochondrial Ribosomal RNAs of Plasmodium falciparum. PLoS ONE 7(6): e38320. doi:10.1371/journal.pone.0038320 Editor: Bob Lightowlers, Newcastle University, United Kingdom Received February 15, 2012; Accepted May 3, 2012; Published June 22, 2012 Copyright: ß 2012 Feagin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: Support for these studies was provided by National Institutes of Health R01 AI40368 and The M. J. Murdoch Charitable Trust (JEF), by grant MOP-11212 to Michael W. Gray from the Canadian Institutes of Health Research (MNS), and National Institutes of Health R01 GM067317 and Welch Foundation F-1427 (RRG). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: [email protected] ¤a Current address: Departments of Pharmacy and Global Health, University of Washington, Seattle, Washington, United States of America ¤b Current address: Center for Childhood Infections and Prematurity Research, Seattle Children’s Research Institute, Seattle, Washington, United States of America ¤c Current address: Department of Physics and Chemistry, Milwaukee School of Engineering, Milwaukee, Wisconsin, United States of America ¤d Current address: Drug Discovery, Drug Metabolism and Pharmacokinetics, Johnson and Johnson, San Diego, California, United States of America ¤e Current address: Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America ¤f Current address: Atlantic Research Centre, Dalhousie University, Halifax, Nova Scotia, Canada Introduction The human malaria parasite Plasmodium falciparum has a most unusual mitochondrial (mt) genome. It consists of tandem repeats of a 5967 nt sequence which encodes open reading frames similar to the genes for cytochrome c oxidase subunits I and III (cox1 and cox3) and apocytochrome b (cob) of other organisms [1]. Mt DNAs also characteristically encode the rRNAs needed for mt translation [2]. While sequences similar to regions of large subunit (LSU) and small subunit (SSU) rRNAs are encoded by the P. falciparum mt genome, they are scattered across both strands of the genome, interspersed with each other and the protein coding genes, and correspond to small RNAs [3,4]. Fragmented rRNAs have been reported from other organisms [5–13] and in some instances, most notably for Chlamydomonas reinhardtii mitochondria [5], the fragments are encoded out of order with each other. However, the Chlamydomonas mt rRNA fragments correspond to the majority of the eubacterial rRNA structure while those of Plasmodium species map to a smaller percentage of the bacterial rRNA. The Caenorhabditis elegans mt rRNAs [14] and those of kinetoplastid protozoans [15–18] are very small but are composed of continuous LSU and SSU rRNAs. The combination of a high degree of fragmentation and the small size of the fragments makes the P. falciparum mt rRNAs among the most unusual described. PLoS ONE | www.plosone.org 1 June 2012 | Volume 7 | Issue 6 | e38320

Upload: robin-gutell

Post on 25-May-2015

212 views

Category:

Technology


2 download

DESCRIPTION

Feagin J.E., Harrell M.I., Lee J.C., Coe K.J., Sands B.H., Cannone J.J., Tami G., Schnare M.N., and Gutell R.R. (2012). The Fragmented Mitochondrial Ribosomal RNAs of Plasmodium falciparum. PLoS One, 7(6):e38320.

TRANSCRIPT

Page 1: Gutell 120.plos_one_2012_7_e38320_supplemental_data

The Fragmented Mitochondrial Ribosomal RNAs ofPlasmodium falciparumJean E. Feagin1*¤a, Maria Isabel Harrell1¤b, Jung C. Lee2¤c, Kevin J. Coe1¤d, Bryan H. Sands1¤e,

Jamie J. Cannone2, Germaine Tami1¤f, Murray N. Schnare3, Robin R. Gutell2

1 Seattle Biomedical Research Institute, Seattle, Washington, United States of America, 2 The Institute for Cellular and Molecular Biology, and the Center for

Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, United States of America, 3 Department of Biochemistry and Molecular

Biology, Dalhousie University, Halifax, Nova Scotia, Canada

Abstract

Background: The mitochondrial genome in the human malaria parasite Plasmodium falciparum is most unusual. Over halfthe genome is composed of the genes for three classic mitochondrial proteins: cytochrome oxidase subunits I and III andapocytochrome b. The remainder encodes numerous small RNAs, ranging in size from 23 to 190 nt. Previous analysisrevealed that some of these transcripts have significant sequence identity with highly conserved regions of large and smallsubunit rRNAs, and can form the expected secondary structures. However, these rRNA fragments are not encoded in linearorder; instead, they are intermixed with one another and the protein coding genes, and are coded on both strands of thegenome. This unorthodox arrangement hindered the identification of transcripts corresponding to other regions of rRNAthat are highly conserved and/or are known to participate directly in protein synthesis.

Principal Findings: The identification of 14 additional small mitochondrial transcripts from P. falcipaurm and the assignmentof 27 small RNAs (12 SSU RNAs totaling 804 nt, 15 LSU RNAs totaling 1233 nt) to specific regions of rRNA are supported bymultiple lines of evidence. The regions now represented are highly similar to those of the small but contiguousmitochondrial rRNAs of Caenorhabditis elegans. The P. falciparum rRNA fragments cluster on the interfaces of the tworibosomal subunits in the three-dimensional structure of the ribosome.

Significance: All of the rRNA fragments are now presumed to have been identified with experimental methods, and nearlyall of these have been mapped onto the SSU and LSU rRNAs. Conversely, all regions of the rRNAs that are known to bedirectly associated with protein synthesis have been identified in the P. falciparum mitochondrial genome and RNAtranscripts. The fragmentation of the rRNA in the P. falciparum mitochondrion is the most extreme example of any rRNAfragmentation discovered.

Citation: Feagin JE, Harrell MI, Lee JC, Coe KJ, Sands BH, et al. (2012) The Fragmented Mitochondrial Ribosomal RNAs of Plasmodium falciparum. PLoS ONE 7(6):e38320. doi:10.1371/journal.pone.0038320

Editor: Bob Lightowlers, Newcastle University, United Kingdom

Received February 15, 2012; Accepted May 3, 2012; Published June 22, 2012

Copyright: � 2012 Feagin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: Support for these studies was provided by National Institutes of Health R01 AI40368 and The M. J. Murdoch Charitable Trust (JEF), by grant MOP-11212to Michael W. Gray from the Canadian Institutes of Health Research (MNS), and National Institutes of Health R01 GM067317 and Welch Foundation F-1427 (RRG).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing Interests: The authors have declared that no competing interests exist.

* E-mail: [email protected]

¤a Current address: Departments of Pharmacy and Global Health, University of Washington, Seattle, Washington, United States of America¤b Current address: Center for Childhood Infections and Prematurity Research, Seattle Children’s Research Institute, Seattle, Washington, United States of America¤c Current address: Department of Physics and Chemistry, Milwaukee School of Engineering, Milwaukee, Wisconsin, United States of America¤d Current address: Drug Discovery, Drug Metabolism and Pharmacokinetics, Johnson and Johnson, San Diego, California, United States of America¤e Current address: Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America¤f Current address: Atlantic Research Centre, Dalhousie University, Halifax, Nova Scotia, Canada

Introduction

The human malaria parasite Plasmodium falciparum has a most

unusual mitochondrial (mt) genome. It consists of tandem repeats

of a 5967 nt sequence which encodes open reading frames similar

to the genes for cytochrome c oxidase subunits I and III (cox1 and

cox3) and apocytochrome b (cob) of other organisms [1]. Mt DNAs

also characteristically encode the rRNAs needed for mt translation

[2]. While sequences similar to regions of large subunit (LSU) and

small subunit (SSU) rRNAs are encoded by the P. falciparum mt

genome, they are scattered across both strands of the genome,

interspersed with each other and the protein coding genes, and

correspond to small RNAs [3,4]. Fragmented rRNAs have been

reported from other organisms [5–13] and in some instances, most

notably for Chlamydomonas reinhardtii mitochondria [5], the

fragments are encoded out of order with each other. However,

the Chlamydomonas mt rRNA fragments correspond to the majority

of the eubacterial rRNA structure while those of Plasmodium species

map to a smaller percentage of the bacterial rRNA. The

Caenorhabditis elegans mt rRNAs [14] and those of kinetoplastid

protozoans [15–18] are very small but are composed of continuous

LSU and SSU rRNAs. The combination of a high degree of

fragmentation and the small size of the fragments makes the P.

falciparum mt rRNAs among the most unusual described.

PLoS ONE | www.plosone.org 1 June 2012 | Volume 7 | Issue 6 | e38320

Page 2: Gutell 120.plos_one_2012_7_e38320_supplemental_data

The features of a particularly small mt genome with just three

protein coding genes and highly fragmented mt rRNAs appear to

be conserved among the phylum Apicomplexa, to which P.

falciparum belongs. The mt DNA sequence has been determined for

numerous species of Plasmodium, which demonstrate strong overall

conservation. The sequence and transcription of the mt genome of

the rodent parasite Plasmodium yoelii have been examined in some

depth [19,20], demonstrating the same structural and genetic

organization as for P. falciparum. Mt genomes of other Plasmodium

species, while not as thoroughly analyzed, share these character-

istics [21] (also described in this work), as do related species.

Theileria parva is an apicomplexan cattle parasite having a ,6.6 kb

mt genome [22,23]. The same three protein coding genes are

present as in P. falciparum and a few rRNA fragments correspond-

ing to domains IV and V of the LSU rRNA have been assigned by

sequence analysis in T. parva [23]. The gene order for the T. parva

mt genome differs from that of Plasmodium, however, as does the

physical form of the genome. The Plasmodium mt genome occurs in

tandem repeats but the T. parva mt genome consists of single linear

copies of the genome bounded by inverted repeat sequences [23].

The mt genomes of additional Theileria species and of several

Babesia species [24] have also been sequenced and are similar in

character to that of T. parva, consistent with these genera both

being in Piroplasmida. Those of Eimeria tenella, a coccidian, have

much the same genes but show a third variation on mt gene order

[25]. The mt genomes of Hepatocystis, Leucocytozoon, and Hemoproteus

species, lesser known apicomplexans, closely resemble Plasmodium

mt genomes in structure and content [26]; they and Plasmodium

species are all members of Order Haemosporidia. Highly

fragmented mt rRNAs thus appear to be a common feature of

apicomplexans.

Assignment of fragmented rRNAs based on DNA sequence

similarities alone can be misleading. Consequently, evidence that

the P. falciparum mt rRNA-like sequences are transcribed is an

important factor in establishing their potential as components of

functional mt ribosomes. We have previously reported a total of 20

small transcripts mapping to the P. falciparum mt genome [4]. They

range from 40 to 195 nt in length and many carry non-coded

oligo(A) tails of unknown function [27,28]. A majority of the small

RNAs are similar to regions of rRNA sequences from other

organisms [3,4], although they are encoded in the mt genome out

of linear order with respect to conventional rRNAs. Taken

together, they comprise many of the highly conserved sequences in

rRNAs. However, RNAs corresponding to many regions of the

rRNAs were not previously detected, including the GTPase center

in domain II of the LSU rRNA. The two candidate sequences

initially thought to comprise that region were each predicted to be

,100 nt long [3] but evidence for transcripts in that size range

was not found [4]. With the recent proliferation of large-scale

sequencing projects, small transcripts complementary to mt DNA

have been reported from P. falciparum cDNA libraries, and the

similarity of some of these to rRNA regions has been noted [29].

Detailed crystallographic data for ribosomes and ribosomal

subunits have become available in recent years, greatly assisting

our understanding of the functional correlates of rRNA and

protein structures. In 2000, Ban et al. [30], Wimberly et al. [31],

and Schuenzen et al. [32] published high-resolution three-

dimensional crystal structures of the Haloarcula marismortui large

subunit and the Thermus thermophilus small subunit of the ribosome,

respectively. A year later, Yusupov et al. [33] published the

structure of the assembled T. thermophilus 70S ribosome. More

detailed structural analyses have followed, including lower

resolution analyses of the mammalian [34,35] and C. elegans [36]

mt ribosomes and mammalian cytoplasmic ribosomes [37]. More

recently, higher resolution three-dimensional structures of two

eucaryotic cytoplasmic small ribosomal subunits were published,

from Saccharomyces cerevisiae [38] and Tetrahymena thermophila [39].

The mechanism of action of antibiotics has been structurally

probed [40–42], the mechanism of peptide bond formation has

been evaluated [43–49], specific interactions between rRNA and

protein components of ribosomes have been assessed [50–53], and

the structure of the polypeptide exit tunnel has been revealed [54].

This wealth of structural data, augmented with comparative

information that reveals patterns of sequence and structure

conservation [55], provides a framework for further evaluation

of the unusual Plasmodium rRNAs.

We here report mapping an additional 14 small transcripts to

the P. falciparum mt genome, yielding a total of 34 such transcripts

(Figure 1). Using multiple analyses, we have identified 27 of them

as rRNA fragments assigned to specific regions of SSU and LSU

rRNA. The regions represented are highly similar to the small but

contiguous mt rRNAs of C. elegans, and are consistent with a

functional role in mt protein synthesis for these highly fragmented

rRNAs.

Results

Plasmodium mt DNA ConservationComplete (or nearly complete) mt genome sequences are

currently available for 25 species and one Plasmodium isolate

collected from a mandrill, and for eight species in four related

genera of hemosporidians (Table S1). All of them have the same

order and orientation of the protein coding genes and the

remaining sequence is highly conserved. We aligned representative

sequences from each Plasmodium species, the Plasmodium isolate, and

the related hemosporidians, later adding mt DNA sequences from

another 184 Plasmodium mt DNAs (Table S1). Based on the

alignments, the closest relatives of P. falciparum are parasites of

chimpanzees: P. reichenowi, P. billcollinsi, and P. billbrayi, the latter

two being recently suggested species [56]. (Gorillas harbor a

parasite that appears even more closely related to P. falciparum [57]

but a full mt DNA sequence is not available.) The mt DNA

sequences of the three chimpanzee parasites are 97.4%, 94.2%,

and 91.7% identical to that of P. falciparum (M76611), respectively.

The mt DNAs of other primate malarias, including the human

parasites P. vivax, P. ovale, P. malariae, P. knowlesi, and multiple

species that parasitize monkeys, are 87.3%–89.1% identical to P.

falciparum. Those of rodent, lizard, and avian malaria parasites are

87.8%–88.3% identical to P. falciparum. The conservation of the

non-Plasmodium hemosporidian mt DNAs relative to that of P.

falciparum is 82.6%–86.8%. Some of the variation reflects minor

size differences among the mt genomes (Table S1). Much reflects

silent or conservative substitutions in the three protein coding

genes.

Variations between individual sequences may represent poly-

morphisms or sequencing errors instead of real differences. Mt

genomes have been sequenced from numerous geographically

distinct P. falciparum [58,59] and P. vivax lines [60]. There are

comparatively few polymorphic sites within these species, princi-

pally silent substitutions in the protein coding genes. There is far

less redundancy of mt DNA data from other Plasmodium species;

for some species, only a single sequence is available. However,

some of those less-represented species are very closely related; for

example, the mt DNA sequences of P. cynomolgi, P. fieldi, and P.

simiovale are ,99% identical to that of P. vivax so shared variation is

likely real. Some of the sequence variation reflects single nt

differences in just one of the species. Prudence is warranted when

interpreting these as they may be idiosyncratic to a specific isolate

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 2 June 2012 | Volume 7 | Issue 6 | e38320

Page 3: Gutell 120.plos_one_2012_7_e38320_supplemental_data

or may reflect sequencing errors. Many of the mt DNAs were

generated by PCR amplification, adding another possible source

for variation. Additionally, the PCR-generated mt genomes

typically lack a 50–100 nt region. Because the P. falciparum mt

genome is tandemly repeated, PCR amplification with neighbor-

ing divergent primers yields a nearly complete genome, missing

only the sequence between the primers. Sequences generated from

different isolates or species in the same study lack the same region

so these presumably reflect primer location. Such mt DNA

sequences were included in our analyses, with the missing

sequence ignored. We did not include other partial or incomplete

mt genome sequences in our analyses.

The size of the complete Plasmodium mt genomes varies from

5864 nt (P. billcollinsi and P. billbrayi) to 6014 nt (P. juxtanucleare).

The differences relate primarily to nt additions/deletions at the

junctions between genes–both mRNAs and small RNAs–or in the

281 nt of apparently non-coding sequence. This is so marked that

gaps in sequence alignments almost always correspond to the

mapped locations of transcript ends (Table 1). Other hemospor-

idian mt genome lengths range from 5935 nt (L. sabrasezi) to

6684 nt (L. majoris), except for Hepatocystis, which has substantial

insertions relative to the other mt genomes (discussed below). For

each, most of the variation between sequences is due to differences

at the junctions between genes. Many of the alignment gaps occur

at an A run, with the number of 39 A residues (or T for the

complementary strand) in mt DNAs often differing between

species. Nearly all of the P. falciparum small RNAs are

oligoadenylated after transcription so differences in the number

of encoded A residues at the 39 end of an RNA may not matter.

Addition of A residues post-transcriptionally may thus render the

RNAs more similar among the species than are their genes.

More substantial size variation occurs at some sites, creating

gaps in the alignment (Table S2). The twelve largest gaps include

six at the junction of genes transcribed from different strands and

six with both genes on the same strand. The largest of the gaps is

found between the cox1 and cob ORFs, separated by 20 nt in P.

falciparum but by 94 in Hepatocystis. Transcript mapping has shown

that the P. falciparum cox1 and cob transcripts are immediately

adjacent [27]. If that is also the case in Hepatocystis, the

untranslated regions (UTRs) for one or both transcripts would

be larger than some of the small mt RNAs. A thirteenth gap is a

51 nt insertion in the Hepatocystis cob ORF toward the 39 end. The

insertion maps to the 39 end of LSUB, and may or may not be

included in the Hepaptocystis LSUB RNA. If included, it would be a

39 tail for LSUB, and easily accomodated in the structure. The

potential effect on COB is greater. The Hepatocystis amino acid

sequence is well-conserved relative to that of Plasmodium prior to

and following the insertion. The DNA sequence of the insertion

contains two stop codons but it seems unlikely that it is an intron.

These are not common in mt-encoded genes and there is no other

evidence for introns in apicomplexan mt genomes. Alternate

explanations are that Hepatocystis has a truncated but functional

COB, or that the stop codons reflect sequence errors. (Since only

one Hepatocystis sequence is available, the latter explanation seems

quite plausible.) The inserted sequence, if translated, would have a

hydrophobic character, consistent with the membrane protein

COB, and the LSUB RNA would be unaffected, other than a

possible tail.

In several cases, some or all of the gap sequence is highly

conserved among species. A run of 13 nt in the gap between cox3

and LSUF is conserved in all the species we examined, as are

seven of the eight nt immediately adjacent to LSUF. Of 21 nt

between RNA6 and RNA12, 15 are conserved in all the species we

examined. Even when there is not broad sequence conservation,

there tends to be conservation among closely related species,

consistent with related hosts (Table S1). However, species with the

same size of intergenic region (Table S2) do not necessarily have

the same sequence.

A handful of sites have 4–7 nt discontinuities in the alignment;

these are invariably located at the 39 or 59 end of a gene.

Differences in nt sequence also tend to cluster near the ends of the

small RNAs and in the intergenic regions. RNA10, which is

assigned to the sarcin ricin loop (SRL) is the principal exception

(Figure S1). The 39 part of RNA10 contains the SRL as part of a

Figure 1. Schematic map of the P. falciparum mt genome. A schematic map of the 6 kb element is shown, with genes above or below the linedepending on direction of transcription (orange arrows: left to right above the line and right to left below). Because the P. falciparum mt genome istandemly repeated, the endpoints shown are the endpoints of Genbank submission M76611, rather than actual structure. Protein coding genes areindicated by white boxes, small mt rDNA sequences are shown with blue (SSU rRNA) and green (LSU rRNA) boxes and labels, and the locations ofRNA23t-RNA27t are indicated with red arrows and labels. Gene abbreviations: cox1 and cox3, cytochrome c oxidase subunits I and III; cob, cytochromeb; LA-LG, LSU rRNA fragments; SA-SF, SSU rRNA fragments; 1–22, RNA1-RNA22; 23t-27t, RNA23t-RNA27t.doi:10.1371/journal.pone.0038320.g001

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 3 June 2012 | Volume 7 | Issue 6 | e38320

Page 4: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table 1. P. falciparum small mt RNA mapping and identification.

geneaP. falciparum M76611b

coordinatesT. parva Z23263coordinates

Raabe, et al. 29

cDNAscsize RNA/DNA(P. falciparum)d subunit/ordere

SSUA 2023-1916 3666-3561 mtR_6043/27 125/108 SSU/4

SSUB 505-390 4182-4074 none complete 116/116 SSU/6

SSUD 5446-5379 3800-3733 mtR_111/36 65/68 SSU/10

SSUE 1638-1680* 2095-2057 mtR_215/7 40/42 SSU/11

SSUF 5507-5447 1840-1900 mtR_11b/1105 74,58/61 SSU/12

LSUA 5201-5026 4073-3906 none complete 175/176 LSU/1

LSUB 4618*-4586 3667-3691 mtR_143/27 30/29 LSU/3

LSUC 225*-204 1824-1839 mtR_203/2 23/22 LSU/4

LSUD 5854-5772 3114-3196 mtR_5773a/31 85/83 LSU/8

LSUE 5771-5577 3199-3377 none complete 190/195 LSU/9

LSUF 1516-1630 3507-3395 none complete 125,110/115 LSU/11

LSUG 389-283 5667-5775 mtR_36a/221 110/107 LSU/12

RNA1 593-506 3827-3905 mtR_19/19 95/88 LSU/6

RNA2 1698-1763 3098-3032 mtR_69c/15 75/67 LSU/2

RNA3 1830-1910 5658-5584 mtR_5713a/91 85/81 LSU/7

RNA4 4625-4696 mtR_37a/410 75/72

RNA5 4716-4802 mtR_5822/61 92/87 SSU/9

RNA6 4808-4865 1643-1584 mtR_92/41 58/58 LSU/15

RNA7 5283-5202 1678-1758 mtR_5837/69 94/82

RNA8 5954-5855 3031-2940 mtR_41/38 115/100 SSU/5

RNA9 72-125 4211-4183 mtR_182b/7 62/54 SSU/8

RNA10 724-625 4359-4301 mtR_140a/29 100/100 LSU/13

RNA11 5340-5284 2300-2240 mtR_14a/12 78/57 LSU/5

RNA12 4887-4945 mtR_52b/35 68/59 SSU/2

RNA13 5025-4996 4401-4373 mtR_5844/63 48/30 LSU/10

RNA14 5546-5508 2039-1997 mtR_11c/62 40/39 SSU/1

RNA15 624-594 1759-1790 mtR_76b/163 40/31

RNA16 3-33 mtR_60/21 ?/31

RNA17 126-165 2192-2154 mtR_18a/132 ?/40 SSU/3

RNA18 4964-4988 1518-1494 mtR_104/23 ?/25 LSU/14

RNA19 5576-5547 mtR_42/9 ?/30 SSU/7

RNA20 34-71 mtR_86/15 54/38

RNA21 1807*-1829 mtR_26/26 ?/23

RNA22 5378-5341 mtR_59/124 ?/38

RNA23t 171-203 mtR_21/182

RNA24t 262-224 mtR_39/36

RNA25t 283-262 mtR_40b/15

RNA26t 1764-1806 mtR_49/72

RNA27t 4138-4085 mtR_84/122

aNaming convention reflects history of identification [4]. RNA23t through RNA27t are tentative assignments. We have not detected them in RNA blotting experiments,but sequence conservation and comparative abundance of cDNAs [29] suggest they are mature RNAs or abundant processing products.b59 ends are based on primer extension data, except for RNA1 (RNase protection), RNAs 9 and 14–22 (59 RACE), and RNA21 (Raabe, et al. [29] cDNAs and sequenceconservation). 39 ends were determined by 39 RACE except for SSUE, LSUB, and LSUC (inferred from 59 end location and transcript size). We have confirming RNaseprotection data for many of the 59 and 39 ends. *, coordinates inferred.ccDNAs corresponding to mapped P. falciparum mt RNAs. Each designation is name/number of cDNAs.dThe observed size from denaturing polyacrylamide gels is noted for the RNA size, compared to the gene size (DNA) based on the location of 59 and 39 ends. ?, RNAidentified by RACE, not blotting.eAssignment of P. falciparum small mt RNAs to large and small subunit rRNAs; numbers indicate linear order relative to conventional rRNAs.doi:10.1371/journal.pone.0038320.t001

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 4 June 2012 | Volume 7 | Issue 6 | e38320

Page 5: Gutell 120.plos_one_2012_7_e38320_supplemental_data

43 nt sequence block that is nearly identical among Plasmodium

species. A 13 nt block at the 59 end of RNA10 is identical among

species. Between these sites, there is considerable variation,

including additions/deletions. Overall, the mt DNAs of Plasmodium

and related hemosporidians are very highly conserved.

Identification of Additional P. Falciparum Small mt RNAsThe GTPase center of the ribosome plays a crucial role in

translocation so assignment of corresponding RNA fragments is a

necessity to postulate a functional mt ribosome. Our attempts to

locate such fragments by northern blot analysis were initially

based on the expectation from sequence analysis that these

RNAs would be ,100 nt long; these attempts were unsuccessful

[4]. One alternate possibility was that other small RNAs from the

mt DNA might encode the GTPase functions. However, none of

the five RNAs that were unassigned to specific regions of rRNA

at that time appeared similar to the GTPase center sequences

[4]. Completion of the mt genome sequence from several other

Plasmodium species provided another means to look for small

RNAs. When we aligned the mt DNAs of Plasmodium species, we

found 100% conservation in two short regions which were

similar to the GTPase center: a region 26 nt long for helix

H1057 and 16 nt long for helix H1087 (Figure S2A). We

therefore re-examined transcription for these very small, highly

conserved regions using oligonucleotide probes complementary to

the conserved sequences. The result was detection in P. falciparum

RNA of a ,30 nt RNA (LSUB) and a ,23 nt RNA (LSUC) for

the 26 and 16 nt conserved regions, respectively (Figure S2B).

We believe that, despite their very small size, these RNAs

provide the GTPase center of the Plasmodium mt ribosome.

The difficulties we encountered in detecting the GTPase center

RNAs suggested that other small RNAs might have been missed in

prior studies. The nucleotide sequences of regions known to

encode putative rRNA fragments are typically conserved among

Plasmodium species at 90% or greater (Table S3). In comparison,

while the amino acid sequences of the three protein genes are

conserved at .95% among these species, the nucleotide sequence

for the mRNAs is only ,85% conserved (Table S3). These data

provided a criterion for re-examination of P. falciparum mt

transcripts. We analyzed all apparently non-coding mt sequences

.15 nt long that were conserved at ,90% or better among

Plasmodium species. By RNA blotting and hybridization (Table S4),

we found additional transcripts: RNA11 (assigned to LSU rRNA),

RNA12 and RNA14 (both assigned to SSU rRNA), RNA15, and

RNA16 (Figure S3). Comparison of the P. falciparum and T. parva

mt DNAs suggested another possible LSU rRNA fragment, and

RNA blotting confirmed the presence of a small RNA, now

labeled RNA13 (Figure S3). We also evaluated small regions of

very high conservation by 59 and 39 RACE (Table S4), confirming

six additional RNAs under 40 nt long: RNA17 and RNA19 (both

assigned to SSU rRNA), RNA18 (assigned to LSU rRNA),

RNA20, RNA21, and RNA22 (Table 1).

After completion of our transcript mapping but prior to

completion of our sequence analysis, Raabe et al. [29] published

an analysis of non-protein coding RNAs identified from size-

selected P. falciparum cDNA libraries. They reported .100 such

RNAs corresponding to the mt genome. We analyzed their data

with ours, comparing the location of transcript ends and, for

those for which RNA blots were shown, whether the observed

transcripts were consistent with the expected size of the small

RNAs and their potential precursors. They found cDNAs

corresponding to all the small RNAs we identified (Table 1),

though their cDNAs for LSUE (190 nt), LSUA (175 nt), LSUF

(125 nt), and SSUB (116 nt) are not full-length and cDNAs for

some of the others have minor variance in 59 and 39 ends. Many

of the remaining cDNAs lack 59 and/or 39 ends consistent with

our mapped transcript ends, some include parts of adjacent small

RNAs, and all were represented by just a few copies each (,10).

In contrast, cDNAs corresponding to the RNAs we mapped

typically were abundant (.10 to 1000 copies), except that some

of the smallest RNAs were poorly represented (LSUC, SSUE,

and RNA19 each have ,10 cDNAs). The abundance of a

particular cDNA is not reliably diagnostic of the corresponding

RNA abundance. However, correlation of our mapped tran-

scripts to the more abundant cDNAs [29] is suggestive. We

believe that the less abundant cDNAs may be products of RNA

degradation or breakage during experimental manipulation.

With identification of these new small RNAs, only 4.2% of the

P. falciparum mt genome–251 of 5967 nt, spread across 15 sites–

lacked mature transcripts. The two largest sites are the 38 nt 39

and 58 nt 59 flanks of LSUC. These include a 26 nt block and a

14 nt block, respectively, in which the sequence is identical among

the Plasmodium species and the other hemosporidians. Although we

did not find evidence for mature RNAs derived from these

sequences, Raabe, et al. [29] reported small RNAs from both

regions (mtR_21 on the 39 flank, and mtR_39 and mtR_40b on

the 59 flank). Similarly, they reported a cDNA (mtR_49)

corresponding to the 43 nt between RNA2 and RNA21; sequence

conservation for this area is low (81.8%) but the variance is largely

confined to the 59 third of the region. A 28 nt region between cox3

and LSUF is well-conserved among the Plasmodium species

(94.1%). The 39 15 nt of the region are included in an apparent

long-lived precursor RNA for LSUF, as discussed below. The

remaining intergenic regions range from 3–21 nt and conservation

is typically 50–70% (Table S2). Lower conservation suggests a lack

of (conserved) function.

Five of the cDNAs [29] map to well-conserved sequences where

we did not find evidence for RNAs; they are represented by .10

cDNAs, and have transcripts of size consistent with the proposed

coding region and with processing from precursor transcripts. As

noted above, four flank mapped rRNAs and the fifth (mtR_84) is

complementary to part of the cob gene. Although we did not detect

these transcripts, their characteristics are consistent with them

being at least long-lived processing intermediates and some may be

mature mt RNAs. We have tentatively named them RNA23t

through RNA27t (Table 1).

With the possible addition of several RNAs identified by Raabe,

et al. [29], all of the small RNAs that derive from the P. falciparum

mt genome have apparently been identified. The rRNA fragments

total 2037 nt, the seven unassigned RNAs sum to 328 nt, and

three mRNAs cover 3351 nt. Together, they account for 95.3% of

the 5967 nt P. falciparum mt genome. (Overlapping sequences are

only counted once.) However, it is difficult to be absolutely certain.

Including the Raabe RNAs would raise coverage to 97.6% of the

genome. Additionally, it is clear from the search for GTPase

center rRNAs that transcripts can be as short as 23 nt; these are

not easily detected. Further, the LSUB fragment overlaps

completely the 39 end of the cob gene (but not the previously

mentioned Hepatocystis insertion) on the opposite strand of DNA

(Figure 1). This opens the possibility of alternate strand coding for

other small RNAs; Raabe et al. [29] report several of these but only

one (mtR_84) appears to be plausible. While we have tested all

regions of the P. falciparum mt genome for transcripts on both

strands, not all have been subjected to the more rigorous scrutiny

used to detect the GTPase center fragments. These caveats aside,

we have mapped 34 small RNAs encoded by the P. falciparum mt

genome, 27 of which have been assigned to specific regions of

rRNA (Table 1).

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 5 June 2012 | Volume 7 | Issue 6 | e38320

Page 6: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Transcript MappingA total of three protein coding genes and 20 small RNAs have

been previously shown by RNA blotting to derive from the P.

falciparum mt genome [3,4,27,28,61]; to these we have now added

14 more small RNAs. The genes on the P. falciparum mt genome

are closely packed (Figure 1) and both DNA strands are

polycistronically transcribed, with individual transcripts presumed

to be cleavage products from the precursor [62]. To assist

evaluation of the predicted secondary structure and potential

interactions of the rRNA fragments, we mapped the 59 and 39 ends

of the small transcripts. Identifying the 39 end is especially

important for the P. falciparum mt RNAs since most have non-

encoded oligo(A) tails up to 25 nt long [27] and therefore the RNA

size on blots often exceeds the size of the corresponding DNA

sequence. We have previously reported the 39 ends, including

oligo(A) tails, of some of the small mt RNAs [4] and here identify

the 39 ends and oligoadenylation status for most of the remaining

RNAs (Table S5).

The 39 ends of the P. falciparum small RNAs were identified by 39

RACE, augmented in some cases by RNase protection. The

oligo(A) tails are often too short for priming cDNA synthesis with

oligo(dT) so for 39 RACE, we enzymatically added 39 C tails to

total RNA. This was followed by cDNA synthesis primed with

oligo(dG), then PCR amplification with oligo(dG) and gene-

specific 59 primers [28]. The PCR products obtained were cloned

and sequenced, revealing the presence of non-encoded As at the 39

end of most of the small transcripts (Table S5). The length of the A

tail is variable but tends to be generally characteristic for each

transcript; that is, most cDNAs for specific small RNAs have short

A tails while those for others have predominantly long A tails

(Table S5). SSUB lacks added 39 A residues [28] and, with the

possible exceptions of SSUE, LSUB, and LSUC (see below), is the

only small mt RNA that lacks an oligo(A) tail. The significance of

oligo(A) tail presence and size is unknown. Oligoadenylation of mt

rRNAs has been previously reported for mammals and insects

[63–65] but not examined in detail. The phenomenon may simply

reflect a promiscuous mt poly(A) polymerase activity.

We obtained 59 end locations by primer extension, RNase

protection, or 59 RACE; in many cases, more than one mode was

used. These data are straightforward with two exceptions. The

LSUF rRNA, which corresponds to part of the peptidyltransferase

domain of LSU rRNA, has two transcripts detected on RNA blots

[4] and two primer extension products (Figure S4); the size

difference in each case is 15 nt. The precise relationship between

the two transcripts is unknown. The larger transcript could

represent an abundant RNA processing intermediate or both sizes

of the RNA could be functional, possibly in different roles. (We are

unaware of any examples of the latter.) It is notable that the 15 nt

59 extension of the larger transcript is less conserved among

Plasmodium species (,70%) than the remainder of the LSUF RNA

(98%), due to considerable variability in the first seven nt (Figure

S4). Sequence corresponding to the conserved section of LSUF has

been identified in T. parva [23], but there sequence corresponding

to the 15 nt extension is not conserved. On balance, these data

better support the hypothesis that the larger RNA is an abundant

precursor. (Probes for SSUF also detect two sizes of transcript on

RNA blots [4] but neither primer extension nor 39 RACE supports

an extension such as that seen for LSUF. This may reflect that

different oligonucleotides used for the RNA blot, 39 RACE, and

primer extension might have differential ability to hybridize to

both RNAs; that hybridization dynamics differ for blot hybridiza-

tion compared to solution hybridization; or that the RNA blot

probe cross-hybridizes to another small RNA.) The second

exception involves the 59 end of RNA21, for which detection of

apparent precursor transcripts by 59 RACE both enlightens and

complicates data interpretation, as discussed below.

Mapping for three of the small RNAs posed technical challenges

that we were not able to overcome. The 40 nt SSUE RNA

includes a near-universally conserved 16 nt sequence, GUACA-

CACCGCUCGUC. Except for the underlined U, this exactly

matches the corresponding sequences from the genome of the P.

falciparum apicoplast (a relict plastid that encodes its own rRNAs

[66]) and the nuclear encoded SSU rRNAs. The small size of the

transcript limits options for positioning primers and attempts at 39

and 59 RACE for SSUE using total RNA have failed, likely due to

cross-hybridization of the primers to these other rRNAs. Use of mt

RNA would theoretically alleviate this problem. However,

methods for isolation of P. falciparum mitochondria [67,68] provide

relatively crude fractionation. We have found RNA from such

organelle preparations to be significantly contaminated with non-

mt sequences, and were unable to obtain an SSUE 59 RACE

product from the ‘‘mt’’ RNA. Primer extension produced multiple

products, some of which also may reflect cross-hybridization to

other P. falciparum rRNAs. Consequently, the 59 and 39 ends of

SSUE are predictions, based on the following observations:

N Radiolabeled probes complementary to SSUE detect a 40 nt

transcript [4].

N SSUE size does not alter following digestion with RNase H in

the presence of oligo(dT), suggesting that the transcript lacks a

significant oligo(A) tail [28].

N 59 RACE for RNA21 produced long products which appear to

derive from partially processed precursor transcripts. Two

clones have a 59 end at 1631 and two at 1638. The LSUF 39

end from 39 RACE maps to 1630 [28] and the 41 nt conserved

region starts at 1638 (Figure S5A).

N SSUE is encoded in a 64 nt region between LSUF and RNA2.

A 41 nt portion of that region is 100% conserved among

Plasmodium species and varies very little in the other

hemosporidians. It differs from the other P. falciparum SSU

rRNAs at just two sites in the 39 half. The conserved region is

flanked by seven slightly variable nt at the 59 end and the 18 nt

39 flank is poorly conserved among Plasmodium species.

N The 41 nt conserved sequence fits well into the expected

secondary structure for this region of SSU rRNA, including

maintenance of highly conserved sequence and capacity to

form helices with portions of SSUB and SSUF, creating a

reasonably conventional secondary structure (Figure 2). How-

ever, the final 18 nt of the 64 nt region would not be predicted

to participate in a helix with other small mt RNAs.

Taking these points together, it is likely that SSUE is

encompassed within the 41 nt highly conserved sequence, with

its 59 end at 1638 and the 39 end at 1680, and few, if any, As added

at its 39 end.

Mapping LSUB and LSUC presented a different challenge. For

RACE, at least one primer must be gene-specific and success is

enhanced if nested gene-specific primers can be used. However,

the LSUB and LSUC transcripts are only 30 and 23 nt,

respectively, so the gene-specific primers are severely limited.

That the primers must be small means in turn that temperatures in

the initial rounds of PCR must be comparatively low, with

accompanying loss of specificity. Despite multiple attempts, we

have been unable to obtain RACE products for these RNAs.

Given the relative size of the transcripts and their highly conserved

sequence, any 39 oligo(A) residues would likely be too few in

number for successful RNase H analysis for A tails. Based on

conservation of sequence between the Plasmodium species, between

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 6 June 2012 | Volume 7 | Issue 6 | e38320

Page 7: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Figure 2. SSU rRNA secondary structure. The P. falciparum mt SSU rRNA secondary structure model is superimposed onto the E. coli SSU rRNAsecondary structure model diagram. Regions where P. falciparum has no structure equivalent to E. coli are shown using gray circles and lines. Colorednucleotides compare the P. falciparum sequence to the Three Phylogenetic Domains/Two Organelles (aka 3P2O) consensus sequence from the Gutelllab’s Comparative RNA Web (CRW) Site [URL for CRW Site: http://www.rna.ccbb.utexas.edu/. URL for 16S rRNA conservation diagram: http://www.rna.ccbb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.3.3DOM.pdf]. Upper case colored nucleotides are conserved in at least 98% of the sequences

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 7 June 2012 | Volume 7 | Issue 6 | e38320

Page 8: Gutell 120.plos_one_2012_7_e38320_supplemental_data

P. falciparum and T. parva, and the size of the transcripts, we have

predicted the likely 59 and 39 ends as 4618 and 4586 for LSUB and

225 and 204 for LSUC, respectively (Table 1).

Transcription and ProcessingMt genomes are often polycistronically transcribed, then

cleaved to release mature RNAs [69]. We have previously

reported evidence for extensive polycistronic transcription of the

P. falciparum mt genome, including RNase protection data

consistent with multigenic precursor transcripts and detection of

overlapping, near-genome length RT-PCR products [62]. During

RACE analysis of the P. falciparum mt small RNAs, we obtained

additional evidence in support of polycistronic transcription. As

noted above, 59 RACE with a gene-specific primer for RNA21

generated products that assisted identification of the SSUE 59 end.

Additional products mapped to nt 1754, 1764, 1765, and 1766 of

P. falciparum mt DNA (Figure S5A). The 39 end of RNA21 maps to

nt 1829 (Table 1) but we were unable to detect RNA21 by blotting

so could not use transcript size to assist the analysis. Gaps in the mt

DNA alignment, combined with cDNA data [29], suggest there

are two transcripts between RNA2 and RNA3. One is RNA21

(mtR-26), a 23 nt transcript with only two variable sites, and the

other is RNA26t (mtR_49), a 43 nt RNA that is .90% conserved

at all but seven residues toward the 59 end. RNA2, RNA26t,

RNA21, and RNA3 are immediately adjacent to each other in P.

falciparum but there are 1–2 nt gaps or insertions in many other

species, located at the predicted junctions. Ironically, the primer

chosen for RNA21 overlapped its 59 end, but that end was

identified from analysis of apparent precursors. The 59 RACE

products mapping to 1764–1766 likely derive from the cleavage

that produces the 39 end of RNA2/59 end of RNA26t, followed by

minor nibbling. (Position 1754 maps within RNA2 and we suspect

it reflects an artifact.).

RNA16 and RNA20 abut each other and are transcribed from

the same DNA strand (Figure 1). While performing 39 RACE

using a gene-specific primer for RNA16, we obtained two sizes of

PCR product. One corresponded to RNA16, including its oligo(A)

tail and the other included RNA16 with no A tail followed by

RNA20, which had an oligo(A) tail (Figure S5B). Similarly, when

performing 59 RACE with a gene-specific primer for RNA20, we

identified not only the 59 end of RNA20 but also obtained longer

products consisting of the expected RNA20 sequences plus all of

RNA16 (Figure S5B). These data imply that 39 cleavage and

oligoadenylation of RNA20 can precede the corresponding step

for RNA16 and that the 59 end of RNA16 can form prior to its 39

end.

The accumulated mapping data have allowed us to identify

possible sites of cleavage in the putative precursor transcript. One

caveat exists for this analysis; if the 39 end of an RNA maps to an

A residue, it is ambiguous whether that A derives from the DNA

sequence or from addition of the oligo(A) tail. Unless there is data

suggesting otherwise, we have assumed that 39 end A residues with

corresponding DNA sequence are encoded rather than added by

poly(A) polymerase. Based on this criterion, many of the mtRNAs

directly abut each other. Indeed, in almost every instance in which

genomic As are found at the junction between genes, it appears

most likely that they are part of the transcript of the upstream

gene. We have identified one exception. At the junction of RNA14

and SSUF (Figure 1), 39 RACE shows that RNA14 has an oligo(A)

tail (Table S5) and there are three A residues at the junction

between the two genes. Our default interpretation would assign

those As to RNA14. However, 59 RACE shows that SSUF starts at

those A residues. Unless the genes overlap, which is not seen

elsewhere in this genome for genes on the same DNA strand (but

see below), all of the As at the 39 end of RNA14 must be added

rather than encoded. Similarly, the junction of RNA1 and SSUB

lies in a run of four As. 59 RACE for SSUB showed some

transcripts starting with two As and some with three. The number

of encoded As at the 39 end of RNA1 is therefore uncertain, but

addition of the A tail renders this moot.

Ribosomal RNAs are typically derived by cleavage from a

polycistronic precursor RNA, often via sequential steps. We have

previously demonstrated the existence of near-genome length

polycistronic transcripts of the P. falciparum mt genome [62],

suggesting a role for cleavage to produce individual P. falciparum mt

RNAs. The mapping data indicate that there are four sets of

abutted genes, with four to 12 members per set (Table S6). For

these, a single cleavage event would suffice to create the 59 end of

one RNA and the 39 end of its neighbor. Polycistronic

transcription of close-packed genes, followed by cleavage to

produce individual mRNAs, is a common feature of mt genomes.

In vertebrates, most mt genes are flanked by tRNAs, and

production of mature mRNAs and rRNAs goes hand-in-hand

with tRNA processing [69]. A similar mechanism would be

consistent with the accumulated P. falciparum data. However, the P.

falciparum mt genome does not encode tRNAs so another

mechanism must guide cleavages. The junctions between

transcripts do not share any obvious specific features with each

other or with the 59 or 39 end sequences immediately adjacent to

other P. falciparum mt RNAs except for a tendency to have an A as

the last encoded residue. Similarly, RNAs which do not directly

abut their neighbors also show no conserved characteristics except

the tendency to have a 39 terminal A. Of the 31 small RNAs for

which 39 RACE data is available, the genes for 19 end with A, 2

with C, 3 with G, and 7 with T (Table S5). The composition of the

mt genome is 32.39% A, 15.69% C, 15.90% G, and 36.01% T.

The prevalence of A residues at the 39 end of transcripts therefore

strongly exceeds that expected by chance. (This point, of course,

depends on our interpretation of which A residues are encoded

and which are added.) Overall, the mechanism for recognition of

putative cleavage sites remains unknown.

Assignment of New rRNA FragmentsPrior analyses proposed that 15 of the small P. falciparum mt

RNAs corresponded to specific regions of LSU or SSU rRNAs [4].

These assignments were based on conservation of sequence and/

or the potential to form an appropriate secondary structure while

maintaining some conserved sites. Assigning fragments which

might correspond to less conserved regions of rRNA structure is

more challenging, but can be assisted by cross-species comparisons

between related organisms. The gene content and organization is

identical in all Plasmodium species for which complete mt DNAs are

available. Indeed, the overall nt conservation is so high (Table S1;

Table S3) that comparisons between these are not particularly

used. For the P. falciparum mt rRNA fragments: red nucleotides match the 3P2O consensus sequence, light blue nucleotides differ, and blacknucleotides cannot be compared to the consensus (which is conserved at less than 90%). Base pair symbols are colored when both of the pairednucleotides have the same color. Each fragment is labeled at its 59 and 39 ends. Each helix present in P. falciparum is labeled with its helix number ingreen (e.g., H500), based on the CRW Site’s helix numbering system [URL: 16S rRNA, http://www.rna.ccbb.utexas.edu/CAR/1A/Structures/h.16.b.E.coli.hlxnum.pdf]. Inset: C. elegans mt SSU rRNA secondary structure model.doi:10.1371/journal.pone.0038320.g002

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 8 June 2012 | Volume 7 | Issue 6 | e38320

Page 9: Gutell 120.plos_one_2012_7_e38320_supplemental_data

useful in assigning the small RNAs to positions in the SSU and

LSU rRNAs. However, the mt genome of T. parva was useful for

comparison.

The T. parva mt genome is, at ,6.6 kb, quite small and encodes

the same three protein coding genes as the Plasmodium mt genomes,

but they are aranged in a different order on a linear molecule,

flanked with inverted repeats [23,70]. The genome also encodes

small RNAs which have similarity to rRNA sequences, some of

which have been previously described [23]. Thus, although the

genome organization is quite different from that of Plasmodium

species, gene content is very similar. We undertook a detailed

examination of the T. parva mt genome for sequences correspond-

ing to the small RNAs mapped to the P. falciparum mt genome. T.

parva matches were found for 24 of the 27 P. falciparum small mt

rRNAs and for 2 of the unassigned RNAs (Table 1). Four of the

five previously predicted T. parva LSU rRNAs [23] match well

with our P. falciparum mapping data and predictions. The fifth, T.

parva LSU5 RNA, has only a partial overlap with its P. falciparum

counterpart, RNA3, as discussed below. With data from two

genera, it was possible to identify compensatory changes in

sequence which assisted assignment of additional P. falciparum

RNAs to specific regions of rRNA. The T. parva analysis also

assisted the identification of corresponding sequences in other

apicomplexans (Table S7). The predicted secondary structures for

fragments comprising the P. falciparum SSU and LSU rRNAs are

shown in Figures 2, 3, 4; the corresponding predicted secondary

structures for T. parva are shown in Figures S6, S7, S8.

Conservation at each position is assessed by comparison to all

available rRNA sequences from three phylogenetic domains

(Archaea, Bacteria, nuclear-encoded Eucarya) and two organelles

(mitochondria and chloroplasts), called 3P2O (three phylogenetic

domains, 2 organelles).

The P. falciparum and T. parva mt rRNA secondary structure

models are very similar to one another and to the central core of

the Escherichia coli rRNAs with one notable exception. E. coli 23S

rRNA positions 1906–1934 are contained within one of the most

highly conserved regions of LSU rRNA [55] (http://www.rna.

ccbb.utexas.edu/) and, as expected, this region (near the 5’ end of

LSUE) is conserved in the P. falciparum sequence/structure.

Surprisingly, the same region of the T. parva homologue is much

shorter, showing poor conservation [23]. This cannot be explained

as a sequencing error because the corresponding mt DNA

sequences from Babesia and Theileria species (Table S1) are also

truncated in this region.

Additional evidence to support the placement of RNA

fragments to the SSU and LSU rRNAs comes from analysis of

covariation, evaluating whether the two positions that form a

base pair have similar patterns of variation to maintain this base

pair. [71] Three categories of covariation are used to gauge the

amount of support for the placement of the RNA fragments.

Category 1 base pairs contain only pure covariations, i.e., the

patterns of variation in the two columns in the alignment are

identical; all of the base pair types (e.g., A:U, G:C, U:A, C:G)

covary with the other base pair types. Category 2 base pairs have

covariation between the two paired nucleotides, but they also

have a low frequency of the two paired positions with different

patterns of variation (e.g., base pair types A:U, G:C, with G:U

less frequent, or A:U, G:C, with A:A less frequent). Category 3

contains those positions that are either invariant or the patterns

of variation at the two paired positions are not the same (e.g.,

G:C ,-. G:U). We analyzed the base pair types present for the

mt rRNA fragments from 218 apicomplexans (species of

Plasmodium and other hemosporidians, the coccidian Eimeria,

and the piroplasms Babesia and Theileria). In total, 27 of the 34(39)

transcripts for hemosporidians, 20 predicted RNAs for the

coccidian, and 24 predicted RNAs for the piroplasms were

assigned to rRNA. The fragments that were compared are

indicated in Table S7; the category 1 and 2 base pair types are

given in Table S8 and shown in Figures S9, S10, S11).

RNA14 is the best candidate among the small RNAs for the 59

end of the SSU rRNA, being the only fragment which can form an

appropriate helix (H27) with SSUA (Figure 2). The T. parva

sequence predicted to correspond to RNA14 has little sequence

similarity to it but has capacity to pair with the T. parva SSUA-

equivalent and can form a similar predicted structure. The base-

pairing potential for the T. parva sequence is better than that

proposed for P. falciparum, thus providing more confidence in our

assignment of RNA14 as the 5’ end of SSU rRNA. Of the 17 base

pairs in the RNA14 fragment, three are category 1 and four are

category 2 (Figure S9; Table S8). RNA8 was previously proposed

to correspond to the 59 terminal domain of SSU rRNA (prior to

the identification of the RNA14 transcript) [4]; we now show that

RNA8 contains the highly conserved 790 loop and interacts with

both the 39 end of SSUA and the 59 end of SSUB (Figure 2). It has

five category 2 base pairs.

Assignment of RNA17 to SSU rRNA was based upon the facts

that RNA17 and the T. parva homologue both contain a sequence

that is very similar to E. coli SSU rRNA positions 282–294 and

that they contain H289 (Figure 2). RNA12 has the potential to

interact with RNA17, providing the remainder of H240 and a

truncated H122 (Figure 2). We have paired the central region of

RNA19 with RNA9 to form the lower half of H1047. RNA17 has

one category 2 covariation; RNA12 and RNA19 have no

covariation. The 39 portion of RNA5 has the potential to basepair

with SSUD and together they may comprise H1303. The 59

portion of RNA5 has the capacity to form an extended helix and

large loop; sequences in the proposed helical regions are well-

conserved among Plasmodium species but the 22 nt loop region

varies in both size and sequence (Table S3). Note that since we

have not yet found T. parva homologues for RNA5, RNA12, or

RNA19, assignment of these transcripts to SSU rRNA should be

considered tentative.

RNA2 and RNA11 correspond to portions of the 59 half of LSU

rRNA (Figure 3). RNA2 approximately spans E. coli LSU positions

944 through 1010, encompassing helices H946, H976 and the 59

strand of H991. RNA11 pairs with RNA2, providing the 39 strand

of H991. It also forms helix H1164 and pairs with LSUA in H812.

RNA2 consists of two hairpins with loop sequences that are very

similar to their E. coli counterparts. The T. parva homologue was

detected because the single-stranded loop sequences are identical

to those of P. falciparum. Assignment of RNA11 was accomplished

by searching the mt genomes for the conserved sequence GGUA

between H1164 and H812, followed by a sequence that could pair

with the appropriate region of LSUA. Candidate sequences were

then manually screened for the presence of a hairpin immediately

upstream of the GGUA and for the potential to interact with

RNA2. Here again the base-pairing potential for these two RNAs

was better for the T. parva homologues than for P. falciparum,

providing confirmation of our structure predictions for the known

P. falciparum transcripts. RNA1 was previously assigned [4] but we

have revised its interactions. The 59 end of RNA1 basepairs with

the 39 end of RNA11, forming H1196. RNA1 also forms H579

with the 59 end of LSUA, encompasses H1276, and basepairs with

RNA3. It was not possible to predict a structure for the 39 end of

RNA1 or the 59 end of RNA3. Of the 34 inter and intra base pairs

in RNA2, RNA11, and RNA1, six are category 1, and six are

category 2 (Figure S10; Table S8).

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 9 June 2012 | Volume 7 | Issue 6 | e38320

Page 10: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Figure 3. LSU rRNA secondary structure (59 half.) The P. falciparum mt LSU rRNA (59 half) secondary structure model is superimposed onto theE. coli LSU rRNA (59 half) secondary structure model diagram. Regions where P. falciparum has no structure equivalent to E. coli are shown using gray

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 10 June 2012 | Volume 7 | Issue 6 | e38320

Page 11: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Part of the 59 half of RNA3 forms a helix with the 39 half of

RNA1 to form H1295 (Figure 3). RNA3 also bridges the 59 half to

the 39 half of the LSU rRNA (Figures 3, 4). Part of the 39 half of

RNA3 forms the 59 strand of the H1648 helix, while the middle

section of LSUE forms the 39 half of this helix (Figure 4). RNA3 is

highly conserved among the Plasmodium species, varying in only 3–

5 sites, and the helical region is 100% conserved for both RNA3

and LSUE. The corresponding T. parva sequence (Table 1)

partially overlaps the previously predicted LSU5 rRNA of Kairo

et al. [23] and provides compensating base change evidence in

support of the proposed interactions. While no covariations exist in

the seven basepairs in H1295 (RNA3/RNA1), two category 1 and

three category 2 basepairs occur in the 15 basepairs in H1648

(Figures S10, S11; Table S8).

RNA13 is also assigned to the 39 half of LSU rRNA and

corresponds to the previously proposed LSU2 RNA in T. parva

[23]. This is a short but highly conserved sequence with only one

difference among the Plasmodium species, at the 39 end. The

fragment corresponds to E. coli sequences from ,2236–2264. The

59 end forms a shortened helix H2077 with the 39 end of LSUE

and also encompasses helix H2246 (Figure 4). The sequence of

H2246 is identical among the Plasmodium species, including

departure from the consensus sequence at three of six highly

conserved positions in and adjacent to the loop. One category 2

base pair occurs in the H2077 helix, and no covariations exist in

H2246 (Figure S11; Table S8).

Pairwise comparisons between the P. falciparum and T. parva mt

genomes and the E. coli rrnB operon also identified another short

conserved sequence (positions 4964–4981 in P. falciparum, 1518-

1501 in T. parva) that resembles E. coli 23S rRNA positions 2687-

2710, comprising a portion of H2675 (Figure 4). This 27 nt region

is identical in all 200 Plasmodium species except for one position

each for P. juxtanucleare and P. fragile. RNA18 has been mapped to

this region by 59 and 39 RACE. No covariations occur from that

portion of RNA18 that forms a five base pair helix (Figure S11;

Table S8).

RNA6 is virtually identical in Plasmodium species (Table S3). The

59 portion of RNA6 is assigned to part of the 39 strand of H2675,

basepairing with RNA18 and RNA10 (Figure 4). Of the 20

basepairs in RNA6 that interact with RNA18, RNA10, and itself,

three are category 2 covariations (Figure S11; Table S8).

We were unable to assign all of the RNA transcripts to specific

regions of the SSU and LSU rRNAs. The 34 transcripts identified

here in P. falciparum (and previous analysis by Feagin and

colleagues [3,4,27,28,62]), plus the five additional transcripts

determined by Raabe et al. [29] are listed in Table 1. In total, for

P. falciparum, 27 of the 34(39) transcripts have been assigned to

specific regions of the rRNAs. The other hemosporidians have

such well-conserved mt genomes relative to P. falciparum that we

predict they will have corresponding RNAs. Twenty-six RNAs

have been predicted for T. parva (Table 1) and similar RNAs are

predicted to exist for the other piroplasms. Similarly, 20 regions

corresponding to RNA fragments have been identified in the mt

genome of the coccidian, Eimeria (Table S7).

Composition of the P. falciparum mt SSU rRNAThe rRNA secondary structures are typically compared to those

of E. coli, as those are the most studied examples. Significant

portions of the SSU and LSU rRNAs primary, secondary, and

three-dimensional structural elements are conserved in all known

rRNAs. The conservation secondary structure diagrams available

at the Gutell lab’s Comparative RNA Web (CRW) Site [http://

www.rna.ccbb.utexas.edu/SAE/2B/ConsStruc/] reveal that the

mt rRNAs are, as a group, much more variable than the rRNAs

within the Archaea, Bacteria, nuclear-encoded Eucarya, and

chloroplasts. The mt SSU and LSU rRNAs usually have fewer

nucleotides than bacterial rRNAs (Table 2). The mt rRNAs of C.

elegans are contiguous and highly minimized [36,55], with its SSU

rRNA and LSU rRNAs being just slightly smaller than the

aggregate size of the P. falciparum mt rRNAs. The correspondence

of the P. falciparum and C. elegans mt rRNAs to each other, and to E.

coli rRNAs, is indicated in Table S9.

Twelve RNA fragments have similarity to regions of SSU rRNA

(Table 1, Figure 2). The assignments differ slightly from what we

previously proposed [4], reflecting the refinement possible with

mapped as opposed to predicted RNA ends, as well as addition of

new transcripts identified in this study. RNA8 better matches the

790 loop than did RNA4, which is now unassigned and RNA5

rather than RNA9 is assigned to the 59 strand of H1303. Together,

the twelve fragments assigned to SSU rRNA total 804 nt.

The P. falciparum mt rRNAs and the C. elegans mt rRNAs are

truncated in all of the major structural domains in the SSU and

LSU rRNAs. P. falciparum SSU rRNA contains helices (H27, H39),

the 530 loop region (helices H500, H505, H511), and H122,

H240, and H289 in domain I (Figure 2; Table S9), though much

of domain I is missing. The C. elegans mt SSU rRNA also lacks a

large portion of domain I. It shares with P. falciparum the well-

conserved helices in H27, H39, H500, H505 and H511. It lacks

H122, H240, and H289 but, unlike P. falciparum, the C. elegans mt

SSU rRNA has helices H9 and H17. The apex of the 530 loop

forms a pseudoknot helix with a few nucleotides in the bulge loop

between helices H500 and H511 [72,73], but the pseudoknot in

both P. falciparum and C. elegans lack strong basepairing potential.

This is common among mt rRNAs, however [55]. The 530 loop is

one of the most conserved regions of rRNA in both sequence and

higher order structure. It is well conserved in the mt rRNA of

Plasmodium, and closely related species of other hemosporidians,

coccidians, and piroplasms.

P. falciparum and C. elegans SSU rRNAs both include helices

H567, H577, H769, and H885 in domain II of the SSU rRNA

(Figure 2; Table S9). While P. falciparum has H821, C. elegans does

not, and C. elegans has part of H673 and H722 while P. falciparum

apparently does not have these helices. The 790 loop, the cap of

the helix H769, is highly conserved in all known rRNAs, including

P. falciparum.

Domain III is the most well-represented part of the P. falciparum

mt SSU rRNA, and there is good correspondence between P.

falciparum and C. elegans (Figure 2; Table S9). This domain contains

helices H921, H939, H944, H960, H984, H1047 (only part of it is

present in P. falciparum), H1303 (with a putative extra helix in

circles and lines. Colored nucleotides compare the P. falciparum sequence to the Three Phylogenetic Domains/Two Organelles (aka 3P2O) consensussequence from the Gutell lab’s Comparative RNA Web (CRW) Site [URL for CRW Site: http://www.rna.ccbb.utexas.edu/. URL for 23S rRNA (59 half)conservation diagram: http://www.rna.ccbb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.23.3.3DOM.5.pdf]. Upper case colored nucleotides areconserved in at least 98% of the sequences used. For the P. falciparum mt rRNA fragments: red nucleotides match the 3P2O consensus sequence,light blue nucleotides differ, and black nucleotides cannot be compared to the consensus (which is conserved at less than 90%). Base pair symbolsare colored when both of the paired nucleotides have the same color. Each fragment is labeled at its 59 and 39 ends. Each helix present in P.falciparum is labeled with its helix number in green (e.g., H500), based on the CRW Site’s helix numbering system [URL: 23S rRNA 59 half, http://www.rna.ccbb.utexas.edu/CAR/1A/Structures/h.235.b.E.coli.hlxnum.pdf]. Inset: C. elegans mt LSU rRNA (59 half) secondary structure model.doi:10.1371/journal.pone.0038320.g003

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 11 June 2012 | Volume 7 | Issue 6 | e38320

Page 12: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Figure 4. LSU rRNA secondary structure (39 half.) The P. falciparum mt LSU rRNA (39 half) secondary structure model is superimposed onto theE. coli LSU rRNA (39 half) secondary structure model diagram. Regions where P. falciparum has no structure equivalent to E. coli are shown using graycircles and lines. Colored nucleotides compare the P. falciparum sequence to the Three Phylogenetic Domains/Two Organelles (aka 3P2O) consensussequence from the Gutell lab’s Comparative RNA Web (CRW) Site [URL for CRW Site: http://www.rna.ccbb.utexas.edu/. URL for 23S rRNA (39 half)conservation diagram: http://www.rna.ccbb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.23.3.3DOM.3.pdf]. Upper case colored nucleotides areconserved in at least 98% of the sequences used. For the P. falciparum mt rRNA fragments: red nucleotides match the 3P2O consensus sequence,light blue nucleotides differ, and black nucleotides cannot be compared to the consensus (which is conserved at less than 90%). Base pair symbolsare colored when both of the paired nucleotides have the same color. Each fragment is labeled at its 59 and 39 ends. Each helix present in P.falciparum is labeled with its helix number in green (e.g., H500), based on the CRW Site’s helix numbering system [URL: 23S rRNA 39 half, http://www.rna.ccbb.utexas.edu/CAR/1A/Structures/h.233.b.E.coli.hlxnum.pdf]. Inset: C. elegans mt LSU rRNA (39 half) secondary structure model.doi:10.1371/journal.pone.0038320.g004

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 12 June 2012 | Volume 7 | Issue 6 | e38320

Page 13: Gutell 120.plos_one_2012_7_e38320_supplemental_data

P. falciparum), H1350, H1399 (truncated in both P. falciparum and

C. elegans), and H1506.

Despite roughly similar sizes, the C. elegans mt SSU rRNA

includes portions of domains I and II that are not represented in

the P. falciparum mt SSU rRNA model. Two of these helices, H9

and H17, are present in all known SSU rRNAs. This suggests that

some of the unassigned small RNAs may correspond to those

regions. However, while the Plasmodium RNA sequences are highly

conserved relative to each other, there is insufficient conservation

to the specific regions of the SSU rRNA to assign the remaining P.

falciparum RNAs with confidence.

Composition of the P. falciparum mt LSU rRNASeven LSU rRNA fragments, LSUA-LSUG, were originally

predicted based on sequence similarities to other rRNAs [3].

Later, two additional LSU rRNA fragments were identified,

RNA1 and RNA10. [4] In this paper, we report identification of

transcripts for LSUB and LSUC, and assignment of RNA2,

RNA3, RNA6, RNA11, RNA13, and RNA18 to LSU rRNA. All

of these fragments have corresponding sequences in the T. parva mt

genome (Table 1). The P. falciparum mt LSU rRNA currently has

15 small RNAs assigned, ranging from 23 to 190 nt long.

Together, they total 1233 nt.

The 59 half of the LSU rRNA secondary structure is

significantly reduced in both P. falciparum and C. elegans relative

to E. coli (Figure 3; Table S9). Domain I is missing from both.

Helix H563 is between domains I and II in the LSU rRNA and is

present in C. elegans. Domain III is truncated and unmodelled in C.

elegans while P. falciparum has the domain III helix H1295, and

H1276, a helix between domains II and III.

LSUA, LSUB, LSUC, RNA2, and RNA11 are all assigned to

domain II of LSU rRNA, as is part of RNA1. As noted above,

LSUB and LSUC are assigned to the GTPase center, forming

helices H1057, H1087, and H1082 respectively. While nearly all

of the sequence deletions relative to the E. coli rRNAs occur at the

peripheral ends of an existing structural domain, two deletions,

both in the LSU rRNA, occur internally. One example of this type

of deletion occurs in the helix that links the H991 helix to the

GTPase center (Figure 3). Helices H1011 and H1030 are

significantly truncated in animal mt genomes [55]. For the C.

elegans mt LSU rRNA, a few base pairs are predicted for H1030 at

the top of the stem, with unpaired nucleotides bridging the gap

between the GTPase center and the multi-stem loop below H991.

In P. falciparum, RNA2 and RNA11 basepair to form the basal

helix in the stem (H991) but there is no interaction (i.e., noticeable

basepairing potential) between these RNAs and the GTPase center

RNAs, LSUB and LSUC. Indeed, while each of the latter can

form an intramolecular helix and can potentially interact with

each other by forming an extended helix H1082 (not present in C.

elegans), they are so small that there is no obvious capacity for them

to interact with any of the other small RNAs. We hypothesize that

the GTPase center RNAs are maintained in proper position by

interactions with ribosomal proteins. RNA2 is also assigned to

H946 and H976, neither of which is included in the C. elegans

structure. RNA11 is also assigned to H1164 and provides the 39

half of the H812 helix while the 59 half of H812 is derived from

LSUA. H1196 is shortened in P. falciparum. In contrast, C. elegans

lacks H1164 and H1196 is also truncated. RNA1 is assigned at the

junction of domains II and III, having a strongly conserved AGUA

sandwiched between dinucleotides that conserve long-distance

basepairing to domain IV.

The 39 half of the LSU rRNA secondary structure is the most

fully represented region, compared to E. coli, for both C. elegans and

P. falciparum (Figure 4; Table S9). Parts of Domain IV interact with

tRNA and the 30S ribosomal subunit. LSUD and LSUE provide

most of domain IV, with RNA3 assigned to one side of helix

H1648, the other side being provided by LSUE. There is no C.

elegans sequence corresponding to RNA3. A set of adjacent helices

(H1682, H1707, and H1752) are missing from both species and

helix H1835 is reduced in C. elegans and P. falciparum to a third the

size in E. coli. Helix 1792 is similar in size in E. coli and P. falciparum

but reduced by half in C. elegans. Other features of domain IV are

maintained in structure and sequence conservation. LSUD and

LSUE are separate RNAs in P. falciparum despite the fact that they

directly abut each other in the P. falciparum mt genome. These two

sequences have been found on the same transcript in T. parva

(RNA1 of Kairo et al. [23]), but it is possible that this represents a

precursor of LSUD and LSUE. Such a precursor RNA containing

LSUD and LSUE is detected in P. falciparum by RNase protection

(Mericle and Feagin, unpublished results) and by RT-PCR [62].

Structural features of domain V are also strongly conserved

between P. falciparum and C. elegans, which show similar variance

from E. coli (Figure 4). Both are missing the L1 binding domain

(H2112, H2113, H2117, H2120, and H2127) and the underlying

H2093 and H2200 helices. The 5S rRNA binding domain (helices

H2283–H2372) is deleted in both C. elegans and P. falciparum. The

bulk of domain V, including the peptidyltransferase center, is

intact and well-conserved, comprised of LSUE, LSUF, and

LSUG.

The most conserved feature of domain VI is the sarcin ricin

loop at the tip of the H2646 helix. This hairpin loop is conserved

across all 3P2O LSU rRNA genes, including the P. falciparum mt

RNA10 fragment (Figure 4). The initial 13 nt of RNA10 are

identical among Plasmodium species and hemosporidians but have

not been assigned to a structure. The next 38 nt are variable

across Plasmodium species and include alignment gaps but within

the more closely related subgroups, conservation is strong (Figure

S1). Regardless of the variation, this sequence can form a long

helix (Figure 4; 17 basepairs with an AAUU tetraloop in P.

falciparum). There is no corresponding structure in C. elegans or E.

coli but one has been proposed in the mt rRNA of the

dinoflagellate Karlodinium micrum [74]. If a similar hairpin exists

in T. parva RNA10, it is much shorter and significantly less stable.

The 39 end of RNA10 and the 59 end of RNA18 comprise the 59

strand of H2675. The 39 end of RNA18 continues to begin the 39

strand of H2675, and the rest of this strand is present in the 59 end

of RNA6 which continues to form H2735. Domain VI is also

truncated in C. elegans, with strong correspondence in the regions

that are present and absent in P. falciparum.

Table 2. Relative sizes of mt rRNAs.

Organism SSUa LSUa total Accession

Caenorhabditis elegans mt 697 953 1650 NC_001328.1

Leishmania tarentolae mt 610 1173 1783 NC_000894.1

Plasmodium falciparum mtb 804 1233 2037 M76611.1

Homo sapiens mt 954 1559 2513 NC_012920.1

Polytomella mtb 979 1556 2535 NC_010357.1

Chlamydomonas reinhardtii mtb 1200 2419 3619 NC_001638.1

Reclinomonas americana mt 1595 2751 4346 NC_001823.1

Escherichia coli 1541 2904 4445 NC_000913.2

asizes are cited in nt.brRNAs are fragmented.doi:10.1371/journal.pone.0038320.t002

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 13 June 2012 | Volume 7 | Issue 6 | e38320

Page 14: Gutell 120.plos_one_2012_7_e38320_supplemental_data

As with the SSU rRNA, it is tempting to speculate that the

regions in C. elegans mt LSU rRNA that are missing from the

assigned RNAs for the P. falciparum mt LSU rRNA are among the

unassigned RNAs.

As noted above, our confidence in the placement of the RNA

fragments onto the SSU and LSU rRNA secondary structures is

augmented with covariations between the two nucleotides that

form basepairs. Covariation analysis facilitates the identification of

the base pairs in an RNA secondary structure that are similar

among the sequences in the alignment dataset. When the

alignments juxtapose similar structural units in each column,

and they are properly interpreted with covariation analysis, the

predicted structure is nearly 100% accurate [75]. Two different P.

falciparum datasets were analyzed for positional covariations. The

first compares the RNAs with conservation of the 3P2O dataset.

Nucleotides that are conserved in 90–98% of the 3P2O dataset in

Figures 2, 3, 4 are shown as a lower case red or blue letter, and

those nucleotides conserved in 98–100% of the same 3P2O dataset

are shown as an upper case red or blue letter. Red indicates that

the P. falciparum sequence is the same as the conserved 3P2O

sequence while blue indicates that it is different. Seven base pairs

in the SSU and LSU rRNAs secondary structure diagrams have

both nucleotides colored blue indicating that the P. falciparum

sequence has covariation at some of the most conserved

nucleotides in the rRNA (Figures 2, 3, 4).

The second dataset contains the predicted mt rRNAs of 218

species of Plasmodium species and related apicomplexans. These

covariations are shown on the SSU and LSU secondary structure

diagrams (Figures S9, S10, S11; Table S8). The total number of

base pairs that have a covariation between the two paired

nucleotides are: SSU rRNA, 10 category 1 and 21 category 2;

LSU rRNA, 42 category 1 and 39 category 2; total rRNA, 52

category 1 and 60 category 2. While several helices do not have

any covariations, the majority of the helices contain at least one.

Thus, we are very confident that the majority, if not all, of the P.

falciparum fragments, have been accurately mapped to the rRNA

secondary structure. An analysis of a more diverse collection of

apicomplexan mt genomes and their RNA transcripts would

further evaluate the accuracy of fragment mapping to the rRNAs

in this manuscript, and likely facilitate the placement of a few more

fragments.

Predicted Structure of the P. falciparum mt RibosomeIn recent years, the crystal structures of large and small

ribosomal subunits and of whole ribosomes have been obtained

from several sources, greatly refining our understanding of

ribosome function during protein synthesis. Of particular interest

to the work reported here is structural data from the C. elegans mt

ribosome [36,55]. Like P. falciparum, C. elegans has highly

minimized rRNAs and as discussed above, the P. falciparum mt

rRNA fragments correspond closely to the C. elegans mt rRNAs. Mt

ribosomes have a larger protein to RNA ratio but the polypeptide

exit tunnel is still lined entirely with RNA [34]. While nothing is

specifically known about the protein content of P. falciparum mt

ribosomes, all of the RNA sequences that comprise the lining of

the C. elegans mt ribosome polypeptide exit tunnel are represented

in the RNAs that have been mapped to the 39 portion of P.

falciparum mt LSU rRNA.

The high-resolution three-dimensional crystal structures of the

30S [31] and 50S [30] ribosomal subunits revealed that the

comparative secondary structure models for the SSU and LSU

rRNA were 97–98% accurate [75]. These crystal structures also

revealed the detailed three-dimensional folding of the secondary

structure. We wanted to determine if the fragments of P. falciparum

rRNA are close together in three-dimensional space and if they

were at the known functional sites of the ribosome in protein

synthesis. All of the fragments that have been assigned to the SSU

and LSU rRNA secondary structure (Figures 2, 3, 4) were mapped

onto the three-dimensional structure of the ribosome (Figure 5A–

E; full-page versions in Figures S12, S13, S14, S15, S16). These

figures reveal that indeed nearly all of the P. falciparum rRNA

fragments map to the regions of the ribosome known to be directly

involved in protein synthesis.

Discussion

The P. falciparum mt genome encodes 34 small RNAs, 27 of

which have been assigned as parts of fragmented LSU and SSU

rRNAs. Compared to other fragmented rRNAs, the number of

separate RNAs is much larger while the fragments are far smaller,

some containing only 23 nt. Are these tiny RNAs in fact part of

functional ribosomes? Considerable data now supports this

hypothesis.

Some of the Plasmodium mt rRNA fragments were first proposed

because of sequence similarities to rRNA regions [3,20,76]. A key

issue became whether the predicted rRNAs corresponded to real

transcripts. For a few, the answer was no [4], but the small RNAs

discussed here are, by definition, transcribed. The mt genome is

polycistronically transcribed [62]; larger presumed precursor

RNAs can be detected by RNase protection [62], RT-PCR

[62], and in 39 and 59 RACE analysis (this work). The small RNAs

persist in the cell while non-coding mt sequences appear to be

rapidly degraded, based on our inability to detect them other than

by RT-PCR of presumed precursor transcripts.

The small RNAs are highly conserved across species and

between genera. Comparison of the mt genomes of 26 Plasmodium

species shows strong conservation of nucleotide sequence for the

small RNAs, to the degree seen for amino acid sequences of the

three mt protein coding genes (Table S3). Mt genome size

differences between the Plasmodium species predominantly reflect

addition or deletion of sequences in non-coding regions or at

junctions between RNAs. Sequences corresponding to many of the

P. falciparum rRNA fragments can be found in mt genomes of other

apicomplexans (Table 1; Table S3); some of them have been

confirmed as transcripts for T. parva [23]. Analysis of dinoflagellate

mt sequences has provided evidence that they also have highly

fragmented mt rRNAs [74,77,78]. It has so far been impossible to

strictly define the genetic content of the dinoflagellate mt genome.

It typically exists as multiple genome molecules carrying differing

combinations of the mt genome, including different pseudogenes

for the protein coding genes cob, cox1 and cox3. These often show

terminal deletions as if the individual DNAs are subject to a high

rate of recombination. The complex nature of dinoflagellate mt

genomes means that none can be considered complete. However,

sequences corresponding to rRNAs have been reported for several

species. Those described to date correspond to some of the best-

conserved P. falciparum mt rRNA fragments, not only in the general

identity of the region but also the size and fragmentation pattern.

Dinoflagellates are close relatives to apicomplexans so these data

suggest that the alveolate that is ancestral to both groups also

possessed fragmented mt rRNAs.

Seven of the small RNAs have not been assigned to regions of

rRNA. Their characteristics are otherwise similar to the rRNA

fragments, however. They are highly conserved among Plasmodium

species, are in a similar size range, and have non-encoded oligo(A)

tails. We hypothesize that at least some, perhaps all, of them

correspond to regions that are less well-conserved among rRNAs,

especially regions present in the C. elegans mt rRNA but missing

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 14 June 2012 | Volume 7 | Issue 6 | e38320

Page 15: Gutell 120.plos_one_2012_7_e38320_supplemental_data

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 15 June 2012 | Volume 7 | Issue 6 | e38320

Page 16: Gutell 120.plos_one_2012_7_e38320_supplemental_data

from P. falciparum rRNAs. The greatest discrepancies between

P. falciparum and C. elegans lie in SSU rRNA, suggesting that a

number of the unassigned RNAs have a role there. However,

alternative scenarios cannot be ruled out, including possible non-

rRNA functions. In that regard, it is important to note that none of

the small RNAs has a size or potential secondary structure

suggestive of tRNAs.

Mt rRNAs are often small compared to bacterial rRNAs but

contain key regions important to function. When arrayed in

positions corresponding to the secondary structure of canonical

rRNAs, the P. falciparum mt rRNA fragments correspond quite well

to regions present in other small mt rRNAs. In particular,

correspondence to the very small C. elegans mt rRNAs is quite high.

All of the rRNA fragments are predicted to basepair with at least

one additional rRNA, creating interactions that should help keep

the individual RNAs properly located and oriented. Mt ribosomal

proteins likely play a key role in maintaining the relationships of

the RNAs to each other since the interactions expected between

some fragments are limited.

Arraying the rRNA fragments in a 3-D model further

substantiates the picture of a small but functional ribosome with

the rRNA fragments clustered mostly on the interface between

subunits. The critical point is that nearly all regions of the

rRNAs that are associated with protein synthesis/function are

present in the Plasmodium mt rRNAs. Despite their unconven-

tional characteristics, they appear likely to function convention-

ally once assembled into ribosomes. How that is mediated and

the role of the unassigned fragments both remain to be

determined.

Materials and Methods

ParasitesThe C10 clone of P. falciparum was grown in vitro by the

method of Trager and Jensen [79]. Parasites were harvested by

saponin lysis of infected red cells, followed by centrifugation and

washing. Harvested cells were quick-frozen in liquid nitrogen and

stored at 280uC until use.

DNA AnalysisComplete mt DNA sequences from 25 Plasmodium species and

an isolate from a mandrill, plus eight other hemosporidians (Table

S1) were aligned in the AlignX program of Vector NTi X10 using

the Clustal W function, and manually revised as needed to

conform to the mapped 59 and 39 ends of P. falciparum RNAs.

Using this alignment, the sequence conservation relative to P.

falciparum was determined in each species for each of the small

RNAs and for the intergenic regions. Intergenic regions showing

an average conservation of ,90% or better were examined on

both DNA strands for the presence of additional small RNAs by

RNA blotting.

RNA Preparation and AnalysisTotal RNA was prepared as previously described [4] by the

acid-guanidinium-phenol-chloroform method of Chomzynski and

Sacchi [80] or with Trizol (Invitrogen). RNA blots were prepared

and analyzed as previously described [4,61]. Briefly, total RNA

was electrophoresed on 12%, 16%, or 20% acrylamide, 7 M

urea gels in TBE (0.1 M Tris-borate, 0.9 mM EDTA, pH 8.3)

and electrophoretically transferred to nylon membrane in TAE

(40 mM Tris-acetate, 1 mM EDTA, pH 8.2). Probes for the

RNA blots were [32P]-labeled in vitro transcripts complementary

to the gene or [32P]-end-labeled oligonucleotides, with hybrid-

ization and wash conditions as previously described for similar

probes [4]. Primer extension experiments were performed as

previously described [81], using unlabeled oligonucleotide

primers (Table S4) and Superscript reverse transcriptase (Invitro-

gen) in the presence of [a-32P] dATP. 39 RACE was employed to

determine the 39 end of the transcripts. Total RNA was treated

with DNase I (Pharmacia/Amersham) in the presence of RNasin

(Promega), followed by the addition of a 39 C tail with poly(A)

polymerase (Invitrogen), as previously described [28]. First strand

cDNA was prepared using an oligo(dG) primer

(CGGGATCCGGGGGGGGGGGG) and Superscript reverse

transcriptase. It was then PCR amplified using Pfx Platinum

polymerase (Invitrogen), with the oligo(dG) primer and a gene-

specific 59 primer (Table S4) for each rRNA fragment. PCR

conditions were 2 min at 95uC, followed by 4 repetitions of 30

seconds at primer-specific annealing temperature, one min at

72uC, 15 sec at 95uC, then 29 repetitions of the same with the

annealing temperature raised to 58uC, and finally 10 min at

72uC. For 59 RACE, first strand cDNA was prepared with gene-

specific primers, a C tail was added to the cDNA using terminal

deoxynucleotidyl transferase, and PCR was carried out using a

nested gene-specific primer and the oligo(dG) primer. For some

RNAs, the size or sequence of the transcripts precluded making a

nested primer so the cDNA primer was used for PCR as well,

with mixed success. The PCR products were cloned into the

pGEM T-EZ vector (Promega) and sequenced using an Applied

Biosystems 3730XL Genetic Analyzer.

Assignment of P. falciparum mt Transcripts to Regions ofrRNA Secondary Structure Models

The P. falciparum and T. parva mt genome sequences were re-

analyzed by performing pairwise BLAST comparisons [82]

between these two sequences and between each of these and

the E. coli rrnB operon [83]. Mt rRNA homologs detected in this

way were then manually folded to fit the E. coli secondary

structure models (http://www.rna.ccbb.utexas.edu/) [55], taking

the P. falciparum transcript mapping data into consideration. The

Spin program of the Staden package [84] was then used to assign

transcripts to "missing" regions of the secondary structures; we

searched for short stretches of sequence that are known to be

conserved among other rRNAs (see structure conservation

diagrams at the CRW Site). In some searches, we also included

query sequences that were predicted by the base pairing partner

sequence contained within other rRNA fragments that were

already assigned to the secondary structure models.

Figure 5. rRNA tertiary structure. The P. falciparum mt rRNAs are superimposed on a space-filling model of the three dimensional rRNA structure.Each individual P. falciparum rRNA fragment is colored and labeled; regions of the model with no P. falciparum equivalent are colored gray.Functional regions of the rRNAs are labeled in black. Full-page versions of each panel are available as Figures S12–S16. (A) P. falciparum SSU rRNAsuperimposed on Thermus thermophilus (PDB ID 1J5E; left = front/interface side, right = back); (B) P. falciparum LSU rRNA superimposed onHaloarcula marismortui (PDB ID 1S72; left = crown/interface side, right = back); (C–E) secondary structure diagrams for SSU, 59 LSU, and 39 LSUrRNAs, respectively, with each P. falciparum mt rRNA fragment color-coordinated with the fragment colors in the three-dimensional structure.doi:10.1371/journal.pone.0038320.g005

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 16 June 2012 | Volume 7 | Issue 6 | e38320

Page 17: Gutell 120.plos_one_2012_7_e38320_supplemental_data

3D ModelThe computer program RasMol [85] and its scripting language

were used to color the regions of the SSU and LSU rRNA that are

associated with the P. falciparum rRNA fragments.

Accession NumbersAccession numbers for sequences are found in Table S1. All

are mitochondrial genome sequences from apicomplexan para-

sites.

Supporting Information

Figure S1 RNA10 variation among Plasmodium species.The sequence of P. falciparum RNA10 is shown in the top line, and

the sarcin ricin loop sequence is indicated with a horizontal

bracket. For the other Plasmodium species, positions that differ from

P. falciparum are indicated with letters. Bases shown in red are

conserved in all Plasmodium species., conserved relative to P.

falciparum; -, no corresponding nucleotide.

(TIF)

Figure S2 Identification of P. falciparum mitochondrialGTPase center rRNAs. (A) The predicted secondary

structures of the most conserved portions of the P. falciparum

mt sequence corresponding to the GTPase center are shown.

Strongly conserved nt are shown in color, with lower case

indicating 90%+ and upper case indicating 95%+ conservation

among three domains and two organelles. Red nt in the P.

falciparum sequence match the consensus and the blue nt differs

from consensus. Open circles are placeholders to indicate the

structure of a conventional GTPase center. These sequences are

identical among 26 different species of Plasmodium, with the

exception of the single position in LSUB that varies from overall

consensus. (B) Total P. falciparum RNA was electrophoresed on

denaturing 7M urea, 20% acrylamide gels, electrophoretically

transferred to nylon membrane, and probed with radiolabeled

oligonucleotides complementary to LSUB and LSUC (Table S4).

Transcript sizes were determined using a ladder of small in vitro

transcripts as markers.

(TIF)

Figure S3 Small mitochondrial RNAs of P. falciparum.Total P. falciparum RNA was electrophoresed on denaturing 7M

urea, 12% acrylamide gels, electrophoretically transferred to nylon

membrane, and probed with radiolabeled sequences (Table S4)

complementary to rDNA regions from the P. falciparum mt

genome. Transcript sizes were determined using a ladder of small

in vitro transcripts as markers. The identities of transcripts shown

on the same panel (RNA11, SSUD) were separately established

with oligonucleotide probes. Abbreviations are as shown for

Figure 1.

(TIF)

Figure S4 LSUF analysis. (A) Total P. falciparum RNA was

electrophoresed on denaturing 7M urea, 12% acrylamide gels,

electrophoretically transferred to nylon membrane, and probed

with a radiolabeled in vitro transcript complementary to LSUF.

Transcript sizes, given in nt, were determined using a ladder of

small in vitro transcripts as markers. (B) Total P. falciparum RNA

(10 ng/lane) was incubated with an unlabeled oligonucleotide

primer and Superscript reverse transcriptase (Invitrogen) in the

presence of [a-32P] dATP. The products were electrophoresed on

7M urea, 12% acrylamide gels and exposed to film. Lanes 1–3,

extensions were performed at 50uC, 55uC, and 60uC, respectively.

The position of the transcript ends is indicated relative to the P.

falciparum mt genome (Genbank M76611). (C) The P. falciparum mt

sequence for nt 1501–1530 is shown. Vertical arrows indicate the

two mapped 59 ends for LSUF. Lower case letters indicate

positions that vary among 200 P. falciparum sequences. The number

of hemosporidian species exhibiting the specific nt at each position

is given.

(TIF)

Figure S5 Mapping P. falciparum mt RNAs. Schematic

representations are shown for two regions of the P. falciparum mt

genome, with mapped RNAs indicated by boxes and the DNA

sequence for transcript ends shown above the boxes. Transcript

termini are in bold, with the 59 nt italicized and the 39 nt

underlined. The nt position in the genome is located above each

terminus.

(TIF)

Figure S6 Proposed secondary structure for T. parva mtSSU rRNA. The T. parva mt SSU rRNA secondary structure

model is superimposed onto the E. coli SSU rRNA secondary

structure model diagram. The transcript ends shown are artificial,

based on the extent of sequence similarity between the mapped P.

falciparum RNAs, and the T. parva mt DNA sequence, or the

capacity to form expected secondary structure.

(TIF)

Figure S7 Proposed secondary structure for T. parva mtLSU rRNA (59 half). The T. parva mt LSU rRNA (59 half)

secondary structure model is superimposed onto the E. coli LSU

rRNA (59 half) secondary structure model diagram. The transcript

ends shown are artificial, based on the extent of sequence

similarity between the mapped P. falciparum RNAs, and the T.

parva mt DNA sequence, or the capacity to form expected

secondary structure.

(TIF)

Figure S8 Proposed secondary structure for T. parva mtLSU rRNA (39 half). The T. parva mt LSU rRNA (39 half)

secondary structure model is superimposed onto the E. coli LSU

rRNA (39 half) secondary structure model diagram. The transcript

ends shown are artificial, based on the extent of sequence

similarity between the mapped P. falciparum RNAs, and the T.

parva mt DNA sequence, or the capacity to form expected

secondary structure.

(TIF)

Figure S9 Location of type 1 and 2 covariations in SSUrRNA. The P. falciparum mt SSU rRNA secondary structure

model, shown in thick gray lines, is superimposed onto the E. coli

SSU rRNA secondary structure model diagram. Regions where P.

falciparum has no structure equivalent to E. coli are shown using

gray circles and lines. Category 1 covarying pairs are indicated

with red and category 2 with blue.

(TIF)

Figure S10 Location of type 1 and 2 covariations in LSUrRNA (59 half). The P. falciparum mt LSU rRNA (59 half)

secondary structure model, shown in thick gray lines, is

superimposed onto the E. coli LSU rRNA (59 half) secondary

structure model diagram. Regions where P. falciparum has no

structure equivalent to E. coli are shown using gray circles and

lines. Category 1 covarying pairs are indicated with red and

category 2 with blue.

(TIF)

Figure S11 Location of type 1 and 2 covariations in LSUrRNA (39 half). The P. falciparum mt LSU rRNA (39 half)

secondary structure model, shown in thick gray lines, is

superimposed onto the E. coli LSU rRNA (39 half) secondary

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 17 June 2012 | Volume 7 | Issue 6 | e38320

Page 18: Gutell 120.plos_one_2012_7_e38320_supplemental_data

structure model diagram. Regions where P. falciparum has no

structure equivalent to E. coli are shown using gray circles and

lines. Category 1 covarying pairs are indicated with red and

category 2 with blue.

(TIF)

Figure S12 SSU rRNA tertiary structure. The P. falciparum

mt SSU rRNAs are superimposed on a space-filling model of the

three dimensional rRNA structure of Thermus thermophilus (PDB ID

1J5E; left = front/interface side, right = back). Each individual P.

falciparum rRNA fragment is colored and labeled; regions of the

model with no P. falciparum equivalent are colored gray. Functional

regions of the rRNAs are labeled in black. (Full page version of

Figure 5A.)

(TIF)

Figure S13 LSU rRNA tertiary structure. The P. falciparum

LSU rRNA is superimposed on a space-filling model of the three

dimensional rRNA structure of Haloarcula marismortui (PDB ID

1S72; left = crown/interface side, right = back). Each individual

P. falciparum rRNA fragment is colored and labeled; regions of the

model with no P. falciparum equivalent are colored gray. Functional

regions of the rRNAs are labeled in black. (Full page version of

Figure 5B.)

(TIF)

Figure S14 P. falciparum mt SSU rRNAs color-coded totertiary structure. Secondary structure diagram for P.

falciparum mt SSU rRNA with each P. falciparum mt rRNA

fragment color-coordinated with the fragment colors in the

Thermus thermophilus three-dimensional structure (Figure S12). (Full

page version of Figure 5C.)

(TIF)

Figure S15 P. falciparum mt LSU rRNAs (59 half) color-coded to tertiary structure. Secondary structure diagram for

P. falciparum mt LSU rRNA (59 half) with each P. falciparum mt

rRNA fragment color-coordinated with the fragment colors in the

Haloarcula marismortui three-dimensional structure (Figure S13).

(Full page version of Figure 5D.)

(TIF)

Figure S16 P. falciparum mt LSU rRNAs (39 half) color-coded to tertiary structure. Secondary structure diagrams for

P. falciparum mt LSU rRNA (39 half) with each P. falciparum mt

rRNA fragment color-coordinated with the fragment colors in the

Haloarcula marismortui three-dimensional structure (Figure S13).

(Full page version of Figure 5E.)

(TIF)

Table S1 Mt genomes from Plasmodium species andrelated apicomplexans.

(PDF)

Table S2 Intergenic size variation among hemospor-idian mt genomes.

(PDF)

Table S3 Similarity of individual mt genes fromPlasmodium species relative to P. falciparum.

(PDF)

Table S4 Oligonucleotides used in this study.

(PDF)

Table S5 P. falciparum mt rRNA 39 end sequences.

(PDF)

Table S6 Abutted P. falciparum mt gene sets.

(PDF)

Table S7 Fragment placement in rRNA alignments forcovariation analysis.

(PDF)

Table S8 Prospective covariations found in fragmentedapicomplexan mt rRNA genes.

(PDF)

Table S9 Correspondence of P. falciparum and C.elegans mt rRNAs.

(PDF)

Acknowledgments

The authors thank Barbara Mericle, David Rehkopf, Tamara Reynolds,

Caitlin Tolzmann, and Michelle Wurscher for excellent technical

assistance.

Author Contributions

Conceived and designed the experiments: JEF RRG. Performed the

experiments: JEF MIH JCL KJC BHS JJC GT MNS. Analyzed the data:

JEF MIH JCL KJC BHS JJC GT MNS. Wrote the paper: JEF MNS

RRG.

References

1. Feagin JE (1992) The 6 kb element of Plasmodium falciparum encodes

mitochondrial cytochrome genes. Mol Biochem Parasitol 52: 145–148.

2. Lang BF, Gray MW, Burger G (1999) Mitochondrial genome evolution and the

origin of eukaryotes. Annu Rev Genet 33: 351–397.

3. Feagin JE, Werner E, Gardner MJ, Williamson DH, Wilson RJM (1992)

Homologies between the contiguous and fragmented rRNAs of the two

Plasmodium falciparum extrachromosomal DNAs are limited to core sequences.

Nucleic Acids Res 20: 879–887.

4. Feagin JE, Mericle BL, Werner E, Morris M (1997) Identification of additional

rRNA fragments encoded by the Plasmodium falciparum 6 kb element. Nucleic

Acids Res 25: 438–446.

5. Boer P, Gray MW (1988) Scrambled ribosomal RNA gene pieces in

Chlamydomonas reinhardtii mitochondrial DNA. Cell 55: 399–411.

6. Heinonen TYK, Schnare MN, Young PG, Gray MW (1987) Rearranged coding

segments, separated by a transfer RNA gene, specify the two parts of a

discontinuous large subunit ribosomal RNA in Tetrahymena pyriformis mitochon-

dria. J Biol Chem 262: 2879–2887.

7. Schnare MN, Gray MW (1990) Sixteen discrete RNA components in the

cytoplasmic ribosome of Euglena gracilis. J Mol Biol 215: 73–83.

8. Schnare MN, Heinonen TYK, Young PG, Gray MW (1986) A discontinuous

small subunit ribosomal RNA in Tetrahymena pyriformis mitochondria. J Biol

Chem 261: 5187–5193.

9. Spencer DF, Collings JC, Schnare MN, Gray MW (1987) Multiple spacer

sequences in the nuclear large subunit ribosomal RNA gene of Crithidia fasciculata.

EMBO J 6: 1063–1071.

10. Turmel M, Boulanger J, Schnare MN, Gray MW (1991) Six group I introns and

three internal transcribed spacers in the chloroplast large subunit ribosomal

RNA gene of the green alga Chlamydomonas eugametos. J Mol Biol 218: 293–311.

11. White TC, Rudenko G, Borst P (1986) Three small RNAs within the 10 kb

trypanosome rRNA transcription unit are analogous to domain VII of other

eukaryotic 28S rRNAs. Nucleic Acids Res 14: 9471–9489.

12. Milbury CA, Lee JC, Cannone JJ, Gaffney PM, Gutell RR (2010)

Fragmentation of the large subunit ribosomal RNA gene in oyster mitochondrial

genomes. BMC Genomics 11: 485.

13. Evguenieva-Hackenberg E (2005) Bacterial ribosomal RNA in pieces. Mol

Microbiol 57: 318–325.

14. Okimoto R, Macfarlane JL, Clary DO, Wolstenholme DR (1992) The

mitochondrial genomes of two nematodes, Caenorhabditis elegans and Ascaris suum.

Genetics 130: 471–498.

15. Sloof P, Van den Burg J, Voogd A, Benne R, Agostinelli M, et al. (1985) Further

characterization of the extremely small mitochondrial ribosomal RNAs from

trypanosomes: a detailed comparison of the 9S and 12S RNAs from Crithidia

fasciculata and Trypanosoma brucei with rRNAs from other organisms. Nucleic

Acids Res 13: 4171–4190.

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 18 June 2012 | Volume 7 | Issue 6 | e38320

Page 19: Gutell 120.plos_one_2012_7_e38320_supplemental_data

16. de la Cruz VF, Lake JA, Simpson AM, Simpson L (1985) A minimal ribosomal

RNA: Sequence and secondary structure of the 9S kinetoplast ribosomal RNA

from Leishmania tarentolae. Proc Natl Acad Sci USA 82: 1401–1405.

17. de la Cruz VF, Simpson AM, Lake JA, Simpson L (1985) Primary sequence and

secondary structure of the 12S kinetoplast (mitochondrial) ribosomal RNA from

Leishmania tarentolae: conservation of the peptidyl transferase structural elements.

Nucleic Acids Res 13: 2337–2356.

18. Payne M, Rothwell V, Jasmer DP, Feagin JE, Stuart K (1985) Identification of

mitochondrial genes in Trypanosoma brucei and homology to cytochrome c oxidase

in two different reading frames. Mol Biochem Parasitol 15: 159–170.

19. Suplick K, Morrisey J, Vaidya AB (1990) Complex transcription from the

extrachromosomal DNA encoding mitochondrial functions of Plasmodium yoelii.

Mol Cell Biol 10: 6381–6388.

20. Vaidya AB, Akella R, Suplick K (1989) Sequences similar to genes for two

mitochondrial proteins and portions of ribosomal RNA in tandemly arrayed 6-

kilobase-pair DNA of a malarial parasite. Mol Biochem Parasitol 35: 97–108.

21. Hikosaka K, Watanabe Y, Kobayashi F, Waki S, Kita K, et al. (2011) Highly

conserved gene arrangement of the mitochondrial genomes of 23 Plasmodium

species. Parasitol Int 60: 175–180.

22. Dechering KJ, Kaan AM, Mbacham W, Wirth DF, Eling W, et al. (1998)

Isolation and functional characterization of two distinct sexual-stage-specific

promoters of the human malaria parasite Plasmodium falciparum. Mol Cell Biol 19:

967–978.

23. Kairo A, Fairlamb AH, Gobright E, Nene V (1994) A 7.1 kb linear DNA

molecule of Theileria parva has scrambled rDNA sequences and open reading

frames for mitochondrially-encoded proteins. EMBO J 13: 898–905.

24. Hikosaka K, Watanabe Y, Tsuji N, Kita K, Kishine H, et al. (2010) Divergence

of the mitochondrial genome structure in the apicomplexan parasites, Babesia

and Theileria. Mol Biol Evol 27: 1107–1116.

25. Lin RQ, Qiu LL, Liu GH, Wu XY, Weng YB, et al. (2011) Characterization of

the complete mitochondrial genomes of five Eimeria species from domestic

chickens. Gene 480: 28–33.

26. Perkins SL, Austin CC (2009) Four new species of Plasmodium from New Guinea

lizards: integrating morphology and molecules. J Parasitol 95: 424–433.

27. Rehkopf DH, Gillespie DE, Harrell MI, Feagin JE (2000) Transcriptional

mapping and RNA processing of the Plasmodium falciparum mitochondrial

mRNAs. Mol Biochem Parasitol 105: 91–103.

28. Gillespie DE, Salazar NA, Rehkopf DH, Feagin JE (1999) The fragmented

mitochondrial ribosomal RNAs of Plasmodium falciparum have short A tails.

Nucleic Acids Res 27: 2416–2422.

29. Raabe CA, Sanchez CP, Randau G, Robeck T, Skryabin BV, et al. (2010) A

global view of the nonprotein-coding transcriptome in Plasmodium falciparum.

Nucleic Acids Res 38: 608–617.

30. Ban N, Nissen P, Hansen J, Moore PB, Steitz TA (2000) The complete atomic

structure of the large ribosomal subunit at 2.4 A resolution. Science 289: 905–

920.

31. Wimberly BT, Brodersen DE, Clemons WM, Morgan-Warren RJ, Carter AP, et

al. (2000) Structure of the 30S ribosomal subunit. Nature 407: 327–339.

32. Schluenzen F, Tocilj A, Zarivach R, Harms J, Gluehmann M, et al. (2000)

Structure of functionally activated small ribosomal subunit at 3.3 angstroms

resolution. Cell 102: 615–623.

33. Yusupov MM, Yusupova GZ, Baucom A, Lieberman K, Earnest TN, et al.

(2001) Crystal structure of the ribosome at 5.5 A resolution. Science 292: 883–

896.

34. Mears JA, Sharma MR, Gutell RR, McCook AS, Richardson PE, et al. (2006) A

structural model for the large subunit of the mammalian mitochondrial

ribosome. J Mol Biol 358: 193–212.

35. Sharma MR, Koc EC, Datta PP, Booth TM, Spremulli LL, et al. (2003)

Structure of the mammalian mitochondrial ribosome reveals an expanded

functional role for its component proteins. Cell 115: 97–108.

36. Mears JA, Cannone JJ, Stagg SM, Gutell RR, Agrawal RK, et al. (2002)

Modeling a minimal ribosome based on comparative sequence analysis. J Mol

Biol 321: 215–234.

37. Chandramouli P, Topf M, Menetret JF, Eswar N, Cannone JJ, et al. (2008)

Structure of the mammalian 80S ribosome at 8.7 A resolution. Structure 16:

535–548.

38. Ben-Shem A, Jenner L, Yusupova G, Yusupov M (2010) Crystal structure of the

eukaryotic ribosome. Science 330: 1203–1209.

39. Rabl J, Leibundgut M, Ataide SF, Haag A, Ban N (2011) Crystal structure of the

eukaryotic 40S ribosomal subunit in complex with initiation factor 1. Science

331: 730–736.

40. Hansen JL, Moore PB, Steitz TA (2003) Structures of five antibiotics bound at

the peptidyl transferase center of the large ribosomal subunit. J Mol Biol 330:

1061–1075.

41. Hansen JL, Ippolito JA, Ban N, Nissen P, Moore PB, et al. (2002) The structures

of four macrolide antibiotics bound to the large ribosomal subunit. Mol Cell 10:

117–128.

42. Tu D, Blaha G, Moore PB, Steitz TA (2005) Structures of MLSBK antibiotics

bound to mutated large ribosomal subunits provide a structural explanation for

resistance. Cell 121: 257–270.

43. Barta A, Dorner S, Polacek N, Berg JM, Lorsch JR, et al. (2001) Mechanism of

Ribosomal Peptide Bond Formation. Science 291: 203.

44. Hansen JL, Schmeing TM, Klein DJ, Ippolito JA, Ban N, et al. (2001) Progress

toward an understanding of the structure and enzymatic mechanism of the large

ribosomal subunit. Cold Spring Harb Symp Quant Biol 66: 33–42.

45. Hansen JL, Schmeing TM, Moore PB, Steitz TA (2002) Structural insights into

peptide bond formation. Proc Natl Acad Sci U S A 99: 11670–11675.

46. Ban N, Freeborn B, Nissen P, Penczek P, Grassucci RA, et al. (1998) A 9A

resolution x-ray crystallographic map of the large ribosomal subunit. Cell 93:

1105–1115.

47. Nissen P, Hansen J, Ban N, Moore PB, Steitz TA (2000) The structural basis of

ribosome activity in peptide bond synthesis. Science 289: 920–930.

48. Steitz TA (2005) On the structural basis of peptide-bond formation and

antibiotic resistance from atomic structures of the large ribosomal subunit. FEBS

Lett 579: 955–958.

49. Rodnina MV, Beringer M, Wintermeyer W (2007) How ribosomes make

peptide bonds. Trends Biochem Sci 32: 20–26.

50. Klein DJ, Moore PB, Steitz TA (2004) The contribution of metal ions to the

structural stability of the large ribosomal subunit. RNA 10: 1366–1379.

51. Klein DJ, Moore PB, Steitz TA (2004) The roles of ribosomal proteins in the

structure assembly, and evolution of the large ribosomal subunit. J Mol Biol 340:

141–177.

52. Moore PB, Steitz TA (2002) The involvement of RNA in ribosome function.

Nature 418: 229–235.

53. Nissen P, Ippolito JA, Ban N, Moore PB, Steitz TA (2001) RNA tertiary

interactions in the large ribosomal subunit: the A-minor motif. Proc Natl Acad

Sci U S A 98: 4899–4903.

54. Voss NR, Gerstein M, Steitz TA, Moore PB (2006) The geometry of the

ribosomal polypeptide exit tunnel. J Mol Biol 360: 893–906.

55. Cannone JJ, Subramanian S, Schnare MN, Collett JR, D’Souza LM, et al.

(2002) The comparative RNA web (CRW) site: an online database of

comparative sequence and structure information for ribosomal, intron, and

other RNAs. BMC Bioinformatics 3: 2.

56. Krief S, Escalante AA, Pacheco MA, Mugisha L, Andre C, et al. (2010) On the

diversity of malaria parasites in African apes and the origin of Plasmodium

falciparum from bonobos. PLoS Pathog 6: e1000765.

57. Liu W, Li Y, Learn GH, Rudicell RS, Robertson JD, Keele BF, et al. (2010)

Origin of the human malaria parasite Plasmodium falciparum in gorillas. Nature

467: 420–425.

58. Joy DA, Feng X, Mu J, Furuya T, Chotivanich K, et al. (2003) Early origin and

recent expansion of Plasmodium falciparum. Science 300: 318–321.

59. Conway DJ, Fanello C, Lloyd JM, Al Joubori BM, Baloch AH, et al. (2000)

Origin of Plasmodium falciparum malaria is traced by mitochondrial DNA. Mol

Biochem Parasitol 111: 163–171.

60. Jongwutiwes S, Putaporntip C, Iwasaki T, Ferreira MU, Kanbara H, et al.

(2005) Mitochondrial genome sequences support ancient population expansion

in Plasmodium vivax. Mol Biol Evol 22: 1733–1739.

61. Feagin JE, Drew ME (1995) Plasmodium falciparum: Alterations in organelle

transcript abundance during the erythrocytic cycle. Exp Parasitol 80: 430–440.

62. Ji Y, Mericle BL, Rehkopf DH, Anderson JD, Feagin JE (1996) The 6 kb

element of Plasmodium falciparum is polycistronically transcribed. Mol Biochem

Parasitol 81: 211–223.

63. Dubin DT, HsuChen CC, Timko KD, Azzolina TM, Prince DL, et al. (1982) 3’

termini of mammalian and insect mitochondrial rRNAs. In: Slonimski P, Borst

P, Attardi G, editors. Mitochondrial Genes. Cold Spring Harbor: Cold Spring

Harbor Press. 89–98.

64. Dubin DT, Montoya J, Timko KD, Attardi G (1982) Sequence analysis and

precise mapping of the 3’-ends of HeLa cell mitochondrial ribosomal RNAs.

J Mol Biol 157: 1–19.

65. Van Etten RA, Bird JW, Clayton DA (1983) Identification of the 3’-ends of the

two mouse mitochondrial ribosomal RNAs. J Biol Chem 258: 10104–10110.

66. Wilson RJ, Williamson DH (1997) Extrachromosomal DNA in the Apicom-

plexa. Microbiol Mol Biol Rev 61: 1–16.

67. Fry M, Beesley JE (1991) Mitochondria of mammalian Plasmodium spp.

Parasitology 102: 17–26.

68. Krungkrai J (1995) Purification, characterization and localization of mitochon-

drial dihydroorotate dehydrogenase in Plasmodium falciparum, human malaria

parasite. Biochim Biophys Acta 1243: 351–360.

69. Gray MW (1993) Origin and evolution of organelle genomes. Curr Opin Genet

Dev 3: 884–890.

70. Shukla GC, Nene V (1998) Telomeric features of Theileria parva mitochondrial

DNA derived from cycle sequence data of total genomic DNA. Mol Biochem

Parasitol 95: 159–163.

71. Gutell RR, Weiser B, Woese CR, Noller HF (1985) Comparative anatomy of 16-

S-like ribosomal RNA. Prog Nucleic Acid Res Mol Biol 32: 155–216.

72. Powers T, Noller HF (1991) A functional pseudoknot in 16S ribosomal RNA.

EMBO J 10: 2203–2214.

73. Gutell RR, Noller HF, Woese CR (1986) Higher order structure in ribosomal

RNA. EMBO J 5: 1111–1113.

74. Jackson CJ, Norman JE, Schnare MN, Gray MW, Keeling PJ, et al. (2007)

Broad genomic and transcriptional analysis reveals a highly derived genome in

dinoflagellate mitochondria. BMC Biol 5: 41.

75. Gutell RR, Lee JC, Cannone JJ (2002) The accuracy of ribosomal RNA

comparative structure models. Curr Opin Struct Biol 12: 301–310.

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 19 June 2012 | Volume 7 | Issue 6 | e38320

Page 20: Gutell 120.plos_one_2012_7_e38320_supplemental_data

76. Joseph JT, Aldritt SM, Unnasch T, Puijalon O, Wirth DF (1989) Character-

ization of a conserved extrachromosomal element isolated from the avian

malarial parasite Plasmodium gallinaceum. Mol Cell Biol 9: 3621–3629.

77. Kamikawa R, Inagaki Y, Sako Y (2007) Fragmentation of mitochondrial large

subunit rRNA in the dinoflagellate Alexandrium catenella and the evolution of

rRNA structure in alveolate mitochondria. Protist 158: 239–245.

78. Slamovits CH, Saldarriaga JF, Larocque A, Keeling PJ (2007) The highly

reduced and fragmented mitochondrial genome of the early-branching

dinoflagellate Oxyrrhis marina shares characteristics with both apicomplexan

and dinoflagellate mitochondrial genomes. J Mol Biol 372: 356–368.

79. Trager W, Jensen JB (1978) Cultivation of malarial parasites. Nature 273: 621–

622.

80. Chomczynski P, Sacchi N (1987) Single-step method of RNA isolation by acid

guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem 162:

156–159.

81. Gardner MJ, Feagin JE, Moore DJ, Spencer DF, Gray MW, et al. (1991)

Organization and expression of small subunit ribosomal RNA genes encoded bya 35-kb circular DNA in Plasmodium falciparum. Mol Biochem Parasitol 48: 77–88.

82. Tatusova TA, Madden TL (1999) BLAST 2 Sequences, a new tool for

comparing protein and nucleotide sequences. FEMS Microbiol Lett 174: 247–250.

83. Brosius J, Dull TJ, Sleeter DD, Noller HF (1981) Gene organization and primarystructure of a ribosomal RNA operon from Escherichia coli. J Mol Biol 148: 107–

127.

84. Staden R, Beal KF, Bonfield JK (2000) The Staden package, 1998. MethodsMol Biol 132: 115–130.

85. Bernstein HJ (2000) Recent changes to RasMol, recombining the variants.Trends Biochem Sci 25: 453–455.

86. Cox-Singh J, Davis TM, Lee KS, Shamsul SS, Matusop A, et al. (2008)Plasmodium knowlesi malaria in humans is widely distributed and potentially life

threatening. Clin Infec Dis 46: 165–171.

P. falciparum Mitochondrial rRNAs

PLoS ONE | www.plosone.org 20 June 2012 | Volume 7 | Issue 6 | e38320

Page 21: Gutell 120.plos_one_2012_7_e38320_supplemental_data

P. falciparum UAUGUCCUGU UUCAAAUAUA UAUAU----GAAUA AUUGUACGAA UAGACAAUUG UGUUCAUAGC UAGAGUACGU AAGGAAAAGG AAAGGUUAAC CGCUAUCAAAP. reichenowii .......... .......... .....----..... .........U ..U....... .......... .......... .......... .......... .......... P. billcollinsi .......... .......... .....AU--..... .......... ..U....... .......... .......... .......... .......... .......... P. billbrayi .......... .......... .....AUAU..... .........U ..U....... .......... .......... .......... .......... .......... P. gallinaceum .......... .....G...U GU.G-----CU.AU ..G.-..... CUA.U..... .......... .......... ........A. .......... .......... P. relictum .......... ...C.G...U GU.G-----CU.AU ..G.-..... .UU.U..... .......... .......... ........A. .......... .......... P. juxtanucleare .......... .....G...U –U.G-----CU.AU ..G.-..... .UAU--.... .......... .......... ........A. .......... .......... P. floridense .......... ...G.U..AU GU.G-----CU.AU ..G.-..... .UAUU..... .......... .......... ........A. .......... .......... P. mexicanum .......... ...G....AC AU.G-----CU... UAA.-..-.. .U.U...... .......... .......... ........A. .......... .......... P. ovale .......... ...G.U..U. G..C-----UU... U...-..A.U A.U.U.U... ...AA..... .......... .......... .......... .......... P. malariae .......... .....C...U G..G-----CU.AU ..G.-..-.. .UA.--U... ...AC.A... .......... .......... .......... .......... P. gonderi .......... .....U.... G.. -----CU.GU ..A.-..-.. ..A.--G... ...AA.A... .......... ........A. .......... .......... P. sp. .......... .....C...U G..G-----CU.AU ..A.-..-.. ..C.---... ...AA.A... .......... ........A. .......... .......... P. fragile .......... ....UC.... G.U.-----CU.GU ..A.-.U-.. .UU.---... ..A.A.A... .......... ........A. .......... .......... P. knowlesi .......... ...GUCC... G...-----CU.GU ..A.-..-.. .UA.--G... ...AA.A... .......... ........-. .......... .......... P. coatneyi .......... ....UCC... G.U.-----CU.GU ..A.-G.-.. .UA.--G... ...AA.A... .......... ........A. .......... .......... P. inui .......... ....GC.... G.U.-----CU.GU ..A.-..-.. .CA.--.... ...AG..... .......... ........A. .......... .......... P. hylobati .......... .....C.... G.U.-----CU.GU ..A.-..-.. .GA.--.... ....A.A... .......... ........A. .......... .......... P. vivax .......... .....C.... G...-----CU.GU ..A.-..-.. ..AU--.... ....A.A... .......... ........A. .......... .......... P. simuim .......... .....C.... G...-----CU.GU ..A.-..-.. ..AU--.... ....A.A... .......... ........A. .......... .......... P. simiovale .......... .....C.... G...-----CU.GU ..A.-..-.. .UU.--.... ...AA.A... .......... .......... .......... .......... P. fieldi .......... .....C.... G...-----CU.GU ..A.-..-.. .UU.--.... ...AA.A... .......... ........A. .......... .......... P. cynomolgi .......... .....C.... G...-----CU.GU ..A.-..-.. .UA.--.... ....A.A... .......... ........A. .......... .......... P. yoelii .......... ...G.U.... -C.G.----CU.G. U.A.-..-.. ..C.U.U... ...AU.A... .......... .......... .......... .........- P. berghei .......... ...G.U.... -C.G.----CU.G. U.A.-..-.. ..C.U.U... ...AU.A... .......... .......... .......... .........- P. chabaudi .......... ...G.U.... -C...----CU.AG U.A.-.U-.. .GC.U.U... ...AU.A... ....C..... .......... .......... .........-

sarcin ricin loop

Page 22: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Gg U

AC

C

c CA

AAA A

Aa

A

UuU UU

U

UU

U

U

U

Gg

GG

g

Gg

G

G U

uC cC

oo

oo

o

ooo

oo

o

oo

LSUB LSUCA

B

23

LSUC

LSUB

~30

H1082

H1087H1057

Page 23: Gutell 120.plos_one_2012_7_e38320_supplemental_data

78

6568

4840 40

54

RNA11SSUD RNA12 RNA13 RNA14 RNA15 RNA16

Page 24: Gutell 120.plos_one_2012_7_e38320_supplemental_data

1501 1510Pf G A G A T A A T G T g C C G TG 15 1 4 2 3 24 34 28 34A 3 22 8 5 3 7 3 6T 16 11 22 26 28 27 7 34 34 34C 1 34 34

1520 1530Pf A A A C A t A T A A C G G T AG 34 1 1 33 32 33A 16 34 34 34 33 34 33 1 2 34T 18 33 34 1C 34 1

LSUF (short)

LSUF (long)C

A

125110

1 2 3B

1501

1516

Page 25: Gutell 120.plos_one_2012_7_e38320_supplemental_data

AATAAATAATATCTAGCG . . . . . . . .TACGCCCTTAACG . . . AAATGT . .CCCTATT

AAGCTTTTG . . . CAATATTGAGTTGAC . . . GAGTGGATTAAATGCC

RNA16 RNA20 RNA9

3 72713433B

1630 1631 1638A

SSUELSUF

1680 1763 180617641754

RNA2 RNA21

1697

RNA26t RNA3

1807 1829 1830

Page 26: Gutell 120.plos_one_2012_7_e38320_supplemental_data

SSU A

SSU B

RNA 9

SSU D

SSU E

SSU F

5’

3’

5’

3’

5’

3’

5’

3’

5’

3’

5’

3’

RNA 8

RNA 14

5’

3’

5’

3’

5’

3’

RNA 17

GA

U

UAUAUACGUAUGCU

AAUAGAAC

C U

UU

UAU A A

UCCCA

UGCUGAUAUUA

AA U A

A UA UC A

UUA GC C

UAA U UUCUAU

AAAA

A A G G A U U GGAUAUUAUA

UUGUAAAA

U C A UGCAG

UGCC

AGCA

G CCGC G

GCAA

UAC

UGUA

UAAUUC

GAGUAUAGUAA

AUUCAUG

UU A G

GCGAA

AA

ACGAG

UUCAUGUU

AUAUAAUUAACAUAUUUGAA A

CUCUA

UA U

CUGUUU

UA U

AGUA

CAGG

AUU

A G AUA

CCCUGGAA

UA

UAUUGA U UC

CACA

UAUA A GA AC

UUUAU

AU

UUCUUUGCCU

U GGA U GUC A

GU UAG

UACGUA

ACAUACAU

GU

UGA A A U G U U AU

A A GU A GU GA A A UCAGU

GGAACAUGUUGUUUAAUUUG

AUUU

CAC

G CACUAUAU C U UAC

UGAUAA

AA

UU

AC

AC

UA

AG

UC

AG

AA

UC

AU

CUAAUU

UAUCAG

UAC

AUUAAUGUGUUAC A A

GA

GA

UGAU

GGCCGUA

GG

AAUUGA

UAUUAAACGUCAA A

GA

GUACU

AG

AC

GU

UAAAUAUUCU

AGCA G U A U

U C GC

GUAAUA

ACUACUUUGAA

CACACUGC

UCG

AC

ACGUA

UUACAGUC

AGUACGA

AGUCGA

AAC

A AGGU A GU U G AC A G U

GA

ACUUGUAGCUGAACAAAGUA

Page 27: Gutell 120.plos_one_2012_7_e38320_supplemental_data

LSU A

LSU B LSU C3’5’

3’

3’ 5’

5’

5’

3’

5’

3’5’

3’

5’

RNA 2 RNA 11

RNA 1

RNA 3

3’ half

C

UUAAC

GUAAUUAA

AUAUUU

CA

GUA

AAAA

UU

GU

A

A

UCUAACUUUAUUU

AA U A C A

UU

GGUACGAACC

UUA

CA

CCC

UUAAA

C GG

UG

UU

GGA

CGA AAUAAAGUUAA

UAGA

UAAAU U C C

AAU

U G A AC

AA

UU

AU

A GA

UU

GUUCUCUGU

GA

AA

UCUUUU

AUG

AA

GAGU

AAUGA

UG

GACUAUGUAAGUUAA C C U A U G U A G U C

AAA

UG

CGA

AA

C AGC

GC A CA A

UCA UAU

UUAAAGUAC

AU UA

AUGUAUGCUUG

AAU

G C UG U A A U C A C U

UUU

AAAGA G

A UAG C G U

UA

UAGCUC

ACU

G

UAUAGAU U

UGAAAUA

AAUG A A

AA

UU

UU

UU

AA

GGUAA C A G A G

UAGAGAA CCU

UCUGGA A

UU

AUG U U A A U

UU

UA

GU

A A UG A A A A U A U C

AGUAUA

UACAGUAACUUAUUAUAG

ACCAUUUUGCAAUCAUAUUUGAAAUAUAUUUCU

AUACUG

Page 28: Gutell 120.plos_one_2012_7_e38320_supplemental_data

LSU D

LSU E

LSU FLSU G

5’

3’ 5’

3’

5’

3’

5’

3’

5’

3’

RNA 10

RNA 3 5’3’

RNA 13

RNA 18

RNA 65’

3’5’

3’A

3’

5’ half UGAGU

UA

A AUUUAGUG

AAGGAACUUGACAGGUAC

AA

UGAGGAGCGU

CUGUUGA

AUA

C C UA C AC

A AGAUAAUG

CUAAGAUU

U AUAUCGAUGU

AUAUUAUCU G

AU A C C

UGUUC

A GUGGUUU

AGCC A

AACAAC A

GGGA U C

UC

UU UU A

AA G

UAAGAA

UU

GACUAUCU

AAAUAAUAGUCAU

GU

GAAUGGUA

UAAC

GACUUCUCU

AUU

GUCUC

CACUAAAUG

UUCU A U G A A A U

AG

AC

UU

A GC U UU G A A

UA

GGAAGCUU

GA U A U A C U

GG G A C G G

UA

AG

A CCCUGUGC

ACCUU

CACUUCCCUCCGA

AGGGAAGU U U A

GCC

GGG A

AGUUU G

CGUCUU

UAACUA

CU

AUUG

GU

AA

UAAGGU A

CGCCAGGG A U A

AC

AGG U U AA A G A U U A A

U U AU A G U

AC

UCAUG

UAAUUAAUCU

AUUGAC

AC

CU

CC

AUGUC

GUCU

CACCAUA C C U U U U

UG

UU

UU

G

UU

AA U G C U U GA

AUAAGGUUUAA

UAGGU

AUGAGA

GUUC

UCAUAU

UAAU A

UGGUA

C GUGA

GUUGGGUUAAGAACGCC

GCAA GGC

U GUUCG

UUCCC

UAUCUAGAUAUU

UGAAAGAGUUUUGU

UU U A U A U G U A A

AA

GU

G

U

U

CUAGCCAGA G U

A CG U

AA

GGA

AUAGGAAAG

AU

UA

AC CG

CU

AUCAGG

UGUU

CG

GU

AU

UG

CA

UGC

CUAG

UG

UAU U GA U GU A A

UU

AU

CGCA

CAA

AUA

GU

UA

AC

AAGA

AAA

AUAU

AU

UA

AUAAGU

U G AA

GU

UAUGGU

UUAA

Page 29: Gutell 120.plos_one_2012_7_e38320_supplemental_data

5’

3’

SSUA

RNA14

RNA12

RNA17

SSUE

SSUFSSUD RNA5

SSUB

RNA8

RNA19

RNA9

5’3’

5’

3’

5’

3’

5’

3’

5’

3’

5’

3’

5’3’

5’

3’

5’

3’

5’

3’

5’

3’

5’

3’

I

II

III

10

50

100

150

200

250

300

350

400

450

500550

600

650

700

750

800

850

900

950

1000

1050

1100

1150

1200

1250

1300

13501400

1450

1500

H9

H17H27

H122

H240H289

H500

H505

H511

H567

H575H577

H655

H673

H722

H769

H885

H921H939

H944

H960

H984

H1047

H1068H1074

H1303

H1350

H1399

H1506

H39

H821

H1241

Page 30: Gutell 120.plos_one_2012_7_e38320_supplemental_data

5’ 3’

5’

3’

LSUB

5’

3’

LSUC

5’

3’

RNA2

5’

3’

RNA11

5’

3’

RNA1RNA3

5’

3’

LSUA

3’ half

5’

I

II

III

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

10501100

1150

1200

1250

1300

1350

1400

1450

1500

1550

1600

1640

2900

H563

H579

H589

H671

H687

H736

H700

H777

H812

H822

H946

H976

H991

H1030

H1057

H1082

H1087

H1164H1196

H1262

H1276

H1295

Page 31: Gutell 120.plos_one_2012_7_e38320_supplemental_data

5’

3’

3’

RNA3

5’

3’

LSUD5’

3’

LSUE

5’ 3’RNA13

5’

3’

LSUF

5’

3’ LSUG

5’

3’

RNA10

5’

3’ RNA185’

3’

RNA6

a

a

5’

3’

5’ half

IV

V

VI

1650

17001750

1800

1850

1900

1950

2000

2050

2100

2150

2200

2250

2300

2350

2400

2450

2500

2550

2600

2650

2700

2740

2800

2850

2900

H1648

H1764

H1775H1782

H1792

H1830

H1835 H1906

H1925

H1935

H2023

H2043

H2064

H2077H2246

H2455

H2507

H2520

H2547

H2588

H2646

H2675

H2735

Page 32: Gutell 120.plos_one_2012_7_e38320_supplemental_data

530 loop

A-siteP-site

SSUA

RNA14

SSUF

SSUD

RNA9SSUB

RNA8

SSUE

RNA19

RNA5

RNA12

RNA17

Page 33: Gutell 120.plos_one_2012_7_e38320_supplemental_data

P-loopPTL GTPase

SRL

A-loop

LSUA

LSUG

RNA10

LSUB

RNA13 RNA11

RNA1

RNA3

RNA18

LSUD

RNA2

LSUC

LSUF

RNA6

LSUE

Page 34: Gutell 120.plos_one_2012_7_e38320_supplemental_data

5’

3’

SSUA

RNA14

RNA12

RNA17

SSUE

SSUF SSUD RNA5

SSUB

RNA8

RNA19

RNA9

5’3’

5’

3’

5’

3’

5’

3’

5’

3’

5’

3’

5’3’

5’

3’

5’

3’

5’

3’

5’

3’

5’

3’

I

II

III

10

50

100

150

200

250

300

350

400

450

500550

600

650

700

750

800

850

900

950

1000

1050

1100

1150

1200

1250

1300

13501400

1450

1500

H9

H17H27

H122

H240H289

H500

H505

H511

H567

H575H577

H655

H673

H722

H769

H885

H921H939

H944

H960

H984

H1047

H1068H1074

H1303

H1350

H1399

H1506

G

Page 35: Gutell 120.plos_one_2012_7_e38320_supplemental_data

5’ 3’

5’

3’

LSUB

5’

3’

LSUC

5’

3’

RNA2

5’

3’

RNA11

5’

3’

RNA1 RNA3

5’

3’

LSUA

3’ half5’

I

II

III

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

10501100

1150

1200

1250

1300

1350

1400

1450

1500

1550

1600

1640

2900

H563

H579

H589

H671

H687

H736

H700

H777

H812

H822

H946

H976

H991

H1030

H1057

H1082

H1087

H1164H1196

H1262

H1276

H1295

Page 36: Gutell 120.plos_one_2012_7_e38320_supplemental_data

5’

3’

3’

RNA3

5’

3’

LSUD5’

3’

LSUE

5’ 3’RNA13

5’

3’

LSUF

5’

3’ LSUG

5’

3’

RNA10

5’

3’ RNA185’

3’

RNA6

a

a

5’

3’

5’ half

IV

V

VI

1650

17001750

1800

1850

1900

1950

2000

2050

2100

2150

2200

2250

2300

2350

2400

2450

2500

2550

2600

2650

2700

2740

2800

2850

2900

H1648

H1764

H1775H1782

H1792

H1830

H1835 H1906

H1925

H1935

H2023

H2043

H2064

H2077H2246

H2455

H2507

H2520

H2547

H2588

H2646

H2675

H2735

Page 37: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S1. Mt genomes from Plasmodium species and related

apicomplexans.

Organism1

(Host2)

Mt DNA size (nt)

3

identity to M76611

Initial Set4 Accession Data

Plasmodium (Haemosporida)

P. berghei (rodent) 5957 88.3% Y AB558173 (contig 5406, Sanger)

5961 88.0% AF014115

P. billbrayi (chimpanzee) 5945 91.8% GQ355468

5949 91.7% Y GQ355469

5864 90.3% GQ355470

5865 90.5% GQ355471

P. billcollinsi (chimpanzee) 5864 92.7% GQ355477

5864 92.8% GQ355478

5950 94.2% Y GQ355479

P. chabaudi (rodent) 5949 88.1% AB379663

5949 88.2% AB379664

5949 88.2% AB379665

5949 88.1% AB379666

5949 88.1% AB379667

5949 88.2% AB379668

5949 88.2% AB379669

5949 88.2% AB379670

5949 88.2% AB379671

5948 88.2% Y AF014116

P. coatneyi (OW monkey, Asian) 5976 88.5% Y AB354575

P. cynomolgi (OW monkey, Asian) 5983 88.8% AB434919

5982 88.8% AB444121

5985 88.8% AB444122

5986 88.9% AB444123

5983 88.8% AB444124

5983 88.8% AB444125

5993 88.9% AB444126

5986 88.9% AB444127

5986 88.9% AB444128

5991 88.8% AB444129

5984 88.8% AB444130

5984 88.8% AB444131

5884* 87.3% Y AY800108

Page 38: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Organism1

(Host2)

Mt DNA size (nt)

3

identity to M76611

Initial Set4 Accession Data

P. falciparum (human) 5967 (self) Y M76611

5967 99.9% AJ276844 (NC_002375)

5949 99.9% AY282924

5949 99.9% AY282925

5949 99.9% AY282926

5949 99.9% AY282927

5949 99.9% AY282928

5949 99.9% AY282929

5949 99.9% AY282930

5949 99.9% AY282931

5949 99.9% AY282932

5949 99.9% AY282933

5949 99.9% AY282934

5949 99.9% AY282935

5949 99.9% AY282936

5949 99.9% AY282937

5949 99.9% AY282938

5949 99.9% AY282939

5949 99.9% AY282940

5949 99.9% AY282941

5949 99.9% AY282942

5949 99.9% AY282943

5949 99.9% AY282944

5949 99.9% AY282945

5949 99.9% AY282946.

5949 99.9% AY282947

5949 99.9% AY282948

5949 99.9% AY282949

5949 99.9% AY282950

5949 100.0% AY282951

5949 100.0% AY282952

5949 99.9% AY282953

5949 99.9% AY282954

5949 99.9% AY282955

5949 99.9% AY282956

5949 99.9% AY282957

5949 99.9% AY282958

Page 39: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Organism1

(Host2)

Mt DNA size (nt)

3

identity to M76611

Initial Set4 Accession Data

5949 99.9% AY282959

5949 100.0% AY282960

5949 99.9% AY282961

5949 99.9% AY282962

5949 99.9% AY282963

5949 99.9% AY282964

5949 99.9% AY282965

5949 99.9% AY282966

5949 99.9% AY282967

5949 99.9% AY282968

5949 99.9% AY282969

5949 100.0% AY282970

5949 100.0% AY282971

5949 99.9% AY282972

5953 99.9% AY282973

5949 99.9% AY282974

5949 99.9% AY282975

5949 99.9% AY282976

5949 99.9% AY282977

5949 99.9% AY282978

5949 99.9% AY282979

5949 99.9% AY282980

5949 99.9% AY282981

5949 99.9% AY282982

5949 100.0% AY282983

5949 99.9% AY282984

5949 100.0% AY282985

5949 99.9% AY282986

5949 100.0% AY282987

5949 100.0% AY282988

5949 99.9% AY282989

5949 99.9% AY282990

5949 99.9% AY282991

5949 99.9% AY282992

5949 99.9% AY282993

5949 99.9% AY282994

5949 99.9% AY282995

Page 40: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Organism1

(Host2)

Mt DNA size (nt)

3

identity to M76611

Initial Set4 Accession Data

5949 99.9% AY282996

5949 99.9% AY282997

5949 99.9% AY282998

5949 99.9% AY282999

5949 99.9% AY283000

5949 99.9% AY283001

5949 99.9% AY283002

5949 99.9% AY283003

5949 99.9% AY283004

5949 99.9% AY283005

5949 99.9% AY283006

5949 99.9% AY283007

5949 99.9% AY283008

5949 99.9% AY283009

5949 99.9% AY283010

5949 99.9% AY283011

5949 99.9% AY283012

5949 100.0% AY283013

5949 100.0% AY283014

5949 100.0% AY283015

5949 99.9% AY283016

5949 99.9% AY283017

5949 100.0% AY283018

5967 99.8% DQ642845

P. fieldi (OW monkey, Asian) 5983 89.0% Y AB354574

5988 89.0% AB444132

5987 89.0% AB444133

P. floridense (lizard) 6002 88.3% Y EF079654 (NC_009961)

P. fragile (OW monkey, Asian) 5977 88.7% AB444134

5978 88.8% AB444135

5977 88.7% AB444136

5977 88.7% Y AY722799 (NC_012369)

P. gallinaceum (bird) 6003 88.3% Y AB250690 (NC_008288)

6002 88.1% AB599930

P. gonderi (OW monkey, African) 5989 89.1% AB434918

5900* 87.7% Y AY800111

P. hylobati (OW monkey, Asian) 5973 88.3% Y AB354573

Page 41: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Organism1

(Host2)

Mt DNA size (nt)

3

identity to M76611

Initial Set4 Accession Data

P. inui (OW monkey, Asian) 5972 88.0% AB354572

5976 88.5% AB444109

5981 88.4% AB444110

5976 88.2% AB444111

5980 88.4% AB444112

5976 88.2% AB444113

5981 88.2% AB444114

5971 88.0% AB444115

5973 87.9% AB444116

5976 88.2% AB444117

5972 87.9% AB444118

5979 88.4% AB444119

5976 88.5% AB444120

5972 87.9% Y HM032052

P. juxtanucleare (bird) 6014 87.8% Y AB250415

P. knowlesi (OW monkey, Asian;

human5)

5957 88.4% AB444106

5958 88.4% AB444107

5958 88.3% AB444108

5957 88.4% Y AY722797 (NC_007232)

P. malariae (human) 5968 88.8% Y AB354570

5969 88.7% AB489192

5969 88.7% AB489193

5968 88.7% AB489194

5856 87.0% GQ355485

5856 87.0% GQ355486

P. mexicanum (lizard) 5991 87.9% AB375765

5991 87.9% Y EF079653 (NC_009960)

P. ovale (human) 5974 89.1% Y AB354571

P. reichenowi (chimpanzee) 5966 97.4% Y AJ251941 (NC_002235)

P. relictum (bird) 5996 88.4% AY733088 (NC_012426)

6003 87.8% AY733089

5997 88.8% Y AY733090

P. simiovale (OW monkey, Asian) 5987 89.0% AB434920

5898* 87.6% Y AY800109

P. simium (NW monkey) 5990 88.9% Y AY722798 (NC_007233)

5899 87.5% AY800110

P. vinckei vinckei (rodent) 5948 87.5% AB599931

Page 42: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Organism1

(Host2)

Mt DNA size (nt)

3

identity to M76611

Initial Set4 Accession Data

P. sp. DAJ-2004 (OW monkey, African) 5896* 87.3% Y AY800112

P. vivax (human) 5991 88.9% AY598035

5990 88.9% AY598036

5991 88.9% AY598037

5991 88.9% AY598038

5991 88.9% AY598039

5991 88.9% AY598040

5994 88.9% AY598041

5989 88.9% AY598042

5990 88.9% AY598043

5991 88.9% AY598044

5991 88.9% AY598045

5989 88.9% AY598046

5989 88.9% AY598047

5992 88.9% AY598048

5991 88.9% AY598049

5989 88.9% AY598050

5990 88.9% Y AY598140 (NC_007243)

P. yoelii (rodent) 5956 88.4% Y M29000 M21313 M33978 (MALPY00209 (TIGR))

Other Haemosporida

H. sp. jb1.JA27 (bird) 5970 86.6% AY733086 (NC_012423)

H. sp. jb2.SEW5141 (bird) 5970 86.8% Y AY733087 (NC_012425)

H. columbae (bird) 5988 83.3% Y FJ168562 (NC_012448)

Hepatocystis sp. (bat) 6259 86.3% Y FJ168565

L. caulleryi (bird) 5959 85.9% Y AB302215

L. fringillarium (bird) 5992 83.2% Y FJ168564 (NC_012451)

L. majoris (bird) 6684 83.5% Y FJ168563 (NC_012450)

L. sabrasezi (bird) 5935 82.6% Y AB299369 (NC_009336)

Parahaemoproteus vireonis (bird) 5893* 86.6% Y FJ168561 (NC_012447)

Eimeria (Coccidia)

Eimeria tenella (bird) 6213 NA AB564272

Babesia and Theileria (Piroplasmida)

B. bovis (cow) 5970 NA AB499088

6005 NA EU075182

B. caballi (horse) 5847 NA AB499086

B. gibsoni (dog) 5865 NA AB499087

T. annulata (cow) 5905 NA CR940346 (NT_167255)

Page 43: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Organism1

(Host2)

Mt DNA size (nt)

3

identity to M76611

Initial Set4 Accession Data

T. equi (horse) 8246 NA AB499091

T. orientalis (cow) 5957 NA AB499090

T. parva (cow) 5924 NA AB499089

5895 NA Z23263 (011005)

1 P., Plasmodium; H., Haemoproteus; L., Leucocytozoon; B., Babesia; T., Theileria.

2 OW, Old World; NW, New World.

3 * mt genome sequence incomplete; presumed to reflect PCR primer choice.

4 Complete (or nearly complete) mitochondrial genomes of 25 Plasmodium species, one Plasmodium isolate,

and eight related hemosporidian species used for initial alignment.

5 P. knowlesi, previously considered only a monkey parasite, has recently been found to cause human

infections as well [86].

Page 44: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S2. Intergenic size variation among hemosporidian mt genomes.

Organisma RNA23t>

<LSUCb

RNA10< <cox3

c

cox3< >LSUF

LSUF> >SSUE

SSUE> >RNA2

RNA3> <SSUA

SSUA< >cox1

cox1c>

>cobc

RNA4> >RNA5

RNA5> >RNA6

RNA12> >RNA18

RNA8< >RNA16

P. falciparum 0 9 28 7 16 5 13 20 19 5 18 15

P. reichenowi 0 9 28 7 16 5 13 20 19 5 18 15

P. billcollinsi 1 9 34 7 16 5 13 20 19 5 18 15

P. billbrayi 0 9 38 7 15 5 13 20 19 4 18 14

P. gallinaceum 1 10 32 10 17 5 17 14 19 7 25 15

P. relictum 0 10 32 10 14 5 17 14 19 7 25 15

P. juxtanucleare 7 11 31 15 18 5 18 14 18 7 24 15

P. floridense 0 10 31 9 17 5 17 15 19 0 25 18

P. mexicanum 1 10 33 12 16 5 13 14 19 0 18 18

P. ovale 1 9 30 11 18 5 17 7 19 4 20 9

P. malariae 1 9 31 7 18 5 19 14 19 4 20 9

P. gonderi 0 8 30 11 18 5 28 27 19 1 19 16

P. sp. DAJ-2004 2 6 30 11 17 5 29 24 19 4 19 15

P. fragile 2 6 30 11 18 5 26 21 20 4 20 10

P. knowlesi 2 6 30 11 18 5 14 18 19 4 20 6

P. coatneyi 2 6 30 11 18 5 32 21 18 3 20 6

P. inui 2 6 30 11 17 5 17 20 29 4 20 11

P. hylobati 2 6 31 11 17 5 24 21 29 4 20 9

P. vivax 2 12 30 11 18 5 24 21 19 4 20 11

P. simium 2 12 30 11 18 5 31 21 19 4 20 11

P. simiovale 2 8 30 11 18 5 28 17 19 4 20 11

P. fieldi 2 8 32 11 18 5 25 13 19 4 20 11

P. cynomolgi 2 10 30 11 18 5 23 15 19 4 20 15

P. yoelii 1 6 30 11 16 5 17 19 19 0 22 4

P. berghei 1 6 30 11 16 5 17 19 19 0 22 4

Page 45: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Organisma RNA23t>

<LSUCb

RNA10< <cox3

c

cox3< >LSUF

LSUF> >SSUE

SSUE> >RNA2

RNA3> <SSUA

SSUA< >cox1

cox1c>

>cobc

RNA4> >RNA5

RNA5> >RNA6

RNA12> >RNA18

RNA8< >RNA16

P. chabaudi 1 6 30 11 15 5 17 17 19 0 21 4

Haemoproteus sp. 4 11 32 7 17 5 13 15 0 0 25 16

Parahaemoproteus 4 12 32 7 17 5 17 15 12 0 25 NA

L. caulleryi 2 4 32 4 17 5 14 16 11 0 25 13

L. sabrasezi 2 10 32 4 4 2 13 13 11 0 27 17

L. majoris 4 11 32 6 17 3 17 15 11 4 27 14

L. fringillarium 4 11 33 6 17 9 17 15 11 10 27 14

H. columbae 3 10 31 7 17 5 18 19 8 0 24 17

Hepatocystis sp. 20 19 62 16 18 18 46 94 24 0 0 28

a Organisms arranged by relatedness of mt genomes from Clustal W alignment, to assist pattern identification. P, Plasmodium; L, Leucocytozoon; H,

Haemoproteus.

b Gene pairs flanking a gap of 10 nt or more are shown with carets indicating the direction of transcription. The number of nt in the gap is given, based on

the 5’ or 3’ end of the flanking transcripts, as appropriate.

c For these genes, the gap is calculated from the end of the ORF, not the transcript end. This was done because using the transcript end would have

required negative numbers for some gaps.

Page 46: Gutell 120.plos_one_2012_7_e38320_supplemental_data

SSUA S4 108 92 6% 99 1% 99 1% 92 6% 93 5% 93 5% 94 4% 94 4% 94 4% 93 5% 93 5% 94 4% 93 5% 93 5% 93 5% 93 5% 93 5% 94 4% 93 5% 100 0% 93 5% 93 5% 93 5% 93 5% 92 6% 94 3%

804 95 97 94 96 98 95 96 94 95 96 95 96 94 95 95 95 95 95 95 99 95 95 95 95 95 95average SSU rRNA

98.5% 99.5% .0% 99.5% 98.5% 98.5% 99.0% 99.0% 9% 99.0% 99.0% 98.5% 9% 99.0% 98.5% 99.0% 96.9% 99.0% .0% 98.5% 98.5% 99.0% 99.0% 98

RNA4 72 86 96 98 83 87 86 86 90 84 90 86 86 84 88 76 84 86 88 86 100 0% 90 87 84 84 1% 87

identical 376 89% 98% 99% 90% 91% 91% 91% 83% 91% 83% 91% 91% 88% 83% 91% 91% 90% 82% 91% 100% 83% 91% 91% 91% 89% 90%

Amino acid sequence similarity is shown We calculated similarity on the basis of identity and also allowing for conservative amino acid replacements

Table S3. Similarity of individual mt genes from Plasmodium species relative to P. falciparum.

gene rRNA ordera size (nt)

berg

hei

billb

rayi

billc

ollin

si

chab

audi

coat

neyi

cyno

mol

gi

field

i

florid

ense

frag

ile

galli

nace

um

gond

eri

hylo

bati

inui

juxt

anuc

lear

e

know

lesi

mal

aria

e

man

drill

b

mex

ican

um

oval

e

reic

heno

wi

relic

tum

sim

iova

le

sim

ium

viva

x

yoel

i

aver

age

RNA14 S1 39 100.0% 97.4% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 92.3% 100.0% 97.4% 97.4% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 99.5%RNA12 S2 59 89.8% 98.5% 100.0% 86.4% 93.2% 91.5% 93.2% 93.2% 91.5% 96.6% 96.6% 93.2% 89.8% 94.9% 89.8% 91.5% 91.5% 93.2% 93.2% 100.0% 94.9% 91.5% 89.8% 89.8% 89.8% 92.7%RNA17 S3 40 97.5% 97.6% 97.5% 97.5% 97.5% 100.0% 100.0% 100.0% 95.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 95.0% 100.0% 100.0% 100.0% 97.5% 100.0% 100.0% 100.0% 100.0% 95.0% 98.9%SSUA S4 108 92 6%. 99 1%. 99 1%. 92 6%. 93 5%. 93 5%. 94 4%. 94 4%. 94 4%. 93 5%. 93 5%. 94 4%. 93 5%. 93 5%. 93 5%. 93 5%. 93 5%. 94 4%. 93 5%. 100 0%. 93 5%. 93 5%. 93 5%. 93 5%. 92 6%. 94 3%.RNA8 S5 100 94.0% 99.1% 99.0% 95.0% 96.0% 95.0% 96.0% 91.0% 95.0% 94.0% 97.0% 96.0% 95.0% 94.0% 95.0% 96.0% 96.0% 95.0% 95.0% 99.0% 93.0% 94.0% 95.0% 95.0% 93.0% 95.3%SSUB S6 116 97.4% 99.2% 99.1% 97.4% 99.1% 97.4% 99.1% 96.6% 96.6% 97.4% 97.4% 99.1% 97.4% 97.4% 97.4% 97.4% 97.4% 97.4% 99.1% 99.1% 97.4% 97.4% 97.4% 97.4% 97.4% 97.8%RNA19 S7 30 100.0% 96.7% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%RNA9 S8 54 94.4% 98.3% 98.1% 94.4% 92.6% 92.6% 94.4% 92.6% 92.6% 94.4% 94.4% 94.4% 90.7% 94.4% 92.6% 96.3% 94.4% 100.0% 94.4% 98.1% 96.3% 96.3% 92.6% 90.7% 98.1% 94.7%RNA5 S9 87 89.7% 99.0% 95.4% 87.4% 90.8% 88.5% 89.7% 87.4% 88.5% 86.2% 87.4% 88.5% 87.4% 86.2% 89.7% 88.5% 89.7% 85.1% 87.4% 98.9% 87.4% 88.5% 87.4% 89.7% 90.8% 89.2%SSUD S10 68 98.5% 98.6% 100.0% 98.5% 95.6% 95.6% 95.6% 95.6% 95.6% 92.6% 95.6% 97.1% 94.1% 94.1% 95.6% 97.1% 95.6% 94.1% 97.1% 100.0% 94.1% 95.6% 95.6% 95.6% 97.1% 96.1%SSUE S11 42 100.0% 97.6% 100.0% 100.0% 100.0% 100.0% 100.0% 97.6% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 97.6% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 99.8%SSUF S12 61 98.4% 98.4% 100.0% 98.4% 100.0% 100.0% 100.0% 100.0% 100.0% 98.4% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 98.4% 100.0% 100.0% 100.0% 100.0% 98.4% 99.5%average SSU rRNA 804 95.3%.3% 97.8%.8% 94.9%.9% 96.0%.0% 98.9%.9% 95.4%.4% 96.3%.3% 94.8%.8% 95.1%.1% 96.3%.3% 95.1%.1% 96.0%.0% 94.5%.5% 95.3%.3% 95.3%.3% 95.5%.5% 95.8%.8% 95.5%.5% 95.9%.9% 99.3%.3% 95.4%.4% 95.5%.5% 95.1%.1% 95.3%.3% 95.3%.3% 95.8%.8%LSUA L1 176 97.7% 99.4% 99.3% 97.2% 96.6% 85.2% 97.7% 98.9% 97.7% 97.7% 85.2% 98.3% 97.2% 98.9% 97.2% 98.3% 99.3% 97.7% 98.3% 97.7% 98.9% 85.2% 97.2% 97.2% 97.7% 98.0%RNA2 L2 67 92.5% 98.6% 98.5% 94.0% 94.0% 92.5% 95.5% 92.5% 94.0% 94.0% 92.5% 95.5% 92.5% 92.5% 91.0% 95.5% 92.5% 92.5% 95.5% 100.0% 92.5% 94.0% 94.0% 92.5% 92.5% 94.1%LSUB L3 29 100.0% 96.6% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 96.6% 96.6% 93.1% 100.0% 96.6% 100.0% 96.6% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 99.2%LSUC L4 22 95.5% 95.7% 90.9% 95.5% 95.5% 95.5% 95.5% 86.4% 86.4% 90.9% 95.5% 95.5% 100.0% 90.9% 90.9% 100.0% 100.0% 90.9% 90.9% 100.0% 90.9% 95.5% 95.5% 95.5% 95.5% 94.2%RNA11 L5 57 91.2% 98.4% 98.2% 94.7% 89.5% 91.2% 93.0% 91.2% 91.2% 93.0% 93.0% 91.2% 91.2% 100.0% 89.5% 91.2% 89.5% 94.7% 93.0% 98.2% 91.2% 87.7% 87.7% 91.2% 91.2% 92.5%RNA1 L6 88 96.6% 98.9% 98.9% 96.6% 98.9% 96.6% 97.7% 93.2% 96.6% 95.5% 96.6% 98.9% 95.5% 98.9% 97.7% 95.5% 94.3% 95.5% 96.6% 100.0% 98.9% 96.6% 96.6% 96.6% 95.5% 96.9%RNA3 L7 81 95.1% 98.8% 100.0% 93.8% 95.1% 96.3% 96.3% 93.8% 93.8% 97.5% 96.3% 95.1% 95.1% 97.5% 96.3% 96.3% 95.1% 95.1% 96.3% 100.0% 97.5% 96.3% 96.3% 96.3% 95.1% 96.1%LSUD L8 83 97.6% 98.8% 100.0% 97.6% 97.6% 97.6% 97.6% 97.6% 97.6% 97.6% 98.8% 97.6% 97.6% 97.6% 98.8% 97.6% 98.8% 97.6% 97.6% 100.0% 97.6% 97.6% 97.6% 97.6% 97.6% 98.0%LSUELSUE L9L9 195195 98.5% 99.5% 100.0%100 99.5% 98.5% 98.5% 99.0% 97.4%97.4% 99.0% 97.9%97. 99.0% 99.0% 98.5% 97.9%97. 99.0% 98.5% 99.0% 96.9% 99.0% 100.0%100 97.4%97.4% 98.5% 98.5% 99.0% 99.0% 98.7%.7%RNA13 L10 30 96.7% 96.8% 100.0% 96.7% 100.0% 100.0% 100.0% 96.7% 96.7% 96.7% 96.7% 100.0% 100.0% 100.0% 96.7% 100.0% 96.7% 96.7% 100.0% 100.0% 96.7% 96.7% 100.0% 96.7% 96.7% 98.3%LSUF L11 115 98.3% 99.1% 100.0% 97.4% 99.1% 98.3% 99.1% 93.9% 98.3% 96.5% 99.1% 98.3% 98.3% 97.4% 98.3% 97.4% 100.0% 97.4% 98.3% 100.0% 97.4% 98.3% 98.3% 98.3% 98.3% 98.2%LSUG L12 107 98.1% 99.1% 99.1% 97.2% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 98.1% 99.7%RNA10 L13 100 82.0% 99.2% 100.0% 79.0% 87.0% 83.0% 90.0% 85.0% 78.0% 83.0% 82.0% 91.0% 88.0% 82.0% 78.0% 85.0% 88.0% 86.0% 85.0% 98.0% 83.0% 83.0% 83.0% 83.0% 82.0% 89.1%RNA18 L14 25 100.0% 96.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%RNA6 L15 58 96.6% 98.3% 100.0% 96.6% 96.6% 96.6% 98.3% 100.0% 98.3% 98.3% 98.3% 98.3% 98.3% 96.6% 98.3% 98.3% 98.3% 98.3% 98.3% 100.0% 100.0% 98.3% 96.6% 96.6% 96.6% 98.1%average LSU rRNA 1233 96.3% 99.2% 96.2% 96.5% 99.4% 96.8% 97.3% 95.5% 96.4% 97.2% 96.4% 97.1% 96.4% 97.1% 96.3% 96.5% 96.9% 95.9% 96.8% 99.5% 96.6% 96.8% 96.6% 96.7% 96.3% 96.9%average mt rRNAs 2037 95.9% 98.6% 95.7% 96.3% 99.2% 96.2% 96.9% 95.2% 95.9% 96.9% 95.9% 96.7% 95.7% 96.4% 95.9% 96.1% 96.5% 95.8% 96.4% 99.4% 96.1% 96.3% 96.0% 96.1% 95.9% 96.5%RNA4 unassigned unassigned 72 86.1%.1% 96.9%.9% 98.6%.6% 83.3%.3% 87.5%.5% 86.1%.1% 86.1%.1% 90.3%.3% 84.7%.7% 90.3%.3% 86.1%.1% 86.1%.1% 84.7%.7% 88.9%.9% 76.4%.4% 84.7%.7% 86.1%.1% 88.9%.9% 86.1%.1% 100.0%. 90.3%.3% 87.5%.5% 84.7%.7% 84.7%.7% 86.1%86. 87.7%.7%RNA7 unassigned 82 97.6% 97.6% 100.0% 96.3% 96.3% NDc 97.6% 96.3% 97.6% 96.3% NDc 97.6% 96.3% 96.3% 96.3% 97.6% NDc 95.1% 96.3% 100.0% 96.3% NDc 98.8% 98.8% 97.6% 97.3%RNA15 unassigned 31 100.0% 97.3% 100.0% 100.0% 87.1% 100.0% 87.1% 96.8% 100.0% 100.0% 100.0% 87.1% 100.0% 100.0% 100.0% 83.9% 100.0% 100.0% 87.1% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 97.2%RNA16 unassigned 31 96.8% 97.4% 100.0% 96.8% 96.8% 96.8% 96.8% 96.8% 93.5% 96.8% 96.8% 96.8% 96.8% 96.8% 96.8% 96.8% 96.8% 90.3% 96.8% 100.0% 96.8% 96.8% 96.8% 96.8% 96.8% 96.8%RNA20 unassigned 38 92.1% 100.0% 100.0% 92.1% 97.4% 97.4% 97.4% 100.0% 97.4% 97.4% 97.4% 97.4% 100.0% 97.4% 97.4% 97.4% 100.0% 100.0% 94.7% 100.0% 97.4% 97.4% 97.4% 97.4% 92.1% 97.4%RNA21 unassigned 36 97.2% 100.0% 100.0% 97.2% 100.0% 97.2% 100.0% 97.2% 100.0% 94.4% 97.2% 100.0% 100.0% 91.7% 94.4% 100.0% 100.0% 97.2% 100.0% 100.0% 94.4% 97.2% 97.2% 97.2% 97.2% 97.9%RNA22 unassigned 38 97.4% 100.0% 97.4% 94.7% 94.7% 94.7% 94.7% 94.7% 92.1% 94.7% 92.1% 94.7% 94.7% 92.1% 94.7% 92.1% 94.7% 100.0% 100.0% 97.4% 100.0% 94.7% 94.7% 94.7% 97.4% 95.5%total unassigned RNAs 328 94.8% 99.1% 93.9% 93.9% 99.4% 95.4% 93.9% 95.4% 95.1% 93.9% 96.3% 96.3% 94.8% 96.0% 93.3% 93.0% 96.0% 95.1% 93.9% 99.7% 96.3% 96.6% 95.7% 94.8% 94.8% 95.5%total small mt RNAs 2365 95.7% 98.7% 95.4% 96.0% 99.2% 96.1% 96.5% 95.2% 95.8% 96.4% 95.9% 96.6% 95.6% 96.3% 95.5% 95.7% 96.4% 95.7% 96.1% 99.5% 96.2% 96.4% 96.0% 95.9% 95.7% 96.3%

COBdCOB identical 376 89% 98% 99% 90% 91% 91% 91% 83% 91% 83% 91% 91% 88% 83% 91% 91% 90% 82% 91% 100% 83% 91% 91% 91% 89% 90%conserved 96% 99% 100% 97% 97% 97% 97% 91% 97% 92% 97% 97% 96% 91% 97% 97% 97% 92% 96% 100% 92% 97% 97% 97% 96% 96%

COX1d identical 477e 93% 96% 98% 92% 94% 94% 94% 94% 94% 94% 93% 94% 93% 94% 94% 92% 93% 94% 95% 99% 94% 94% 94% 94% 92% 94%

conserved 98% 99% 99% 97% 98% 98% 98% 98% 98% 97% 98% 98% 98% 98% 98% 98% 98% 98% 99% 100% 98% 98% 98% 98% 98% 98%

COX3d identical 250e 84% 86% 88% 83% 84% 84% 86% 84% 85% 84% 84% 83% 84% 82% 85% 84% 83% 83% 81% 94% 84% 86% 84% 84% 84% 85%

conserved 91% 96% 97% 91% 92% 92% 92% 92% 93% 93% 92% 92% 92% 92% 92% 92% 92% 92% 91% 98% 93% 93% 93% 93% 91% 93%a Linear order of rRNA fragments, based on similarity to conventional rRNAs.b Sample collected from a mandrill, species not identified.c ND, not done. RNA7 similarity was not calculated for species from which its sequence is largely missing, presumably due to placement of PCR primers.d Amino acid sequence similarity is shown We calculated similarity on the basis of identity and also allowing for conservative amino acid replacements . , .e P. falciparum cox1 and cox3 lack conventional initiation codons and are presumed to use non-standard intiation codons [1] , common for mitochondrial proteins.

Page 47: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S4. Oligonucleotides used in this study.

Analysis Gene RACE Primer Sequence Positiona

RACE Analysis SSUB 5’ cDNA GTGTTCCACCGCTAGTGTTTGC 431-452

5’ PCR CCACTTGCTTATAACTGTATGGACG 463-487

SSUE 5’ CCAACAACATAACATTTTTTAGTCCCA 1794-1820

RNA1 3’ GCTGACTTGAGTAATGATA 559-578 rc

RNA4 3’ CATTTCTGAGTATTGAGCGGAAC 4647-4669

RNA6 5’ cDNA CTTGCCAACTCCCTATCATGTC 4839-4860 rc

5’ PCR GTCTTGCTAACGGCTTGTACGG 4820-4841 rc

3’ CCGTACAAGCCGTTAGCAAGACA 4820-4843

RNA9 5’ CAATCAAATTGGATGGTGTTGGC 83-105rc

3’ CCTAATTTACGGGTCGGTTGTGG 69-91 rc

RNA10 3’ GTACGAATAGACAATTGTGTTCATAGCTA 664-691

RNA11 5’ TTCAATTCGTACTTCCACTACCAG 5279-5302

3’ CATCGATATACGGATTTCTCCTG 5348-5370 rc

RNA12 3’ GGGATATTTGTAGTACACCTTGATTGG 4911-4937

RNA13 3’ GGGAAGTTTAGCCAGGAAGTCAGC 5001-5024 rc

RNA14 5’ CGAGTCGATCAGGAAGGTTTC 5519-5539

3’ GATGAAACCTTCCTGATCGACTCG 5519-5542 rc

RNA15 3’ CACACTTCCCTTCTCGCC 606-623

RNA16 5’ cDNA GGATGGTGTTGGCTGGGC 78-95 rc

5’ PCR GGTATCTCGTAATGTAGAACAA 9-30

3’ AAGCTTTTGGTATCTCGTAATGTAG 1-25

RNA17 3’ CTGTGTTACAAATTTTTGATCCCAGG 114-139

RNA18 3’ CGGTATTGCATGCCTGGTG 4968-4986

RNA19 3’ GTTCTTATGTGTTGGCATGG 5557-5576 rc

RNA20 5’ cDNA GGATGGTGTTGGCTGGGC 78-95 rc

5’ PCR GAAAAGGATTTGACGGTCAACTC 35-57 rc

3’ TGAGTTGACCGTCAAATCCTTTTCA 34-58

RNA21b 3’ GCATGGGACTAAAAAATGTTATGTTGTTG 1791-1820

5’ CCAACAACATAACATTTTTTAGTCCCA 1794-1821 rc

RNA22 5’ CAGGAGAAATCCGTATATCGATG 5348-5370

3’ CATCGATATACGGATTTCTCCTG 5348-5370 rc

Primer Extension LSUB TGATTACAGCTCCCAAGCAAAC 4594-4615

LSUC GAGCTATGACGCTATC 206-221

LSUF GAGCTCTATATATACTATAACC 1553-1574 rc

RNA6 GTCTTGCTAACGGCTTGTACGG 4820-4842 rc

RNA11 CAGGAGAAATCCGTATATCGATG 5348-5370

Page 48: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Analysis Gene RACE Primer Sequence Positiona

RNA12 CAATCAAGGTGTACTACAAATATC 4913-4936 rc

RNA13 GACGCTGACTTCCTGGCT 4998-5015

RNA14/RNA15 GTTTCTTTTACCTCACGAGTCGATC 5504-5528

RNA Blots LSUB TGATTACAGCTCCCAAGCAAAC 4594-4615

LSUC GAGCTATGACGCTATC 206-221

LSUF GAGCTCTATATATACTATAACC 1553-1574 rc

RNA12 CAATCAAGGTGTACTACAAATATC 4913-4936 rc

RNA13 GACGCTGACTTCCTGGCT 4998-5015

RNA14 GTTTCTTTTACCTCACGAGTCGATC 5504-5528

RNA15 TTCTATGGAAACACACTTC 595-613

RNA16 GTTCTACATTACGAGATACCAAAAGC 3-28 rc

a Position is given per Genbank entry M76611. rc, reverse complement. b The 5’ RACE primer for RNA21 overlapped the transcript start but generated products which correspond to the ends of

upstream RNAs.

Page 49: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S5. Plasmodium falciparum mt rRNA 3’end sequences

Fragment Sequence

SSUA GTATTATCCA TCCATGTCAG GCGTTAAAAG CGTTCGTTCT TATAGTGTAG

GUAUUAUCCA UCCAUGUCAG GCGUUAAAAG CGUUCGUUCU AAAAAAA

GUAUUAUCCA UCCAUGUCAG GCGUUAAAAG CGUUCGUUCU AAAAAAAAAAAAA (2)

GUAUUAUCCA UCCAUGUCAG GCGUUAAAAG CGUUCGUUCU AAAAAAAAAAAAAAAAAA

SSUB TTGTTTCATT TGATAGTAAA CACTATACCT TACCAATCTA TTTGAACTTG

UUGUUUCAUU UGAUAGUAAA CACUAUACCU UACCAAUCUA (4)

SSUD GACGTCAGGA AGTCCTGGAC GTTGAATCCA ATAGCATTGA TTAAAAGACA

GACGUCAGGA AGUCCUGGAC GUUGAAUCCA AUAGCAUUGA AAA

GACGUCAGGA AGUCCUGGAC GUUGAAUCCA AUAGCAUUGA AA

GACGUCAGGA AGUCCUGGAC GUUGAAUCCA AUAGCAUUGA

SSUF AACATGGTAG TTGACAGTGA ACTTGTAGCT GAACCAAAAA TGGCTGCTGG

AACAUGGUAG UUGACAGUGA ACUUGUAGCU GAACCAAAAA

AACAUGGUAG UUGACAGUGA ACUUGUAGCU GAACCAAAAA AAA

AACAUGGUAG UUGACAGUGA ACUUGUAGCU GAACCAAAAA AAAAAA (2)

LSUA TAGACGGTTT TCTGCGAAAT CTATTTGGAA GATATATCAT TGGGAAGTTT

UAGACGGUUU UCUGCGAAAU CUAUUUGGAA GAUAUAUCAU AAAAAAAAAAAAA

LSUD TGATGAATAT TTCAAGTTAC TGACATCTGC CCGGCATCAA TGATAAACGG

UGAUGAAUAU UUCAAGUUAC UGACAUCUGC CCGGCAUCAA AA

UGAUGAAUAU UUCAAGUUAC UGACAUCUGC CCGGCAUCAA AAA

LSUE GCCCGACGGT AAGACCCTGA GCACCTTAAC TTCCCTAAAA GTTCTTATGT

GCCCGACGGU AAGACCCUGA GCACCUUAAC UUCCCUAAAA AAAAAAAAAAAAAAA

LSUF CTATTGGCAC CTCCATGTCG TCTCATCGCA GCCTTGCAAT AAATAATATC

CUAUUGGCAC CUCCAUGUCG UCUCAUCGCA GCCUUGCAAU AAAAAA

CUAUUGGCAC CUCCAUGUCG UCUCAUCGCA GCCUUGCAAU AAAAAAAA

CUAUUGGCAC CUCCAUGUCG UCUCAUCGCA GCCUUGCAAU AAAAAAAAA

LSUG AGAACGTCTT GAGGCAGTTT GTTCCCTATC TACCGTTTTA TCTTTGCATG

AGAACGUCUU GAGGCAGUUU GUUCCCUAUC UACCGUUUUA AAAAAAAAAAA

AGAACGUCUU GAGGCAGUUU GUUCCCUAUC UACCGUUUUA AAAAAAAAAAAA

RNA1 AGCTATCCAT AGTTAATTGA TTCCGTTTTG ACCGGTCATT TTCTTTGCCT

AGCUAUCCAU AGUUAAUUGA UUCCGUUUUG ACCGGUCAUU AAAA

RNA2 TGAGATGGAA ACAGCCGGAA AGGTAATTTT ACGCCCTTAA CGTAAAGATC

UGAGAUGGAA ACAGCCGGAA AGGUAAUUUU ACGCCCUUAA AAAAAAAAAA

UGAGAUGGAA ACAGCCGGAA AGGUAAUUUU ACGCCCUUAA AAAAAAAAAAAAAAA

UGAGAUGGAA ACAGCCGGAA AGGUAAUUUU ACGCCCUUAA AAAAAAAAAAAAAAAAAAAAA

RNA3 CTGTGAGGAA ACTACATTAA AGGAACTCGA CTGGCCTACA CTATAAGAAG

CUGUGAGGAA ACUACAUUAA AGGAACUCGA CUGGCCUACA AAAAAA (2)

Page 50: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Fragment Sequence

CUGUGAGGAA ACUACAUUAA AGGAACUCGA CUGGCCUACA AAAAAAAA (2)

RNA4 GATTGAGCGG AACAAATCAG ACCGTAAGGT TATAATTATG TACTATGATT

GAUUGAGCGG AACAAAUCAG ACCGUAAGGU UAUAAUUAUG AAA (2)

RNA5 TTACAGTATC AATCGGATTT ACATGCTCAG CCGCCAAAAA CTATAACGAT

UUACAGUAUC AAUCGGAUUU ACAUGCUCAG CCGCCAAAAA AAAAAAAAAA

UUACAGUAUC AAUCGGAUUU ACAUGCUCAG CCGCCAAAAA AAAAAA

RNA6 AAGCCGTTAG CAAGACATGA TAGGGAGTTG GCAAGTTAAA GAAGTTCTG

AAGCCGUUAG CAAGACAUGA UAGGGAGUUG GCAAGUUAAA A

AAGCCGUUAG CAAGACAUGA UAGGGAGUUG GCAAGUUAAA AA (2)

RNA7 AATCGAGAGA GATTCCATTA GTTGTCTCTA TGAATAGTGG TTATAGCCAT

AAUCGAGAGA GAUUCCAUUA GUUGUCUCUA UGAAUAGUGG AAAAAAAAAA

AAUCGAGAGA GAUUCCAUUA GUUGUCUCUA UGAAUAGUGG AAAAAAAAAAAAAAA (2)

RNA8 CCCGGGAAAC CGGCGCTTCC ATTTATAAGA AGTTAAATTA CTGGAAGCGT

CCCGGGAAAC CGGCGCUUCC AUUUAUAAGA AGUUAAAUUA AAAAAAA

CCCGGGAAAC CGGCGCUUCC AUUUAUAAGA AGUUAAAUUA AAAAAAAAAAAAA

CCCGGGAAAC CGGCGCUUCC AUUUAUAAGA AGUUAAAUUA AAAAAAAAAAAAAAAA (3)

CCCGGGAAAC CGGCGCUUCC AUUUAUAAGA AGUUAAAUUA AAAAAAAAAAAAAAAAAA

CCCGGGAAAC CGGCGCUUCC AUUUAUAAGA AGUUAAAUUA AAAAAAAAAAAAAAAAAAA

RNA9 AACACCATCC AATTTGATTG GGAATTATCT GTGTTACAAA TTTTTGATCC

AACACCAUCC AAUUUGAUUG GGAAUUAUCU GUGUUACAAA

AACACCAUCC AAUUUGAUUG GGAAUUAUCU GUGUUACAAA A

AACACCAUCC AAUUUGAUUG GGAAUUAUCU GUGUUACAAA AAA

AACACCAUCC AAUUUGAUUG GGAAUUAUCU GUGUUACAAA AAAAAA

AACACCAUCC AAUUUGAUUG GGAAUUAUCU GUGUUACAAA AAAAAAAAAA

AACACCAUCC AAUUUGAUUG GGAAUUAUCU GUGUUACAAA AAAAAAAAAAAA

RNA10 TAGAGTACGT AAGGAAAAGG AAAGGTTAAC CGCTATCAAA TGGCGAGAAG

UAGAGUACGU AAGGAAAAGG AAAGGUUAAC CGCUAUCAAA AAAAAAAAA

UAGAGUACGU AAGGAAAAGG AAAGGUUAAC CGCUAUCAAA AAAAAAAA

RNA11 TTTAGAACAG GAGAGTATAT TCTGGTAGTG GAAGTACGAA TTGAAGTGGA

UUUAGAACAG GAGAGUAUAU UCUGGUAGUG GAAGUACGAA AAAAAAAAA

RNA12 TGTATGGGAT ATTTGTAGTA CACCTTGATT GGTTTTACTA TTTATACTTA

UGUAUGGGAU AUUUGUAGUA CACCUUGAUU GGUUUUACUA AAAAAAAAAAAA (2)

UGUAUGGGAU AUUUGUAGUA CACCUUGAUU GGUUUUACUA AAAAAAAAAAAAA

UGUAUGGGAU AUUUGUAGUA CACCUUGAUU GGUUUUACUA AAAAAGAAAAAAAAAAAA

UGUAUGGGAU AUUUGUAGUA CACCUUGAUU GGUUUUACUA AAAAAAAAAAAAAAAAAAAAAAAAA

RNA13 GATATATCAT TGGGAAGTTT AGCCAGGAAG TCAGCGTCTA TATTAAAAA

5’ UGGGAAGUUU AGCCAGGAAG UCAGCGUCUA AAAAAAAAAAAAAA

Page 51: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Fragment Sequence

RNA14 TTAAGGATGA AACCTTCCTG ATCGACTCGT GAGGTAAAAG AAACAGTCCG

5’ UUAAGGAUGA AACCUUCCUG AUCGACUCGU GAGGUAAAAG AA

UUAAGGAUGA AACCUUCCUG AUCGACUCGU GAGGUAAAAG AAA

UUAAGGAUGA AACCUUCCUG AUCGACUCGU GAGGUAAAAG AAAAAAAA

UUAAGGAUGA AACCUUCCUG AUCGACUCGU GAGGUAAAAG AAAAAAAAAA

UUAAGGAUGA AACCUUCCUG AUCGACUCGU GAGGUAAAAG AAAAAAAAAAAAAAA

RNA15 GCTATCAAAT GGCGAGAAGG GAAGTGTGTT TCCATAGAAA CCTTGTATAT

5’ U GGCGAGAAGG GAAGUGUGUU UCCAUAGAAA AAAAAAAAA

RNA16 TATAATAAAG CTTTTGGTAT CTCGTAATGT AGAACAATAT TGAGTTGACC

5’ G CUUUUGGUAU CUCGUAAUGU AGAACAAUAU A

G CUUUUGGUAU CUCGUAAUGU AGAACAAUAU AA (3)

G CUUUUGGUAU CUCGUAAUGU AGAACAAUAU AAAA (3)

G CUUUUGGUAU CUCGUAAUGU AGAACAAUAU AAAAAAAAAAAAAA

RNA17 TTTTTGATCC CAGGCTGGTA AAAAATGTAA ACTTTTAGCC CATAAGAATA

UUUUUGAUCC CAGGCUGGUA AAAAAUGUAA ACUUUUAGCC AAAAAAA

RNA18 ATACTTATCG ATAAATGTTC GGTATTGCAT GCCTGGTGTT TTTAATATAG

5’ UGUUC GGUAUUGCAU GCCUGGUGUU AAAAAAAAA (2)

UGUUC GGUAUUGCAU GCCUGGUGUU AAAAAAAAAAA

UGUUC GGUAUUGCAU GCCUGGUGUU AAAAAAAAAAAA

UGUUC GGUAUUGCAU GCCUGGUGUU AAAAAAAAAAAAAAA (2)

RNA19 TTCCCTAAAA GTTCTTATGT GTTGGCATGG TTACGAGATT AAGGATGTTT

5’ GUUCUUAUGU GUUGGCAUGG UUACGAGAUU AAAAAAAA

GUUCUUAUGU GUUGGCAUGG UUACGAGAUU AAAAAAAAA

GUUCUUAUGU GUUGGCAUGG UUACGAGAUU AAAAAAAAAAAAA

GUUCUUAUGU GUUGGCAUGG UUACGAGAUU AAAAAAAAAAAAAAA

GUUCUUAUGU GUUGGCAUGG UUACGAGAUU AAAAAAAAAAAAAAAAAA (2)

RNA20 ATTGAGTTGA CCGTCAAATC CTTTTCATTA AAAGAGTGGA TTAAATGCCC

5’ UGAGUUGA CCGUCAAAUC CUUUUCAUUA AAAGAGUGGA AAAAAAAAAAA

UGAGUUGA CCGUCAAAUC CUUUUCAUUA AAAGAGUGGA AAAAAAAAAAAAAAAAAAAA (2)

RNA21 AGCATGGGAC TAAAAAATGT TATGTTGTTG GTTTAAGCCC TATTACCATA

AGCATGGGAC TAAAAAATGT TATGTTGTTG GTTTAAGCCC AAAAAA

AGCATGGGAC TAAAAAATGT TATGTTGTTG GTTTAAGCCC AAAAAAA (2)

AGCATGGGAC TAAAAAATGT TATGTTGTTG GTTTAAGCCC AAAAAAAA

RNA22 GATTAAAAGA CATCGATATA CGGATTTCTC CTGAAAAAAC GCGAAAAACC

5’ UUAAAAGA CAUCGAUAUA CGGAUUUCUC CUGAAAAAAC AA

The top sequence (in bold italics) for each set of RACE data corresponds to the genomic DNA sequence.

Numbers in parentheses indicate how often that size of tail was found.

Page 52: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S6. Abutted P. falciparum mt gene sets

Description Members Size (nt)

RNA16 to RNA17 4 163

RNA8 to RNA13 12 959

cox3 to LSUG 6a 1205

cox1 to RNA4 3 2660

a If RNA23t through RNA27t are included, this run is cox3 to RNA23t, 10 members, 1317 nt.

Page 53: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S7. Fragment placement in rRNA alignments for covariation

analysis

Fragmenta rRNA

b Hemosporidians

c Coccidian

d Piroplasms

e

SSUA SSU +f + +

SSUB SSU + + +

SSUD SSU + + +

SSUE SSU + + +

SSUF SSU + + +

LSUA LSU + + +

LSUB LSU + + +

LSUC LSU + + +

LSUD LSU + + +

LSUE LSU + + +

LSUF LSU + + +

LSUG LSU + + +

RNA1 LSU + + +

RNA2 LSU + + +

RNA3 LSU + + +

RNA5 SSU +

RNA6 LSU + +

RNA8 SSU + + +

RNA9 SSU + +

RNA10 LSU + + +

RNA11 LSU + +

RNA12 SSU +

RNA13 LSU + + +

RNA14 SSU + +

RNA17 SSU + + +

RNA18 LSU + + +

RNA19 SSU +

COUNTS 27 20 24

a Gene names are as defined in Table 1.

b LSU, large subunit rRNA; SSU, small subunit rRNA. Fragments not assigned: RNA4, RNA7, RNA15,

RNA16, RNA20, RNA21, RNA22, RNA23t, RNA24t, RNA25t, RNA26t, RNA27t.

c RNAs from Plasmodium and related species that were used for co-variation analysis.

d RNAs from Eimeria tenella that were used for co-variation analysis.

e RNAs from Babesia and Theileria that were used for co-variation analysis.

f +, present; blank if absent.

Page 54: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S8. Prospective covariations found in fragmented apicomplexan mt rRNA genes.

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 581,1259

(LSUA,RNA1) 2

ug=( 16, 16.00, 94.1%) h, i ag=( 1, 1.00, 5.9%)

ug=( 16, 16.00, 94.1%) j ag=( 1, 1.00, 5.9%)

ug=( 16, 12.36, 72.7%) aa=( 5, 1.36, 22.7%) ag=( 1, 4.64, 4.5%)

LSU 583,1257

(LSUA,RNA1) 2

ac=(192, 192.00, 97.0%) gc=( 6, 6.00, 3.0%)

ac=(192, 192.00, 97.0%) gc=( 6, 6.00, 3.0%)

ac=(192, 184.54, 93.2%) uu=( 8, 0.31, 3.9%) gc=( 6, 5.77, 2.9%)

LSU 589,668 (LSUA)

2 cg=(198, 198.00, 100.0%) cg=(199, 199.00, 100.0%) cg=(199, 191.31, 96.1%)

au=( 5, 0.19, 2.4%) gu=( 3, 0.12, 1.4%)

LSU 590,667 (LSUA)

2 au=(198, 198.00, 100.0%) au=(198, 198.00, 99.5%)

aa=( 1, 1.00, 0.5%)

au=(198, 190.35, 95.7%) ua=( 8, 0.35, 3.9%) aa=( 1, 8.65, 0.5%)

LSU 591,666 (LSUA)

2 ua=(198, 198.00, 100.0%) ua=(198, 198.00, 99.5%)

aa=( 1, 1.00, 0.5%)

ua=(198, 190.35, 95.7%) au=( 8, 0.35, 3.9%) aa=( 1, 8.65, 0.5%)

LSU 672,808 (LSUA)

1 ug=(198, 198.00, 100.0%) ug=(199, 199.00, 100.0%) ug=(199, 191.31, 96.1%)

au=( 8, 0.31, 3.9%)

LSU 679,798 (LSUA)

2 cg=(200, 200.00, 98.5%) -g=( 3, 3.00, 1.5%)

cg=(201, 201.00, 98.5%) -g=( 3, 3.00, 1.5%)

cg=(201, 193.42, 94.8%) aa=( 8, 0.30, 3.8%) -g=( 3, 2.89, 1.4%)

LSU 680,797 (LSUA)

2 au=(200, 200.00, 98.5%) -u=( 3, 3.00, 1.5%)

au=(201, 201.00, 98.5%) -u=( 3, 3.00, 1.5%)

au=(201, 193.42, 94.8%) ua=( 8, 0.30, 3.8%) -u=( 3, 2.89, 1.4%)

LSU 693,769 (LSUA)

2

au=(200, 200.00, 98.5%) -u=( 3, 3.00, 1.5%)

au=(200, 200.00, 98.0%) -u=( 3, 3.00, 1.5%) uu=( 1, 1.00, 0.5%)

au=(200, 192.45, 94.3%) ua=( 8, 0.34, 3.8%) -u=( 3, 2.89, 1.4%) uu=( 1, 8.66, 0.5%)

LSU 694,768 (LSUA)

2

cg=(200, 200.00, 98.5%) -g=( 3, 3.00, 1.5%)

cg=(200, 200.00, 98.0%) -g=( 3, 3.00, 1.5%) ag=( 1, 1.00, 0.5%)

cg=(200, 192.45, 94.3%) ua=( 8, 0.30, 3.8%) -g=( 3, 2.89, 1.4%) ag=( 1, 0.96, 0.5%)

Page 55: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 695,767 (LSUA)

2

gc=(200, 200.00, 98.5%) -c=( 3, 3.00, 1.5%)

gc=(200, 200.00, 98.0%) -c=( 3, 3.00, 1.5%) ac=( 1, 1.00, 0.5%)

gc=(200, 192.45, 94.3%) au=( 8, 0.34, 3.8%) -c=( 3, 2.89, 1.4%) ac=( 1, 8.66, 0.5%)

LSU 696,766 (LSUA)

2 au=(200, 200.00, 98.5%) -u=( 3, 3.00, 1.5%)

au=(201, 201.00, 98.5%) -u=( 3, 3.00, 1.5%)

au=(201, 193.42, 94.8%) ua=( 8, 0.30, 3.8%) -u=( 3, 2.89, 1.4%)

LSU 697,765 (LSUA)

1 au=(203, 203.00, 100.0%) au=(204, 204.00, 100.0%) au=(204, 196.30, 96.2%)

ua=( 8, 0.30, 3.8%)

LSU 698,763 (LSUA)

2

cg=(203, 203.00, 97.1%) ug=( 6, 6.00, 2.9%)

cg=(204, 204.00, 97.1%) ug=( 6, 6.00, 2.9%)

cg=(204, 198.39, 93.6%) ug=( 8, 8.75, 3.7%) au=( 5, 0.11, 2.3%) ua=( 1, 0.04, 0.5%)

LSU 700,732 (LSUA)

2

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 205.19, 96.3%) au=( 5, 0.19, 2.3%) ac=( 2, 6.81, 0.9%) gu=( 1, 5.81, 0.5%)

LSU 701,731 (LSUA)

1 gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%)

ua=( 8, 0.29, 3.7%)

LSU 741,756 (LSUA)

1 gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%)

cg=( 8, 0.29, 3.7%)

LSU 744,753 (LSUA)

1 ua=(209, 209.00, 100.0%) ua=(210, 210.00, 100.0%) ua=(210, 202.29, 96.3%)

cg=( 8, 0.29, 3.7%)

LSU 779,785 (LSUA)

1 ug=(209, 209.00, 100.0%) ug=(210, 210.00, 100.0%) ug=(210, 202.29, 96.3%)

au=( 8, 0.29, 3.7%)

LSU 824,833 (LSUA)

1 ua=(208, 207.00, 99.5%) au=( 1, 0.00, 0.5%)

ua=(209, 208.00, 99.5%) au=( 1, 0.00, 0.5%)

ua=(217, 216.00, 99.5%) au=( 1, 0.00, 0.5%)

LSU 946,971 (RNA2)

1 au=(209, 209.00, 100.0%) au=(209, 208.00, 99.5%)

gc=( 1, 0.00, 0.5%) au=(209, 200.37, 95.9%) gc=( 9, 0.37, 4.1%)

Page 56: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 950,967 (RNA2)

1 cg=(208, 207.00, 99.5%) ua=( 1, 0.00, 0.5%)

cg=(209, 208.00, 99.5%) ua=( 1, 0.00, 0.5%)

cg=(209, 200.37, 95.9%) au=( 8, 0.29, 3.7%) ua=( 1, 0.00, 0.5%)

LSU 953,964 (RNA2)

2 gg=(209, 209.00, 100.0%) gg=(209, 209.00, 99.5%)

ug=( 1, 1.00, 0.5%)

gg=(209, 201.33, 95.9%) ua=( 8, 0.33, 3.7%) ug=( 1, 8.67, 0.5%)

LSU 976,987 (RNA2)

1 ug=(209, 209.00, 100.0%) ug=(210, 210.00, 100.0%) ug=(210, 202.29, 96.3%)

au=( 6, 0.17, 2.8%) gc=( 2, 0.02, 0.9%)

LSU 977,986 (RNA2)

1 gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%)

cg=( 8, 0.29, 3.7%)

LSU 1002,1153

(RNA2,RNA11) 1

cg=(209, 209.00, 100.0%) cg=(209, 209.00, 100.0%) cg=(209, 201.29, 96.3%) aa=( 8, 0.29, 3.7%)

LSU 1164,1185 (RNA11)

1 au=(209, 209.00, 100.0%) au=(209, 209.00, 100.0%) au=(209, 201.29, 96.3%)

ua=( 8, 0.29, 3.7%)

LSU 1166,1183 (RNA11)

2 au=(209, 209.00, 100.0%) au=(209, 209.00, 100.0%) au=(209, 203.22, 96.3%)

gc=( 6, 0.22, 2.8%) gu=( 2, 7.78, 0.9%)

LSU 1196,1250

(RNA11,RNA1) 2

gu=(207, 207.00, 99.0%) uu=( 2, 2.00, 1.0%)

gu=(207, 207.00, 99.0%) uu=( 2, 2.00, 1.0%)

gu=(207, 199.37, 95.4%) ua=( 8, 0.37, 3.7%) uu=( 2, 9.63, 0.9%)

LSU 1198,1247

(RNA11,RNA1) 2

ua=(208, 208.00, 99.5%) ug=( 1, 1.00, 0.5%)

ua=(208, 208.00, 99.5%) ug=( 1, 1.00, 0.5%)

ua=(208, 200.33, 95.9%) au=( 8, 0.29, 3.7%) ug=( 1, 0.96, 0.5%)

LSU 1656,2004

(RNA3,LSUE) 2

ca=(202, 202.00, 96.7%) ua=( 6, 6.00, 2.9%) aa=( 1, 1.00, 0.5%)

ca=(203, 203.00, 96.7%) ua=( 6, 6.00, 2.9%) aa=( 1, 1.00, 0.5%)

ca=(203, 195.55, 93.1%) au=( 8, 0.33, 3.7%) ua=( 6, 5.78, 2.8%) aa=( 1, 8.67, 0.5%)

LSU 1658,2002

(RNA3,LSUE) 1

au=(209, 209.00, 100.0%) au=(210, 210.00, 100.0%) au=(210, 202.29, 96.3%) ua=( 8, 0.29, 3.7%)

Page 57: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 1659,2001

(RNA3,LSUE) 2

cg=(198, 189.47, 94.7%) ua=( 4, 0.11, 1.9%) au=( 4, 0.08, 1.9%) ug=( 2, 5.74, 1.0%) gc=( 1, 0.00, 0.5%)

cg=(199, 190.47, 94.8%) ua=( 4, 0.11, 1.9%) au=( 4, 0.08, 1.9%) ug=( 2, 5.74, 1.0%) gc=( 1, 0.00, 0.5%)

cg=(199, 183.48, 91.3%) ua=( 12, 0.77, 5.5%) au=( 4, 0.07, 1.8%) ug=( 2, 12.91, 0.9%) gc=( 1, 0.00, 0.5%)

LSU 1661,1999

(RNA3,LSUE) 2

ug=(209, 209.00, 100.0%) ug=(209, 209.00, 99.5%) ua=( 1, 1.00, 0.5%)

ug=(209, 201.33, 95.9%) gc=( 8, 0.29, 3.7%) ua=( 1, 0.96, 0.5%)

LSU 1663,1997

(RNA3,LSUE) 1

au=(209, 209.00, 100.0%) au=(210, 210.00, 100.0%) au=(210, 202.29, 96.3%) gc=( 8, 0.29, 3.7%)

LSU 1766,1986

(LSUD,LSUE) 1

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%) au=( 8, 0.29, 3.7%)

LSU 1767,1985

(LSUD,LSUE) 1

gc=(208, 207.00, 99.5%) au=( 1, 0.00, 0.5%)

gc=(208, 206.02, 99.0%) au=( 2, 0.02, 1.0%)

gc=(216, 214.02, 99.1%) au=( 2, 0.02, 0.9%)

LSU 1794,1825

(LSUD) 1

ua=(209, 209.00, 100.0%) ua=(209, 208.00, 99.5%) au=( 1, 0.00, 0.5%)

ua=(209, 200.37, 95.9%) au=( 9, 0.37, 4.1%)

LSU 1795,1824

(LSUD) 2

cu=(208, 208.00, 99.5%) uu=( 1, 1.00, 0.5%)

cu=(209, 209.00, 99.5%) uu=( 1, 1.00, 0.5%)

cu=(213, 209.09, 97.7%) ua=( 3, 0.07, 1.4%) uc=( 1, 0.02, 0.5%) uu=( 1, 4.91, 0.5%)

LSU 1797,1822

(LSUD) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 202.29, 96.3%) au=( 8, 0.29, 3.7%)

LSU 1800,1817

(LSUD) 1

au=(209, 209.00, 100.0%) au=(209, 208.00, 99.5%) cg=( 1, 0.00, 0.5%)

au=(209, 200.37, 95.9%) cg=( 9, 0.37, 4.1%)

LSU 1804,1813

(LSUD) 2

ua=(202, 201.03, 96.7%) ca=( 6, 6.97, 2.9%) cg=( 1, 0.03, 0.5%)

ua=(202, 200.08, 96.2%) ca=( 6, 7.92, 2.9%) cg=( 2, 0.08, 1.0%)

ua=(202, 192.73, 92.7%) gc=( 8, 0.29, 3.7%) ca=( 6, 7.63, 2.8%) cg=( 2, 0.07, 0.9%)

LSU 1805,1812

(LSUD) 1

ua=(209, 209.00, 100.0%) ua=(210, 210.00, 100.0%) ua=(210, 202.29, 96.3%) au=( 8, 0.29, 3.7%)

Page 58: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 1806,1811

(LSUD) 2

cg=(208, 207.00, 99.5%) ua=( 1, 0.00, 0.5%)

cg=(208, 207.01, 99.0%) ua=( 1, 0.01, 0.5%) ug=( 1, 1.99, 0.5%)

cg=(208, 199.41, 95.4%) ua=( 9, 0.41, 4.1%) ug=( 1, 9.59, 0.5%)

LSU 1830,1975

(LSUD,LSUE) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(211, 204.22, 96.8%) ua=( 7, 0.22, 3.2%)

LSU 1831,1974

(LSUD,LSUE) 1

au=(209, 209.00, 100.0%) au=(210, 210.00, 100.0%) au=(217, 216.00, 99.5%) gc=( 1, 0.00, 0.5%)

LSU 1836,1904

(LSUD,LSUE) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 206.07, 98.1%) uu=( 4, 0.07, 1.9%)

LSU 1837,1903

(LSUD,LSUE) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 206.07, 98.1%) uu=( 4, 0.07, 1.9%)

LSU 1843,1897

(LSUD,LSUE) 1

ua=(209, 209.00, 100.0%) ua=(209, 208.00, 99.5%) cg=( 1, 0.00, 0.5%)

ua=(209, 208.00, 99.5%) cg=( 1, 0.00, 0.5%)

LSU 1906,1924

(LSUE) 2

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 206.07, 98.1%) a-=( 2, 0.02, 0.9%) ua=( 2, 0.02, 0.9%)

LSU 1907,1923

(LSUE) 2

gu=(209, 209.00, 100.0%) gu=(210, 210.00, 100.0%) gu=(210, 205.12, 97.7%) cg=( 2, 0.05, 0.9%) ug=( 2, 0.05, 0.9%) ag=( 1, 0.02, 0.5%)

LSU 1909,1921

(LSUE) 2

ug=(209, 209.00, 100.0%) ug=(210, 210.00, 100.0%) ug=(211, 204.22, 96.8%) au=( 2, 0.05, 0.9%) a-=( 2, 0.05, 0.9%) ca=( 2, 0.03, 0.9%) aa=( 1, 0.07, 0.5%)

LSU 1925,1929

(LSUE) 2

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 202.29, 96.3%) ac=( 3, 0.04, 1.4%) uu=( 3, 0.07, 1.4%) ua=( 2, 0.05, 0.9%)

LSU 1945,1961

(LSUE) 1

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%) au=( 8, 0.29, 3.7%)

Page 59: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 1947,1959

(LSUE) 2

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 203.26, 96.3%) ua=( 7, 0.26, 3.2%) ug=( 1, 7.74, 0.5%)

LSU 1948,1958

(LSUE) 1

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(214, 210.07, 98.2%) au=( 4, 0.07, 1.8%)

LSU 1949,1957

(LSUE) 1

gc=(208, 207.00, 99.5%) au=( 1, 0.00, 0.5%)

gc=(209, 208.00, 99.5%) au=( 1, 0.00, 0.5%)

gc=(209, 200.37, 95.9%) ua=( 8, 0.29, 3.7%) au=( 1, 0.00, 0.5%)

LSU 1950,1956

(LSUE) 2

gu=(206, 204.03, 98.6%) aa=( 2, 0.03, 1.0%) ga=( 1, 2.97, 0.5%)

gu=(207, 205.03, 98.6%) aa=( 2, 0.03, 1.0%) ga=( 1, 2.97, 0.5%)

gu=(207, 197.50, 95.0%) ca=( 8, 0.40, 3.7%) aa=( 2, 0.10, 0.9%) ga=( 1, 10.50, 0.5%)

LSU 2024,2039

(LSUE) 2

ua=(208, 208.00, 99.5%) ug=( 1, 1.00, 0.5%)

ua=(209, 209.00, 99.5%) ug=( 1, 1.00, 0.5%)

ua=(209, 201.33, 95.9%) gc=( 8, 0.29, 3.7%) ug=( 1, 0.96, 0.5%)

LSU 2025,2038

(LSUE) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(213, 208.11, 97.7%) au=( 5, 0.11, 2.3%)

LSU 2026,2037

(LSUE) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 202.29, 96.3%) ua=( 8, 0.29, 3.7%)

LSU 2027,2036

(LSUE) 1

au=(209, 209.00, 100.0%) au=(209, 208.00, 99.5%) ua=( 1, 0.00, 0.5%)

au=(209, 200.37, 95.9%) ua=( 9, 0.37, 4.1%)

LSU 2047,2621

(LSUE,LSUG) 2

cg=(208, 207.00, 99.5%) ua=( 1, 0.00, 0.5%)

cg=(208, 206.02, 99.0%) ua=( 2, 0.02, 1.0%)

cg=(208, 198.46, 95.4%) ua=( 9, 0.41, 4.1%) uu=( 1, 0.05, 0.5%)

LSU 2048,2620

(LSUE,LSUG) 2

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%) au=( 7, 0.26, 3.2%) aa=( 1, 0.04, 0.5%)

LSU 2049,2619

(LSUE,LSUG) 1

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%) cg=( 8, 0.29, 3.7%)

LSU 2080,2240

(LSUE,RNA13) 2

ua=(209, 209.00, 100.0%) ua=(210, 210.00, 100.0%) ua=(216, 215.01, 99.1%) cg=( 1, 0.01, 0.5%) ug=( 1, 1.99, 0.5%)

Page 60: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 2461,2489

(LSUF) 1

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(213, 208.11, 97.7%) au=( 5, 0.11, 2.3%)

LSU 2462,2488

(LSUF) 1

ua=(206, 203.04, 98.6%) au=( 3, 0.04, 1.4%)

ua=(206, 202.08, 98.1%) au=( 4, 0.08, 1.9%)

ua=(206, 194.66, 94.5%) au=( 12, 0.66, 5.5%)

LSU 2463,2487

(LSUF) 1

au=(209, 209.00, 100.0%) au=(210, 210.00, 100.0%) au=(210, 202.09, 96.3%) ua=( 8, 0.29, 3.7%)

LSU 2466,2484

(LSUF) 2

ua=(209, 209.00, 100.0%) ua=(209, 209.00, 99.5%) ug=( 1, 1.00, 0.5%)

ua=(209, 201.33, 95.9%) au=( 8, 0.29, 3.7%) ug=( 1, 0.96, 0.5%)

LSU 2470,2480

(LSUF) 2

gc=(207, 205.02, 99.0%) au=( 2, 0.02, 1.0%)

gc=(208, 206.02, 99.0%) au=( 2, 0.02, 1.0%)

gc=(208, 198.46, 95.4%) au=( 4, 0.11, 1.8%) ug=( 3, 0.06, 1.4%) aa=( 2, 0.08, 0.9%) ua=( 1, 0.06, 0.5%)

LSU 2514,2570

(LSUF,LSUG) 1

ua=(209, 209.00, 100.0%) ua=(210, 210.00, 100.0%) ua=(210, 202.29, 96.3%) cg=( 8, 0.29, 3.7%)

LSU 2516,2568

(LSUF,LSUG) 1

gc=(209, 209.00, 100.0%) gc=(209, 208.00, 99.5%) au=( 1, 0.00, 0.5%)

gc=(209, 200.37, 95.9%) au=( 9, 0.37, 4.1%)

LSU 2517,2567

(LSUF,LSUG) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 202.29, 96.3%) ua=( 8, 0.29, 3.7%)

LSU 2524,2539

(LSUG) 2

gc=(208, 208.00, 99.5%) gu=( 1, 1.00, 0.5%)

gc=(209, 209.00, 99.5%) gu=( 1, 1.00, 0.5%)

gc=(209, 201.33, 95.9%) au=( 8, 0.33, 3.7%) gu=( 1, 8.67, 0.5%)

LSU 2527,2536

(LSUG) 1

cg=(195, 181.94, 93.3%) ua=( 14, 0.94, 6.7%)

cg=(196, 182.93, 93.3%) ua=( 14, 0.93, 6.7%)

cg=(204, 190.90, 93.6%) ua=( 14, 0.90, 6.4%)

LSU 2547,2561

(LSUG) 1

au=(207, 205.02, 99.0%) ua=( 2, 0.02, 1.0%)

au=(208, 206.02, 99.0%) ua=( 2, 0.02, 1.0%)

au=(216, 214.02, 99.1%) ua=( 2, 0.02, 0.9%)

LSU 2594,2599

(LSUG) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(213, 208.11, 97.7%) ua=( 5, 0.11, 2.3%)

Page 61: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

LSU 2742,2762

(RNA6) 2

ug=(195, 195.00, 93.3%) cg=( 14, 14.00, 6.7%)

ug=(195, 195.00, 93.3%) cg=( 14, 14.00, 6.7%)

ug=(195, 187.81, 89.9%) cg=( 14, 13.48, 6.5%) au=( 8, 0.29, 3.7%)

LSU 2743,2761

(RNA6) 2

au=(206, 206.00, 98.6%) uu=( 3, 3.00, 1.4%)

au=(206, 206.00, 98.6%) uu=( 3, 3.00, 1.4%)

au=(206, 198.41, 94.9%) ug=( 8, 0.41, 3.7%) uu=( 3, 10.59, 1.4%)

LSU 2744,2760

(RNA6) 2

gu=(209, 209.00, 100.0%) gu=(209, 209.00, 100.0%) gu=(209, 201.29, 96.3%) aa=( 8, 0.29, 3.7%)

SSU 27,556

(RNA14,SSUA) 2

gc=(208, 208.00, 99.5%) ac=( 1, 1.00, 0.5%)

gc=(208, 208.00, 99.5%) ac=( 1, 1.00, 0.5%)

gc=(208, 200.33, 95.9%) aa=( 8, 0.33, 3.7%) ac=( 1, 8.67, 0.5%)

SSU 28,555

(RNA14,SSUA) 1

gc=(209, 209.00, 100.0%) gc=(209, 209.00, 100.0%) gc=(209, 201.29, 96.3%) ua=( 8, 0.29, 3.7%)

SSU 30,553

(RNA14,SSUA) 2

ua=(209, 209.00, 100.0%) ua=(209, 209.00, 100.0%) ua=(209, 202.26, 96.3%) cg=( 7, 0.26, 3.2%) ug=( 1, 7.74, 0.5%)

SSU 32,552

(RNA14,SSUA) 1

au=(209, 209.00, 100.0%) au=(209, 209.00, 100.0%) au=(209, 201.29, 96.3%) ua=( 8, 0.29, 3.7%)

SSU 39,403

(RNA14) 2

cg=(205, 205.00, 98.1%) gg=( 3, 3.00, 1.4%) ag=( 1, 1.00, 0.5%)

cg=(205, 205.00, 98.1%) gg=( 3, 3.00, 1.4%) ag=( 1, 1.00, 0.5%)

cg=(205, 197.44, 94.5%) gc=( 5, 0.18, 2.3%) au=( 3, 0.06, 1.4%) gg=( 3, 7.71, 1.4%) ag=( 1, 3.85, 0.5%)

SSU 40,402

(RNA14) 2

cg=(206, 206.00, 98.6%) ag=( 3, 3.00, 1.4%)

cg=(206, 206.00, 98.6%) ag=( 3, 3.00, 1.4%)

cg=(206, 198.41, 94.9%) ua=( 8, 0.29, 3.7%) ag=( 3, 2.89, 1.4%)

SSU 41,401

(RNA14) 1

ua=(209, 209.00, 100.0%) ua=(209, 209.00, 100.0%) ua=(214, 211.04, 98.6%) au=( 3, 0.04, 1.4%)

SSU 293,304 (RNA17)

2 gc=(209, 209.00, 100.0%) gc=(209, 209.00, 99.5%)

ga=( 1, 1.00, 0.5%)

gc=(209, 201.33, 95.9%) au=( 8, 0.29, 3.7%) ga=( 1, 0.96, 0.5%)

Page 62: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

SSU 504,541 (SSUA)

2

ug=(110, 110.00, 52.6%) cg=( 98, 98.00, 46.9%) gg=( 1, 1.00, 0.5%)

ug=(111, 111.00, 52.9%) cg=( 98, 98.00, 46.7%) gg=( 1, 1.00, 0.5%)

ug=(111, 106.93, 50.9%) cg=( 98, 94.40, 45.0%) au=( 8, 0.29, 3.7%) gg=( 1, 0.96, 0.5%)

SSU 505,526 (SSUA)

2 ga=(207, 207.00, 99.0%) aa=( 2, 2.00, 1.0%)

ga=(208, 208.00, 99.0%) aa=( 2, 2.00, 1.0%)

ga=(208, 200.37, 95.4%) ac=( 8, 0.37, 3.7%) aa=( 2, 9.63, 0.9%)

SSU 507,524 (SSUA)

2 ua=(209, 209.00, 100.0%) ua=(209, 208.00, 99.5%)

ag=( 1, 0.00, 0.5%) ua=(209, 200.37, 95.9%) ag=( 9, 0.37, 4.1%)

SSU 511,540 (SSUA)

2 ug=(209, 209.00, 100.0%) ug=(209, 208.00, 99.5%)

cu=( 1, 0.00, 0.5%)

ug=(210, 209.04, 96.3%) ua=( 7, 6.97, 3.2%) cu=( 1, 0.00, 0.5%)

SSU 513,538 (SSUA)

2 cu=(209, 209.00, 100.0%) cu=(209, 208.00, 99.5%)

aa=( 1, 0.00, 0.5%)

cu=(209, 208.04, 95.9%) cg=( 8, 7.96, 3.7%) aa=( 1, 0.00, 0.5%)

SSU 585,756

(SSUA,RNA8) 2

ug=(206, 206.00, 99.0%) ua=( 2, 2.00, 1.0%)

ug=(207, 207.00, 99.0%) ua=( 2, 2.00, 1.0%)

ug=(207, 199.37, 95.4%) gu=( 8, 0.29, 3.7%) ua=( 2, 1.93, 0.9%)

SSU 772,807 (RNA8)

2

ug=(207, 207.00, 99.5%) ua=( 1, 1.00, 0.5%)

ug=(207, 207.00, 99.0%) ag=( 1, 1.00, 0.5%) ua=( 1, 1.00, 0.5%)

ug=(207, 199.37, 95.4%) gc=( 5, 0.18, 2.3%) gu=( 3, 0.11, 1.4%) ag=( 1, 0.96, 0.5%) ua=( 1, 0.96, 0.5%)

SSU 774,805 (RNA8)

2 ac=(208, 208.00, 100.0%) ac=(208, 208.00, 99.5%)

uc=( 1, 1.00, 0.5%)

ac=(208, 200.33, 95.9%) uu=( 8, 0.33, 3.7%) uc=( 1, 8.67, 0.5%)

SSU 778,804 (RNA8)

2 gc=(206, 206.00, 99.0%) gu=( 2, 2.00, 1.0%)

gc=(207, 207.00, 99.0%) gu=( 2, 2.00, 1.0%)

gc=(207, 199.37, 95.4%) ua=( 8, 0.29, 3.7%) gu=( 2, 1.93, 0.9%)

SSU 784,798 (RNA8)

2 ug=(201, 201.00, 96.6%) ua=( 7, 7.00, 3.4%)

ug=(201, 201.00, 96.2%) ua=( 8, 8.00, 3.8%)

ug=(201, 193.59, 92.6%) au=( 8, 0.29, 3.7%) ua=( 8, 7.71, 3.7%)

Page 63: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

SSU 885,912 (SSUB)

2 gc=(209, 209.00, 100.0%) gc=(209, 209.00, 99.5%)

gu=( 1, 1.00, 0.5%)

gc=(209, 201.33, 95.9%) ua=( 8, 0.29, 3.7%) gu=( 1, 0.96, 0.5%)

SSU 894,905 (SSUB)

2 gu=(208, 208.00, 99.5%) gc=( 1, 1.00, 0.5%)

gu=(209, 209.00, 99.5%) gc=( 1, 1.00, 0.5%)

gu=(209, 201.33, 95.9%) ua=( 8, 0.29, 3.7%) gc=( 1, 0.96, 0.5%)

SSU 895,904 (SSUB)

2

ug=(207, 207.00, 99.0%) ua=( 2, 2.00, 1.0%)

ug=(208, 208.00, 99.0%) ua=( 2, 2.00, 1.0%)

ug=(208, 204.18, 95.4%) ua=( 5, 4.91, 2.3%) au=( 4, 0.09, 1.8%) uu=( 1, 4.91, 0.5%)

SSU 930,1387

(SSUB,SSUD) 2

cg=(208, 208.00, 99.5%) ug=( 1, 1.00, 0.5%)

cg=(208, 208.00, 99.0%) ug=( 2, 2.00, 1.0%)

cg=(208, 200.37, 95.4%) ua=( 8, 0.37, 3.7%) ug=( 2, 9.63, 0.9%)

SSU 931,1386

(SSUB,SSUD) 1

au=(209, 209.00, 100.0%) au=(210, 210.00, 100.0%) au=(213, 208.11, 97.7%) gc=( 5, 0.11, 2.3%)

SSU 933,1384

(SSUB,SSUD) 1

ag=(209, 209.00, 100.0%) ag=(209, 209.00, 100.0%) ag=(209, 201.29, 96.3%) ga=( 8, 0.29, 3.7%)

SSU 939,1344

(SSUB,SSUD) 2

gu=(208, 208.00, 99.5%) au=( 1, 1.00, 0.5%)

gu=(208, 208.00, 99.0%) au=( 2, 2.00, 1.0%)

gu=(208, 200.37, 95.4%) ua=( 8, 0.29, 3.7%) au=( 2, 1.93, 0.9%)

SSU 941,1342

(SSUB,SSUD) 1

gc=(209, 209.00, 100.0%) gc=(210, 210.00, 100.0%) gc=(210, 202.29, 96.3%) au=( 8, 0.29, 3.7%)

SSU 962,973 (SSUB)

1 ua=(106, 53.76, 50.7%) cg=(103, 50.76, 49.3%)

ua=(106, 53.50, 50.5%) cg=(104, 51.50, 49.5%)

ua=(114, 59.61, 52.3%) cg=(104, 49.61, 47.7%)

SSU 986,1219

(SSUB,RNA9) 1

au=(209, 209.00, 100.0%) au=(209, 209.00, 100.0%) au=(209, 201.29, 96.3%) ga=( 8, 0.29, 3.7%)

SSU 989,1216

(SSUB,RNA9) 2

cg=(207, 205.02, 99.0%) au=( 1, 0.00, 0.5%) ua=( 1, 0.00, 0.5%)

cg=(207, 205.02, 99.0%) au=( 1, 0.00, 0.5%) ua=( 1, 0.00, 0.5%)

cg=(207, 197.46, 95.4%) aa=( 8, 0.37, 3.7%) au=( 1, 0.04, 0.5%) ua=( 1, 0.04, 0.5%)

Page 64: Gutell 120.plos_one_2012_7_e38320_supplemental_data

rRNAa Pairb (Fragmentsc)

Categoryd Hemosporidiane + Coccidianf + Piroplasmg

SSU 1411,1489

(SSUE,SSUF) 1

cg=(209, 209.00, 100.0%) cg=(210, 210.00, 100.0%) cg=(210, 202.29, 96.3%) ua=( 8, 0.29, 3.7%)

SSU 1514,1521

(SSUF) 1

au=(209, 209.00, 100.0%) au=(209, 208.00, 99.5%) gc=( 1, 0.00, 0.5%)

au=(217, 216.00, 99.5%) gc=( 1, 0.00, 0.5%)

a LSU, large subunit ribosomal RNA; SSU, small subunit ribosomal RNA. b Base pairs with significant potential covariation are shown. Pairs are named using the E. coli rRNA reference structure model numbering. c Fragment containing each position (not repeated if both positions are in the same fragment). d Category 1, strong covariation signal without exceptions (red in Figures S9-S11); 2, covariation signal with exceptions (blue in Figures S9-S11). e 200 sequences from Plasmodium and nine related hemosporidians, for a total of 209 sequences. f All sequences from the hemosporidian alignment, plus one coccidian sequence, for a total of 210 sequences. g All sequences from the hemosporidian and coccidian alignment, plus eight piroplasm sequences, for a total of 218 sequences. h Background shading: red, invariant; green, coccidian does not add information to hemosporidian; blue, piroplasm does not add information to coccidian. i Covariation data: <base pair type>=(<number of occurrences observed in dataset>, <number of occurrences expected based upon random distribution of

observed nucleotides at the two positions>, <percentage of sequences containing the base pair type>). j Text color: green, coccidian has that combination at that base pair (may overlap with hemosporidian); blue, at least one of the eight piroplasm sequences

has that combination at that base pair (may overlap with hemosporidian and/or coccidian).

Page 65: Gutell 120.plos_one_2012_7_e38320_supplemental_data

Table S9. Correspondence of P. falciparum and C. elegans mt rRNAs.

SSU helixa P. falciparumb C. elegansb

9 missing present 17 missing present 27 RNA14:SSUA present 39 RNA14 present

122 RNA12 missing 240 RNA12 missing 289 RNA17 missing 500 SSUA present 505 SSUA present 511 SSUA present 567 SSUA:SSUB present 575 SSUA:SSUB present 577 SSUA:RNA8 present 655 missing present; short 673 missing present 722 missing present 769 RNA8 present 821 RNA8:SSUB missing 885 SSUB present 921 SSUB:SSUE present 939 SSUB:SSUD present 944 SSUB:RNA9 present 960 SSUB present 984 SSUB:RNA9 present

1047 RNA19:RNA9; short present 1068 missing present 1074 missing present 1303 SSUD:RNA5 present 1350 SSUD present

1399 SSUE:SSUF; short present; short 1506 SSUF present

5’ LSU helixa P. falciparumb C. elegansb

563 missing present 579 LSUA:RNA1 present 589 LSUA; short present; short 671 LSUA present 687 LSUA present 700 LSUA: short missing 736 LSUA present 777 LSUA present 812 LSUA present 822 LSUA missing 946 RNA2 missing 976 RNA2 missing 991 RNA2:RNA11 unstructured

1030 missing present, short 1057 LSUB present 1082 LSUB:LSUC missing 1087 LSUC present 1164 RNA11 missing 1196 RNA11:RNA1, short present; short 1262 RNA1:LSUE present 1276 RNA1 missing 1295 RNA1:RNA3 missing

a Helix numbers correspond to the first nt in the

corresponding helices in E. coli rRNA secondary

structures

3’LSU helixa P. falciparumb C. elegansb 1648 RNA3:LSUE missing 1764 LSUD:LSUE present 1775 LSUD present 1782 LSUD present 1792 LSUD present; short 1830 LSUD:LSUE present 1835 LSUD:LSUE; short present; short 1906 LSUE present 1925 LSUE present 1935 LSUE present 2023 LSUE present 2043 LSUE:LSUG present 2064 LSUE:LSUF present 2077 LSUE:RNA13 present 2246 RNA13 present 2455 LSUF present 2507 LSUF:LSUG present 2520 LSUF:LSUG present 2547 LSUG present 2588 LSUG present 2646 RNA10 present 2675 RNA10/RNA18:RNA6 present; short 2735 RNA6 present

b Significant differences between P. falciparum

and C. elegans rRNA structure (absence or

altered size) are in bold italic font