d - biomed central
TRANSCRIPT
RESEARCH ARTICLE Open Access
A holistic phylogeny of the coronin gene familyreveals an ancient origin of the tandem-coronin,defines a new subfamily, and predicts proteinfunctionChristian Eckert, Björn Hammesfahr and Martin Kollmar*
Abstract
Background: Coronins belong to the superfamily of the eukaryotic-specific WD40-repeat proteins and play a rolein several actin-dependent processes like cytokinesis, cell motility, phagocytosis, and vesicular trafficking. Two majortypes of coronins are known: First, the short coronins consisting of an N-terminal coronin domain, a unique regionand a short coiled-coil region, and secondly the tandem coronins comprising two coronin domains.
Results: 723 coronin proteins from 358 species have been identified by analyzing the whole-genome assembliesof all available sequenced eukaryotes (March 2011). The organisms analyzed represent most eukaryotic kingdomsbut also cover every taxon several times to provide a better statistical sampling. The phylogenetic tree of thecoronin domains based on the Bayesian method is in accordance with the most recent grouping of the majorkingdoms of the eukaryotes and also with the grouping of more recently separated branches. Based on this“holistic” approach the coronins group into four classes: class-1 (Type I) and class-2 (Type II) are metazoan/choanoflagellate specific classes, class-3 contains the tandem-coronins (Type III), and the new class-4 represents thecoronins fused to villin (Type IV). Short coronins from non-metazoans are equally related to class-1 and class-2coronins and thus remain unclassified.
Conclusions: The coronin class distribution suggests that the last common eukaryotic ancestor possessed a singleand a tandem-coronin, and most probably a class-4 coronin of which homologs have been identified in Excavataand Opisthokonts although most of these species subsequently lost the class-4 homolog. The most ancient shortcoronin already contained the trimerization motif in the coiled-coil domain.
BackgroundThe coronin proteins, which were originally isolated as amajor co-purifying protein from an actin-myosin-com-plex of the slime mold Dictyostelium discoideum [1],have since been identified in other protists [2,3], fungi[4], and animals [5], but are absent in plants. Coroninsare a conserved family of actin binding proteins [6-8] andthe first family member had been named coronin basedon its strong immunolocalization to the actin rich crownlike structures of the cell cortex in Dictyostelium discoi-deum [1]. Coronins belong to the superfamily of the
eukaryotic-specific WD40-repeat proteins [9,10] and playa role in several actin-dependent processes like cytokin-esis [11], cell motility [11,12], phagocytosis [13,14], andvesicular trafficking [15].WD-repeat motifs are minimally conserved regions of
approximately 40-60 amino acids typically starting withGly-His (GH) dipeptides 11-24 residues away from the N-terminus and ending with a Trp-Asp (WD) dipeptide atthe C-terminus. WD40-repeat proteins, which are charac-terized by the presence of at least four consecutive WDrepeats in the middle of the molecule, fold into beta pro-peller structures and serve as stable platforms for protein-protein interactions [9].The coronin proteins have five canonical WD-repeat
motifs located centrally. Since the region encoding the
* Correspondence: [email protected] of NMR-based Structural Biology, Max-Planck-Institute forBiophysical Chemistry, Am Fassberg 11, 37077 Goettingen, Germany
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
© 2011 Eckert et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.
WD repeats is similar to the sequence of the beta-subu-nit of trimeric G-proteins the formation of a five-bladedbeta-propeller was assumed for coronins [16]. However,the determination of the structure of murine coronin-1(MmCoro1A [17]) demonstrated that the protein, analo-gous to the trimeric G-proteins, forms a seven-bladedbeta-propeller carrying two potential F-actin bindingsites. Apart from the central WD-repeats, almost all cor-onin proteins have a C-terminal coiled-coil sequencethat mediates homo-oligomerization [18-20], and a shortN-terminal motif that contains an important regulatoryphosphorylation site in coronin-1B [12]. In addition,each coronin protein has a unique region of variablelength and composition following the conserved exten-sion to the C-terminus of the beta-propeller.Based on their domain composition coronins have ori-
ginally been divided into two subfamilies, namely short andlong coronins [21]. Short coronins consist of 450 - 650amino acids containing one seven-bladed beta-propellerand a C-terminal coiled-coil region. Furthermore, the N-terminal region of most known short coronins contains 12basic amino acids. Since this motif is only present in coro-nin molecules, it has been suggested as a novel coronin sig-nature [21]. The longer types of coronin, also called PODor Coronin 7, possess two complete core domains in tan-dem but lack a coiled-coil motif. In the longer coronins,the sequence of the basic N-terminal motif is reduced to 5amino acids. Based on phylogenetic relationships amongthe coronins, the Human Genome Organization nomencla-ture committee (HGNC) proposed a system in 2001 thatgrouped the short coronins into two classes resulting in atotal of three subtypes [8]. Very recently, a new nomencla-ture has been suggested dividing the coronins into twelvesubclasses based on the analysis of about 250 coroninsfrom most taxa [22]. In contrast to previous systems, everymammalian coronin (and corresponding vertebrate homo-logs) was designated an own class resulting in seven verte-brate classes. Invertebrates were grouped into two classes,the fungi got an own class, coronins from alveolates weregrouped with those from Parabasalids (class 10), and theremaining coronins from Amoeba, Heterolobosea, andEuglenozoa were combined into the twelfth class. Thisstudy constituted the first major phylogenetic analysis ofthe coronin family. However, this classification was notconsistent with the latest phylogeny of the eukaryotes andhomologs of some major branches like the stramenopileswere missing.Here, we present the analysis of the complete coronin
repertoires of all eukaryotic organisms sequenced andassembled so far. The distribution of all coronin homo-logs is in accordance with the latest taxonomy of theeukaryotes and reveals the origin of the tandem-coroninand another newly defined class in the last commonancestor of the eukaryotes.
ResultsIdentification and annotation of the coronin proteinsThe coronin protein genes were identified by TBLASTNsearches against the corresponding genome data of the dif-ferent species. The list of sequenced eukaryotic species aswell as access information to the corresponding genomedata has been obtained from diArk [23]. Species thatmissed certain orthologs in the first instance were latersearched again with supposed-to-be orthologs of otherclosely related species. In this iterative process all coroninfamily proteins have been identified or their loss in certainspecies or taxa was confirmed. Because verified cDNAsequences and protein predictions, which often containmispredicted exons and introns even in the “annotated”genomes, are not available for most of the sequenced spe-cies, the protein sequences were assembled and assignedby manual inspection of the genomic DNA sequences.Exons have been confirmed by the identification of flank-ing consensus intron-exon splice junction donor andacceptor sequences [24]. In addition, the gene structuresof all coronin genes were reconstructed using WebScipio[25,26]. Through comparison of the intron positions andsplice-site phases in relation to the protein multiple-sequence alignment, several suspicious exon border pre-dictions could be resolved and the protein sequences sub-sequently be corrected. The genomic sequences of manyspecies contain several gaps due to the low coverage of thesequencing or problems in the assembly process. Onlysome of the gaps could be closed at the amino-acid levelby analyzing EST data.The coronin dataset contains 723 sequences from 358
organisms (Table 1). 614 sequences are complete, and anadditional 44 sequences are partially complete. Sequencesfor which a small part is missing (up to 5%) were termed“Partials”, while sequences for which a considerable part ismissing were termed “Fragments”. This difference hasbeen introduced because Partials are not expected to con-siderably influence the phylogenetic analysis. Several ofthe genes were termed pseudogenes because they containtoo many frame shifts, in-frame stop codons, and missingsequences to be attributed to sequencing or assemblyerrors.
Multiple sequence alignment, phylogenetic analysis, andclassificationA multiple sequence alignment of all coronin family mem-bers has been created and extensively manually improved(Additional file 1). The basis of the alignment was the con-served coronin domain that consists of the b-propellerregion and a subsequent conserved extension, which packsagainst the “bottom” surface of the propeller [17]. Thisentire domain is conserved in all coronin homologs andwe would therefore suggest naming it coronin-domain.The unique regions following the coronin-domain could
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 2 of 17
only be aligned for homologs of closely related species.The C-terminal predicted coiled-coil regions were alignedagain for all corresponding sequences to analyze potentialoligomerization patterns (see below). The second coronin-domains of the tandem-coronins were also aligned to thecoronin-domains for the phylogenetic analysis. One partof the coronin-domain in coronin-1D is encoded by acluster of mutually exclusive exons (see below) and there-fore the exon with the higher sequence identity to relatedhomologs has been included in the alignment. The phylo-genetic tree of the coronin family was calculated for 764coronin-domains, including both coronin-domains of thetandem-coronins separately, using the Bayesian (Addi-tional file 2) and the maximum-likelihood method (Addi-tional file 3). The resulting trees were almost identical.However, the relations of the innermost nodes represent-ing the most ancient relationships were best resolvedusing the Bayesian approach (Figure 1). The resulting phy-logenetic tree is in accordance with the latest phylogeneticgrouping of the six kingdoms of the eukaryotes [27-29] ofwhich five are covered by the data analyzed here. Thus,coronins of phylogenetic related species group together inthe coronin family tree. In the coronin tree, not only thegrouping is retained but also the evolutionary history ofthe branches. For example, the fungi separate as mono-phyletic group before the metazoans, and after theAmoeba.The classification into subfamilies should at best includeboth the phylogenetic grouping of the protein familymembers and the domain organisation of the respectivehomologs. However, because most coronins contain aunique region between the coronin-domain and theC-terminal coiled-coil regions, several sub-branch speci-fic domain organisation patterns evolved. To keep the
coronin classification as simple as possible and to pro-vide the highest consistency with previous classificationschemes, the following classification is proposed: Theclassification should solely be based on the phylogenetictree of the coronin-domains because it is in accordancewith the phylogeny of the eukaryotes and contains theconserved part of the proteins that is the basis of theprotein family. Metazoan species encode two phylogen-etically distinct groups of coronins that have historicallybeen named class-1 and class-2 coronins. Further var-iants of these classes should be named alphabetically,e.g. class-1A, class-1B, etc.. However, due to the inde-pendent whole-genome, genomic region, and singlegene duplication events of certain phylogenetic branchesthese variant designations do not always refer to ortho-logs. For the mammalian coronins, which are the bestanalyzed coronins, the suggested classification is almostentirely consistent with previous classifications [8] andthe HGNC nomenclature except for “CORO6” and“CORO7”, which are here classified as coronin-1D andcoronin-3, respectively. Class-3 comprises the tandemcoronins. All members of this class group together in thephylogenetic tree, and only single homologs have beenfound in all species analyzed. Class-4 is a newly definedclass that contains coronins with variable numbers ofC-terminal PH, gelsolin, and VHP domains, but also cor-onins with only very short sequences outside the coro-nin-domain. The other coronins group in accordancewith the latest taxonomy of the species (Figure 1). In ouropinion it does not add information or help the scientificcommunity if those coronins were classified separately.In contrast to the metazoans, gene duplications in thebranches of Amoeba, Excavata, and SAR are species-spe-cific and do not warrant further subclassification at themoment. For example, instead of talking about a “class-11 coronin” and long explanations what type of coroninswould belong to such a class, it would be easier, shorter,and less confusing to just say a “Naegleria coronin”, an“apicomplexan coronin” or a “yeast coronin”. The distri-bution of the coronins analyzed here is summarized forsome example species in Figure 2 including previouslyused names and classification schemes. The distributionof all coronins is found in Additional file 4. Coroninhomologs are absent in Rhodophyta (Cyanidioschyzon,Galdieria), Viridiplantae, Microsporidia, Formicata(Giardia), and Haptophyceae (Emiliania).
Short coronins (class-1, class-2, and unclassified coronins)The domain organisations of most short coronins (class-1,class-2, and unclassified coronins) are similar. They consistof the 390 amino-acid long coronin-domain followed by ashort unique domain and a C-terminal short coiled-coilregion (about 30-40 amino acids, Figure 3). The uniqueregions are conserved in branches (e.g. the vertebrates
Table 1 Data statistics
coronin
Sequence
Total 723
From WGS 700
Domains 7
Amino acids
Total pseudogenes 7
Pseudogenes without sequence 3
Completeness
Complete 614
Partials 44
Fragments 62
Species
Total 358
WGS-projects 323
EST-projects 112
WGS- and EST-projects 152
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 3 of 17
have similar regions, as do the arthropods, the nematodes,etc.), but are not conserved for major taxa (e.g. fungi,Metazoa, stramenopiles).The Saccharomyces cerevisiae coronin, ScCoro
(CRN1), is known to bind to microtubules via its uniqueregion between the b-barrel domain and the coiled-coiloligomerisation region (Figure 3, [30]). Two shortregions showing homology to the microtubule-bindingregions of MAP1B mediate this interaction. However,the MAP1B sequence motif is very short (about ten resi-dues) and not very specific comprising mainly glutamateand lysine residues [30]. If the corresponding motifs in
ScCoro are responsible for microtubule-binding then allyeast and Schizosaccharomyces coronins should be ableto bind to microtubules because they contain motifswith similar amino acid compositions. A similar motifor region could not be identified in the Pezizomycotinacoronins. While these supposed microtubule-bindingregions mainly consist of glutamate, lysine, proline, ser-ine, and threonine and are not even conserved in veryclosely related yeast species, the Saccharomyces cerevi-siae coronin, ScCoro, has very recently been describedto contain a CA domain (C: central; A: acidic; [31]).This domain, with which ScCoro activates and inhibits
Aac
3C
mf 3
Hrs
3
Am 3
Apf 3
Bot 3
Nav
3
Aea
3C
pq 3
And
3An
g 3
Da
3
Der
3D
m 3
Dse
3D
y 3
Dg
3D
v 3
Dp
3D
w 3Myd
3
Tic
3
Pdc
3R
hp 3
Dap
3
Nv
3S
tp 3
Cab
3C
e 3
Car
3Cb 3
Cej 3
Hb 3
Psp 3S
tr 3
Ci 3
Cis 3
Aim 3
Caf 3
Myl 3
Eqc 3Bt 3
Cvp 3O
c 3
Caj 3
Hs 3
Pat 3
Pna 3
Mam
3Pah 3
Mm
3R
n 3
Md 3
Gg 3 Xt 3Br 3G
a 3Tar 3Bf 3
Cpt 3
Lg 3
Amq 3
Tct 3
Bad_a 3Bad_b 3
Spp 3
Fna_b 3Fnb_b 3
Fnb_c 3Fnd_b 3
Fnd_c 3M
lp 3 Pug 3Put 3M
v 3R
g 3Spr 3
Phb 3R
ha 3Co 3
Ac 3
Ays 3 Ppp 3Dif 3
Dcp 3Dd 3
Ecs 3
Phi 3Phr 3
Phs 3Pu 3
Tv a 3
Srp 3Ed 3Eh 3Eti 3
Aac 3Cterm
Cmf 3Cterm
Hrs 3CtermAm 3Cterm
Apf 3Cterm
Bot 3Cterm
Nav 3Cterm
Pdc 3Cterm
Rhp 3Cterm
Aea 3Cterm
Cpq 3Cterm
And 3Cterm
Ang 3Cterm
Da 3CtermDse 3Cterm
Dy 3Cterm
Dm 3Cterm
Dss 3Cterm
Der 3Cterm
Dw 3Cterm
Dp 3Cterm
Dv 3Cterm
Dg 3Cterm
Dmo 3Cterm
Myd 3CtermTic 3Cterm
Dap 3Cterm
Is 3Cterm
Cpt 3Cterm
Lg 3Cterm
Aim 3Cterm
Caf 3Cterm
Bt 3Cterm
Myl 3Cterm
Eqc 3CtermCvp 3Cterm
Mm 3Cterm
Rn 3CtermOc 3Cterm
Caj 3Cterm
Ggg 3Cterm
Hs 3CtermPat 3Cterm
Pna 3CtermMam 3Cterm
Pah 3CtermOra 3Cterm
Md 3Cterm
Aoc 3CtermGg 3Cterm
Tag 3CtermXt 3Cterm
Br 3Cterm
Ga 3Cterm
Tar 3Cterm
Tn 3CtermOl a 3CtermStp 3CtermBf 3CtermAmq 3Cterm
Ci 3Cterm
Cis 3CtermNv 3CtermCab 3Cterm
Car 3Cterm
Cb 3CtermCej 3CtermCe 3CtermHb 3Cterm
Psp 3CtermStr 3Cterm
Co 3Cterm
Bad_a 3Cterm
Bad_b 3CtermSpp 3Cterm
Fna_b 3Cterm
Fnd_b 3Cterm
Fnd c 3CtermFnb_b 3Cterm
Fnb_c 3Cterm
Mlp 3Cterm
Pug 3Cterm
Put 3Cterm
Mv 3Cterm
Rg 3Cterm
Spr 3Cterm
Muc 3Cterm
Rha 3CtermPhb 3Cterm
Tv a 3CtermEd 3Cterm
Eh 3CtermEti 3Cterm
Ac 3Cterm
Ays 3Cterm
Ppp 3CtermDif 3Cterm
Dcp 3Cterm
Dd 3CtermPp 3Cterm
Aua 3Cterm
Ecs 3Cterm
Phi 3Cterm
Phs 3CtermPhr 3CtermPu 3CtermSrp 3Cterm
Tct 3Cterm
Ays 4AAys 4B
Dcp 4Dd 4
Dif 4APpp 4A
Ed 4Eh 4
Eti 4Ng 4
Co 4Spp 4
Ays 4CPpp 4B
Dif 4B
Aua
AAu
a B
Ecs
Hya
Phs
Phi
PhrPy
cPuSrpFr
cPh
tThpBh
ABh
B
Bab
Bb
Tep
Tha
Et Nca
Tg_a
Tg
_b
Tg_c
Pb
Pl
c Pl
y
Pf_a
Pf
_d
Pf_b
Pf
_e
Pf_c
Pf
_h Pf_i Pf_j Plr PlgPk
Pv
Ch B
Cp BCrm
B
Ch
AC
p ACrm
A
Pt
TetBin
Crf A
Crf B
Lb
LcLm
LemLe
iLet
TbTbgTy
vTrc A
Trc B
Tv_a
ATv
_a BTv_a
C
Ng
AcAch
AysPpp
Dif Dcp
Dd
Pp A
Pp B
Ed A
Eh A
Ed B
Eh B
Eti AMab
Abb Agb
Cpc PloScc
Lab ALab B
Sll GltPus
HtaSth
DisTav
Ges
Fp
Ppl APpl C
Ppl BWc
PhcChp
Fna_b
Fnd_b
Fnd_cFnb_b Tem
MlgUm_a
Um_b
Mlp
PugPut
MvRgSpr
Aec An_a An_b
En
Asc Asf_aAsf_b
Nef
AstPch
AfAo
PcmTls
Ajc_aAjc bAjc_c
Ajc_dAjc_e
Ajd_aAjd_b
Pab_aPab_c
Pab_b
Arb Te Thv Trr Trt
Arg Aro
Coi_a Coi_dCoi_b Coi_c
Cop_bCop_c
Ur
McpMgMyf
Msp
Alb Ptr Pyt
CohPn
Bg
Bof Scs
Ged
Chg Th
Tit Poa
NcNet
NedSom
Ecf
FoGim
GzNh
HjTra
Hpv
Glg Va Vd
Ggt Mag
Grc
Crp
Tum
ShcSho
Sp Sj
Ca_aCa_b
CadCt_a
Cap Loe
Shs Deh Mrg
CllPia_b
Pcp_aPcp_b
CglNac
Sab_aSab_b
Sak Smi
Sc_a Sc_bSc_c
Sap_a
VpZr
Ka KlKlw Lak_aLatLw
Erg
Wa Yl
Alm Aalpha
Alm AbetaAlm Balpha
Bad_a
Bad_b Spp
Muc
Rha A
Rha BPhb Tct
Co
Aac 2Cmf 2
Hrs 2
Am 2Apf 2
Bot 2
Nav 2
Tic 2
Aea 2Cpq 2And 2Ang 2
Myd 2
Ayp 2Rhp 2
DapIs 2
Cpt 2Lg 2Her 2AHer 2B
Aim 2A
Caf 2A
Bt 2AO
c 2AEqc 2A
Ss 2ACaj 2A
Ggg 2A
Pna 2AH
s 2APat 2A
Mam
2A
Pah 2A
Otg 2A
Cvp 2A
Mm
2A
Rn 2A
Spt 2AM
yl 2ALa 2A
Md 2A
Aoc 2A
Gg 2A
Tag 2A
Xl 2AXt 2A
Br 2A
Ga 2A
Ol_a 2A
Tar 2A
Tn 2A
Br 2CGa 2C
Tar 2C
Tn 2C
Ol_a 2C
Ol_b 2C
Aim 2B
Caf 2BEqc 2B
Bt 2BM
yl 2B
Caj 2B
Ggg 2B
Pat 2B
Hs 2B
Pna 2B
Cvp 2B
La 2BPah 2B
Mam
2B
Oc 2B
Mm
2B
Rn 2BM
d 2BOra 2B
Gg 2BAoc 2B
Xl 2B
Xt 2B
Br 2B
Ga 2B
Ol_a 2B
Tar 2B
Tn 2B
Br 2DBf 2
Stp 2 Brm 2
Wb 2
Lol 2Ov 2
Trs 2
Amq 2 Mb 2Pro 2
Nv 2
Tia 2ATia 2B
Ci 1
Cis 1
Stp 1H
m 1
Sck 1N
v 1Bf 1Brm 1
Wb 1
Ov 1
Lol 1
Glp 1
Mh 1
Mi 1
Cab 1C
ar 1
Ce
1 Cb 1
Cej 1
Hb 1
Psp 1
Str 1
Trs 1O
id 1
Aim
1A
Bt 1
A
Ss 1
A
Caf 1A
Eqc 1
AM
yl 1A
La 1
A
Mim
1A
Caj 1A
Ggg 1AHs 1
APat
1APna
1AMam
1AMf 1A
Pah 1
A
Cvp
1A
Mm
1A
Rn 1
ASp
t 1A
Md
1AOc
1A
Dn 1
A
Aoc
1AXl 1
A
Xt 1
ABr 1
A
Cyc
1A
Ga
1A
Tar 1
A
Tn 1
A
Fh 1
A
Ol_
a 1A
Ol_
b 1A
Aim 1B
Bt 1BMim 1BCaf 1B
Ect ALa 1B
Ere 1B
Oc 1B
Myl 1B
Caj 1BPna 1B
Pah 1B
Ggg 1BHs 1B
Pat 1B
Eqc 1B
Mm 1B
Rn 1B
Cvp 1B
Aoc 1B
Aim 1CGgg 1C
Pah 1C
Bt 1CSs 1C
Mm 1CRn 1C
Ora 1C
Caf 1CCvp 1CPat 1CPna 1C
Hs 1CMam 1C
Caj 1CLa 1C
Otg 1CMyl 1CMd 1C
Eqc 1C
Spt 1COc 1C
Gg 1CTag 1C
Aoc 1C
Xl 1CalphaXl 1CbetaXt 1C
Br 1CalphaGa 1Cbeta
Ol_a 1CbetaOl_b 1Cbeta
Tar 1CbetaTn 1Cbeta
Br 1CbetaGa 1CalphaOl_a 1Calpha
Tn 1Calpha
Tar 1Calpha
Aim 1D
Bt 1DMd 1DOra 1D
Cvp 1DLa 1D
Eqc 1D
Mim 1D
Spt 1D
Ggg 1DHs 1D
Pat 1D
Mam 1DPah 1D
Pna 1D
Caj 1DMyl 1D
Fc 1D
Caf 1D
Mm 1DRn 1D
Oc 1D
Aoc 1DGg 1
D
Tag 1D
Xt 1D
Br 1DGa 1
D
Tar 1DTn 1
DOl_a
1D
Br 1
E
Ga
1E
Ol_
a 1E
Ol_
b 1E
Tar 1
E
Tn 1
E
Cpt
1H
er 1
DH
er 1
AH
er 1
BH
er 1
C
Ecg
1AEm
1A
Hm
m 1
AS
cm 1
Ecg
1BEm
1B
Hm
m 1
BSm
1
Lg 1
Aea
1C
pq 1
Ang
1An
d 1
Da
1
Der
1D
m 1
Dse
1
Dss
_a 1
Dss
_d 1
Dp
1A
Drp
1A
Dp
1B
Drp
1B
Dss
_c 1
Dss
_f 1
Dw
1D
y 1
Dg
1
Dm
o 1
Dv
1G
om 1
Myd
1Ti
c 1
Ayp
1R
hp 1
Pdc
1D
ap 1
AD
ap 1
B
Nav
1
Am 1
Apf 1
Bot 1
Hrs
1Aa
c 1
Cm
f 1
Class-3 C-Term
Stra
men
op
iles
Alv
eola
ta
Class-4
Capsasporaowczarzaki
Basidomycota
Ascomycota
Fish
Cla
ss-1
E
Vertebrata Class-1C
Vertebra
ta Class-
1DVerte
brata Class-1B
Vert
ebra
ta C
lass
-1A
Inve
rteb
rate
Cla
ss-1
Invertebrate Class-2
Vertebrata C
lass-2AVertebrata Class-2B
Class-3 N-Term
Metazoa Class-1 Metazoa Class-2
Fungi
SAR
Amoebae
Excavata
Rhizaria
(Fungi/Metazoaincertae sedis)
0.89
0.53
0.52
0.88
0.59
0.44
0.83 0.75 0.58
0.50
0.50
0.46
0.450.41
0.450.340.41
0.08
0.41
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
0.37
0.38
0.45
0.29
0.38
0.98
0.98
0.90
0.41
1.00
1.00
0.87
0.610.600.60
Figure 1 Phylogenetic tree of the coronin family. The phylogenetic tree of the coronin family was calculated from the multiple sequencealignment of the conserved coronin domain using the Bayesian method. The unrooted tree was drawn with iTOL [73] and branches werecoloured according to class and taxonomic distributions. For an extended representation of the tree including all posterior probability values seeAdditional file 2.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 4 of 17
the ARP2/3 complex depending on concentration [31],is similar to CA domains in WASP family proteins [32].The CA domain is well conserved but distinct withinthe Saccharomyceta clade (Pezizomycotina and Sacchar-omycotina, Figure 4).
Surprisingly, the coronins of the Tremellomycetes (e.g.Filobasidiella/Cryptococcus species) that belong to theBasidiomycota encode a C-terminal dUTPase domain(deoxyuridine triphosphatase domain) instead of thecoiled-coil region (Figure 3). These coronin sequences
Heterolobosea Naegleria gruberi
Kinetoplastida Leishmania major Crithidia fasciculata
Parabasalia Trichomonas vaginalis
Stramenopiles Phytophthora ramorum Aureococcus anophagefferens Thalassiosira pseudonana
Alveolata Toxoplasma gondii Plasmodium falciparum Theileria parva Tetrahymena thermophila Cryptosporidium hominis
Apusozoa Thecamonas trahens
Amoeba Dictyostelium discoideum Dictyostelium fasciculatum Acytostelium subglobosum Entamoeba histolytica Physarum polycephalum Acanthamoeba castellanii
Fungi Ascomycota Basidiomycota Allomyces macrogynus Batrachochytrium dendrobatidis Spizellomyces punctatus Rhizopus arrhizus
Fungi/Metazoa Capsaspora owczarzakiincertae sedis
Choanoflagellida Monosiga brevicolis
Metazoa Amphimedon queenslandica Trichoplax adhaerens Nematostella vectensis Cestoda Branchiostoma floridae Strongylocentrotus purpuratus Oikopleura dioica Ciona intestinalis Helobdella robusta Lottia gigantea
Nematoda Caenorhabditis elegans Brugia malayi Meloidogyne hapla
Arthropoda Drosophila melanogaster Drosophila pseudoobscura Anopheles gambii Daphina pulex
Actinopterygii Takifugu rubripes Brachydanio rerio
Mammalia
Aves Xenopus tropicalis Anolis carolinensis
1 2 34 8 95 6 7 10 11 12CORO1B CORO1C CORO6CORO1A CORO2A CORO2B CORO7
p57TACO
ClipinA
IR10ClipinB
ClipinC P70POD-1
coroninSE HCRNN4coronin-3
CRN2
1B 1C 1D 1E1A 2A 2B 2C 2D
Class-1 Class-2 Class-3 Class-4
Villidin
unclassified
( )
HGNCMorgan & Fernandez
Figure 2 Coronin repertoire of selected species of major taxa and branches. The coronins of several representative species for mosteukaryotic taxa and branches are listed (for the list of all species see Additional file 4). On top, alternatively used names and classificationschemes are given for better comparison and orientation.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 5 of 17
are supported by many EST/cDNA clones for several ofthe Filobasidiella species extending from the coronindomain to the stop-codon. In addition to this dUTPasedomain, the Filobasidiella species contain a furtherdUTPase in the genome that is conserved in the otherBasidiomycotes, and also the other fungi. The dUTPasedomains of the Tremellomycetes coronins contain allcharacteristic dUTPase domain motifs [33] and aretherefore supposed to constitute enzymatically activedomains. dUTPases typically form homotrimer activesite architectures with all monomers contributing con-served residues to each of the three active sites [33].Except for the prediction of trimerization of these coro-nins, which could be mediated by the dUTPase domainsinstead of the coiled-coil domains in the other coronins,it needs experimental data to link the function of actinfilament structure remodelling by coronins to dUTPnucleotide hydrolysis in DNA repair by dUTPases.
Class-3 coroninsClass-3 coronins (Type III coronins) comprise homologsthat encode two coronin domains arranged in tandem[8]. These two coronin domains are separated by uniqueregions, and class-3 coronins do not encode coiled-coildomains. As recently reported [31] the class-3 coronins
also encode a CA domain similar to the CA domain ofthe WASP family proteins at their C-termini (Figure 3).Based on the multiple sequence alignment of 112 class-3 coronins from all major branches of the eukaryotesthe position of the C-region has slightly been adjustedin comparison with a previous analysis (Figure 4; [31]).Although the C-region of the class-3 coronins is not asconserved as similar regions in the yeast short coroninsor in WASP family proteins, the characteristic patternof hydrophobic residues concluded by a basic residue isvisible in the homologs of all species (Figure 4). In con-trast to the short coronins, the unique region betweenthe C-terminal coronin-domain and the conserved CA-domain is short (20-30 amino acids).Like for the short coronins the Filobasidiella species have
surprising and species-specific tandem-coronins. The Filo-basidiella class-3 coronins have a D-glycerate 3-kinasedomain between the two coronin-domains (Figure 3). Onlythe termini of the Filobasidiella class-3 coronins are sup-ported by EST/cDNA data, but long exons bridge the N-terminal coronin-domain and the glycerate 3-kinasedomain, as well as the glycerate 3-kinase domain and theC-terminal coronin-domain. As found for the dUTPasedomain of the short coronins, the Filobasidiella speciescontain an additional D-glycerate 3-kinase that has
0 200 400 600 aa
445
461
LZ
651
LZ
DdCoro
602
PfCoro
695
Fna_bCoro
HsCoro1A
ScCoro (Crn1)
605
CeCoro3 (POD-1)
LZ Leucine Zipper MAP1B homology region
800 1000 1200 1400 1600 1800
1057
Fnb_bCoro31321
DdCoro4 (Villidin)1704
CeCoro1
CoCoro41738
EhCoro41602
PH
Gelsolin
dUTPase
VHP
Coiled coil
WD40 repeat
Central region
Acidic region
D-glycerate 3-kinase
Figure 3 Domain organisation of representative coronins. A colour key to the domain names and symbols is given on the right except forthe coronin domain that is coloured in orange. The abbreviations for the domains are: WD, WD repeat; PH, pleckstrin-homology domain; LZ,leucine zipper; VHP, villin headpeace domain.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 6 of 17
homologs in the other fungi and also in plants. Why it isadvantageous to connect an actin-filament binding functionto a glycerate 3-kinase needs experimental evaluation. Theglycerate 3-kinase domain is not found in the class-3 coro-nins of the other Basidiomycotes. Except for the Filobasi-diella species only the insects have long insertions betweenthe two coronin-domains of their class-3 coronins. Theseinsertions are highly conserved, about 300 residues long,
and do not show any homology to known domains,sequence motifs, and other proteins.In contrast to the related species Rhizopus arrhizus
and Phycomyces blakesleeanus the coronin-3 of Mucorcircinelloides consists of only the second coronin-domain of the tandem. We can exclude the possibilityof this being an artefact of the genome assembly forthree reasons. First, the genome sequence is continuous
0
1
2
3
4
bits
N
1
G
E
DSTN
2
S
G
LAIV
3
S
G
ND
4
V
K
A
NHED
5
T
I
F
AVL
6
SL
7
T
S
N
G
QAK
8
G
TNSKE
9
E
NKSD
10
T
S
Q
GDNK
11
TKDQES
12
IV
13
T
K
G
E
SDN
14Q
G
DTNSK
15ML
16IL
17
E
A
DQNK
18
K19
ASV
20
V
T
G
CNAS
21
A
TSNED
22
M
F
EQIL
23
E
SD
24
SAEQD
25
T
S
E
PLIVD
26
R
DEN
27
K
E
A
RGHND
28
VSKEDA
29
T
S
K
VQDE
30
T
G
A
EPND
31
Q
A
EDP
32
V
G
E
A
PNKDS
33
T
S
Q
D
ERK
34
T
S
R
K
G
NED
35 36
E
37
DEN
38
SAGE
39
R
N
G
EDK
40
K
I
SNDT
41
GEDS
42
VSADEG
43
W
44
Q
DE
45
N
L
SDE
46
S
DAEV
47
T
Q
G
EPKD
48
L
I
EKD
49
T
L
D
RKPE
50
L
I
VSEP
51
D
KIETVA
52
E
TRSKP
53
PAVIS
C
0
1
2
3
4
bits
N
1
N
M
I
E
ATP
2
ERAP
3
S
VNAP
4
P
A
GTS
5
VM
6
R
Q
GNSAK
7
S
QDE
8
HNQ
9
AQSGK
10
VDQGSA
11
S
12
VIM
13
SA
14
T
NSA
15
IAM
16
VA
17
DNS
18
RK
19
YF
20
KMQA
21
D22
H
A
GEDKN
23
V
Q
A
DE
24
A
PDE
25
S
TKAED
26
PGDE
27
NGRE
28
NHE
29 30 31 32
E
33
KGDAE
34
VDGEA
35
SPDE
36T
N
KPSED
37T
EGVDA
38
S
AEGD39
SGED40
P
N
GETADS
41
G
S42
GP
S43
F44
DE
45
QAE
46
PEAIV
47
V
T
G
KPQS
48
SRK
49
QP
50
Q
I
TPVA
51
D
SAPQE
52
ISKR
53
Q
TVSPRA
C
C-domain A-domain
Saccharomycotina coronins
Pezizomycotina coronins
Class-3 coronins
C-domain A-domain
0
1
2
3
4
bits
N
1
D
VSMEK
2
R
A
DYTS
3
Q
H
VED
4
R
I
TSKQE
5
K
S
NAPEQ
6
ASDRK
7
T
A
N
ERQK
8
A
F
KDQE
9
S
D
KAQE
10
S
Y
REVIL
11
F
K
A
IMQL
12
H
ADSKN
13
K
D
RTSA
14
FWLVM
15
LAFVS
16
Q
ETDSNA
17
FQSRK
18
PIVAML
19
M
T
Q
SNGK
20
MLDE
21
G
YKQHD
22
Q
K
H
E
DSNR
23
R
H
QDAE
24
S
H
D
25 26 27 28 29 30 31 32 33 34 35 36 37
Q
38 39 40 41
S
K
D
A
NE
42
S
RQEDP
43
I
D
V
44
K
V
DE
45
A
H
ESKDG
46
L
D
VSEN
47
Y
V
K
I
R
48
S
QGNTED
49
L
K
G
C
ETD
50
N
A
R
SEKP
51
V
T
D
A
EPL
52
I
V
D
TEP
53
S
N
G
R
AT
Q
54
E
A
TPFD
55
T
G
E
L
Q
DS
56
T
LFM
57
V
R
K
I
A
QDE
58
VDAGE
59
G
60
S
P
DAEG
61
S
A
DPV
62
V
PAESD
63
A
SDE
64
TKHSEND
65
SDE
66
W67
ESNQD
68
N
QED
69
EDY
70
C
Figure 4 Sequence conservation in the CA domains. The sequence logos illustrate the sequence conservation within the multiple sequencealignments of the CA domains of the Saccharomycotina, the Pezizomycotina, and the class-3 coronins. The CA domains of theSaccharomycotina and the Pezizomycotina are located within the unique regions of the short coronins while the CA domain of the class-3coronins is at the C-termini of the proteins like in WASP family proteins. The regions between the C and the A domains are of variable length.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 7 of 17
around MucCoro3. Secondly, there is no homology toany part of the N-terminal coronin-domain of RhaCoro3or PhbCoro3 in the genome although the sequenceidentity of the C-terminal coronin-domains is about65%. And finally, there is a TATA-box shortly upstreamof the MucCoro3 gene. Because a coronin-3 has alreadybeen present in the most ancient eukaryote the loss ofthe N-terminal coronin-domain must be specific toMucor circinelloides.
Class-4 coroninsBased on the phylogenetic tree (Figure 1) and thedomain composition of the protein homologs, anothercoronin class can be defined for which the Dictyosteliumdiscoideum homolog, also called villidin [34], would be arepresentative (Figure 3). We suggest naming membersof this class class-4 coronins. Most class-4 coronins con-sist of an N-terminal coronin-domain followed by threeto four PH domains, four to five gelsolin domains, and aC-terminal villin headpeace domain (VHP). Class-4 cor-onins were identified in two of the major kingdoms ofthe eukaryotes, in excavates and opisthokonts. Further-more, they are found in several of the sub-branches ofthe opisthokonts, in amoebae, fungi, and the fungi/meta-zoa incertae sedis branch. Because class-4 coronins fromdifferent species often contain different numbers of PHand gelsolin domains, domain gain and loss events musthave happened in the respective branches or single spe-cies. However, there are not enough coronin-4 homo-logs identified yet to reconstruct the evolution of theseregions. In addition to these multi-domain class-4 coro-nins there is a group of class-4 coronins that just con-sists of the conserved coronin domain and is restrictedto some Amoebae species yet.
Alternatively spliced coroninsAlternative splice forms have been reported for two coro-nin homologs: five variants of coronin from Caenorhab-ditis elegans [35], CeCoro1 (Figure 5), and three variantsfor coronin-1C from human [36], HsCoro1C. Thedescribed splice variants do not concern the beta-barreldomain but the structurally low-complexity region priorto the coiled-coil region in CeCoro1 and elongations ofthe N-terminus of HsCoro1C, respectively. In thereported analysis of CeCoro1 [35] two splice sites (thealternative 3’-splice site of exon7 and the alternative 5’-splice site of exon8) do not obey the conventional spli-cing rules. Alternative 5’-splicing of exon8 would lead toa premature stop-codon. In the four additional Caenor-habditis strains analyzed here, C. briggsae, C. japonica,C. remanei, and C.brenneri, alternative 5’-splicing ofexon8 would not lead to a premature stop-codon at thesame position as in C. elegans but to transcripts of var-ious lengths. The same accounts for several of the other
available nematode coronin-1 genes. Given the high con-servation of the nematode coronin-1 genes, especially theCaenorhabditis genes, and the completely uncommonnature of the potential splice sites, the reported alterna-tive 3’-splice site of exon7 and 5’-splice site of exon8 aremost probably artificial results. An alternative 3’-splicesite has been reported for exon8 of CeCoro1 comprisingtwo amino acids [35]. Similar splice sites were identifiedin the genes of the other analyzed Caenorhabditis speciesbut not in other nematodes. This splice site is thus alsoeither an artificial result or specific for the Caenorhabdi-tis branch. In addition, skipping of exon8 has also beenreported to lead to an alternative transcript [35]. Theintron position and reading frame of exon8 of CeCoro1 isconserved in all analyzed nematode coronin-1’s exceptfor the Strongyloides rattii coronin-1, which consists ofonly one exon, and the Pristionchus pacificus coronin-1,which has introns at different positions. Compared to thefull-length transcript, the other alternative splice forms ofCeCoro1 are of low abundance (see Figure two in [35]).Because the integrity of exon8 of CeCoro1 (intron posi-tions around the conserved coding sequence of exon8) isnot conserved in nematodes but the correspondingamino-acid sequence, alternative splicing of nematodecoronin-1 is either restricted to some sub-branches or anartificial result of the CeCoro1 analysis.Alternative splicing of human coronin-1C results in
two additional transcripts derived from alternative tran-scription start sites encoded by an additional upstreamexon, compared to the normal start site as found andconserved in all other coronin proteins [36]. These alter-native splice forms seem to be restricted to modern pri-mates (human, chimpanzee, gorilla, orangutan, andgibbon) and have been discussed in detail elsewhere [36].We have identified alternative splice variants for coronin-
1D (Figure 5), a coronin subfamily restricted to vertebrates.A cluster of two mutually exclusively spliced exons, exon5aand exon5b, was identified in all tetrapods. The amino-acidsequences corresponding to exon5 of the fish genes aremore similar to exon5b than to exon5a. Thus, exon5a isthe result of an exon duplication event that either occurredafter the separation of tetrapods from fishes or at the onsetof the vertebrates, where exon5a has been lost in the ances-tor of the fishes. Exon5 represents the sequence of almostthe entire fourth WD repeat (fifth blade in the b-propeller)starting in the middle of the fourth b-strand of blade four.By exchanging the fourth WD repeat the vertebrates couldfine-tune the function of the coronin-1D beta-barreldomain. Vertebrate coronin-1D (CORO6) has not beenanalyzed experimentally yet and its specific function isunknown.Further alternative transcripts are derived from mam-
malian coronin-1D genes by alternative 5’-splicing of thelast exon, exon10. This alternative splicing results in one
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 8 of 17
additional glutamine residue and is conserved in all 22analyzed mammalian coronin-1D’s except for Ailuro-poda melanoleuca (giant panda), Loxodonta africana(elephant), Myotis lucifugus (little brown bat), and Bostaurus (cow).
OligomerizationMost of the short coronins have predicted coiled-coildomains at the C-terminus that are the bases for theirsupposed oligomerization. Initially, coronins have beenproposed to form dimers [16], the most common form
CeCoro1
HsCoro1D
For clarity introns have been scaled down by a factor of 2.24
1 gi|193211354|ref|NC_003281.8| (5435bp)
400 bps (ex.) 900 bps (in.) TAG
TAA
For clarity introns have been scaled down by a factor of 2.97
1 gi|224589808|ref|NC_000017.10| (5687bp)
400 bps (ex.) 1000 bps (in.) “Q”5a 5b
180 190 200 210 220 230....|....|....|....|....|....|....|....|....|....|....|....|
HsCoro1D DVIHSVCWNSNGSLLATTCKDKTLRIIDPRKGQVVAEQARPHEGARPLRAVFTADGKLLSExon5b ERFAAHEGMRPMRAVFTRQGHIFT
240 250 260 270....|....|....|....|....|....|....|
280|....|....
HsCoro1D TGFSRMSERQLALWDPNNFEEPVALQEMDTSNGVLLPFYDPDSSIExon5b TGFTRMSQRELGLWDP
blade 4 blade 5
blade 5
exon 4
exon 5b
exon 5a exon 6
exon 5a
exon 5b
blade 6
Figure 5 Gene structures of alternatively spliced coronins. The cartoons outline the gene structures of the alternatively spliced coronin-1gene from Caenorhabditis elegans, CeCoro1, and the coronin-1D gene from Homo sapiens, HsCoro1D. The alternatively spliced CeCoro1 genecontains a differentially included exon8, which has an additional alternative 3’-splice site, leading to three transcripts. The other two describedsplice sites, an alternative 3’-splice site of exon7 and an alternative 5’-splice site of exon8 [35], are most probably artificial. The HsCoro1D genecontains a cluster of two mutually exclusive spliced exons, exon5a and exon5b, and an alternative 5’-splice site of exon10. Dark grey bars andlight grey bars mark exons and introns, respectively, and alternative exons and splice sites are coloured.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 9 of 17
of coiled-coil multimerization. In the last decade, a fewcoronin homologs were biochemically purified and ana-lyzed. Accordingly, the Xenopus laevis coronin-1C(XcoroninA) has been shown to form a dimer [37] whilean oligomeric state has been found for human coronin-1C (coronin 3; [18], and the Saccharomyces cerevisiaecoronin (CRN1) trimerizes [31]. Parallel trimer forma-tion has also been shown in a crystal structure of thecoiled-coil domain of mouse coronin-1A [20] revealinga conserved motif determining the trimeric structure:R1-[ILVM]2-X3-X4-[ILV]5-E6. In this motif arginineforms a salt-bridge with glutamate at the surface of thecoiled-coil structure and the aliphatic side chain moi-eties of arginine and glutamate pack against the hydro-phobic residues at positions 2 and 5 of the motifshielding them from solvent. Mutation of the arginineto lysine leads to a concentration-dependent equilibriumbetween trimers and tetramers with tetramers formingat high concentration, while mutation to alanine or nor-leucine leads to tetramers [20]. Mutation of the invar-iant arginine to glutamine in the trimerization motif ofhuman matrilin-1 leads to tetramers [38]. Unfortunately,the switching of arginine and glutamate in the respectivepositions has not been analyzed yet. We would expectthat such a switch should be as stable as the originalmotif. Thus, to predict the oligomerization state wehave analyzed all coronin coiled-coil regions for the pre-sence of the trimerization motif. Accordingly, all 233class-1 coronins have the classical motif, except forDpCoro1B and DrpCoro1B (Drosophila pseudoobscuraand persimilis; Lys at position 1), and NvCoro1 (Nema-tostella; Cys at position 2), and are thus predicted toform trimers. This would include the Xenopus Coro1Cthat has, however, been shown to exist as a dimer [37].The situation is more diverse for the class-2 coronins.The invertebrate coronins contain the trimerizationmotif, except for AmqCoro2 (Amphimedon; Ser at P1),HerCoro2A (Helobdella; Lys at P1), HerCoro2B (Phe atP1), MydCoro2 (Mayetiola; Gln at P6), and the nema-tode class-2 coronins (Cys at P2). Almost all fish class-2coronins contain the trimerization motif, but the othervertebrate class-2 coronins have conserved mutations.The tetrapod class-2A coronins encode a glutamineinstead of the invariant arginine, which would turnthem to tetramers in analogy to matrilin-1 [38]. The tet-rapod class-2B coronins contain glutamine instead ofthe glutamate at position 6 of the motif, a substitutionwhose effect has not been analyzed yet.About half of the analyzed fungal coronins have the clas-
sical trimerization motif. The most common substitutionsthat are found in all Schizosaccharomyces and mostBasidiomyota coronins are lysines or glutamines instead ofthe arginine at position 1. While the coiled-coil region isconserved in general, substitutions happened in specific
species but not in whole branches (except for the Schizo-saccharomyces). Therefore, we would expect all fungalcoronins to form trimers. All Amoeba coronins, the Stra-menopiles coronins (exceptions: FrcCoro a His at P1,BhCoro_B a Asn at P6, AuaCoro_B a Lys at P1), theTrichomonas and Naegleria coronins contain the classicaltrimerization sequence motif in the coiled-coil region.Interestingly, the kinetoplastid coronins have the salt-bridge switched in the motif and should thus also be ableto form trimers. From the Alveolata, only the Ciliophora(e.g. Tetrahymena) and Coccidia (e.g. Toxoplasma gondii)coronins contain coiled-coil domains, and only the Cocci-dia contain the trimerization motif.These are, however, predictions based on the existence
of the proposed trimerization motif. The motif has beenidentified in 86% of all short, autonomous, and parallelthree-stranded coiled-coils while it is also observed in 9%of the antiparallel trimers and in 5% of the parallel andantiparallel dimers [20]. Thus, although most short coro-nins are predicted to form trimers some might neverthe-less function in other oligomeric states in the cell. Theoligomerization state can ultimately only be shown inexperiments, which have, however, been done for just afew of the coronins yet.
F-actin bindingF-actin binding is one of the common properties of coro-nin proteins. The extended multiple sequence alignmentpresented here together with the recently determinedcrystal structure of murine coronin-1A [17] now allows areevaluation of previous mutagenesis studies. Truncationstudies have shown that the coronin domain, includingthe b-propeller and its C-terminal extension, is necessaryfor F-actin binding [30,39]. Mapping the sequence con-servation within 13 short coronin members onto the sur-face of the crystal structure revealed two regions, oneformed by blades 1, 6, and 7 and one formed by blades 6and 7 and a portion of the C-terminal extension, torepresent possible actin binding sites [17]. Subsequently,several surface-exposed charged amino acids have beenmutated to alanine or substituted by reversed charges inhuman coronin-1B and their F-actin binding affinity hasbeen analyzed ([40], Figure 6 red dots, see also Additionalfile 5). Only the R30D mutation abolished actin bindingin vitro. Although an arginine is the most prevalentamino acid at this position it is often substituted by alysine or a proline (Figure 6). The multiple sequencealignment of the coronins also does not show a trendtowards a class-specific substitution. For example, whilea proline is found at this position in all vertebrate class-2coronins, arginines, lysines, asparagines, prolines, threo-nins, and tyrosins are found in invertebrate class-2 coro-nins. At least negatively charged amino acids are notfound in any of the coronin domains at this position.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 10 of 17
0
1
2
3
4
bits
N
1 2 3
P
M
4
I
PL
E
SVAG
5E
SPFI
6A
V
N
PELSG
7
V
R
N
D
ESAT
8
Q
E
N
MTS
9
E
N
M
L
R
TAP
10
I
Y
L
Q
V
ATS
11
K
A
T
R
S
Q
12
M
A
S
PTRK
13
Q
P
K
ARSM
14
I
A
TLSM
15
Q
F
AWMS
16
M
AFSGR
17
V
W
Q
S
PKR
18
L
G
I
QRVF
19
I
M
GYFV
20
S
N
PKR
21
TVQAS
22
W
P
S23
L
SRK
24
R
L
I
YF
25
KR
26
F
Y
NH
27
TIALV
28
E
K
TA
QYF
29
S
CAPG
30
S
I
E
RTKQ
31
I
T
Q
SVAP
32
S
G
PTLVA
33
T
P
SNHRK
34
Q
PNARK
35
N
SDE
36
H
S
E
T
A
N
L
Q
37
T
Q
P
N
H
A
38 39
G
T
A
S
Q
WHC
40
L
VFIY
41
RSTED
42
ESGDN
43
VLI
44
C
N
S
HPKR
45
P
S
A
L
GNIV
46
N
A
LGTS
47
A
Q
NTSRK
48
Y
L
A
R
TVSN
49
P
QVSTA
50
I
STHW
51
C
T
YSEPD
52
Q
N
D
TPACG
53
ATSDE
54
HES
55
TS
56
A
57
TAV
58
TS
59
GT
60
F
61 62 63 64 65 66 67
Q
CGNTS
68
Q
G
C
HDSTN
69
Y
N
H
I
GLF
70
FLVIC
71
T
Q
HCSKA
72
G
TSCAV
73
T
G
SN
74
K
C
TSGAP
75
H
Q
A
L
SERK
76
W
LRYF
77
A
MFVIL
78
F
SA
79
A
LFIV
80
A
VPIN
C10 20 30 40 50 60 70 80
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
ScCoro__fl -------------MSGKFVRASKYRHVFGQAAKKEL--QYEKLKVTNNAWD---------------SNLLKTNGKFIAVNDdCoro__fl --------------MSKVVRSSKYRHVFAAQPKKEE--CYQNLKVTKSAWD---------------SNYVAANTRYFGVIMmCoro1A_fl -------------MSRQVVRSSKFRHVFGQPAKADQ--CYEDVRVSQTTWD---------------SGFCAVNPKFMALI
1-2 R30A/D
0
1
2
3
4
bits
N
401
T
IAMFL
402
T
P
DGANES
403
R
N
V
T
SKE
404
P
S
K
D
AEG
405
I
H
D
SAP
406
E
MVIL
407
N
S
V
G
TKA
408
T
I
VLG
409
N
I
AGDVS
410
Q
M
V
S
EFTL
411
H
K
I
T
LVEQ
412
N
G
VSDTE
413
F
VMIL
414
G
D41
5
C
N
VGQST
416
V
R
SAGT
417
T
LGIAS
418
ATPNS
419
ASG
420
S
APLTIV
421
VIMCL
422
T
V
IFLM
423
I
P42
4
V
S
I
HYLF
425
V
L
WFY
426
SD
427
S
ADEP
428
E
SGD
429
I
V
LNST
430
R
HKQGSN
431
TCVLMI
432
IVL
433
I
VLFY
434
A
IVL
435
S
I
GVTCA
436
SG
437
RK
438
G43
9
ED
440
N
C
RTSG
441
A
Q
R
VTSN
442
CLVI
443
H
N
KFYR
444
V
S
I
A
MF
LCY
445
M
LFY
446
QE
447
M
FLYVI
448
L
D
N
Q
VEST
449
A
T
ESPND
450
A
S
T
GED
451
N
Q
D
ASEK
452
NK
EDATG
453
C
GSPRT
454
G
SNAVE
455
YQEA
456
457
458
459
460
T
S
A
G
R
QDKE
461
E
QDSKAP
462
Y
Q
N
P46
3
HAFY
464
A
IFVL
465
T
L
Y
FESH
466
K
A
PEFY
467
GCVIL
468
M
D
ATNS
469
R
H
M
S
QTE
470
S
V
C
HYF
471
QTRSK
472
A
LCGTS
473
A
E
G
TPSK
474
L
A
NTSDE
475
V
T
LAQSP
476
I
F
ALTH
Q
477
L
S
QKR
478
SAG
479
VAFLIM
480
V
T
SCAG
C410 420 430 440 450 460 470 480
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
ScCoro__fl IEKGDLGG-FYTVDQ-SSGILMPFYDEGNKILYLVGKGDGNIRYYEFQNDE------------LFELSEFQSTEAQRGFADdCoro__fl FTT-PLS--AQVVDS-ASGLLMPFYDADNSILYLAGKGDGNIRYYELVD----------ESPYIHFLSEFKSATPQRGLCMmCoro1A_fl LEE-PLS--LQELDT-SSGVLLPFFDPDTNIVYLCGKGDSSIRYFEITS----------EAPFLHYLSMFSSKESQRGMG
1-13 1-14 1-15 1-16 1-171-121-11
0
1
2
3
4
bits
N
481
W
I
S
MYVLF
482
IAVLM
483
T
SP
484
RK
485
W
S
K
L
Q
NTHR
486
V
E
MS
AG
487
488
IVCL
489
K
S
A
END
490
L
S
MITV
491
A
R
KHMNS
492
T
Q
ADSEK
493
G
AVNC
494
E
495
T
L
FVI
496
S
D
T
G
VNLMFA
497
KR
498
G
CILVAF
499
V
M
FLY
500
V
QRK
501
M
A
I
NVTL
502
E
I
NSVHT
503
Q
G
D
ASTEN
504
KSTDNR
505
R
A
Q
NSTKG
506
P
A
L
NDSG
507
NSG
508
AVGD
509
IG
510
AG
511
A
T
Y
SLK
512
LVCI
513
L
R
D
IVQE
514
I
A
T
VP
515
T
LVI
516
G
P
M
Q
VIAS
517
I
VYMF
518
F
Y
Q
RIT
519
ILV
520
P
521
R
522
IVRK
523
E
SQTRNK
524
T
D
KGSA
525
MIGL
526
527
528
529
530
531
532
533
534
535
TYKAS
536
T
V
Q
S
RED
537
V
EF
T
DSL
538
YF
539
HQ
540
ASED
541
E
D54
2
VIL
543
FY
544
G
V
P54
5
N
MEPD
546
S
V
ICAT
547
T
PLAKR
548
F
R
L
TIVPA
549
TASG
550
W
G
D
T
P
551
I
Q
TVDKE
552
T
SAP
553
G
VETSA
554
Q
T
VIML
555
G
DEST
556
G
V
T
PSA
557
G
K
Q
ASDE
558
K
S
QADE
559
Y
FW
560
M
VWIFL
C490 500 510 520 530 540 550 560
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
ScCoro__fl VAPKRM-VNVKENEVLKGFKTVVDQ-----RIEPVSFFVPRR------------SEEFQEDIYPDA-PSNKPALTAEEWFDdCoro__fl FLPKRC-LNTSECEIARGLKVTPF------TVEPISFRVPRK------------SDIFQDDIYPDT-YAGEPSLTAEQWVMmCoro1A_fl YMPKRG-LEVNKCEIARFYKLHER------KCEPIAMTVPRK------------SDLFQEDLYPPT-AGPDPALTAEEWL
1-18 1-20
1-21
1-221-19
Figure 6 Sequence conservation within the actin binding region. The sequence logos illustrate the sequence conservation within themultiple sequence alignments of the coronin domains. Here, only the N- and C-termini of the coronin domains are shown because most of theresidues implicated in actin binding map to these regions. For the representation of the entire coronin domain see Additional file 5. For betterorientation, the sequences of three representative coronins are shown: the yeast coronin as the main target of mutagenesis experiments, theDictyostelium coronin as the founding member of the protein family, and the murine coronin-1A of which the crystal structure is known.Secondary structural elements as determined from the crystal structure are drawn as yellow arrows (b-strands) and red boxes (a-helices). Greendots point to amino acids of ScCoro that have been mutated to alanine [41] and red dots highlight mutagenesis studies in HsCoro1B [40]. Light-blue boxes highlight mutations that abolished actin binding, dark-grey boxes represent mutations that did not influence actin binding, andlight-grey boxes point to mutations in yeast coronins that could not be expressed and tested.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 11 of 17
Recently, systematic mutagenesis of charged surface-exposed residues of yeast coronin revealed a patch ofresidues extending over the top and one side of theb-propeller that abolished actin binding when mutated toalanine (Figure 6, green dots [41]). The analysis of theconservation within the coronin proteins shows thatmany of the substitutions in both studies have been per-formed on marginally conserved residues (e.g. E215A/K,K216A/E, 1-11, 1-15, 1-16). Thus, it is not surprising thatcoronins with mutations of these residues are able tobind F-actin. As actin binding is one of the commonfunctions of coronins and actins belong to the highestconserved protein families the actin binding surface ofthe coronins is also expected to be highly conserved.Most of the residues that were found to abolish actinbinding when mutated to alanine are strongly conserved(Figure 6). The few residues that are highly conservedbut do not influence actin binding might be interactionsites for other proteins like cofilin.
DiscussionHere, we have analyzed 723 coronins from 358 species. For323 species whole genome sequence data was availableallowing a “holistic” analysis of the coronin protein family.In addition, the whole genome assemblies of 69 specieshave been analyzed that in the end did not contain anycoronin homolog. These species include Rhodophyta (Cya-nidioschyzon, Galdieria), Viridiplantae, Microsporidia, For-micata (Giardia), and Haptophyceae (Emiliania). Asequence alignment of the coronin proteins was createdand extensively improved manually. The phylogenetic ana-lysis of the conserved coronin domain, which is alsoincluded in the crystal structure [17], using the Bayesianmethod showed that the grouping of the coronins is com-pletely in accordance with the latest phylogeny of theeukaryotic species (Figure 1, [27-29]). Subsequently, weanalyzed the coronin tree with respect to established andproposed classifications defining subfamilies. Two majorschemes are currently in use, the old one established bythe HGNC [8] and a more recent one expanding the num-ber of classes from three to twelve [22]. Essentially, thelater classification re-defines subclasses of the HGNCscheme as separate classes, e.g. 1A and 1B become class-4and class-1, respectively, and groups some branches to newclasses. However, some coronins still remained unclassifiedand several classes have been proposed, like the inverte-brate metazoan classes 8 and 9, although the contributingmembers did not form monophyletic branches in theunderlying protein family tree. The proposed classes 10and 12 contain members of unrelated taxonomic branches,probably because these coronins were adjacent in the treefigure. In addition, the Entamoeba tandem-coronin did notgroup to the other tandem-coronins. Thus, this classifica-tion is not consistent with the taxonomy of the eukaryotes.
In addition, homologs of major branches were missing inthe analysis like those from stramenopiles. We do notintend to add confusion to the classification of the coroninfamily but want to suggest a reliable and, consideringfuture genome sequencing projects, expandable scheme.Two major reasons support the future use of the HGNCscheme although it needs some minor adjustments. Theclassification by Morgan and Fernandez [22] of coroninsoutside the metazoans is not consistent with the latest tax-onomy of the eukaryotes and therefore not adaptable toour more comprehensive coronin tree. In addition, it iswell known that two whole-genome-duplications are thereason for the expansion of gene homologs at the origin ofthe vertebrates [42], while another whole-genome-duplica-tion happened at the origin of the Actinopterygii [43,44].Thus retaining the orthology between non-vertebrate andvertebrate coronins in class-numbers would be desirablebut has also been abandoned by Morgan and Fernandez[22]. Here, we adapted the HGNC classification except forrenaming CORO6 and CORO7 (HGNC) to coronin-1Dand coronin-3, respectively, numbering additional fish cor-onins as coronin-1E, coronin-2C, and coronin-2D, anddefining the new coronin class-4. The term “class” isequivalent to the term “Type” used by the Bear group inrecent reviews [8,45]. However, we prefer the term “class”to be consistent with the terminology used for other pro-tein families (e.g. the myosin family [46,47]) and thereforeto facilitate the work with databases and search engines inthe future.Class-4 coronins represent a new type of coronins that
are present in Excavata (Naegleria gruberi), Amoebae,fungi (Spizellomyces punctatus), and the Fungi/Metazoaincertae sedis branch. Most class-4 coronins consist of theN-terminal coronin domain followed by two to three PH,four to five gelsolin, and a C-terminal VHP domain. Thefirst representative of this subfamily has been identified inDictyostelium discoideum and called villidin because of thehomology of its gelsolin and VHP domains to villin [34].The homology of villidins WD-repeat region to coroninhas been recognized later on [48,49] suggesting villidinsorigin through a fusion of the coronin domain with villin.Villin is the founding member of a superfamily of proteinscontaining three to six gelsolin domains (reviewed in[48,50]). Like villidin (class-4 coronins), villin, supervillin,and protovillin also contain a C-terminal VHP domain.Alignment of villin to the class-4 coronins gelsolin domainsshows that the class-4 coronins have lost the first gelsolindomain of villin. The first gelsolin domain of villin is asso-ciated with dimerization, actin filament capping, nuclea-tion, and bundling, and G-actin binding [50]. Thus, class-4coronins do not play a role in these activities via theirgelsolin domains [34]. However, villin contains three phos-pholipid-binding domains, two preceding the second gelso-lin domain and one overlapping with the VHP domain.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 12 of 17
These phospholipid-binding domains are conserved inclass-4 coronins and are most probably responsible fortheir association with internal membranes like Golgi-struc-tures and ER-membranes [34].To reveal the evolution of the coronin family and to
determine the coronin repertoire of the last commonancestor of the eukaryotes, we plotted the coronin inven-tory of several representative species, whose genomesequences are available and whose coronin inventoriesare therefore complete, on the most widely agreed tree ofthe eukaryotes (Figure 7). However, especially the group-ing of taxa that emerged close to the origin of the eukar-yotes remains highly debated. Therefore, alternativebranchings are also indicated in the tree. The phylogenyof the supposed supergroup Excavata is the least under-stood because only a few species of this branch havebeen completely sequenced so far. While the grouping ofthe Heterolobosea, Trichomonada, and Euglenozoa intothe Excavata is found in most analyses, the grouping ofthe Diplomonadida as separate phylum or as part of theExcavata is still debated (arrow 1 [51]). Also, some ana-lyses group the red algae of the Rhodophyta branch tothe Viridiplantae [52-54] and others support their inde-pendence (arrow 2; [28,55]). According to most of the
recent phylogenetic analyses, the Alveolata, Rhizaria, andStramenopiles form the superfamily SAR [27,54]. Theplacement of the Haptophyceae and Cryptophyta to theSAR is still highly debated. Although several analysesare in favour to this grouping (arrow 3; [55-57]) mostanalyses are in contrast [27-29,53,54]. Short coroninscontaining the N-terminal coronin domain and the C-terminal oligomerization domain have been found in allbranches except Diplomonadida, Haptophyceae, and Vir-idiplantae/Rhodophyta. The phylogenetic grouping of thespecies based on the phylogenetic tree of the coronindomains showed that the coronins with different domaincompositions (containing dUTPase domains, ARP2/3binding domains, no coiled-coil regions) are species-specific developments based on domain loss and gainevents while the corresponding species correctly grouptogether inside the respective branches. Class-3 coroninsare also found in all major eukaryotic superkingdomsthat contain coronins. We did not identify any speciesthat contains exclusively a class-3 coronin suggestingthat encoding a class-3 coronin is a plus for the speciesbut not a necessity. Class-4 coronins were found in twoof the four coronin-containing superkingdoms, the Exca-vata and the Opisthokonts. Several major sub-branches
Emiliania huxleyi
Guillardia theta
Haptophyceae
Cryptophyta
Alveolata Ciliophora
ApicomplexaCoccidia
Aconoidasia
Tetrahymena thermopila
Toxoplasma gondii
Plasmodium faliciparum
Stra
menopile
s
SAR
Bacillariophyta
Oomyc
etes
Blastocystis
Pela
goph
ycea
e
Aureococcus anophageferens
Phytophthora ramorum
Thalassiosira pseudonana
DiplomonadidaGiardia lamblia
Euglenozoa
Excavata
Leishmania major
Trichomonada
Opisthokonts
Trichomonas vaginalis
Capsaspora owczarzaki
Monosiga brevicollis
Met
azoaChoanoflagellida
Fungi
Fungi/Metazoa incertae sedis
Am
oebo
zoa
Rhod
ophy
ta
Galdieria sulphurariaCyanidioschyzon merolae
Virid
ipla
ntae
Chlo
roph
yta
Stre
ptop
hyta
Ectocarpus siliculosus
Blastocystis hominis
Phaeophyceae
Heterolobosea
Naegleria gruberi
0
0
0 0 0
4
4
1
3
3
4
0 3
3
40 3
40 3
40 3
0 0
0 0
3
0
0
0
0
0
0 0
3
0 3
2
2
X
X X
1
2
3
1 3
3
4
4
4
3
3
3
Eukaryota
Rhizaria
Bigelowiella natans
3
Figure 7 Evolution of the coronin protein family with respect to the species evolution. Schematic representation of the most widelyaccepted eukaryotic tree of life. Branch lengths are arbitrary. The coronin inventories of certain taxa and specific species have been plotted tothe tree with class numbers given in colour-coded boxes. “O” stands for “Orphan”, the unclassified short coronins. The numbers on the arrowsrefer to alternative placing of the respective taxa: 1: The independence of the Diplomonadida (instead of grouping them to the superkingdomExcavata) is supported by [51]. 2: The monophyly of the Rhodophyta is supported by [28,55]. 3: Grouping the Haptophyceae and Cryptophyta tothe SAR is supported by [55-57].
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 13 of 17
of the Opisthokonts contain class-4 coronins, the Amoe-bozoa, the Fungi, and the Fungi/Metazoa incertae sedisbranch. However, the evolution of the class-4 coroninsrather seems to be determined by gene-loss events. Thisdistribution of the coronin classes demonstrates that thelast common ancestor of the eukaryotes must have con-tained a short coronin as well as a tandem coronin (class-3), and most probably even a class-4 coronin. In the cor-onin-family tree (Figure 1) the C-terminal coronin-domains of the class-3 coronins group closer to the shortcoronins than the N-terminal coronin-domains. Thissuggests a three-step invention of the class-3 coronin(Figure 8): First a gene duplication of the short coroninhappened (1). The new copy was subsequently copiedtwice but the order of these events could not be deter-mined (2). One copy has been distributed in a differentgenomic region resulting in the class-4 coronin afterfusion to a copy of the villin gene (2B). The other copyresulted in a tandem gene duplicate in which the newcopy was placed at the 5’ site of the original gene (2A).The tandem gene duplicate subsequently fused to buildthe class-3 coronin prototype (3). It could also be possi-ble that the coronin domain copy, which led to the class-4 coronin, would have been produced as a copy of the 3’coronin of the then already existing tandem gene dupli-cate (4).At the origin of the Metazoa and Choanoflagellida
branches another gene duplication event led to two distinctclasses, class-1 coronins and class-2 coronins (Figure 7).The further evolution of the short coronins in the inverte-brate branches is determined by species-specific gene-loss
and gene-duplication events (Figure 2). This view is, how-ever, based on the species whose genomes are availabletoday and might change as soon as sequencing of morerelated species reveals subtypes of the class-1 and class-2coronins in major invertebrate branches. At the origin ofthe vertebrates the two well-known whole-genome duplica-tions (2R, [42]) resulted in several subtypes of both theclass-1 and class-2 coronins. The subsequent third wholegenome duplication in the fish-lineage [43,44] led to evenmore gene duplicates. Subsequent to this boost of coroninhomologs at the onset of the vertebrates branch-specificgene deletions happened, like the loss of the class-1B var-iants in fishes and the class-1A loss in birds (Figure 2).The short coiled-coil region including the trimerization
motif R-[VILM]-X-X-[VIL]-E is an accomplishment ofthe most ancient short coronin because it is found in cor-onins of all branches of the eukaryotic tree. It has beenretained without major mutations for a long evolutionarytime. This is exemplified by the fact that changes, whichmight lead to other oligomerization states, are species-specific or have been introduced in very recently sepa-rated branches.
ConclusionsThe phylogenetic tree based on the coronin domains of723 homologs from 358 species allowed grouping thecoronin proteins into four classes: Class-1 (Type I) andclass-2 (Type II) comprise short coronins and resultedfrom a gene duplication of a short coronin at the onsetof the halozoans. Short coronins are characterized byan N-terminal coronin domain followed by a unique
gene duplication1
fusion of the tandem gene duplicates3
A: tandem gene duplicationB: gene duplication(order unknown)
2
4
short coronin
class-3 coronin
class-4 coronin
A
BA
Figure 8 Evolution of the coronin classes. The cartoon shows the different gene duplication and fusion events that led to the formation ofthe short coronins, the class-3 coronins, and the class-4 coronins.
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 14 of 17
domain and a C-terminal short coiled-coil region. Thecoiled-coil domain of almost all short coronins containsa trimerization motif that must therefore have alreadyexisted in the last common ancestor of the eukaryotes.Class-3 (Type III) coronins comprise coronins with twocoronin-domains arranged in tandem and have beenfound in species of all eukaryotic kingdoms that containcoronins. Class-4 (Type IV) coronins encode fusions ofthe coronin domain to villin and have been identified inExcavata and Opisthokonts although most of these spe-cies subsequently lost the class-4 homolog. Hence, thelast common ancestor of the eukaryotes must have con-tained a short coronin and a class-3 coronin, and mostprobably a class-4 coronin.
MethodsIdentification and annotation of the coronin familyproteinsThe coronin genes have been identified by TBLASTNsearches against the sequenced eukaryotic genomes, whichhave been obtained via lists available from the diArk data-base [23,58]. All hits were manually analyzed at the geno-mic DNA level. Datasets of predicted proteins produced bythe sequencing consortia often miss homologs, and pre-dicted proteins contain mispredicted exons and introns inmany cases, necessitating manual assembly and annotation.The correct coding sequences were identified with the helpof the multiple sequence alignments of all coronin proteins.As the amount of protein sequences increased (especiallythe number of sequences in taxa with few representatives),many of the initially predicted sequences were reanalyzedto correctly identify all exon borders. Where possible, ESTdata available from the NCBI EST database has been ana-lyzed to help in the annotation process. In addition, coroninhomologs from cDNA projects or single-gene analyses havebeen obtained by TBLASTN searches against the NCBI nrdatabase [59]. Gene structures have been reconstructedusing WebScipio [25] as far as genomic sequence data wasavailable. All sequence related data (names, correspondingspecies, GenBank ID’s, alternative names, correspondingpublications, domain predictions, gene structure recon-structions, and sequences) and references to genomesequencing centres are available through CyMoBase[60,61].
Generating the multiple sequence alignmentThe multiple sequence alignment of the coronin family hasbeen built and extended during the process of annotatingand assembling new sequences. The initial alignment hasbeen generated from the first about 50 non-validatedsequences obtained from NCBI using the ClustalW soft-ware with standard settings [62]. During the following cor-rection of the sequences (removing wrongly annotated
sequences and filling gaps) the alignment has been adjustedmanually. Subsequently, every newly predicted sequencehas been preliminary aligned to its supposed closest rela-tive using ClustalW, the aligned sequence added to themultiple sequence alignment of the coronins, and the coro-nin alignment adjusted manually during the subsequentsequence validation process. We have also retained theintegrity of the primary sequence within the secondarystructural elements that have been determined fromthe crystal structure (e.g. sequence gaps have only beenintroduced in known loop regions). Still, many gaps insequences derived from low-coverage genomes remained.In those cases, the integrity of the exons surrounding thegaps has been maintained (gaps in the genomic sequenceare reflected as gaps in the multiple sequence alignment).The unique and coiled-coil regions are completely diver-gent in sequence and length and were therefore alignedmanually. The domain compositions of the short coronin,the class-3, and the class-4 coronins are different andregions outside the N-terminal coronin domain were onlyaligned within these groups. The C-terminal coronindomains of the class-3 coronins were separately includedin the multiple sequence alignment of the coronins, inaddition to being aligned as part of their class.
Building treesFor calculating phylogenetic trees only full-length and par-tial sequences were included in the alignment. The phylo-genetic trees were generated based on the conservedcoronin domains (corresponding to amino acids 1-386 ofHsCoro1A) using two different methods: 1. Maximumlikelihood (ML) using the LG model with estimated pro-portion of invariable sites and bootstrapping (1,000 repli-cates) using RAxML [63]. 2. Posterior probabilities weregenerated using MrBayes v3.1.2 [64] with the MPI option[65]. Two independent runs with 15,000,000 generations,four chains, and a random starting tree were computedusing the mixed amino-acid option. From the 32,000thgeneration MrBayes used the Wag model [66]. Using Prot-Test [67], the LG model [68], which is, however, notimplemented in MrBayes, was determined to provide aslightly better fit to the data than the Wag model. Treeswere sampled every 1,000th generation and the first 25%of the trees were discarded as “burn-in” before generatinga consensus tree.
Domain and motif predictionProtein domains were predicted using the SMART [69]and Pfam [70] web server. The leucine zipper motifs havebeen identified using the Prosite database [71]. The CAdomains have been identified by visual inspection of themanual sequence alignment of the coronins and motifcomparisons with CA domains of WASP family proteins
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 15 of 17
available at CyMoBase (unpublished data, [61]). Graphicalrepresentations of the sequence patterns have been gener-ated with WebLogo [72].
Additional material
Additional file 1: Sequence alignment of the coronins The filecontains the alignment of the full-length sequences of the coronins infasta-format. The data can also be downloaded from CyMoBase [61].
Additional file 2: MrBayes tree of the coronin family This file containsthe phylogenetic tree calculated with MrBayes including posteriorprobability values that has been the basis for Figure 1. Here, the tree isplotted in an extended way so that every coronin can be found andcompared easily.
Additional file 3: RAxML tree of the coronin family This file containsthe phylogenetic tree calculated with RAxML including bootstrap values.The tree is plotted in an extended way so that every coronin can befound and compared easily.
Additional file 4: Coronin repertoire of all eukaryotes analyzedComplete table of the coronin inventories of 358 eukaryotes.
Additional file 5: Conserved residues in the coronin domain Thisfigure contains the sequence conservation of the entire coronin domainincluding all mutagenesis experiments as described in Cai et al. [40] andGandhi et al. [41].
AcknowledgementsThis work has been funded by grants KO 2251/3-1 and KO 2251/3-2 of theDeutsche Forschungs gemein schaft.
Authors’ contributionsCE and MK assembled coronin sequences, performed data analysis andwrote the manuscript. BH performed the phylogenetic analysis. All authorsread and approved the final manuscript.
Received: 28 June 2011 Accepted: 25 September 2011Published: 25 September 2011
References1. de Hostos EL, Bradtke B, Lottspeich F, Guggenheim R, Gerisch G: Coronin,
an actin binding protein of Dictyostelium discoideum localized to cellsurface projections, has sequence similarities to G protein beta subunits.EMBO J 1991, 10:4097-4104.
2. Tardieux I, Liu X, Poupel O, Parzy D, Dehoux P, Langsley G: A Plasmodiumfalciparum novel gene encoding a coronin-like protein which associateswith actin filaments. FEBS Lett 1998, 441:251-256.
3. Figueroa JV, Precigout E, Carcy B, Gorenflot A: Identification of a coronin-like protein in Babesia species. Ann N Y Acad Sci 2004, 1026:125-138.
4. Heil-Chapdelaine RA, Tran NK, Cooper JA: The role of Saccharomycescerevisiae coronin in the actin and microtubule cytoskeletons. Curr Biol1998, 8:1281-1284.
5. Suzuki K, Nishihata J, Arai Y, Honma N, Yamamoto K, Irimura T,Toyoshima S: Molecular cloning of a novel actin-binding protein, p57,with a WD repeat and a leucine zipper motif. FEBS Lett 1995, 364:283-288.
6. de Hostos EL: A brief history of the coronin family. Subcell Biochem 2008,48:31-40.
7. Clemen CS, Rybakin V, Eichinger L: The coronin family of proteins. SubcellBiochem 2008, 48:1-5.
8. Uetrecht AC, Bear JE: Coronins: the return of the crown. Trends Cell Biol2006, 16:421-426.
9. Smith TF: Diversity of WD-repeat proteins. Subcell Biochem 2008, 48:20-30.10. Neer EJ, Schmidt CJ, Nambudripad R, Smith TF: The ancient regulatory-
protein family of WD-repeat proteins. Nature 1994, 371:297-300.11. de Hostos EL, Rehfuess C, Bradtke B, Waddell DR, Albrecht R, Murphy J,
Gerisch G: Dictyostelium mutants lacking the cytoskeletal protein
coronin are defective in cytokinesis and cell motility. J Cell Biol 1993,120:163-173.
12. Cai L, Holoweckyj N, Schaller MD, Bear JE: Phosphorylation of coronin 1Bby protein kinase C regulates interaction with Arp2/3 and cell motility.J Biol Chem 2005, 280:31913-31923.
13. Maniak M, Rauchenberger R, Albrecht R, Murphy J, Gerisch G: Coronininvolved in phagocytosis: dynamics of particle-induced relocalizationvisualized by a green fluorescent protein Tag. Cell 1995, 83:915-924.
14. Ferrari G, Langen H, Naito M, Pieters J: A coat protein on phagosomesinvolved in the intracellular survival of mycobacteria. Cell 1999,97:435-447.
15. Rybakin V, Stumpf M, Schulze A, Majoul IV, Noegel AA, Hasse A: Coronin 7,the mammalian POD-1 homologue, localizes to the Golgi apparatus.FEBS Lett 2004, 573:161-167.
16. de Hostos EL: The coronin family of actin-associated proteins. Trends CellBiol 1999, 9:345-350.
17. Appleton BA, Wu P, Wiesmann C: The crystal structure of murine coronin-1: a regulator of actin cytoskeletal dynamics in lymphocytes. Structure2006, 14:87-96.
18. Spoerl Z, Stumpf M, Noegel AA, Hasse A: Oligomerization, F-actininteraction, and membrane association of the ubiquitous mammaliancoronin 3 are mediated by its carboxyl terminus. J Biol Chem 2002,277:48858-48867.
19. Oku T, Itoh S, Ishii R, Suzuki K, Nauseef WM, Toyoshima S, Tsuji T:Homotypic dimerization of the actin-binding protein p57/coronin-1mediated by a leucine zipper motif in the C-terminal region. Biochem J2005, 387:325-331.
20. Kammerer RA, Kostrewa D, Progias P, Honnappa S, Avila D, Lustig A,Winkler FK, Pieters J, Steinmetz MO: A conserved trimerization motifcontrols the topology of short coiled coils. Proc Natl Acad Sci USA 2005,102:13891-13896.
21. Rybakin V, Clemen CS: Coronin proteins as multifunctional regulators ofthe cytoskeleton and membrane trafficking. Bioessays 2005, 27:625-632.
22. Morgan RO, Fernandez MP: Molecular phylogeny and evolution of thecoronin gene family. Subcell Biochem 2008, 48:41-55.
23. Odronitz F, Hellkamp M, Kollmar M: diArk–a resource for eukaryoticgenome research. BMC Genomics 2007, 8:103.
24. Breathnach R, Chambon P: Organization and expression of eucaryoticsplit genes coding for proteins. Annu Rev Biochem 1981, 50:349-383.
25. Odronitz F, Pillmann H, Keller O, Waack S, Kollmar M: WebScipio: an onlinetool for the determination of gene structures using protein sequences.BMC Genomics 2008, 9:422.
26. Keller O, Odronitz F, Stanke M, Kollmar M, Waack S: Scipio: using proteinsequences to determine the precise exon/intron structures of genes andtheir orthologs in closely related species. BMC Bioinformatics 2008, 9:278.
27. Parfrey LW, Grant J, Tekle YI, Lasek-Nesselquist E, Morrison HG, Sogin ML,Patterson DJ, Katz LA: Broadly sampled multigene analyses yield a well-resolved eukaryotic tree of life. Syst Biol 2010, 59:518-533.
28. Reeb VC, Peglar MT, Yoon HS, Bai JR, Wu M, Shiu P, Grafenberg JL, Reyes-Prieto A, Rummele SE, Gross J, Bhattacharya D: Interrelationships ofchromalveolates within a broadly sampled tree of photosyntheticprotists. Mol Phylogenet Evol 2009, 53:202-211.
29. Hampl V, Hug L, Leigh JW, Dacks JB, Lang BF, Simpson AG, Roger AJ:Phylogenomic analyses support the monophyly of Excavata and resolverelationships among eukaryotic “supergroups”. Proc Natl Acad Sci USA2009, 106:3859-3864.
30. Goode BL, Wong JJ, Butty AC, Peter M, McCormack AL, Yates JR, Drubin DG,Barnes G: Coronin promotes the rapid assembly and cross-linking ofactin filaments and may link the actin and microtubule cytoskeletons inyeast. J Cell Biol 1999, 144:83-98.
31. Liu SL, Needham KM, May JR, Nolen BJ: Mechanism of a Concentration-dependent Switch between Activation and Inhibition of Arp2/3 Complexby Coronin. J Biol Chem 2011, 286:17039-17046.
32. Veltman DM, Insall RH: WASP family proteins: their evolution and itsphysiological implications. Mol Biol Cell 2010, 21:2880-2893.
33. Vertessy BG, Toth J: Keeping uracil out of DNA: physiological role,structure and catalytic mechanism of dUTPases. Acc Chem Res 2009,42:97-106.
34. Gloss A, Rivero F, Khaire N, Muller R, Loomis WF, Schleicher M, Noegel AA:Villidin, a novel WD-repeat and villin-related protein from Dictyostelium,
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 16 of 17
is associated with membranes and the cytoskeleton. Mol Biol Cell 2003,14:2716-2727.
35. Yonemura I, Mabuchi I: Heterogeneity of mRNA coding forCaenorhabditis elegans coronin-like protein. Gene 2001, 271:255-259.
36. Xavier CP, Rastetter RH, Stumpf M, Rosentreter A, Muller R, Reimann J,Cornfine S, Linder S, van Vliet V, Hofmann A, Morgan RO, Fernandez MP,Schroder R, Noegel AA, Clemen CS: Structural and functional diversity ofnovel coronin 1C (CRN2) isoforms in muscle. J Mol Biol 2009, 393:287-299.
37. Asano S, Mishima M, Nishida E: Coronin forms a stable dimer through itsC-terminal coiled coil region: an implicated role in its localization to cellperiphery. Genes Cells 2001, 6:225-235.
38. Beck K, Gambee JE, Kamawal A, Bachinger HP: A single amino acid canswitch the oligomerization state of the alpha-helical coiled-coil domainof cartilage matrix protein. EMBO J 1997, 16:3767-3777.
39. Oku T, Itoh S, Okano M, Suzuki A, Suzuki K, Nakajin S, Tsuji T, Nauseef WM,Toyoshima S: Two regions responsible for the actin binding of p57, amammalian coronin family actin-binding protein. Biol Pharm Bull 2003,26:409-416.
40. Cai L, Makhov AM, Bear JE: F-actin binding is essential for coronin 1Bfunction in vivo. J Cell Sci 2007, 120:1779-1790.
41. Gandhi M, Jangi M, Goode BL: Functional surfaces on the actin-bindingprotein coronin revealed by systematic mutagenesis. J Biol Chem 2010,285:34899-34908.
42. Van de Peer Y, Maere S, Meyer A: 2R or not 2R is not the questionanymore. Nat Rev Genet 2010, 11:166.
43. Steinke D, Hoegg S, Brinkmann H, Meyer A: Three rounds (1R/2R/3R) ofgenome duplications and the evolution of the glycolytic pathway invertebrates. BMC Biol 2006, 4:16.
44. Jaillon O, Aury JM, Brunet F, Petit JL, Stange-Thomann N, Mauceli E,Bouneau L, Fischer C, Ozouf-Costaz C, Bernot A, Nicaud S, Jaffe D, Fisher S,Lutfalla G, Dossat C, Segurens B, Dasilva C, Salanoubat M, Levy M, Boudet N,Castellano S, Anthouard V, Jubin C, Castelli V, Katinka M, Vacherie B,Biemont C, Skalli Z, Cattolico L, Poulain J, et al: Genome duplication in theteleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 2004, 431:946-957.
45. Chan KT, Creed SJ, Bear JE: Unraveling the enigma: progress towardsunderstanding the coronin family of actin regulators. Trends Cell Biol2011, 21:481-488.
46. Odronitz F, Kollmar M: Drawing the tree of eukaryotic life based on theanalysis of 2,269 manually annotated myosins from 328 species. GenomeBiol 2007, 8:R196.
47. Berg JS, Powell BC, Cheney RE: A millennial myosin census. Mol Biol Cell2001, 12:780-794.
48. Archer SK, Claudianos C, Campbell HD: Evolution of the gelsolin family ofactin-binding proteins as novel transcriptional coactivators. Bioessays2005, 27:388-396.
49. Xavier CP, Eichinger L, Fernandez MP, Morgan RO, Clemen CS: Evolutionaryand functional diversity of coronin proteins. Subcell Biochem 2008,48:98-109.
50. Khurana S, George SP: Regulation of cell structure and function by actin-binding proteins: villin’s perspective. FEBS Lett 2008, 582:2128-2139.
51. Simpson AG, Inagaki Y, Roger AJ: Comprehensive multigene phylogeniesof excavate protists reveal the evolutionary positions of “primitive”eukaryotes. Mol Biol Evol 2006, 23:615-625.
52. Keeling PJ: The endosymbiotic origin, diversification and fate of plastids.Philos Trans R Soc Lond B Biol Sci 2010, 365:729-748.
53. Burki F, Shalchian-Tabrizi K, Pawlowski J: Phylogenomics reveals a new‘megagroup’ including most photosynthetic eukaryotes. Biol Lett 2008,4:366-369.
54. Burki F, Shalchian-Tabrizi K, Minge M, Skjaeveland A, Nikolaev SI,Jakobsen KS, Pawlowski J: Phylogenomics reshuffles the eukaryoticsupergroups. PLoS One 2007, 2:e790.
55. Nozaki H, Maruyama S, Matsuzaki M, Nakada T, Kato S, Misawa K:Phylogenetic positions of Glaucophyta, green plants (Archaeplastida)and Haptophyta (Chromalveolata) as deduced from slowly evolvingnuclear genes. Mol Phylogenet Evol 2009, 53:872-880.
56. Keeling PJ: Chromalveolates and the evolution of plastids by secondaryendosymbiosis. J Eukaryot Microbiol 2009, 56:1-8.
57. Hackett JD, Yoon HS, Li S, Reyes-Prieto A, Rummele SE, Bhattacharya D:Phylogenomic analysis supports the monophyly of cryptophytes and
haptophytes and the association of rhizaria with chromalveolates. MolBiol Evol 2007, 24:1702-1713.
58. diArk - a resource for eukaryotic genome research. [http://www.diark.org].59. Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL:
NCBI BLAST: a better web interface. Nucleic Acids Res 2008, 36:W5-9.60. Odronitz F, Kollmar M: Pfarao: a web application for protein family
analysis customized for cytoskeletal and motor proteins (CyMoBase).BMC genomics 2006, 7:300.
61. CyMoBase - a database for cytoskeletal and motor proteins. [http://www.cymobase.org].
62. Thompson JD, Gibson TJ, Higgins DG: Multiple sequence alignment usingClustalW and ClustalX. Curr Protoc Bioinformatics 2002, Chapter 2:Unit 2 3.
63. Stamatakis A, Hoover P, Rougemont J: A rapid bootstrap algorithm for theRAxML Web servers. Syst Biol 2008, 57:758-771.
64. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inferenceunder mixed models. Bioinformatics 2003, 19:1572-1574.
65. Altekar G, Dwarkadas S, Huelsenbeck JP, Ronquist F: Parallel Metropoliscoupled Markov chain Monte Carlo for Bayesian phylogenetic inference.Bioinformatics 2004, 20:407-415.
66. Whelan S, Goldman N: A general empirical model of protein evolutionderived from multiple protein families using a maximum-likelihoodapproach. Mol Biol Evol 2001, 18:691-699.
67. Abascal F, Zardoya R, Posada D: ProtTest: selection of best-fit models ofprotein evolution. Bioinformatics 2005, 21:2104-2105.
68. Le SQ, Gascuel O: An improved general amino acid replacement matrix.Mol Biol Evol 2008, 25:1307-1320.
69. Letunic I, Doerks T, Bork P: SMART 6: recent updates and newdevelopments. Nucleic Acids Res 2009, 37:D229-232.
70. Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL,Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR,Bateman A: The Pfam protein families database. Nucleic Acids Res 2010,38:D211-222.
71. Sigrist CJ, Cerutti L, de Castro E, Langendijk-Genevaux PS, Bulliard V,Bairoch A, Hulo N: PROSITE, a protein domain database for functionalcharacterization and annotation. Nucleic Acids Res 2010, 38:D161-166.
72. Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logogenerator. Genome Res 2004, 14:1188-1190.
73. Letunic I, Bork P: Interactive Tree Of Life (iTOL): an online tool forphylogenetic tree display and annotation. Bioinformatics 2007, 23:127-128.
doi:10.1186/1471-2148-11-268Cite this article as: Eckert et al.: A holistic phylogeny of the coroningene family reveals an ancient origin of the tandem-coronin, defines anew subfamily, and predicts protein function. BMC Evolutionary Biology2011 11:268.
Submit your next manuscript to BioMed Centraland take full advantage of:
• Convenient online submission
• Thorough peer review
• No space constraints or color figure charges
• Immediate publication on acceptance
• Inclusion in PubMed, CAS, Scopus and Google Scholar
• Research which is freely available for redistribution
Submit your manuscript at www.biomedcentral.com/submit
Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268
Page 17 of 17