<<

Eckert et al. BMC Evolutionary Biology 2011, 11:268 http://www.biomedcentral.com/1471-2148/11/268

RESEARCHARTICLE Open Access A holistic phylogeny of the family reveals an ancient origin of the tandem-coronin, defines a new subfamily, and predicts function Christian Eckert, Björn Hammesfahr and Martin Kollmar*

Abstract Background: Coronins belong to the superfamily of the eukaryotic-specific WD40-repeat and play a role in several -dependent processes like cytokinesis, motility, , and vesicular trafficking. Two major types of coronins are known: First, the short coronins consisting of an N-terminal coronin , a unique region and a short coiled-coil region, and secondly the tandem coronins comprising two coronin domains. Results: 723 coronin proteins from 358 species have been identified by analyzing the whole- assemblies of all available sequenced (March 2011). The organisms analyzed represent most eukaryotic kingdoms but also cover every taxon several times to provide a better statistical sampling. The phylogenetic tree of the coronin domains based on the Bayesian method is in accordance with the most recent grouping of the major kingdoms of the eukaryotes and also with the grouping of more recently separated branches. Based on this “holistic” approach the coronins group into four classes: class-1 (Type I) and class-2 (Type II) are metazoan/ specific classes, class-3 contains the tandem-coronins (Type III), and the new class-4 represents the coronins fused to villin (Type IV). Short coronins from non-metazoans are equally related to class-1 and class-2 coronins and thus remain unclassified. Conclusions: The coronin class distribution suggests that the last common eukaryotic ancestor possessed a single and a tandem-coronin, and most probably a class-4 coronin of which homologs have been identified in and although most of these species subsequently lost the class-4 homolog. The most ancient short coronin already contained the trimerization motif in the coiled-coil domain.

Background eukaryotic-specific WD40-repeat proteins [9,10] and play The coronin proteins, which were originally isolated as a a role in several actin-dependent processes like cytokin- major co-purifying protein from an actin-myosin-com- esis [11], cell motility [11,12], phagocytosis [13,14], and plexoftheslimemoldDictyostelium discoideum [1], vesicular trafficking [15]. have since been identified in other [2,3], fungi WD-repeat motifs are minimally conserved regions of [4], and animals [5], but are absent in . Coronins approximately 40-60 amino acids typically starting with are a conserved family of actin binding proteins [6-8] and Gly-His (GH) dipeptides 11-24 residues away from the N- the first family member had been named coronin based terminus and ending with a Trp-Asp (WD) dipeptide at on its strong immunolocalization to the actin rich crown the C-terminus. WD40-repeat proteins, which are charac- like structures of the cell cortex in discoi- terized by the presence of at least four consecutive WD deum [1]. Coronins belong to the superfamily of the repeats in the middle of the molecule, fold into beta pro- peller structures and serve as stable platforms for protein- protein interactions [9]. * Correspondence: [email protected] The coronin proteins have five canonical WD-repeat Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Goettingen, Germany motifs located centrally. Since the region encoding the

© 2011 Eckert et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and in any medium, provided the original work is properly cited. Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 2 of 17 http://www.biomedcentral.com/1471-2148/11/268

WD repeats is similar to the sequence of the beta-subu- Results nit of trimeric G-proteins the formation of a five-bladed Identification and annotation of the coronin proteins beta-propeller was assumed for coronins [16]. However, The coronin protein were identified by TBLASTN the determination of the structure of murine coronin-1 searches against the corresponding genome data of the dif- (MmCoro1A [17]) demonstrated that the protein, analo- ferent species. The list of sequenced eukaryotic species as gous to the trimeric G-proteins, forms a seven-bladed well as access information to the corresponding genome beta-propeller carrying two potential F-actin binding data has been obtained from diArk [23]. Species that sites. Apart from the central WD-repeats, almost all cor- missed certain orthologs in the first instance were later onin proteins have a C-terminal coiled-coil sequence searched again with supposed-to-be orthologs of other that mediates homo-oligomerization [18-20], and a short closely related species. In this iterative process all coronin N-terminal motif that contains an important regulatory family proteins have been identified or their loss in certain phosphorylation site in coronin-1B [12]. In addition, species or taxa was confirmed. Because verified cDNA each coronin protein has a unique region of variable sequences and protein predictions, which often contain length and composition following the conserved exten- mispredicted exons and introns even in the “annotated” sion to the C-terminus of the beta-propeller. , are not available for most of the sequenced spe- Based on their domain composition coronins have ori- cies, the protein sequences were assembled and assigned ginally been divided into two subfamilies, namely short and by manual inspection of the genomic DNA sequences. long coronins [21]. Short coronins consist of 450 - 650 Exons have been confirmed by the identification of flank- amino acids containing one seven-bladed beta-propeller ing consensus intron-exon splice junction donor and and a C-terminal coiled-coil region. Furthermore, the N- acceptor sequences [24]. In addition, the gene structures terminal region of most known short coronins contains 12 of all coronin genes were reconstructed using WebScipio basic amino acids. Since this motif is only present in coro- [25,26]. Through comparison of the intron positions and nin molecules, it has been suggested as a novel coronin sig- splice-site phases in relation to the protein multiple- nature [21]. The longer types of coronin, also called POD sequence alignment, several suspicious exon border pre- or Coronin 7, possess two complete core domains in tan- dictions could be resolved and the protein sequences sub- dem but lack a coiled-coil motif. In the longer coronins, sequently be corrected. The genomic sequences of many the sequence of the basic N-terminal motif is reduced to 5 species contain several gaps due to the low coverage of the amino acids. Based on phylogenetic relationships among sequencing or problems in the assembly process. Only the coronins, the Genome Organization nomencla- some of the gaps could be closed at the amino-acid level ture committee (HGNC) proposed a system in 2001 that by analyzing EST data. grouped the short coronins into two classes resulting in a The coronin dataset contains 723 sequences from 358 total of three subtypes [8]. Very recently, a new nomencla- organisms (Table 1). 614 sequences are complete, and an ture has been suggested dividing the coronins into twelve additional 44 sequences are partially complete. Sequences subclasses based on the analysis of about 250 coronins for which a small part is missing (up to 5%) were termed from most taxa [22]. In contrast to previous systems, every “Partials”, while sequences for which a considerable part is mammalian coronin (and corresponding vertebrate homo- missing were termed “Fragments”.Thisdifferencehas logs) was designated an own class resulting in seven verte- been introduced because Partials are not expected to con- brate classes. Invertebrates were grouped into two classes, siderably influence the phylogenetic analysis. Several of the fungi got an own class, coronins from were the genes were termed pseudogenes because they contain grouped with those from Parabasalids (class 10), and the too many frame shifts, in-frame stop codons, and missing remaining coronins from , Heterolobosea, and sequences to be attributed to sequencing or assembly were combined into the twelfth class. This errors. study constituted the first major phylogenetic analysis of the coronin family. However, this classification was not Multiple sequence alignment, phylogenetic analysis, and consistent with the latest phylogeny of the eukaryotes and classification homologs of some major branches like the stramenopiles A multiple sequence alignment of all coronin family mem- were missing. bers has been created and extensively manually improved Here, we present the analysis of the complete coronin (Additional file 1). The basis of the alignment was the con- repertoires of all eukaryotic organisms sequenced and served coronin domain that consists of the b-propeller assembled so far. The distribution of all coronin homo- region and a subsequent conserved extension, which packs logs is in accordance with the latest taxonomy of the against the “bottom” surface of the propeller [17]. This eukaryotes and reveals the origin of the tandem-coronin entire domain is conserved in all coronin homologs and and another newly defined class in the last common we would therefore suggest naming it coronin-domain. ancestor of the eukaryotes. The unique regions following the coronin-domain could Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 3 of 17 http://www.biomedcentral.com/1471-2148/11/268

Table 1 Data statistics coronin classification as simple as possible and to pro- coronin vide the highest consistency with previous classification Sequence schemes, the following classification is proposed: The Total 723 classification should solely be based on the phylogenetic From WGS 700 tree of the coronin-domains because it is in accordance Domains 7 with the phylogeny of the eukaryotes and contains the Amino acids conserved part of the proteins that is the basis of the Total pseudogenes 7 protein family. Metazoan species encode two phylogen- Pseudogenes without sequence 3 etically distinct groups of coronins that have historically Completeness been named class-1 and class-2 coronins. Further var- Complete 614 iants of these classes should be named alphabetically, Partials 44 e.g. class-1A, class-1B, etc.. However, due to the inde- Fragments 62 pendent whole-genome, genomic region, and single Species gene duplication events of certain phylogenetic branches Total 358 these variant designations do not always refer to ortho- WGS-projects 323 logs. For the mammalian coronins, which are the best EST-projects 112 analyzed coronins, the suggested classification is almost WGS- and EST-projects 152 entirely consistent with previous classifications [8] and the HGNC nomenclature except for “CORO6” and “CORO7”, which are here classified as coronin-1D and only be aligned for homologs of closely related species. coronin-3, respectively. Class-3 comprises the tandem The C-terminal predicted coiled-coil regions were aligned coronins. All members of this class group together in the again for all corresponding sequences to analyze potential phylogenetic tree, and only single homologs have been oligomerization patterns (see below). The second coronin- found in all species analyzed. Class-4 is a newly defined domains of the tandem-coronins were also aligned to the classthatcontainscoroninswithvariablenumbersof coronin-domains for the phylogenetic analysis. One part C-terminal PH, gelsolin, and VHP domains, but also cor- of the coronin-domain in coronin-1D is encoded by a onins with only very short sequences outside the coro- cluster of mutually exclusive exons (see below) and there- nin-domain. The other coronins group in accordance fore the exon with the higher sequence identity to related with the latest taxonomy of the species (Figure 1). In our homologs has been included in the alignment. The phylo- opinion it does not add information or help the scientific genetic tree of the coronin family was calculated for 764 community if those coronins were classified separately. coronin-domains, including both coronin-domains of the In contrast to the metazoans, gene duplications in the tandem-coronins separately, using the Bayesian (Addi- branches of Amoeba, Excavata, and SAR are species-spe- tional file 2) and the maximum-likelihood method (Addi- cific and do not warrant further subclassification at the tional file 3). The resulting trees were almost identical. moment. For example, instead of talking about a “class- However, the relations of the innermost nodes represent- 11 coronin” and long explanations what type of coronins ing the most ancient relationships were best resolved would belong to such a class, it would be easier, shorter, using the Bayesian approach (Figure 1). The resulting phy- and less confusing to just say a “Naegleria coronin”,an logenetic tree is in accordance with the latest phylogenetic “apicomplexan coronin” or a “yeast coronin”.Thedistri- grouping of the six kingdoms of the eukaryotes [27-29] of bution of the coronins analyzed here is summarized for which five are covered by the data analyzed here. Thus, some example species in Figure 2 including previously coronins of phylogenetic related species group together in used names and classification schemes. The distribution the coronin family tree. In the coronin tree, not only the of all coronins is found in Additional file 4. Coronin grouping is retained but also the evolutionary history of homologs are absent in Rhodophyta (Cyanidioschyzon, the branches. For example, the fungi separate as mono- Galdieria), Viridiplantae, Microsporidia, Formicata phyletic group before the metazoans, and after the (Giardia), and Haptophyceae (Emiliania). Amoeba. The classification into subfamilies should at best include Short coronins (class-1, class-2, and unclassified coronins) both the phylogenetic grouping of the protein family The domain organisations of most short coronins (class-1, members and the domain organisation of the respective class-2, and unclassified coronins) are similar. They consist homologs. However, because most coronins contain a of the 390 amino-acid long coronin-domain followed by a unique region between the coronin-domain and the short unique domain and a C-terminal short coiled-coil C-terminal coiled-coil regions, several sub-branch speci- region (about 30-40 amino acids, Figure 3). The unique fic domain organisation patterns evolved. To keep the regions are conserved in branches (e.g. the vertebrates Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 4 of 17 http://www.biomedcentral.com/1471-2148/11/268

Class-3 N-Term SAR

Psp 3

Cej 3

Cb 3

Aim 3 Cis 3 Hb 3 Caf 3 3 Str 3

E Myl 3 C

Mam 3 Ci 3 Pna 3 Pat 3 qc 3 Pah 3 Oc 3 vp 3 Bt 3 H

v 3 Caj 3 Mm 3 ac 3

Rn 3 s 3 Car 3 Car Cab 3

Ce 3

N Stp 3 Dap Rhp 3 pf 3 Pdc 3 Md 3 Nav 3 A ic 3 Cmf 3 Gg 3 Hrs 3 Bot 3 Am 3 ea 3 Tar 3 A T Ang 3 Ga 3 Xt 3 And 3 Cp A Cpq 3 Br 3 Myd 3 er 3 Da 3 Amq 3 Dy 3 Tct 3 Bf 3 Dse 3 Lg 3 D Spr 3 t 3 Dm 3 Dw 3 Dp 3 Fnd_c 3 Rg 3 Dg 3 Pug 3 Put 3 Mv 3 Dv 3 F Bh A Fnb_c 3 Mlp 3 Bh B nd_b 3 Thp Fnb_b Frc Fna_b 3 Pht Ecs Aua A rp Bad_b 3 Aua B Bad_a 3 Rha 3 S Phb 3 Pu hs Hya P hi 3 Pyc S P Phr pp 3 t

Ppp 3Co Bin P Ays 3 Tet Crm A Dif 3 3 Ch A Dcp 3Dd 3 Cp A Crm B Phs 3Ac 3 Ch B Phr 3 Cp B Phi 3 Bab T Bb Pu 3 Tep v Dse 3Cterm Srp 3 Tha Ecs 3 a Et Da 3Cterm 3 Nca Eh 3 Tg_a D Dy 3Cterm Ed 3 Tg_b ss 3Cterm Eti 3 Tg_c Dm 3Cterm Pb Der 3Cterm Plc Dw 3Cterm Ply Pk Dp 3Cterm Pv Dv 3Cterm Plg Dmo Dg3Cterm 3Cterm Plr Myd 3Cterm Pf_j Ang 3Cterm Pf_i Pf_c And 3Cterm Pf_h f_a Cpq 3Cterm P Excavata Aea 3Cterm Pf_d Tic 3Cterm Pf_b Apf 3Cterm Pf_e Ng Am 3Cterm Tv_a C Bot 3Cterm Cmf 3Cterm Tv_a A Tv_a cB A Aac 3Cterm Tr Hrs 3Cterm Trc B Nav 3Ct Tyv Rhp 3Cterm Tb Pdc 3Cterm Tbg Dap 3Cterm Crf A erm Crf B Is 3Cterm Lb Lg 3Cterm Let Cpt 3Cterm Lei Caf 3Cterm Lem Aim 3Cterm Lc Bt 3Cterm Lm Amoebae Myl 3Cterm Alveolata Ac Eqc 3Cterm Ach Rn 3Cterm Mab Mm 3Cterm Eti A Cvp 3Cterm Ed A Stramenopiles Eh A Oc 3Cterm Ed B GggHs 3Cter 3Ctermm Eh B Pp A Pat 3Cterm p B Pna 3Cterm P Pah 3Cterm Dcp Mam 3Cterm Dd Dif ys Caj 3Cterm A Ora 3Cterm Ppp Md 3Cterm Tct Aoc 3Cterm Phb Tag 3Cterm Muc Gg 3Cterm Rha A Rha B ta Class-3 C-Term Xt 3Cterm Alm Balpha Tn 3Cterm Tar 3Cterm Alm Aalpha Alm Abe Ga 3Cterm Ol a 3Cterm Spp Bad_a Br 3Cterm Bad_b Stp 3Cterm Mlp Bf 3Cterm Pug Cis 3Cterm Put Ci 3Cterm Mv Amq 3Cterm Rg Nv 3Cterm Spr Cb 3Cterm Mlg Car 3Cterm Um_a Cej 3Cterm Um_b Cab 3Cterm Tem Ce 3Cterm Fnb_b Hb 3Cterm Fna_b Psp 3Cterm Fnd_b Str 3Cterm Fnd_c Chp Co 3Cterm Spr 3Cterm Phc Ges Rg 3Cterm Dis Mv 3Cterm Tav Put 3Cterm Fp Pug 3Cterm Wc Mlp 3Cterm Fnd c 3Cterm Ppl B Fnd_b 3Cterm Ppl A Fna_b 3Cterm Ppl C Fnb_b 3Cterm Hta Fnb_c 3Cterm Sth Glt Rha 3Cterm Pus Muc 3Cterm Sll Phb 3Cterm Lab A Bad_b 3Cterm Lab B Bad_a 3Cterm Plo Spp 3Cterm Scc Tv a 3Cterm Cpc Basidomycota Abb Eh 3Cterm Ed 3Cterm Agb Yl Eti 3Cterm Ppp 3Cterm Wa Ays 3Cterm Pcp_a Pcp_b Dif 3Cterm Pia_b Dd 3Cterm Cll Dcp 3Cterm Deh Pp 3Cterm Mrg Ac 3Cterm Shs Phs 3Cterm Cap Phi 3Cterm Loe Phr 3Cterm Ct_a Pu 3Cterm Cad Srp 3Cterm Ca_a Ecs 3Cterm Ca_b Aua 3Cterm Erg Tct 3Cterm Ka Kl Dd 4 Klw Dcp 4 Lak_a Dif 4A Ppp 4A Lat Lw Ays 4B Zr Ays 4A Vp Eh 4 Cgl Ed 4 Nac Class-4 Eti 4 1.00 Sap_a Ng 4 Sc_a Spp 4 0.53 1.00 Sc_b Co 4 Sc_c Ppp 4B Smi Ays 4C Sak Dif 4B 0.87 0.60 Sab_a 0.60 Sab_b 0.58 Sj 0.88 0.89 0.75 0.61 Sp 0.44 0.83 Shc Sho Tum 0.52 0.41 Pn 0.59 0.08 0.34 Coh Alb 0.45 Ptr Pyt Fungi Bg Ged Bof Scs Crp Poa Tit Chg Ascomycota Th 0.41 0.38 Som Ned Rn 1C Nc Mm 1C 1.00 Net Ora 1C 0.29 Grc Ss 1C 0.45 Ggt Bt 1C Mag Ggg 1C 0.98 0.45 Glg Aim 1C Va Pah 1C Vd Pna 1C Hpv Pat 1C Hj Tra Cvp 1C Caf 1C 0.90 Ecf Nh Mam 1C 1.00 0.46 Gz Hs 1C 0.38 La 1C Fo Gim Caj 1C 0.41 0.98 1.00 Msp Otg 1C 0.41 0.37 Mcp Myl 1C Mg Md 1C 0.50 Myf Eqc 1C Vertebrata Class-1C Pcm Spt 1C Tls Oc 1C 1.00 1.00 Af Tag 1C Ao Gg 1C En Aoc 1C 1.00 Aec Xt 1C 1.00 An_a 0.50 Xl 1Cbeta 1.00 An_b Ast Xl 1Calpha Pch Ol_b 1Cbeta Asc Ol_a 1Cbeta Nef Ga 1Cbeta Asf_a Tn 1Cbeta Asf_b Tar 1Cbeta Pab_b Br 1Calpha 1.00 Pab_a Tn 1Calpha Pab_c Ol_a 1Calpha Ajd_a Tar 1Calpha Ajd_b Ga 1Calpha Ajc_d Br 1CbetaLa 1B Invertebrate Class-2 Ajc_e Ect A Ajc_a Ere 1B Ajc b Caf 1B Ajc_c Oc 1B Aro Arg Mim 1B Bt 1B Arb Myl 1B Te Thv Pna 1B Caj 1B Vertebrata Class-1B Trr Trt Pah 1B Hs 1B Ur Cop_b Ggg 1B Cop_c Pat 1B Coi_a Eqc 1B Coi_d Aim 1B Coi_b Mm 1B Vertebrata Class-2B Coi_c Rn 1B Co Cvp 1B Tia 2A Aoc 1B Tia 2B Ora 1D Nv 2 Md 1D Amq 2 Bt 1D Mb 2 La 1D Pro 2 Cvp 1D Cpt 2 Lg 2 Capsaspora Mim 1D Vertebrata Class-2A Eqc 1D Her 2A Spt 1D Her 2B Pah 1D Dap Is 2 Mam 1D Ayp 2 Pna 1D Hs 1D Rhp 2 Myd 2 owczarzaki Ggg 1D Vertebrata Class-1D Aea 2 Pat 1D yl 1D Cpq 2 M And 2 Caj 1D Fc 1D Ang 2 Tic 2 Aim 1D Nav 2 Caf 1D (Fungi/Metazoa Rn 1D Hrs 2 Aac 2 Mm 1D Cmf 2 Oc 1D Bot 2 Tag 1D Am 2 Gg 1D Apf 2 ) Aoc Xt1D 1D1D Trs 2 Brm 2 Tn Wb 2 Tar 1D Lol 2 Ga 1D Ov 2 Stp 2 Ol_a Br1D 1D Bf 2 Hs 1A Br 2D Br Ggg 1Aat 1A P Ga 2B2 1A Ol_a 2B B Pna Mf1A Tar 2B Tn 2B Mam 1A Xl 2B Pah 1A Xt 2B Caj 1A Aoc Eqc 1A Gg Caf 1A Ora 2B 2B Myl 1A Ss 1A Md 2B 2B Bt 1A Mm 2B im 1A Vertebrata Class-1A Rn 2B A Myl 2B Bt 2B Mim 1ALa 1A Eqc 2B Rn 1A 1A Aim 2B C Mm 1A M af 2B Cvp Oc 2Bam 2B Spt 1A Pah 2B Md 1A 1A Oc 1A Cvp 2B Dn 1An La 2B T Caj 2B ar 1A Ggg 2B T Pat 2B Ga 1A Hs 2B Br 1A Pna Cyc 1A Br 2C Ol_a 2C h 1A 1A Ol_b 2C Ol_b 1A F Ga 2C 2B Ol_a 1A Xt 1A Tar 2C Xl Fish Class-1E Tn 2C Br 2A G Aoc 1ATn 1E Ol_a 2A Tar 2A a 2A Tar 1E Tn 2A 1B Xl 2A Xt 2A Ol_b 1E Ga 1EBr 1E Aoc 2A Ol_a 1E Gg 2A Tag 2A Drp Dp 1B Md 2A L Drp 1ADp 1A Myl 2A Cvp 2A a 2A Spt 2A Dss_f 1 Mm 2A Rn 2A Otg 2A Dss_c 1 Dse 1 er 1 Eqc 2A Dss_d 1 Dm 1 Ss 2A D Dss_a 1 Dw 1 Oc 2A Dy 1 Bt 2A Metazoa Class-2 Da 1 Aim 2A v 1 Caf 2A Caj 2A Dg 1 Mam 2A Pah 2A D Ggg 2A Dmo 1 P Hs 2A Pat 2 Nv 1 Metazoa Class-1 Invertebrate Class-1 S na 2A Gom 1 Cpq 1 Aea 1 Hm yd 1 Stp 1 Ang 1 Ci 1 Cis 1 ck 1 And 1 1 Bf 1 Tic 1 hp 1 Oid 1 Trs 1 M Glp 1 M Mi 1

R Lol 1

Ov 1 Ayp 1 Brm 1 1 Cab 1 Cb 1 Cej 1 Pdc 1 Hb 1 Psp 1 Str 1 Wb 1 A m 1 h 1

Cmf Aac 1 Hrs 1 Apf 1 Dap 1B A Dap 1A Bot 1 m 1

Nav 1 g 1A S

Lg 1

Em 1B Ce 1

Cpt 1

Ecg 1B 1 Car Em 1A Scm 1

Ec

Her 1C Her 1B Her 1A

Her 1D Hmm 1B

Hmm 1A

Figure 1 Phylogenetic tree of the coronin family. The phylogenetic tree of the coronin family was calculated from the multiple sequence alignment of the conserved coronin domain using the Bayesian method. The unrooted tree was drawn with iTOL [73] and branches were coloured according to class and taxonomic distributions. For an extended representation of the tree including all posterior probability values see Additional file 2. have similar regions, as do the arthropods, the nematodes, ScCoro are responsible for -binding then all etc.), but are not conserved for major taxa (e.g. fungi, yeast and Schizosaccharomyces coronins should be able Metazoa, stramenopiles). to bind to because they contain motifs The coronin, ScCoro with similar compositions. A similar motif (CRN1), is known to bind to microtubules via its unique or region could not be identified in the Pezizomycotina region between the b-barrel domain and the coiled-coil coronins. While these supposed microtubule-binding oligomerisation region (Figure 3, [30]). Two short regions mainly consist of glutamate, lysine, proline, ser- regions showing homology to the microtubule-binding ine, and threonine and are not even conserved in very regions of MAP1B mediate this interaction. However, closely related yeast species, the Saccharomyces cerevi- the MAP1B sequence motif is very short (about ten resi- siae coronin, ScCoro, has very recently been described dues) and not very specific comprising mainly glutamate to contain a CA domain (C: central; A: acidic; [31]). and lysine residues [30]. If the corresponding motifs in This domain, with which ScCoro activates and inhibits Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 5 of 17 http://www.biomedcentral.com/1471-2148/11/268

p57 coronin HCRNN4 IR10 ClipinC P70 SE Villidin TACO coronin-3 ClipinB POD-1 ClipinA CRN2 HGNC CORO1ACORO1B CORO1C CORO6 CORO2A CORO2B CORO7 Morgan & Fernandez 489123 56 7 101112unclassified

Class-1 Class-2 Class-3 Class-4

Heterolobosea Naegleria gruberi

Kinetoplastida Leishmania major Crithidia fasciculata

Parabasalia Trichomonas vaginalis

Stramenopiles Phytophthora ramorum Aureococcus anophagefferens Thalassiosira pseudonana

Alveolata Toxoplasma gondii falciparum Theileria parva Tetrahymena thermophila Cryptosporidium hominis

Apusozoa Thecamonas trahens

Amoeba Dictyostelium fasciculatum subglobosum Entamoeba histolytica polycephalum Acanthamoeba castellanii

Fungi Ascomycota Basidiomycota ( ) Allomyces macrogynus Batrachochytrium dendrobatidis Spizellomyces punctatus Rhizopus arrhizus

Fungi/Metazoa Capsaspora owczarzaki incertae sedis

Choanoflagellida Monosiga brevicolis

Metazoa Amphimedon queenslandica Trichoplax adhaerens Nematostella vectensis Cestoda Branchiostoma floridae Strongylocentrotus purpuratus Oikopleura dioica Ciona intestinalis Helobdella robusta Lottia gigantea

Nematoda Caenorhabditis elegans Brugia malayi Meloidogyne hapla

Arthropoda Drosophila melanogaster Drosophila pseudoobscura Anopheles gambii Daphina pulex

1A1B 1C 1D 1E 2A 2B 2C 2D

Actinopterygii Takifugu rubripes Brachydanio rerio

Mammalia

Aves

Xenopus tropicalis Figure 2 Coronin repertoire of selected species of major taxa and branches. The coronins of several representative species for most eukaryotic taxa and branches are listed (for the list of all species see Additional file 4). On top, alternatively used names and classification schemes are given for better comparison and orientation. the ARP2/3 complex depending on concentration [31], Surprisingly, the coronins of the Tremellomycetes (e.g. is similar to CA domains in WASP family proteins [32]. Filobasidiella/Cryptococcus species) that belong to the The CA domain is well conserved but distinct within Basidiomycota encode a C-terminal dUTPase domain the Saccharomyceta (Pezizomycotina and Sacchar- (deoxyuridine triphosphatase domain) instead of the omycotina, Figure 4). coiled-coil region (Figure 3). These coronin sequences Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 6 of 17 http://www.biomedcentral.com/1471-2148/11/268

0 200 400 600 800 1000 1200 1400 1600 1800 aa

445 WD40 repeat PH DdCoro Gelsolin 461 LZ Hs Coro1A Central region VHP

605 Acidic region dUTPase CeCoro1 LZ Leucine Zipper MAP1B homology region 651 LZ ScCoro (Crn1) D-glycerate 3-kinase

695 Fna_bCoro

602 PfCoro

1057 CeCoro3 (POD-1)

1321 Fnb_bCoro3

1704 DdCoro4 (Villidin)

1738 CoCoro4

1602 EhCoro4

Figure 3 Domain organisation of representative coronins. A colour key to the domain names and symbols is given on the right except for the coronin domain that is coloured in orange. The abbreviations for the domains are: WD, WD repeat; PH, pleckstrin-homology domain; LZ, leucine zipper; VHP, villin headpeace domain. are supported by many EST/cDNA clones for several of also encode a CA domain similar to the CA domain of the Filobasidiella species extending from the coronin the WASP family proteins at their C-termini (Figure 3). domain to the stop-codon. In addition to this dUTPase Based on the multiple sequence alignment of 112 class- domain, the Filobasidiella species contain a further 3 coronins from all major branches of the eukaryotes dUTPase in the genome that is conserved in the other the position of the C-region has slightly been adjusted Basidiomycotes, and also the other fungi. The dUTPase in comparison with a previous analysis (Figure 4; [31]). domains of the Tremellomycetes coronins contain all Although the C-region of the class-3 coronins is not as characteristic dUTPase domain motifs [33] and are conserved as similar regions in the yeast short coronins therefore supposed to constitute enzymatically active or in WASP family proteins, the characteristic pattern domains. dUTPases typically form homotrimer active of hydrophobic residues concluded by a basic residue is site architectures with all monomers contributing con- visible in the homologs of all species (Figure 4). In con- served residues to each of the three active sites [33]. trast to the short coronins, the unique region between Except for the prediction of trimerization of these coro- the C-terminal coronin-domain and the conserved CA- nins, which could be mediated by the dUTPase domains domain is short (20-30 amino acids). instead of the coiled-coil domains in the other coronins, Like for the short coronins the Filobasidiella species have it needs experimental data to link the function of actin surprising and species-specific tandem-coronins. The Filo- filament structure remodelling by coronins to dUTP basidiella class-3 coronins have a D-glycerate 3-kinase nucleotide hydrolysis in DNA repair by dUTPases. domain between the two coronin-domains (Figure 3). Only the termini of the Filobasidiella class-3 coronins are sup- Class-3 coronins ported by EST/cDNA data, but long exons bridge the N- Class-3 coronins (Type III coronins) comprise homologs terminal coronin-domain and the glycerate 3-kinase that encode two coronin domains arranged in tandem domain, as well as the glycerate 3-kinase domain and the [8]. These two coronin domains are separated by unique C-terminal coronin-domain. As found for the dUTPase regions, and class-3 coronins do not encode coiled-coil domain of the short coronins, the Filobasidiella species domains. As recently reported [31] the class-3 coronins contain an additional D-glycerate 3-kinase that has Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 7 of 17 http://www.biomedcentral.com/1471-2148/11/268

Saccharomycotina coronins 4 3 2 bits V V D N DDL D K S L DD D T E 1 V E S N D D K SG E AP K K D N D D S I Q N E PP T K E D DS K E VK E S E A N N ED E D S A E DKR I E K S N D D S A K S P E T S D N EVI D I Q H V S Q E Q N D QN NQ DN E V R D K P N A P G D H G L G A G Q N A DV D E I E S N P Q T A V K A E S E E A G E K TP NT A C S RL S P G K R I F SK N R S A S G L E G G E D E AV E A G K D S S K T E A GGG I S E E S A T N L K G S E I Q I G G K T F SEDE L N T DA EV E K SNSVTST E T TQM I AVAMES TRKVTTQV T DSRKG DQNSTLT D 7 8 2 4 6 9 1 3 0 5 I 24 46 25 35 36 43 50 53 10 11 12 19 21 30 31 38 39 42 14 15 16 18 22 23 26 29 32 37 40 47 51 13 17 20 27 28 34 41 45 48 52 33 44 LL 49 N L K W C

C-domain A-domain

Pezizomycotina coronins 4 3 2 bits K P N D AED DS S AE 1 A S A E AD V A S F EE Q E K DG D R KA M V M S Q P E K A DE PKP A G A A V A A S D S R T T R S Q S P T I P EG G M Q G EA EE AS N K P V G N E D E AD N SD A D PE G A A K G G G G P T A H T E E G A R T M Q E VP V V K T S Q A A N I I Q G S Q G D Q N R S N T E P Q V E A D S P M P E S P S T H V R I T I D V D K Q N N HA A P K S GG E S Q I R D N Q V S N D 2 4 1 7 9 5 6 3 8 E 0 V Y 12 24 25 46 11 17 20 35 39 42 43 50 53 14 18 21 22 28 45 15 19 23 26 29 30 31 32 36 37 38 40 47 51 16 33 34 44 48 52 13 27 41 49 10 K S P N S D S F C Class-3 coronins 4 3 2 bits K K LLNAMS KL D P D 1 SDEQ E K D GN DP M E K A M D E R E D Q EN Q N E E S T Q VV G H E TK F S PRQ Q I S S P K TP E Q Y E EK S D N S T EP D G D Y D N E R D K A E T E M V F R Q I Q E S D L D E L D V D T P SA M T V V AA Q H K C L E D S A D N S T E A N S Q A E F D N D D S E S R V E H RK A E KS E D Q I D M I A T V GR D A T V G G S A A Y A G Q H I V E N EF D D K K L K L DA I Q A RA R D E N VR T A K L I Q T S A D S D L R LE T YE Q D SQ N A N S D R K W H G A D G S Y V K M Q V H K I H S P T QF A F G Q K A N S S E K H D K R P E D F S P A T V S S QR S D S V D V E E 6 7 9 2 5 0 1 3 4 8 22 18 11 25 47 52 66 12 13 14 17 24 38 39 40 41 42 43 45 48 49 50 55 59 60 69 16 21 23 10 15 20 44 53 54 57 58 61 62 63 64 67 19 46 51 56 65 68 70 26 27 28 29 30 31 32 33 34 35 36 37 N W C C-domain A-domain

Figure 4 Sequence conservation in the CA domains. The sequence logos illustrate the sequence conservation within the multiple sequence alignments of the CA domains of the Saccharomycotina, the Pezizomycotina, and the class-3 coronins. The CA domains of the Saccharomycotina and the Pezizomycotina are located within the unique regions of the short coronins while the CA domain of the class-3 coronins is at the C-termini of the proteins like in WASP family proteins. The regions between the C and the A domains are of variable length. homologs in the other fungi and also in plants. Why it is and do not show any homology to known domains, advantageous to connect an actin-filament binding function sequence motifs, and other proteins. to a glycerate 3-kinase needs experimental evaluation. The In contrast to the related species Rhizopus arrhizus glycerate 3-kinase domain is not found in the class-3 coro- and Phycomyces blakesleeanus the coronin-3 of Mucor nins of the other Basidiomycotes. Except for the Filobasi- circinelloides consists of only the second coronin- diella species only the insects have long insertions between domain of the tandem. We can exclude the possibility the two coronin-domains of their class-3 coronins. These of this being an artefact of the genome assembly for insertions are highly conserved, about 300 residues long, three reasons. First, the genome sequence is continuous Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 8 of 17 http://www.biomedcentral.com/1471-2148/11/268

around MucCoro3. Secondly, there is no homology to available nematode coronin-1 genes. Given the high con- any part of the N-terminal coronin-domain of RhaCoro3 servation of the nematode coronin-1 genes, especially the or PhbCoro3 in the genome although the sequence Caenorhabditis genes, and the completely uncommon identity of the C-terminal coronin-domains is about nature of the potential splice sites, the reported alterna- 65%. And finally, there is a TATA-box shortly upstream tive 3’-splicesiteofexon7and5’-splicesiteofexon8are of the MucCoro3 gene. Because a coronin-3 has already most probably artificial results. An alternative 3’-splice been present in the most ancient the loss of site has been reported for exon8 of CeCoro1 comprising the N-terminal coronin-domain must be specific to two amino acids [35]. Similar splice sites were identified Mucor circinelloides. in the genes of the other analyzed Caenorhabditis species but not in other nematodes. This splice site is thus also Class-4 coronins either an artificial result or specific for the Caenorhabdi- Based on the phylogenetic tree (Figure 1) and the tis branch. In addition, skipping of exon8 has also been domain composition of the protein homologs, another reported to lead to an alternative transcript [35]. The coronin class can be defined for which the Dictyostelium intron position and reading frame of exon8 of CeCoro1 is discoideum homolog, also called villidin [34], would be a conserved in all analyzed nematode coronin-1’sexcept representative (Figure 3). We suggest naming members for the Strongyloides rattii coronin-1, which consists of of this class class-4 coronins. Most class-4 coronins con- only one exon, and the Pristionchus pacificus coronin-1, sist of an N-terminal coronin-domain followed by three which has introns at different positions. Compared to the to four PH domains, four to five gelsolin domains, and a full-length transcript, the other alternative splice forms of C-terminal villin headpeace domain (VHP). Class-4 cor- CeCoro1 are of low abundance (see Figure two in [35]). onins were identified in two of the major kingdoms of Because the integrity of exon8 of CeCoro1 (intron posi- the eukaryotes, in excavates and opisthokonts. Further- tions around the conserved coding sequence of exon8) is more, they are found in several of the sub-branches of not conserved in nematodes but the corresponding the opisthokonts, in amoebae, fungi, and the fungi/meta- amino-acid sequence, alternative splicing of nematode zoa incertae sedis branch. Because class-4 coronins from coronin-1 is either restricted to some sub-branches or an different species often contain different numbers of PH artificial result of the CeCoro1 analysis. and gelsolin domains, domain gain and loss events must Alternative splicing of human coronin-1C results in have happened in the respective branches or single spe- two additional transcripts derived from alternative tran- cies. However, there are not enough coronin-4 homo- scription start sites encoded by an additional upstream logs identified yet to reconstruct the of these exon, compared to the normal start site as found and regions. In addition to these multi-domain class-4 coro- conserved in all other coronin proteins [36]. These alter- nins there is a group of class-4 coronins that just con- native splice forms seem to be restricted to modern pri- sists of the conserved coronin domain and is restricted mates (human, , gorilla, orangutan, and to some Amoebae species yet. gibbon) and have been discussed in detail elsewhere [36]. We have identified alternative splice variants for coronin- Alternatively spliced coronins 1D (Figure 5), a coronin subfamily restricted to vertebrates. Alternative splice forms have been reported for two coro- A cluster of two mutually exclusively spliced exons, exon5a nin homologs: five variants of coronin from Caenorhab- and exon5b, was identified in all tetrapods. The amino-acid ditis elegans [35], CeCoro1 (Figure 5), and three variants sequences corresponding to exon5 of the fish genes are for coronin-1C from human [36], HsCoro1C. The more similar to exon5b than to exon5a. Thus, exon5a is described splice variants do not concern the beta-barrel the result of an exon duplication event that either occurred domain but the structurally low-complexity region prior after the separation of tetrapods from fishes or at the onset to the coiled-coil region in CeCoro1 and elongations of of the vertebrates, where exon5a has been lost in the ances- the N-terminus of HsCoro1C, respectively. In the tor of the fishes. Exon5 represents the sequence of almost reported analysis of CeCoro1 [35] two splice sites (the the entire fourth WD repeat (fifth blade in the b-propeller) alternative 3’-splice site of exon7 and the alternative 5’- starting in the middle of the fourth b-strand of blade four. splice site of exon8) do not obey the conventional spli- By exchanging the fourth WD repeat the vertebrates could cing rules. Alternative 5’-splicing of exon8 would lead to fine-tune the function of the coronin-1D beta-barrel a premature stop-codon. In the four additional Caenor- domain. Vertebrate coronin-1D (CORO6) has not been habditis strains analyzed here, C. briggsae, C. japonica, analyzed experimentally yet and its specific function is C. remanei,andC.brenneri,alternative5’-splicing of unknown. exon8 would not lead to a premature stop-codon at the Further alternative transcripts are derived from mam- same position as in C. elegans but to transcripts of var- malian coronin-1D genes by alternative 5’-splicing of the ious lengths. The same accounts for several of the other last exon, exon10. This alternative splicing results in one Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 9 of 17 http://www.biomedcentral.com/1471-2148/11/268

CeCoro1

400 bps (ex.) 900 bps (in.) TAG

1 gi|193211354|ref|NC_003281.8| (5435bp) TAA For clarity introns have been scaled down by a factor of 2.24

HsCoro1D

400 bps (ex.) 1000 bps (in.) 5a 5b “Q”

1 gi|224589808|ref|NC_000017.10| (5687bp)

For clarity introns have been scaled down by a factor of 2.97

blade 4 blade 5

exon 4 exon 5a 180 190 200 210 220 230 ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| HsCoro1D DVIHSVCWNSNGSLLATTCKDKTLRIIDPRKGQVVAEQARPHEGARPLRAVFTADGKLLS Exon5b ERFAAHEGMRPMRAVFTRQGHIFT exon 5b

blade 5 blade 6

exon 5a exon 6

240 250 260 270 280 ....| ....| ....| ....| ....| ....| ....| ....| ....| HsCoro1D TGFSRMSERQLALWDPNNFEEPVALQEMDTSNGVLLPFYDPDSSI Exon5b TGFTRMSQRELGLWDP exon 5b Figure 5 Gene structures of alternatively spliced coronins. The cartoons outline the gene structures of the alternatively spliced coronin-1 gene from Caenorhabditis elegans, CeCoro1, and the coronin-1D gene from Homo sapiens, HsCoro1D. The alternatively spliced CeCoro1 gene contains a differentially included exon8, which has an additional alternative 3’-splice site, leading to three transcripts. The other two described splice sites, an alternative 3’-splice site of exon7 and an alternative 5’-splice site of exon8 [35], are most probably artificial. The HsCoro1D gene contains a cluster of two mutually exclusive spliced exons, exon5a and exon5b, and an alternative 5’-splice site of exon10. Dark grey bars and light grey bars mark exons and introns, respectively, and alternative exons and splice sites are coloured. additional glutamine residue and is conserved in all 22 Oligomerization analyzed mammalian coronin-1D’sexceptforAiluro- Most of the short coronins have predicted coiled-coil poda melanoleuca (giant panda), Loxodonta africana domains at the C-terminus that are the bases for their (elephant), Myotis lucifugus (little brown bat), and Bos supposed oligomerization. Initially, coronins have been taurus (cow). proposed to form dimers [16], the most common form Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 10 of 17 http://www.biomedcentral.com/1471-2148/11/268

of coiled-coil multimerization. In the last decade, a few species but not in whole branches (except for the Schizo- coronin homologs were biochemically purified and ana- saccharomyces). Therefore, we would expect all fungal lyzed. Accordingly, the Xenopus laevis coronin-1C coronins to form trimers. All Amoeba coronins, the Stra- (XcoroninA) has been shown to form a dimer [37] while menopiles coronins (exceptions: FrcCoro a His at P1, an oligomeric state has been found for human coronin- BhCoro_B a Asn at P6, AuaCoro_B a Lys at P1), the 1C (coronin 3; [18], and the Saccharomyces cerevisiae Trichomonas and Naegleria coronins contain the classical coronin (CRN1) trimerizes [31]. Parallel trimer forma- trimerization sequence motif in the coiled-coil region. tion has also been shown in a crystal structure of the Interestingly, the kinetoplastid coronins have the salt- coiled-coil domain of mouse coronin-1A [20] revealing bridge switched in the motif and should thus also be able a conserved motif determining the trimeric structure: to form trimers. From the Alveolata, only the Ciliophora R1-[ILVM]2-X3-X4-[ILV]5-E6.Inthismotifarginine (e.g. Tetrahymena) and Coccidia (e.g. Toxoplasma gondii) forms a salt-bridge with glutamate at the surface of the coronins contain coiled-coil domains, and only the Cocci- coiled-coil structure and the aliphatic side chain moi- dia contain the trimerization motif. eties of arginine and glutamate pack against the hydro- These are, however, predictions based on the existence phobic residues at positions 2 and 5 of the motif of the proposed trimerization motif. The motif has been shielding them from solvent. Mutation of the arginine identified in 86% of all short, autonomous, and parallel to lysine leads to a concentration-dependent equilibrium three-stranded coiled-coils while it is also observed in 9% between trimers and tetramers with tetramers forming of the antiparallel trimers and in 5% of the parallel and at high concentration, while mutation to alanine or nor- antiparallel dimers [20]. Thus, although most short coro- leucine leads to tetramers [20]. Mutation of the invar- nins are predicted to form trimers some might neverthe- iant arginine to glutamine in the trimerization motif of less function in other oligomeric states in the cell. The human matrilin-1 leads to tetramers [38]. Unfortunately, oligomerization state can ultimately only be shown in the switching of arginine and glutamate in the respective experiments, which have, however, been done for just a positions has not been analyzed yet. We would expect few of the coronins yet. that such a switch should be as stable as the original motif. Thus, to predict the oligomerization state we F-actin binding have analyzed all coronin coiled-coil regions for the pre- F-actin binding is one of the common properties of coro- sence of the trimerization motif. Accordingly, all 233 nin proteins. The extended multiple sequence alignment class-1 coronins have the classical motif, except for presented here together with the recently determined DpCoro1B and DrpCoro1B (Drosophila pseudoobscura crystal structure of murine coronin-1A [17] now allows a and persimilis; Lys at position 1), and NvCoro1 (Nema- reevaluation of previous mutagenesis studies. Truncation tostella; Cys at position 2), and are thus predicted to studies have shown that the coronin domain, including form trimers. This would include the Xenopus Coro1C the b-propeller and its C-terminal extension, is necessary that has, however, been shown to exist as a dimer [37]. for F-actin binding [30,39]. Mapping the sequence con- The situation is more diverse for the class-2 coronins. servation within 13 short coronin members onto the sur- The invertebrate coronins contain the trimerization face of the crystal structure revealed two regions, one motif, except for AmqCoro2 (Amphimedon; Ser at P1), formed by blades 1, 6, and 7 and one formed by blades 6 HerCoro2A (Helobdella; Lys at P1), HerCoro2B (Phe at and 7 and a portion of the C-terminal extension, to P1), MydCoro2 (Mayetiola; Gln at P6), and the nema- represent possible actin binding sites [17]. Subsequently, tode class-2 coronins (Cys at P2). Almost all fish class-2 several surface-exposed charged amino acids have been coronins contain the trimerization motif, but the other mutated to alanine or substituted by reversed charges in vertebrate class-2 coronins have conserved mutations. human coronin-1B and their F-actin binding affinity has The tetrapod class-2A coronins encode a glutamine been analyzed ([40], Figure 6 red dots, see also Additional instead of the invariant arginine, which would turn file 5). Only the R30D mutation abolished actin binding them to tetramers in analogy to matrilin-1 [38]. The tet- in vitro. Although an arginine is the most prevalent rapod class-2B coronins contain glutamine instead of aminoacidatthispositionitisoftensubstitutedbya the glutamate at position 6 of the motif, a substitution lysine or a proline (Figure 6). The multiple sequence whose effect has not been analyzed yet. alignment of the coronins also does not show a trend About half of the analyzed fungal coronins have the clas- towards a class-specific substitution. For example, while sical trimerization motif. The most common substitutions a proline is found at this position in all vertebrate class-2 that are found in all Schizosaccharomyces and most coronins, arginines, lysines, asparagines, prolines, threo- Basidiomyota coronins are lysines or glutamines instead of nins, and tyrosins are found in invertebrate class-2 coro- the arginine at position 1. While the coiled-coil region is nins. At least negatively charged amino acids are not conserved in general, substitutions happened in specific found in any of the coronin domains at this position. Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 11 of 17 http://www.biomedcentral.com/1471-2148/11/268

4 3

bits 2 N S Q N I A C V L V GT S F P KE D VSKN S F 1 G R F A KMM F A R N F I I I A T K R V R C PA A G H HW T R A SPS R K F L A R Q ST Q S K R L C K P L I V V K A T M S A L S N E T G F S PR W S P E I T Q Y S Y S L S G T T YK P A A E N K V A W S I L T S V T Q N S Q T P R RL Q TH G V D VS V T S M V AA T I G ES R Q A T P F P S P T S I L P D S S N T A G V E Q R E DE G S D C G N S S L I H C T R D H P W A S K A H A A I H T E S V R A Y P A S T E N L N S F E D S S D A E L N G F Q L C M N Y V S C QM N R N PA VG K ANA Q M N A T T RS Q V K Y M I I H R LH K S G N P Q Q G I Q P A S F N C P T T Q S N N L E P I S G L P Q Y I L L F C H M W S Q L V Q V A I P I S E S C C V R Q E L FAA P Y T Q M P A T G W Y T E T L G L T A A V T I G F R N S A F T A N V T S H H G YK ST T

1 2 3 4 5 6 7 8 9 T 0 10 11 12 13 14 15 16 17 18 19 20 21 SK22 23 R24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 A 56 57 58 F59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 N 10 20 30 40 50 60 70 80C ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ScCoro__fl ------MSGKFVRASKYRHVFGQAAKKEL--QYEKLKVTNNAWD------SNLLKTNGKFIAVN DdCoro__fl ------MSKVVRSSKYRHVFAAQPKKEE--CYQNLKVTKSAWD------SNYVAANTRYFGVI MmCoro1A_fl ------MSRQVVRSSKFRHVFGQPAKADQ--CYEDVRVSQTTWD------SGFCAVNPKFMALI

1-2 R30A/D

4 3

bits 2 M I Y I M L T S P NML A GN P L S F G G DD Y K E I 1 S S T GTE E L L L T K A Y E QE V S V K R V S Q L E S T K HF N F N FE A T T S A T R S H K I S S E Q C C E I T TSD N Y Y L E K V S L A D C DK E F A P V R S A L N T E Q S I F G D E P Q M V P E T S SGL S A A T I E F T S F L T T Y V I ST F Y Q S T E S SG S D N P D G M FAF V G D P Q Y T L AG G L L Y AD A V V Q Y V K F T H F T M Q R A K A T L S G M C N K V L V V D P G H Q V T E Y H L C N I V A A N D D A D G E N S S M V N S C C A H V AY G I RR Q AL S P S T Q E I G L T H K K H R S V A V S G L L V A E P G I N F T T P M GN I D N A M L N V Q R T A I L Q R S M K A K S C A N G I V L G S L A A L E I V I S I A C T A L S I H S L Q A E T I F M T S D N V T I V V C A T H Q S N V L A C S V S F Q T R E V A I V W E I F RC S A I M S P V SV N R V G T I GY G I L F 0 401 402 403 404 405 406 407 408 409 410 411 412 413 D414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 G436 K 437 438 D439 440 441 442 443 444 E445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 P462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 G478 479 480 N D C 410 420 P 430 G 440 450 460 470 480 ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ScCoro__fl IEKGDLGG-FYTVDQ-SSGILMPFYDEGNKILYLVGKGDGNIRYYEFQNDE------LFELSEFQSTEAQRGFA DdCoro__fl FTT-PLS--AQVVDS-ASGLLMPFYDADNSILYLAGKGDGNIRYYELVD------ESPYIHFLSEFKSATPQRGLC MmCoro1A_fl LEE-PLS--LQELDT-SSGVLLPFFDPDTNIVYLCGKGDSSIRYFEITS------EAPFLHYLSMFSSKESQRGMG

1-11 1-12 1-13 1-14 1-15 1-16 1-17

4 3

bits 2 I L I T GG F D D D T L M D F G K R F K R A L 1 S Y T G A NN D N A E E L C F K HG H A K L K K L N C D A K M E Q S R S P C E S S M S A S T D D A T I S A N L I D S L T E V S K T V S I D L M V T R L SS I F N T T P M G S V K A L V Q I T S V G R A T V HD N L R S Q S T F Y D I E A I D V Q S N L Y T V V Q A Y Q S TA Y W V G E V E E T E K A Y T V N I E A V A N T E LR DS F Q K S N M V C W P E L T G P E T T Q Q V I V A D AM A K M R R V Q L K P I I G Q G V V D Q Y R S Q I N G T P D V S V M M S T R A K G S T I A L T I Q F S E V Q P M V A L I I G G S I CM A A A S I G L T V W L G T N T K FD A G V I E V L W K F A M V G V R I K VI T P T S H S R T SV Y L Y E F R NAGI GA L P M VI F 0 481 P482 483 K 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 V 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 F537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 W 559 560 N P C 490 E 500 510 520PR 530 540D 550 560 ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ....| ScCoro__fl VAPKRM-VNVKENEVLKGFKTVVDQ-----RIEPVSFFVPRR------SEEFQEDIYPDA-PSNKPALTAEEWF DdCoro__fl FLPKRC-LNTSECEIARGLKVTPF------TVEPISFRVPRK------SDIFQDDIYPDT-YAGEPSLTAEQWV MmCoro1A_fl YMPKRG-LEVNKCEIARFYKLHER------KCEPIAMTVPRK------SDLFQEDLYPPT-AGPDPALTAEEWL

1-18 1-19 1-20 1-22

1-21 Figure 6 Sequence conservation within the actin binding region. The sequence logos illustrate the sequence conservation within the multiple sequence alignments of the coronin domains. Here, only the N- and C-termini of the coronin domains are shown because most of the residues implicated in actin binding map to these regions. For the representation of the entire coronin domain see Additional file 5. For better orientation, the sequences of three representative coronins are shown: the yeast coronin as the main target of mutagenesis experiments, the Dictyostelium coronin as the founding member of the protein family, and the murine coronin-1A of which the crystal structure is known. Secondary structural elements as determined from the crystal structure are drawn as yellow arrows (b-strands) and red boxes (a-helices). Green dots point to amino acids of ScCoro that have been mutated to alanine [41] and red dots highlight mutagenesis studies in HsCoro1B [40]. Light- blue boxes highlight mutations that abolished actin binding, dark-grey boxes represent mutations that did not influence actin binding, and light-grey boxes point to mutations in yeast coronins that could not be expressed and tested. Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 12 of 17 http://www.biomedcentral.com/1471-2148/11/268

Recently, systematic mutagenesis of charged surface- In addition, homologs of major branches were missing in exposed residues of yeast coronin revealed a patch of the analysis like those from stramenopiles. We do not residues extending over the top and one side of the intend to add confusion to the classification of the coronin b-propeller that abolished actin binding when mutated to familybutwanttosuggestareliableand,considering alanine (Figure 6, green dots [41]). The analysis of the future genome sequencing projects, expandable scheme. conservation within the coronin proteins shows that Two major reasons support the future use of the HGNC many of the substitutions in both studies have been per- scheme although it needs some minor adjustments. The formed on marginally conserved residues (e.g. E215A/K, classification by Morgan and Fernandez [22] of coronins K216A/E, 1-11, 1-15, 1-16). Thus, it is not surprising that outside the metazoans is not consistent with the latest tax- coronins with mutations of these residues are able to onomy of the eukaryotes and therefore not adaptable to bind F-actin. As actin binding is one of the common our more comprehensive coronin tree. In addition, it is functions of coronins and belong to the highest well known that two whole-genome-duplications are the conserved protein families the actin binding surface of reason for the expansion of gene homologs at the origin of the coronins is also expected to be highly conserved. the vertebrates [42], while another whole-genome-duplica- Most of the residues that were found to abolish actin tion happened at the origin of the Actinopterygii [43,44]. binding when mutated to alanine are strongly conserved Thus retaining the orthology between non-vertebrate and (Figure 6). The few residues thatarehighlyconserved vertebrate coronins in class-numbers would be desirable but do not influence actin binding might be interaction but has also been abandoned by Morgan and Fernandez sites for other proteins like cofilin. [22]. Here, we adapted the HGNC classification except for renaming CORO6 and CORO7 (HGNC) to coronin-1D Discussion and coronin-3, respectively, numbering additional fish cor- Here, we have analyzed 723 coronins from 358 species. For onins as coronin-1E, coronin-2C, and coronin-2D, and 323 species whole genome sequence data was available defining the new coronin class-4. The term “class” is allowing a “holistic” analysis of the coronin protein family. equivalent to the term “Type” used by the Bear group in In addition, the whole genome assemblies of 69 species recent reviews [8,45]. However, we prefer the term “class” have been analyzed that in the end did not contain any to be consistent with the terminology used for other pro- coronin homolog. These species include Rhodophyta (Cya- tein families (e.g. the myosin family [46,47]) and therefore nidioschyzon, Galdieria), Viridiplantae, Microsporidia, For- to facilitate the work with databases and search engines in micata (Giardia), and Haptophyceae (Emiliania). A the future. sequence alignment of the coronin proteins was created Class-4 coronins represent a new type of coronins that and extensively improved manually. The phylogenetic ana- are present in Excavata (Naegleria gruberi), Amoebae, lysis of the conserved coronin domain, which is also fungi (Spizellomyces punctatus), and the Fungi/Metazoa included in the crystal structure [17], using the Bayesian incertae sedis branch. Most class-4 coronins consist of the method showed that the grouping of the coronins is com- N-terminal coronin domain followed by two to three PH, pletely in accordance with the latest phylogeny of the four to five gelsolin, and a C-terminal VHP domain. The eukaryotic species (Figure 1, [27-29]). Subsequently, we first representative of this subfamily has been identified in analyzed the coronin tree with respect to established and Dictyostelium discoideum and called villidin because of the proposed classifications defining subfamilies. Two major homology of its gelsolin and VHP domains to villin [34]. schemes are currently in use, the old one established by The homology of villidins WD-repeat region to coronin the HGNC [8] and a more recent one expanding the num- has been recognized later on [48,49] suggesting villidins ber of classes from three to twelve [22]. Essentially, the origin through a fusion of the coronin domain with villin. later classification re-defines subclasses of the HGNC Villin is the founding member of a superfamily of proteins scheme as separate classes, e.g. 1A and 1B become class-4 containing three to six gelsolin domains (reviewed in and class-1, respectively, and groups some branches to new [48,50]). Like villidin (class-4 coronins), villin, supervillin, classes. However, some coronins still remained unclassified and protovillin also contain a C-terminal VHP domain. and several classes have been proposed, like the inverte- Alignment of villin to the class-4 coronins gelsolin domains brate metazoan classes 8 and 9, although the contributing shows that the class-4 coronins have lost the first gelsolin membersdidnotformmonophyleticbranchesinthe domain of villin. The first gelsolin domain of villin is asso- underlying protein family tree. The proposed classes 10 ciated with dimerization, actin filament capping, nuclea- and 12 contain members of unrelated taxonomic branches, tion, and bundling, and G-actin binding [50]. Thus, class-4 probably because these coronins were adjacent in the tree coronins do not play a role in these activities via their figure. In addition, the Entamoeba tandem-coronin did not gelsolin domains [34]. However, villin contains three phos- group to the other tandem-coronins. Thus, this classifica- pholipid-binding domains, two preceding the second gelso- tion is not consistent with the taxonomy of the eukaryotes. lin domain and one overlapping with the VHP domain. Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 13 of 17 http://www.biomedcentral.com/1471-2148/11/268

These phospholipid-binding domains are conserved in recent phylogenetic analyses, the Alveolata, Rhizaria, and class-4 coronins and are most probably responsible for Stramenopiles form the superfamily SAR [27,54]. The their association with internal membranes like Golgi-struc- placement of the Haptophyceae and Cryptophyta to the tures and ER-membranes [34]. SAR is still highly debated. Although several analyses To reveal the evolution of the coronin family and to are in favour to this grouping (arrow 3; [55-57]) most determine the coronin repertoire of the last common analyses are in contrast [27-29,53,54]. Short coronins ancestor of the eukaryotes, we plotted the coronin inven- containing the N-terminal coronin domain and the C- tory of several representative species, whose genome terminal oligomerization domain have been found in all sequences are available and whose coronin inventories branches except Diplomonadida, Haptophyceae, and Vir- are therefore complete, on the most widely agreed tree of idiplantae/Rhodophyta. The phylogenetic grouping of the the eukaryotes (Figure 7). However, especially the group- species based on the phylogenetic tree of the coronin ing of taxa that emerged close to the origin of the eukar- domains showed that the coronins with different domain yotes remains highly debated. Therefore, alternative compositions (containing dUTPase domains, ARP2/3 branchings are also indicated in the tree. The phylogeny binding domains, no coiled-coil regions) are species- of the supposed supergroup Excavata is the least under- specific developments based on domain loss and gain stood because only a few species of this branch have events while the corresponding species correctly group been completely sequenced so far. While the grouping of together inside the respective branches. Class-3 coronins the Heterolobosea, Trichomonada, and Euglenozoa into are also found in all major eukaryotic superkingdoms the Excavata is found in most analyses, the grouping of that contain coronins. We did not identify any species the Diplomonadida as separate or as part of the that contains exclusively a class-3 coronin suggesting Excavata is still debated (arrow 1 [51]). Also, some ana- that encoding a class-3 coronin is a plus for the species lyses group the red of the Rhodophyta branch to but not a necessity. Class-4 coronins were found in two the Viridiplantae [52-54] and others support their inde- of the four coronin-containing superkingdoms, the Exca- pendence (arrow 2; [28,55]). According to most of the vata and the Opisthokonts. Several major sub-branches

2 3 Monosiga brevicollis 1 2 Choanoflagellida1 3 Cyanidioschyzon merolae

Capsaspora owczarzaki Metazoa Galdieria sulphuraria 0 3 4 0 3 4 Fungi/Metazoa incertae sedis4 Aureococcus anophageferens 0 0 3 0 3 4 Phytophthora ramorum Fungi 0 3 Blastocystis hominis

Amoebozoa 0 0 Trichomonas vaginalis OomycetesBlastocystis Ectocarpus siliculosus

0 0 0 3 Pelagophyceae 3 0 3 Phaeophyceae Rhodophyta Trichomonada 3 Thalassiosira pseudonana 0 Opisthokonts Bacillariophyta Naegleria gruberi Bigelowiella natans 0 Plasmodium faliciparum 0 4 Viridiplantae 3 4 Stramenopiles Aconoidasia 0 Rhizaria 3 Heterolobosea 3 Coccidia Toxoplasma gondii Leishmania major X 2 Alveolata Ciliophora 0 0 Euglenozoa 3 4 4 3 Tetrahymena thermopila Excavata SAR X Emiliania huxleyi Giardia lamblia Diplomonadida X Haptophyceae

1 Cryptophyta Guillardia theta 0 0

Eukaryota 0 3 4 Figure 7 Evolution of the coronin protein family with respect to the species evolution. Schematic representation of the most widely accepted eukaryotic tree of life. Branch lengths are arbitrary. The coronin inventories of certain taxa and specific species have been plotted to the tree with class numbers given in colour-coded boxes. “O” stands for “Orphan”, the unclassified short coronins. The numbers on the arrows refer to alternative placing of the respective taxa: 1: The independence of the Diplomonadida (instead of grouping them to the superkingdom Excavata) is supported by [51]. 2: The monophyly of the Rhodophyta is supported by [28,55]. 3: Grouping the Haptophyceae and Cryptophyta to the SAR is supported by [55-57]. Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 14 of 17 http://www.biomedcentral.com/1471-2148/11/268

of the Opisthokonts contain class-4 coronins, the Amoe- and gene-duplication events (Figure 2). This view is, how- bozoa, the Fungi, and the Fungi/Metazoa incertae sedis ever, based on the species whose genomes are available branch. However, the evolution of the class-4 coronins today and might change as soon as sequencing of more rather seems to be determined by gene-loss events. This related species reveals subtypes of the class-1 and class-2 distribution of the coronin classes demonstrates that the coronins in major invertebrate branches. At the origin of last common ancestor of the eukaryotes must have con- the vertebrates the two well-known whole-genome duplica- tained a short coronin as well as a tandem coronin (class- tions (2R, [42]) resulted in several subtypes of both the 3), and most probably even a class-4 coronin. In the cor- class-1 and class-2 coronins. The subsequent third whole onin-family tree (Figure 1) the C-terminal coronin- genome duplication in the fish-lineage [43,44] led to even domains of the class-3 coronins group closer to the short more gene duplicates. Subsequent to this boost of coronin coronins than the N-terminal coronin-domains. This homologs at the onset of the vertebrates branch-specific suggests a three-step invention of the class-3 coronin gene deletions happened, like the loss of the class-1B var- (Figure 8): First a gene duplication of the short coronin iants in fishes and the class-1A loss in birds (Figure 2). happened (1). The new copy was subsequently copied The short coiled-coil region including the trimerization twicebuttheorderoftheseeventscouldnotbedeter- motif R-[VILM]-X-X-[VIL]-E is an accomplishment of mined (2). One copy has been distributed in a different the most ancient short coronin because it is found in cor- genomic region resulting in the class-4 coronin after onins of all branches of the eukaryotic tree. It has been fusion to a copy of the villin gene (2B). The other copy retained without major mutations for a long evolutionary resulted in a tandem gene duplicate in which the new time. This is exemplified by the fact that changes, which copy was placed at the 5’ site of the original gene (2A). might lead to other oligomerization states, are species- Thetandemgeneduplicatesubsequentlyfusedtobuild specific or have been introduced in very recently sepa- the class-3 coronin prototype (3). It could also be possi- rated branches. ble that the coronin domain copy, which led to the class- 4 coronin, would have been produced as a copy of the 3’ Conclusions coronin of the then already existing tandem gene dupli- Thephylogenetictreebasedonthecoronindomainsof cate (4). 723 homologs from 358 species allowed grouping the At the origin of the Metazoa and Choanoflagellida coronin proteins into four classes: Class-1 (Type I) and branches another gene duplication event led to two distinct class-2 (Type II) comprise short coronins and resulted classes, class-1 coronins and class-2 coronins (Figure 7). from a gene duplication of a short coronin at the onset The further evolution of the short coronins in the inverte- of the halozoans. Short coronins are characterized by brate branches is determined by species-specific gene-loss an N-terminal coronin domain followed by a unique

1 gene duplication A

short coronin A: tandem gene duplication 2 B: gene duplication (order unknown) A B

4

3 fusion of the tandem gene duplicates class-4 coronin

class-3 coronin Figure 8 Evolution of the coronin classes. The cartoon shows the different gene duplication and fusion events that led to the formation of the short coronins, the class-3 coronins, and the class-4 coronins. Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 15 of 17 http://www.biomedcentral.com/1471-2148/11/268

domain and a C-terminal short coiled-coil region. The sequences and filling gaps) the alignment has been adjusted coiled-coil domain of almost all short coronins contains manually. Subsequently, every newly predicted sequence a trimerization motif that must therefore have already has been preliminary aligned to its supposed closest rela- existed in the last common ancestor of the eukaryotes. tive using ClustalW, the aligned sequence added to the Class-3 (Type III) coronins comprise coronins with two multiple sequence alignment of the coronins, and the coro- coronin-domains arranged in tandem and have been nin alignment adjusted manually during the subsequent found in species of all eukaryotic kingdoms that contain sequence validation process. We have also retained the coronins. Class-4 (Type IV) coronins encode fusions of integrity of the primary sequence within the secondary the coronin domain to villin and have been identified in structural elements that have been determined from Excavata and Opisthokonts although most of these spe- the crystal structure (e.g. sequence gaps have only been cies subsequently lost the class-4 homolog. Hence, the introduced in known loop regions). Still, many gaps in last common ancestor of the eukaryotes must have con- sequences derived from low-coverage genomes remained. tained a short coronin and a class-3 coronin, and most In those cases, the integrity of the exons surrounding the probably a class-4 coronin. gaps has been maintained (gaps in the genomic sequence are reflected as gaps in the multiple sequence alignment). Methods The unique and coiled-coil regions are completely diver- Identification and annotation of the coronin family gent in sequence and length and were therefore aligned proteins manually. The domain compositions of the short coronin, The coronin genes have been identified by TBLASTN the class-3, and the class-4 coronins are different and searches against the sequenced eukaryotic genomes, which regions outside the N-terminal coronin domain were only have been obtained via lists available from the diArk data- aligned within these groups. The C-terminal coronin base [23,58]. All hits were manually analyzed at the geno- domains of the class-3 coronins were separately included mic DNA level. Datasets of predicted proteins produced by in the multiple sequence alignment of the coronins, in the sequencing consortia often miss homologs, and pre- addition to being aligned as part of their class. dicted proteins contain mispredicted exons and introns in many cases, necessitating manual assembly and annotation. Building trees The correct coding sequences were identified with the help For calculating phylogenetic trees only full-length and par- of the multiple sequence alignments of all coronin proteins. tial sequences were included in the alignment. The phylo- As the amount of protein sequences increased (especially genetic trees were generated based on the conserved the number of sequences in taxa with few representatives), coronin domains (corresponding to amino acids 1-386 of many of the initially predicted sequences were reanalyzed HsCoro1A) using two different methods: 1. Maximum to correctly identify all exon borders. Where possible, EST likelihood (ML) using the LG model with estimated pro- data available from the NCBI EST database has been ana- portion of invariable sites and bootstrapping (1,000 repli- lyzed to help in the annotation process. In addition, coronin cates) using RAxML [63]. 2. Posterior probabilities were homologs from cDNA projects or single-gene analyses have generated using MrBayes v3.1.2 [64] with the MPI option been obtained by TBLASTN searches against the NCBI nr [65]. Two independent runs with 15,000,000 generations, database [59]. Gene structures have been reconstructed four chains, and a random starting tree were computed using WebScipio [25] as far as genomic sequence data was using the mixed amino-acid option. From the 32,000th available. All sequence related data (names, corresponding generation MrBayes used the Wag model [66]. Using Prot- species, GenBank ID’s, alternative names, corresponding Test[67],theLGmodel[68],whichis,however,not publications, domain predictions, gene structure recon- implemented in MrBayes, was determined to provide a structions, and sequences) and references to genome slightly better fit to the data than the Wag model. Trees sequencing centres are available through CyMoBase were sampled every 1,000th generation and the first 25% [60,61]. of the trees were discarded as “burn-in” before generating a consensus tree. Generating the multiple sequence alignment The multiple sequence alignment of the coronin family has Domain and motif prediction been built and extended during the process of annotating Protein domains were predicted using the SMART [69] and assembling new sequences. The initial alignment has and [70] web server. The leucine zipper motifs have beengeneratedfromthefirstabout50non-validated been identified using the Prosite database [71]. The CA sequences obtained from NCBI using the ClustalW soft- domains have been identified by visual inspection of the ware with standard settings [62]. During the following cor- manual sequence alignment of the coronins and motif rection of the sequences (removing wrongly annotated comparisons with CA domains of WASP family proteins Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 16 of 17 http://www.biomedcentral.com/1471-2148/11/268

available at CyMoBase (unpublished data, [61]). Graphical coronin are defective in cytokinesis and cell motility. J Cell Biol 1993, representations of the sequence patterns have been gener- 120:163-173. 12. Cai L, Holoweckyj N, Schaller MD, Bear JE: Phosphorylation of coronin 1B ated with WebLogo [72]. by protein kinase C regulates interaction with Arp2/3 and cell motility. J Biol Chem 2005, 280:31913-31923. 13. Maniak M, Rauchenberger R, Albrecht R, Murphy J, Gerisch G: Coronin Additional material involved in phagocytosis: dynamics of particle-induced relocalization visualized by a green fluorescent protein Tag. Cell 1995, 83:915-924. Additional file 1: Sequence alignment of the coronins The file 14. Ferrari G, Langen H, Naito M, Pieters J: A coat protein on phagosomes contains the alignment of the full-length sequences of the coronins in involved in the intracellular survival of mycobacteria. Cell 1999, fasta-format. The data can also be downloaded from CyMoBase [61]. 97:435-447. 15. Rybakin V, Stumpf M, Schulze A, Majoul IV, Noegel AA, Hasse A: Coronin 7, Additional file 2: MrBayes tree of the coronin family This file contains the mammalian POD-1 homologue, localizes to the Golgi apparatus. the phylogenetic tree calculated with MrBayes including posterior FEBS Lett 2004, 573:161-167. probability values that has been the basis for Figure 1. Here, the tree is 16. de Hostos EL: The coronin family of actin-associated proteins. Trends Cell plotted in an extended way so that every coronin can be found and Biol 1999, 9:345-350. compared easily. 17. Appleton BA, Wu P, Wiesmann C: The crystal structure of murine coronin- Additional file 3: RAxML tree of the coronin family This file contains 1: a regulator of actin cytoskeletal dynamics in lymphocytes. Structure the phylogenetic tree calculated with RAxML including bootstrap values. 2006, 14:87-96. The tree is plotted in an extended way so that every coronin can be 18. Spoerl Z, Stumpf M, Noegel AA, Hasse A: Oligomerization, F-actin found and compared easily. interaction, and membrane association of the ubiquitous mammalian Additional file 4: Coronin repertoire of all eukaryotes analyzed coronin 3 are mediated by its carboxyl terminus. J Biol Chem 2002, Complete table of the coronin inventories of 358 eukaryotes. 277:48858-48867. 19. Oku T, Itoh S, Ishii R, Suzuki K, Nauseef WM, Toyoshima S, Tsuji T: Additional file 5: Conserved residues in the coronin domain This Homotypic dimerization of the actin-binding protein p57/coronin-1 figure contains the sequence conservation of the entire coronin domain mediated by a leucine zipper motif in the C-terminal region. Biochem J including all mutagenesis experiments as described in Cai et al. [40] and 2005, 387:325-331. Gandhi et al. [41]. 20. Kammerer RA, Kostrewa D, Progias P, Honnappa S, Avila D, Lustig A, Winkler FK, Pieters J, Steinmetz MO: A conserved trimerization motif controls the topology of short coiled coils. Proc Natl Acad Sci USA 2005, 102:13891-13896. Acknowledgements 21. Rybakin V, Clemen CS: Coronin proteins as multifunctional regulators of This work has been funded by grants KO 2251/3-1 and KO 2251/3-2 of the the and membrane trafficking. Bioessays 2005, 27:625-632. Deutsche Forschungs gemein schaft. 22. Morgan RO, Fernandez MP: Molecular phylogeny and evolution of the coronin gene family. Subcell Biochem 2008, 48:41-55. Authors’ contributions 23. Odronitz F, Hellkamp M, Kollmar M: diArk–a resource for eukaryotic CE and MK assembled coronin sequences, performed data analysis and genome research. BMC Genomics 2007, 8:103. wrote the manuscript. BH performed the phylogenetic analysis. All authors 24. Breathnach R, Chambon P: Organization and expression of eucaryotic read and approved the final manuscript. split genes coding for proteins. Annu Rev Biochem 1981, 50:349-383. 25. Odronitz F, Pillmann H, Keller O, Waack S, Kollmar M: WebScipio: an online Received: 28 June 2011 Accepted: 25 September 2011 tool for the determination of gene structures using protein sequences. Published: 25 September 2011 BMC Genomics 2008, 9:422. 26. Keller O, Odronitz F, Stanke M, Kollmar M, Waack S: Scipio: using protein References sequences to determine the precise exon/intron structures of genes and 1. de Hostos EL, Bradtke B, Lottspeich F, Guggenheim R, Gerisch G: Coronin, their orthologs in closely related species. BMC Bioinformatics 2008, 9:278. an actin binding protein of Dictyostelium discoideum localized to cell 27. Parfrey LW, Grant J, Tekle YI, Lasek-Nesselquist E, Morrison HG, Sogin ML, surface projections, has sequence similarities to beta subunits. Patterson DJ, Katz LA: Broadly sampled multigene analyses yield a well- EMBO J 1991, 10:4097-4104. resolved eukaryotic tree of life. Syst Biol 2010, 59:518-533. 2. Tardieux I, Liu X, Poupel O, Parzy D, Dehoux P, Langsley G: A Plasmodium 28. Reeb VC, Peglar MT, Yoon HS, Bai JR, Wu M, Shiu P, Grafenberg JL, Reyes- falciparum novel gene encoding a coronin-like protein which associates Prieto A, Rummele SE, Gross J, Bhattacharya D: Interrelationships of with actin filaments. FEBS Lett 1998, 441:251-256. chromalveolates within a broadly sampled tree of photosynthetic 3. Figueroa JV, Precigout E, Carcy B, Gorenflot A: Identification of a coronin- protists. Mol Phylogenet Evol 2009, 53:202-211. like protein in Babesia species. Ann N Y Acad Sci 2004, 1026:125-138. 29. Hampl V, Hug L, Leigh JW, Dacks JB, Lang BF, Simpson AG, Roger AJ: 4. Heil-Chapdelaine RA, Tran NK, Cooper JA: The role of Saccharomyces Phylogenomic analyses support the monophyly of Excavata and resolve “ ” cerevisiae coronin in the actin and microtubule . Curr Biol relationships among eukaryotic supergroups . Proc Natl Acad Sci USA 1998, 8:1281-1284. 2009, 106:3859-3864. 5. Suzuki K, Nishihata J, Arai Y, Honma N, Yamamoto K, Irimura T, 30. Goode BL, Wong JJ, Butty AC, Peter M, McCormack AL, Yates JR, Drubin DG, Toyoshima S: Molecular cloning of a novel actin-binding protein, p57, Barnes G: Coronin promotes the rapid assembly and cross-linking of with a WD repeat and a leucine zipper motif. FEBS Lett 1995, 364:283-288. actin filaments and may link the actin and microtubule cytoskeletons in 6. de Hostos EL: A brief history of the coronin family. Subcell Biochem 2008, yeast. J Cell Biol 1999, 144:83-98. 48:31-40. 31. Liu SL, Needham KM, May JR, Nolen BJ: Mechanism of a Concentration- 7. Clemen CS, Rybakin V, Eichinger L: The coronin family of proteins. Subcell dependent Switch between Activation and Inhibition of Arp2/3 Complex Biochem 2008, 48:1-5. by Coronin. J Biol Chem 2011, 286:17039-17046. 8. Uetrecht AC, Bear JE: Coronins: the return of the crown. Trends Cell Biol 32. Veltman DM, Insall RH: WASP family proteins: their evolution and its 2006, 16:421-426. physiological implications. Mol Biol Cell 2010, 21:2880-2893. 9. Smith TF: Diversity of WD-repeat proteins. Subcell Biochem 2008, 48:20-30. 33. Vertessy BG, Toth J: Keeping uracil out of DNA: physiological role, 10. Neer EJ, Schmidt CJ, Nambudripad R, Smith TF: The ancient regulatory- structure and catalytic mechanism of dUTPases. Acc Chem Res 2009, protein family of WD-repeat proteins. Nature 1994, 371:297-300. 42:97-106. 11. de Hostos EL, Rehfuess C, Bradtke B, Waddell DR, Albrecht R, Murphy J, 34. Gloss A, Rivero F, Khaire N, Muller R, Loomis WF, Schleicher M, Noegel AA: Gerisch G: Dictyostelium mutants lacking the cytoskeletal protein Villidin, a novel WD-repeat and villin-related protein from Dictyostelium, Eckert et al. BMC Evolutionary Biology 2011, 11:268 Page 17 of 17 http://www.biomedcentral.com/1471-2148/11/268

is associated with membranes and the cytoskeleton. Mol Biol Cell 2003, and the association of rhizaria with chromalveolates. Mol 14:2716-2727. Biol Evol 2007, 24:1702-1713. 35. Yonemura I, Mabuchi I: Heterogeneity of mRNA coding for 58. diArk - a resource for eukaryotic genome research. [http://www.diark.org]. Caenorhabditis elegans coronin-like protein. Gene 2001, 271:255-259. 59. Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL: 36. Xavier CP, Rastetter RH, Stumpf M, Rosentreter A, Muller R, Reimann J, NCBI BLAST: a better web interface. Nucleic Acids Res 2008, 36:W5-9. Cornfine S, Linder S, van Vliet V, Hofmann A, Morgan RO, Fernandez MP, 60. Odronitz F, Kollmar M: Pfarao: a web application for protein family Schroder R, Noegel AA, Clemen CS: Structural and functional diversity of analysis customized for cytoskeletal and motor proteins (CyMoBase). novel coronin 1C (CRN2) isoforms in muscle. J Mol Biol 2009, 393:287-299. BMC genomics 2006, 7:300. 37. Asano S, Mishima M, Nishida E: Coronin forms a stable dimer through its 61. CyMoBase - a database for cytoskeletal and motor proteins. [http://www. C-terminal coiled coil region: an implicated role in its localization to cell cymobase.org]. periphery. Genes Cells 2001, 6:225-235. 62. Thompson JD, Gibson TJ, Higgins DG: Multiple sequence alignment using 38. Beck K, Gambee JE, Kamawal A, Bachinger HP: A single amino acid can ClustalW and ClustalX. Curr Protoc Bioinformatics 2002, Chapter 2:Unit 2 3. switch the oligomerization state of the alpha-helical coiled-coil domain 63. Stamatakis A, Hoover P, Rougemont J: A rapid bootstrap algorithm for the of cartilage matrix protein. EMBO J 1997, 16:3767-3777. RAxML Web servers. Syst Biol 2008, 57:758-771. 39. Oku T, Itoh S, Okano M, Suzuki A, Suzuki K, Nakajin S, Tsuji T, Nauseef WM, 64. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference Toyoshima S: Two regions responsible for the actin binding of p57, a under mixed models. Bioinformatics 2003, 19:1572-1574. mammalian coronin family actin-binding protein. Biol Pharm Bull 2003, 65. Altekar G, Dwarkadas S, Huelsenbeck JP, Ronquist F: Parallel Metropolis 26:409-416. coupled Markov chain Monte Carlo for Bayesian phylogenetic inference. 40. Cai L, Makhov AM, Bear JE: F-actin binding is essential for coronin 1B Bioinformatics 2004, 20:407-415. function in vivo. J Cell Sci 2007, 120:1779-1790. 66. Whelan S, Goldman N: A general empirical model of protein evolution 41. Gandhi M, Jangi M, Goode BL: Functional surfaces on the actin-binding derived from multiple protein families using a maximum-likelihood protein coronin revealed by systematic mutagenesis. J Biol Chem 2010, approach. Mol Biol Evol 2001, 18:691-699. 285:34899-34908. 67. Abascal F, Zardoya R, Posada D: ProtTest: selection of best-fit models of 42. Van de Peer Y, Maere S, Meyer A: 2R or not 2R is not the question protein evolution. Bioinformatics 2005, 21:2104-2105. anymore. Nat Rev Genet 2010, 11:166. 68. Le SQ, Gascuel O: An improved general amino acid replacement matrix. 43. Steinke D, Hoegg S, Brinkmann H, Meyer A: Three rounds (1R/2R/3R) of Mol Biol Evol 2008, 25:1307-1320. genome duplications and the evolution of the glycolytic pathway in 69. Letunic I, Doerks T, Bork P: SMART 6: recent updates and new vertebrates. BMC Biol 2006, 4:16. developments. Nucleic Acids Res 2009, 37:D229-232. 44. Jaillon O, Aury JM, Brunet F, Petit JL, Stange-Thomann N, Mauceli E, 70. Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Bouneau L, Fischer C, Ozouf-Costaz C, Bernot A, Nicaud S, Jaffe D, Fisher S, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Lutfalla G, Dossat C, Segurens B, Dasilva C, Salanoubat M, Levy M, Boudet N, Bateman A: The Pfam protein families database. Nucleic Acids Res 2010, Castellano S, Anthouard V, Jubin C, Castelli V, Katinka M, Vacherie B, 38:D211-222. Biemont C, Skalli Z, Cattolico L, Poulain J, et al: Genome duplication in the 71. Sigrist CJ, Cerutti L, de Castro E, Langendijk-Genevaux PS, Bulliard V, teleost fish Tetraodon nigroviridis reveals the early vertebrate proto- Bairoch A, Hulo N: PROSITE, a database for functional karyotype. Nature 2004, 431:946-957. characterization and annotation. Nucleic Acids Res 2010, 38:D161-166. 45. Chan KT, Creed SJ, Bear JE: Unraveling the enigma: progress towards 72. Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo understanding the coronin family of actin regulators. Trends Cell Biol generator. Genome Res 2004, 14:1188-1190. 2011, 21:481-488. 73. Letunic I, Bork P: Interactive Tree Of Life (iTOL): an online tool for 46. Odronitz F, Kollmar M: Drawing the tree of eukaryotic life based on the phylogenetic tree display and annotation. Bioinformatics 2007, 23:127-128. analysis of 2,269 manually annotated myosins from 328 species. Genome Biol 2007, 8:R196. doi:10.1186/1471-2148-11-268 47. Berg JS, Powell BC, Cheney RE: A millennial myosin census. Mol Biol Cell Cite this article as: Eckert et al.: A holistic phylogeny of the coronin 2001, 12:780-794. gene family reveals an ancient origin of the tandem-coronin, defines a 48. Archer SK, Claudianos C, Campbell HD: Evolution of the gelsolin family of new subfamily, and predicts protein function. BMC Evolutionary Biology actin-binding proteins as novel transcriptional coactivators. Bioessays 2011 11:268. 2005, 27:388-396. 49. Xavier CP, Eichinger L, Fernandez MP, Morgan RO, Clemen CS: Evolutionary and functional diversity of coronin proteins. Subcell Biochem 2008, 48:98-109. 50. Khurana S, George SP: Regulation of cell structure and function by actin- binding proteins: villin’s perspective. FEBS Lett 2008, 582:2128-2139. 51. Simpson AG, Inagaki Y, Roger AJ: Comprehensive multigene phylogenies of excavate protists reveal the evolutionary positions of “primitive” eukaryotes. Mol Biol Evol 2006, 23:615-625. 52. Keeling PJ: The endosymbiotic origin, diversification and fate of plastids. Philos Trans R Soc Lond B Biol Sci 2010, 365:729-748. 53. Burki F, Shalchian-Tabrizi K, Pawlowski J: Phylogenomics reveals a new ‘megagroup’ including most photosynthetic eukaryotes. Biol Lett 2008, 4:366-369. Submit your next manuscript to BioMed Central 54. Burki F, Shalchian-Tabrizi K, Minge M, Skjaeveland A, Nikolaev SI, and take full advantage of: Jakobsen KS, Pawlowski J: Phylogenomics reshuffles the eukaryotic supergroups. PLoS One 2007, 2:e790. 55. Nozaki H, Maruyama S, Matsuzaki M, Nakada T, Kato S, Misawa K: • Convenient online submission Phylogenetic positions of Glaucophyta, green plants (Archaeplastida) • Thorough peer review and Haptophyta () as deduced from slowly evolving • No space constraints or color figure charges nuclear genes. Mol Phylogenet Evol 2009, 53:872-880. 56. Keeling PJ: Chromalveolates and the evolution of plastids by secondary • Immediate publication on acceptance endosymbiosis. J Eukaryot Microbiol 2009, 56:1-8. • Inclusion in PubMed, CAS, Scopus and Google Scholar 57. Hackett JD, Yoon HS, Li S, Reyes-Prieto A, Rummele SE, Bhattacharya D: • Research which is freely available for redistribution Phylogenomic analysis supports the monophyly of cryptophytes and

Submit your manuscript at www.biomedcentral.com/submit