Biochemistry and Molecular Biology 46 (2014) 17e24

Contents lists available at ScienceDirect

Insect Biochemistry and Molecular Biology

journal homepage: www.elsevier.com/locate/ibmb

Evolutionary implications of dipluran hexamerins

Wei Xie, Yun-Xia Luan*

Key Laboratory of Insect Developmental and Evolutionary Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China article info abstract

Article history: Hexamerin, as a member of the highly conserved hemocyanin superfamily, has been shown to Received 21 November 2013 be a good marker for the phylogenetic study of . However, few studies have been conducted on Received in revised form hexamerins in basal hexapods. The first hexamerin CspHex1 was reported only recently (Pick and 10 January 2014 Burmester, 2009). Remarkably, CspHex1 was suggested to have evolved from hexapod hemocyanin Accepted 11 January 2014 subunit type 2, which is very different from all insect hexamerins originated from hexapod hemocyanin subunit type 1. Does this finding suggest double or even multiple origins of hexamerins in ? To Keywords: find more evidence on the evolution of dipluran hexamerins, eight putative hexamerin gene sequences Diplura Insecta were obtained from three dipluran , as were three hemocyanin genes from two collembolan Hexamerin species. Unexpectedly, after adding the new sequences into the phylogenetic analyses, all dipluran Hemocyanin hexamerins including CspHex1 grouped together and as sister to the insect hexamerins, with high Phylogeny likelihood and Bayesian support. Our analysis supports a single origin of the hexamerins in Hexapoda, and suggests the close relationship between Diplura and Insecta. In addition, our study indicates that a relatively comprehensive taxa sampling is essential to solve some problems in phylogenetic reconstruction. Ó 2014 Elsevier Ltd. All rights reserved.

1. Introduction orders (Burmester et al., 1998, 2002; Hagner-Holler et al., 2007; Pick et al., 2008; Martins et al., 2010). The hemocyanin superfamily includes the following five classes Together with Protura and Collembola, Diplura is regarded as a of proteins with similar structures and close evolutionary re- group of basal hexapods; however, it has an uncertain relationship lationships: hemocyanins (Hc), phenoloxidases (PO), pseudohe- with Insecta (Koch, 1997; Carapelli et al., 2007). Most studies sug- mocyanins/cryptocyanins (PHc/Cc), hexamerins (Hx), and gest Diplura are members of a monophyletic Hexapoda (Bitsch and hexamerin receptors (Hxr) (Burmester, 2002). Hemocyanin is a Bitsch, 2004; Bitsch et al., 2004; Dallai et al., 2011). However, some kind of respiratory protein that only exists in some and studies based on mtDNA data and the combination of morpho- mollusks (Burmester and Hankeln, 2007). Hexamerins of hexapods logical characters and molecular markers rejected the monophyly are derived from hemocyanins, and share a similar protein struc- of Hexapoda, and found Diplura close to some crustaceans rather ture with hemocyanins; however, hexamerins do not function as than to Insecta (Carapelli et al., 2007; Giribet et al., 2001, 2004). respiratory proteins because of mutations at six histidine residues Collembola has the same problem (Nardi et al., 2003; Cook et al., of the copper-binding sites (Burmester, 2002). Members of the 2005). These conflicting results suggest the question of hexapod hemocyanin superfamily that evolved independently in Arthro- monophyly should be investigated from a different angle, with a poda are unique and practical evolutionary markers for studying new marker. arthropod phylogeny (Burmester, 2002; Ertas et al., 2009; Kusche CspHex1, the first reported hexamerin molecule of a dipluran and Burmester, 2001a, b; Terwilliger et al., 1999). In particular, species was identified by Pick and Burmester in 2009.Itwas hexamerins have recently been used in the phylogenetic inference sequenced from Campodea sp. (Campodeoidea: Campodeidae). In of insects, and the trees constructed from insect hexamerins are contrast to all known insect hexamerins, which are derived from consistent with the generally accepted relationships of insect hexapod hemocyanin subunits type 1 (Hc1), CspHex1 grouped with hexapod hemocyanin subunit type 2 (Hc2) on the gene tree, but with a long branch. Therefore, Pick and Burmester (2009) proposed that hexamerins evolved independently from hemocyanins at least * Corresponding author. two times in Hexapoda. Because only one dipluran hexamerin was E-mail address: [email protected] (Y.-X. Luan). involved in their study, we sought more data to test the claim for

0965-1748/$ e see front matter Ó 2014 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ibmb.2014.01.003 18 W. Xie, Y.-X. Luan / Insect Biochemistry and Molecular Biology 46 (2014) 17e24 such a distinctive origin of CspHex1. In this study, we explored databases of dipluran O. japonicus, Campodea cf. fragilis and hexamerins and hemocyanins in three dipluran species, as well as collembolan F. candida. Query sequences included the hexamerin some species from Collembola. Eight new hexamerins from three genes from dipluran L. weberi (obtained by our RT-PCR cloning) and dipluran species were sequenced or identified from databases, and Campodea sp. CP-2009 (Accession: FN263375), as well as hemo- three hemocyanins were sequenced from collembolans. With these cyanin genes from collembolan Sinella curviseta (Accession: new data, a comprehensive phylogenetic analysis was conducted to FM242638) and archaeognathan Machilis germanica (Accession: elucidate the evolutionary and taxonomic implications of dipluran FM242639). The hexamerin and hemocyanin hits were then hexamerins. screened at the NCBI non-redundant (nr) protein database with the BlastX algorithm on the NCBI website. The verified cDNA fragments 2. Materials and methods were assembled into unigenes using ContigExpress of Vector NTI 10.3 (Invitrogen, Carlsbad, USA). Afterward, we applied the newly 2.1. Specimen collection discovered sequences for iterations in the next round of searching. Then, the putative hexamerin and hemocyanin candidates were Dipluran Lepidocampa weberi (Oudemans, 1890) (Campodeoi- verified through RT-PCR cloning and sequencing (sequences from dea: Campodeidae) was collected in Guangyuan, Sichuan Province C. fragilis were not verified because of the lack of specimen). The (China), and dipluran Occasjapyx japonicus (Enderlein, 1907) sequences were finally checked for similarity with reference se- (Japygoidea: ) was collected in Minhang District, quences using PSI-BLAST (Altschul et al., 1997), as well as for his- Shanghai (China). Collembolans Lepidocyrtus cyaneus (Tullberg, tidine mutations at the copper-binding sites. 1871) (Entomobryoidea: ) and Entomobrya proxima Folsom 1924 (Entomobryoidea: Entomobryidae) were collected at 2.4. Phylogenetic analysis Shanghai (China), and collembolan Folsomia candida Willem 1902 (Entomobryoidea: Isotomidae) was provided by Aarhus University, Multiple sequence alignments were assembled using MAFFT Denmark. All collembolans were cultured at 20 C in the laboratory (Katoh et al., 2005) with the L-INS-I method, in which the BLOSUM for a long period and were fed with granulated dry yeast. 62 matrix was set as default. Three different alignments for phylogenetic analyses of dipluran hexamerins were constructed 2.2. RT-PCR cloning and sequencing successively. First, to confirm the tree topology of Pick and Burmester (2009), we used exactly the same 103 protein se- Ò With Trizol reagent, total RNA of dipluran L. weberi (Lwe) and quences from the arthropod hemocyanin superfamily as in their O. japonicus (Oja) were extracted from single living juvenile and study, in which the single dipluran hexamerin (CspHex1) was used, adult specimens respectively, whereas RNA of collembolan and only one collembolan hemocyanin (ScuHc1) was included. F. candida (Fca), L. cyaneus (Lcy), and E. proxima (Epr) was extracted Detailed information on these 103 molecules is presented in from pooled specimens of living juveniles and adults. First-strand Supplementary Table 1, and the alignment is presented in cDNA synthesis was performed via SuperScriptÔ III Reverse Tran- Supplementary Fig. 1a. scriptase (Invitrogen, Carlsbad, USA) according to the manufac- Second, four new hexamerin sequences from Diplura with more turer’s instruction. Based on the conserved amino acid sites of than 400 amino acids (LweHx1, OjaHx1, CfrHx1, and CfrHx2), three hemocyanins and hexamerins in hexapods and crustaceans, new collembolan hemocyanin sequences (LcyHc1, EprHc1, and degenerate PCR primers were designed and used for PCR amplifi- FcaHc2), and one previously reported collembolan hemocyanin cation. We obtained the hexamerin gene fragment of dipluran sequence (FcaHc1, CAR85703, Pick et al., 2009) were added and L. weberi with the primer pair 50-GAGGGDSWGTTYSTBTAYGC-30and aligned together with the 103 sequences employed above (the 50-YAARGGYTTGTGGTTNAGACG-30. In addition, we obtained he- alignment of 111 sequences in Supplementary Fig. 1b). mocyanin gene fragments of collembolan L. cyaneus and E. proxima Third, to supplement work by Ertas et al. (2009), three remipede with the primer pair 50-ATGGAYTKBCCHTTCTGGTGGA-30 and 50- hemocyanins from Speleonectes tulumensis (Ertas et al., 2009)were GTNGCGGTYTCRAARTGYTCCAT-30. To avoid nonspecific products, added to the alignment of 111 sequences to generate a 114- we optimized the annealing temperature by performing gradient sequence alignment (Supplementary Fig. 1c). The alignments in PCRs. PCR products of the expected sizes were cloned after gel Supplementary Fig. 1aec were visualized using the latest version of purification, and then sequenced by the commercial company GeneDoc (Nicholas and Nicholas, 1997). Sangon (Shanghai, China). The gene-specific primers were further The appropriate model of amino acid substitution designed based on the obtained sequences, and 50 and 30 RACE were (WAG þ Gamma, Whelan and Goldman, 2001) was selected by performed through nested PCR according to the instructions in the ProtTest (Abascal et al., 2005) using the Akaike Information Crite- 50 full RACE kit and 30 full RACE kit from Takara Bio, Inc. (Dalian, rion. RAxML version 7.0.4 (Stamatakis, 2006) was run for maximum China). The pMD19-T vector from Takara was used for cDNA frag- likelihood analysis with the number of bootstrap replicates set to ment cloning. 1000. The Bayesian method was performed using MrBayes 3.1.2 (Huelsenbeck and Ronquist, 2001). Prior probabilities for all trees 2.3. Data mining of dipluran hexamerins were equal. Metropolis-coupled Markov Chain Monte Carlo sam- pling with one cold and three heated chains was run for 20,000,000 The total RNA for RNA Sequencing of dipluran O. japonicus and generations for the dataset of 103 sequences, 8,000,000 genera- Ò collembolan F. candida was prepared through the Trizol method tions for the dataset of 111 sequences, and 10,000,000 generations described above. Their transcriptomes were sequenced separately for the dataset of 114 sequences. Two independent runs continued using Illumina HiSeq 2000 and annotated by Beijing Genomics until the values of average split frequencies were stationary and Institute (Shenzhen, China) (unpublished data from our lab). In below 0.01. For different datasets, we observed different speeds of addition, the expressed sequence tag (EST) data for dipluran Cam- convergence; therefore, the burnin value was set accordingly to podea cf. fragilis (Cfr) (taxid: 383857) was downloaded from Gen- estimate the posterior probabilities. The burnin value (with sta- Bank (Meusemann et al., 2010). tionary value of average split frequencies noted) for datasets of 103, The software ncbi-blast-2.2.27þ was downloaded from the NCBI 111, and 114 sequences was 100,000 (0.003), 30,000 (0.004), and website to run the tBLASTn program locally against these three 50,000 (0.005), respectively. The starting trees were randomly W. Xie, Y.-X. Luan / Insect Biochemistry and Molecular Biology 46 (2014) 17e24 19 selected, and trees were sampled every 100 generations. In addi- 2008), and blattarian Periplaneta americana (PamHc2, CAR85702; tion, one more Bayesian analysis was performed for the dataset of Pick et al., 2009). 111 sequences using PhyloBayes (Lartillot et al., 2009) with the WAG model and CAT rate model, two runs for 1,000,000 generation, 3.2. Molecular phylogeny of dipluran hexamerins and trees sampled every 100 generations. Crustacean hemocyanin and pseudohemocyanins were used as the outgroups for all rooted The trees produced by maximum likelihood and Bayesian trees in this study. All the phylogenetic tree calculations were run methods were essentially identical. First, all 103 protein sequences on CIPRES Science Gateway (Miller et al., 2010). used by Pick and Burmester (2009) were used in the phylogenetic inference, and the consistent tree topology as in their study was 3. Results acquired. CspHex1 groups with the hexapod Hc2 sequences (ML bootstrap value ¼ 60, Bayesian posterior probabilities ¼ 1.00), far 3.1. Identification of new molecules from the clade of insect hexamerins (Supplementary Fig. 2). Then, along with the 103 sequences used above, another four First, a hexamerin gene from dipluran L. weberi (LweHx1) and dipluran hexamerins longer than 400 amino acids (LweHx1, two hemocyanin genes from collembolans L. cyaneus (LcyHc1) and OjaHx1, CfrHx1, and CfrHx2) and four collembolan hemocyanins E. proxima (EprHc1) were identified via RT-PCR and sequencing. (LcyHc1, EprHc1, FcaHc2 and FcaHc1) were included for further Second, two new hexamerin molecules of dipluran O. japonicus phylogenetic inference (Fig. 2 and Supplemental Fig. 3a). The new (OjaHx1, 2) and a hemocyanin of collembolan F. candida (FcaHc2) dataset with 111 sequences exhibited faster convergence in the were identified from transcriptome sequences (unpublished data). Bayesian analysis and higher support values for some nodes in both These transcriptome-derived identifications were further the maximum likelihood and Bayesian analyses. Remarkably, the confirmed through RT-PCR and sequencing. With the 50 and 30 RACE phylogenetic position of CspHex1 has changed. All dipluran hex- procedure, LweHx1 and OjaHx1 were successfully extended to full amerins, including CspHex1, group together (with 83%,1.00 support length cDNA sequence respectively and FcaHc2 was extended to a value) and join with the clade of insect hexamerins (with 82%, 1.00 nearly full length cDNA sequence. We failed to obtain any hexam- support value), with all these hexamerins grouping in turn with the erin molecule in Protura or Collembola, nor any hemocyanin Hc1 sequences (Fig. 2 and Supplemental Fig. 3a). molecule in Diplura or Protura. For Collembola, FcaHc1, LcyHc1, and EprHc1 group well with the Afterward, from the EST database of the dipluran C. fragilis insect Hc1 sequences, and FcaHc2 joins the insect Hc2 sequences (Meusemann et al., 2010), the verified candidates for putative he- with good support (Fig. 2 and Supplemental Fig. 3a). In particular, mocyanin members were assembled into five unigenes FcaHc2 is the first hemocyanin subunit type 2 ever reported in basal (named CfrHx1-5), which were all identified as putative hexam- hexapods. erins because of the mutations at the histidine residues of the Fig. 3 shows a simplified tree derived from Fig. 2 to display the copper-binding sites (i.e., from amino acids other than histidine at relationships among the hexapod hexamerins involved in the these sites). analysis. The dipluran hexamerins are from the base. It is also All eleven newly reported molecules are listed and annotated in interesting that with new dipluran hexamerins added, the poly- Table 1 with relevant properties. Eight of the dipluran hexamerins neopeteran hexamerins group together possess at least one amino acid other than histidine among the six (Plecoptera þ Orthroptera þ Blattria & Isoptera), which is different copper-binding sites (Table 1). from the previous studies on insect hexamerins (Burmester, 2002; The multiple sequence alignment of selected dipluran hexam- Hagner-Holler et al., 2007), although the support value is relatively erins, along with reference sequences, is shown in Fig. 1. The pu- low. tative hexamerins from dipluran L. weberi (LweHx1, JX867262), The results of Bayesian inference by PhyloBayes were also C. fragilis (CfrHx1, JX867269), and O. japonicus (OjaHx1, JX867263) consistent with those of MrBayes (data not shown). Moreover, the were compared with hexamerins from dipluran Campodea sp. unrooted tree of 111 sequences in Fig. 2 was checked in case an (CspHex1, CAX63173, Pick and Burmester, 2009) and plecopteran artificial influence occurs from outgroup selection (Supplementary Perla marginata (PmaHex1, AM690365, Hagner-Holler et al., 2007), Fig. 3b). All methods resulted in basically consistent tree topologies. as well as with hemocyanins from collembolans F. candida (FcaHc2, Finally, three remipede hemocyanins were also added to the KF670722) and S. curviseta (ScuHc1, FM242638; Pick et al., 2009), phylogenetic analysis with both maximum likelihood and Bayesian zygentoman Thermobia domestica (TdoHc2, CAQ63322; Pick et al., method (Supplemental Fig. 4). The basic phylogenetic topology is

Table 1 Properties of putative hexamerin and hemocyanin molecules reported in this study.

Species Sequence GenBank cDNA (bp) Deduced Conserved Method name accession no. amino acid histidine sites sequence for copper-binding (countable sites)

Diplura Campodeoidea: Lepidocampa weberi LweHx1 JX867262 2094 698 2 (6) PCR cloning & RACE Campodeidae Campodea cf. fragilis CfrHx1 JX867269 1422 474 2 (6) EST data mining Campodea cf. fragilis CfrHx2 JX867270 1338 446 0 (3) EST data mining Campodea cf. fragilis CfrHx3 FN207665 502 166 0 (3) EST data mining Campodea cf. fragilis CfrHx4 FN208803 682 227 0 (4) EST data mining Campodea cf. fragilis CfrHx5 FN208006 641 194 0 (3) EST data mining Japygoidea: Japygidae Occasjapyx japonicus OjaHx1 JX867263 2184 728 4 (6) PCR cloning & RACE Occasjapyx japonicus OjaHx2 JX867264 789 263 2 (3) PCR cloning Collembola Folsomia candida FcaHc2 KF670722 2106 672 6 (6) PCR cloning & RACE Lepidocyrtus cyaneus LcyHc1 JX867267 495 164 4 (4) PCR cloning Entomobrya Proxima EprHc1 JX867268 493 163 5 (5) PCR cloning 20 W. Xie, Y.-X. Luan / Insect Biochemistry and Molecular Biology 46 (2014) 17e24 W. Xie, Y.-X. Luan / Insect Biochemistry and Molecular Biology 46 (2014) 17e24 21 consistent with the above results, although support values for the (Gullan and Cranston, 2005), Diplura undergoes a less extreme some nodes have decreased. metamorphosis than do holo- or hemimetabolous insects. The many dipluran hexamerins found in this study, together with that from a silverfish (TdoHex1, Zygentoma, T. domestica; Pick et al., 4. Discussion 2008), documents the occurrence of hexamerins in ametabolous hexapods. In further studies, hexamerins from ametabolous species 4.1. Evolution of hexamerins should be used to explore their essential functions beyond metamorphosis. In the arthropod hemocyanin superfamily, insect hexamerins evolved from hexapod Hc1 and crustacean cryptocyanin derived from crustacean Hc. Both the insect hexamerins and the crustacean 4.2. Evolution of basal hexapods pseudohemocyanins (cryptocyanins) serve as storage proteins (Burmester, 1999a, 2002; Terwilliger et al., 1999) and share the Diplura is a group of soil arthropods that typically have two same function in exoskeleton building (Terwilliger et al., 2005; prolonged tails or a pair of pincer-like cerci, with no eyes or Zhou et al., 2007). This protein family has been studied little in pigment (Koch, 2009). The phylogenetic position of Diplura is still the basal hexapods. In the only previous study (Pick and Burmester, unclear (Carapelli et al., 2007). In the present study, both the dis- 2009), the dipluran hexamerin (CspHex1) grouped on the gene tree tribution and phylogenetic analysis of hexamerins support Diplura with hexapod Hc2 proteins, far away from the insect hexamerins. being a member of Hexapoda, which is consistent with most pre- However, when we added more dipluran hexamerins from two vious studies based on morphological and molecular data (Koch, major groups (Japygoidea and Campodeoidea), the phylogenetic 1997; Bitsch et al., 2004; Dell’Ampio et al., 2009; Kjer et al., 2006; analysis joined all the dipluran hexamerins with insect hexamerins Luan et al., 2004, 2005; Meusemann et al., 2010; Regier et al., with good support (Fig. 2). A single origin of the known hexamerins 2008, 2010; von Reumont et al., 2011). Our results group four in Hexapoda is now indicated. collembolan hemocyanin gene sequences with insect Hc1 and in- Compared with hexapod type 1 hemocyanins, the known sect Hc2 respectively, which also confirms the monophyly of hexapod type 2 hemocyanins have a unique string of nine amino Hexapoda. acids (Fig. 1, sites 485e493). This feature is shared by CspHex1, as Mitochondrial genes are likely of limited value for investigating presented by Pick and Burmester (2009). We examined all five such ancient relationships (Brewer et al., 2013), and ribosomal RNA available dipluran hexamerins, and only the two Campodea species sequences of diplurans likewise are divergent, likely leading to were found to have the string (i.e., CspHex1 and CfrHx1 of Table 1). long-branch-attraction problems. Conveniently the hexamerin se- Thus, this feature is not common to all dipluran hexamerins, not to quences of diplurans are not so divergent, as some of the insects all Campodeoidea (LweHx1 lacks it) or even between hexamerins show longer branches for hexamerin in the tree of Fig. 2. This could from the same species (e.g., CfrHx1 and CfrHx2). Considering that mean the relatively slowly evolving hexamerin genes are indeed we got solid support for a single origin of hexapod hexamerins, two appropriate for answering the phylogenetic questions posed in this evolutionary scenarios can account for the findings about the nine study. amino acids: (1) The string was originally absent from hexamerins, The monophyly of Diplura has been questioned mainly because its then uniquely inserted into Hx1 of the Campodea. This seems two major groups, Campodeoidea and Japygoidea, differ in cercus likely because the nine amino acids in the Campodea Hx1 are not type, sperm morphology, and ovarian structure (Stys and Bilinski, quite the same as those in the insect Hc2, nor those in the 1990; Bilinski, 1994), but many other morphological characteristics, collembolan FcaHc2 (Fig. 1). (2) The nine amino acid were present as well as some molecular studies, support dipluran monophyly in the ancestral hexapod Hc1, which evolved into some hexamerins (Koch,1997; Luan et al., 2005; Dallai et al., 2011). In the present study, of Campodea as Pick and Burmester (2009) proposed, with the nine new dipluran hexamerins from three species cover both the Japy- residues retained there, and these nine were then lost from the goidea and Campodeoidea, thus providing a relatively comprehen- hexamerins of the other diplurans and, independently from the sive view of dipluran hexamerins. Our analysis supports the hexamerins of the other insects. monophyly of Diplura with high support values (83%, 1.00 in Fig. 2). It is widely accepted that hexamerins serve as storage proteins Actually we explored the hemocyanin-family proteins in all the in the pre-molt stage of many insect species for subsequent three groups of basal hexapods (Diplura, Collembola, and Protura) metamorphosis (Scheller et al., 1990; Burmester, 1999b; Telfer and through RT-PCR cloning and transcriptome database searching; Kunkel, 1991), although they may still have other roles as carriers of however, we failed to obtain any hemocyanin gene in Diplura, any small organic molecules, such as steroid hormones (Enderle et al., hexamerin in Collembola, or any related molecule at all in Protura. 1983). However, most functional studies of hexamerins have been We are uncertain whether hemocyanins necessarily occur in all performed on Diptera and Lepidoptera (Telfer and Kunkel, 1991; basal hexapods because other respiratory proteins such as hemo- Burmester et al., 1998; Burmester and Scheller, 1999; Nagamanju globin may be employed and the tracheal system may be sufficient et al., 2003), which are holometabolous insects that have a non- for some species. Nor have we eliminated the possibility of hex- feeding pupal stage in which food stores should be required. In a amerins in collembolans or proturans. study on the hymenopteran insect, parasitoid wasp Nasonia vitri- Hemocyanin sequences suggested a close relationship between pennis, the transcriptional analysis of hexamerins suggested the Remipedia and Hexapoda in the study of Ertas et al. (2009), but physiological function beyond metamorphosis, namely egg pro- their study lacked the related molecules from basal hexapods, with duction (Cristino et al., 2010). As a basal group with epimorphosis only one Collembola Hc1 included. In this study, three remipede

Fig. 1. Multiple sequence alignment of selected dipluran hexamerins with reference sequences. The six conserved histidines at the copper-binding sites are shaded in black, other conserved residues are shaded in light gray, and the featured nine amino acid strings for insect hemocyanin type 2 are in dark gray (sites 485e493). PmaHex1: a hexamerin from Plecoptera Perla marginata (Hagner-Holler et al., 2007); LweHx1: a hexamerin from Diplura Lepidocampa weberi; OjaHx1: a hexamerin from Diplura Occasjapyx japonicus; CfrHx1: a hexamerin from Diplura Campodea cf. fragilis; CspHex1: a hexamerin from Diplura Campodea sp. (Pick and Burmester, 2009); ScuHc1: a hemocyanin from Collembola Sinella curviseta (Pick et al., 2009); TdoHc2: a hemocyanin from Zygentoma Thermobia domestica (Pick et al., 2008); PamHc2: a hemocyanin from Blattaria Periplaneta americana (Pick et al., 2009); FcaHc2: a hemocyanin from Collembola Folsomia candida. 22 W. Xie, Y.-X. Luan / Insect Biochemistry and Molecular Biology 46 (2014) 17e24

Fig. 2. ML tree of hexapod hemocyanins and hexamerins, with the crustacean molecules as the outgroup, calculated using RAxML. The phylogenetic tree was based on 111 protein sequences, from adding four dipluran hexamerins (OjaHx1, LweHx1, CfrHx1, and CfrHx2) and four collembolan hemocyanins (FcaHc1, FcaHc2, LcyHc1, and EprHc1) to the 103 sequences of Supplementary Fig. 2. The clades of endopterygote hexamerins are collapsed to simplify the topology. For original tree refer to supplemental Fig. 3a. The scale bar denotes 0.3 amino acid substitutions per site. The numbers at the nodes depict the bootstrap values of maximum likelihood/Bayesian posterior probabilities. New sequences obtained in this study are marked with *. Abbreviations: PleHc, Pacifastacus leniusculus hemocyanin subunit; GroHc1, Gammarus roeseli hemocyanin subunit 1; HamHcA, Homarus americanus hemocyanin subunit A; PelHc1-4, Palinurus elephas hemocyanin subunit 1e4; PvuHc, Palinurus vulgaris hemocyanin; PinHcA-C, Panulirus interruptus hemocyanin subunit AeC; CmaHc6, Cancer magister hemocyanin subunit 6; CsaHc, Callinectes sapidus hemocyanin subunit; PvaHc, Penaeus vannamei hemocyanin; PvaHc1, Penaeus vannamei hemocyanin subunit 1; CmaCc1, Cancer magister cryptocyanin 1; HamPHc1 and 2, Homarus americanus pseudohemocyanin 1 and 2; TdoHc1 and 2, Thermobia domestica hemocyanin subunit 1 and 2; PgrHc1 and 2, Perla grandis hemocyanin subunit 1 and 2; PmaHc1 and 2, Perla marginata hemocyanin subunit 1 and 2; HmeHc1 and 2, membranacea hemocyanin subunit 1 and 2; BduHc1 and 2; Blaptica dubia hemocyanin subunit 1 and 2; CseHc1 and 2, Cryptotermes secundus hemocyanin subunit 1 and 2; PamHc1 and 2, Periplaneta americana hemocyanin subunit 1 and 2; FcaHc1, Folsomia candida hemocyanin subunit 1; ScuHc1, Sinella curviseta hemocyanin subunit 1; CacHc1, Chelidurella acan- thopygia hemocyanin subunit 1; SamEHP, Schistocerca americana embryonic hemolymph protein; CmoHc1, Carausius morosus hemocyanin subunit 1; MgeHc1, Machilis germanica hemocyanin 1; TdoHx1, Thermobia domestica hexamerin 1; EmuHex1, Ephemerella mucronata hexamerin 1; PmaHx1, Perla marginata hexamerin 1; RflHex1 and 2, Reticulitermes flavipes hexamerin 1 and 2; PamHex12, Periplaneta americana hexamerin (allergen clone C12); BdiHex, Blaberus discoidalis hexamerin; RmiHx1 and 2, Romalea microptera hex- amerin 1 and 2; LmiJHBP, Locusta migratoria juvenile hormone binding protein; RclCyanA and B, Riptortus clavatus cyanoprotein alpha subunit and beta subunit. hemocyanins were added into our dataset of 111 sequences for until more diplura taxa were added. To assure the validity of the phylogenetic inference. The affinity of remipede hemocyanins with analysis, we performed a control study of datasets (Fig. 2 vs. hexapod hemocyanins is further supported. Supplementary Fig. 2) and phylogenetic inference with several methods and models (maximum likelihood with RAxML vs. 4.3. Phylogenetic inference and conclusion Bayesian analysis). We conclude that relatively comprehensive data of dipluran hexamerins are essential to resolve their molecular Our research demonstrates a case in which a dipluran hexam- origin (from Hc1) and diplurans’ monophyly and phylogenetic po- erin was not able to achieve a well-resolved phylogenetic position sition close to Insecta. W. Xie, Y.-X. Luan / Insect Biochemistry and Molecular Biology 46 (2014) 17e24 23

Cristino, A., Nunes, F., Barchuk, A., Aguiar-Coelho, V., Simões, Z., Bitondi, M., 2010. Organization, evolution and transcriptional profile of hexamerin genes of the parasitic wasp Nasonia vitripennis (Hymenoptera: Pteromalidae). Insect Mol. Biol. 19, 137e146. Dallai, R., Mercati, D., Carapelli, A., Nardi, F., Machida, R., Sekiya, K., Frati, F., 2011. Sperm accessory microtubules suggest the placement of Diplura as the sister- group of Insecta s.s. Arth. Struct. Dev. 40, 77e92. Dell’Ampio, E., Szucsich, N.U., Carapelli, A., Frati, F., Steiner, G., Steinacher, A., Pass, G., 2009. Testing for misleading effects in the phylogenetic reconstruction of ancient lineages of hexapods: influence of character dependence and char- acter choice in analyses of 28S rRNA sequences. Zool. Scr. 38, 155e170. Enderle, U., Käuser, G., Renn, L., Scheller, K., Koolman, J., 1983. Ecdysteroids in the hemolymph of blowfly are bound to calliphorin. In: Scheller, K. (Ed.), The Larval Serum Proteins of Insects: Function, Biosynthesis, Genetic. Thieme, Stuttgart, pp. 40e49. Ertas, B., von Reumont, B.M., Wägele, J.W., Misof, B., Burmester, T., 2009. Hemo- cyanin suggests a close relationship of Remipedia and Hexapoda. Mol. Biol. Evol. 26, 2711e2718. Giribet, G., Edgecombe, G.D., Carpenter, J.M., D’Haese, C.A., Wheeler, W.C., 2004. Is Ellipura monophyletic? A combined analysis of basal hexapod relationships with emphasis on the origin of insects. Org. Divers. Evol. 4, 319e340. Fig. 3. A simplified diagram for the evolution of hexapod hexamerins. Abbreviations: Giribet, G., Edgecombe, G.D., Wheeler, W., 2001. Arthropod phylogeny based on Hc, hemocyanins; PHc, pseudohemocyanins; Hc1, hemocyanin subunit type 1; Hc2, eight molecular loci and morphology. Nature 413, 157e161. fi hemocyanin subunit type 2; Hx, hexamerins. Gullan, P.J., Cranston, P., 2005. Insect systematics: phylogeny and classi cation. In: Gullan, P.J., Cranston, P. (Eds.), Insects: an Outline of Entomology, third ed. Wiley-Blackwell, Oxford, p. 182 (Fig. 7.2). Acknowledgments Hagner-Holler, S., Pick, C., Girgenrath, S., Marden, J.H., Burmester, T., 2007. Diversity of stonefly hexamerins and implication for the evolution of insect storage proteins. Insect Biochem. Mol. Biol. 37, 1064e1074. Our work was inspired by Prof. Er-Jun Ling from the Chinese Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: Bayesian inference of phylogenetic Academy of Sciences. We express our gratitude to Prof. Thorsten trees. Bioinformatics 17, 754e755. Burmester for his prompt and elaborate reply to our questions Katoh, K., Kuma, K.-I., Toh, H., Miyata, T., 2005. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucl. Acids Res. 33, 511e518. regarding dipluran hexamerins. We also appreciate his series of Kjer, K.M., Carle, F.L., Litman, J., Ware, J., 2006. A molecular phylogeny of Hexapoda. works on arthropod hemocyanin family members, which made our Arthrop. Syst. Phylog. 64, 35e44. research possible. This work was supported by grants from the Koch, M., 1997. Monophyly and phylogenetic position of the Diplura (Hexapoda). Pedobiologia 41, 9e12. National Natural Science Foundation of China (Nos. 31071911, Koch, M., 2009. Diplura. In: Resh, V.H., Cardé, R.T. (Eds.), Encyclopedia of Insects, 30870282, and 31272298). second ed. Elsevier, pp. 281e283. Kusche, K., Burmester, T., 2001a. Molecular cloning and evolution of lobster he- mocyanin. Biochem. Biophys. Res. Commun. 282, 887e892. Appendix A. Supplementary data Kusche, K., Burmester, T., 2001b. Diplopod hemocyanin sequence and the phylo- genetic position of the Myriapoda. Mol. Biol. Evol. 18, 1566e1573. Lartillot, N., Lepage, T., Blanquart, S., 2009. PhyloBayes 3: a Bayesian software Supplementary data related to this article can be found at http:// package for phylogenetic reconstruction and molecular dating. Bioinformatics dx.doi.org/10.1016/j.ibmb.2014.01.003. 25, 2286e2288. Luan, Y.X., Yao, Y.G., Xie, R.D., Yang, Y.M., Zhang, Y.P., Yin, W.Y., 2004. Analysis of 18S rRNA gene of Octostigma sinensis (Projapygoidea: Octostigmatidae) supports the References monophyly of Diplura. Pedobiologia 48, 453e459. Luan, Y.X., Mallatt, J.M., Xie, R.D., Yang, Y.M., Yin, W.Y., 2005. The phylogenetic Abascal, F., Zardoya, R., Posada, D., 2005. ProtTest: selection of best-fit models of positions of three basal-hexapod groups (Protura, Diplura, and Collembola) protein evolution. Bioinformatics 21, 2104e2105. based on ribosomal RNA gene sequences. Mol. Biol. Evol. 22, 1579e1592. Altschul, S., Madden, T., Schäffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D., 1997. Martins, J.R., Nunes, F.M.F., Cristino, A.S., Simões, Z.L.P., Bitondi, M.M.G., 2010. The Gapped BLAST and PSI-BLAST: a new generation of protein database search four hexamerin genes in the honey bee: structure, molecular evolution and programs. Nucl. Acids Res. 25, 3389e3402. function deduced from expression patterns in queens, workers and drones. Bilinski, S., 1994. The ovary of . In: Büning, J. (Ed.), The Insect Ovary. BMC Mol. Biol. 11, 23. Chapman & Hall, London, pp. 7e30. Meusemann, K., von Reumont, B.M., Simon, S., Roeding, F., Strauss, S., Kück, P., Bitsch, C., Bitsch, J., 2004. Phylogenetic relationships of basal hexapods among Ebersberger, I., Walzl, M., Pass, G., Breuers, S., Achter, V., von Haeseler, A., mandibulate arthropods: a cladistic analysis based on comparative morpho- Burmester, T., Hadrys, H., Wägele, J.W., Misof, B., 2010. A phylogenomic logical characters. Zool. Scr. 33, 511e550. approach to resolve the arthropod tree of life. Mol. Biol. Evol. 27, 2451e Bitsch, J., Bitsch, C., Bourgoin, T., D’Haese, C., 2004. The phylogenetic position of 2464. early hexapod lineages: morphological data contradict molecular data. Syst. Miller, M.A., Pfeiffer, W., Schwartz, T., 2010. Creating the CIPRES science gateway for Entomol. 29, 433e440. inference of large phylogenetic trees. In: Proceedings of the Gateway Brewer, M.S., Swafford, L., Spruill, C.L., Bond, E.J., 2013. Arthropod phylogenetics in Computing Environments Workshop (GCE), 14 November 2010, New Orleans, light of three novel millipede (Myriapoda: Diplopoda) mitochondrial genomes Louisiana, pp. 1e8. with comments on the appropriateness of mitochondrial genome sequence Nagamanju, P., Hansen, I.A., Burmester, T., Meyer, S.R., Scheller, K., Dutta-Gupta, A., data for inferring deep level relationships. PLOS One 8, e68005. 2003. Complete sequence, expression and evolution of two members of the Burmester, T., Massey Jr., H.C., Zakharkin, S.O., Benes, H., 1998. The evolution of hexamerin protein family during the larval development of the rice moth, hexamerins and the phylogeny of insects. J. Mol. Evol. 47, 93e108. Corcyra cephalonica. Insect Biochem. Mol. Biol. 33, 73e80. Burmester, T., 1999a. Identification, molecular cloning, and phylogenetic analysis of Nardi, F., Spinsanti, G., Boore, J.L., Carapelli, A., Dallai, R., Frati, F., 2003. Hexapod a non-respiratory pseudo-hemocyanin of Homarus americanus. J. Biol. Chem. origins: monophyletic or paraphyletic? Science 299, 1887e1889. 274, 13217e13222. Nicholas, K.B., Nicholas Jr., H.B., 1997. GeneDoc: Analysis and Visualization of Ge- Burmester, T., 1999b. Evolution and function of the insect hexamerins. Eur. J. netic Variation. http://www.nrbsc.org/gfx/genedoc/index.html. Entomol. 96, 213e225. Pick, C., Hagner-Holler, S., Burmester, T., 2008. Molecular characterization of he- Burmester, T., 2002. Origin and evolution of arthropod hemocyanins and related mocyanin and hexamerin from the firebrat Thermobia domestica (Zygentoma). proteins. J. Comp. Physiol. B. 172, 95e107. Insect Biochem. Mol. Biol. 38, 977e983. Burmester, T., Hankeln, T., 2007. The respiratory proteins of insects. J. Insect Physiol. Pick, C., Schneuer, M., Burmester, T., 2009. The occurrence of hemocyanin in Hex- 53, 285e294. apoda. FEBS J. 276, 1930e1941. Burmester, T., Scheller, K., 1999. Ligands and receptors: common theme in insect Pick, C., Burmester, T., 2009. A putative hexamerin from a Campodea sp. suggests an storage protein transport. Naturwissenschaften 86, 468e474. independent origin of hemocyanin-related storage proteins in Hexapoda. Insect Carapelli, A., Lio, P., Nardi, F., van der Wath, E., Frati, F., 2007. Phylogenetic analysis of Mol. Biol. 18, 673e679. mitochondrial protein coding genes confirms the reciprocal paraphyly of Hex- Regier, J.C., Shultz, J.W., Ganley, A.R., Hussey, A., Shi, D., Ball, B., Zwick, A., Stajich, J.E., apoda and Crustacea. BMC Evol. Biol. 7, S8. Cummings, M.P., Martin, J.W., Cunningham, C.W., 2008. Resolving arthropod Cook, C.E., Yue, Q., Akam, M., 2005. Mitochondrial genomes suggest that hexapods phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nu- and crustaceans are mutually paraphyletic. Proc. R. Soc. B 272, 1295e1304. clear gene sequence. Syst. Biol. 57, 920e938. 24 W. Xie, Y.-X. Luan / Insect Biochemistry and Molecular Biology 46 (2014) 17e24

Regier, J.C., Shultz, J.W., Zwick, A., Hussey, A., Ball, B., Wetzer, R., Martin, J.W., Terwilliger, N.B., Ryan, M.C., Towle, D., 2005. Evolution of novel functions: crypto- Cunningham, C.W., 2010. Arthropod relationships revealed by phylogenomic cyanin helps build new exoskeleton in Cancer magister. J. Exp. Biol. 208, 2467e analysis of nuclear protein coding sequences. Nature 463, 1079e1083. 2474. Scheller, K., Fischer, B., Schenkel, H., 1990. Molecular properties, functions and von Reumont, B.M., Jenner, R.A., Wills, M.A., Dell’ampio, E., Pass, G., Ebersberger, I., developmentally regulated biosynthesis of arylphorin in Calliphora vicina. In: Meyer, B., Koenemann, S., Iliffe, T.M., Stamatakis, A., Niehuis, O., Meusemann, K., Hagedorn, H.H. (Ed.), Molecular Insect Science. Plenum, New York, pp. 155e162. Misof, B., 2011. Pancrustacean phylogeny in the light of new phylogenomic data: Stamatakis, A., 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic an- support for Remipedia as the possible sister group of Hexapoda. Mol. Biol. Evol. alyses with thousands of taxa and mixed models. Bioinformatics 22, 2688e2690. 29, 1031e1045. Stys, P., Bilinski, S., 1990. Ovariole types and the phylogeny of hexapods. Biol. Rev. Whelan, S., Goldman, N., 2001. A general empirical model of protein evolution 65, 401e429. derived from multiple protein families using a maximum-likelihood approach. Telfer, W.H., Kunkel, J.G., 1991. The function and evolution of insect storage hex- Mol. Biol. Evol. 18, 691e699. amerins. Ann. Rev. Entomol. 36, 205e228. Zhou, X., Tarver, M.R., Scharf, M.E., 2007. Hexamerin-based regulation of juvenile Terwilliger, N.B., Dangott, L.J., Ryan, M.C., 1999. Cryptocyanin, a crustacean molting hormone-dependent gene expression underlies phenotypic plasticity in a social protein: evolutionary links to arthropod hemocyanin and insect hexamerins. insect. Development 134, 601e610. Proc. Natl. Acad. Sci. 96, 2013e2018.