ARTICLE IN PRESS

Insect Biochemistry and Molecular Biology Biochemistry and Molecular Biology 37 (2007) 1064–1074 www.elsevier.com/locate/ibmb

Diversity of stonefly hexamerins and implication for the evolution of insect storage proteins

Silke Hagner-Hollera, Christian Pickb, Stefan Girgenrathc, James H. Mardenc, Thorsten Burmesterb,Ã

aInstitute of Zoology, Johannes Gutenberg University of Mainz, D-55099 Mainz, Germany bInstitute of Zoology and Zoological Museum, University of Hamburg, D-20146 Hamburg, Germany cDepartment of Biology, Pennsylvania State University, USA

Received 4 April 2007; received in revised form 30 May 2007; accepted 1 June 2007

Abstract

Hexamerins are large storage proteins of in the 500 kDa range that evolved from the copper-containing hemocyanins. Hexamerins have been found at high concentration in the hemolymph of many insect taxa, but have remained unstudied in relatively basal taxa. To obtain more detailed insight about early hexamerin evolution, we have studied hexamerins in stoneflies (). Stoneflies are also the only insects for which a functional hemocyanin is known to co-occur with hexamerins in the hemolymph. Here, we identified hexamerins in five plecopteran species and obtained partial cDNA sequences from Perla marginata (Perlidae), Nemoura sp. (Nemouridae), burksi (), vivipara (), and Diamphipnopsis samali (Diamphipnoidae). At least four distinct hexamerins are present in P. marginata. The full-length cDNA of one hexamerin subunit was obtained (PmaHex1) that measures 2475 bp and translates into a native polypeptide of 702 amino acids. Phylogenetic analyses showed that the plecopteran hexamerins are monophyletic and positioned at the base of the insect hexamerin tree, probably diverging about 360 million years ago. Within the Plecoptera, distinct hexamerin types evolved before the divergence of the families. Mapping amino acid compositions onto the phylogenetic tree shows that the accumulation of aromatic amino acids (and thus the evolution of ‘‘arylphorins’’) commenced soon after the hexamerins diverged from hemocyanins, but also indicates that hexamerins with distinct amino acid compositions reflect secondary losses of aromatic amino acids. r 2007 Elsevier Ltd. All rights reserved.

Keywords: Hexamerin; Molecular clock; Plecoptera; Phylogeny; Stoneflies; Storage protein

1. Introduction giving rise to native molecules of about 500 kDa (Telfer and Kunkel, 1991; Burmester, 1999). However, in contrast All insect species investigated so far harbor large to the respiratory hemocyanins, hexamerins do not bind proteins in their hemolymph that are referred to as oxygen because the copper-binding histidines residues have hexamerins. Sequence comparisons have shown that these been replaced by other amino acids (Beintema et al., 1994; proteins belong to the superfamily of hemocya- Burmester, 2001, 2002). nins, phenoloxidases, crustacean pseudohemocyanins and In many insect species, hexamerins accumulate in the dipteran hexamerin receptors (Beintema et al., 1994; hemolymph to extraordinarily high concentrations, reach- Burmester and Scheller, 1996; Burmester, 2001, 2002). ing up to 50% of the total salt-extractable proteins of the Like arthropod hemocyanins, hexamerins usually consist in premolt developmental stages (Scheller et al., of six identical or similar subunits in the range of 80 kDa, 1990; Telfer and Kunkel, 1991). Therefore, hexamerins are thought to act mainly as storage proteins, which are used ÃCorresponding author. Tel.: +49 40 42838 3913; as a source of energy and amino acids during non-feeding fax: +49 40 42838 3937. periods, such as pupal or nymphal stages (Telfer and E-mail address: [email protected] (T. Burmester). Kunkel, 1991; Burmester, 1999). Hexamerins have also

0965-1748/$ - see front matter r 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.ibmb.2007.06.001 ARTICLE IN PRESS S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074 1065 been identified as constituents of the sclerotizing system of vivipara and Diamphipnopsis samali employing the guani- the insect’s cuticle (Peter and Scheller, 1991). They may dine-thiocyanate method (Chirgwin et al., 1979) or the urea also serve as carriers for ecdysteroids (Enderle et al., 1983), procedure according to Holmes and Bonner (1973).We juvenile hormone (Braun and Wyatt, 1996) and other designed a set of degenerated oligonucleotide primers organic compounds such as riboflavin (Magee et al., 1994). according to conserved amino acid sequences of insect More recent studies suggest that hexamerins function in hexamerins and hemocyanins (Burmester, 2001). The for- caste differentiation of termites via regulation of the ward primers were: 50-GAGGGNSAGTTCGTNTACGC-30, juvenile hormone levels (Zhou et al., 2006, 2007). There 50-CCNCCNCCNTAYGARRTCTACCC-30; reverse pri- is also evidence that some hexamerins play a role in the mers: 50-GAANGGYTTGTGGTTNAGRCG-30,50-TCG- insect’s humoral immune response (Phipps et al., 1994; TACTTGGGTCCNAGGAAGAC-30. Reverse-transcription Beresford et al., 1997). Molecular phylogenetic analyses (RT)-PCR experiments were carried out applying the have shown a complex pattern of hexamerin evolution in OneStep-kit according to the manufacturer’s instructions some insect orders, with several hexamerin subtypes, such (Qiagen, Hilden, Germany). PCR-fragments of the as the highly aromatic arylphorins or methionine-rich expected size were cloned into the pCR4-TOPOs-TA hexamerins occurring in parallel in a single species or pGem-T Easy cloning vectors, and sequenced bi- (Burmester et al., 1998; Burmester, 1999, 2001). Hexamerin directionally by a commercial service (GENTERPRISE, sequences have been successfully used as markers for the Mainz, Germany) or on-site using an ABI Prism 377 DNA inference of insect phylogeny (Burmester et al., 1998; sequencer. Additional 30 sequence of T. burksi and Burmester, 1999). D. samali hexamerin was obtained by the 30 rapid Although hexamerins are probably ubiquitously present amplification of cDNA ends (RACE) method using RACE in insects, most information derives from investigations of primers supplied with a kit (Roche, Indianapolis, IN, USA) dipteran and lepidopteran species, whereas data from other in combination with gene specific PCR primers. taxa are scarce (for a review see Telfer and Kunkel, 1991; Burmester, 1999). Here, we investigate for the first time the 2.3. Molecular cloning of P. marginata hexamerin cDNA hexamerins from Plecoptera (stoneflies). Stoneflies are thought to have arisen near the base of the winged insects A cDNA expression library from 3 years old P. marginata and have retained several ancient morphological and nymphs was constructed as previously described employing behavioral features (Hennig, 1969, 1981). Recently, we the STRATAGENE l-ZAP system (Hagner-Holler et al., have demonstrated that the stonefly Perla marginata 2004). PCR-fragments of various P. marginata hexamerins harbors a functional hemocyanin in its hemolymph were labeled with digoxigenin (Roche PCR labeling kit) and (Hagner-Holler et al., 2004). Thus, the Plecoptera are the used as probes to screen the cDNA library. Positive phage only known insect order which possesses both types of clones were converted to pBK-CMV plasmid vectors using hexameric proteins in their hemolymph. This observation the material provided by STRATAGENE, and sequenced prompted us to investigate the occurrence and evolution of on both strands as described (GENTERPRISE, Mainz). hexamerins in selected stonefly species. 2.4. Sequence analyses 2. Material and methods Sequences were assembled with the Vector NTI 10 2.1. Analyses of hemolymph proteins program (Invitrogen) or with the software package LasergeneTM (DNA STAR, Madison, WI, USA). The Hemolymph was withdrawn from the dorsal abdomen tools provided by the ExPASy Molecular Biology Server of living P. marginata nymphs (1–3 years) and adults by of the Swiss Institute of Bioinformatics (http://www. the aid of a syringe, and immediately diluted with an expasy.org) were used for the analyses of DNA and amino approximately equal volume of 100 mM Tris–HCl, pH 7.5, acid sequences. Signal peptides were predicted using the 10 mM Ca2+,10mMMg2+. Hemocytes and cell debris online version of SignalP V1.1 (Nielsen et al., 1997). An were removed by 10 min centrifugation at 10,000 g at alignment of the stonefly hexamerins was constructed by 4 1C. The total protein concentration in the hemolymph hand using GeneDoc 2.6 (Nicholas and Nicholas, 1997). was determined according to the method of Bradford Statistical analysis was carried out with ANOVA, as (1976). Denaturing SDS-PAGE was performed on 7.5% implemented in the Microsoft EXCEL 2003 spreadsheet gels according to standard procedures. After electrophor- program. The Plecoptera sequences were added to an esis, the gel was fixed in ethanol/acetic acid and stained alignment of insect hexamerins and hemocyanins, as it has with 0.1% Coomassie Brilliant Blue R-250. been used in previous studies (Burmester et al., 1998; Burmester, 1999). More recently available sequences have 2.2. RT-PCR cloning of stonefly hexamerins been added using ClustalX (Thompson et al., 1997) and manually adjusted with GeneDoc. Except for the plecop- Total RNA was extracted from adult and larval teran hexamerins, incomplete sequences were excluded P. marginata, Nemoura sp., Taeniopteryx burksi, Allocapnia from the analyses. A list of sequences used in this study is ARTICLE IN PRESS 1066 S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074 provided in Supplemental Table 1. Amino acid composi- et al., 2004). We assumed that brachyceran and nemato- tions were deduced from the translated cDNA sequences. ceran Diptera diverged between 238.5 and 295.4 million years ago (MYA), Diptera and Lepidoptera between 281.5 2.5. Molecular phylogeny and 307.2 MYA, Hymenoptera and Mecopteria (i.e., Diptera+Lepidoptera) more than 299.0 MYA, Isoptera The program packages PHYLIP 3.6b (Felsenstein, and Blattaria more than 145.5 MYA, and that the first 2004), TREE-PUZZLE 5.2 (Strimmer and von Haeseler, Hexapoda appeared at least 407.0 MYA. 1997) and MrBayes 3.1 (Huelsenbeck and Ronquist, 2001) were used for the phylogenetic tree reconstructions. 3. Results Distance matrices were calculated with TREE-PUZZLE using the WAG (Whelan and Goldman, 2001) model of 3.1. Plecopteran hexamerins amino acid evolution, assuming gamma distributions of substitution rates with eight categories. Neighbor-joining Hemolymph samples from 1- to 3-year-old larval and trees were inferred with the program NEIGHBOR from adult specimens of P. marginata were withdrawn and the PHYLIP package. The reliability of the branching analyzed by SDS–PAGE. In all four stages, we observed pattern was tested by bootstrap analysis (Felsenstein, 1985) several prominent bands in the range of 75–80 kDa, which with 100 replications, employing the PUZZLEBOOT shell is the typical mass of arthropod hemocyanins and insect script (M. Holder and A. Roger). Bayesian phylogenetic hexamerins (Telfer and Kunkel, 1991; Burmester, 1999, analyses were performed by MrBayes, using the WAG 2001)(Fig. 1). The two lower bands at about 75 kDa were model and assuming a gamma distribution of substitution previously identified as hemocyanins by using specific rates. Prior probabilities for all trees were equal. Metro- antibodies in Western blotting (Hagner-Holler et al., 2004). polis-coupled Markov chain Monte Carlo (MCMCMC) In 3-years-old larvae and in adult stoneflies, multiple sampling was performed with one cold and three heated (about four) additional bands with higher molecular chains that were run for 1,000,000 generations. Starting masses (about 80 kDa) appear, which we suspect to be trees were random, trees were sampled every 100th hexamerins. Due to the lack of a specific antibody, we generation and posterior probabilities were estimated on cannot conclusively evaluate this assumption. The mole- the final 2000 trees (burnin ¼ 8000). cular evidence presented below, however, demonstrates the presence of multiple hexamerin chains. 2.6. Molecular clock estimates 3.2. Molecular cloning and analyses of P. marginata Linearized trees were calculated under the assumption of hexamerins a relaxed molecular clock employing the program R8S 1.71 (Sanderson, 1997, 2002). We applied the Langley–Fitch We used an alignment of insect hexamerins and hemo- maximum likelihood method (Langley and Fitch, 1974) cyanins to deduce degenerate oligonucleotide primers. with the Powell algorithm to estimate divergence times. These primers were applied in reverse transcription-PCR The tree was calibrated with fossils constraints (Kukalova- on total RNA from 3-years-old nymphs and adults of Peck, 1991; Ross and Jarzembowski, 1993). Stratigraphic P. marginata, which resulted in several cDNA products of information was obtained from http://www.fossilrecord. about 1000 bp lengths. Cloning and sequencing of the net (Benton and Donoghue, 2007); numerical ages derived cDNAs revealed that these fragments represent four diffe- from the ‘‘International Stratigraphic Chart’’ (Gradstein rent hexamerin sequences, which we named PmaHex1–4.

Fig. 1. Coomassie-stained SDS-gel with hemolymph from Perla marginata. About 10 mg of total hemolymph proteins from 1, 2- and 3-year-old larvae (1–3) and adult stoneflies were applied. The molecular mass standard is given on the left side; the arrow denotes the position of the hexamerins. ARTICLE IN PRESS S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074 1067

A cDNA library had been constructed from poly(A)+ 3.3. Identification of hexamerins in other plecopteran species RNA isolated from 3 years old P. marginata nymphs (Hagner-Holler et al., 2004). By screening about Using various combinations of degenerate primers, 600,000 pfu with four different probes that are derived we obtained partial sequences from four additional from these sequences, we obtained one full-length clone plecopteran species: Nemoura sp. (Nemouridae), T. burksi (PmaHex1) and two 50 incomplete clones (PmaHex3 (Taeniopterygidae), A. vivipara (Capniidae), and D. samali and PmaHex4). PmaHex2 could not be identified in the (Diamphipnoidae). They represent two other major super- library. families of the Plecoptera. While P. marginata belongs to The cDNA sequence of PmaHex1 comprises a total of the Perloidea, Nemoura sp., T. burksi and A. vivipara are 2475 bp and covers an open reading frame of 2109 bp, members of the Nemouroidea, and D. samali belongs to the beginning with a ATG at bp 13 (Supplemental Fig 1). Eusthenioidea. Three distinct hexamerins were obtained The translation stop codon (TAA) is present at position from Nemoura sp., and one for each T. burksi, A. vivipara, bp 2119 and is followed by a polyadenylation signal and D. samali (Supplemental Fig 2). The cDNA sequences (AATAAA) at bp 2441 and poly(A) tail of 18 bp. The measured 841–1931 bp and translate into polypeptide incomplete sequences of PmaHex3 and PmaHex4 comprise fragments from 279 to 567 amino acids. Comparisons of 1899 and 1847 bp, respectively. By comparison with the amino acid sequences showed a high degree of identity. PmaHex1, we inferred that 408 and 308 bp of the 50 coding The closest relatives among hexamerins from different ends are missing in PmaHex3 and Hex4 (Supplemental species were the proteins from T. burksi (TbuHex1) and Fig 2). The partial PmaHex2 sequence covers 1086 bp of A. vivipara (AviHex1), which display 96.2% identity on the the middle region, and misses 450 bp of the 50 and nucleotide and 95.3% on the amino acid level. Within 600 bp of the 30 coding regions. Within the overlapping species, hexamerins 1 and 3 from Nemoura sp. (NspHex1 regions, the P. marginata sequences are 65.8–87.6% and 3) were found closely related, with 97.4% identity on identical on the DNA level and 54.0–83.1% identical on the nucleotide and 97.5% on the amino acid level. the amino acid level (Table 1). The deduced primary structure of PmaHex1 yields a 3.4. Phylogeny of insect hexamerins polypeptide of 702 amino acids with a calculated molecular mass of 82.52 kDa. Computer analysis of the sequence The hexamerin sequences of the Plecoptera were revealed the presence of a typical 16 amino acid signal included in an alignment of insect hemocyanins and peptide necessary for transmembrane transport and export selected other hexamerins (Supplemental Table 1). We into the hemolymph (von Heijne, 1986). Therefore, the omitted hexamerin sequences carrying long insertions or native secreted PmaHex1 subunit comprises 686 amino for which only partial database entries were available. We acids with a mass of 80.95 kDa (pI 5.68), which is further excluded several sequences of the dipteran LSP-1 in excellent agreement with the data obtained from and LSP-2 types, for which multiple similar entries from SDS-PAGE (Fig. 1). The six copper-binding sites essential closely related species were available. The final amino acid for the function in O2 binding are not conserved in alignment includes four crustacean hemocyanins, five PmaHex1, although two of the six histidines have been insect hemocyanins, 10 plecopteran and 67 other hexamer- maintained (Fig. 2). A potential N-glycosylation site ins. Phylogenetic trees were constructed employing Baye- (NXS/T) is present at Asn 207. The PmaHex1 amino acid sian inference and the neighbor-joining (NJ) method. Both sequence was used to search the databases employing the analyses show that the plecopteran hexamerins form a well- BLAST algorithm. Hexamerin 2 (RflHex2; accession supported monophyletic clade (100% bootstrap value; 1.0 no. AY572859) from the termite Reticulitermes flavipes Bayesian posterior probability) (Figs. 3 and 4). Within the (Scharf et al., 2005) was found to be the closest relative of Plecoptera, PmaHex3 was found at the basal most PmaHex1 (Fig. 2). Both hexamerins were clearly distinct position; however, the hexamerins of P. marginata were from crustacean or insect hemocyanins, as exemplified by paraphyletic, with the clade leading to PmaHex1 and 2 the absence of the copper-binding histidines. being closer to the other plecopteran hexamerins than PmaHex3 or 4 (Fig. 3). The three hexamerins from Nemoura sp. were monophyletic, and also the hexamerins Table 1 from the Nemouroidea (Nemoura sp., T. burksi and Comparison of P. marginata hexamerins A. vivipara) formed a monophylum. In neighbor-joining analysis, the plecopteran hexamerins PmaHex1 PmaHex2 PmaHex3 PmaHex4 were found to be in sistergroup position to all other insect PmaHex1 87.6 65.8 73.9 hexamerins. This relationship is well supported (94% PmaHex2 83.1 69.1 75.2 bootstrap value in NJ analysis; Fig. 3). In Bayesian PmaHex3 60.4 58.6 71.9 analysis, the plecopteran hexamerins were found as PmaHex4 67.4 67.7 54.0 sistergroup of the cockroach and termite hexamerins (data Percent identities between hexamerins were calculated from nucleotide not shown). The analyses were rerun after exclusion of (above diagonal) and amino acid sequences (below). the nine incomplete plecopteran hexamerin sequences. ARTICLE IN PRESS 1068 S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074

Fig. 2. Multiple alignment of selected hexamerin and hemocyanin sequences. P. marginata hexamerin 1 (PmaHex1) was compared with hemocyanin subunit a from the spiny lobster Panulirus interruptus (PinHcA, accession no. P04254), hemocyanin subunit 1 from P. marginata (PmaHc1; AJ555403) and the hexamerin from the termite R. flavipes (RflHex2; AY572858). The copper-binding histidines are shaded in black, other strictly conserved residues are shaded in gray. The signal peptides are underlined, the secondary structure of PinHcA is given in the lower row.

The basal position of the PmaHex1 was recovered, but lum, which is basal to the mecopterian hexamerins. The this topology does not reach the significance levels (65% hemipteran hexamerins were sister to all hexamerins from bootstrap support in NJ analysis; Fig. 4). The other Holometabola. The recently identified hexamerins of the hexamerins display an essentially identical tree topology as termites (Isoptera) (Scharf et al., 2005) join the cockroach those published before (Burmester, 1999, 2001): the (Blattaria) hexamerins, but were not monophyletic. R. flavipes hexamerins from Diptera (LSP1 and LSP2) were mono- hexamerin 1 (RflHex1) groups with the Periplaneta phyletic, as well as the methionine-rich and aromatic americana hexamerin (PamHex12), whereas R. flavipes hexamerins (arylphorins) from the Lepidoptera. The hexamerin 2 (RflHex2) joins the Blaberus discoidalis position of the lepidopteran riboflavin (Rb)-binding hexamerin (BdiHex). hexamerins was uncertain: they were found either in sistergroup position to the other hexamerins from Holo- 3.5. Analyses of amino acid compositions metabola (NJ; Fig. 4) or form a common clade with the other lepidopteran and the dipteran hexamerins (Bayesian The proportions of the aryl-groups (Phe and Tyr) and of analyses; data not shown). Coleopteran and hymenopteran Met were deduced from the cDNA sequences for each of hexamerins form a reasonably well-supported monophy- the insect hemocyanins and hexamerins (Fig. 5). PmaHex1 ARTICLE IN PRESS S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074 1069

Applying a relaxed clock with five distinct rates, the time of divergence of Hexapoda and malacostracan Crustacea was estimated to have occurred 490 MYA in the late Cambrian or early Ordovician period (Fig. 6). Within the insects, hexamerins evolved from the hemocyanins 468 MYA. The Plecoptera and other winged insects split 360 MYA in the Carboniferous period. The hexamerins of the Eumetabola (Hemiptera and Holometabola) diverged 340 MYA, followed by the split of mecopterian and hymenopteran+coleopteran hexamerins 317 MYA. The hexamerins from Blattaria and Isoptera diverged between 180 and 217 MYA, Hymenoptera vs. Coleoptera 297 MYA, Lepidoptera vs. Diptera 307 MYA, and Brachycera vs. Nematocera 251 MYA. Dipteran LSP1 and LSP2 hexamerins diverged 271 MYA, lepidopteran Met-rich hexamerins and arylphorins separated 291 MYA. The plecopteran hexamerins commenced to Fig. 3. Phylogeny of stonefly hexamerins. A simplified tree of insect diversify into distinct subunit types around 253 MYA hexamerins is displayed. The numbers at the nodes represent bootstrap (data not shown). support values of NJ analyses. The bar equals 0.1 substitutions per site. 4. Discussion is a moderately aromatic hexamerin with 6.8% phenylala- nine and 8.8% tyrosine (sum ¼ 15.7%); the King and 4.1. Early evolution of hexamerins within insects Jukes average (King and Jukes, 1969) is 4.0% and 3.3%, respectively (sum ¼ 7.3%). As far as sequences are known, Most previous studies that identify hexamerins in insects the other plecopteran hexamerins display similar amino have focused on the evolutionarily more derived taxa acid compositions (mean ¼ 16.5% aromatic amino acids). (Telfer and Kunkel, 1991; Burmester, 1999). In fact, there The average of content in aryl groups of the insect are only a handful of studies on hexamerins from ‘‘lower’’ hemocyanins is 10.8%, which is significantly lower than insects (e.g., Braun and Wyatt, 1996; Jamroz et al., 1996; that of the plecopteran hexamerins (or PmaHex1 alone) Wu et al., 1996; Zhou et al., 2006). Here we present—to the and other arylphorin-like hexamerins of the insects best of our knowledge—the first data on hexamerins from (Po0.001; ANOVA). The content of Met was not the stoneflies, a taxon that is thought to have arose near the significantly different. Insect hemocyanins do not signifi- base of the clade of winged insects. Like other hexamerins cantly differ in their proportion of aromatic amino acids (Telfer and Kunkel, 1991; Burmester, 1999), they com- from the crustacean hemocyanins (data not shown). Most menced to accumulate in late larval (nymphal) stages other hexamerins have similar high contents in aromatic (Fig. 1), and were also present in adults. The copper- amino acids, with three notable exceptions: the orthopteran binding sites typical for hemocyanins are absent (Fig. 2). JH-binding hexamerins, lepidopteran riboflavin-binding It is therefore reasonable to assume that the hexamerins hexamerins and lepidopteran methionine-rich hexamerins in Plecoptera have functions similar to those in other have only 10% Phe and Tyr, a content that is not insects, and most likely store amino acids and energy significantly different from that of hemocyanins. Some for non-feeding periods (Telfer and Kunkel, 1991; hexamerins are also enriched in methionine: while the Burmester, 1999). King and Jukes average of Met is 1.8%, a Met-content of Hexamerins emerged from hemocyanins probably early up to 11% was observed (BmoSSP1). The plecopteran in insect evolution. The molecular clock calculations hexamerins have slightly more Met compared to the suggest that this event took place around 468 MYA Jukes–Cantor average, but were not significantly different (Fig. 6). Therefore, hexamerins evolved already in the from the insect hemocyanins (P ¼ 0.17; ANOVA). insect stemline and it can be expected that distinct hemocyanins and hexamerins may be found in other 3.6. Molecular clock estimates ‘‘lower’’ Insecta and the endognathan Hexapoda (Collembola, Protura, Diplura). Stoneflies are the first known insects The phylogenetic tree of insect hexamerins and insect that harbor hemocyanin along with the hexamerins in their plus crustacean hemocyanins was linearized assuming a hemolymph. This fact clearly hints to distinct functions of molecular clock. Divergence dates known from the fossil these proteins. record (Kukalova-Peck, 1991; Ross and Jarzembowski, The clade leading to the plecopteran hexamerins diverged 1993; Benton and Donoghue, 2007) were employed to from that of other insects around 360 MYA (Fig. 6), while calibrate the tree using a Monte Carlo approach as the diversification of different plecopteran hexamerins types implemented in the program r8s (Sanderson, 1997, 2002). commenced around 253 MYA (not shown). This timing and ARTICLE IN PRESS 1070 S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074

Fig. 4. NJ analyses of insect hexamerins. A phylogenetic tree was deduced from a multiple sequence alignment of selected hexamerins, using insect and crustacean hemocyanins as outgroup. The numbers at the nodes represent bootstrap support values. The bar equals 0.1 substitutions per site. the paraphyly noted above (Fig. 3) suggests that the aromatic amino acids (Phe+Tyr), Met, Glx or Asx (Telfer separation of (pre-)plecopteran hexamerins into distinct types and Kunkel, 1991). While aromatic and Met-rich hexam- preceded the emergence of the plecopteran families. The erins are widespread in different insect orders, Glx- and diversification of hexamerins by gene duplication, resulting in Asx-rich hexamerins are apparently restricted to Hyme- multiple paralogs within distinct species, is a common noptera (Wheeler and Buck, 1995; Martinez et al., 2000). phenomenon within the hexamerins (Burmester, 1999). In To refer to its particularly high content of aromatic amino some cases, this phenomenon may simply reflect the need of acids, the term ‘‘arylphorin’’ has been introduced for a additional gene copies for mass expression, in other cases hexamerin of Hyalophora cecropia (Lepidoptera) (Telfer there may be distinct roles for different hexamerins. et al., 1983). This designation has been adapted to hexamerins of other insect orders with a similar amino 4.2. Early accumulation of aromatic amino acids in acid composition (e.g. Scheller et al., 1990; de Kort and hexamerins Koopmanschap, 1994; Jamroz et al., 1996). However, the ‘‘arylphorins’’ of different insect orders are not particularly In response to particular needs during postlarval related (Fig. 4; Burmester, 1999), i.e., arylphorins may well development, many hexamerins are enriched in either be more closely related to hexamerins of their own order ARTICLE IN PRESS S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074 1071

Fig. 5. Evolution of amino acid compositions of insect hexamerins. A simplified phylogenetic tree of insect hexamerins is displayed. The mean relative contents (in percent) of aromatic amino acids (tyrosine and phenylalanine; first number at the branch and first column of chart) and of methionine (second number at branch and second column) are mapped to the tree. The bar equals 0.1 substitutions per site. with lower content in aromatic amino acids than to has been assumed that they support female reproduction arylphorins of other orders. and egg development by enhancing the pool of sulphur- To address the question whether aromatic hexamerins containing amino acids at the time of vitellogenesis (Pan evolved multiple times independently or whether this is a and Telfer, 1996). Likewise, the juvenile-hormone binding plesiomorphic character of the hexamerins, we mapped the hexamerins of Orthoptera (cf. Braun and Wyatt, 1996) relative content in Phe+Tyr and Met on the phylogenetic have a comparatively low content of Phe+Tyr (10.2%), tree (Fig. 5). The plecopteran hexamerins were found to but are rich in Met (4.5%). The phylogenetic tree implies have high content in aromatic amino acids. Thus, the that these proteins have evolved independently from accumulation of aromatic amino acids, probably reflecting highly aromatic hexamerins. Thus, the losses of aromatic the need of building blocks for the formation of the cuticle amino acids most likely reflect secondary evolutionary (Scheller et al., 1990; Burmester, 1999), appears to have events (Fig. 5). occurred in a very early stage of the evolution of hexamerins. Therefore, the enrichment in aromatic amino 4.3. Implications for insect phylogeny acids should be considered as plesiomorphic character of these proteins. Hexamerins have been successfully applied to estimate Hexamerins with lower content in Phe+Tyr most likely evolutionary patterns and divergence times of insects reflect other requirements, e.g., in the higher extant (Burmester et al., 1998; Burmester, 2001). Assuming a Lepidoptera, two distinct types of Met-rich hexamerins relaxed molecular clock with multiple substitution rates (averages 4.2% and 7.5% Met) have been identified (Ryan and fossil constraints, we calculated that Hexapoda and et al., 1985; Tojo and Yoshiga, 1994). Met-rich hexamerins malacostracan Crustacea diverged around 490 MYA. This are more abundant in the female than in the male, and it is in good agreement with previous estimates using ARTICLE IN PRESS 1072 S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074

Fig. 6. A linearized tree was drawn on the basis of the corrected protein distance data. Hexamerin evolution is in good agreement with known paleontological history of the insects and the assumed phylogenetic relationships. Note that the blattarian and isopteran hexamerins are mutually paraphyletic (cf. Fig. 4). hemocyanins and hexamerins, and supports the hypothesis The lack of clear morphological or molecular signals that that Hexapoda are actually closely related to Crustacea could resolve plecopteran affinities and relationships among (Burmester, 2001). Notably, the molecular clock estimates lower Neoptera can most likely be explained by the rapid show that the evolution and radiation of the winged insects diversification of most modern orders of winged insect in (Pterygota) commenced around 360 MYA and mainly the Carboniferous period 300–360 MYA (Hennig, 1969; occurred in the Carboniferous period until 300 MYA Kukalova-Peck, 1991). (Fig. 6). These calculations are in excellent agreement with Another interesting finding of the present study is the the available fossil record (Hennig, 1969; Kukalova-Peck, close relationship of cockroach and termite hexamerins. 1991; Ross and Jarzembowski, 1993) and previous The monophyly of cockroaches is disputed, and some molecular clock estimates (Gaunt and Miles, 2002). authors have concluded that the Blattaria are paraphyletic The position of the Plecoptera within the Neoptera is still with respect to the Isoptera (e.g., Hennig 1981; Lo et al., uncertain. While Hennig (1969, 1981) considered the 2000). Lo et al. (2000) presented molecular evidence that Plecoptera as a basal taxon within the Neoptera, Boudreaux Isoptera may derive from subsocial, woodfeeding cock- (1979) placed the stoneflies as sistergroup of the embiids roaches. In fact, the known blattarian and isopteran (web-spinners). On the basis of 18S rRNA sequences, Flook hexamerins form a well-supported common clade. More- and Rowell (1998) suggested that the Plecoptera form a over, they are mutually paraphyletic, showing that the common clade with the Grylloblattodea and Dermaptera. hexamerins diversified into at least two classes before the Employing ‘‘total evidence’’ analyses of molecular and Blattaria and Isoptera split. Molecular clock estimates morphological data, Wheeler et al. (2001) concluded that the suggest that the clades leading to the hexamerins of Plecoptera were sistergroup to the Embiidina, but were not B. germanica and P. americana on one hand, and of able to resolve the exact affinity of that clade. The poor R. flavipes on the other hand, diverged about 200 MYA. phylogenetic signal is the most likely reason for the Although fossil data of termites are poor, there is in fact uncertain placement of the Plecoptera which was also evidence for the presence of Isoptera in lower Cretaceous observed with hexamerins. NJ analyses placed plecopteran deposits (Ross and Jarzembowski, 1993), confirming the hexamerins at the base of the neopteran tree, thus molecular clock estimates. corroborating the view of Hennig (1969, 1981).However, Thus, our molecular clock estimates are in excellent bootstrap analyses show only weak support in NJ trees and agreement with the fossil record and show the usefulness of this topology was not supported in the Bayesian trees, which hexamerins to infer the times of insect diversification. left the basal Neoptera (‘‘Polyneoptera’’) largely unresolved. However, it should also be pointed out that (1) the ARTICLE IN PRESS S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074 1073 orthology of various hexamerins is not always thoroughly Enderle, U., Ka¨user, G., Renn, L., Scheller, K., Koolman, J., 1983. established and that (2) molecular clock estimates usually Ecdysteroids in the hemolymph of blowfly are bound to calliphorin. have large standard errors. Therefore, all our calculations In: Scheller, K. (Ed.), The Larval Serum Proteins of Insects: Function, Biosynthesis, Genetic. Thieme, Stuttgart, New York, pp. 40–49. should only be considered as rough estimates rather than as Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using exact dates. the bootstrap. Evolution 39, 783–791. Felsenstein, J., 2004. PHYLIP (Phylogeny Inference Package), version 3.6. Distributed by the author. Department of Genetics, University of Acknowledgments Washington, Seattle. Flook, P.K., Rowell, C.H.F., 1998. Inferences about orthopteroid We thank Klaus Enting and Rainer Rupprecht (Mainz) phylogeny and molecular evolution from small subunit nuclear for P. marginata specimens, and Falko Roeding and ribosomal DNA sequences. Insect Mol. Biol. 7, 163–178. Robert Scho¨pflin for the help with the implementation Gaunt, M.W., Miles, M.A., 2002. An insect molecular clock dates the origin of the insects and accords with palaeontological and biogeo- of the r8s program. This work has been supported by graphic landmarks. Mol. Biol. Evol. 19, 748–761. grants of the Deutsche Forschungsgemeinschaft (Bu956/5; Gradstein, F.M., Ogg, J.G., Smith, A.G., et al., 2004. A Geologic Bu956/9) and by NSF grant IBN-9722196. The nucleotide Timescale 2004. Cambridge University Press, Cambridge. sequences reported in this paper have been submitted to Hagner-Holler, S., Schoen, A., Erker, W., Marden, J.H., Rupprecht, R., the EMBL/GenBankTM Databases under the accession Decker, H., Burmester, T., 2004. A respiratory hemocyanin from an insect. Proc. Natl. Acad. Sci. USA 101, 871–874. nos. AM690365 to AM690371, EF620538, EF617597 and Hennig, W., 1969. Stammesgeschichte der Insekten. W. Kramer, EF617598. Frankfurt/M. Hennig, W., 1981. Insect Phylogeny. Academic Press, New York. Holmes, D.S., Bonner, J., 1973. Preparation, molecular weight, base Appendix A. Supplementary data composition, and secondary structure of giant nuclear ribonucleic acid. Biochemistry 12, 2330–2338. Supplementary data associated with this article can be Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: Bayesian inference of found in the online version at doi:10.1016/j.ibmb.2007.06.001. phylogenetic trees. Bioinformatics 17, 754–755. Jamroz, R.C., Beintema, J.J., Stam, W.T., Bradfield, J.Y., 1996. Aromatic hexamerin subunit from adult female cockroaches References (Blaberus discoidalis): molecular cloning, suppression by juvenile hormone, and evolutionary perspectives. J. Insect Physiol. 42, Beintema, J.J., Stam, W.T., Hazes, B., Smidt, M.P., 1994. Evolution of 115–124. arthropod hemocyanins and insect storage proteins (hexamerins). Mol. King, J.L., Jukes, T.H., 1969. Non-Darwinian evolution. Science 164, Biol. Evol. 11, 493–503. 788–798. Benton, M.J., Donoghue, P.C.J., 2007. Paleontological evidence to date Kukalova-Peck, J., 1991. Fossil history and evolution of hexapodstruc- the tree of life. Mol. Biol. Evol. 24, 26–53. tures. In: Naumann, I.D. (Ed.), The Insects of Australia. Melbourne Beresford, P.J., Basinski-Gray, J.M., Chiu, J.K., Chadwick, J.S., University Press, Melbourne, Australia, pp. 141–179. Aston, W.P., 1997. Characterization of hemolytic and cytotoxic Langley, C.H., Fitch, W., 1974. An estimation of the constancy of the rate Gallysins: a relationship with arylphorins. Dev. Comp. Immunol. 21, of molecular evolution. J. Mol. Evol. 3, 161–177. 253–266. Lo, N., Tokuda, G., Watanabe, H., Rose, H., Slaytor, M., Maekawa, K., Boudreaux, H.B., 1979. Arthropod Phylogeny with Special Reference Bandi, C., Noda, H., 2000. Evidence from multiple gene sequences Process to Insects. Wiley, New York. indicates that termites evolved from wood-feeding cockroaches. Curr. Bradford, M.M., 1976. A rapid and sensitive method for quantification of Biol. 10, 801–804. microgram quantities of proteins using the principle of protein dye Magee, J., Kraynack, N., Massey Jr., H.C., Telfer, W.H., 1994. Properties binding. Anal. Biochem. 72, 248–254. and significance of a riboflavin-binding hexamerin in the hemo- Braun, R.P., Wyatt, G.R., 1996. Sequence of the juvenile hormone lymph of Hyalophora cecropia. Arch. Insect Biochem. Physiol. 25, binding protein from the hemolymph of Locusta migratoria. J. Biol. 137–157. Chem. 271, 31756–31762. Martinez, T., Burmester, T., Veenstra, J., Wheeler, D., 2000. Sequence and Burmester, T., 1999. Evolution and function of the insect hexamerins. Eur. evolution of a hexamerin from the ant, Camponotus festinatus. Insect J. Entomol. 96, 213–225. Mol. Biol. 9, 427–431. Burmester, T., 2001. Molecular evolution of the arthropod hemocyanin Nicholas, K.B., Nicholas Jr., H.B., 1997. GeneDoc: Analysis and superfamily. Mol. Biol. Evol. 18, 184–195. Visualization of Genetic Variation. /www.psc.edu/biomed/genedoc/S. Burmester, T., 2002. Origin and evolution of arthropod hemocyanins and Nielsen, H., Engelbrecht, J., Brunak, S., von Heijne, G., 1997. related proteins. J. Comp. Physiol. B 172, 95–117. Identification of prokaryotic and eukaryotic signal peptides and Burmester, T., Scheller, K., 1996. Common origin of arthropod prediction of their cleavage sites. Protein Eng. 10, 1–6. tyrosinase, arthropod hemocyanin, insect hexamerin, and dipteran Pan, M.L., Telfer, W.H., 1996. Methionine-rich hexamerin and arylphorin receptor. J. Mol. Evol. 42, 713–728. arylphorin as presursor reservoirs for reproduction and metamor- Burmester, T., Massey Jr., H.C., Zakharkin, S.O., Benes, H., 1998. The phosis in female luna moths. Arch. Insect Biochem. Physiol. evolution of hexamerins and the phylogeny of insects. J. Mol. Evol. 47, 33, 149. 93–108. Peter, M.G., Scheller, K., 1991. Arylphorin and the integument. In: Chirgwin, J.M., Przbyla, A.E., MacDonald, R.J., Rutter, W.J., 1979. Retnakaran, A., Binnington, K. (Eds.), The Physiology of Insect Isolation of biologically active ribonucleic acid from sources enriched Epidermis. CSIRO, Australia, pp. 113–122. in ribonuclease. Biochemistry 18, 5294–5299. Phipps, D.J., Chadwick, J.S., Aston, W.P., 1994. Gallysin-1, an de Kort, C.A.D., Koopmanschap, A.B., 1994. Nucleotide and deduced antibacterial protein isolated from hemolymph of Galleria mellonella. amino acid sequence of a cDNA clone encoding diapause protein 1, an Dev. Comp. Immunol. 18, 13–23. arylphorin-type storage hexamer of the Colorado potato beetle. Ross, A.J., Jarzembowski, E.A., 1993. Hexapoda. In: Benton, M.J. (Ed.), J. Insect Physiol. 40, 527–535. The Fossil Record 2. Chapman & Hall, London, pp. 363–426. ARTICLE IN PRESS 1074 S. Hagner-Holler et al. / Insect Biochemistry and Molecular Biology 37 (2007) 1064–1074

Ryan, R.O., Anderson, D.R., Grimes, W.J., Law, J.H., 1985. Arylphorin Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, from Manduca sexta: carbohydrate structure and immunological D.G., 1997. The ClustaIX windows interface: flexible strategies for studies. Arch. Biochem. Biophys. 243, 115–124. multiple sequence alignment aided by quality analysis tools. Nucleic Sanderson, M.J., 1997. A nonparametric approach to estimating Acids Res. 25, 4876–4882. divergence times in the absence of rate constancy. Mol. Biol. Evol. Tojo, S., Yoshiga, T., 1994. Purification and characterization of three 14, 1218–1231. storage proteins in the common cutworm, Spodoptera litura. Insect Sanderson, M.J., 2002. Estimating absolute rates of molecular evolution Biochem. Mol. Biol. 24, 729–738. and divergence times: a penalized likelihood approach. Mol. Biol. Von Heijne, G., 1986. A simple method for predicting signal peptide Evol. 19, 101–109. cleavage sequences. Nucleic Acids Res. 14, 4683–4690. Scharf, M.E., Wu-Scharf, D., Zhou, X., Pittendrigh, B.R., Bennett, G.W., Wheeler, D.E., Buck, N.A., 1995. Storage proteins in ants during 2005. Gene expression profiles among immature and adult reproduc- development and colony founding. J. Insect. Physiol. 41, 885–894. tive castes of the termite Reticulitermes flavipes. Insect Mol. Biol. 14, Wheeler, W.C., Whiting, M., Wheeler, Q.D., Carpenter, J.M., 2001. The 31–44. phylogeny of the extant hexapod orders. Cladistics 17, 113–169. Scheller, K., Fischer, B., Schenkel, H., 1990. Molecular properties, Whelan, S., Goldman, N., 2001. A general empirical model of protein functions and developmentally regulated biosynthesis of arylphorin in evolution derived from multiple protein families using a maximum- Calliphora vicina. In: Hagedorn, H.H., Hildebrand, J.G., Kidwell, likelihood approach. Mol. Biol. Evol. 18, 691–699. M.G., Law, J.H. (Eds.), Molecular Insect Science. Plenum Press, New Wu, C.H., Lee, M.F., Liao, S.C., Luo, S.F., 1996. Sequencing analysis of York, pp. 155–162. cDNA clones encoding the American cockroach Cr-PI allergens. Strimmer, K., von Haeseler, A., 1997. Likelihood-mapping: a simple Homology with insect hemolymph proteins. J. Biol. Chem. 271, method to visualize phylogenetic content of a sequence alignment. 17937–17943. Proc. Natl. Acad. Sci. USA 94, 6815–6819. Zhou, X., Oi, F.M., Scharf, M.E., 2006. Social exploitation of hexamerin: Telfer, W.H., Kunkel, J.G., 1991. The function and evolution of insect RNAi reveals a major caste-regulatory factor in termites. Proc. Natl. storage hexamers. Annu. Rev. Entomol. 36, 205–228. Acad. Sci. USA 103, 4499–4504. Telfer, W.H., Keim, P.S., Law, J.H., 1983. Arylphorin, a new protein from Zhou, X., Tarver, M.R., Scharf, M.E., 2007. Hexamerin-based regulation Hyalophora cecropia: comparison with calliphorin and manducin. of juvenile hormone-dependent gene expression underlies phenotypic Insect Biochem. 13, 601–613. plasticity in a social insect. Development 134, 601–610.