Science (2018) 25, 33–44, DOI 10.1111/1744-7917.12375

ORIGINAL ARTICLE Transcriptome analysis in the beet webworm, recurvalis (: )

Jian-Cheng Chang∗ and Srinivasan Ramasamy AVRDC – The World Vegetable Center, Shanhua, Tainan, Taiwan, China

Abstract The beet webworm, Fabricius, is a destructive pest on vegetable crops in tropics and subtropics; its main host plant is amaranth. It has become imperative to develop non-chemical methods to control S. recurvalis on amaranth. However, the lack of molecular information about this species has hindered the development of novel pest management strategies. In this study, high-throughput RNA sequencing covering de novo sequence assemblies, functional annotation of transcripts, gene function classification and enrichment was performed on S. recurvalis. Illumina sequencing generated a total of 120 435 transcript contigs ranging from 201 to 22 729 bases with a mean length of 688 bases. The assembled transcripts were subjected to Basic Local Alignment Search Tool- X (BLASTX) to obtain the annotations against non-redundant, Swiss-Prot, Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) protein databases. A subset of 58 225 transcript sequences returned hits from known proteins in the National Center for Biotechnology Information database, and the majority of the transcript sequences had the highest number of hits for Danaus plexippus (50.43%). A total of 1217 Gene Ontology-level 3 annotations were assigned to 51 805 transcripts, while 39 650 transcripts were predicted as functional protein-coding genes in the COG database and 20 037 transcripts were enriched to KEGG pathways. We identified 40 putative genes related to pheromone production and reception in S. recurvalis, with the expression of one gene between 0.29 and 1141.79 fragments per kilo base per million (FPKM) reads. The transcriptome sequence of S. recurvalis is a first step toward offering a comprehensive genomic resource which would enable better understanding of molecular mechanisms to enable development of effective pest management practices for this species. Key words pheromone; protein-coding genes; putative genes; Spoladea recurvalis; transcriptome

Introduction over the world, but predominantly reported in tropical and subtropical regions (CABI, 2016). Amaranth is the Spoladea recurvalis Fabricius (syn. Hymenia recurvalis), main host plant of S. recurvalis. The pest attacks veg- commonly known as beet webworm, is a widely dis- etable crops, including bean, beet root, cucurbits, egg- tributed pest of the family Crambidae. It is found all plant and sweet potato as well as various weed plants (Hsu & Srinivasan, 2012). The life span of S. recurvalis is 3- 4 weeks: a 3–4 day egg stage, a 12–15 day larval stage and Correspondence: Srinivasan Ramasamy, AVRDC – The an 8–11 day pupal stage (Bhattacherjee & Ramdas, 1964). World Vegetable Center, 60 Yi-Min Liao, Shanhua, Tainan Spoladea recurvalis larvae feed on the cuticle, upper epi- 74151, Taiwan, China. Tel: +886 6 5837801; Fax: +886 6 dermis and palisade layer of the leaves and are protected 5830009; E-mail: [email protected] by curling the leaves with silvery webs (Bhattacherjee & ∗Current address: Biology Division, Taiwan Sugar Research Ramdas, 1964). Leaf damage caused by S. recurvalis Institute, Tainan, Taiwan, China larvae decreases the quality of vegetable crops, making

33 C 2016 Institute of Zoology, Chinese Academy of Sciences 34 J. C. Chang & R. Ramasamy them less marketable and leading to economic loss. Com- population genetic analyses, which are usually found to plete leaf skeletonization occurs in severe infestations. be informative in families rich in intraspecific polymor- Recent reports from eastern Kenya showed that infesta- phism in the order Lepidoptera (New, 2004). A transcrip- tion by S. recurvalis alone can result in 100% yield loss tome study on this economically important insect pest is of amaranth (Kahuthia-Gathu, 2011). According to Miya- needed to obtain vital genomic information. hara (1993), S. recurvalis can migrate over long distances, Pheromone biosynthesis activating neuropeptide and an influx of this pest may lead to serious crop dam- (PBAN) and pheromone binding protein (PBP) transcripts age in new locations. Use of chemical pesticides against are usually absent in the abdomen. For instance, PBAN S. recurvalis on amaranth is not advisable because of the transcripts were present in the head and thoracic complex short cropping cycle, which may result in serious pesticide of pod borer ( vitrata), a Crambid species residue threats (Fan et al., 2013). Hence, it has become (Chang & Ramasamy, 2014). The three PBP transcripts imperative to develop alternate pest management strate- in the stem borer, Sesamia nonagrioides were found to gies to reduce the use of synthetic pesticides on amaranth. be expressed in antennae (Glaser et al., 2014). Hence, we Although several studies already have been carried out on focused on transcripts of S. recurvalis in the head (with the bio-ecology and host-plant interactions of S. recur- antennae) and thoracic complex using Illumina sequenc- valis on amaranth, there is a lack of genome-level data for ing technology. The transcriptome analysis covered de this species to guide the development of pest management novo sequence assemblies, functional annotation of tran- packages. scripts, gene function classification and enrichment, and The domesticated silkworm (Bombyx mori) is one of identification of genes involved in pheromone production the model for research, and the first lepidopteran and reception. The resulting transcriptome data offer com- for which draft genome sequences became available prehensive genomic resources of S. recurvalis for further in 2004. Two independent data sets from Japan (Mita studies, and will lead to better understanding and con- et al., 2004) and China (Xia et al., 2004) generated us- trol of the molecular mechanisms of S. recurvalis growth, ing whole-genome shotgun sequencing were merged, re- development, metabolism and reproduction, which may assembled and the resulting high-quality sequence data improve the effectiveness of pest management strategies was released (International Silkworm Genome Consor- for this species. tium, 2008). However, high-throughput next generation sequencing (NGS) technologies have advanced over the Materials and methods past decade, with the entire genomes of model insect species such as Drosophila melanogaster (Adams et al., Ethics statement 2000), Anopheles gambiae (Neafsey et al., 2013) and Danaus plexippus (Zhan et al., 2011) available for re- No specific permits were required for the described search. RNA sequencing (RNA-seq) is an approach to field studies, and no specific permissions were required retrieve the transcriptome (i.e., the full set of transcripts for these locations/activities. We confirm that samples in a cell) information for a specific developmental stage were taken from non-endangered, non-protected species or physiological condition through the use of NGS capa- on open, public lands. bilities. RNA-seq provides major advantages over other technologies, including (Wang et al., 2009): (i) support- ing genomic research in non-model species without the Insects need to detect transcripts based upon known genomic se- quences (e.g., no gene-specific oligonucleotide primer is The larval specimens of S. recurvalis were collected required); (ii) unravelling high-resolution sequence vari- from amaranth plants in the field in Shanhua, Taiwan, ations in single nucleotides; (iii) quantifying a range of China. A laboratory culture was established for two gen- gene expression levels for different developmental and erations on host amaranth plants. After pupation, each metabolic changes with high sensitivity; (iv) studying individual pupa was isolated, sexed and maintained in gene architecture such as intron-exon boundaries and al- a plastic cup until eclosion. Male and female adults ternative splicing patterns; and (v) developing molecu- were reared separately in wire-meshed acrylic cylinders lar markers, such as single nucleotide polymorphisms (30 cm × 15 cm) and provided with water-soaked sponges (SNPs) and simple sequence repeats (SSRs) on a large and a 10% honey solution until ready for mating. Insect scale within a single sequencing run. There have been samples were kept at 26 ± 1°C and 70% ± 10% relative few studies on the molecular characterization of S. recur- humidity with a photoperiod of 12 : 12 (light : dark) com- valis and there is no genetic resource for phylogenetic and mencing with the larval stage through the life cycle. Since

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 Transcriptome analysis in Spoladea recurvalis 35 the expression of PBAN (Chang & Ramasamy, 2014) and the assembly of clean reads into transcript isoforms using PBP genes (Zhang et al., 2010) are relatively higher in the de novo transcriptome assembler Trinity (Fitch, 1970). the latter half of the pupal stages and in adults, heads and Subsequently, a Basic Local Alignment Search Tool-X thoraces from 4- to 6-day-old pupae of both sexes were (BLASTX) search was performed against the databases  dissected and preserved in RNAlater R solution (Ambion of non-redundant proteins (NR), Swiss-Prot, Clusters of Inc., Austin, TX, USA) for subsequent isolation of total Orthologous Groups of proteins (COG), and Kyoto En- RNA. cyclopedia of Genes and Genomes (KEGG) with the cri- terion of E-value ࣘ 10-5. The top BLASTX hits were selected to determine the sequence direction and to anno- Total RNA extraction tate functions of the transcripts.

Total RNA was isolated from homogenized heads and thoraces using a TRIzol Reagent (Invitrogen, Carlsbad, Functional gene classification CA, USA) according to the manufacturer’s protocol. The RNA extract was re-purified with phenol : chloro- Each transcript was annotated for Gene Ontology (GO) form : isoamyl alcohol (25 : 24 : 1; phenol phase pH = function, COG term and KEGG pathway enrichment us- 4–5) to remove proteins and DNA contamination, then ing Blast2GO software (Conesa et al., 2005). The three precipitated with ethanol. RNA was quantified by ab- major gene properties at the highest level of the GO system sorbance measurement on a spectrophotometer at 260 nm, are cellular component, molecular function and biological while the purity of the RNA was determined by the ratios process. Alignment to the COG database provided a rapid of A260/A280 and A260/A230. description of the functional characteristics of transcripts, and KEGG pathway mapping classified predicted genes into their respective pathways. RNA sequencing

The whole-transcriptome sequencing was performed by Expression analysis Genomics BioSci & Tech Co. (Taipei, Taiwan, China). To construct a cDNA library, mRNA was first enriched Gene-level expression measurement was reported in from a 20 µg total RNA sample using magnetic oligo(dT) fragments per kilo base per million reads (FPKM) using beads. Next, mRNA was fragmented by divalent cation the software package Cufflinks (Mortazavi et al., 2008; and heat treatment. Random hexamer primers were used Trapnell et al., 2010). FPKM of a gene is calculated with for first-strand cDNA synthesis and second-strand cDNA the formula: was synthesized using RNase H, DNA polymerase I with C buffer and deoxynucleotide triphosphates (dNTPs). Short 106 × 103 × , NL fragments were purified with a QiaQuick polymerase chain reaction (PCR) extraction kit (Qiagen, Valencia, where C represents the number of reads specifically CA, USA) and further subjected to end-repair and ad- aligned to the gene, N is the total number of mapped reads dition of a poly(A) tail. After that, the short fragments and L depicts length of the transcript in base pairs. The were ligated with sequencing adapters. By gel selection, FPKM method combines normalization steps for different suitable sizes of cDNA fragments were then PCR am- transcript sizes and variations of distinct sequencing runs plified as templates. Finally, the library was sequenced to ensure that expression levels of different transcripts can using an Illumina HiSeqTM 2000 (BGI, Beijing, China). be compared across runs. The raw transcriptome data has been deposited in the Na- tional Center for Biotechnology Information’s (NCBI’s) Sequence Read Archive (SRA) database (accession Results and discussion number: SRX648633). S. recurvalis, a serious pest of Amaranthus spp., is glob- ally widespread with very little molecular information and Assembly and sequence annotation currently no molecular markers available. The main objec- tive of this research was to perform de novo transcriptome All retrieved raw reads from RNA-seq were filtered, in- analysis with NGS in this non-model insect pest, S. recur- cluding removal of adaptor sequences, contamination and valis, at the later pupal stage. Analysis of the biological low-quality reads from raw reads. This was followed by characteristics of assembled transcripts and identification

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 36 J. C. Chang & R. Ramasamy

Fig. 1 Transcript length distribution of Trinity assembly for Spoladea recurvalis. All 120 435 assembled transcripts were included in the histogram with a threshold size of 200 nucleotide acids. of the central genes underlying pheromone production and the accuracy of the quantification of RNA assayed and reception are major topics of interest, because the via NGS (Wang et al., 2009). pheromone-based monitoring and/or mass-trapping could be used as an effective component technology in inte- grated management of S. recurvalis. Sequence annotation

The assembled transcripts were subjected to BLASTX RNA sequencing and sequencing assembly to obtain the annotation against the NR, Swiss-Prot, COG and KEGG protein databases using an E-value criterion Illumina sequencing generated a total of 89 282 522 as described in the materials and methods section. A to- clean reads containing 8 035 426 980 bases with an aver- tal of 58 225 transcript sequences (48.35%) returned hits age GC content of 49.71%; 98.25% of bases in the reads over the cutoff E-value, while there were 62 210 sin- had a Phred quality score equal to or greater than 20. There gletons (51.65%) that did not match any homology in were 120 435 transcript contigs successfully assembled public databases. The fact that more than half of the from the clean reads, covering 82 908 409 bases and a transcripts were unable to match sequences in public mean of 44.02% GC. Transcript length ranged from 201 databases signifies a deficiency in genetic/genomic in- to 22 729 bases with a mean length of 688 bases (Fig. 1). puts of the Crambidae and even Lepidoptera. The major- These transcripts were further clustered into 68 834 iso- ity of transcripts (54.07%) that yielded high score hits had form clusters and the longest transcript was selected as an E-value ranging from 1.0E-50 to 1.0E-5, followed by the representative for each cluster. It should be noted that transcripts with hits of an E-value between 1.0E-100 to during the construction of the cDNA library, there were 1.0E-50 (20.63%) and between 0 to 1.0E-150 (15.55%) aberrations generated through deep sequencing, result- (Fig. 2). About 24% of transcripts showed over 80% se- ing from the fragmentation of RNA/cDNA and assembly quence similarities to known proteins, whereas 19.57% of errors, which may influence the final assembly statistics transcript sequences had similarities of only 15% to 40%

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 Transcriptome analysis in Spoladea recurvalis 37

Fig. 2 Analysis of the Expect value of transcript sequences in Fig. 3 Distribution of homology across assembled transcripts. the public database. E-value distribution of all matched tran- Similarities of Spoladea recurvalis transcript sequences that re- scripts from the Basic Local Alignment Search Tool (BLAST) turned hits in the public database. results.

(Fig. 3). The majority of the transcript sequences had the order Lepidoptera, as they are closely related evolution- highest number of hits for Danaus plexippus (50.43%; ary lineages. S. recurvalis transcripts also showed high Fig. 4) and 3.63% of the transcripts were matched to an- homology to a number of hymenopteran members, includ- other lepidopteran model insect species, silkworm (Bom- ing Bombus spp. (4.04%), Megachile rotundata (3.91%), byx mori). As expected, these results showed high sim- Apis spp. (3.52%), Nasonia vitripennis (3.46%), Harpeg- ilarity of sequences between S. recurvalis and members nathos saltator (3.27%), Camponotus floridanus (2.84%), that have complete genomes available belonging to Dit- Solenopsis invicta (1.74%) and Acromyrmex echinatior rysia clade containing both butterflies and in the (1.71%). As for other insect orders, 3.64% and 1.50% of S.

Fig. 4 Species distribution of top Basic Local Alignment Search Tool-X (BLASTX) results. Distribution of species among the top BLASTX hits against the NCBI non-redundant protein database.

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 38 J. C. Chang & R. Ramasamy

Fig. 5 Gene Ontology (GO) categories of assembled transcripts. Gene ontologies ascribed to Spoladea recurvalis transcript top Basic Local Alignment Search Tool hit terms by the GO database, assigned for (A) biological process, (B) cellular component and (C) molecular function at GO level 2. recurvalis transcripts returned BLAST results with above- The annotation of coding sequences at the whole threshold sequence similarity to Tribolium castaneum genome level largely relies on the homology-based iden- (Coleoptera) and Drosophila spp. (Diptera), respectively. tification of putative genes in the model organisms (Gotz¨ It is noteworthy that several non-Insecta species, such as et al., 2008). However, a certain number of sequences the parasitic protozoan Leishmania major Friedlin strain cannot be matched to known genes, ambiguous assign- (0.35%) that has been reported to infect lepidopteran hosts ment of functions, or inconsistent annotations observed were observed from top BLAST hits (Li et al., 2011; in non-model species. Moreover, each species contains Rogers et al., 2011). a unique gene lineage which may not be homologous to

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 Transcriptome analysis in Spoladea recurvalis 39 others within the molecular databases used for functional results of GO distribution are congruent with observations annotation. Hence, more effort on expanding annotation in other insect transcriptome studies (Mittapalli et al., reference sets specific to different branches of evolution 2010; Yang et al., 2012). by taking advantage of rapidly developed NGS technolo- The COG database consists of proteins from complete gies is a task of top priority. genomes of bacteria, archaea and eukaryotes (Tatusov et al., 1997). COG is used for phylogenetic classifica- tion on the basis that every protein in the database is GO classification connected through a vertical evolutionary descent, or the orthology concept (Fitch,1970). A total of 39 650 tran- For gene product function classification, all transcript scripts (32.92%) were predicted as functional protein- sequences were subjected to GO, a standardized vocabu- coding genes and classified into 25 functional categories lary to define and annotate the functions of genes/proteins (Fig. 6). The count in each class fluctuated between 265 structured in hierarchical levels. A total of 1217 GO- (0.60%) of ‘nuclear structure’ and 8271 (18.74%) of ‘gen- level 3 annotations were assigned to 51 805 transcripts eral function prediction only’. Proteins involved in signal (43.01%). Figure 5A shows that 14 644 sequences were transduction mechanisms (5533, 12.54%) and the cate- categorized into 18 biological processes, with the ma- gory of post-translational modification, protein turnover, jority of genes belonging to metabolic (28.12%) and chaperones (3147, 7.13%) are also abundant in S. recur- cellular (26.50%) processes. The 18 cellular component valis head and thoracic tissue. terms contain 8639 sequences with cell (25.93%) and cell All assembled transcripts were subjected to the KEGG part (25.93%) being dominant (Fig. 5B). The distribu- pathway database. The database is a network that con- tion of GO terms in 14 molecular functions showed bind- sists of graphical representations of metabolic and ing (52.22%) and catalytic activity (34.99%) at the high- regulatory pathways, including enzyme-enzyme catal- est frequencies among 28 522 sequences (Fig. 5C). The yses, protein-protein interactions and gene expression

Fig. 6 Clusters of Orthologous Groups of proteins (COGs) classification. Alignment of assembled transcripts to the COG database for prediction of protein function. A: RNA processing and modification; B: chromatin structure and dynamics; C: energy production and conversion; D: cell cycle control, cell division, chromosome partitioning; E: amino acid transport and metabolism; F: nucleotide transport and metabolism; G: carbohydrate transport and metabolism; H: coenzyme transport and metabolism; I: lipid transport and metabolism; J: translation, ribosomal structure and biogenesis; K: transcription; L: replication, recombination and repair; M: cell wall/membrane/envelope biogenesis; N: cell motility; O: post-translational modification, protein turnover, chaperones; P: inorganic ion transport and metabolism; Q: secondary metabolites biosynthesis, transport and catabolism; R: general function prediction only; S: function unknown; T: signal transduction mechanisms; U: intracellular trafficking, secretion and vesicular transport; V: defense mechanisms; W: extracellular structures; Y: nuclear structure; Z: cytoskeleton.

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 40 J. C. Chang & R. Ramasamy

Fig. 7 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment in Spoladea recurvalis. Basic Local Alignment Search Tool-X (BLASTX) search against KEGG protein database assigned the assembled transcripts to a total of 22 subfamilies belonging to four different families. Dot bars: cellular processes; white bars: environmental information processing; slashed bars: genetic information processing; gray bars: metabolism. relations (Ogata et al., 1999). This allows for deciphering Large numbers of functionally ungrouped contigs are higher-order functions in terms of interacting molecular often published in the literature of different transcrip- networks at the system level. There were 20 037 tran- tome projects (Wang et al., 2010; Bai et al., 2011; scripts (16.64%) assigned to KEGG pathways, includ- Karatolos et al., 2011; Shen et al., 2011) and may imply ing 11 groups in the ‘metabolism’ family (carbohydrate the constraint of annotating potential functions in tran- metabolism, energy metabolism, lipid metabolism, nu- scripts referring to closely related taxa with restricted cleotide metabolism, amino acid metabolism, metabolism genomic data or distantly related model species. Van of other amino acids, glycan biosynthesis and metabolism, Belleghem et al. (2012) postulated that unmapped short metabolism of cofactors and vitamins, metabolism of transcripts may be chimeric sequences due to disinte- terpenoids and polyketides, biosynthesis of other sec- grated reads or artifacts of amplification. Examining the ondary metabolites, and xenobiotics biodegradation and size distribution of unassigned S. recurvalis transcripts metabolism), four groups in the family ‘genetic informa- showed the range of 203–6195 bp, and the lengthy se- tion processing’ (transcription, translation, folding and quences are likely to denote novel genes that are not re- replication and repair), three groups in the ‘environmen- ported in other organisms. Nonetheless, further functional tal information processing’ family (membrane transport, investigation is necessary to verify these vague transcripts. signal transduction and signaling molecules and interac- tion), and four groups in the family ‘cellular processes’ (transport and catabolism, cell motility, cell growth and Functional genes related to the pheromone biosynthesis death, and cell communication) (Fig. 7). Our result indi- regulation and reception cated that the largest three pathway groups were ‘signal transduction’ (2415, 12.05%), ‘translation’ (1872, 9.34%) As lepidopterans utilize pheromones for controlling and ‘folding’ (1721, 8.59%). and modulating various physiological processes, mainly A relatively high percentage of transcripts could not be mating and reproduction, we focused on the genes related allocated a GO term, a COG group or a KEGG pathway. to the stimulation, regulation and reception of pheromones

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 Transcriptome analysis in Spoladea recurvalis 41

review please refer to Jurenka & Rafaeli, 2011). Hence, head and thoracic complex seem to be appropriate body parts for RNA isolation. We already reported the pres- ence of PBAN transcripts in the head and thoracic com- plex of Maruca vitrata, but absent in abdomen (Chang & Ramasamy, 2014). The PBP transcripts were mostly expressed in antennae, as evidenced in S. nonagrioides (Glaser et al., 2014). There was one isoform cluster iden- tified as a putative PBAN gene, and two isoform clusters were homologous to PBAN-R (Fig. 8). The relative ex- pression levels were calculated by FPKM. So far, several PBAN genes from members of the Lepidoptera have been Fig. 8 Fragments per kilo base per million reads (FPKM) of pu- identified; however, M. vitrata is the only species from tative pheromone biosynthesis activating neuropeptide (PBAN) which PBAN gene is documented in the agriculturally and PBAN receptor (PBAN-R) genes. Expression level of iden- tified PBAN and PBAN-R transcripts in Spoladea recurvalis. and economically important Crambidae, in addition to the PBAN-R study of the European corn borer (Ostrinia nubilalis) from the same family (Chang & Ramasamy, in S. recurvalis. PBAN functions in female sex pheromone 2014). The sequence data of PBAN and PBAN-R from biosynthesis (Raina et al., 1989), and diapause hormone S. recurvalis in this study allowed an extensive analysis (DH) co-transcribed from the same gene was reported for functional and comparative investigation and further to activate embryonic diapause or terminate pupal di- disclosed the mechanism of sex pheromone biosynthesis apause, which is essential for survival under extreme in this family, which may provide new targets for inte- or unfavorable environments (Imai et al., 1991; Xu & grated pest management strategies. Denlinger, 2003). PBAN is generated in the sube- As for other multigene families encoding proteins with sophageal ganglia, and is released into the hemolymph important roles in chemoreception, Figure 9 shows that through the corpora cardiaca followed by ligand-binding three isoform clusters were identified as sensory neuron to PBAN receptor (PBAN-R) in the pheromone gland (for membrane proteins (SNMPs) (FPKM = 2.31, 3.34 and

Fig. 9 Fragments per kilo base per million reads (FPKM) of genes encoding chemosensory protein families involved in chemoreception. FPKM of three sensory neuron membrane protein (SNMP) genes, 14 odorant binding protein (OBP) genes and 20 chemosensory protein (CSP) genes.

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 42 J. C. Chang & R. Ramasamy

1.25). SNMPs are crucial elements associated with the Acknowledgments responses to pheromone compounds in olfactory signal transduction in insect species (Benton et al., 2007; Vogt This study was supported with funds from the Federal et al., 2009). One SNMP was shown to be required for Ministry for Economic Cooperation and Development, perceiving the pheromone compound cis-vaccenyl acetate Germany. We are grateful to Fu-cheng Su, Mei-ying Lin, in D. melanogaster (Benton et al., 2007; Jin et al., 2008). Yuan Liang, Li-hua Yang, Chun-chu Huang, and Ruey- The expression of SNMPs may suggest potential targets chuan Hu who worked on this project and provided the for novel pest management strategies for S. recurvalis. insects necessary for the study. The authors thank Dr. Wu, The role of odorant binding proteins (OBP) or chemosen- Keh-Ming (Genomics BioSci & Tech Co., Taiwan, China) sory proteins (CSP) is to interact with odor molecules for de novo transcriptome assembly support. and pheromones in the sensillar lymph of insects, fol- lowed by ligand binding and signal transduction (Vogt, 1981). We obtained 14 OBP isoforms (FPKM between Disclosure 0.29 and 321.32) and 20 isoforms similar to the CSP family in public databases with FPKM values ranging The authors have no conflict of interest to declare. from 0.34 to 1 141.79. The number of OBPs identified in S. recurvalis is lower than those reported in transcriptomes of other lepidopterans (e.g., 18 and 20 in Manduca sexta References and B. mori, respectively), whereas the putative CSP num- ber did not differ significantly from 21 CSPs reported for Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A., M. sexta (Gong et al., 2007; Grosse-Wilde et al., 2011) Gocayne, J.D., Amanatides, P.G., Scherer, S.E., Li, P.W., It is noteworthy that the expression patterns of genes may Hoskins, R.A. and Galle, R.F. et al. (2000) The genome vary at different developmental stages and tissue distri- sequence of Drosophila melanogaster. Science, 287, 2185– bution conditions. In particular, OBPs and CSPs are not 2195. confined to the chemosensory tissues, and likely engage Bai, X., Mamidala, P., Rajarapu, S.P., Jones, S.C. and Mittapalli, in other physiological responses (see review by Sanchez-´ O. (2011) Transcriptomics of the bed bug (Cimex lectularius). Gracia et al., 2001). For example, the remarkably high PLoS ONE, 6, e16336. expression levels observed in OBP1 and CSP1–4 from Benton, R., Vannice, K.S. and Vosshall,L.B. (2007) An essential S. recurvalis (Fig. 9) support the speculation that OBPs role for a CD36-related receptor in pheromone detection in and CSPs may be involved in the stimulation and regula- Drosophila. Nature, 450, 289–293. tion of adult development in pupae. Thus, the comparison Bhattacherjee, N.S. and Ramdas Menon, M.G. (1964) Bio- of identified transcript numbers among transcriptomes nomics, biology and control of Hymenia recurvalis (Fabricius) from specific tissues or life stages may not truly reflect (Pyralidae: Lepidoptera). Indian Journal of Entomology, 26, the comprehensive evolutionary history and diversifica- 176–183. tion of these multigene families in S. recurvalis. Chang, J.C. and Ramasamy, S. (2014) Identification and expres- sion analysis of diapause hormone and pheromone biosyn- thesis activating neuropeptide (DH-PBAN) in the legume pod Conclusions borer, Maruca vitrata Fabricius. PLoS ONE, 9, e84916. CABI (Centre for Agriculture and Bioscience Interna- We have compiled a library with a total of 120 435 tran- tional) (2016) Spoladea recurvalis (Hawaiian beet web- scripts from the beet webworm, S. recurvalis. The results worm) Datasheet. Invasive Species Compendium, http://www. from this study will enrich the genomic resources for cabi.org/isc/datasheet/28245 (accessed on April 25, 2016). the family Crambidae in the order Lepidoptera, and will Conesa, A., Gotz,¨ S., Garc´ıa-Gomez,´ J.M., Terol, J., Talon,´ M. serve as a foundation for future functional studies. Func- and Robles, M. (2005) Blast2GO: a universal tool for an- tional profiling is essential to link available physiologi- notation, visualization and analysis in functional genomics cal and ecological information with molecular data; it is research. Bioinformatics, 21, 3674–3676. the future path for transcriptome research. The assembled Fan, S.F., Zhang, F.Z., Deng, K.L., Yu, C.S., Liu, S.W., Zhao, and annotated transcriptome sequences of S. recurvalis P.Y. and Pan, C.P. (2013) or amaranth contains in this study contribute valuable information with which highest residue of metalaxyl, fluazifop-P-butyl, chlorpyrifos, to investigate lepidopteran genomes, as well as to shed and lambda-cyhalothrin on six leaf vegetables upon open field light on genes correlated with pheromone production and application. Journal of Agriculture and Food Chemistry, 61, reception in moths. 2039–2044.

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 Transcriptome analysis in Spoladea recurvalis 43

Fitch, W.M. (1970) Distinguishing homologous from analogous and Williamson, M.S. (2011) Pyrosequencing the transcrip- proteins. Systematic Biology, 19, 99–113. tome of the greenhouse whitefly, Trialeurodes vaporariorum Glaser, N., Frerot,´ B., Leppik, E., Monsempes, C., Capdevielle- reveals multiple transcripts encoding insecticide targets and Dulac, C., Le Ru, B., Lecocq, T., Harry, M., Jacquin-Joly, detoxifying enzymes. BMC Genomics, 12, 56. E. and Calatayud, P.A. (2014) Similar differentiation patterns Li, Z.W., Shen, Y.H., Xiang, Z.H. and Zhang, Z. (2011) between PBP expression levels and pheromone component Pathogen-origin horizontally transferred genes contribute to ratios in two populations of Sesamia nonagrioides. Journal of the evolution of Lepidopteran insects. BMC Evolutionary Bi- Chemical Ecology, 40, 923–927. ology, 11, 356. Gong, D.P., Zhang, H.J., Zhao, P., Lin, Y., Xia, Q.Y. and Xi- Margam, V.M., Coates, B.S., Bayles, D.O., Hellmich, R.L., ang, Z.H. (2007) Identification and expression pattern of the Agunbiade, T., Seufferheld, M.J., Sun, W., Kroemer, J.A., chemosensory protein gene family in the silkworm, Bombyx Ba, M.N. and Binso-Dabire, C.L. et al. (2011) Transcrip- mori. Insect Biochemistry and Molecular Biology, 37, 266– tome sequencing, and rapid development and application of 277. SNP markers for the legume pod borer Maruca vitrata (Lep- Gotz,¨ S., Garc´ıa-Gomez,´ J.M., Terol, J., Williams, T.D., Nagaraj, idoptera: Crambidae). PLoS ONE, 6, e21388. S.H., Nueda, M.J., Robles, M., Talon,´ M., Dopazo, J. and Mita, K., Kasahara, M., Sasaki, S., Nagayasu, Y., Yamada, T., Conesa, A. (2008) High-throughput functional annotation and Kanamori, H., Namiki, N., Kitagawa, M., Yamashita, H. and data mining with the Blast2GO suite. Nucleic Acids Research, Ya s u ko c h i , Y. et al. (2004) The genome sequence of silkworm, 36, 3420–3435. Bombyx mori. DNA Research, 11, 27–35. Grosse-Wilde, E., Kuebler, L.S., Bucks, S., Vogel, H., Wicher, Mittapalli, O., Bai, X., Mamidala, P., Rajarapu, S.P., Bonello, D. and Hansson, B.S. (2011) Antennal transcriptome of Man- P. and Herms, D.A. (2010) Tissue-specific transcriptomics duca sexta. Proceedings of the National Academy of Sciences of the exotic invasive insect pest emerald ash borer (Agrilus of the United States of America, 108, 7449–7454. planipennis). PLoS ONE, 5, e13708. Holt, R.A., Subramanian, G.M., Halpern, A., Sutton, G.G., Char- Miyahara, Y.(1993) Immigration of the Hawaiian beet webworm lab, R., Nusskern, D.R., Wincker, P., Clark, A.G., Ribeiro, , Hymenia recurvalis, during the Bai-u season, 2: The J.M. and Wides, R. et al. (2002) The genome sequence of arrival of the moths in early June 1992. Proceedings of the the malaria mosquito Anopheles gambiae. Science, 298, 129– Association for Plant Protection of Kyushu (Japan). 149. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. and Hsu, Y.C. and Srinivasan, R. (2012) Desert horse purslane Wold, B. (2008) Mapping and quantifying mammalian tran- weed as an alternative host for amaranth leaf webber, Hy- scriptomes by RNA-Seq. Nature Methods, 5, 621–628. menia recurvalis in Taiwan. Formosan Entomologist, 32, Neafsey, D.E., Christophides, G.K., Collins, F.H., Emrich, 297–302. S.J., Fontaine, M.C., Gelbart, W., Hahn, M.W., Howell, P.I., Imai, K., Konno, T., Nakazawa, Y., Komiya, T., Isobe, M., Koga, Kafatos, F.C.and Lawson, D. et al. (2013) The evolution of the K. Goto, T., Yaginuma, T., Sakakibara, K. and Hasegawa, K. Anopheles 16 genomes project. G3 (Bethesda), 3, 1191–1194. et al. (1991) Isolation and structure of diapause hormone New, T.R. (2004) Moths (Insecta: Lepidoptera) and conserva- of the silkworm, Bombyx mori. Proceedings of the Japan tion: background and perspective. Journal of Insect Conser- Academy, Series B, 67, 98–101. vation, 8, 79–94. International Silkworm Genome Consortium (2008) The Nusawardani, T., Kroemer, J.A., Choi, M.Y. and Jurenka, genome of a lepidopteran model insect, the silkworm Bom- R.A. (2013) Identification and characterization of the py- byx mori. Insect Biochemistry and Molecular Biology, 38, rokinin/pheromone biosynthesis activating neuropeptide fam- 1036–1045. ily of G protein-coupled receptors from Ostrinia nubilalis. Jin, X., Ha, T.S.and Smith, D.P.(2008) SNMP is a signaling com- Insect Molecular Biology, 22, 331–340. ponent required for pheromone sensitivity in Drosophila. Pro- Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H. and ceedings of the National Academy of Sciences of the United Kanehisa, M. (1999) KEGG: kyoto encyclopedia of genes States of America, 105, 10996–11001. and genomes. Nucleic Acids Research, 27, 29–34. Jurenka, R. and Rafaeli, A. (2011) Regulatory role of PBAN in Raina, A.K., Jaffe, H., Kempe, T.G., Keim, P., Blacher, R.W., sex pheromone biosynthesis of heliothine moths. Frontiers in Fales, H.M., Riley, C.T., Klun, J.A., Ridgway, R.L. and Hayes, Endocrinology, 2, 46. D.K. (1989) Identification of a neuropeptide hormone that Kahuthia-Gathu, R. (2011) Invasion of amaranth and spinach regulates sex pheromone production in female moths. Sci- by the spotted beet webworm Hymenia species in eastern ence, 244, 796–798. province of Kenya. Kenyatta University. 5 pp. Rogers, M.B., Hilley, J.D., Dickens, N.J., Wilkes, J., Bates, P.A., Karatolos, N., Pauchet, Y.,Wilkinson, P.,Chauhan, R., Denholm, Depledge, D.P., Harris, D., Her, Y., Herzyk, P. and Imamura, I., Gorman, K., Nelson, D.R., Bass, C., ffrench-Constant, R.H. H. et al. (2011) Chromosome and gene copy number variation

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44 44 J. C. Chang & R. Ramasamy

allow major structural change between species and strains of transcriptome and analysis of its gene expression during Leishmania. Genome Research, 21, 2129–2142. development. BMC Genomics, 11, 400. Sanchez-Gracia,´ A., Vieira, F.G., Almeida, F.C. and Rozas, Wang, Z., Gerstein, M. and Snyder, M. (2009) RNA-Seq: a rev- J. (2001) Comparative genomics of the major chemosen- olutionary tool for transcriptomics. Nature Reviews Genetics, sory gene families in . Encyclopedia of Life 10, 57–63. Sciences. John Wiley & Sons, Ltd., Chichester, UK. Xia, Q., Zhou, Z., Lu, C., Cheng, D., Dai, F., Li, B., Zhao, P., doi:10.1002/9780470015902.a0022848. Zha, X., Cheng, T. and Chai, C. et al. (2004). A draft sequence Shen, G.M., Dou, W.,Niu, J.Z., Jiang, H.B., Yang,W.J., Jia, F.X., for the genome of the domesticated silkworm (Bombyx mori). Hu, F., Cong, L. and Wang, J.J. (2011) Transcriptome analysis Science, 306, 1937–1940. of the oriental fruit fly (Bactrocera dorsalis). PLoS ONE,6, Xu, W.H.and Denlinger, D.L. (2003) Molecular characterization e29127. of prothoracicotropic hormone and diapause hormone in He- Tatusov, R.L., Koonin, E.V.and Lipman, D.J. (1997) A genomic liothis virescens during diapause, and a new role for diapause perspective on protein families. Science, 278, 631–637. hormone. Insect Molecular Biology, 12, 509–516. Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, Xue, J., Bao, Y.Y., Li, B.L., Cheng, Y.B., Peng, Z.Y., Liu, H., G., van Baren, M.J., Salzberg, S.L., Wold, B.J. and Pachter, L. Xu, H.J., Zhu, Z.R., Lou, Y.G., and Cheng, J.A. et al. (2010) (2010) Transcript assembly and quantification by RNA-Seq Transcriptome analysis of the brown planthopper Nilaparvata reveals unannotated transcripts and isoform switching during lugens. PLoS ONE, 5, e14233. cell differentiation. Nature Biotechnology, 28, 511–515. Yang, P., Zhu, J.Y., Gong, Z.J., Xu, D.L., Chen, X.M., Liu, Van Belleghem, S.M., Roelofs, D., Van Houdt, J. and Hendrickx, W.W., Lin, X.D. and Li, Y.F. (2012) Transcriptome analy- F. (2012) De novo transcriptome assembly and SNP discovery sis of the Chinese white wax scale Ericerus pela with fo- in the wing polymorphic salt marsh beetle Pogonus chalceus cus on genes involved in wax biosynthesis. PLoS ONE,7, (Coleoptera, Carabidae). PLoS ONE, 7, e42605. e35719. Vogel, H., Heidel, A., Heckel, D. and Groot, A. (2010) Tran- Zhan, S., Merlin, C., Boore, J.L. and Reppert, S.M. (2011) The scriptome analysis of the sex pheromone gland of the noctuid monarch butterfly genome yields insights into long-distance moth Heliothis virescens. BMC Genomics, 11, 29. migration. Cell, 147, 1171–1185. Vogt, R.G. and Riddiford, L.M. (1981) Pheromone binding and Zhang, S.X., Zhang, Y., Xu, S.Q., Wang, G.X., Hu, Z.J., Zhao, inactivation by moth antennae. Nature, 293, 161–163. C.X. and Cui, W.Z. (2010) Mapping and expression analysis Vogt, R.G., Miller, N.E., Litvack, R., Fandino, R.A., Sparks, of GOBP/PBP subfamily gene cluster during pupal and adult J., Staples, J., Friedman, R. and Dickens, J.C. (2009) The stages of the silkworm, Bombyx mori. Acta Entomologica insect SNMP gene family. Insect Biochemistry and Molecular Sinica, 53, 1069–1076. Biology, 39, 448–456. Wang, X.W., Luan, J.B., Li, J.M., Bao, Y.Y., Zhang, C.X. and Liu, S.S. (2010) De novo characterization of a whitefly Accepted June 7, 2016

C 2016 Institute of Zoology, Chinese Academy of Sciences, 25, 33–44