De Novo Construction of a Transcriptome for the Stink Bug Crop
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2020.12.04.412270; this version posted December 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 1 Title 2 De novo construction of a transcriptome for the stink bug crop 3 pest Chinavia impicticornis during late development 4 5 Authors 6 Bruno C. Genevcius*, Tatiana T. Torres 7 8 Affiliation 9 Department of Genetics and Evolutionary Biology, University of São Paulo, São Paulo, SP, 10 Brazil 11 *correspondence: [email protected] 12 13 Abstract 14 Chinavia impicticornis is a Neotropical stink-bug (Hemiptera: Pentatomidae) with economic 15 importance for different crops. Little is known about the development of the species, as 16 well as the genetic mechanisms that may favor the establishment of populations in 17 cultivated plants. Here we conduct the first large-scale molecular study with C. 18 impicticornis. We generated RNA-seq data for males and females, at two immature 19 stages, for the genitalia separately and for the rest of the body. We assembled the 20 transcriptome and conduct a functional annotation. De novo assembled transcriptome 21 based on whole bodies and genitalia of males and females contained around 400,000 22 contigs with an average length of 688 bp. After pruning duplicated sequences and 23 conducting a functional annotation, the final annotated transcriptome comprised 39,478 24 transcripts of which 12,665 had GO terms assigned. These novel datasets will provide 25 invaluable data for the discovery of molecular processes related to morphogenesis and 26 immature biology. We hope to contribute to the growing research on stink bug evo-devo as 27 well as the development of bio-rational solutions for pest management. 28 29 Key-words: genes, green stink bug, larvae, mRNA, nextGen, nymphs, sequencing 30 31 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.04.412270; this version posted December 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 32 DATA DESCRIPTION 33 Context 34 The stink bug Chinavia impicticornis (Hemiptera, Pentatomidae) is a Neotropical species 35 with wide distribution across the South America. The species is an economically important 36 polyphagous pest reported to feed on fourteen plant species [1]. Most of its damage has 37 been attributed to soybean and closely related crops. However, the capability of switching 38 to less preferred hosts in anomalous conditions may help explain the successful 39 establishment of its populations in different crops [2,3]. 40 Not only C. impicticornis but many other species of stink bugs are responsible for 41 millions of dollars’ worth of damage worldwide every year. Taken as an example, the brown 42 marmorated stink bug (Halyomorpha halys) has caused a 37 million dollar loss only to 43 apple growers in the USA during 2011 [4]. Towards mitigating this damage, there is 44 growing applied and basic research which has certainly contributed to our knowledge of 45 the biological aspects and management strategies for stink bug pests [5]. Most efforts 46 have focused on sampling methods, taxonomic identification, insecticide effectiveness, 47 and population monitoring [6,7]. However, traditional pest control techniques have the 48 potential of inflicting serious ecological disturbances as well as the selection of resistant 49 lineages of crop pests. The development of bio-rational solutions is largely dependent on a 50 detailed comprehension of the underlying biology of these pests [8]. In this context, studies 51 on the “omics” fields offer promising species-specific and environmental-friendly tools [9]. 52 More specifically, transcriptomic approaches provide the foundation for identifying gene 53 targets associated with pheromones, pesticide resistance, and other features with potential 54 for pest managing. Conversely, only five species of stink have had full body transcriptomes 55 sequenced to date: the brown marmorated stink bug [10], the harlequin bug [9], the 56 southern green stink bug [11], the brown stink bug [12], and the predatory stink bug Arma 57 chinensis [13]. Genomic data are even more scarce, comprising four genomes assembled 58 to date: Halyomorpha halys [14], Euschistus heros, Piezodorus guildinii, and Stiretrus 59 anchorago. 60 Here we target this gap by documenting and characterizing the first transcriptome 61 for the Neotropical stink bug C. impicticornis (Hemiptera, Pentatomidae). We conduct our 62 study at the nymphal stage for two reasons. First, because management strategies are 63 known to affect nymphs and adults in different ways, and nymph transcriptomes are yet 64 scarce for stink bugs, e.g. [10]. Second, developmental transcriptomes provide invaluable 65 data for the discovery of molecular processes underlying morphogenesis. We focused on 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.04.412270; this version posted December 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 66 the fifth instar because this is the stage where morphological sexual differentiation takes 67 place [15]. These resources may be helpful for the growing research on genital evo-devo 68 and stink bugs development as a whole [15,16]. 69 70 Methods 71 Samples 72 Our colony of C. impicticornis was fed on green bean pods (Phaseolus vulgaris) and 73 reared under the following conditions: 26±1 °C, 65±10% RH, and 14L:10D hours of 74 photophase. We isolated RNA from male and female immatures at two developmental 75 stages: the beginning and the end of the fifth nymphal instar (two hours after and seven 76 days after molting from fourth to fifth instar, respectively). The fifth nymphal instar in our 77 controlled conditions has an average of eight days. We included three individuals per sex 78 per stage, totaling 12 specimens. We chose to remove the genitalia of each specimen to 79 extract RNA and construct the libraries of bodies and genitalia separately. Genitalia are the 80 most important traits for both species delimitation and sex identification in true bugs. 81 Therefore, with this approach, we expect to generate data to be particularly useful for 82 studying the mechanisms of sex determination and speciation. For RNA sequencing, each 83 genital sample was sequenced separately (n = 3 genitalia per sex per sample), while 84 bodies of the same sex and developmental stage were pooled to generate a single library 85 (n = 1 body per sex per stage). 86 87 RNA extraction & sequencing 88 Frozen specimens (at -80 °C) were sexed and had their genitalia separated under a 89 stereoscope. RNA extraction of genitalia and bodies were conducted with the Trizol 90 reagent following the manufacturer’s protocol (Invitrogen, Life Technologies). RNA was 91 quantified via Qubit 2.0 (Thermo Fisher Scientific, USA) and then sent for sequencing at 92 the Centro de Genômica Funcional (ESALQ-USP) using Illumina HiSeq 2500. Samples 93 were paired-end sequenced with 100 bp and a target depth of 20 million read pairs per 94 sample. 95 96 Transcriptome assembly & annotation 97 We removed redundant sequences to decrease computational usage for transcriptome 98 assembly using a custom Perl script [17] (supplementary material). De novo assembly was 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.04.412270; this version posted December 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 99 conducted in Trinity v. 2.4.0 [18] with concatenated samples, a minimum contig length of 100 199, and other parameters set to default values. 101 Due to the lack of a reference genome, transcriptome annotation was conducted 102 using FunctionAnnotator [19]. This is a web-based tool that blasts sequences against the 103 NCBI NR protein database. FunctionAnnotator was also used for the functional 104 characterization, which uses the B2G4PIPE engine for GO terms assignment. 105 106 Quality control 107 The quality of raw reads was assessed using the program FastQC v. 0.11.9. A quality 108 control plot was constructed using MultiQC v. 1.9 [20] and is represented in Fig. 1. Raw 109 reads of all samples had good quality. We removed low quality sequences and adapters 110 using Trimmomatic v. 0.36 with the following parameters: LEADING:3 TRAILING:3 111 SLIDINGWINDOW:4:20 MINLEN:36. An average of 11.5 ± 0.006 % of raw reads was 112 removed during the trimming process. 113 To avoid redundant transcripts in the transcriptome, we removed the shortest 114 isoforms with a build-in Trinity script. We then assessed transcriptome completeness using 115 the Benchmarking Universal Single-Copy Orthologs (BUSCO) v. 4.1.4 [21] against the 116 Hemiptera ortholog database with default parameters. BUSCO analysis revealed 117 appropriate levels of completeness for the assembled transcriptome (Complete: 90.1%, 118 Fragmented: 1.5%, Missing: 8.4%; Fig. 1). 119 120 Transcriptome characterization 121 RNA sequencing generated approximately 5 million raw reads (Table 1). The de novo 122 assembled transcriptome had almost 400k contigs, with a 35.04% GC content and a mean 123 contig length of 688 bp. After removing the shortest isoforms, 268,000 contigs remained.