Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ui Fu Yujie Hua Jinfeng CforOBP4-6 of characteristics and functional ) (Coleoptera: , (Fabricius) potato formicarius sweet the of assembly genome Chromosome-level ufn Li Huifeng CforOBP4-6 weevil, Hua of Jinfeng potato characteristics functional sweet and Brentidae), the (Coleoptera: assemblyof apparent genome and control. adaptability Chromosome-level the pest environmental sustainable mechanism Overall, ecologically of genetic for understanding basis, ligands. technologies the ecological new other deepen and molecular and ideas pests, the new pheromones reveal agricultural provide sex to major and binding plasticity, resource for the of study, valuable affinities process in-depth a evolutionary an binding provides In and strong genome system. families had formicarius chemosensory gene the CforOBP4-6 C. important in Many high-quality that involved were (Mya). indicated genome ago of results formicarius years length C. million ponderosae, total assay chromosomes, the Dendroctonus 138.89 the in to 11 approximately of expanded sister 99.03% ponderosae the were was for D. on and that formicarius accounted from sequences located C. direction diverged repeat was and that formicarius showed sequence of sequences C. analysis the Mb genomic and genomic determine 157.51 of Comparative to total, Mb used chromosome. In be 337.06 associated respectively. 338.84 could the of of that Mb, total size length 34.23 A a sequence and had the Mb predicted. and obtained 14.97 were assembly of high-quality genes values The chromosome-level protein-coding N50 generations. a 11,907 scaffold obtained 15 we and for sequencing, contig inbred PacBio lines with and economic from Mb, Illumina and of Using functions ecological adult study. genetic of considerable intensive the assembly of causing mechanisms, genome genetic subject worldwide, of the potato understanding been sweet and have management of formicarius comprehensive C. pests of effect important the most improve To the damage. of one is formicarius Cylas Abstract 2021 25, June 3 2 1 Li Zongyun Correspondence * 3 China China Jiangsu, University, Normal 2 Jiangsu Sciences, Life of School 1 uhuAaeyo giutrlSine/we oaoRsac nttt,CA ins,China. Jiangsu, , CAAS Institute, Research Potato Sciences/Sweet Agricultural of Guangxi, Academy Sciences, Xuzhou Agricultural of Academy Guangxi Institute, Research Maize Laboratory, Potato Genomics, Sweet Comparative & Phylogenomics of Laboratory Key Jiangsu Biology, Plant Integrative of Institute uhuAaeyo giutrlSine/we oaoRsac nttt,CAAS Institute, Research Potato Sciences/Sweet Sciences Agricultural Agricultural of of Academy Academy Xuzhou Guangxi University Institute, Normal Research Jiangsu Maize Sciences Life of School 1 2 ufn Li Huifeng , | oge Huang Yongmei 1,2 1 1 | e Zhang Lei , e Zhang Lei 2 oge Huang Yongmei , 1 2 | 1 | ioegDong Xiaofeng , ioegDong Xiaofeng aqn Li Yanqing 2 | inunChen Tianyuan 1 | 2 aqn Li Yanqing , ogu Han Yonghua 1 ogu Han Yonghua , 1 2* | 1 2 af Ma Daifu | inunChen Tianyuan , ioa Gou Xiaowan 1 ioa Gou Xiaowan , 3* | ya formicarius Cylas ogu Li Zongyun 1 | inigSun Jianying 2 af Ma Daifu , 1 1* inigSun Jianying , 1 (Fabricius) | ui Fu Yujie 3 and , 1 1 , | Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. nomto oipoeteeeto opeesv aaeetadudrtn h oeua clg and potato ecology of sweet molecular functions the damage genetic understand to the and ability management species, unique (Kyereko comprehensive this a world of of (Coffelt, exhibit effect the evolution programmes the they of loss control improve (IPM) 1986), regions a to and management al., subtropical information prevent in pest et and to integrated resulting Heath tropical of used decline, 1978; part throughout been Mcclellan, yield as have & weevil drastic strategies Sower, potato causes olfaction-based Vick, sweet that Although the considerable Hongbo, pest (SPW), annually. of Kyereko, and destructive infestations weevil dollars responsible 2016; yields of Industry, major potato are crop millions 1999; a Sweet Weevils in of Adams, is reduction & 2019). Brentidae), severe 2016). Jones, Yeboah, causing (Coleoptera: Skuhrovec, Hardee, & 2016; crops, & Meiwei, al., cash Schon Amoanimaa-Dede, et and et 2016; (Christiaens (Bouchard food loss Cameron, to (https://bugguide.net/node/view/15740) economic & damage species substantial Oberprieler, pest for Gunter, world’s 2011; the weevils al., primitive of called many also are including which of members Brentidae, 1 new formicarius Cylas and ideas new provide and deepen Keywords control. plasticity, pests, pest apparent agricultural sustainable and major ecologically and adaptability of for pheromones environmental technologies process sex evolutionary of for and understanding mechanism affinities the genetic binding basis, high-quality strong ecological the had molecular CforOBP4-6 Overall, that ligands. indicated other results assay binding direc- the and analysis the sequence genomic Comparative located in the was determine chromosome. panded associated to sequences sequences the used genomic repeat of be of of . length could Mb Mb total that contig 157.51 the 337.06 that length with of total, showed of sequence 99.03% Mb, In total the for 338.84 A accounted respectively. and of tion chromosomes, Mb, size predicted. 11 34.23 a were the and from had genes on weevils Mb obtained protein-coding 14.97 adult assembly 11,907 of of high-quality and values assembly The N50 genome scaffold chromosome-level generations. of and a 15 understanding obtained for and we management inbred comprehensive lines sequencing, of PacBio effect and of the functions Illumina improve genetic To the mechanisms, genetic damage. economic and logical formicarius No.174 Cylas Sciences, Agricultural of E-mail: Academy China, Abstract Province, Guangxi Guangxi Institute, City, Nanning Research Road, Maize Daxue Laboratory, Potato Sweet Chen Jiangsu Tianyuan City, Xuzhou Zone, high-speed Development the Technological E-mail: and of China, Economic North Province, Xuzhou CAAS, Road, Institute, Xuhai Research station, Potato railway Sciences/Sweet Agricultural of Academy Xuzhou Xuzhou District, Ma New Daifu Tongshan Road, E-mail: Genomics, Shanghai China, Comparative No.101 University, Province, & Normal Phylogenomics Jiangsu Jiangsu City, of Sciences, Laboratory Life Key of Jiangsu School Biology, Plant Integrative of Institute Li Zongyun ponderosae | Introduction C prxmtl 3.9mlinyasao(y) ayipratgn aiista eeex- were that families gene important Many (Mya). ago years million 138.89 approximately C . . formicarius aBo iC hoooelvlgnm,fntoa noain drn idn proteins binding odorant annotation, functional genome, chromosome-level Hi-C, PacBio, , soeo h otipratpsso we oaowrdie asn osdrbeeco- considerable causing worldwide, potato sweet of pests important most the of one is formicarius [email protected] a itrto sister was eoewr novdi h hmsnoysse.I ni-et study, in-depth an In system. chemosensory the in involved were genome [email protected] .formicarius C. edotnsponderosae Dendroctonus C . formicarius .formicarius C. 2 eoepoie aubersuc orva the reveal to resource valuable a provides genome aebe h ujc fitniesuy Using study. intensive of subject the been have , nldsoe 100dsrbdetn species, extant described 11,000 over includes [email protected] aebe h ujc fitniestudy. intensive of subject the been have tal. et and , C 09.T rvd elhof wealth a provide To 2019). , ya formicarius Cylas . formicarius iegdfrom diverged (Fabricius) D Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 2.1 con- 2 pest innovative and adaptation ecological of specialization, provide trol. host-plant study will evolution, sequence biological molecular genome the of high-quality conforma- mechanisms for The chromosome foundation High-throughput maps. strong and interaction chromatin a reads (Hi-C) long technology (PacBio) capture Biosciences of tion assembly Pacific genome sequencing, chromosome-level (PE) a report end we study, present the In limited. based very reported been Ma control been has 2017; pest has recognition, Lin, alternative patterns olfactory & develop expression Wu, to gene Pu, need related Qu, C but urgent (Bin, different an analysis of is transcriptome development there on the Thus, far, So 2008). methods. Nderitu, & Nyamasyo, Maniania, olwdb 5gnrtoso igepi aigwt rs we oaorosa eprtr f27 of temperature a at roots 2.2 potato sweet University. fresh Normal with Jiangsu mating Sciences, single-pair of ° generations 15 by followed C control effective no are there adults, as (Kyereko potato nocturnal sweet mainly of worldwide, larval production are unknown the areas have and because in year, application However, distances, the “push- for throughout in short strategies active successfully 2017). fly are used behaviour, Eiasu, been these have feeding & because mates, and suitable Laurie, generations find overlapping (Hlerema, to have ability strategies pests’ control the with pull” interfere and monitor (Heath to 1986 in experiments field IPM. and (Coffelt techniques 1978 in of sterilization pheromone insect sex resistance, the plant Furthermore, host control, biological and chemical measures, sn aorpsetohtmtr(hroFse cetfi,UA n ui()30Fluorometer 3.0 Qubit(r) a assessed and was USA) quantity and Scientific, quality Fisher DNA (Thermo the a and spectrophotometer using weevils China), NanoDrop female (TsingKe, a and Kit male using DNA 400 approximately Genomic from isolated TreliefTM was DNA genomic High-molecular-weight many research, of economic years 150 an of as course reported the first Over was 1954). it Newsom, and & on Christian, (India), studies Deen, Trenquebar 1988). (Cockerham, from (Austin, 1857 Fabricius century in by sixteenth pest 1792-1794 the ago in of years described beginning million the 80-100 first approximately at subcontinent America, Indian South the northwestern (Jiangsu, in China originating 1991), After southern (Wolfe, 2016) (Mya) In Chen, & Taiwan), 1986). Gao, and Sutherland, Li, Hainan Guangdong, 1991; Guangxi, Hossner, Yunnan, Sichuan, . & Guizhou, Fujian, (McConnell Hunan, plants Jiangxi, host developingZhejiang, its in as alleviation Reddy, recorded & poverty Kohama, (Hiroyoshi, and Guyana immense security, and Although has Venezuela USA, food potato Caribbean, Sweet Kyereko nutrition, the 2016; islands, human 2017). Pacific the al., in Asia, et role Africa, Yang major source (J. 2007). important humans (Bovell-Benjamin, a an countries for is play minerals Nations), to United and the potential vitamins of proteins, Organization Agriculture calories, and of (Food China in crop significant ( potato Sweet ne 60 under C formicarius | . . aeil n Methods and Materials formicarius formicarius | | eoesequencing Genome Insects .formicarius C. C ± a rdc eea eeain e eradoewne nsoaeo noe ed M,Wang, (Ma, fields open in or storage in overwinter and year per generations several produce can . tal. et %rltv uiiy(H n 68hlgtdr LD htpro tteSho fLife of School the at photoperiod (L:D) light-dark h 16:8 a and (RH) humidity relative 5% pme batatas Ipomoea formicarius tal. et NB aooyI:21,4)(iue1 a olce rmNnig unx,China, Guangxi, Nanning, from collected was 1) (Figure 2611,543) ID: (NCBI nldn ntemcaimo niomna dpainadtemlclrmcaimof mechanism molecular the and adaptation environmental of mechanism the on including , .formicarius C. 09 n a enfudi ihrlttd ra swl Krd uhre,2012). Mukherjee, & (Korada well as areas higher-latitude in found been has and 2019) , 98,adtebociiyo h ytei opudwstse nbt aoaoyand laboratory both in tested was compound synthetic the of bioactivity the and 1978), , C . rfr we oao oeta 0seisof species 30 than more potato, sweet prefers formicarius aaeetadcnrlhv encridot nldn tde nagricultural on studies including out, carried been have control and management L]Lm,tesvnhms motn rpi h ol n h orhmost fourth the and world the in crop important most seventh the Lam), [L.] tal. et assetniels fsetptt (Kyereko potato sweet of loss extensive causes C .formicarius C. rtbcm soitdwt we oao hc rgntdi rnear or in originated which potato, sweet with associated became first . 96.Ofcinbsdapoce,uigsnhtcsxpheromones sex synthetic using approaches, Olfaction-based 1986). , formicarius C . Z--oee--l()2btnae a rtextracted first was (E)-2-butenoate, (Z)-3-dodecen-1-yl , stemjrdsrcieps fsetptt throughout potato sweet of pest destructive major the is formicarius 3 tal. et hc ilavnetekoldeo the of knowledge the advance will which , tal. et 06;hwvr eoi eerhon research genomic however, 2016); , C Ipomoea . formicarius 09.Gnrly nquarantine in Generally, 2019). , n te eeahv been have genera other and .formicarius C. tal. et sn luiapaired- Illumina using C 09 Ondiaka, 2019; , . formicarius populations was ± C 1 Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. aawr ujce osrc ult oto eoeasml.FrIlmn aa eue h Trimmomatic the reads low-quality used 2.3 trim we 2014). and data, Usadel, sequences Illumina & For adaptor remove Lohse, assembly. original to (Bolger, before The RRID:SCR-011848) control (Trimmomatic, China). quality (Beijing, v0.33 strict PacBio Corporation program one Technologies to a Biomarker protocol; subjected on PacBio at were standard acquired constructed the was data were to for cells SMRT according libraries bp) an USA) the 150 DNA CA, on of x 20-kb Park, movie sequenced (2 Prep Menlo cell sequences Biosciences, Library and RSII (SMRT) (Pacific PE DNA constructed sequencer real-time PacBio of Nano Sequel data were molecule TruSeq and raw single Illumina bp of libraries the sequencing, 270 Gb to read 27.64 Illumina of according of construct length USA) total (Illumina, a insertion to Kit; platform an Ten used with X then HiSeq libraries Illumina was genomic DNA PE extracted libraries. The USA). (Invitrogen, ubn 09,Pln(io,RI:C-171 Wle ta. 04 otae EM 25(http:// v2.5 CEGMA (http://busco.ezlab.org) v2.0 software, 2.4 2015). BUSCO 2014) Zdobnov, further and & al., To 2007) Kriventseva, & Ioannidis, Korf, Li et Mb. Waterhouse, (H. & (Walker (Simao, 14.97 v0.7.10-r789) Bradnam, RRID:SCR-014731) (BWA of (Parra, Aligner N50 (Pilon, Datasets) correction Burrows-Wheeler contig korflab.ucdavis.edu/ program consensus Pilon with of and Wtdbg rounds mapped 2009), assembly the four reads integrity, genome Durbin, with genomic Illumina Mb the contigs using evaluate 338.84 into and performed a assembly were assembled the obtained were of We accuracy data the improve 2020). PacBio assembly, Li, reference & was the (Ruan assembly draft by of (Koren the obtained v1.2.4 quality Falcon then, were using and the reads conducted Second, parameters, error-corrected was default assembly (v assembly. range, with Preliminary program draft supported adapters Canu generated. maximum a hairpin the their and using obtain obtain bases corrected to To and unsupported used assembled trimming 2017). were then al., reads were et long these (Koren PacBio 1.7) and First, polished. corrected, was 1980). of assembly and (Staden, depth draft selected method results. a the overlap-layout-consensus were assembly at an reads genome was using longer subsequent assembled peak the the were 19-mers, 19-mer data obtain of The long-read number to 2011). PacBio total used The Kingsford, was the & size is the (Marcais genome K-num of depth) estimated size, analysis k-mer The genome the the average 61X. the of in the represents employed is size G were K-depth The (where libraries and K-num/K-depth PE map. = 270-bp G distribution from follows: frequency reads) clean depth of (k=19) Gb k-mer (27.64 data sequence The eeue ogop lse,sr n rettecnisit hoooelvlsqecs(utne al., et (Burton sequences chromosome-level into contigs the assessment data orient verified 2.5 quality the and 1). and software, sort Lachesis Figure sorting using S5, cluster, pairs, Li (Table By group, read 2013). (H. 2015). invalid to al., v0.7.10-r789 et the used BWA (Servant (v2.11.1) of were using using HiC-Pro filtering removed mapped using and were performed were were Identification reads reads PE clean analysis. low-quality Walker the subsequent and 2009; Then, sequences Durbin, adapter Corporation 2014). & Technologies data, Biomarker (Andrews, clean at were platform software fragments obtain which Ten FastQC chimaeric X To junctions, HiSeq These chimaeric Illumina sonication. China). the digested form by on was (Beijing, fragments to strategy DNA bp PE proximity-ligated the a 300-700 and and using into PBS, sequenced biotinylated sheared in further were formaldehyde 2015; (vol/vol) and ends al., 2% enriched et sticky were with (Nagano fixed The protocols was described HindIII. tissue previously Insect with following by 2014). bp al., 700 et the to Rao of 300 from assembly ranging chromosome-level library fragment linear the generate To 07 n IEHne 100(a ese,21)sfwr akgst osrc env repeat novo de a Wang, construct & (Price, to Xu v1.05 packages (Z. software RepeatScout v1.05 2010) the LTR-FINDER evolution Wessler, 2005), used genome & Myers, we in (Han & genome, role (Edgar v1.0.0 assembled important v2.4 MITE-Hunter the an PILER-DF and on play 2005), 2007) and Based Pevzner, species & 2012). among Jones, conserved Salzberg, well & less (Treangen are sequences Repeat | | | iClbaycntuto n hoooeasml ae nH- data Hi-C on based assembly chromosome and construction library Hi-C eoesqec annotation sequence Genome eoeeauto,asml n correction and assembly evaluation, Genome tal. et 04 TbeS) nyuiul lge Eraswr osdrdfor considered were reads PE aligned uniquely Only S4). (Table 2014) , 4 .formicarius C. .formicarius C. .formicarius C. eoe ecntutdteHi-C the constructed we genome, tal. et 07.T ute improve further To 2017). , eoewsetmtdas estimated was genome eeotie.Frlong- For obtained. were Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ih ilr yr,&Lpa,19)adKA 21Mras&Knsod 01.Frhroe Inter- Furthermore, motifs functional 2011). conserved Kingsford, annotate to & used v2.1(Marcais was 2014) KAAS (Altschul, al., and et v2.2.31 (Jones 1990) (BLAST) 005829) and Tool (RRID:SCR Lipman, www.expasy.org/sprot/ v5.8-49.0 Search & ProScan (http:// Alignment Myers, knowledgebase (Figure 2011), Local Miller, protein (KEGG) al., Basic Genomes pro- Swiss-Prot Gish, and et using and of Genes (Marchler-Bauer 2000) swissprot/) Groups of (nr/nt) http://www.ebi.ac.uk/ Goto, Encyclopedia Orthologous collection & Kyoto Eukaryotic nucleotide 2003), (Kanehisa the al., 2001), S4) et to al., (Boeckmann alignment et database (Tatusov TrEMBL by database performed 503 (KOG) was of teins total annotation A functional 2004). of Durbin, Gene pseudogenes genome & and the Clamp, identified in (Birney, were 015054) annotated sequences were (RRID:SCR above V2.4.1 pseudogenes the Genewise in using located obtained mutations genes were protein-coding frameshift known or the codons to termination homologous sequences in for presented searched the are we in pseudogenes, annotation of RNA annotation non-coding the of For results The (tRNAscan-SE, 1.1(Nawrocki microRNAs. Infinal v1.3.1 predict 2006), S14. to tRNAscan-SE Enright, Table used & using was Bateman, 2013) identified Eddy, Dongen, were van & Grocock, tRNAs (Griffiths-Jones, The database miRBase 2018). al., SCR mi- et RRID: the the (rRNA, (Kalvari aligning predict strategies by 13.0) to different software lease adopted v1.2 RNAs, RNAmmer were non-coding added using prediction) different were predicted methods of tRNA integration sequence-based characteristics EVM and RNA structural in croRNA assembly. and the lost whole-genome homologue-based the to genes novo-based, obtain According the de to integration, v2.0.2 the PASA EVM of with in results modified approaches genes predicted and three reliable the the some on from RRID:SCR of based filtering, annotation (EVM, gene loss After v1.1.148 of the (EVM) results avoid EVidenceModeler 2016). the by Finally, RRID:SCR Salzberg, integrated (PASA, prediction. & were Borodovsky, Alignments gene & Leek, Spliced RNA-seq-based Langmead, Lomsadze, Assemble for Pertea, (Kim, v5.1(Tang, used to V2.0.4 Kim, GenemarkS-T Program er.github.io), Hisat (Pertea, and decod using of a V1.2.3 2015) trans genome reads using Stringtie (http:// RNA-seq reference females v2.0 and larvae The the and TransDecoder from 2015) USA). with males WI, extracted Salzberg, assembled of Madison, were were & abdomens) (Promega samples sequences and kit RNA Illumina thoraxes System eleven from legs, Isolation heads, prediction, RNA Total (antennae, gene V tissues sequence-based reference. adult a RNA as 3-day-old 2016) of and al., process et (Keilwagen the The v1.3.1 In GeMoMa using prediction. performed novo was (https://www.ncbi.nlm.nih.gov/genome/), prediction database de homology-based (NCBI) for and Information Biotechnology 2007) for Guigo, Center National & Parra, insects (Blanco, coleopteran derosae the v1.4 RRID:SCR SNAP of GeneID (Augustus, 2004), sequences Salzberg, and v3.0 protein & 2004) Pertea, Augustus (Majoros, (Korf, 1997), v3.0.4 v2006-07-28 GlimmerHMM Karlin, 2004), meth- Morgenstern, & gene & Waack, (Burge the sequence-based Steinkamp, in v3.1 RNA prediction and Genscan gene homologue-based programs for novo-based, and employed de Repbase were sequence, Finally, the ods against repeat libraries. sequences the the aligning masking merge by to After regions used was repeat libraries. classify 2005) the repeat to al., identify novo used et to de was (Jurka v4.0.6 2007) database RepeatMasker Repbase al., used RepeatMasker the et we used different (Wicker and and PASTEClassifier classify libraries, library & novo Then, RepeatMasker, and repeat Kojima, de the (Bao, 2009). the sequences 3.3.0; 17.01) on Chen, repeat analysis (v. & RepeatModeler (Tarailo-Graovac detect library (v. a Repbase (v4.0.6) to conducted the RepeatMasker we to 3.3.0) Next, sequences (v.4.09), genome 2015). (v. TRF aligning Kohany, by Mask used sequences repetitive Protein we of Repeat types First, and 012954) parameters. RRID:SCR default with library .formicarius C. , rce borbonicus Oryctes 185 otaewt eal aaeesfrekroe Lw dy 97.Bsdo the on Based 1997). Eddy, & (Lowe eukaryotes for parameters default with software 010835) eoeuigGnlsaV.. Se h,Wn,Pi hn 09.Tepremature The 2009). Chen, & Pei, Wang, Chu, (She, V1.0.4 Genblasta using genome , rblu castaneum Tribolium C C .formicarius C. . formicarius . formicarius and 5 rspiamelanogaster Drosophila C . , formicarius npohr glabripennis Anoplophora . eoeasml.W rtue h software the used first We assembly. genome C . formicarius eoet h fmdtbs (re- database Rfam the to genome 166 Ha ta. 03 were 2003) al., et (Haas 014656) 169 Ha ta. 08.To 2008). al., et (Haas 014659) eoe h RA were rRNAs The genome. eedwlae rmthe from downloaded were , edotnspon- Dendroctonus 047 (Stanke, 008417) C . formicarius Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 04,Poo RI:SR066)(r ta. 05,Pa RI:SR042)(ine l,2014), al., protein 24493 et and (Finn motifs 004726) 1386 Finally, al., SCR et 2012). of (RRID: 2.6 Wu al., genome Pfam H. et the (C. 2005), (Lees in CATH-Gene3D PIRSF annotated al., and were 2004), et 2009) domains al., al., (Bru et (Haft, et SUPER- 006969) TIGRFAMs (Letunic (Lima 2005), SCR 1994), 005026) HAMAP al., (RRID:SCR (RRID: Beck, PROSITE et 4.0 ProDom & (Mi SMRT databases: 004869) 2004), (Attwood 2003), following SCR 003412) White, the (RRID: & SCR PANTHER to Selengut, 2002), (RRID: aligned Chothia, PRINTS were & 1991), (Gough annotations FAMILY functional (Bairoch, the SCR003457) and (RRID: domains, protein and ta. 2013), al., et of sequence of those whole-genome the used We et oest ne h rcs ntepyoeyo eegi n loss. birth- and gain using gene species. 2006) of Hahn, 16 phylogeny expansion & the the Demuth, of in of Cristianini, 2.7 analysis process Bie, groups the an (De homologous infer in v4.1) Computational to the by (CAFE used models performed cluster Evolution death were were Family families to species gene gene used orthologous of 16 of was contraction Analysis the and 2.0) expansion of the (version of families Comparisons OrthoMCL gene and 2013). scores ancestor contraction. Drummond, likelihood common and trees and & of evolutionary recent 2006) Xie, comparison the Suchard, a most Kumar, in (Rambaut, by The & v1.5 used runs TRACER Dudley, calibration independent in the fossil (Hedges, estimates of the database convergence parameter and the TimeTree model controls, assessed time The MCMCTREE We respec- using the derived. KEGG, estimated as 2007). was was and applied time Yang, GO were divergence (Z. using times tree, package phylogenetic divergence out and the PAML annotation carried were of functional were the model basis The genes the in site On evolving species. branch each rapidly S5). a in (Figure obtained and genes tively the single-copy 2007) of of Yang, pressure analysis (Z. selective ob- enrichment (PAML) software the to PhyML3.0 Likelihood analyse by species to Maximum applied performed each by was Analysis in sites with Phylogenetic genes across analysis in 2010). the distribution phylogenetic al., gamma connected (ML) et discrete we Maximum-likelihood (Guindon and 2002). species, in Miyata, repeats building. sets 16 & bootstrap tree gene Kuma, the 1000 phylogenetic Misawa, orthologous of for (Katoh, group 1:1 genes MAFFT orthologue super-sequences and using each single-copy tain for families families orthologous single-copy performed gene were the the (E-value the cluster alignments of Using BLASTP Multiple to all sequences using coding 2003). used aligned pairwise the Roos, were we of & in results orthologues, Stoeckert, gene BLASTP Li, conserved each (L. The the of OrthoMCL identify transcripts 1e-5). To longest of the tree. cut-off from phylogenetic translated a sequences infer protein and orthologs predict 2016), 2010), Genomics, Aphid migratoria 2019), custa al., et (Duan l,21)wt utbeeouinr oe.Fnly h rewsanttduigEoiwsoftware Evoview instructions the using following et generated annotated was (Tamura cDNA was method analysis. of (qRT-PCR) tree ML reaction profiles chain the the transcription-polymerase expression using Finally, construct tissue-specific X10.0 2019). to Chen, The MEGA model. used & in evolutionary were Hu, constructed Lercher, sequences suitable Gao, was acid a (Subramanian, tree (https://www.evolgenius.info/evolview) amino with phylogenetic The 2011) The al., 2020). tree. McKenna, ML & an Andersson, Schwartz, with of Schneider, (OBPs) program proteins BLAST glabripennis binding NCBI A. odorant the and using (ORs) receptors annotated manually were families taneum gene chemosensory The | | oprtv eoi analysis genomic Comparative dnicto ftehmsnoygn families gene thechemosensory of Identification Zootermopsisnevadensis T.castaneum eune sqeis(tp/batnb.l.i.o/ls.g) h rti eune folfactory of sequences protein The (http://blast.ncbi.nlm.nih.gov/Blast.cgi). queries as sequences .glabripennis A. Al)wr onoddfo eBn Adrsn eln,&Mthl,21;Mitchell, 2019; Mitchell, & Keeling, (Andersson, GenBank from downloaded were (Agla) Wn ta. 2014), al., et (Wang Rcad ta. 2008), al., et (Richards obxmori Bombyx eiuu humanus Pediculus C.elegans MKnae l,2016), al., et (McKenna Trao ta. 04,and 2014), al., et (Terrapon a sda notru.TeCdM oe Shbure l,2012) al., et (Schabauer model CodeML The outgroup. an as used was Xae l,2004), al., et (Xia D.melanogaster .formicarius C. .formicarius C. CforOBP4-36 Ptedihe l,2006), al., et (Pittendrigh Onthophagustaurus 6 .borbonicus O. Glat 1992), (Gelbart, n 5pbihdwoegnm eune,namely, sequences, whole-genome published 15 and psmellifera Apis .castaneum T. . eeeautdb eltm uniaiereverse quantitative real-time by evaluated were Caenorhabditiselegans Co ta. 2010), al., et (Choi Myre l,2016), al., et (Meyer ie lectularius Cimex Sqecn osrim 2006), Consortium, (Sequencing (Tcas), crhspo pisum Acyrthosiphon D . D Cnotu,19) to 1998), (Consortium, ponderosae . ponderosae Agrilusplanipennis Rsnede al., et (Rosenfeld (International Do)and (Dpon) (Keeling .cas- T. Lo- Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. S) h idn ffiiy(i ftecmeioswscluae ae nteI5 ausuigthe using values IC50 3.1 the on based complex. CforOBP4/1-NPN calculated CA, 3 the Jolla, was were of La curves constant competitors (GraphPad, dissociation the the software the and is of 8 [IC (Kd) Prism (Ki) = stoichiometry constant GraphPad Ki affinity a equation binding using dissociation binding with The plots CforOBP4/1-NPN experiments The Scatchard The independent (ligand:protein). relative USA). three 1:1 the CforOBPs). of of from average and stoichiometry calculated an (1-NPN a from with 1:1 obtained probe of were fluorescence ligands a of as affinities 1-NPN using measured stored was protein purified The 2.9 electrophoresis China). 24 the Shanghai, gel at of (Sangong, polyacrylamide concentration Kit USA) The sulphate Assay -80 Healthcare). BCA MO, dodecyl (GE at the Louis, column with sodium desalting measured St HiTrap 15% was a by (Sigma protein The using purified analysed USA). dialyzed U/mg) were then was NJ, (2 and Piscataway, proteins protein (SDS-PAGE) solution Healthcare, CforOBP4-6 purified enterokinase GE recombinant The HP; The using h. (HisTrap 7.4). removed column pH was 28 chromatography imidazole; His-tag at affinity mM incubation Ni-ion 4 20 of a min, NaCl, h using 10 4 M r/m, After 0.5 (12000 phosphate, 0.8. centrifugation to by 0.6 harvested reached culture the of 37 at cultured and isopropyl- medium of LB in mL 500 to 2 the using analysed were values = Ct E relative equation The the 2.8 control. by and calculated tissue was each efficiency of amplification (TaKaRa, replicates the II and Taq generated, Thermal was Ex 10 curve 94 USA). Premix at standard CA, Green denaturation a Richmond, value, initial TB (Bio-Rad, parameters: using System 94 following of Detection performed the cycles PCR using was Real-Time performed qRT-PCR was CFX96 cycling a genes, S1. in reference Table China) Three in Beijing, listed China). are Beijing, primers (TaKaRa, Eraser gDNA CforUBE4A with Kit reagent RT namely, PrimeScript the of 2443051kmr eeue oeaut eoesz n hrceitc,adthe of and total number a characteristics, a total was and k-mers, The peak size abnormal main heterozygous. removing genome the was evaluate After of peak to 24,314,168,078. main that used was the twice were data of at k-mers sequencing depth depth 22,494,390,501 the the k-mer from half The obtained at k-mers S1). depth of (Figure k-mer the peak) and main peak, the repetitive (i.e., 61 was depth k-mer of size genome The 20 binding to The 2 methanol. from in dissolved ranging were concentrations 2 ligands to the final 1-NPN all achieve adding and spectra to by using emission 1-NPN measured 7.4) Japan) and nm. were nm, 550 (Hitachi, 1-NPN 337 to for spectrophotometer at nm constants fluorescence excitation 400 with from F-4700 probe 102 recorded an fluorescent were for (Hua on a CforOBP4-6 study as out of previous (1-NPN) our carried affinity N-phenyl-1-naphthylamine in the was described determine experiment methods to the binding to performed according was S2) assay (Table volatiles binding displacement fluorescence A containing medium liquid (LB) Luria-Bertani EcoRI of and mL BamHI 5 the in contained cultured and was removed) was clone (50 peptide positive kanamycin signal The (the sites. CforOBP4-6 restriction for designed were in S1) expressed were CforOBP4-6 Recombinant | -1/slope Result | | | eoesz evaluation size Genome assay binding displacement Fluorescence proteins CforOBP4-6 recombinant of purification and Expression ° C. ῝φορβ-αςτιν h en n tnaderr eecluae o he ilgclrpiae ihtretechnical three with replicates biological three for calculated were errors standard and means The . ° β o n 60 and s 5 for C GnakacsinM 142,wr sda h otos(u ta. 01.Tespecific The 2021). al., et (Hua controls the as used were 512412), MT accession (GenBank - μ --hoaatprnsd IT)a nlcnetaino Mwe h D(0 nm) (600 OD the when mM 1 of concentration final a at (IPTG) D-1-thiogalactopyranoside /L n rw vrih t37 at overnight grown and g/mL) 50 .formicarius C. GnakacsinM 716465), MH accession (GenBank ]/(1+[1-NPN]/K ° o 0s codn otedlto ocnrto n h orsodn CT corresponding the and concentration dilution the to According s. 30 for C a siae ykmraaye fteIlmn N aa h average The data. DNA Illumina the of analyses k-mer by estimated was 1-NPN ,wee[-P]i h recnetaino -P n K and 1-NPN of concentration free the is [1-NPN] where ), ° .Rcmiatpoenepeso a nue yteaddition the by induced was expression protein Recombinant C. shrci coli Escherichia ° ihsaig(2 /) h utr a hndiluted then was culture The r/m). (220 shaking with C 7 ° )adsnctdi idn ue 2 Msodium mM (20 buffer binding in sonicated and C) CforGAPDH μ μ .Tebnigante ftelgnswere ligands the of affinities binding The M. frB46i 0m rsHlbffr(pH buffer Tris-HCl mM 50 in CforOBP4-6 M el.Poaytcepeso rmr (Table primers expression Prokaryotic cells. ° ihsaig(8 /) h el were cells the r/m), (180 shaking with C GnakacsinM 141 and 512411) MT accession (GenBank ° o 0s olwdb 40 by followed s, 30 for C tal. et .formicarius C. -T method. 01.Teligand The 2021). , ° o 4 for C genome 1-NPN Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. a oovoshtrzgu ek ult hc i o n ln n irb otmnto (Figure contamination microbe and map. 3.2 genome plant a find sequence of of not repetitive genome construction did the the the There check that that to quality S3). implied showed (Table A characteristics 0.43% analysis These approximately peak. distribution S2). was heterozygous k-mer heterozygosity obvious the The no that was Mb. and 28.60% 364.51 approximately be was content to estimated was size apdt h sebe eoesqecs(al 7.Teeaaye niae htthe that indicated the analyses These of 3.3 were assembly. 0.56% reads S7). high-quality proper (Table a and genes fragmented, 96.04% sequences was and and 98.87% core obtained genome reads complete Furthermore, conserved 98.01% genome assembled as approximately the highly identified total, In to S8). were 248 v3.0.1 (Table mapped S9). the (Table with BUSCO genome assembly of from the analysis assembled in orthologues 94.76% respectively, and eukaryotes the insect and of conserved genome, single-copy in genes detected, highly core reference with found 1,054 conserved completely 458 evaluated were a the were eukaryotes was of to database 98.03% of genome data that CEGMA showed draft analysis Illumina the 2.5 the the v in CEGMA of mapping The completeness BUSCO, v2.5. CEGMA the using accuracy, genes assembly orthologous the assess To (Hlerema Mb) of 850 sizes to and genome 110 the from (ranging to genomes coleopteran (Table reads published clean longest long The the of and Gb S6). kb, 34.14 Table 18.75 and the reads was assemble clean formicarius length short to N50 C. of combined read Gb were the 27.40 statistical obtained which kb, reads, The we 11.84 raw S5), Finally, platform. was filtering kb. PacBio length after 89.71 the subread was sequencing, on genomic mean read long-read depth) PE the For (˜100x that Illumina bases adapter S4). showed 34,139,672,427 (Table and Using data and reads data low-quality reads clean platforms. of 2,884,459 of removal PacBio obtained Gb and we and obtained, 27.40 Illumina was in data the resulted raw sequences both of Gb from of 27.64 reads approximately genome sequencing, long draft and a short annotate generate and assemble To tal. et 2018), al., et (Schoville the vespilloides as such genomes, coleopteran total assembled the the genome, that of draft indicated 3.4 assembled 99.03% results These the comprising 2). of scaffolds, Figure of Mb. 99.42% 78 S11, 34.23 (Table genome for importantly, oriented of draft accounting and N50 More ordered scaffold sequences, were a pseudo-chromosomes. length, genome and sequence 11 Mb of 13.21 Mb to in Mb of Mb 32.23 sequencing, anchored N50 337.06 337.06 contig and PE was was of a pairs Illumina that with total assembly read with 97-fold scaffolds, a final 154 correction mapped a representing Finally, and obtained unique error contigs filtering, we of After 221 maps, after comprising Mb interaction S10). size, library Hi-C 80.04 (Table and fragment total, generated sequencing In long-read Hi-C were PacBio pairs the Mb). interaction (338.84 from valid genome reads of draft clean the of of Gb coverage 32.83 obtained We oteerpttv lmnswstria netdrpa TR lnt f6.8M,1.1 fassembly) of 19.11% Mb, 64.78 of the (length DNA in (TIR) sequences S12). repetitive repeat of (Table inverted proportion (TEs) terminal The elements was 2). (Table elements transposable repetitive were these 41.55% the to which of of 21.61% for genome), accounted transposons assembled the of 46.49% the total, In ylu phaleratus Hycleus | | | luiaadPci eunigadgnm assembly genome and sequencing PacBio and Illumina eetannotation Repeat hoooesqec assembly sequence Chromosome 2013), , 1.5 fasml)(unnhme l,2015), al., et (Cunningham assembly) of (12.85% C ylu cichorii Hycleus . .formicarius C. .formicarius C. ihacni 5 egho 49 badalnetcni egho 73 b(al 1, (Table Mb 27.31 of length contig longest a and Mb 14.97 of length N50 contig a with formicarius D 38Mp Y .W,L,&Ce,2018). Chen, & Li, Wu, M. (Y. Mbp) (308 lsoeu angulosus Plastocerus . ponderosae eoewsfudt oti 5.1M frpttv eune (approximately sequences repetitive of Mb 157.51 contain to found was genome and sebygnm iei iia otema sebysz ftepreviously the of size assembly mean the to similar is size genome assembly a ihdge fcniut n completeness. and continuity of degree high a had yohnmshampei Hypothenemus .phaleratus H. 1%ad2%o sebyfrmlsadfmls epciey (Keeling respectively) females, and males for assembly of 23% and (17% .formicarius C. .formicarius C. 37Mp Ks,Mtk,Bck olr oa,2018) Bocak, & Vogler, Bocek, Motyka, (Kusy, Mbp) (367 .formicarius C. 2.3 n 34%o seby epciey eoe,and genomes, respectively) assembly, of 13.47% and (22.73% 8 .formicarius C. eoe n h otcmo lsicto assigned classification common most the and genome, .formicarius C. 27 fasml)(eae l,2015), al., et (Vega assembly) of (2.7% eoe 3.4M sebywsotie for obtained was assembly Mb 338.84 A genome. etntradecemlineata Leptinotarsa eue yrdsqecn prahto approach sequencing hybrid a used we , sasml eoe hc sconducive is which genome, simple a is eoewshge hnta nother in that than higher was genome tal. et 07 n scomparable is and 2017) , 1%o assembly) of (17% .formicarius C. Nicrophorus Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. fsnoypoess nldn nsacigfrfo ore,lctn ae n pwigsts avoiding sites, of spawning (Hua mechanism and groups olfactory mates among the behaviour, socializing locating insect and sources, study individuals food between better for To information searching of in exchange including predators, processes, sensory environment, of the with interactions During oorpcrcpos(R) 7gnsecdn hmsnoypoen CP)ad4gnsecdn sensory encoding genes 4 and encoding genes (CSPs) 72 proteins (GRs), chemosensory receptors encoding gustatory genes encoding 17 genes 234 (IRs), ORs, receptors encoding ionotropic genes existing 132 the insect OBPs, in encoding coleopteran genes the chemosensory for of comparison set complete a the improve and proteins 3.7 olfaction-related of biosynthesis of the sensitivity affect the olfactory in may genes which contracted and metabolites, expanded chemosensory these of analyses enrichment KEGG formicarius and GO performed we addition, by formicarius obtained were fam- species carius in gene 16 genes contracted the expanded and of of expanded number ancestor of numbers common The recent glabripennis contraction. most and the in expansion in ilies family families gene gene the 132,846 analysing of of time total evolution A The (McKenna ancestor. studies common previous a with sharing consistent clearly of was clade, insects a other formed the and together clustered were order of derosae ancestor common proportion the the ( from them, 0.82% diverged Among from ranged could S13). genomes (Table which 99%. insect unclusters identified, coleopteran were 896 six from orthologues including the unigenes ( 11,011 in insects, 223 genes total, 15 Finally, species-specific In other phylogeny. of the families. the with gene infer clustered 75 to be to used corresponding were obtained, analysis were family gene the from icus S15). (Table in genes miRNAs 40 and tRNAs 3.6 165 rRNAs, 102 including sequences, with RNAs Compared and the 3). in (14,402), Figure genes S14; of effect (Table and number castaneum prediction functions the T. RNA-seq-based gene putative genomes, good of coleopteran corresponding a known to support showing other assigned the S3), genes, were (Figure With genes) protein-coding genes predicted 11,610 protein-coding showed S13). total the results protein-coding (Table the prediction on 11,907 of families final of 97.51% gene the for total unique accounting methods, A prediction were results. gene which prediction homology-based of the 75 integrate the to families, in software found v1.1.1 were EVM genes used we Then, the in genes coding genomes. 3.5 2020) al., et Zhang the (L. in (58.22%) that to similar was .taurus O. .elegans C. | | | 5)adhmnpea insect hymenopteran and (55) pce hlgn analysis phylogeny Species .formicarius C. dnicto fteceoesr eefamilies gene chemosensory the of Identification eepeito n ucinlannotation functional and prediction Gene .formicarius C. a aeeprecdmr ulcto vnsthan events duplication more experienced have may .formicarius C. and .formicarius C. .Albosrpvle ftendsgnrtdwr bv 0,temjrt en ihrthan higher being majority the 90%, above were generated nodes the of values bootstrap All ). .formicarius C. eoe(iueS) efudlnaeseicepnino ee eae oteboytei of biosynthesis the to related genes of expansion lineage-specific found We S6). (Figure genome eeas h otaudn ae ntemlioyhmlgu eenme Fgr 5.In S5). (Figure number gene homologous multicopy the on based abundant most the also were , .formicarius C. A a h otpiiiea noutgroup. an as primitive most the was (12,841), . planipennis .formicarius C. n w te oepea net lsee together. clustered insects coleopteran other two and ee3 n 8 epciey(iueS) oprdwith Compared S5). (Figure respectively 28, and 31 were .formicarius C. O eoe ucinlantto ttsissoe ht1,6 ee 9.2 fthe of (96.32% genes 11,469 that showed statistics annotation Functional genome. 23 hni h 5ohrseiso net,ecp h oepea insect coleopteran the except insects, of species other 15 the in than (223) .formicarius C. . yoolapectoralis Pyrocoelia taurus a prxmtl 3.9Ma(iue4.Teclotrniscsfo h same the from insects coleopteran The 4). (Figure Mya 138.89 approximately was a 7epne n 6cnrce eefmle,wihdmntae htthe that demonstrated which families, gene contracted 16 and expanded 27 had .formicarius C. 1,7)(al 1) eas dnie n noae aiu non-coding various annotated and identified also We S13). (Table (11,373) (14,402), eoe aey biii,RAsqbsdadhmlg-ae methods. homology-based and RNA-seq-based initio, ab namely, genome, .mellifera A. h hlgntcaayi hwdta hr eefwrspecies-specific fewer were there that showed analysis phylogenetic The TbeS17). (Table .glabripennis A. .formicarius C. eoe(al n al ) hc eedvddit ,9 gene 9,291 into divided were which 4), Table and 3 (Table genome D .formicarius C. a nrae infiaty hsrsl niae that indicated result This significantly. increased had . ponderosae 4.8 fasml)(ue l,21)and 2017) al., et (Fu assembly) of (44.88% 11 Fgr ) h 197oet-n oooosgenes homologous one-to-one 11,907 The 4). (Figure (141) 9 8.4Ma n h iegnetm between time divergence the and Mya, 187.54 h oepea insect coleopteran the , .formicarius C. eue he ieetmtost rdc protein- predict to methods different three used We D (12,102), eefmle r rbbyivle navariety a in involved probably are families gene . ponderosae .formicarius C. .formicarius C. tal. et A . eoe(al 1) sa obvious an As S16). (Table genome glabripennis 06,adtedffrnito time differentiation the and 2016), , efudta hs ee in genes these that found We . .formicarius C. eoewssmlrt htin that to similar was genome .glabripennis A. a xlrd eannotated We explored. was .borbonicus O. D (14,533), . ponderosae and rplajaponica Propylea .formicarius C. O tal. et .ponderosae D. a 1genes 61 has . .borbon- O. o7.38% to ) borbonicus .formi- C. and 2021). , .pon- D. A C. C. . Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. o nyi h eeto fsxpeooe u loi h eaiu fsacigfrhs lns(Hua plants host for searching of behaviour vitro. the involved in in CforOBP4-6 are also of CforOBP1-3 but analysis that pheromones functional in sex suggested detected of results also reception analysis was al. the to characteristic CforOBP1 in Functional whereas compared only antennae, antennae in legs. not female enriched processing. and also chemosensory in abdomen were in enriched CforOBP1-3 the involved study, significantly exclusively previous likely were our most seven In were only genes these and Thus, heads, legs. and female antennae and is, that major the found lobe, ( of we antennal abdomen seven that the , Interestingly, female from the fact 1986). data in The (Sutherland, anatomical enriched found cantly with 7). was consistent (Figure dimorphism is sexual antennae females where and female males to between compared differences significant antennae 24 male the in of expression 13 ( difference; antennae significant female and no showed in samples enriched CforOBP17 antennal significantly female ( were and mouthparts s male the the in of enriched analysis were three whereas , 33 CforOBP16 the of 24 females. of and CforOBP5 transcripts males The of 7). abdomens the (Figure and legs) of the and expression analysed head The comparatively (excluding we thoraxes legs, samples, antennae), tissue the among excluding differences expression of into expression insights initial obtain To 3.8 in CforOBP24) to or identical and alone were CforOBP22 and placed and group CforOBP21, were the plus-C (DponOBP26) CforOBP11, of the and of (CforOBP10, expansion motif members large OBPs 6-cysteine as a Five identified classic were comprise the pattern. OBPs radiation encode identified small in genes of chemosensation a remaining majority in The the roles play and 2005). to subfamily, Smith, known minus-C & also are Jones, that Atkinson, genes Xu, of (P. family for essential suggested an been constitute OBPs has As for number food. and large hosts findings in of difficulties numbers insect large of The typical subfamilies in radiation seven 8 paralogous group planipennis all frequent as of of placed formicarius and representatives pattern identified C. the include was follow Many ORs These of and 150). lineage 4/6 new 5). group One Figure except chemoreceptors. ORs S17: (Table were of CforOrco OBPs) 36 coreceptor and OR ORs between (154 activity components chemosensory 190 and the of in total involved in a families expansion which gene gene in massive families, indicating OBP and found, OR the the are of exceptions in genome SNMPs the 4 general, and In CSPs, 13 IRs, 39 GRs, (McKenna 46 (SNMPs) proteins membrane neuron CforOBP32 CforOBP27 01.T ute td h oeo frBsi latr eonto of recognition olfactory in CforOBPs of role the study further To 2021). , | CforOBP35 A iseepeso rfieof profile expression Tissue . glabripennis , 9in 79 , , , .formicarius C. CforOBP7 CforOBP19 CforOBP17 CforOBPs and a hshrortelre dnie netO eetie eas hr r 6Osin ORs 46 are there because repertoire, OR insect identified larger the harbour thus may , CforOBP28 .Nevertheless, ). .formicarius C. CforOBP33 .ponderosae D. .glabripennis A. Fgrs5ad6). and 5 (Figures CforOBP .formicarius C. , ntemi hmsnoytsu,teatna,has(h hl edcapsule head whole (the heads antennae, the tissue, chemosensory main the in .formicarius C. CforOBP9 , , CforOBP20 CforOBP19 R r ntne ras(iue5 n r eie rmrcn expansions. recent from derived are and 5) (Figure arrays tandem in are ORs , .Ms fthe of Most ). CforOBP29 a edet h motneo hmclcmuiainaogweevils. among communication chemical of importance the to due be may CforOBP16 a etitdt h anceoesr ise atna n heads) and (antennae tissues chemosensory main the to restricted was s 2 in 121 , AlOP5 Fgr 6). (Figure (AglaOBP15) , CforOBPs CforOBP10 and .formicarius C. CforOBP10 , , CforOBP21 CforOBP20 noe opnnssmlrt hs fohrweis h notable The weevils. other of those to similar components encodes .glabripennis A. CforOBP T , CforOBP32 , CforOBP CforOBP23 . .formicarius C. castaneum tal. et .formicarius C. CforOBP4 , in CforOBP5 CforOBP11 10 , .formicarius C. eesgicnl nihdi nene( antennae in enriched significantly were s , , CforOBP13 a 5 Rgnsi diint h ihyconserved highly the to addition in genes OR 153 has CforOBP22 CforOBP21 06.W aulyanttd3 Bs 5 ORs, 154 OBPs, 36 annotated manually We 2016). , eeepesdi h anceoesr tissue, chemosensory main the in expressed were s C n 7 in 270 and , , and R r huh ob u ocreto past or current to due be to thought are ORs . CforOBP34 formicarius , eoe(al S16). (Table genome CforOBP28 , CforOBP5 oeossinvicta Solenopsis , eoe ecmae h RadOBP and OR the compared We genome. CforOBP7 CforOBP12 .castaneum T. , , , CforOBP19 CforOBP29 CforOBP22 T . castaneum , and CforOBP35 hwdmr hnfieodover- fivefold than more showed and C , T , . . CforOBP10 castaneum (TcasOBP6), formicarius CforOBP14 CforOBP11 , , , CforOBP25 .formicarius C. CforOBP32 CforOBP23 Wr ta. 01,the 2011), al., et (Wurm CforOBP (Mitchell and , D eperformed we , CforOBP36 , , .ponderosae D. CforOBP14 eesignifi- were s .Statistical ). CforOBP15 tal. et , . , , CforOBP4 CforOBP34 CforOBP24 CforOBP30 (CforOr91- ponderosae Drosophila CforOBP 2020). , A. et ), , , , Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. fms ftesqecsi h rlmnr sebe rf eoeadt dniytegop re and order group, the identify locations to chromosomal and the genome determine draft to assembled used preliminary Marie- be the Flot, can in 2002; Kleckner, technology sequences & this the Dekker, Rippe, Therefore, of (Dekker, signals most law. interaction 2015). of power the a between Koszul, and to interaction chromosomes, & the subject between Nelly, are principles: that distance two than linear on greater their based is and mainly Dekker, chromosomes is (Belaghzal, within assembly interactions fragments DNA distribution genome DNA genome-wide Hi-C-assisted depth detecting by 2017). k-mer chromatin Gibcus, the of & structure on 3D the Based decodes Hi-C mating. single-pair to The of due diversity be genetic might estimated population an it the the provide or map, of could Fu level habitat, assembled 2018; low feeding 2019; al., we the narrow et indicates al., that (Evans heterozygosity et insects genome low coleopteran Liu high-quality known The 2019; other The of Moran, genomes of the & 2020). the that development Smith, showed al., ecological analysis Park, molecular et distribution Li, mer the Yang (Y. for J. insects SMRT resource Hi- 2020; in important in and al., developments recently strategies recent et reported sequencing With Meng also long-read was technology. using technology Hi-C of assembly C and sequence genome platforms chromosomal-level genome evolutionary technology, sequencing draft for sequencing PacBio first basis and the the Illumina report provides the we the and study, At sequence this agriculture. genome In insects. in the other pests for and important data coleopteran ecologically comparative on and provides research economically it are time, which same of species many Brentidae, 1.690 of values Ki with CforOBP6, 1.370 for of rhamnetin Ki a 0.289 and showed with of kaempferol-3-glucoside Ki CforOBP5, volatiles, a for potato with ether sweet 3,7,4’-trimethyl CforOBP5, for tested affinity 43 binding the highest Among the 9B-D). (Figure affinities binding eae pce r tl akn.Tegnm euneof sequence genome The lacking. of Li, markers still molecular & are genome-wide Yang species and (H. resources related genome genomic mitochondrial abundant However, The of studied. transcriptome clear. Hiroyoshi the become 1999; and have Raman, 2019) characteristics & physiological Jansson, and (Capinera, ecological subtropics and tropics al. the throughout production potato C 4 5B-D). (Figure tively pheromones sex the with CforOBP4-6 recombinant of values Ki the of that 1.564 (Figure displacement indicated were calculated results (IC50, were test (Ki) concentration constant binding inhibitory dissociation The the median of values the reciprocal curves, the and these 9). 1-NPN) on of 50% Based than 3.295 5A. more were prop- Figure 1-NPN binding to in the characterize bound played to CforOBP4-6 ligands for values as including Kd chosen μ compounds, were The odorant S4), CforOBP4-6. (Table 102 of volatiles CforOBP4-6, erties plant 8B. antennae-enriched host Figure and of in pheromones shown specificity sex as binding SDS-PAGE, the by determine examined To were proteins in recombinant expressed purified successfully The were enterokinase. pET30a/CforOBP4-6 vectors expression prokaryotic E The system. expression Recombinant 3.9 n 3.491 and M | . . 06 ed,Za,&Hme,21) ihcniudrsac on research continued With 2014). Humber, & Zhao, Reddy, 2016; , DISCUSSION | coli formicarius C loecnebnigassay binding Fluorescence . Fgr A.Prfidrcmiatmtr rtiswr bandb laigteHsTgusing His-Tag the cleaving by obtained were proteins mature recombinant Purified 8A). (Figure formicarius ± CforOBP4-6 0.229 ± 0.2524 stems motn eto we oaoadivdsmn ftemi ra fsweet of areas main the of many invades and potato sweet of pest important most the is μ C ,1.064 M, eoesz a eemndt eapoiaey388 buigSR assembly. SMRT using Mb 338.84 approximately be to determined was size genome . μ formicarius ,rsetvl Fgr A.Rpeettv -P ipaeetcre eedis- were curves displacement 1-NPN Representative 9A). (Figure respectively M, rtis xrse rdmnnl natna,wr rdcduigaprokaryotic a using produced were antennae, in predominantly expressed proteins, ± C 0.221 . eoesz a prxmtl 6.1b(iue 1adS2). and S1 (Figures 364.51Mb approximately was size genome formicarius μ n 1.351 and M C . formicarius (Bin 11 tal. et ± ± ± .1,7.05 0.114, 0.028 0.093 eoewssaladhmzgu oprdwith compared homozygous and small was genome ± 07 Ma 2017; , C C 0.026 μ μ ,adtoeo iioie 1-aminoanthracene tiliroside, of those and M, . . ,rsetvl Tbe5,idctn strong indicating 5), (Table respectively M, formicarius C formicarius ± μ . ,floe yteant fkaempferol of affinity the by followed M, formicarius 0.305 tal. et tal. et C μ . n 8.112 and M formicarius ttecrmsm ee using level chromosome the at 07 .Zhang L. 2017; , rvdsanvlrsuc for resource novel a provides 06 aebe previously been have 2016) , n t eae pce.K- species. related its and ± 0.151 C C . ± t morphological, its , formicarius μ 0.248 . ,3.072 M, formicarius tal. et μ ,respec- M, ± 2020). , 0.1881 nthe in and et Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. n boi n itcsrse n ilaclrt tde npplto eeis hc ilfcltt the facilitate will which genetics, population on studies accelerate insecticides will to and tolerance stresses of important processes biotic the identify and underlying to mechanisms abiotic of researchers molecular and the allowing patterns of resistance, genetic understanding stress our population advance for of resources and valuable mechanisms very genes molecular provide also functional the but on species this study of mechanisms further evolutionary and genetics the studying oprtv eoi nlssbtenti pce n 5ohrseis n on htO n OBP and OR that found and species, other 15 and species in and expanded this species were between of families other genome analysis gene to chromosome-level genomic adapt high-quality comparative species first invasive the a some constructed which we summary, by In mechanism the candidate these of of study verification functional the The environments. in studied. help further will be & will species adaptability Breer, contraction specialization, or Pregitzer, ecological expansion (Fleischer, exhibited among (Andersson insects that relationship evolution by the family compounds study believed volatile to gene are used of and ORs frequently recognition therefore, are a the and behaviour; provide in 2018) findings perception role Krieger, The odorant theoretical important regulating a level. an provide by molecular play and management the to insects pest at of for factors mechanism interference recognition strategy the odorant behavioural new of the effective exploration clarify example, more to for screening us research; basis help further will for ORs important strong of particularly had function be CforOBP4-6 will that ligands. candidates putative indicated other These and results chemical assay pheromones sex binding taste for competitive mediate affinities fluorescent Tiwari, to binding The 2008; revealing inferred al., identified. SNMPs, et were were Sato and that 2006; 6 Wurm (Bargmann, CSPs, OBPs 2019; control Sowdhamini, and OBPs, biological These & for ORs as Karpe, targets 2019). identified potential well (Keeling in al., are we as environment and et expand chemical genes, neurotransmission Wu IRs, diverse to expanded (N. and a these appeared adaptation GRs with environmental Among that interact ORs, in to genes with role ability functional a associated their play key are may genes Some genes functional these reported. identified; also been were not have ORs, of ancestor the between time divergence estimated Zhang The genes D (L. single-copy lineages. species same 745 mite the using lepidopteran one in castaneum analysis and clustered and phylogenetic were dipteran orders species) ML of five (six ancestor the representing the species with from 11 consistent diverged from Mya, insect insects 410 coleopteran other approximately the even insects that and showed species genomes coleopteran insect coleopteran high-quality of other Wu Zhang assemblies M. of The (L. Y. those future 2018; Mb to for al., 14.97 comparable et paradigm of are (Ando new genomes N50 assembly a annotated this longest functionally providing of 11,907 the were addition, quality genomes, the with a In genes high date, with genomes, the Mb. and To 14.97 chromosomes coleopteran of completeness of 11 published 96.32% databases. N50 on and contig public all and genome, constructed in of Mb the search were 34.23 in a of contigs al., N50 predicted through 221 et scaffold were (Dudchenko data, a assembly genes with sequence genome Mb, protein-coding assist Hi-C 337.06 to of Using used size genome successfully 2014). been Rusk, has and 2017; sequences these of orientation tde aesonta h aeaut aeahg ffiiyfrsxpeooe eesdb h females the by released pheromones sex for affinity 2018). al., high et a Zhang have Q. adults S. male 2015; with (ExaBayes), the al., approach et (Coffelt that Bayesian (Mckenna shown a species have and (McKenna beetle IQ-TREE) Studies species 373 and insect from (RAxML 15 genes methods from protein-coding ML orthologues two nuclear 523 on 95 with based ML, and on 2016) based , analysis phylogenetic previous a and . ponderosae C C . formicarius . tal. et formicarius and and 98 Heath 1978; , .glabripennis A. C iegdfrom diverged eoeasml ilfcltt vltoaysuis h hlgntcaayi f16 of analysis phylogenetic The studies. evolutionary facilitate will assembly genome . formicarius tal. et C a prxmtl 2.7Mya. 223.47 approximately was . D formicarius tal. et prxmtl 8.4Mya; 187.54 approximately 96,bttemcaim fsxpeooepreto,especially perception, pheromone sex of mechanisms the but 1986), , . ponderosae tal. et tal. et 01.A h aetm,tebnigcaatrsiso CforOBP4- of characteristics binding the time, same the At 2011). , 09 Mitchell 2019; , 08 .Zhang L. 2018; , hs aaesntol rvd elho nomto for information of wealth a provide only not datasets These . prxmtl 3.9Ma l fwihaecnitn with consistent are which of all Mya, 138.89 approximately 12 C C . . formicarius formicarius tal. et C .glabripennis A. tal. et . formicarius 00.I h uue h eefamilies gene the future, the In 2020). , 2020). , sebyi h otcomprehensive most the is assembly opeegnm eunewill sequence genome complete A . tal. et tal. et 03 Richards 2013; , a itrto sister was iegdfo h netrof ancestor the from diverged 00.Clotrninsects Coleopteran 2020). , C . formicarius tal. et D C tal. et . 00.The 2020). , . ponderosae performed , formicarius 2008). , tal. et T . , Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. a,W,Kjm,K . oay .(05.RpaeUdt,adtbs frpttv lmnsin elements repetitive of ecology. database to a receptors from Update, chemosensation Repbase doi:10.1038/nature05402 Comparative 301. (2006). (2015). I. O. C. Bargmann, Kohany, & K., K. genomes. eukaryotic Kojima, proteins. W., in Bao, patterns and wild sites of related dictionary doi:10.1093/nar/19.suppl.2241 and a 2241-2245. PROSITE: potatoes sweet (1991). A. of Bairoch, diversity genetic and evolution Conf taxonomy, ning The (1988). database. species. fingerprint F. motif D. protein Austin, PRINTS–a (1994). Data. E. Sequence M. doi:10.1093/protein/7.7.841 Throughput Beck, inversions841-848. High & Repeated for K., tool (2018). T. Control T. Attwood, Quality Niimi, A FastQC . (2014). . S. . Andrews, J., Hirata, A., Ito, K., 9 Hara, K., a Goto, within T., Matsuda, T., Ando, correlates genes chemosensory ( search of alignment content Genomic local wood-boring (2019). Basic glabripennis in F. R. range (1990). Mitchell, J. host & I., D. with C. Lipman, Keeling, & N., M. W., Andersson, E. Myers, W., Miller, W., tool. Gish, F., S. Altschul, interests. competing no References have they wrote manuscript. that J.H. final declare data; the authors the approved samples; The analysed authors the J.S. All collated and manuscript. Interest Y.Li. X.G. the of and Y.H., revised Conflicts Y.H. X.D., L.Z. H.L., L.Z., and J.S., Z.L. J.H., X.G., and research; Y.H., manuscript; the study; the performed the Y.F. designed and T.C. J.H. and SRA D.M. NCBI J.H., the Z.L., into no. deposited respectively. Archive accession been SRR14373956, Contributions have under and Read no. Author SRR14373634, reads Assembly Sequence accession SRR14368409, RNA-seq NCBI SRR14373957, BioSample NCBI and to nos. and Hi-C the submitted accession PacBio, PRJNA725325 was under in Illumina, and assembly Raw deposited chromosome PRJNA725324 were SRR14368409. final assembly The nos. accession and SAMN18917023. BioProject reads with sequencing (SRA), genome raw The and Accessibility (KYCX18-2126) Data China (nycytxgxcxtd-11- of of Project Province Foundation Jiangsu Construction Students of Science Team Jiangsu Program Natural Innovation Innovation ResearchPractice 2018YM18), of Guangxi Postgraduate 03), and Development (2016JJB130253), (Guinongke2017JZ05 Program Province Sciences Academic Guangxi Guangxi Agricultural Priority of Foundation Development of (31660627), Technology CARS-10-C19), and Academy and China Science (PAPD), (CARS-10-B3 Institutions of System Education Foundation Higher Research Jiangsu Agriculture Science China Natural the by National supported was research This Acknowledgements of IPM of development 34) -3 doi:10.1038/s41467-018-06116-1 1-13. (3843), o il 215 Biol, Mol J xlrto aneac tlzto fSetPtt eei eore e we oaoPlan- Potato Sweet Rep Resources Genetic Potato Sweet of Utilization & Maintenance Exploration pannier . ). M eois 20 Genomics, BMC nrndiedvricto fitapcfi oorpten fldbr beetles. ladybird of patterns colour intraspecific of diversification drive intron ’ ltomfrInvto n nrpeerhpTann rga (202010320111Y). Program Training Entrepreneurship and Innovation for Platform o N,6 DNA, Mob 3,4340 doi:10.1016/S0022-2836(05)80360-2 403-410. (3), C . formicarius 1) -.doi:10.1186/s13100-015-0041-9 1-6. (11), 60,11.doi:10.21203/rs.2.11535/v1 1-18. (690), . edotnsponderosae Dendroctonus 13 , giu planipennis Agrilus uli cd e,1 Suppl 19 Res, Acids Nucleic aue 444 Nature, rti n,7 Eng, Protein and , a Commun, Nat Anoplophora 71) 295- (7117), (7), , Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. osrim .e .(98.Gnm euneo h eaoeC lgn:apafr o investigating for platform a elegans: C. nematode the of sweetpotato sequence the Genome of pheromone (1998). Sex S. biology. (1978). e. T. C. W. Consortium, Mcclellan, & L., 7 potato L. Entomology, sweet Sower, Environmental the of W., biology K. The Vick, weevil, (1954). A., T. J. L. Newsom, Coffelt, & B., M. Christian, T., RNA O. weevil. Deen, (2016). G. L., Smagghe, K. Cockerham, . Weevil Gene Sweetpotato . (2010). African . the J. C., against 6 Niblett, Andrews, strategy Rep, A., & biopesticide Bailey, P., promising M., a Ghislain, A. I., interference: Moczek, Pertry, K., Y., Prentice, Yang, O., H., Christiaens, Tae, perspec- taurus. E., global Onthophagus beetle a Snell-Rood, horned management. the T., in pest discovery Kijimoto, potato H., Sweet J. (1999). Choi, V. K. Raman, & K., R. Chromosome- tive. Jansson, interactions. (2013). L., J. chromatin J. Shendure, on Capinera, & based O., assemblies J. genome Kitzman, doi:10.1038/nbt.2727 novo R., 1119-1125. de Qiu, P., of R. scaffolding Patwardhan, DNA. A., scale genomic Adey, human N., in J. structures Burton, gene database complete ProDom of The Prediction (1997). (2005). S. 268 human D. Karlin, 3D. & in Kahn, C., on role & Burge, emphasis S., future more Dalmar, and Y., present, families: Beausse, doi:10.1093/nar/gki034 past, domain S., (Insecta). H., its Carrere, protein Coleoptera of C. E., of review in Courcelle, Lyal, a C., names F., Bru, Family-group potato: J. Sweet Lawrence, (2011). (2007). B. A., nutrition. C. A. M. A. Smith, Alonso-Zarazaga, Bovell-Benjamin, E., . A. Davies, doi:10.3897/zookeys.88.807 . Y., data. sequence Illumina . Bousquet, for trimmer P., flexible a Bouchard, Trimmomatic: (2014). B. M. Usadel, 30 & Schneider, Bioinformatics, M., . Lohse, M., . A. . Bolger, 2003. E., in TrEMBL Gasteiger, supplement A., its Estreicher, and C., knowledgebase 31 genes. M. protein Blatter, SWISS-PROT identify The R., to Apweiler, (2003). geneid A., Using Bairoch, ex- B., and Boeckmann, (2007). R. transcriptome Guigo, Antennal & 4 Genomewise. G., Chapter (2017). and Parra, GeneWise T. E., J. Blanco, (2004). Lin, R. Durbin, & & Z., formicarius. M., doi:10.1101/gr.1865504 Z. Cylas Clamp, Wu, weevil E., sweetpotato H., Birney, the X. in Pu, genes Q., olfactory doi:10.1038/s41598-017-11456-x M. of analyses Qu, high-resolution for pression Y., procedure Hi-C S. optimized An Bin, 2.0: Hi-C conformation. (2017). chromosome H. of J. mapping Gibcus, genome-wide & J., Dekker, H., Belaghzal, 1,3530 doi:10.1093/nar/gkg095 365-370. (1), 1,7-4 doi:10.1006/jmbi.1997.0951 78-94. (1), h lrd noooit 82 Entomologist, Florida The 386,11.doi:10.1038/srep38836 1-11. (38836), ehia ultnLusaaArclua xeietSain 483 Station, Experiment Agricultural Louisiana Bulletin Technical ya formicarius Cylas cec,282 Science, d odNt e,52 Res, Nutr Food Adv 1,12.doi:10.1002/0471250953.bi0403s18 1-28. (1), 1) 1422.doi:10.1093/bioinformatics/btu170 2114-2120. (15), 59) 0221.doi:10.1126/science.282.5396.2012 2012-2018. (5396), lgnuu aoaoybosa n vdneframlil opnn system. component multiple a for evidence and bioassay laboratory elegantulus 5,7678 doi:10.1093/ee/7.5.756 756-758. (5), 1,15.doi:10.1016/S1043-4526(06)52001-7 1-59. (1), 1,15 doi:10.2307/3495845 125. (1), M eois 11 Genomics, BMC 14 ehd,123 Methods, uli cd e,33 Res, Acids Nucleic 66.doi:10.1016/j.ymeth.2017.04.004 56-65. , 0.doi:10.1186/1471-2164-11-703 703. , 1) 1-30. (10), eoeRs 14 Res, Genome Dtbs su) D212-215. issue), (Database urPoo Bioinformatics, Protoc Curr a itcnl 31 Biotechnol, Nat oky,88 Zookeys, c e,7 Rep, Sci ya brunneus Cylas uli cd Res, Acids Nucleic 5,988-995. (5), o Biol, Mol J 1,11073. (1), 1,1-972. (1), (12), . Sci Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. u . i . in . un . hn,S,Lu . u .(07.Ln-edsqec sebyof assembly sequence Long-read phasing (2017). and J. Hu, scaffolding . . . genomics: Q., Liu, Contact S., Zhang, W., Quan, (2015). Y., firefly Tian, the R. J., Li, Koszul, X., signatures. Fu, & physical 3D H., chromosome doi:10.1016/j.febslet.2015.04.034 receptors Marie-Nelly, olfactory using F., world: (meta)genomes odor J. the to Q. Flot, Access (2014). Huang, M. (2018). Punta, insects. J. in . transduction Krieger, . signal . 2627-5 & for . H., role R., their Breer, . S. and P., Eddy, Pregitzer, Y., J., R. . Fleischer, Eberhardt, P., database. N., Coggill, families J., protein Egekwu, Clements, the A., B., herbivory.Pfam: Bateman, and Dainat, D., detoxification R. C., into Finn, insights S. provides ( Cook, colonies, beetle bee hive E., doi:10.1093/gigascience/giy138 social small Aiden, Scully, of the D., site . of Genome McKenna, . repeats. (2018). D., genomic of . J. classification scaf- C., and Evans, chromosome-length identification N. PILER: yields Durand, (2019). Hi-C (2005). W. S. M., using 21 E. Richards, genome Myers, matics, Hoeger, & aegypti . K., C., Aedes R. S. . the Edgar, Nyquist, of . assembly D., L., novo A. S. De Lee, Omer, folds. C., (2017). S., S. L. S. Murali, E. Batra, C., K. O., Worley, Dudchenko, . A., v1.0. R. assembly Gibbs, genome the L., conformation. planipennis for K. chromosome Agrilus Capturing Kuhn, tool J., computational (2002). Duan, a N. Kleckner, CAFE: & (2006). M., W. Dekker, 295 K., M. J. Rippe, Hahn, A. J., & Moore, Dekker, P., . evolution. J. vespilloides family Demuth, . Nicrophorus gene N., Behavior, . of Cristianini, J., study Social D. T., Complex Parker, Bie, with C., De Beetle E. a McKinney, of J., Methylome Shelton, Silphidae). and A., (Coleoptera: R. Genome Wiberg, The L., (2015). Ji, B., C. Cunningham, utr .L,Oepilr .G,&Cmrn .L 21) oeua hlgntc fAsrla wee- Australian of of phylogenetics comparison Molecular through lineage (2016). hyperdiverse L. a S. in Cameron, relationships analyses. exploring & independent G., Curculionoidea): 3.0. R. miR- (Coleoptera: PhyML Oberprieler, vils of L., performance N. (2006). algorithms the New Gunter, J. assessing (2010). O. A. phylogenies: Gascuel, & maximum-likelihood Enright, 59 W., estimate Hordijk, Biol, & M., to Anisimova, V., methods A., Lefort, and F., Bateman, J. nomenclature. Dufayard, S., S., gene Guindon, Dongen, and fly. van targets struc- known sequences, J., doi:10.1093/nar/gkj112 the of microRNA R. proteins Grocock, of Base: all S., representing return Griffiths-Jones, HMMs assignments. SUPERFAMILY: genome and The alignments (2002). searches, doi:10.1093/nar/30.1.268 C. sequence SCOP Chothia, ture. & J., (1992). Gough, M. W. doi:10.1126/science.257.5075.1421 Gelbart, 55) 3611.doi:10.1126/science.1067799 1306-1311. (5558), cec,356 Science, 3,3731 doi:10.1093/sysbio/syq010 307-321. (3), yoolapectoralisPyrocoelia S) 12i5.doi:10.1093/bioinformatics/bti1003 i152-i158. (S1), 63) 29.doi:10.1126/science.aal3327 92-95. (6333), uta noooy 55 Entomology, Austral eoeBo vl 7 Evol, Biol Genome iifrais 22 Bioinformatics, genome. uli cd e,42 Res, Acids Nucleic iacec,6 Gigascience, ehn tumida Aethina gDt Commons Data Ag 1) 3339.doi:10.1093/gbe/evv194 3383-3396. (12), 2,2723 doi:10.1111/aen.12173 217-233. (2), elMlLf c,75 Sci, Life Mol Cell 1) 2917.doi:10.1093/bioinformatics/btl097 1269-1271. (10), 15 1) -.doi:10.1093/gigascience/gix112 1-7. (12), 2) -.doi:10.1093/nar/gkt1223 1-9. (27), oepea iiuia) olwd para- worldwide a Nitidulidae), Coleoptera: , ESLt,589 Lett, FEBS doi:10.15482/USDA.ADC/1503806 . uli cd e,34 Res, Acids Nucleic cec,257 Science, 3,4558 doi:10.1007/s00018-017- 485-508. (3), uli cd e,30 Res, Acids Nucleic iacec,7 Gigascience, 2 tA,2966-2974. A), Pt (20 57) 1421-1422. (5075), 1,D140-D144. (1), 1,268-272. (1), 1) 1-16. (12), Bioinfor- Science, Syst Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. aeia . oo .(00.KG:kooeccoei fgnsadgenomes. and genes of S. encyclopedia kyoto I. KEGG: Hunter, A. doi:10.1093/nar/28.1.27 Petrov, (2000). 27-30. S. . (1), families. Goto, RNA . & . non-coding M., . for Kanehisa, R., resource S. genome-centric Eddy, a E., . to Rivas, Rep- P., shifting 46 E. 13.0: (2005). Nawrocki, Rfam N., . J. Quinones-Olvera, (2018). Walichiewicz, J., C., & Argasinska, I., O., elements. McAnulla, Kalvari, Kohany, repetitive P., eukaryotic W., Klonowski, of A., Li, database doi:10.1159/000084979 Pavlicek, a M., V., Update, V. base Fraser, Kapitonov, J., Y., classification. Jurka, H. function protein Chang, genome-scale doi:10.1093/bioinformatics/btu031 5: D., InterProScan Binns, pisum. (2014). Acyrthosiphon P., aphid pea Brentinae). Jones, the of Brentidae: sequence Genome Coleoptera: 8 (2010). Biol, (Insecta: C. Florida Genomics, Aphid of of International weevils analysis Primitive characteristic Functional 87 (2021). (2016). Nematology, Z. P. Entomology Li, O. . D. . Industry, ( . weevil volatiles. C., plant potato Wu, host sweet H., and Li, the pheromones Y., from Li, proteins of Y., Odorant-binding use Huang, three C., on Pan, observations J., ( Preliminary Hua, weevil potato (2017). sweet B. the Eiasu, doi:10.1515/opag-2017-0063 of stor- & and control S., transfer, the Laurie, production, I., sperm Hlerema, Age-related times (2016). divergence P. weevil, of V. 29 potato knowledge-base Behavior, G. sweet public Reddy, the a & in T., age TimeTree: Kohama, (2006). S., S. Hiroyoshi, Kumar, Identi- & (1986). J., H. organisms. Dudley, J. among Tumlinson, B., & S. B., Hedges, Dueben, weevil, I., sweetpotato F. 12 female Ecol, Proshold, by Chem E., produced P. J pheromone Sonnet, sex A., of weevils J. fication boll Coffelt, of R., plants R. host Heath, and movement, inverted-repeat Emergence, Mississippi. miniature (1999). of discovering Delta C. 10.1093/jee/92.1.130 the L. for doi:DOI in Adams, program ) & D., : a G. (Coleoptera Jones, MITE-Hunter: D., D. families. Hardee, (2010). protein (2008). sequences. R. of R. genomic database S. from J. TIGRFAMs elements Wessler, Wortman, The transposable & . (2003). Y., O. . Han, . White, & . Assemble . D., J., 31 to J. Res, Orvis, Selengut, Program Acids . E., H., the I., J. D. and L. Allen, Haft, EVidenceModeler M., Hannick, Pertea, using Jr., W., annotation K., Alignments. Zhu, structure Spliced R. L., gene as- Smith, S. eukaryotic alignment Salzberg, R., Automated transcript J., maximal J. B. using Wortman, annotation Haas, M., genome S. Arabidopsis Mount, the Improving L., semblies. A. (2003). O. Delcher, White, J., B. Haas, D) 35D4.doi:10.1093/nar/gkx1038 D335-D342. (D1), 2,e001.doi:10.1371/journal.pbio.1000313 e1000313. (2), uli cd e,31 Res, Acids Nucleic 6,6977 doi:10.1007/s10905-016-9590-0 689-707. (6), 1,3133 doi:10.1093/nar/gkg128 371-373. (1), 6,18-53 doi:10.1007/BF01012367 1489-1503. (6), iifrais 22 Bioinformatics, eoeBo,9 Biol, Genome 4,1-4. (4), ya formicarius Cylas 1) 6456.doi:10.1093/nar/gkg770 5654-5666. (19), 2) 9127.doi:10.1093/bioinformatics/btl505 2971-2972. (23), 1,12.doi:10.1186/gb-2008-9-1-r7 1-22. (1), etMngSi 77 Sci, Manag Pest Cylas uli cd e,38 Res, Acids Nucleic p)i ot Africa. South in sp.) Fbiis Clotr:Curculionidae). (Coleoptera: (Fabricius) 16 ora fEooi noooy 92 Entomology, Economic of Journal 1,3032 doi:10.1002/ps.6019 300-312. (1), yoee eoeRs 110 Res, Genome Cytogenet ya formicarius Cylas ya omcru elegantulus formicarius Cylas 2) -.doi:10.1093/nar/gkq862 1-8. (22), iifrais 30 Bioinformatics, pnArclue 2 Agriculture, Open ntepreto fsex of perception the in ) euei bassiana Beauveria uli cd e,28 Res, Acids Nucleic uli cd Res, Acids Nucleic ora fInsect of Journal 9,1236-1240. (9), 14,462-467. (1-4), 1,595-599. (1), 1,130-139. (1), (Summers). Nucleic PLoS for J Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. i,Q,Go . hn,Y,H,W,L,Y,Zu . hu .N 21) chromosomal-level A genes RNA (2019). transfer N. of detection X. improved Zhou, for program a . tRNAscan-SE: (1997). . sequence. R. genomic S. in . Eddy, & D., M., Zhu, T. Lowe, Y., disease, Li, Chagas (2009). W., for microbial A. vector Hu, insect Bairoch, curated doi:10.1093/gigascience/giz089 Y., the manually . for Zhang, assembly and Y., . genome sets Guo, on . proteome Q., based C., aphid Liu, Rivoire, microbial pea K., sequenced Michoud, the UniProtKB/Swiss-Prot. completely G., in in of evolution families Keller, protein family E., database Gene Coudert, a H., (2019). HAMAP: A. A. Auchincloss, N. T., Moran, Lima, & E., T. assembly. eukaryotic Smith, genome for groups chromosome-level H., ortholog Park, of transform. identification Y., OrthoMCL: Burrows-Wheeler Li, (2003). with S. alignment D. Roos, read & Jr., short genomes. J., accurate C. Stoeckert, and L., Fast Li, (2009). SMART (2004). R. P. 25 Bork, Durbin, Bioinformatics, . & . H., . J., Li, Schultz, Gene3D: integration. T., Doerks, data analysis. D., (2012). genomic F. network C. towards Ciccarelli, S., Orengo, 4.0: protein Schmidt, R., & and R. Copley, H., annotation I., B. Letunic, functional Dessailly, R., genomics, 40 Rentzsch, Res, comparative I., Acids for Nucleic sweet Sillitoe, resource J., major domain-based Perkins, The a C., Yeats, (2019). J., A. Yeboah, Lees, & review. G., a Meiwei, control: 8 H., search, and Amoanimaa-Dede, management Z., weevils; Hongbo, potato T., W. Kyereko, 10.1038/s41598-018-35328-0 (Elateridae). families beetles three identify click sequences Genome derived (2018). morphologically 17084 L. Bocak, as & P., Coleoptera A. Vogler, of M., Bocek, M., Motyka, D., Kusy, genomes. novel in finding 59 Gene scalable (2004). Canu: I. separation. (2017). Korf, M. repeat A. and Phillippy, weighting & k-mer H., doi:10.1101/gr.215087.116 N. adaptive Bergman, via 722-736. R., assembly J. long-read Miller, accurate K., weevil,and Berlin, P., potato B. Walenz, sweet Us- S., require- of Koren, memory Management low (2012). (2016). with A. aligner F. Mukherjee, spliced review. & fast Hartung, R., a HISAT: R. & Korada, (2015). J., L. S. Grau, Salzberg, & H., ments. B., M. Langmead, J. prediction. D., Schattat, Bohlmann, Kim, gene L., . homology-based for . J. conservation . Erickson, doi:10.1093/nar/gkw092 position A., M., G. intron Taylor, Wenk, ing K., S. J., Chan, Keilwagen, R., T. beetle, pine Docking, 14 mountain Y., Biol, the N. Genome of Liao, genome M., Draft M. sequence (2013). Yuen, multiple rapid I., for C. method novel Keeling, a transform. MAFFT: Fourier (2002). fast T. on Miyata, based & K., alignment Kuma, K., Misawa, K., Katoh, a ehd,12 Methods, Nat ri,VgtbeadCra cec n itcnlg,6 Biotechnology, and Science Cereal and Vegetable Fruit, 2,19 doi:10.35248/2171-0983.8.218 1-9. (2), eoeRs 13 Res, Genome 3,11.doi:10.1186/gb-2013-14-3-r27 1-19. (3), 1) 7416.doi:10.1093/bioinformatics/btp324 1754-1760. (14), uli cd e,25 Res, Acids Nucleic s) 45D7.doi:10.1093/nar/gkr1181 D465-D471. (s1), 4,3730 doi:10.1038/nmeth.3317 357-360. (4), 9,27-19 doi:10.1101/gr.1224503 2178-2189. (9), o ilEo,36 Evol, Biol Mol uli cd e,32 Res, Acids Nucleic uli cd e,37 Res, Acids Nucleic uli cd e,30 Res, Acids Nucleic 5,9594 doi:10.1093/nar/25.5.955 955-964. (5), M iifrais 5 Bioinformatics, BMC edotnsponderosae Dendroctonus 17 noooy rihlg eptlg:CretRe- Current Herpetology: & Ornithology Entomology, 1) 1325.doi:10.1093/molbev/msz138 2143-2156. (10), raoarubrofasciata Triatoma s) 12D4.doi:10.1093/nar/gkh088 D142-D144. (s1), s) 41D7.doi:10.1093/nar/gkn661 D471-D478. (s1), S) 79-92. (S1), 1) 0936.doi:10.1093/nar/gkf436 3059-3066. (14), cetfi eot,8 Reports, Scientific 5) -.doi:10.1186/1471-2105-5- 1-9. (59), uli cd e,44 Res, Acids Nucleic okn,amjrfrs pest. forest major a Hopkins, ya formicarius Cylas . iacec,8 Gigascience, eoeRs 27 Res, Genome 1.doi:ARTN (1). 9,1-12. (9), world a : 8,1-8. (8), (5), Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ar,G,Banm . of .(07.CGA ieiet cuaeyantt oegnsi eukaryotic in genes core annotate accurately to entomo- pipeline a the CEGMA: (2007). of I. genomes. Korf, Virulence & K., (2008). Bradnam, G., H. Parra, viability. J. egg Nderitu, and & fecundity N., 7348.2008.00236.x on H. effects G. and Nyamasyo, K., fungi N. pathogenic searches. Maniania, homology S., RNA faster Ondiaka, 100-fold 1.1: Infernal Single-cell (2013). (2015). cell. R. P. S. single 29 Fraser, Eddy, a & & P., in A., E. simultaneously Tanay, Nawrocki, occur W., that Dean, The interactions W., (2020). chromatin S. of 10 D. Protocols, Wingett, detection D. E., genome-wide McKenna, (Coleoptera). Yaffe, for Y., Hi-C & beetles Lubling, N., in T., M. Nagano, receptors Andersson, M., odorant A. of Schwartz, doi:10.1111/imb.12611 evolution M., T. and (2005). Schneider, D. diversity P. F., pathways. Thomas, R. . and Mitchell, . functions . subfamilies, S., doi:10.1093/nar/gki078 families, Rabkin, D284-288. J., protein issue), Vandergriff, (Database of A., database Kejariwal, Chromosome- PANTHER R., Loo, The (2020). B., Lazareva-Ulitsky, J. H., Cai, Mi, R´eunion Island. R & La . J., borbonicuson R. doi:10.1093/gbe/evw133 BeetleOryctes Sommer, 2093-2105. . M., Scarab (7), Herrmann, the P., . Baskaran, of V., M., Genome G. Markov, Zhu, M., Y., J. Meyer, Jiang, D., D. Finkelbergs, B. Farrell, H., doi:10.1093/gigascience/giaa020 of . Han, assembly . genome Z., . level dur- Liu, diversify S., to M. F., extinction Caterino, mass Meng, G., end-Permian survived R. revolution. Coleoptera Beutel, terrestrial that L., Cretaceous reveals C. life the Bellamy, of ing K., tree Kanda, beetle (2016). The L., S. A. (2015). Richards, Wild, . D., interface. D. . Mckenna, beetle-plant . the M., at S. innovations doi:10.1186/s13059-016-1088-8 Geib, evolutionary R., 1-18. and Kirsch, ( functional K., beetle key Hoover, longhorned reveals Y., Asian [Erratum Pauchet, the D., glyphosate of E. Genome of Scully, D., isotherms D. adsorption McKenna, pH-dependent (1991). CA103(25):208751y]. R. in doi:10.1021/jf00004a043 L. cited Hossner, document & proteins. to of S., Bryant, annotation J. functional . the McConnell, . for . Database C., Domain DeWeese-Scott, Conserved K., 39 M. a Derbyshire, Res, CDD: F., Chitsaz, (2011). occurrences B., H. of J. counting S. Anderson, parallel S., ab efficient Lu, source for A., approach Marchler-Bauer, open lock-free two fast, GlimmerHMM: A and (2011). k-mers. TigrScan C. of of Kingsford, (2004). analysis & L. G., expression Marcais, S. gene Salzberg, and & Transcriptome gene-finders. M., eukaryotic Pertea, initio (2016). H., S. W. Chen, Majoros, & B., Gao, doi:10.1093/jisesa/iew053 X., Li, formicarius R., Cylas Wang, J., Ma, 2) 9323.doi:10.1093/bioinformatics/btt509 2933-2935. (22), 1,D2-29 doi:10.1093/nar/gkq1189 D225-D229. (1), iifrais 23 Bioinformatics, iifrais 27 Bioinformatics, 1) 9620.doi:10.1038/nprot.2015.127 1986-2003. (12), euei bassiana Beauveria Clotr:Betde uigdffrn eeomn stages. development different during Brentidae) (Coleoptera: lrciagrahami Aldrichina 9,16-07 doi:10.1093/bioinformatics/btm071 1061-1067. (9), iifrais 20 Bioinformatics, 6,7470 doi:10.1093/bioinformatics/btr011 764-770. (6), and npohr glabripennis Anoplophora ytmtcEtmlg,40 Entomology, Systematic eahzu anisopliae Metarhizium naso ple ilg,153 Biology, Applied of Annals ora fArclua n odCeity 39 Chemistry, Food and Agricultural of Journal oesclyipratblowfly. important forensically a , 1) 8827.doi:10.1093/bioinformatics/bth315 2878-2879. (16), 18 osetptt weevil potato sweet to ,agoal infiativsv species, invasive significant globally a ), 4,8580 doi:10.1111/syen.12132 835-880. (4), eoeBooyadEouin 8 Evolution, and Biology Genome netMlBo,29 Biol, Mol Insect 1,4-8 doi:10.1111/j.1744- 41-48. (1), dlpre,C 21) Draft (2016). C. ¨ odelsperger, netSi 16 Sci, Insect J iacec,9 Gigascience, eoeBo,17 Biol, Genome uli cd e,33 Res, Acids Nucleic ya puncticollis Cylas Bioinformatics, uli Acids Nucleic 4,824-824. (4), 1,77-91. (1), 3,1-12. (3), 1,1-11. (1), Nature (227), Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. evn,N,Vrqax . aoe .R,Vaa . hn .J,Vr,J . ailt .(2015). E. homologous Barillot, identify to . BLAST enabling . GenBlastA: (2009). . N. P., Chen, & processing. J. J., sequences. Pei, data Vert, K., gene Wang, S., J., Hi-C J. Chu, C. for R., the She, Chen, pipeline of E., flexible genome Viara, and the R., doi:10.1186/s13059-015-0831-x optimized from B. insects an Lajoie, social N., HiC-Pro: Varoquaux, into N., Insights Servant, Erratum: (2006). Richards, Lepti- . beetle, G. mellifera. . potato Apis H. . Colorado honeybee H., the T. J. of Consortium, Bowsher, genome A., the Sequencing Bhandari, genomics: B., pest Chrysomelidae). J. agricultural (Coleoptera: Benoit, for decemlineata N., species notarsa M. model Andersson, A H., (2018). Y. S. Chen, D., S. Schoville, the of species Caucasus. Society. new the Computer A from IEEE (2016). Salamin, Nanophyinae) Forum. J. . Skuhrovec, Phd & . & K., Workshops . Schon, Symposium M., Processing Robinson-Rechavi, Distributed A., & Stamatakis, Parallel H., the Stockinger, C., (2012). Pacher, N. M., olfactory Valle, assemblies. Insect H., (2008). K. Schabauer, one-dimensional Touhara, & channels. B., ion improve L. ligand-gated Vosshall, heteromeric T., are Nakagawa, 3D receptors T., Nakagawa, M., Pellegrino, in K., Sato, Genomes (2014). doi:10.1038/nmeth.2795 N. wtdbg2. with Rusk, assembly long-read accurate and Fast (2020). doi:10.1038/s41592-019-0669-3 H. (2016). Li, E. & C. J., Mason, lectularius. Ruan, . G. Cimex . . bug Bucher, R., bed . Durrett, the S., . of Simon, phylogenomics doi:10.1038/ncomms10164 A., . Narechania, geospatial W., R., and M. assembly R. Brugler, Genome Beeman, D., pest Reeves, R., A., and J. Denell, Rosenfeld, beetle J., model S. Brown, the M., of doi:10.1038/nature06784 G. genome Weinstock, entomopathogenic The A., of R. (2008). Brentidae). efficacy Gibbs, (Coleoptera: field S., formicarius and Richards, Cylas Laboratory weevil, (2014). sweetpotato A. R. the 122 Humber, Pathology, of Invertebrate & management E. H., the Aiden, Z. . for . Zhao, fungi looping. P., . chromatin T., V. of J. principles G. reveals Robinson, Reddy, resolution D., kilobase v1.5. I. at Bochkov, genome Tracer K., human E. the (2013). Stamenova, of 159 C., map J. N. 3D A Durand, A. (2014). H., L. Drummond, M. Huntley, & S., geno- S. large D., Rao, in Xie, families repeat A., of M. identification http://beast.bio.ed.ac.uk/Tracer novo Suchard, De A., (2005). A. Rambaut, P. Pevzner, (2006). & A. C., G. genome N. Dasch, mes. Pediculidae) Jones, & (Phthiraptera: L., J., humanus A. Romero-Severson, humanus Price, H., Pediculus expression S. the Lee, Transcript-level genome: S., (2016). target J. project. new L. Johnston, a M., S. of J. Salzberg, Sequencing Clark, & R., Ballgown. B. T., and Pittendrigh, StringTie J. HISAT, Leek, with M., experiments G. doi:10.1038/nprot.2016.095 RNA-seq Pertea, of D., analysis Kim, M., Pertea, iifrais 21 Bioinformatics, 7,16-60 doi:10.1016/j.cell.2014.11.021 1665-1680. (7), e noo,43 Entomol, Med J lmoeL nOtmzdVrino oeLfrteBac-ieModel. Branch-Site the for CodeML of Version Optimized An SlimCodeML: eoeRs 19 Res, Genome S) 31i5.doi:10.1093/bioinformatics/bti1018 i351-i358. (S1), aue 444 Nature, 01.doi:10.1016/j.jip.2014.07.009 10-15. , 6,10-11 doi:10.1603/0022-2585(2006)43[1103:soantg]2.0.co;2 1103-1111. (6), . 1,1319 doi:10.1101/gr.082081.108 143-149. (1), otx,4169 Zootaxa, 71) 1-1.doi:10.1038/nature05400 512-512. (7118), 3,5158 doi:10.11646/zootaxa.4169.3.9 571-578. (3), aue 452 Nature, c e,8 Rep, Sci 19 rblu castaneum Tribolium Corimalia 1,13.doi:10.1038/s41598-018-20154-1 1931. (1), 79) 0210.doi:10.1038/nature06850 1002-1006. (7190), oi,18 Clotr:Brentidae: (Coleoptera: 1885 Gozis, eoeBo,16 Biol, Genome a rtc 11 Protoc, Nat . aue 452 Nature, a ehd,17 Methods, Nat a ehd,11 Methods, Nat a omn 7 Commun, Nat vial nieat: online Available ae rsne at presented Paper 79) 949-955. (7190), 9,1650-1667. (9), 1,259-270. (1), 2,155-158. (2), ora of Journal 10164. , 1,5. (1), Cell, Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. of,G .(91.Teoii n ipra fteps pce of H. species A. pest Schulman, the of . dispersal . and . world. origin the B., The of groups Chalhoub, (1991). P., W. Capy, G. elements. Wolfe, transposable L., eukaryotic J. for Bennetzen, system doi:10.1038/nrg2165 A., classification 982. provides Hua-Van, unified genome A locust F., The (2007). (2014). Sabot, L. Kang, T., . flight. . Wicker, long-distance . D., and Zhao, formation F., swarm Jiang, Pilon: into X., (2014). Jiang, insight P., M. (2015). Yang, A. improvement. X., Fang, A. Earl, assembly X., . Pain, Wang, genome . and . . detection S., . variant Sakthikumar, microbial A., . 9 comprehensive Abouelliel, One, A., M., for Priest, J. tool T., Hypothenemus Ceja-Navarro, integrated Shea, borer, an B., T., berry Abeel, M. coffee J., the Nair, B. worldwide: Walker, E., coffee of Shen, pest H., insect devastating Chen, most hampei. M., the of com- S. genome sequencing: Brown, Draft next-generation E., and F. DNA Repetitive Vega, Erratum: (2012). solutions. L. and receptors. S. challenges olfactory Salzberg, putational insect & of J., traces prediction T. Molecular Topology Treangen, (2014). (2019). J. R. 55 Liebig, Sowdhamini, Biol, . & Struct . D., Opin . S. . W., Karpe, genome. Booth, . V., termite X., . Tiwari, a Meng, S., in L., organization Ji, B. M., social Rao, H. from alternative T., proteins Robertson, of of C., U. Shankavaram, Li, classification phylogenetic N., A., Terrapon, in T. developments Tatusova, new V., database: COG I. genomes. The Garkavtsev, complete genomic (2001). A., in V. D. elements E. Koonin, Natale, repetitive L., identify R. to Tatusov, RepeatMasker Using (2009). N. sequences. Chen, & transcripts. M., RNA Tarailo-Graovac, in molecular regions coding MEGA5: 43 protein of Res, (2011). Identification Acids (2015). S. M. Nucleic Borodovsky, & Kumar, parsimony A., maximum Lomsadze, & S., and Tang, M., distance, Nei, evolutionary G., likelihood, maximum Stecher, methods. using weevil N., analysis sweetpotato Peterson, genetics the D., evolutionary of Peterson, control for and K., webserver biology Tamura, a the v3: of Evolview review (2019). A (Fab). H. (1986). A. W. J. Chen, Sutherland, & S., Hu, trees. J., phylogenetic doi:10.1093/nar/gkz357 of M. management Lercher, and annotation, S., finding visualization, gene data. Gao, for reading B., server web gel Subramanian, a DNA AUGUSTUS: (2004). of B. Morgenstern, manipulation eukaryotes. & and S., in Waack, storage R., Steinkamp, the M., for Stanke, method BUSCO: 8 computer (2015). Res, new M. Acids A Nucleic E. (1980). Zdobnov, R. & Staden, V., E. Kriventseva, orthologs. single-copy P., with doi:10.1093/bioinformatics/btv351 Ioannidis, completeness 3210-3212. annotation M., and assembly R. genome Waterhouse, assessing A., F. Simao, rpclPs aa,32 Manag, Pest Tropical 1) -4 doi:10.1371/journal.pone.0112963 1-14. (11), c e,5 Rep, Sci o ilEo,28 Evol, Biol Mol urPoo iifrais hpe 4 Chapter Bioinformatics, Protoc Curr uli cd e,32 Res, Acids Nucleic uli cd eerh 29 Research, Acids Nucleic 22.doi:10.1038/srep12525 12525. , we oaoPs aaeetAGoa Perspective Global A Management Pest Potato Sweet J 1,1423 doi:10.1016/j.sbi.2019.05.014 194-203. (1), 1) 6339.doi:10.1093/nar/8.16.3673 3673-3694. (16), 1) -0 doi:10.1093/nar/gkv227 1-10. (12), 1) 7123.doi:10.1093/molbev/msr121 2731-2739. (10), 4,3435 doi:10.1080/09670878609371084 304-315. (4), aueRvesGntc,13 Genetics, Reviews Nature WbSre su) 3932 doi:10.1093/nar/gkh379 W309-312. issue), Server (Web 1,2-8 o:O 10.1093/nar/29.1.22 doi:DOI 22-28. (1), 1,11.doi:10.1002/0471250953.bi0410s25 1-14. (1), 20 a omn 5 Commun, Nat a omn 5 Commun, Nat uli cd e,47 Res, Acids Nucleic 2,1616 doi:10.1038/nrg3164 146-146. (2), Cylas 97 doi:10.1038/ncomms3957 2957. , 66 doi:10.1038/ncomms4636 3636. , ihakyt h etspecies pest the to key a with 13-43. , a e ee,8 Genet, Rev Nat iifrais 31 Bioinformatics, W) W270-W275. (W1), ya formicarius Cylas 1) 973- (12), PLoS Curr (19), Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ih elwt akrdidctstefeunyo iCitrcinlnsfo o ohigh. from to low ranging from heatmap links the interaction of Hi-C key of frequency colour the The indicates chromosomes. red within dark and to between yellow light links Hi-C chromosomes. of distribution 11 the for matrix among Death contact (g) chromosomal males. Genome-wide and 2 females Figure in dimorphism sexual Evolutionary strong showed (2018). sensilla antennal feigning. P. The (f) Zhang, Male. (e) & Female. A., 1. Slipinski, Figure H., species. Legends Pang, and Table L., genes and of Figure Dan, sampling Y., extensive Li, by H., revealed of doi:10.1038/s41467-017-02644-4 assembly Coleoptera L. genome of Che, Chromosome-level history (2020). Q., J. S. Cui, . Zhang, . . Y., Li, L., 20 Wu, Resour, P., Du, J., predator Luo, the S., Li, L., likelihood. Zhang, maximum by (2017). analysis phylogenetic M. 4: doi:10.1093/molbev/msm088 genome PAML Vingron, reference (2007). Chromosome-level (2020). . Z. X. Yang, Li, . . butterfly . . . dead-leaf S., the S., Lu, of doi:10.1111/1755-0998.13185 Z., editing Haas, Dong, gene J., P., and Mao, M., assembly Xiao, Xie, W., Wan, J., history. J., Yang, Helmuth, hexaploidization its H., back Kuhl, traces doi:10.1038/s41477-017-0002-z genome 703. H., potato M. sweet Haplotype-resolved Moeinzadeh, of J., genome Yang, mitochondrial Complete (2019). Y. China. from Li, se- & draft H., A Yang, of (2004). activity for G. required LTR is (2007). Analysis, posons. LUSH H. Wang, Biology OBP & Drosophila . Z., (2005). P. Xu, . D. Smith, . & N., B., mori). neurons. D. pheromone-sensitive Li, (Bombyx Jones, R., silkworm F., Atkinson, P., domesticated Dai, Xu, (2011). the D., L. of Cheng, Keller, genome . C., doi:10.1126/science.1102210 the . Lu, . for Z., G., quence Zhou, B. Q., Hunt, ant Xia, S., fire Nygaard, M., the Corona, of doi:10.1073/pnas.1009690108 O., beetles genome Riba-Grognuz, blister two J., The of Wang, yield genomes Y., Draft genomes Wurm, (2018). webworm Fall S. X. (2019). Chen, S. & Zhan, J., phaleratus . Li, . M., . Y. W. Wu, Q., Barker, Wang, . species. X., invasive . Liu, of . Y., adaptation 0746-5 R., Cao, rapid C. X., Vinayaka, into Li, A., insights S., D. Zhang, Resource. Natale, N., Information L., Wu, Protein S. the L. at doi:10.1093/nar/gkh097 Yeh, system D112-D114. Z., classification (s1), family H. PIRSF: Huang, (2004). A., C. Nikolskaya, H., C. Wu, uli cd e,35 Res, Acids Nucleic . ya formicarius Cylas iacec,7 Gigascience, 1,2237 doi:10.1111/1755-0998.13100 292-307. (1), iohnra N atBRsucs 4 B-Resources, Part DNA Mitochondrial rplajaponica Propylea 3,17 doi:10.1093/gigascience/giy006 1-7. (3), ern 45 Neuron, 1,W6-28 doi:10.1093/nar/gkm286 W265-W268. (1), oeossinvicta Solenopsis oudrtn t oeac oisciie n ihtemperatures. high and insecticides to tolerance its understand to - niae ieetphenotypes. different indicates a-g . IDR necetto o h rdcino ullnt T retrotrans- LTR full-length of prediction the for tool efficient an FINDER: h ubro o2lnswscluae steitrcinfrequency interaction the as calculated was links log2 of number The 2,1320 doi:10.1016/j.neuron.2004.12.031 193-200. (2), a clEo,3 Evol, Ecol Nat . alm inachus Kallima 21 rcNt cdSiUSA 108 A, S U Sci Acad Natl Proc 1,14-22 doi:10.1080/23802359.2019.1591247 1241-1242. (1), ya formicarius Cylas ya formicarius Cylas 1,1515 doi:10.1038/s41559-018- 105-115. (1), . o clRsu,20 Resour, Ecol Mol o ilEo,24 Evol, Biol Mol a g.()Lra c ua (d) Pupa. (c) Larva. (b) Egg. (a) cec,306 Science, a omn 9 Commun, Nat ylu cichorii Hycleus Clotr:Brentidae) (Coleoptera: a lns 3 Plants, Nat hwn interactions showing uli cd e,32 Res, Acids Nucleic 50) 1937-1940. (5703), 1) 5679-5684. (14), 8,1586-1591. (8), 4,1080-1092. (4), and 25,1-12. (205), o Ecol Mol 9,696- (9), Hycleus Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ihayfmle) netseicgnsaetoeta r rsn nol h 6isc species. insect 16 cluster the only not in do (genes present that Other are (genes species); that Uncluster each those ling categories); from are family orthologous genes gene abovementioned castaneum Insect-specific unique any families). a to any from ge- belong with (genes orthologous not (multiple-copy Specific N:N:N do families); families); that gene gene common common in in genes nes orthologous (single-copy 1:1:1 insects. Mya). of model orthology and gene and insects trees Phylogenetic databases. 4 five Figure on based annotation functional TrEMBL. of and diagram Venn 3 Figure igCnotu,2006), Consortium, cing 2016), al .Bnigante fteCoOP- rtist 0 hmclcompounds chemical 102 to proteins CforOBP4-6 the the of of affinities set Binding gene 5. methods consensus Table three the on of based Summary results 4 prediction Table gene of Statistics 3 elements Table repeat of Statistics for as 2 Table results [calculated assembly ability of binding (Z)-3-dodecen-1-yl(E)-2- Summary affinity. the with 1 significant proteins of Table exhibited three Comparison that these (E) ligands of values] 22 ligands. constants) and to dissociation butenoate compounds. the (D) CforOBP4 odorant of CforOBP6 of (reciprocals various Binding and 1/Ki (1-NPN). with (C), N-phenyl-1-naphthylamine probe CforOBP4-6 CforOBP5 fluorescent of (B), the for curves CforOBP4-6 of binding Affinity Competitive 9. B digested. Figure was vector pET/CforOBP4-6 the in 2,4,6, in expression A expression digested; protein not purification. total was expres- 1,4,7, CforOBP4–6 prokaryotic vector of pET/CforOBP4-6 pET/CforOBP4-6 the the analysis SDS-PAGE 1,3,5, of digestion B, (p-values enzyme vector; downregulation restriction sion Double represent female A, asterisks 8. in red Figure expression the the The and legs, levels. to upregulation, parts), expression compared represent mouth * high genes are asterisks including with black expressed red but log2 differentially The antennae by significant legs. (missing represented statistically heads are levels mark antennae, expression asterisks (adult The tissues abdomens). different and in thoraxes log2 as 36 of profile Expression 7. species. Figure insect other sequence from of OBPs ponderosae sources droctonus and The CforOBPs each. from of ORs S11. nated Table tree 5 in by Phylogenetic coleopteran- detailed here 6. expanded are represented massively Figure suffixes are the receptor 1-7 size, of groups tree explanation OR reduce and former To data in groups. lineages OR OR coleopteran specific major seven the indicate family. (OR) green), receptor (Agla, odorant the from of were tree Phylogenetic 5. Figure elegans lectularius Cimex pisum Acyrthosiphon tal. et < giu planipennis Agrilus Cnotu,1998). (Consortium, .5 ** 0.05; ya formicarius Cylas 2013), , (Richards rblu castaneum Tribolium .coli E. rblu castaneum Tribolium < (Rosenfeld npohr glabripennis Anoplophora .1 *** 0.01; E/frB46wsidcdb PG ,,,prfidpET/CforOBP4-6. purified 3,6,9, IPTG; by induced was pET/CforOBP4-6 ItrainlAhdGnmc,2010), Genomics, Aphid (International tal. et (Dpon). Locustamigratoria (Duan 2008), , tal. et < Co,red), (Cfor, oevle hwtedvrec ie rmtepeet(ilo er ago, years (million present the from times divergence the show values Node 0.001). (Tcas), .coli E. tal. et CforOBPs 2016), , nhpau taurus Onthophagus Ta,voe)and violet) (Tcas, E/frB46wsntidcdb PG ,,,ttlprotein total 2,5,8, B IPTG; by induced not was pET/CforOBP4-6 2019), , npohr chinensis Anoplophora edotnsponderosae Dendroctonus otrossnevadensis Zootermopsis (Wang (McKenna . ya formicarius Cylas obxmori Bombyx eta hwn h eaieepeso ee fCforOBP4- of level expression relative the showing Heatmap tal. et 22 nhpau taurus Onthophagus (Choi ya formicarius Cylas tal. et 2014), , ya formicarius Cylas eiuu humanus Pediculus tal. et (Xia 2016), , (Achi), Drosophilamelanogaster (Terrapon 2010), , tal. et Do,blue), (Dpon, bandb ieetmethods different by obtained rce borbonicus Oryctes iohlszeamais Sitophilus Oa,bon.Teclue arcs coloured The brown). (Otau, 2004), , h eetrsqecsincluded sequences receptor The ,poenmlclrmre;A marker; molecular protein M, edotnsponderosae Dendroctonus n i te coleopteran other six and tal. et genome (Pittendrigh npohr glabripennis Anoplophora psmellifera Apis R O,G,KEGG GO, KOG, NR, 2014), , Se) and (Szea), h Bsorigi- OBPs The Glat 1992), (Gelbart, (Meyer Caenorhabditis tal. et Tribolium (Sequen- 2006), , tal. et (Kee- Den- (A) , Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. al 1.Cmaio fteceoeetraddtxfiainspraiiso aiu spe- various of superfamilies detoxification and chemoreceptor the cies. of RNA. Comparison non-coding S16. predicted Table the genes. of predicted Statistics of S15 annotation Table functional of of Statistics sets S14 gene Table of comparison the of species. Statistics S13 sequences. Table repeat of Statistics assembly. assembly. S12. chromosome chromosome-level Table and for correction libraries error Hi-C for S11. data Table Hi-C of Summary S10. Table the for genes. evaluation eukaryotic BUSCO CEGs. conserved S9 highly Table 248 and the (CEGs) of genes evaluation eukaryotic the CEGMA to aligned S8 reads Table Illumina of reads. Summary clean S7 PacBio Table system. of PacBio distribution the Length from S6 system. derived Table Illumina reads the sequence using Summarized generated S5 reads Table the sequence of of analysis Summary 19-kmer S4. the Table of compounds. information chemical Statistical standard S3. analysis Table of expression source and gene Purity and S2. expression Table prokaryotic cloning, for qRT-PCR. used using pairs Primer S1. Table (b). of evolving expansion KEGG pathways. family rapidly enriched and gene highly for obtained (a) performed the was GO analysis enrichment of using of analysis out families enrichment gene carried and were annotation genes Functional S6 Figure gene contracted and expanded manilensis of sizes the and , species insect 16 of families. tree (KEGG) Phylogenetic Genomes S5 and Figure Genes of database Encyclopedia (KOG) Kyoto proteins and of (b) (d). groups (nr/nt) orthologous collection eukaryotic nucleotide (a), (c), database (GO) of Ontology annotation Gene functional Gene S4 method, Figure homology-based initio, ab methods: three by RNA-seq. genes library.and predicted NT of Distribution the S3 and Figure reads NGS 10000 of of comparisons Distribution the S2. in 19-mers Figure of frequency Distribution S1. Figure information Supplemental O . taurus .formicarius C. , C , A . .formicarius C. elegans . mellifera , , Z D . , . nevadensis C ponderosae asaesbiie orpeetdffrn Otrs b EGpathway KEGG (b) terms. GO different represent to subdivided are Bars . ya formicarius Cylas . lectularius and , O ya formicarius Cylas ya formicarius Cylas , A . ya formicarius Cylas borbonicus D . pisum . melanogaster 23 ed ntetp1 pce codn othe to according species 13 top the in reads . , T ya formicarius Cylas ya formicarius Cylas . ya formicarius Cylas castaneum .formicarius C. eoeasml ae n15 core 1054 on based assembly genome , eoeasml ae n48core 458 on based assembly genome a Oercmn nlsso expanded of analysis enrichment GO (a) B a efre yainett the to alignment by performed was . mori ya formicarius Cylas , A , . P h rp eit h most the depicts graph The . planipennis . eoeassembly. genome genome. humanus n 5ohrinsect other 15 and , , A L genome. . . glabripennis migratoria Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. al 1.Sqecsue o uligpyoeei trees. phylogenetic building for used Sequences S17. Table 24 Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 25 Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 26 Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 27 Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. d eae e ae f h nenlsnil hwdsrn euldmrhs nfmlsadmls (g) males. and females in dimorphism sexual strong showed sensilla antennal The feigning. (f) Death Male. (e) Female. (d) 1. Figure ya formicarius Cylas - niae ieetphenotypes. different indicates a-g . 28 a g.()Lra c Pupa. (c) Larva. (b) Egg. (a) Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ih elwt akrdidctstefeunyo iCitrcinlnsfo o ohigh. from to low ranging from heatmap links the interaction of Hi-C key of colour frequency The the indicates chromosomes. red within dark and to yellow between light links Hi-C chromosomes. of distribution 11 the for matrix among contact chromosomal Genome-wide 2 Figure h ubro o2lnswscluae steitrcinfrequency interaction the as calculated was links log2 of number The 29 ya formicarius Cylas hwn interactions showing Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. y) :: snl-oyotooosgnsi omngn aiis;NNN(utpecp orthologous (multiple-copy N:N:N families); gene common in genes orthologous (single-copy 1:1:1 insects. Mya). of model orthology and gene and insects trees Phylogenetic 4 Figure databases. five on based annotation functional TrEMBL. of and diagram Venn 3 Figure oevle hwtedvrec ie rmtepeet(ilo er ago, years (million present the from times divergence the show values Node 30 ya formicarius Cylas n i te coleopteran other six and R O,G,KEGG GO, KOG, NR, Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. pcfi Rlnae nfre Rgop - r ersne eeb R ah h ore fsequence of sources The each. S11. ORs Table 5 in coleopteran- by detailed expanded here are massively represented suffixes are the receptor 1-7 size, of groups tree explanation OR reduce and To former data in groups. lineages OR OR coleopteran specific major seven the indicate family. (OR) green), receptor (Agla, odorant the from of were tree Phylogenetic 5. Figure species. insect 2006), 16 Consortium, the cluster (Sequencing only not in do present al. that are (genes that (genes Uncluster those Other (Keeling categories); are species); each genes orthologous from abovementioned Insect-specific castaneum family any gene families). to unique any a belong with from not (genes do Specific families); that gene common in genes anradtselegans Caenorhabditis 2006), , 1992), 2016), , crhspo pisum Acyrthosiphon ie lectularius Cimex tal. et ya formicarius Cylas (Richards giu planipennis Agrilus rblu castaneum Tribolium 2013), , tal. et Cnotu,1998). (Consortium, npohr glabripennis Anoplophora (Rosenfeld ItrainlAhdGnmc,2010), Genomics, Aphid (International Co,red), (Cfor, 2008), , Locustamigratoria (Duan Ta,voe)and violet) (Tcas, nhpau taurus Onthophagus tal. et tal. et edotnsponderosae Dendroctonus 2016), , 2019), , (McKenna (Wang 31 nhpau taurus Onthophagus otrossnevadensis Zootermopsis obxmori Bombyx tal. et (Choi tal. et 2014), , tal. et Do,blue), (Dpon, eiuu humanus Pediculus 2016), , (Xia Drosophilamelanogaster 2010), , Oa,bon.Teclue arcs coloured The brown). (Otau, h eetrsqecsincluded sequences receptor The tal. et rce borbonicus Oryctes (Terrapon edotnsponderosae Dendroctonus npohr glabripennis Anoplophora 2004), , (Pittendrigh tal. et psmellifera Apis (Meyer Tribolium (Gelbart, 2014), , tal. et et Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. es h lc seik ersn peuain n h e seik ersn oneuain(p-values downregulation represent female asterisks red in the expression and the The legs, upregulation, to levels. parts), represent expression compared * mouth high asterisks genes are including with black expressed red but The log2 differentially antennae by significant legs. (missing represented statistically are heads levels mark antennae, expression asterisks (adult The tissues abdomens). different and thoraxes in log2 as of 36 profile Expression 7. Figure species. insect other from OBPs and CforOBPs ponderosae of Dendroctonus tree from Phylogenetic originated 6. Figure < .5 ** 0.05; rblu castaneum Tribolium < .1 *** 0.01; (Dpon). < 0.001). CforOBPs (Tcas), npohr chinensis Anoplophora . eta hwn h eaieepeso ee fCforOBP4- of level expression relative the showing Heatmap 32 (Achi), iohlszeamais Sitophilus Se) and (Szea), h OBPs The Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ,,,tepTCoOP- etrwsntdgse;A246 h E/frB46vco a ietd B digested. was vector pET/CforOBP4-6 the in in 2,4,6, expression expression A digested; protein not total purification. was expres- 1,4,7, CforOBP4–6 prokaryotic vector pET/CforOBP4-6 of pET/CforOBP4-6 the the analysis of 1,3,5, SDS-PAGE digestion B, enzyme vector; restriction sion Double A, 8. Figure .coli E. E/frB46wsidcdb PG ,,,prfidpET/CforOBP4-6. purified 3,6,9, IPTG; by induced was pET/CforOBP4-6 .coli E. E/frB46wsntidcdb PG ,,,ttlprotein total 2,5,8, B IPTG; by induced not was pET/CforOBP4-6 33 ,poenmlclrmre;A marker; molecular protein M, Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 34 Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 35 Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. al ttsiso eetelements repeat of Statistics 2 Table for methods different of as results [calculated assembly ability of binding (Z)-3-dodecen-1-yl(E)-2- Summary the affinity. with 1 significant of proteins Table exhibited three Comparison that these (E) ligands of values] 22 ligands. constants) and dissociation to butenoate compounds. the (D) CforOBP4 of odorant CforOBP6 of (reciprocals various Binding and 1/Ki (1-NPN). with N-phenyl-1-naphthylamine (C), probe CforOBP4-6 CforOBP5 fluorescent of (B), the for curves CforOBP4-6 binding of Affinity Competitive 9. Figure oa ubr14221 13,216,042 2,902,866 27,328,994 34,231,294 20,733,671 - 49,944,160 154 339,038,736 - - 339,045,436 - 18,725 4,519,195 (bp) N90 Scaffold Contig (bp) size N50 89,710 2,884,459 (bp) length Max - number Total (bp) 34,139,672,427 Contig size filling Genome gap and scaffolding Hi-C (bp) N90 Contig (bp) size N50 (bp) length Max number Total (bp) size Genome assembly SMRT Strategies 36 ya formicarius Cylas (A) Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. al umr ftecnessgn e fthe of set gene consensus the of Summary 4 Table methods three on based results prediction gene of Statistics 3 Table otaeEVM 11,907 119,607,661 10,045 (bp) number length Gene Gene (bp) length gene Average (bp) length gene Average (bp) length Gene number Gene Software N 6627233 21.61 73273338 367692 30.4 103073651 DNA 491713 Rate(%) Length Number RNA Type Classes nerto V 11,907 17,697 EVM PASA Integration 14,203 RNA-seq number Gene Homology-based Genscan set gene and Software initio Ab Method S 5 6610.11 46.46 157507930 2.58 362661 906151 8740806 0.32 658 19.11 0.02 45612 0.05 64780365 1086541 2.27 0.16 323939 73898 153271 7055 7703503 Total 0.33 558071 Unknown 495 29512 0.09 0.03 1124157 SSR 476 0.61 PotentialHostGene 2751 321973 Unknown 2070686 3940 86177 18.01 TIR Maverick 61058986 1693 804 MITE 444 278835 Helitron 1.35 0.96 0.24 Crypton 8.22 0.68 3.14 4594013 3263649 27870928 Unknown 801356 10642243 2301757 TRIM 129821 9889 5977 SINE 2014 55504 PLE 6732 LTR/Unknown LTR/Gypsy LTR/Copia LINE LARD DIRS eeak- 35,199 23,437 GeneMarkS-T 22,127 TransDecoder 8,104 melanogaster Drosophila castaneum Tribolium 10,593 borbonicus Oryctes 36,128 ponderosae Dendroctonus glabripennis Anoplophora SNAP GeneID GlimmerHMM Augustus 37 ya formicarius Cylas 7,434 10,489 8,839 10,499 12,478 genome Posted on Authorea 25 Jun 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162460541.11570297/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. formicarius-fabricius-coleoptera-brentidae-and-functional-characteristics-of-cforobp4-6 articles/527720-chromosome-level-genome-assembly-of-the-sweet-potato-weevil-cylas- Information-2021.6.5.doc Supplemental file Hosted (E)-2-buten (Z)-3-dodecen-1-yl compounds. chemical 102 to ligands proteins Potential CforOBP4-6 the of affinities Binding 5. Table Benzaldehyde β Damascenone β Tiliroside ether 3,7,4’-trimethyl Kaempferol amfrl-3-glucoside Kaempferol ty butyrate Ethyl Eucalyptol Docosane 2-Hexanone uy acetate Butyl Rhamnetin formate Butyl 2-Hexanol 3-Hexanol 1-Hexanol 2-Heptanone Phenylacetaldehyde Daucosterol 6-Methyl-5-hepten-2-one Eucalyptol 1-Aminoanthracene cy aldehyde Octyl :Mr oeta iad eetse,bttermiig7 oeta iad i o idayo h frB46poen.bSxpeooecmoet.cAesetptt eodr eaoie.dND sa brvainfrntdtce,wihmasta osgicn idn a eetdi h sa.a oeptnillgnswr etd u h eann 7ptnillgnsddntbn n fteCoOP- rtis :e hrmn opnns :r we oaoscnaymtblts :..i nabeito o o eetd hc en htn infiatbnigwsdtce nteasy :Mr oeta iad eetse,bttermiig7 oeta iad i o idayo h frB46poen.bSxpeooecmoet.cAesetptt eodr eaoie.dND sa brvainfrntdtce,wihmasta osgicn idn a eetdi h sa.a oeptnillgnswr etd u h eann 7ptnillgnsddntbn n fteCoOP- rtis :e hrmn opnns :r we oaoscnaymtblts :..i nabeito o o eetd hc en htn infiatbnigwsdtce nteasy :Mr oeta iad eetse,bttermiig7 oeta iad i o idayo h frB46poen.bSxpeooecmoet.cAesetptt eodr eaoie.dND sa brvainfrntdtce,wihmasta osgicn idn a eetdi h sa.a oeptnillgnswr etd u h eann 7ptnillgnsddntbn n fteCoOP- rtis :e hrmn opnns :r we oaoscnaymtblts :..i nabeito o o eetd hc en htn infiatbnigwsdtce nteasy :Mr oeta iad eetse,bttermiig7 oeta iad i o idayo h frB46poen.bSxpeooecmoet.cAesetptt eodr eaoie.dND sa brvainfrntdtce,wihmasta osgicn idn a eetdi h assay. the in detected was binding significant no that means which detected, not for abbreviation an is d:N.D. metabolites. secondary potato sweet c:Are components. pheromone b:Sex proteins. CforOBP4-6 the of any bind not did ligands potential 77 remaining the but tested, were ligands potential More a: assay. the in detected was binding significant no that means which detected, not for abbreviation an is d:N.D. metabolites. secondary potato sweet c:Are components. pheromone b:Sex proteins. CforOBP4-6 the of any bind not did ligands potential 77 remaining the but tested, were ligands potential More a: assay. the in detected was binding significant no that means which detected, not for abbreviation an is d:N.D. metabolites. secondary potato sweet c:Are components. pheromone b:Sex proteins. CforOBP4-6 the of any bind not did ligands potential 77 remaining the but tested, were ligands potential More a: assay. the in detected was binding significant no that means which detected, not for abbreviation an is d:N.D. metabolites. secondary potato sweet c:Are components. pheromone b:Sex proteins. CforOBP4-6 the of any bind not did ligands potential 77 remaining the but tested, were ligands potential More a: assay. the in detected was binding significant no that means which detected, not for abbreviation an is d:N.D. metabolites. secondary potato sweet c:Are components. pheromone b:Sex proteins. CforOBP4-6 the of any bind not did ligands potential 77 remaining the but tested, were ligands potential More a: assay. the in detected was binding significant no that means which detected, not for abbreviation an is d:N.D. metabolites. secondary potato sweet c:Are components. pheromone b:Sex proteins. CforOBP4-6 the of any bind not did ligands potential 77 remaining the but tested, were ligands potential More a: assay. the in detected was binding significant no that means which detected, not for abbreviation an is d:N.D. metabolites. secondary potato sweet c:Are components. pheromone b:Sex proteins. CforOBP4-6 the of any bind not did ligands potential 77 remaining the but tested, were ligands potential More a: - - Cyclocitral Ionone c c c c a otaeEVM 92,827,868 26,779,793 (bp) length Intron 7,796 (bp) 2249 (bp) length length Exon intron Average (bp) length (bp) exon length Average intron Average (bp) length Intron (bp) length exon Average (bp) length Exon Software c c b c vial at available 38 https://authorea.com/users/421871/ IC50 1.614 CforOBP4 2.7 1.871 1.096 1.007 N.D. N.D. N.D. 35.8 [?]50 47.31 [?]50 29.34 25.57 32.06 N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. ± ± 0.168 ± ± ± ± ± ± ± ± d 4.81 0.083 0.062 0.073 0.034 6.85 1.78 2.58 2.98 1.004 Ki CforOBP4 1.680 1.164 0.682 0.627 N.D. N.D. N.D. 22.278 [?]50 29.440 [?]50 18.258 19.950 15.912 N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. ± ± ± ± ± ± ± ± ± ± 0.052 0.105 0.039 0.045 0.021 2.993 4.263 1.108 1.854 1.605 3.104 IC50 2.923 CforrOBP5 1.934 2.531 1.451 N.D. 0.4773 2.262 46.04 N.D. [?]50 N.D. 17.07 40.27 [?]50 36.85 [?]50 20.85 18.31 16.88 N.D. N.D. N.D. N.D. N.D. ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.401 0.193 0.039 0.11 0.078 0.047 4.6 0.29 3.75 2.92 0.26 0.64 1.81 0.0433 1.880 Ki 1.770 CforrOBP5 1.171 1.533 0.879 0.289 N.D. 1.370 27.886 N.D. [?]50 N.D. 10.339 24.391 [?]50 22.320 [?]50 12.629 11.090 10.224 N.D. N.D. N.D. N.D. N.D. ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.243 0.117 0.024 0.063 0.047 0.026 0.028 2.786 0.176 2.271 1.769 0.157 0.388 1.096 2.589 IC50 2.07 CforrOBP6 2.513 1.444 0.965 N.D. 2.658 N.D. 16.82 N.D. N.D. N.D. 23.07 12.76 [?]50 33.47 N.D. 19.4 10.67 3.407 46.36 28.81 16.38 11.09 5.467 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.096 0.93 0.252 0.028 0.1 0.0303 0.179 0.7 0.85 0.39 1.1 0.65 0.323 8.15 5.84 0.46 0.48 0.546 1.646 Ki 1.316 CforrOBP6 1.598 0.918 0.613 N.D. 1.690 N.D. 10.693 N.D. N.D. N.D. 14.666 8.112 [?]50 21.278 N.D. 12.333 6.783 2.166 29.472 18.315 10.413 7.05 3.476 ± ± ± ± ± ± ± ± ± ± ± 0.305 ± ± ± ± ± ± ± 0.160 0.61 0.018 0.064 0.019 0.114 0.248 0.413 0.205 0.347 0.445 0.540 0.699 0.591 5.181 3.713 0.292