Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. ii Ren Lipin flesh of adaptation evolutionary flies the into insights provides of peregrina assembly genome novo de Chromosome-level oprdwt h te erpaosfle,fls isaecaatrzdb h erdcieptenof pattern reproductive the 2010). by Castner characterized (Byrd& are corpses flies decomposed flesh with flies, associated necrophagous investigations other which family, forensic the Muscidae in with and role Calliphoridae Compared (Majumder crucial Sarcophagidae, include abdomen a mainly the play flies on carrion-feeding pattern gray common checkerboard-like eyes, The a red-tipped and brightly tropical thorax, the from including the spread features, al. (Xue on surface widely et regions stripes body is longitudinal Oceanian significant species black with and The fly and Oriental, flesh habits. Palaearctic, large-sized ecological the a in also of life areas human subtropical with to associated closely is which peregrina Sarcophaga INTRODUCTION 1 peregrina Sarcophaga the revealing further KEYWORDS into insight sheds and olfactory peregrina, and S. and evolution. metabolism, of adaptive reproduction lipid resource of ovoviviparous genomic formation, mechanisms its valuable axis molecular a clarifying Dorso-ventral underlying revealed provides in and analysis study aid formation This genomic that membrane activity. Comparative features horionic receptor Mya. biological as genes. ˜7.14 to such predicted diverged related habit, all bullata genes of necrophagous S. 92.14% selected for and positively identified accounting were peregrina and annotated, elements functionally S. expanded were repeat that genes anchored of indicated reliably protein-coding 45.70% scaffolding 14,476 analysis Moreover, Hi-C of Phylogenetic total genome. Mb. A underlying 3.84 assembled of chromosome- the genome. the N50 at the contig of However, genome in with 97.76% novo–assembled Mb for 560.31 de biological carrion. was accounting present genome the on pseudochromosomes, we assembled has Here final six feed The and genome. to peregrina. significance, high-quality adaptation S. of forensic for lack and scale and by pattern medical unsolved ecological, reproductive remain still ovoviviparous great mechanisms of the be as to such considered characteristics usually is peregrina Sarcophaga Abstract 2020 5, May 6 5 4 3 2 1 Bao EboehC Ltd Co biotech OE University Normal East University Medical Dalian University Medical Jiang Xin University South Central available not Affiliation 6 ogAn Dong , 02 Adtoa l :FgrsS1). Figures 1: file (Additional 2012) 1 ajeShang Yanjie , 6 amn Meng Fanming , , Rbna-eviy 80 Dpea acpaia) omnykona ehfly, flesh as known commonly Sarcophagidae), (Diptera: 1830) (Robineau-Desvoidy, enovo de 2 iYang Li , eoeasml,cmaaiegnmc,aatv evolution adaptive genomics, comparative assembly, genome 1 iegCai Jifeng , 2 hwnWang Shiwen , 2 1 n aogGuo Yadong and , 3 in Wang Xiang , 2 4 tal. et hnChen Shan , 01.Mroe,i is it Moreover, 2011). 5 Zhigui , Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. eew eotdachromosome-level a reported we Here other (Nazni and oneself diseases human or nurses parasitic in the and infestation) by diseases (parasitic either infectious (Miura intestinal causing myiasis of ectoparasite mial an species as fly (Lee vector and mammals livestock, the and of human one in and pest sanitary peregrina S. (Wang temperatures constant at (Gongyin differentiation taxonomic for (Sukontason exuviae morphology pupal (Goff larval larvae 2008), of (Tomberlin Stevens development death-scene (Wells& and indoor growth and on cases cadmium) corpse floating especially al. investigations, et forensic for data valuable eoe,wis ulctswr lee u.Fnly h la aawr ple oeal h genome the enable to applied were data USA). clean CA, the Diego, Finally, (San Xten out. were Kit HiSeq bases Prep filtered ambiguous Illumina Library were and the DNA reads on duplicates Nano contaminated TruSeq bp) adapters, whilst Illumina 150 sequencing removed, by (PE reads, generated low-quality reads control, was end quality bp paired After 400 150-nt of to sizes of sequenced insert female and with adult single library a A from 2006). extracted was studies. DNA further Genomic for used were survey females to adult Genome order hatched 2.2 In newly rearing. and larval larvae and adult 3rd-instar larvipositing of Afterwards, for pairs generations. medium mating a variability, as genetic liver reduce pork employing chamber, climate artificial 25 at raised of specimens Adult of Cultivation in 2.1 evolution adaptive METHODS of AND understanding MATERIALS our 2 enrich to order in peregrina dipteran S. published (Belton other levels with chromosomal analysis into sequencing scaffolds Real-Time) the Molecule anchor (Single SMRT can of (Hi-C) combination al. the capture reads, conformation Illumina chromosome short and of defect and the (Anstead to genomics flies due technology, dipteran sequencing for next-generation transcriptomics developed Kim of genomics, recently of emergence been perspective the have the with transcriptomic from Fortunately, phenomena biological epigenetics. of and mechanisms food specific the contaminate of and gation (Agrawal meat Sarcophagidae uncovered family (Majumder the from for industries nutrient livestock take the which in Although losses industries, economic meat to of leading pest ultimately key material, the also are (Guo corpse a on colonizing scenes death many at as well as in cadavers al. fly on instance, flesh et patterns For necrophagous succession 2010). important making insect the Castner larval an of estimate evolution, (Byrd& therefore of to species entomology adaptive accurately is stage forensic very it of the used of and be field result reduces corpses, can the reproduction the decomposed species the of this be larvae), (PMI) that first interval to to Given postmortem hatch appears eggs species. when reproduction other time of (the to development compared mode competitive The more them adult. and pupa (Goff larva, carrion onto larvae into Majumder hatch immediately 1989; which eggs depositing ovolarviparity), (or ovoviviparity 02 Roberts 2012; tal. et 01.Rcn tde nteseishv anyfcsdo h ffc fdusadhaymtl (eg. metals heavy and drugs of effect the on focused mainly have species the on studies Recent 2011). 04 iiAisyah Siti 2014; .peregrina S. 08 Scott 2018; ± a lopoon mlctosfrhmnhgeeadtelvsokeooy ti nimportant an is It economy. livestock the and hygiene human for implications profound also has . tal. et 1 ° n 70 and C tal. et tal. et .peregrina S. 01.Te a as yai ntehsia niomn hc sas aldnosoco- called also is which environment hospital the in myiasis cause can They 2011). tal. et .peregrina S. osse clgcl eia n oesciprac,teeaefwgnmcresources genomic few are there importance, forensic and medical ecological, possesses 04.Terpoutv yl of cycle reproductive The 2014). tal. et tal. et 03,wihesr h viaiiyo ihqaiyrfrnegnm assembly. genome reference high-quality of availability the ensure which 2013), ± 05.A uh h pce scniee sa niao fwudcr neglect, care wound of indicator an as considered is species the such, As 2005). %rltv uiiywt htpro eieo 21hlgtdrns nan in light/darkness 12:12h of regime photoperiod a with humidity relative 5% 04,wihsrea eeec o oeua tde frltdseis But species. related of studies molecular for reference a as serve which 2014), eetapdwt oklvrbi nCagh,HnnPoic,Cia and China, Province, Hunan Changsha, in bait liver pork with trapped were 05 Sukontason 2015; tal. et tal. et tal. et enovo de 2017b). 01.Adtoal,i ffsoeilns h aveo hsspecies this of larvae the islands, off-shore in Additionally, 2011). 00 Martinson 2010; eoeasml of assembly genome .peregrina S. tal. et tal. et 2 tal. et .peregrina S. tal. et 07,a ela h eeomna aacollection data developmental the as well as 2007), 00 Wang 2010; .peregrina S. 99 Wu 1989; tal. et rpe ntewl eehgl nrdfrsix for inbred highly were wild the in trapped 00,ctclrhdoabncmoiinin composition hydrocarbon cuticular 2010), 09,wihsrosyhne h investi- the hinder seriously which 2019), opie he ent tgs including stages, definite three comprises .peregrina S. .peregrina S. tal. et tal. et sn D ehd(Rinkevich method SDS using tal. et 03,mlclridentification molecular 2013), 07) hc ol provide would which 2017a), 05 o Santos dos 2015; soeo h otcommon most the of one is n efr comparative perform and tal. et 2012). tal. et 2014; tal. et tal. et et Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. ed.Tecniswr hncutrd ree n retduigte3 env seby(3d-DNA, assembly novo de 3D the paired-end the using of oriented validity the and evaluate ordered to clustered, as pipeline so then Juicer identified were were using was reads contigs Pairs genome paired-end Dumped assembled The clean and draft reads. quality End the Dangling High with Circle, v. out. aligned Self fastp filtered were (Durand by were reads 2.3.2 trimmed duplicates paired-end v. first whilst clean were removed, The data were retained. raw N’s Hi-C as genome, bases the of assembly (Chen 0.12.6 chromosomal-level a Xten. HiSeq construct Illumina To the Hi-C on by products reads) construction reaction paired-end PCR Pseudomolecule PCR bp The 2.6 quantitative (150 by products. sequenced quantified then Hi-C were and libraries the Kit Hi-C Prep to the and ligated beads, were XP (Lieberman-Aiden adapters repaired AMPure blunt-end with after were purified (Burton bp performed were pulldown 150–300 was biotin–streptavidin-mediated of through amplification fragments blunt-end- purification DNA PCR by generating sizes. followed bp biotin-14-dCTP A-tailed, 200–300 and to and enzyme sheared mechanically Klenow suspension was nuclei with (Lieberman-Aiden DNA and a incubating strands obtain reaction, by DNA to were cross-linking end-labeled buffer repaired tissues the isolation and remaining nuclei quench II), the in to (Dpn guts, suspended and added dissected powder was the to (Belton of glycine ground removal Subsequently, finally (Zhuang after were formaldehyde. tissues protocol and 1% washed, previous with was a larva fixed following 3rd-instar constructed single was preparation A library Hi-C (Additional The species other the preparation from with single- library contamination analysis a Hi-C no depth and 2.5 was GC 33.23% there the was S4). Combining that genome Figure S3). indicated the 1: and genome of file S2 the coverage content Figures of the GC 1: depth average and file sequencing an sequencing (Additional that cure during showed distribution remained analysis (Chin peaked contamination The v2.1.0 assembly. potential (Walker Arrow the whether in v1.20 of evaluate Pilon polished to by further analysis were polished depth contigs further and and subsequently v5.1, contigs were Blasr improved Subreads by PacBio contigs The raw reads. al. (Koren error-corrected the the v1.6 to using back Canu v1.2.8 mapped with S2). WTDBG Table corrected on 1: originally performed file was were (Additional subreads annotation platform. long genome Xten assisting PacBio HiSeq for on produced was performed was data raw bp) 2.4 of 150 Preparation Gb (PE Library the 9.62 sequencing following RNA of paired-end (Ambion) TruSeq total Kit and using A Isolation USA), constructed were miRNA CA, then mirVana RNAs (Illumina, were the total Kit libraries using annotation, were female RNA-seq genome subreads adult assist protocol. clean single to manufacturer’s the whole USA). order adapters, a of MA, in Center sequencing from Furthermore, Science, Genome assembly. and extracted the (Sage genome quality at subsequent low BluePippin cells for of SMRT the removal used 11 on After with instrument protocol China). Sequel (Wuhan, selection PacBio Nextomics the on performed to of was according sequencing female SMRT Kits adult Prep Template (Rinkevich single bell S1) a Table from 1: extracted file was DNA Genomic and size sequencing genome Genome the 2.3 estimate to order 2011). in Kingsford (Marcais& Meanwhile, assembly. genomic of subsequent heterozygosity calibrate and survey 03.Det iherrrtoo aBorwln ed,Ilmn hr ed eempe akt the to back mapped were reads short Illumina reads, long raw PacBio of ratio error high a to Due 2013). enovo De tal. et tal. et 02.Tence itr a isle,tecrmtnwsdgse ihrsrcinenzyme restriction with digested was chromatin the dissolved, was mixture nuclei The 2012). sebyadplsigo h genome the of polishing and assembly tal. et .peregrina S. 08.Atrqaiycnrl h o-ult ed,aatrcnaiainadambiguous and contamination adapter reads, low-quality the control, quality After 2018). tal. et 06.Atrad,acrigt h oaino pI etito ie,tertoof ratio the sites, restriction DpnII of location the to according Afterwards, 2016). 09.TeH-Clbaywscntutdb h ENx lr IDAlibrary DNA II Ultra NEBNext the by constructed was library C Hi- The 2009). tal. et epromdthe performed we , 06.A2-bisr MTellbaywsgnrtduigteSMRT the using generated was library SMRTbell insert 20-kb A 2006). tal. et 09,adlgtdb 4DAplmrs.Teextracted The polymerase. DNA T4 by ligated and 2009), K mrdsrbto ( distribution -mer 3 .peregrina S. tal. et 04.I diin eapidteGC the applied we addition, In 2014). K tal. et 7 sn h elfihprogram Jellyfish the using 17) = sn D ehd(Additional method SDS using 07.Tegnm assembly genome The 2017). tal. et 09,Briefly, 2019), tal. et 2013). et Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. 06.Mroe,i re oietf oiieyslce ee nthe in genes (Tijl the selected v4.2 identify positively Evolution to identify Family to Gene analyzed of order further Analysis in were Computational Moreover, results by Mya), the 2006). families relationships, (105.91-234.53 gene contracted phylogenetic Culicidae and and in family expanded families implemented the gene program of to MCMCTREE clocks According by Afterwards, molecular model The with 2014). clock The v8.2 2007). Stamatakis relaxed calcitrans RAxML (Yang 2006; a 2000). in 4.9e (Alexandros under implemented (Castresana v replicates estimated as PAML v0.91b bootstrap were method Gblocks 100 times likelihood the and divergence maximum using model a GTRGAMMA trimmed using the inferred then was and tree 2013), phylogenetic Standley (Kazutaka& v7.0 Xu 2015; among Kelly (Emms& (Buchfink v2.7 v0.9.30 OrthoFinder using DIAMOND using aligned first and from sequences bullata protein Sarcophaga families, gene identify to order In analysis family Swissprot, databases. Gene NR, the InterProscan 2.8 against and aligned Pfam were genes GO, predicted KOG, the KEGG, functions, TrEMBL, gene annotate to order (Yagi v1.1.1 in Additionally, Repbase EVidenceModeler via the set against gene search consensus a generate to genes (Haas predicted (Haas all respectively TopHat, integrated and we PASA Subsequently, through assembly genome the against (Stanke v3.0 Augustus al. the using parameters The 2000). Durbin domestica ( insects dipteran of structures subunits gene their protein-coding searches, and homology rRNAs combined the We predict to used 1997). was Eddy v1.2 database (Lowe& Rfam RNAmmer the (tRNAs). from (Lagesen RNAs 1e-5) transfer [?] and (E-value snRNAs BLAST using annotated were (Camacho (Thiel (ncRNAs) Tool RNAs Identification non-coding MIcroSAtellite The in Rep- known implemented the as against identified search 2003). were to (SSRs) used repeats were sequence 4.09) v (Allred (TRF, (Bedell repeats finder parameters Base repeats default Tandem with and 4.0.6 v RepeatProteinMask combi- we RepeatMasker the sequences, by repetitive tated in (Bedell (TEs) 1.0.11 elements v. transposable RepeatModeler and repeats tandem the the ned identify to order annotation In using functional evaluated and was prediction assembly Gene the (Sim˜ao2.7 of v3.1 the (BUSCO) completeness using Orthologs pseudomolecules the Single-Copy construct Meanwhile, Universal to 3d-DNA. Benchmarking used was from assembly script validated The finalize-output.sh interactions. neighboring on based (Dudchenko pipeline (Dudchenko 123) 170 v. 04,adGnI 144(rmeg ot20) enhl,tasrpoedt a tlzdt align to utilized was data transcriptome Meanwhile, 2007). Rost (Bromberg& v1.4.4 GeneID and 2004), .melanogaster D. tal. et .peregrina S. enovo de tal. et and eeue sqeist erhaantteasmldgnm sn h eeiev.. (Birney& v2.4.1 GeneWise the using genome assembled the against search to queries as used were ) tal. et 08.Tegnscnann E eete bnoe sn h rnpsnS akg to package TransposonPSI the using abandoned then were TEs containing genes The 2008). tal. et .domestica M. n ooo-ae ehd.A methods. homolog-based and 07.W loanttdtetNsb RAcnS 131wt eal parameters default with v1.3.1 tRNAscan-SE by tRNAs the annotated also We 2007). ee aegypti Aedes 09 Kalvari 2009; 08 Robinson 2018; enovo de n te ieseismnindaoe igecp aiiswr lge i h MAFFT the via aligned were families single-copy above, mentioned species nine other and , tmxscalcitrans Stomoxys tal. et with , 08 Bedell 2008; rdcin eepromdfo h oooybsdpeitost ri model train to predictions homology-based the from performed were predictions 2.73.6Ma eeue o oslcalibration. fossil for used were Mya) (26.97-36.96 .aegypti A. , tal. et npee gambiae Anopheles tal. et .peregrina S. tal. et tal. et tal. et enovo de 00.Terpttv eune nteasmldgnm eeanno- were genome assembled the in sequences repetitive The 2000). 08,icuigmcoNs(iNs,rbsmlRA (rRNAs), RNAs ribosomal (miRNAs), microRNAs including 2018), 08.Temssebyadmsoncinwr aulyadjusted manually were misconnection and misassembly The 2018). tal. et 07.H- otc arxwsvsaie sn ucbxv 1.9.8 v. Juicebox using visualized was matrix contact Hi-C 2017). and 03.Fnly l eest eepeitdi sebe genome. assembled in predicted were sets gene all Finally, 2013). , .domestica M. rdcinadtasrpoedt-ae prahst predict to approaches data-based transcriptome and prediction .gambiae A. 00 esn19;Jurka 1999; Benson 2000; tal. et ntehmlg-ae ehd rti eune rmfive from sequences protein method, homology-based the In . tal. et enovo de tal. et , rspiamelanogaster Drosophila 4 04,SA Kr 04,GimrM (Majoros GlimmerHMM 2004), (Korf SNAP 2004), 09.T ute eeltepyoeei relationships phylogenetic the reveal further To 2019). .peregrina S. 05,adteaindrslswr hnclustered then were results aligned the and 2015), pcfi eetlbaywsfis eeae sn the using generated first was library repeat specific en h aiyClcdea notru,were outgroup, an as Culicidae family the being , .cuprina L. tal. et n te iedpeaseis including species, diptera nine other and .peregrina S. , eaii capitata Ceratitis tal. et tal. et 00.Atrad,RepeatMasker, Afterwards, 2000). tal. et 2015). 05.I diin h simple the addition, In 2005). , uii cuprina Lucilia 08 Moriya 2008; ertie orthologous retained we , , atoeaoleae Bactrocera tal. et and , Stomoxys Musca 2007). tal. et tal. et et Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. oa f799gnswr sindt Ocasfiain.Bsdo EGaayi,w ol annotate could could we genes analysis, 9,332 the Additionally, KEGG of on genome. results assembled Based databases. The database. the InterPro Swissprot classifications. in the the pathways in GO in metabolic proteins to annotated KEGG to assigned be 130 homology were showed and genes that genes (Additional genes 7,999 5,236 genome 11,425 of assembled identified is in total We genes genome A all S14). assembled of Table 92.14% in for 1: accounting genes file functions, of potential bp, with ( number transcript 356.4 annotated above total average were and mentioned the bp, gene, genomes 1,404.03 Moreover, published per cuprina bp, five exons L. 7,635.25 of S12). 3.94 were those Table gene of than per number 1: larger length average file exon an (Additional average with and respectively identified length, were CDS average genes length, 15,710 of total A only with BUSCOs, S11). annotation single-copy Table of Genome 1: 97.4% 3.2 file and (Additional BUSCOs BUSCOs complete duplicated 98.2% of covered 0.8% assembly genome Hi-C the the in against observation of cytological that Hi-C genome twice on 3d-DNA assembled than based and draft chromosomes clean Juicer the six of the exactly of of al. were 96.45% karyotype for 97.76% Mb containing the input 548.19 for of obtained, to as accounting length identical were chromosomes, used total data a were six with clean into which pseudochromosomes Finally, anchored of S9), Table (Additional Gb pipelines. reads scaffolding 153.8 1: paired-end and control, 1,063,074,766 analysis file of quality (Additional consisting After reads produced were S8), paired-end data Table raw S7). Hi-C Table 1: of of file 1: 20.90 Gb completeness file 159.4 Mb, single-copy of of (Additional of 3.84 result total BUSCOs 97.1% of A the and missing number Meanwhile, BUSCOs of complete contig 1.4% S6). the 97.9% and Table 3.79 only covered Finally, contig assembly of with 1: genome longest BUSCOs, N50 respectively file S6). the the contig that Additional Table kb, indicated N50, with 1, assembly 13.90 (Table contig 1: size, the respectively and with in file 2,031, size, kb Mb (Additional and for in 554.66 8.55 respectively Mb retained was Mb were 2,031, were assembly 560.31 subreads of subreads genome was of number initial of assembly The Gb contig N50 57.83 and the S5). control, Table Mb and quality 1: length After file average data. (Additional The PacBio raw assembly. of genome Gb 58.54 generated We S5) Figure and of S3 Table (Walkerthat 4.1.5.0 1: peak file v. the (Additional GATK to (˜3.0%) second in heterozygous According the highly insects, was peak. which other (Vurture Mb, main in ˜472 the common be also be to estimated is to heterozygosity selected high was that of peak Given count heterozygosity. total high suggesting A of retained. curve were distribution of data genome clean the Gb survey to order In assembly and sequencing Genome 3.1 RESULTS 3 (Wang MCScanX using alignment ortholog between synteny scale chromosome the conducted we Besides, (P (Camacho Blastall using analysis) among groups < 00 ( 2010) .5 sn oelwt h rnhst oe sipeetdi h ALpcae(ag2007). (Yang package PAML the in implemented as model branch-site the with Codeml using 0.05) 17 tal. et mraayi Adtoa l :TbeS4). Table 1: file (Additional analysis -mer i.2a Fig. and , .melanogaster D. .peregrina S. 07.Mroe,w efre h eeoyoiyaayi sn N aln implemented calling SNP using analysis heterozygosity the performed we Moreover, 2017). .domestica M. .melanogaster D. enovo de diinlfie1 al 1) lhuhtesz fteasmldgnm smore is genome assembled the of size the Although S10). Table 1: file Additional , 17 mrpeetda nsa oso itiuinwt w padcne signals, convex upward two with distribution Poisson unusual an presented -mer n h eann ee pce atrrmvlo h ugopi evolutionary in outgroup the of removal (after species seven remaining the and n oooybsdpeitossoe ht260 bo eeiiesequences repetitive of Mb 256.07 that showed predictions homology-based and tal. et eoe( genome Adtoa l :Fgr 6adTbeS3.1,7 rti-oiggenes protein-coding 14,476 S13). Table and S6 Figure 1: file (Additional ) 08.Tehtrzgst ai a 16% hc srltvl oe than lower relatively is which ˜1.65%, was ratio heterozygosity The 2018). .peregrina S. tal. et i suohoooe nteasmldgnm a eaindnearly aligned be can genome assembled the in pseudochromosomes six , i.2b Fig. 09.Sbeunl,w acltdlklho ai et o selection for tests ratio likelihood calculated we Subsequently, 2009). 68G frwilmn aawr rdcd fwih40.4 which of produced, were data illumina raw of Gb 46.8 , .Tersl fcmltns fteasml niae that indicated assembly the of completeness of result The ). 17 tal. et mrws2,0,8,3 rmsotcenras The reads. clean short from 28,804,585,532 was -mer 5 2012). .peregrina S. .aegypti A. 17 mrdpho 1 h eoesz was size genome the 61, of depth -mer and .melanogaster D. , . gambiae A.s .peregrina S. Fg 1) (Fig. , .melanogaster D. ae ngenome- on based enovo de (Agrawal hc is which , genome et , Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. iu td mle htdffrnito fvnrlfliua el sntadrc euto emiesignal and germline early between of link result a direct providing a cells, follicle not dorsal is the cells from follicular signals Pre- indirect ventral on formation. of relies axis differentiation offspring Dorso-ventral but that and the transduction, activity for implied chori- receptor survival encoding study factor of mainly growth vious significantly, level fibroblast expanded higher formation, have membrane a development onic embryonic but in female, involved per genes Furthermore, offspring fewer S24 to Tables leads 2: alpha-hydroxylase pattern file syn- (Attardo acid This (Additional nutrient fatty transport synthase, as cholesterol S25). acid intracellular glands and fatty and accessory with process metabolic associated female phospholipid expansions of activity, gene devel- adaptation with larval as (Watanabe conserved, early well generally system during delivery as bodies larvae, and fat developing thesis from acclimatize transferred to fat uterus of (Attardo composed body (Attardo mainly fat opment are the secretions throughout glandular uterus The ovo- the repro- of viviparous to (Majumder reproduction pseudo-placental larvae connects the undergo to to insects, ability birth in the giving by reproduction defined oviparous (Meier is of which duction pattern unique, relatively common is most viviparity the enrichment with GO Compared activity, and alpha-hydroxylase Besides, KEGG acid by fatty S25). encoded analyzed DISCUSSION and genes 4 then S24 These and Tables etc. S27). activity, identified and transferase 2: were S26 morphogenesis, file Table genes epidermis (Additional 2: selected receptor file positively factor etc. (Additional 692 amino growth analyses formation, hydrolases, fibroblast of membrane activity, of total chorionic for synthesis receptor a as utilized olfactory encoded were well activity, mainly which as synthase They genes, activity acid expanded respectively. fatty 2,805 GO, metabolism, and and acid genes KEGG contracted of 367 analyses including enrichment families, gene these the from expansion in of families degree gene greater contracted 1,191 revealed approach contraction This and expansion family within Moreover, Gene S9). 3.4 Figure 2010). Castner 1: (Byrd& file to remains (Additional of decomposed related Mya) constitute diversification them on 4.99–9.43 closely of the succession both HPD: more Sarcophagidae, that faunal (95% family with is insect Mya consistent the likely of 32.53 Sarcophagidae is was part This family them main between epoch. the the Paleogene time Late Moreover, divergence the estimated within bootstrap Mya) and S8). (ML 25.38–40.06 family, other Figure supported to strongly than 1: Calliphoridae and file branch (Additional indicated large separated taxon, analysis a outgroup Phylogenetic into an whilst S21). together As 100), Table clustered = and these, were BP S7 Among species percentage, Figure fly 1: eight genes. file that 13,039 (Additional trees covering to phylogenetic unique genome, construct were assembled ( genes in identified 106 were identified containing families were gene families thirteen gene (Additional 9,636 analysis genome phylogeny Finally, assembled and the identification 157 in family addition, identified Gene In tetra-, were 3.3 S18). tri-, tRNAs and di-, 1,465 S19). S17 mono-, and Table Furthermore, Tables 409, snRNAs, 1: S16). 1: file 200 2,835, and file rRNAs, 12,124, S15 (Additional Table 50 43,763, respectively miRNAs, 58,559, 1: repeats, file 338,634, hexa-nucleotide (Additional and including genome penta-, detected, most the were the of represented SSRs 12.35% (69.21Mb) 456,324 transposons for DNA accounting genome. TEs, assembled abundant the of 45.70% covering identified, were tal. et 06 Majumder 2006; tal. et i.2c Fig. tal. et 99,nml,nuihn nrueieosrn rmamdfidacsoygadand gland accessory modified a from offspring intrauterine nourishing namely, 1999), 06 age&Brel18) h erdcinrqie dpieeouino the of evolution adaptive requires reproduction The 1980). Bursell Langley& 2006; .aegypti A. diinlfie1 al 2) ete dnie ,2 igecp rhlg to orthologs single-copy 5,622 identified then We S20). Table 1: file Additional , Fg 3, (Fig. .peregrina S. tal. et tal. et and diinlfie1 al S22 Table 1: file Additional 04 Meier 2014; tal. et .gambiae A. 04.Ti ln shgl pcaie,etnigfo hr it where from extending specialized, highly is gland This 2014). and 04.I hssuy ee novdi ii eaoimare metabolism lipid in involved genes study, this In 2014). .peregrina S. .bullata S. tal. et Dpea uiia)aecutrdtgte n clearly and together clustered are Culicidae) (Diptera: 6 1999). tal. et eecutrdmr lsl hnohrspecies. other than closely more clustered were .peregrina S. and .peregrina S. ) .bullata S. h orsodn ee eeidentified were genes corresponding The . 06 Ma 2006; eie,262ucutrdgenes unclustered 2,662 Besides, . n ny58gn aiishda had families gene 568 only and , okpae71 y 9%HPD: (95% Mya 7.14 place took tal. et 95 Meier 1975; tal. et 1999). Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. optn interests Competing applicable. authors Not participate All to manuscript; the consent the manuscript. Z.B. extracted and revised the approval L.R., Y.S. X.W. of samples; and Ethics and version the L.R. S.C. final collected analysis; Y.G., the Y.S. evolution approved manuscript; and and and the S.W. read annotation wrote assembly, L.Y., L.R. genome L.R., samples; the study; DNA/RNA on the worked designed D.A. J.C. and and F.M. S23). Y.G., parameters (Table L.R., 1 and file versions Additional All in provided Archive) CONTRIBUTIONS are PRJNA509973. Read AUTHORS’ study (Sequence ID this SRA Bioproject in the (SRR8416036-37, software numbers the in for accession available with used via openly SRR10821909) Information) are Biotechnology and study for this Center SRR9722726-27 (National of NCBI findings of the database support 81772026). that data (No. The China of Foundation Science Natural National ACCESSIBILITY DATA the by supported is study This quality the improve greatly can FUNDING of which suggestions, identification valuable species for for reviewers and College) manuscript. editors Police of to (Guizhou grateful very Chen also Lushi are Prof. thank greatly We phylogenetic large-scale of application provides the only for ACKNOWLEDGEMENTS evolution not insect of genome study of This further for adaptations gap flies. a projects. evolutionary fills flesh it the of also and but diversity revealing structure species, phylogenetic genomic for of the resources publication into The important insight mapping. further Hi-C sheds with of function sequencing assembly PacBio genome combining chromosomal-scale contiguity quality choice. high and present host we behind study, this mechanisms In physiological in the into peptide insight sheds signaling CONCLUSIONS study 5 abundant our an and interest, is intrinsic of (NPF) are F Wu 2011; neuropeptide Wegener (Nassel& that development indicated S24). study Table assembled olfactory melanogaster in 2: Previous significantly encoding expanded file have genes (Additional interaction genome. ligand-receptor taste expanded neuroactive of significantly encode perception that exhibited genes sensory Moreover, analysis smell, enrichment of perception study, sensory and this (Carrasco activity, choice colonize In receptor host olfactory can that behind demonstrated mechanisms flies physiological 2013). been of Necrophagous has Leal description It functional evolution. insects. a herbivorous provide long-term with can compared cues the corpses during decomposed the developed on analyzing breed been and detecting has perceptions (Field in system orientation, neural environment role activation, essential olfactory and the an initial gustatory from plays olfactory, as Olfaction semiochemicals visual, such the processes. The activities, physiological complicated of 2010). the range Wall regulate (Ashworth& broad feeding ovoviviparity. of a and of involves pattern identification reproduction behavior reproductive the feeding the exploring insect further of Besides, for mechanism basis genetic theoretical important possible an the provides clarify in clearly formation axis to dorsal-ventral for events late hc lyrlsi edn,rpouto,adcodntslra eairlcagsduring changes behavioral larval coordinates and reproduction, feeding, in roles play which , tal. et Drosophila 03.Hnetemcaim fhs oainby location host of mechanisms the Hence 2003). tal. et 7 00 i iels21) ope n sensitive and complex A 2015). Liberles Li& 2000; mro (Jordan embryos .peregrina S. tal. et .peregrina S. n te carrion-feeding other and 00.Dsietefailure the Despite 2000). .peregrina S. ihhg coverage high with .peregrina S. .peregrina S. u study our , tal. et 2015; We . D. Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. ateaa .(00.Slcino osre lcsfo utpeainet o hi s nphylogenetic in use their for alignments multiple from blocks environments. conserved complex of in Selection selection analysis. plant (2000). host J. Insect Castresana, 8 (2009). (2015). science, P. L. insect Anderson, T. in & Madden, opinion C., & Current M. K., Larsson, Bealer, D., J., Carrasco, Papadopoulos, N., applications. Ma, and V., architecture Avagyan, BLAST+: Investigations. G., Legal Coulouris, in C., of Camacho, Utility The Entomology: Forensic (2010). Edition. L. J. Chromosome- 2nd Castner, (2013). & interactions. H., J. chromatin J. Shendure, Byrd, on & based O., assemblies J. genome Kitzman, 1119. novo R., (12), Qiu, de DIAMOND. P., of using R. alignment scaffolding Patwardhan, protein A., scale Adey, sensitive N., and J. Fast Burton, function. (2015). on H. D. polymorphisms Huson, non-synonymous 12 & of Methods, C., effect Xie, predict B., experiment. Buchfink, SNAP: annotation Drosophila (2007). 35 Research, B. the Acids Rost, in Nucleic Hi- & GeneWise Y., Using Bromberg, (2012). (2000). J. R. Dekker, Durbin, 10 Research, & & sequences. DNA E., Y., analyze Birney, to Zhan, program genomes. a N., of finder: Naumova, repeats 27 conformation Tandem H., (1999). the G. J. RepeatMasker. Benson, capture to Gibcus, to enhancement P., technique performance R. comprehensive doi:10.1016/j.ymeth.2012.05.001 a McCord, a MaskerAid: M., C: (2000). J. gland W. milk Belton, Gish, viviparous and & of yolk I., aspects of 16 Korf, Molecular regulation Bioinformatics, A., morsitans): (2006). J. to morsitans S. Bedell, cuprina (Glossina Aksoy, L. fly & and tsetse P., synthesis. sericata the Strickler-Dinglasan, protein Lucilia of N., blowflies biology B. sheep Guz, R. reproductive the M., Gasser, of G. . Responses baits. Attardo, semiochemical . of (2010). . development R. C., the Wall, and S. & odour Murali, R., R., interventions. A. J. future Jex, underpin Ashworth, to S., biology R. fly Hall, parasitic D., unlocks N. 6 genome Young, cuprina K., Lucilia P. (2015). Korhonen, A., C. mask. Anstead, protein architecture surface-layer Three-dimensional a (2008). through T. D. 1438. Schwartz, electrodeposited & nanoarrays F., Baneyx, inorganic M., Sarikaya, of A., Cheng, B., thousands D. with Allred, analyses models. Genus phylogenetic mixed the likelihood-based and of maximum taxa RAxML-VI-HPC: of Flesh of (2006). Cytogenetics S. (2010). Alexandros, H. Kurahashi, & Diptera). R., R. (Sarcophagidae: Tewari, Boettcherisca N., Bajpai, R., U. Agrawal, interests. competing no REFERENCES have they that declare authors The 34 doi:10.1038/ncomms8344 7344. , 2,573-580. (2), oeua ilg n vlto,17 Evolution, and Biology Molecular 1,5-0 doi:10.1038/nmeth.3176 59-60. (1), R rs,Bc Raton Boca Press, CRC 4,547-548. (4), ora fisc hsooy 52 physiology, insect of Journal 1) 1040-1041. (11), iifrais 22 Bioinformatics, 1) 3823-3835. (11), 1-7. , doi:10.4001/003.018.0221 . yooi,75 Cytologia, M iifrais 10 Bioinformatics, BMC 2) 2688. (21), 4,540-552. (4), 1-2,12-16 doi:10.1016/j.jinsphys.2006.07.007 1128-1136. (11-12), eia n eeiayEtmlg,8 Entomology, Veterinary and Medical 8 2,149-155. (2), 1,421. (1), ehd,58 Methods, aoltes 8 letters, Nano aueBoehooy 31 Biotechnology, Nature uli cd Research, Acids Nucleic 4,303-309. (4), a Commun, Nat 3,268-276. (3), 5,1434- (5), Genome Nat Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. auaa . tnly .M 21) AF utpesqec lgmn otaevrin7 improve- 7: version software alignment A. sequence families. Petrov, usability. multiple RNA and MAFFT . performance non-coding (2013). in . M. for ments D. resource . Standley, & genome-centric R., K., S. a Kazutaka, Eddy, to E., shifting Rivas, P., 13.0: 46 E. research, Repbase Rfam Nawrocki, (2005). N., (2018). Quinones-Olvera, J. I. Walichiewicz, J., & Argasinska, O., I., Kohany, elements. Kalvari, P., repetitive Klonowski, eukaryotic (2000). of A., H. database notch Pavlicek, Ruohola-Baker, a through (2008). V., . Update, formation V. R. . Kapitonov, axis J. . dorso-ventral J., Wortman, D., embryonic Jurka, Stein, to J., . signalling Sen, M., EGF . A. links Morimoto, activation. mirror . A., Assemble J. gene J., Blasi, to homeobox Orvis, J., Program The N. E., Clegg, the important J. C., K. and Allen, forensically Jordan, EVidenceModeler M., of Pertea, using Identification W., annotation Alignments. Zhu, structure (2014). Spliced L., gene L. S. eukaryotic Wu, Salzberg, gene. Automated & J., period and J., B. COI Haas, Cai, on P., based China doi:10.1007/s00414-013-0923-7 Li, in 221-228. W., Sarcophagidae) (1), Yan, (Diptera: in L., flies composition sarcophagid Zha, hydrocarbon Cuticular Y., flies. (2007). Guo, necrophagous H. six of Cui, of rate & development differentiation 450-456. Z., the taxonomic (3), Guanghui, on for tissues Z., in exuviae Jiaying, cocaine pupal L., of Kai, Effect Y., (1989). Gongyin, R. J. Sarcophagidae). Goodbrod, & (Diptera: I., peregrina olfaction. A. Boettcherisca insect Omori, in L., studies M. Molecular Goff, (2000). J. L. Wadhams, & A., 9 comparisons J. Pickett, genome M., whole L. in Field, biases fundamental solving OrthoFinder: 2 accuracy. (2015). (2016). inference L. S. orthogroup E. improves Kelly, Aiden, dramatically & & Experiments. M., S., Hi-C D. E. E. Emms, Lander, Loop-Resolution Aiden, H., Analyzing M. Huntley, for . S., System doi:10.1016/j.cels.2016.07.002 S. . One-Click Rao, 95-98. I., . a Machol, R., Provides S., Mostofa, M. Juicer Shamim, T., C., N. N. L. Musial, Durand, under E. C., facilitates for Aiden, N. module scaffolds Tools Durand, chromosome-length . Assembly with S., Juicebox . genomes S. The Batra, . (2018). S., C., L. M. N. Durand, Shamim, M., O., Dudchenko, scaffolds. Hoeger, chromosome-length . yields K., Hi-C 356 S. using . genome N.Y.), Nyquist, aegypti York, Aedes D., the (New . A. of assembly J., Omer, novo De S., Thurmond, (2017). S. A., genome Batra, reference M. O., Crosby, 6 Dudchenko, Release B., annotations. melanogaster V. genome Drosophila Strelets, the of of L., migration doi:10.1093/nar/gku1099 introduction J. large-scale FlyBase: and Goodman, (2013). E. assembly (2014). J., E. M. A. Eichler, W. . Schroeder, Gelbart, . G., . Santos, data. C., dos Heiner, sequencing SMRT J., Drake, long-read A., from A. assemblies Klammer, preprocessor. 10 genome P., FASTQ Marks, microbial all-in-one H., finished ultra-fast D. Nonhybrid, an Alexander, fastp: S., C. (2018). Chin, J. Gu, & Y., 34 Chen, formatics, Y., Zhou, S., Chen, 6,5551 doi:10.1046/j.1365-2583.2000.00221.x 545-551. (6), 6,563. (6), a ee,24 Genet, Nat D) 35d4.doi:10.1093/nar/gkx1038 D335-d342. (D1), 1) 884-890. (17), eoeBooy 9 Biology, Genome 63) 29.doi:10.1126/science.aal3327 92-95. (6333), 4,4943 doi:10.1038/74294 429-433. (4), oeua ilg n vlto,30 Evolution, and Biology Molecular 1,7. (1), ora fMdclEtmlg,26 Entomology, Medical of Journal 9 $ yoeei n eoersac,110 research, genome and Cytogenetic eoeBo,16 Biol, Genome 1000. < bioRxiv uli cd eerh 43 research, acids Nucleic em > enovo de 577 doi:10.1101/254797 254797. , ora fMdclEtmlg,44 Entomology, Medical of Journal 5.doi:10.1186/s13059-015-0721- 157. , 4,772-780. (4), < /em > sebyo mammalian of assembly n ea e,128 Med, Legal J Int 2,91-93. (2), D) D690-D697. (D1), elSs,3 Syst, Cell auemethods, Nature netMlBiol, Mol Insect 14,462-467. (1-4), uli acids Nucleic Science Bioin- (1), Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. ir,M,Hysk,S,Ymd,T,Hysk,Y,&Kmmr,K 20) ptammisscue by caused Ophthalmomyiasis (2005). K. Kamimura, & peregrina. Y., Boettcherisca Hayasaka, of T., larvae Yamada, S., Diptera. Hayasaka, the M., in Miura, viviparity and Ovoviviparity D. (1999). Denlinger, P. & Ferrar, bullata. & H., 74 Sarcophaga M., J. , Kotrba, Werren, Flesh R., B., the Meier, J. of Benoit, Analyses C., Transcriptomic 9 E. Ontogenetic-Based (Bethesda), Jennings, and G3 D., Genome Y. occurrences Kelkar, (2019). of J., counting L. Peyton, parallel O., efficient E. for approach Martinson, flesh lock-free of fast, biology A reproductive (2011). k-mers. The C. of (2014). Kingsford, & A. G., R. Marcais, Khan, & R., 23 Boettcherisca Sciences, H. fly, Biological Sarcophagidae). Khan, flesh of 1830)(Diptera: K., (Robineau-Desvoidy, of M. peregrina biology Dash, Boettcherisca The fly, R., Z. (2012). M. R. Majumder, H. ab Khan, source & open A., two R. 196. GlimmerHMM: Sarcophagidae). Khan, and 1830)(Diptera: K., (Robineau-Desvoidy, TigrScan M. peregrina fly Dash, (2004). tsetse M., L. the Majumder, S. in Salzberg, modulations Structural & gene-finders. M., eukaryotic (1975). Pertea, initio S. H., D. W. Smith, & Majoros, cycle. U., pregnancy Jarlfors, a genes L., during RNA gland D. . transfer milk of Denlinger, detection C., improved . W. for program Ma, . a tRNAscan-SE: A., (1997). sequence. Telling, R. genomic S. T., in Eddy, & Ragoczy, the M., of M., T. principles Lowe, folding Imakaev, reveals interactions L., long-range of Williams, mapping genome. L., Comprehensive human (2009). N. O. M. Berkum, Dorschner, olfaction. Van intu- through E., an attraction in and Lieberman-Aiden, myiasis Aversion nasal Nosocomial (2015). (2011). D. L. S. W. doi:10.1016/j.cub.2014.11.044 Liberles, Cho, & & P., Q., C. Li, Fung, C., Y. Lin, L., patient. T. bated enzymes. Chen, female degrading T., and adult Y. proteins, by binding Lee, synthesis receptors, of 58 milk roles Entomology, in insects: of gland in Review uterine reception Annual Odorant and (2013). body S. fat W. Leal, of Role RNAmmer: (1980). (2007). E. morsitans. W. Bursell, Glossina D. & Ussery, & A., genes. P. T., RNA Langley, Rognes, ribosomal H.-H., of Staerfeldt, annotation A., rapid E. and Rodland, consistent P., Hallin, genomes. K., novel Lagesen, in finding Gene scalable (2004). Canu: I. (2017). Korf, separation. M. repeat A. Phillippy, and & weighting H., k-mer N. Comprehensive 722-736. adaptive Bergman, (5), via (2018). R., assembly H. J. G. long-read Miller, Son, accurate K., and . Berlin, P., . B. Walenz, . S., species. K., Koren, fly S. Kim, important H., forensically J. Seo, a K., peregrina, H. doi:10.1038/sdata.2018.220 Cha, Sarcophaga E., of S. analysis Shin, Y., transcriptome H. Lim, Y., J. Kim, 3,1928 doi:10.1111/j.1469-185X.1999.tb00186.x 199-258. (3), iifrais 27 Bioinformatics, ora fteCieeMdclAscain 74 Association, Medical Chinese the of Journal cec,326 Science, 5,11-30 doi:10.1534/g3.119.400148 1313-1320. (5), netBohmsr,10 Biochemistry, Insect uli cd eerh 25 Research, Acids Nucleic 1,61-67. (1), iifrais 20 Bioinformatics, 6,764-770. (6), 55) 289-293. (5950), aaeeJunlo ptamlg,49 Ophthalmology, of Journal Japanese 373-391. , iseCl,7 Cell, Tissue 11.doi:10.1016/0020-1790(80)90033-5 11-17. , M iifrais 5 Bioinformatics, BMC 1) 2878-2879. (16), 5,955-964. (5), 10 2,3930 doi:10.1016/0040-8166(75)90008-7 319-330. (2), uli cd eerh 35 Research, Acids Nucleic 8,3931 doi:10.1016/j.jcma.2011.06.001 369-371. (8), agaehJunlo olg,40 Zoology, of Journal Bangladesh 9 doi:10.1186/1471-2105-5-59 59. , 2,177-179. (2), urBo,25 Biol, Curr hk nvriyJournal University Dhaka c aa 5 Data, Sci eoeRsac,27 Research, Genome 9,3100-3108. (9), ilgclReviews, Biological 3,R120-r129. (3), 180220. , 2,189- (2), Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. akr .J,Ael . ha . ret . bulil . atiua,S,...Yug .K (2014). K. S. Young, . . . improvement. assembly S., genome Sakthikumar, and A., detection variant Abouelliel, microbial M., comprehensive Priest, for tool T., integrated Shea, M. an T., Schatz, Pilon: reads. Abeel, & J., J., short Gurtowski, B. from Walker, H., profiling for Fang, genome roadmap J., reference-free C. A fast Underwood, (2011). M., 2202-2204. GenomeScope: S. Nattestad, VanLaerhoven, J., (2017). & F. C. Sedlazeck, M., of W., A. entomology. study G. Tarone, forensic the Vurture, in for E., research tool M. applied computational Benbow, and a R., basic CAFE: bridging Mohr, (2006). K., W. J. M. Tomberlin, Hahn, & P., J. evolution. Demuth, family C., gene Nello, B., development D. the L.). Tijl, for vulgare databases (Hordeum EST barley Exploiting in 106 (2003). SSR-markers genetics, A. Forensically gene-derived Graner, of & characterization (2010). R., Varshney, and L. rate. W., Michalek, developmental K. T., and Sukontason, Thiel, morphology & K., Thailand: in Moophayak, 1055-1064. species (5), T., fly Chaiwong, flesh N., finding important gene Bunchu, for server K., web Sukontason, a AUGUSTUS: (2004). phylo- B. large Morgenstern, of eukaryotes. & S., in post-analysis Waack, and R., Steinkamp, analysis M., phylogenetic Stanke, for tool a 8: version genies. in RAxML Habitat (2014). of A. Importance Entomology. The Stamatakis, Forensic in (2015). Implications C. 52 H. Malaysia: Entomology, Chin, in Medical Carcasses & N. BUSCO: Rabbit W., of on Evans, Liu, Decomposition David (2015). of K., Ecology M. Hiromu, . the L., E. Baha, Zdobnov, S., . a Aisyah, & Siti to V., . adaptations E. D., with Kriventseva, orthologs. S. single-copy P., diseases with Ioannidis, Giers, completeness of 3210-3212. annotation M., vector G., and assembly R. global A. genome Waterhouse, assessing a Clark, A., L., D., F. domestica Bopp, Simao, Musca W., fly, L. house (2018). the Beukeboom, environment. L. septic of C., E. Genome W. Aiden, Warren, & (2014). G., P., J. J. Data. Mesirov, Scott, Hi-C H., for Thorvaldsdottir, System C., sequencing. Visualization SMRT N. Cloud-Based doi:10.1016/j.cels.2018.01.001 of Durand, a advantages D., Provides The Turner, Juicebox.js (2013). T., C. J. M. Robinson, Schatz, & O., Frequencies M. 14 (2006). States. United G. Carneiro, Biol, eastern J. J., the Scott, R. from & Nosocomial flies Roberts, P., house (2011). B. in Lazzaro, H. CYP6D1 G., and M. 15 Vssc1 S. Hisham, Biol, of Brady, Mol . alleles L., resistance R. . pyrethroid Hamm, the . L., of C., Zhang, Heo, D., W., F. unit. Chew, Rinkevich, signaling care M., intensive F L. an neuropeptide Akmar, in H., long myiasis Lee, nasal and J., signaling? short Jeffery, Y W., of Nazni, neuropeptide review comparative vertebrate A to genome similarities automatic (2011). doi:10.1016/j.peptides.2011.03.013 Any an C. KAAS: Wegener, invertebrates: (2007). & in M. R., Kanehisa, D. & Nassel, C., A. server. Yoshizawa, reconstruction S., pathway and Okuda, annotation M., Itoh, Y., Moriya, iifrais(xod nln) 30 England), (Oxford, Bioinformatics 7,45 doi:10.1186/gb-2013-14-6-405 405. (7), 2,1717 doi:10.1111/j.1365-2583.2006.00620.x 157-167. (2), 3,411-422. (3), uli cd eerh 32 Research, Acids Nucleic eoeBo,15 Biol, Genome iifrais 22 Bioinformatics, 1,9-23. (1), 1) 6.doi:10.1186/s13059-014-0466-3 466. (10), h aasa ora fptooy 33 pathology, of journal Malaysian The 1) 1269-1271. (10), (suppl 9,11-33 doi:10.1093/bioinformatics/btu033 1312-1313. (9), uli cd eerh 35 Research, Acids Nucleic ) 309-312. 2), 11 nulRve fEtmlg,56 Entomology, of Review Annual (suppl elSs,6 Syst, Cell etds 32 Peptides, 1,53. (1), aaiooyRsac,106 Research, Parasitology iifrais 33 Bioinformatics, ) 182-185. 2), iifrais 31 Bioinformatics, hoeia n applied and Theoretical 2,256-258.e251. (2), 6,1335-1355. (6), 401-421. , Genome Journal Insect (14), (19), Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. iueS2 Figure S1 Figure 1: file of genome Additional The (2019). R. Tang, INFORMATION . SUPPORTING . . domestication. crop C., and Zhang, evolution 51 K., polyploid karyotypes, Genetics, M. legume Pandey, into J., insight Wang, provides M., peanut cultivated Yang, H., Chen, W., likelihood. Zhuang, maximum Sequence by (2013). analysis T. phylogenetic Onozaki, 4: . 24 PAML . (Diptera: L.). (2007). caryophyllus . 1990 Z. (Dianthus T., Yang, Verves carnation Harada, of K., Boettcheriscina Tanase, genome A., subtribe the Ohmiya, of of H., analysis Hirakawa, China. review S., from Kosugi, A genus server M., Yagi, web and (2011). a species J. OrthoVenn2: new Du, a (2019). of 303-329. & Y. descriptions Wang, G., with . Y. Sarcophagidae), Verves, species. . multiple W., . across Xue, H., clusters Guo, orthologous Z., of Wei, annotation Y., 47 research, and Luo, comparison L., whole-genome and Fang, foraging for Z., of Dong, control Developmental L., (2003). Xu, P. of Shen, analysis system. & Y-like proteomic N., H. neuropeptide A Cai, Drosophila 6273(03)00396-9 (2013). H., the Y. J. by Park, G. behavior G., Ye, social Lee, & T., A., Wen, J. Q., Wu, Cheng, exposure. C., cadmium to Hu, response X., in Gao, peregrina 225-229. K., entomology. Boettcherisca Li, forensic of Y., in midguts J. larval methods Zhu, DNA-based X., of G. Application Wu, (2008). Genome 53 R. (2014). entomology, J. Y. trypanosomiasis. of Stevens, Attardo, review African & . of D., J. vector . Wells, . morsitans): N., (Glossina Solano, fly M., tsetse Hall, 380-386. M., the Lehane, of Post- M., sequence Estimating Berriman, Boettcherisca for J., Important Significance Hattori, Forensically Watanabe, and (2017). Pattern a M. Development Wang, MCScanX: & China: Y., (2012). Insect Interval. in L. H. mortem Tao, Sarcophagidae) A. (2017). N., Paterson, Y. Y. (Diptera: Zhang, L. . peregrina F., Tao, J. . collinearity. Wang, and Y., . . Wang, synteny X., gene Wang, . of J., doi:10.1093/nar/gkr1293 analysis . Li, e49. evolutionary X., J., (7), and Tan, X. detection D., Yin, for J. China. Debarry, L., toolkit H., Shenzhen, L. Tang, Li, in Y., Wang, F., J. and Wang, human Y., 75-86. of of X. (Complete), identification remains Jiang, Meyerson, and on Y., & discovery M. succession W., the Ma, C. Y., for Whelan, Wang, T., tool hosts. Sharpe, computational eukaryotic S., customizable from doi:10.1093/bioinformatics/bty501 Bullman, a libraries 4289. I., in PathSeq: A. sequences GATK Ojesina, microbial S., (2018). C. Pedamallu, M. A., M. Walker, 9 One, PloS 8,1586-1591. (8), Ccnetdsrbtoso h sebe genome. assembled the of distributions content GC of . specimen adult female A . W) 5-5.doi:10.1093/nar/gkz333 W52-W58. (W1), 1) 112963. (11), 5,865. (5), ora fMdclEtmlg,54 Entomology, Medical of Journal 0-2.doi:10.1146/annurev.ento.52.110405.091423 103-120. , .peregrina S. 12 6,1491-1497. (6), . iifrais(xod nln) 34 England), (Oxford, Bioinformatics ern 39 Neuron, N eerh 21 Research, DNA oescSineItrainl 271 International, Science Forensic oeua ilg n Evolution, and Biology Molecular 1,1711 doi:10.1016/s0896- 147-161. (1), ultno netlg,66 Insectology, of Bulletin n.ScEtml r,47 Fr., Entomol. Soc Ann. uli cd eerh 40 research, acids Nucleic 3,231-241. (3), cec,344 Science, uli acids Nucleic 2) 4287- (24), Annual (6182), Nature (2), , Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. al S24 Table 2: file Additional S23. Table S22. Table S21. Table S20. Table S19. Table S18. Table S17. Table S16. Table S15. Table S14. Table S13. Table S12. Table S11 Table S10 Table S9. Table S8. Table S7 Table S6. Table S5. Table S4. Table S3. Table annotation. S2. Table S1. Table S9 Figure S8 Figure genomes. insect S7 Figure peregrina S6 Figure estimation. size genome S5 Figure S4. Figure S3 Figure n te v itrnspecies. dipteran five other and h opeeeso h rf sebe eoewsasse sn BUSCO. using assessed was genome assembled draft the of completeness The . ttsia nlsso iCrwdt fe ult control. quality after Xten. data HiSeq raw Illumina Hi-C from of data analysis raw Statistical Hi-C the and library of Construction assembly. platform. genome PacBio the the of on Statistic based data calling. sequencing SNP raw using of estimated Statistic was genome assembled of heterozygosity The using estimation heterozygosity and size Genome samples. sequencing genome for Evaluation iegnetm siainars 0isc species. insect 10 across estimation time among Divergence tree . phylogenetic (ML) likelihood maximum The . of distribution depth GC The . Oercmn nlsso xaddgenes. expanded of analysis enrichment GO . BUSCO. using assessed was genome assembled Hi-C the chromosomes. of six completeness in The contigs . clustered the of size Genome . rqec fthe of Frequency . oprtv ersnaino rhlgu n aaoosgnsaindwt te nine other with aligned genes paralogous and orthologous of representation comparative A . oprtv nlsso rdce noaino eesrcuecaatrsisbetween characteristics structure gene of annotation predicted of analysis Comparetive . et itiuinof distribution Depth esosadmi aaeeso h otaeue nti study. this in used software the species. of insect parameters 10 main among and families Versions gene genomes. contracted insect and nine expanded other of with Statistic aligned genes orthologous Single-copy species. of insect Statistic 10 among assembly. families genome gene the of in Statistic genes non-protein-coding for annotation of Statistic classification. SSR of Summary distribution. SSR of genome. Statistic assembled in sequences repetitive assembly. of genome Annotation the in sequences repetitive of genes. Classification predicted for annotation functional species. of other Statistic with genome the assembly. of genome statistics the Comparative of annotation genes protein-coding of Summary rncitm eunigsaitc rmteIlmn ie oass rti oiggenes coding protein assist to HiSeq Illumina the from statistics sequencing Transcriptome 17 .peregrina S. mrdphdsrbto uv nteerrcretdrasue othe to used reads error-corrected the in curve distribution depth -mer .peregrina S. genome. 13 genome. 17 mranalysis. -mer .peregrina S. n te ieinsects. nine other and S. Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. provides-insights-into-the-evolutionary-adaptation-of-flesh-flies changes.docx track nodes. articles/433277-chromosome-level-de-novo-genome-assembly-of-sarcophaga-peregrina- with calibration the manscript indicate circles Revised near red number two The each and for times, clade. file families divergence each gene Hosted the (blue) for show contracted families and numbers (red) gene black expanded The (red) significantly contracted of clade. number and the (blue) indicates branch expanded each of number the shows 3 Fig. species. selected to all relationship among evolutionary identified close families between the of clusters with of pseudochromosomes orthologous species represents Chromosomes of five respectively color. distribution 4” same the 3R, the shows 3L, with diagram 2R, lines 100-kb 2L, by the “X, linked at “Chr1-6”, is links as contact species of two number the indicated across are match which (low), white to (high) resolution. red from female. interactions Hi-C adult of an shows Densities circle species. density. the other gene in 2 (V) photo (%); The content Fig. windows. GC Kb (IV) 500 content; in transposon calculated LTR are (III) content; transposon DNA 1 Fig. LEGENDS FIGURE S27. Table S26 Table S25 Table oprtv eoi nlssamong analyses genomic Comparative eoi adcp of landscape Genomic Chromosome-level b Oercmn nlsso oiieyslce genes. selected positively of analysis enrichment GO genes. . expanded of analysis enrichment KEGG . EGercmn nlsso oiieyslce genes. selected positively of analysis enrichment KEGG a hoooeclierbok between blocks collinear Chromosome otgcnatmti fteasmldgnm.Teclrbro h ih hw h density the shows right the on bar color The genome. assembled the of matrix contact Contig enovo de .peregrina S. eoeassembly genome rmotrt ne:()szso suohoooe;(II) pseudochromosomes; 6 of sizes (I) inner: to outer From . .peregrina S. vial at available 14 .peregrina S. .peregrina S. .peregrina S. n ieohrseis h ubro h branch the on number The species. other nine and .peregrina S. and eeson.Tenmesidct gene indicate numbers The shown). were https://authorea.com/users/303160/ n oprtv eoeaayi with analysis genome comparative and .melanogaster D. n te is(o lrt,only clarity, (for flies other and .melanogaster D. .peregrina S. eoe.Tebest The genomes. r marked are . c Venn Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. 15 Posted on Authorea 17 Mar 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158446791.17846463 — This a preprint and has not been peer reviewed. Data may be preliminary. adaptation-of-flesh-flies de-novo-genome-assembly-of-sarcophaga-peregrina-provides-insights-into-the-evolutionary- 1.docx Table file Hosted vial at available https://authorea.com/users/303160/articles/433277-chromosome-level- 16