Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. INZHOU QIAN leopardus of , assembly coral genome leopard chromosomal-scale and sequencing novo De ewiZheng Weiwei muerltdptwy.I diin eietfid51843SP ae ngnm eeunigo 54 of resequencing genome on based SNPs annotated. 5,178,453 identified functional in we were expansions addition, of 96.5% family entirety In which gene pathways. good among that a related genes, revealed indicating immune retrieved, analyses protein-coding BUSCO were genomic scaffolds. 25,248 genes pseudo-chromosomes total Comparative single-copy predicted 24 a conserved of We has consisting the assembly. assembly Mb, of 34.15 the genome 97.2% of The N50 that technologies. scaffold showed a sequencing analysis with read the Mb long 881.55 report of PacBio we length and Here (Hi-C) programs. capture for breeding resource formation molecular genomic and and genomic of of studies assembly lack biological taste, the However, its delicious countries. Oceans. hinders and many Indo-Pacific of in appearance water aquaculture subtropical body for and fish tropical appealing the its in distributing to widely Due fish reef coral carnivorous a is grouper, coral farming leopard the The in programs breeding genomic of Abstract development were the 96.5% benefit biological which and to were evolutionary among expected genetics, genes genes, is pseudo- for single-copy protein-coding it 24 resource conserved Particularly, genomic into 23,234 industry. the valuable species. oriented predicted a of this and We provides 95.3% of assembly. clustered genome that were studies the leopardus showed P. scaffolds of The analysis The entirety annotated. respectively. BUSCO functional good Mb, data. a 33.47 Hi-C and indicating valid Kb retrieved, Gb 31.8 13.39 of with N50 chromosomes scaffold 10 body 10 Gb Here and using appealing 127.36 contig genome programs. its breeding leopardus Using a P. genomic to technologies. of and Due (Hi-C) assembly studies However, biological and capture countries. its sequencing Oceans. conformation many hinders novo in Indo-Pacific leopardus de P. aquaculture the of for for report resources fish water we molecular commercial subtropical and popular carnivorous genomic a and a of become lack tropical has is the leopardus the Epinephelinae, P. taste, family in delicious Plectropomus, distributing and genus appearance widely to fish belonging reef leopardus, Plectropomus coral grouper, coral leopard The Abstract 2020 28, April 6 5 4 3 2 1 Chen elwSaFseisRsac Institute Research Fisheries Sea Yellow University Yat-Sen Sun available not Affiliation University Sciences Ocean Fishery Guangdong of Academy Chinese BGI-Shenzhen Institute, BGI-Qingdao, Research Fisheries Sea Yellow 6 .leopardus P. 1 1 iy Guo Xinyu , inh Zhang Tianshi , eoeuigacmiaino 10 of combination a using genome lcrpmsleopardusPlectropomus 2 agHuang Yang , 4 hnx Tian Changxu , 3 ayn Gao Haoyang , eogn ognsPetoou,fml Epinephelinae, family Plectropomus, genus to belonging , 1 × 3 eoisw eeae 0.0M eoeasml with assembly genome Mb 902.90 a generated we Genomics hnu Zhu Chunhua , × .leopardus P. eois ihtruhu hoooecon- chromosome high-throughput Genomics, 2 × a Xu Hao , eoisadhg-hogptchromosome high-throughput and Genomics .leopardus P. a eoeapplrcommercial popular a become has 3 arnLin Haoran , 1 hnhnLiu Shanshan , enovo de eeascae with associated were 5 n Songlin and , eunigand sequencing .leopardus P. 4 , Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. auatrrsisrcin h nert n ult fteetatdDAwseautduig1 gel (Plextech, 1% Analyzer using DNA/Protein evaluated Pultton was the a to DNA using according extracted assessed Germany) the was (Qiagen, of concentration kit quality DNA purification and The DNA integrity electrophoresis. QIAamp The a instruction. using manufacturer’s extracted was DNA Genomic sequencing and Sampling 2.1 methods and Materials conservation resource as geno- well this of as addition, studies studies, In ecological comparative Epinephelinae. aquaculture. and and in family phylogenetic biological program evolutionary, within genomic, breeding future for genomic facilitate its resource will promote crucial me efficiently a will provide and will species, grouper this coral leopard of of annotation data and genomic species. by assembly this of revealed genome 10 of exploitation chromosomal-scale mechanism using breeding a insufficient metabolic genomic report The and muscle we 2017). conservation Here the study, al., genetic and transcriptomic et the 2015), the (Mekuchi limits 2010), al., analyses largely al., et resource metabolome et this (Wang and (Zhang of morphs expression studies markers 2005) genetic colour gene microsatellite Anderson, and two of Genomic & development in 1997). (Frisch the Jones, comparison responses in & Light stress reported 1999; were physiological Carsonewart, species 1998), & (Leis (Zeller, biology biology behaviour and reproductive 2006), control. al., metabolic et and in breeding studies genomic Previous as such technology, increasing of The aquaculture aquaculture 2012). advanced (Fabinyi, the requires colour, price of which trading skin development high rapid beautiful a a with and trigger worldwide complex brown. taste demands species by and impressive commercial characterized red flesh, important also bright an high-protein are become as and They such low-fat 1995). colour, to (Ferreira, body life Due of in variety the later a of male Tonga and waters to and structures , sex-reverse Fiji subtropical social Islands, and other or Caroline females most tropical the of Like to in out eastwards 2019). inhabits and Pauly, Australia naturally & to It (Froese Japan southern Plectropomus. from genus Oceans Indo-Pacific in fish representative grouper, a groupers. coral the of leopard studies (Saad, The ( biological and grouper date, evolutionary family to giant taxonomical, However, of the ( the groupers. classification grouper species, hinders of red-spotted the taxonomy previous grouper the However, the a of and 2007). clarify on 2019) sequences with to Hastings, 16 genome needed impact & discussion in are two (Craig great in species evidence only sequences have still genomic 165 gene more groupers is approximately H3 Therefore, the of species histone 2019). and comprised of these 16S variety is of 12S, Epinephelidae and taxonomy on family abundance based The biological The classified ecosystem. important genus, ecosystem. are complex reef predators, fairly coral carnivorous this main the the in as members acting Epinephelinae), (, Groupers Introduction annotation genome development the benefit industry. to farming expected Keywords the is in it Particularly, programs species. breeding grouper genomic the of of studies biological and lutionary The individuals. × eoi n iCsqecn ehooy h elanttdgnm n h asv sequencing massive the and genome well-annotated The technology. sequencing Hi-C and Genomic epr oa grouper, coral leopard : .leopardus P. .leopardus P. lcrpmsleopardus Plectropomus eoeadvraindt rvd aubegnmcrsuc o eeis evo- genetics, for resource genomic valuable provide data variation and genome anyfcsdo pce lsicto Hrio ta. 04 Herwerden 2014; al., et (Harrison classification species on focused mainly pnpeu akaara lcrpmsleopardus Plectropomus .leopardus P. lonml oa ru rsotdcrlgopr is grouper, coral spotted or trout coral namely also , 2 G ta. 09,aeaalbe hc significantly which available, are 2019), al., et (Ge ) .leopardus P. eoesqecn,crmsmlassembly, chromosomal sequencing, genome , spooyoshrahoiim starting hermaphroditism, protogynous is pnpeu lanceolatus Epinephelus nmn sa onre n regions, and countries Asian many in .leopardus P. .leopardus P. hc a generated was which , Zo tal., et (Zhou ) a recently has Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. BSO Wtros ta. 07 n Ccnetaaye.Tesnl oyotoouso of orthologues copy single Orthologs The Single-Copy analyses. Universal content Benchmarking GC by and assessed 2017) al., were et assembly (Waterhouse genome (BUSCO) the of completeness The ntedatasml X ta. 09.Teln eune eesltit he rus nldn total including groups, three into gaps split close were to sequences correction long –r error The 0 any without 2019). reads al., –min –min long et options sequencing (Xu (with which molecule assembly reads TGS-GapCloser, single employed draft we 10x) assembly, the genome in ([?] the into depth of scaffolds accuracy low and and uses contigs integrity primary the (Durand improve the Juicebox further anchor To with to visualized al., used and et was built (Dudchenko 2016). were 24” (v170123) maps al., -c contact 3d-dna et 0 intra-chromosomal Hi-C Then -s / raw 2015). inter haploid the al., The “-m Firstly, 2012) chromosomes. et of approach. al., (Servant parameters scaffolding et (v2.8.0) with Hi-C (Luo HiC-Pro 2017) the (v1.12) with using Gapcloser filtered assembly with were chromosomal-scale filled reads a were into assembly oriented initial and the “avg in of parameters Gaps the . with 2017) al., et (Weisenfeld = We measured G was formula the distribution with k-mer calculated The was 17-mers. 10,000. size & of genome of (Marcais The count 2017). (v2.2.10) kmer al., maximum Jellyfish et N (Vurture a with GenomeScope and computed using 17 plotted was the = and frequencies k of counts using heterozygosity 2011) k-mer Kingsford, The and reads. size quality-filtered genome data the RNA-Seq the clean estimated Gb assessment We 69.34 quality and and with 2018), assembly performed al., Genome were 1). et reads 2.3 (Table raw (Zhou analyses the (Illumina, RNA-QC-Chain (BGI, further of data, Kit platform for control RNA-Seq remained Prep sequencing quality were for The BGISEQ-500 Library data. pipeline a RNA raw QC on Ultra Gb The a sequenced 70.36 NEBNext CA, USA). was producing (Invitrogen, the (Agilent, China), protocol, reagent Bioanalyzer using Shenzhen, manufacturer’s 2100 TRIzol prepared the Agilent leopard using were following the the fin, which USA) using of and libraries, evaluated tissues muscle was six RNA-Seq quantity spleen, from six and RNA skin, integrity total liver, RNA extracted gonad, USA). we including genes, protein-coding grouper, the of coral using prediction generated. the were Kb cell facilitate sequencing 20 SMAT To RNA one library of and from The size data collection protocol. and fragment Tissue manufacturer’s system, 2.2 a the Sequel following PacBio with a USA) library with (PacBio, SMRTbell sequenced 1.0 terminated was a kit and was 2018) preparation constructed fixation al., template we et the China). SMRTBell (Gong sequencing, Shenzhen, and protocol (BGI, library 1% long-read platform Hi-C For of sequencing the concentration following BGISEQ-500 prepared a a was using in library sequenced Hi-C formaldehyde DNA then A Genomic with glycine. sequencing. M fixed for 0.2 was library using Hi-C samples filtered. constructed were blood ha- we assembly, rate) reads in genome error the chromosome-scale 1% and a (representing reads obtain adaptor-contaminated 20 To the than reads, lower value duplicated quality The 2013). a al., ving lengt- et read (Zhou with QC-Chain China), Shenzhen, and (BGI, platform BGISEQ-500 2 using of hs produced were reads Raw USA). cisco, 10 A libraries. 20 sequencing [?] the amount construct to total used a with DNA USA). 17-mer enovo de × on ) n ahgopwr sdt l h orsodn lge gaps. aligned corresponding the fill to used were group each and 3), round d –min 0 idy × eoislne-edlbaywscntutduigtesadr rtcl(10 protocol standard the using constructed was library linked-read Genomics 0 p h a ed eete lee ihFsQ (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) FastQC with filtered then were reads raw The bp. 100 /D 17-mer sebe h 10 the assembled hr h N the where , ac –r 0 match d .,–min 0.2, idy n=6,max ins=364, on )adraswt eghi -0b(ihotos–min options (with 2-20kb in length with reads and 3) round 17-mer × eoissotrasit otg n cfflsuigSproa(v1.2) Supernova using scaffolds and contigs into reads short Genomics stettlnme f1-es n D and 17-mers, of number total the is ac 0 –r 200 match μ n=0 n min and ins=500 ,1.8 g, < OD 3 on ) ed ihlnt ? 0k wt options (with kb 20 [?] length with reads 1), round 260/280 n=6” h rf sebywste anchored then was assembly draft The ins=260”. .leopardus P. < . n concentration a and 2.0 eoeb -e nlss using analysis, k-mer by genome 17-mer eoe h ekfrequency peak the denotes × eois a Fran- San Genomics, > d –min 0 idy 25ng/ 12.5 μ match were l - Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. h xaddadcnrce eefmle sn AE(eBee l,20) h nihetanalyses enrichment The 2006). identified al., and et 80%) Bie (De [?] identity CAFE [?]1e-5, using (E-value families orthologous Blast the gene identified all-to-all lengths contracted we with Secondly, using and regions 2009). species, expanded al., mapped three the et the these (Krzywinski with (v0.69) among constructed Circos groups were using visualizations collinearity for chromosomal Kb The 2007). (Harris, the lanceolatus of E. assembly genome the compared We analyses genomic Comparative the predict repli- to bootstrap used (http://www.timetree.org/). 500 was with of 2007) 2010) (Yang, time package al., divergent latipes PAML et phylogenetic The the (Guindon a in times. (v3.0) implemented genes, divergence PhyML program orthologous using MCMCtree single-copy model The the Bayes cations. Using with generated organisms. was Epinephelus the tree oculatus, all and Lepisosteus among protein morhua, relationships the Gadus paralogous comparing rerio, by Danio latipes, families lancelatus Oryzias gene orthologous niloticus, Oreochromis the tus, identified the we among sequences tree, cDNA phylogenetic the estimation define 2007). time To al., divergent et and can- (Elsik analysis Glean The Phylogenetic using (v5.5.0) 2.6 clean and set 2016). TransDecoder 2016) Gb gene were with al., consensus methods 69.34 al., predicted a et above of et into then the (Pertea merged (Pertea were from total (v2.1.1) predicted sequences (v2.0.10) A StringTie genes transcript HISAT2 Finally, using (https://github.com/TransDecoder/TransDecoder/). within respectively. using detected regions sequences fin, were protein-coding accu- genome and structures didate for assembled with muscle transcript 2002) constructed the spleen, putative al., libraries to the skin, RNA-Seq et aligned liver, six (Doerks were from gonad, (v2.4.0) data generated GeneWise including were using tissues, data Transcriptomic proteins six matching alignments. the spliced against rate aligned then were the to aligned latipes Oryzias akaara, Epinephelus of the sequences with protein 2006) tion, al., genes. et protein-coding of (Stanke the parameters detect employed we to with genome, dictions masked genome repeat the the on in Based annotation repeats and tandem prediction MaxPerid=2000”. the Gene and 2.5 identify Minscore=50, PI=10, to PM=80, 1999) Delta=7, Mismatch=7, (Benson, predicted “Match=2, were v4.09) repeats (TRF, Novel Finder proteins. TE-relevant the the on (v3.2.2) LTR based detect RepeatProteinMask (http://www.repeatmasker.org) to 2015). RepeatModeler al., used et using was (Bao (v14.06) RepeatMasker library TE in RepBase implemented the with 2009) Chen, & Graovac the and in database sequences repeat the repetitive in the found annotated was We contamination annotation external sequence No Repetitive filtered. 2.4 were N’s windows and 50% sliding content non-overlapping than GC Kb more The 10 genome. with harboring tool. measured windows BUSCO also the using were genome genome and the assembled across the depth against sequencing searched average were v2.0) (BUSCO, obd9 .leopardus P. idr(u&Wn,20)adRpaSot(rc ta. 05.I diin eue admRepeat Tandem used we addition, In 2005). al., et (Price RepeatScout and 2007) Wang, & (Xu Finder and and G.morhua and pnpeu akaara. Epinephelus .leopardus P. ota fteohrEiehlsseisuigteLSZto v.20)wt eal options default with (v1.02.00) tool LASTZ the using species Epinephelus other the of that to .akaara E. enovo de eeue sclbaintm,wihwsdwlae rmteTmTe database TimeTree the from downloaded was which time, calibration as used were ai ei,Tkfg urps atrsesauets pnpeu lanceolatus, Epinephelus aculeatus, Gasterosteus rubripes, Takifugu rerio, Danio rdcin.Konrpaswr dnie sn eetakr(330 (Taraio- (v3.3.0) RepeatMasker using identified were repeats Known predictions. .leopardus P. isl,t eeltercliert eainhp,w lge h hoooe of chromosomes the aligned we relationships, collinearity their reveal to Firstly, . eoeuigtLSn(-au[]e5.Tehmlgu eoesequences genome homologous The (E-value[?]1e-5). tBLASTn using genome ai rerio Danio rea v.)(ie l,20)wsue odfieteotooosand orthologous the define to used was 2006) al., et (Li (v4.0) TreeFam and .leopardus P. yolsu semilaevis Cynoglossus n ietlot,including teleosts, nine and .rerio D. riigstaddfutstig.Frhmlg-ae predic- homology-based For settings. default and set training enovo De .leopardus P. enovo de and ihtepbihdgnmso rue pce,including species, grouper of genomes published the with 4 eepeito a efre sn uuts(v2.7) Augustus using performed was prediction gene .rubripes T. oooybsdadtasrpoeasse pre- transcriptome-assisted and homology-based , eoewt ohhmlg erhn nknown in searching homology both with genome eedwlae rmNB aaaeand database NCBI from downloaded were , enovo de aiuurbie,Gseotu aculea- Gasterosteus rubripes, Takifugu .aculeatus G. eetlbaycntutdwith constructed library repeat and .rubripes T. and , > O. 2 Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. admrpas eeaigattlo 9.9M frpttv eune,ocpig3.8 ftewhole the of 33.38% occupying sequences, repetitive of Mb and 298.99 novel known, of of ( total combination assembly a a genome by generating obtained were repeats, sequences tandem repetitive non-redundant and consensus 39.65%. The of content annotation GC genes, average sequence ( orthologue an Repetitive copy with genes 3.3 concentrated, single fragmented relatively conserved 3.2% were the and depth of complete sequencing 97.2% and the retrieved of assembly 94.0% the ( including that Mb showed 41.71 ( analysis to Mb BUSCO 34.14 Mb of 15.72 N50 from scaffold ranging a lengths the and TableS1 close chromosome the Kb to with of 855.69 x, ˜20 length of pseudo-chromosomes, of total N50 24 depth contig the completeness a Finally, the a with improve TAG-Gapcloser. with reads, further using sequence To Mb, long assembly Kb. PacBio the 33.60 used in we of assembly, contig gaps chromosomal- genome a the a with of into length, accuracy oriented in and ( and (Mb) anchored approach megabase then 902.46 scaffolding were of Hi-C assembly assembly the draft the using in assembly scaffolds scale and contigs were ( The reads demonstrated dominant 2017). was short A peak Genomics analysis. 0.635%. homozygous 10x 17-kmer be the the The to with to Mb estimated corresponding 871 was be distribution heterozygosity to the k-mer size and 17 genome the the estimated of we peak data, clean the on Based to mapped assembly ends genome two ( with Chromosomal-level Gb reads, 3.2 18.05 Hi-C ( valid for scaffolding on as reads control reads Hi-C 1,255,828 Quality raw for generated total data. useful technology the Hi-C are of Gb which 10.60% ( 126.37 contigs, to data to different resulted clean amounting finally Gb pairs, data 127.36 read obtained Hi-C total we raw the a filtering, 631,842,593 producing quality linked-read obtained After China), Genomics we Shenzhen, reads. 10x sequencing, (BGI, raw A platform of Gb BGISEQ-500 sequencing. 152.61 on genome of sequenced for and used constructed was was China), library (Laizhou, Company Aquatic Mingbo a of DNA estimation Genomic set. size SNP genome final and the Sequencing in only 3.1 kept and were 0.1 SNPs, et discussion the [?] (McKenna and rate filtered GATK Results missing further using and We population-scale 0.05 depth VCFtools. a [?] of using on (MAF) quality calculated of frequency performed were criteria then the frequencies were satisfying allele SNPs calling The parameters score SNP default with quality 2010). software having BWA 2009). al., using bases assembly Durbin, 10% genome & with the to (Li reads al., mapped (4) et low- were and (Zhou reads of quality-filtered adapters, QC-Chain influence The The containing to using reads potential aligned filtered bp. (1) the reads and 300 (3) types: avoid checked of following reads; To were the duplicated size reads in platform. raw insert reads libraries analysis, 2000 an removing Pair-ended subsequent HiSeq 2013), with the fish. Illumina USA), in each the (Illumina, reads of on quality protocol tissues fin conducted standard from was the extracted sequencing to were of according DNA individuals constructed Genomic 54 were China. of total of a Province sampled Hainan we expanded resequencing, the genome For of calling implications SNP functional and identify Genome-resequencing to adjusted 2.8. test, performed exact were (Fisher’s annotations genes KEGG contracted and and GO on based ). al 4 Table .leopardus P. .Terpttv eune a opie fDAtasoosi 4.8Mb 146.68 in transposons DNA of comprised was sequences repetitive The ). niiula rud1ya-l 05kg) (0.5 year-old 1 around at individual enovo de sebe ihSproasfwr v.)(esnede al., et (Weisenfeld (v1.2) software Supernova with assembled > .,mpigquality mapping 2.0, p al 1 Table iue2b Figure 5 -value al 1 Table < ). 0.05). al 3 Table .A eut eotie rf genome draft a obtained we result, a As ). .Snl oeuesqecn ihPacBio with sequencing molecule Single ). .leopardus P. > 0 ndnie uloie Ns;(2) (N’s); nucleotides unidentified 10% al 2 Table > Fgr 1) (Figure .Tedsrbto fG content GC of distribution The ). 0 N quality SNP 40, .leopardus P. rmtofrigfcoisin factories farming two from .Tegnm sebyhad assembly genome The ). al 1 Table hc a rvddby provided was which , eoews881.55 was genome > Supplementary 0 io allele minor 30, .FrteHi-C the For ). iue2a Figure < 20. ) Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. . eoer-eunigfrSPcalling SNP for found. re-sequencing was ( Genome pathway pathways KEGG 3.7 enriched related no immune however, contracted, and were systems families immune ( of S4 linage number ( Table Epinephelus expanded a the of significantly in ancestor were enriched common 126 recent which most among its ), to in compared expanded 18,007 were and families 18,674 18,336, with with to groupers, numbers specific three were The genes the 4,427 families. in gene similar orthologous lanceolatus highly constructing E. were by families analyzed gene was of evolution family gene Furthermore, ( collinearity (2 The otype including analyses. Epinephelus), genomic (genus comparative groupers two for available tus only are sequences genome groupers Currently, other with comparison Genomic 3.6 G. selected is of the linage linage of grouper the relationships phylogenetic with orthologues, the ancestor copy revealing single common were constructed, 4,134 was of ( total tree a phylogenetic teleost identified a we teleosts, which selected on 10 based of genes and genomes the Using time divergent et and (Kalvari Phylogeny database and annotated 3.5 Rfam were sequence using RNAs (http://infernal.janelia.org/) rRNA 2013) nuclear human ( rRNA Eddy, Small 2018) the & 1,230 (Nawrocki al., respectively. against tool 2019). database, homology infernal Lowe, these 2014) searching the & by of by Griffiths-Jones, (Chan identified & one tRNAscan-SE were (Kozomara least using microRNAs miRBase at identified 324 were in and tRNAs annotated genes 843 24,364 functional of genes, successfully total non-coding were a For Finally, genes) respectively. 70.3% predicted database, and GO all 88.9% ( and and of databases InterPro (GO), out Gene3D, in Ontology protein annotated (96.48% PIRSF, Gene multiple were and genes in PRINTS, respectively. genes 2019) domains TIGRFAMs, predicted database, al., protein PANTHER, the et TrEMBL identify of (Mitchell HAMAP, and SMART) to (ProDom, KEGG Pfam, 2014) PROSITE, InterPro SwissProt, al., databases, COILS, in and of et gene 84.6% hits (Jones databases public 92.3%, (v5.0) positive domain several result, InterProScan got a in employed genes As also sequences predicted We (E-value[?]1e-5). protein the BLASTp the using of TrEMBL, comparing 96.4% and by KEGG genes SwissProt, predicted including the annotated We including species, ( teleost number ( other exon to genes and compared protein-coding were 25,248 models rubripes of gene composing predicted set the of gene non-redundant a generated from results the Combining annotation and repeats unknown prediction and Gene 4.27%), (LTRs; 3.4 Mb nuclear 38.27 ( interspersed in (9.79%) short repeats Mb 5.43%), terminal long (LINEs; 87.71 0.38%), Mb in (SINEs; 48.68 Mb in 3.41 elements in interspersed elements long assembly), the of (16.37% and .lanceolatus E. E.akaara n iue4 Figure hwn iia itiuinpten nmN egh D egh xnlnt,ito length intron length, exon length, CDS length, mRNA in patterns distribution similar showing , 4) n h hoooesnei oprsn hwdta hyhv ihlvlo genomic of level high a have they that showed comparisons syntenic chromosome the and =48), ,idctn nipoe aaiyfrrssac odsae in diseases to resistance for capacity improved an indicating ), upeetr al S3 Table Supplementary upeetr al S2 Table Supplementary iue5a Figure and orva h iiaiisaddffrne ftegoprgnms ecnutdfunctional conducted we genomes, grouper the of differences and similarities the reveal To . iue3) Figure .W on htthe that found We ). n 699gnswith genes 16,929 and , .akaara E. aculeatus al 4 Table ). enovo de .leopardus P. epciey oa f1,9 ee eesae ytetretlot and teleost, three the by shared were genes 15,497 of total A respectively. , . hc eaae ihtercmo netr˜12ma( mya ˜71.2 ancestor common their with separated which , ). oooybsdadtasrpoeasse rdcin,w successfully we predictions, transcriptome-assisted and homology-based , .leopardus P. .lanceolatus E. ). ). ( iue5b Figure .leopardus P. .akaara E. n h pnpeu rue pce aetesm kary- same the have species grouper Epinephelus the and p and < .W lofudthat found also We ). 6 .5.Teepne eefmle eesignificantly were families gene expanded The 0.05). epciey In respectively. , .akaara E. iegd˜5. ilo er g ma rmthe from (mya) ago years million 57.2 ˜ diverged h otcoeyrltdseist the to species related closely most The . .leopardus P. .leopardus P. .leopardus P. iue5d Figure .rerio D. al 5 Table oa f79gene 799 of total a , oa f1 gene 12 of total A . hrd1,7 genes 17,375 shared , , Figure4 .latipes O. Supplementary .Testatistics The ). .leopardus P. .lanceola- E. iue5c Figure ). and T. , Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. esnG(99 admrpasfidr rga oaayeDAsequences. eukaryotic DNA in analyze to elements program repetitive a of finder: 573-580. database repeats , Tandem a (1999) Update, G Repbase Benson (2015) O Kohany KK, genomes. Kojima W, Bao re-sequencing Genome PRJNA622646. SRR11482498). References BioProject (SRR9330009, at (SRA) deposited Archive been Read has Sequence data the Assembly in NCBI deposited and GCA (VJNF00000000) been accession DDBJ/ENA/GenBank GenBank at the deposited with database been has genome assembled The Statement Accessibility Data and reviewed authors All None. manuscript. the genome interests wrote and the Q.Z. of H.X. performed Conflicts Q.Z., data. and the DNA project. analyzed manuscript. genomic the H. final the managed Y. the extracted C.H.Z. approved and X.Y.G. and G.H Q.Z., S.L.C. samples. sequencing. sequencing project. the the collected conceived C.X.T. C.H.Z. and H.R.L S.L.C., [2017ASTCP- (Qingdao) contributions Technology Project. Author and Climbing Science Scholar Marine (Qingdao) Taishan for Technology Shandong Laboratory and Pro- and National Cultivation OS15] Science Talents Pilot Marine AoShan by (2016LZGC009),the for Supported Science Project gram Laboratory Fisheries Variety Laboratory National Marine Superior Guangdong Pilot Shandong for Engineering (2018-MFS-T08), Processes, Laboratory and by Production Science Supported Food Marine Southern Program and Talent of Youth Fund the (ZJW-2019-06), by (Zhanjiang) supported was work This and analysis the genetic of accurate aquaculture promote Acknowledgements in The will programs adaptions. annotations, breeding and genomic functional traits the phenotypic their their accelerate with underlying together diversity genetic variations, the genomic dissect to to first and addition the evolution in supplies genetics, genomes, annotation Epinephelidae and the assembly akaara implement genome E. and The Plectropomus technologies. genus sequencing of the read genome of long assembly PacBio genome and chromosomal-scale Hi-C a differences. provide phenotypic we the Here in the roles dissect traits, important to playing agronomical individuals work, genes corresponding future Conclusion of key the the identify of dissection In phenotypes to strains. analysis, recorded and superior and associations structure for variations genomic genomic breeding population selective these use genomic as will for such we and sweeps, studies, nonsynonymous effects selective and and of genetic synonymous identification location were the ( The 57,082 regions Based for and filtering. coding genome. 132,709 quality resource the in that after to showing locating for SNPs mapped respectively, annotated, Gb 5,178,453 reads also SNPs, 12.39 of total were averagely the total SNPs of an a these % identified with of 99.63 we data, with alignments, clean individuals, these each Gb ( on for 668.90 14.0x genome of of assembled depth the total sequencing to a mean a obtained and we fish filtering, each quality of After individuals reads. 54 of re-sequencing genome The oieDNA, Mobile hsspligipratgnmcdt o hl-eoeaayi oeuiaetepopulation the elucidate to analysis whole-genome for data genomic important supplying thus , 6 11. , 0799..Rwsqecn aafor data sequencing Raw 008729295.1. .leopardus P. al 6 Table upeetr al S5 Table Supplementary 7 .TeeSP ilpoiea motn genomic important an provide will SNPs These ). rdcdattlo ,3,5,4 a sequencing raw 2,232,057,448 of total a produced .leopardus P. .leopardus P. yitgaino 0 Genomics, 10x of integration by .Tecenraswr aligned were reads clean The ). . uli cd Research, Acids Nucleic .leopardus P. .lanceolatus E. eoehas genome and 27 Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. oe ,BnsD hn -,Fae ,L ,MAul ,MWlimH alnJ icelA Nuka A, Inter- Mitchell (2014) J, S Maslen Hunter R, H, Lopez McWilliam S-Y, C, classification. Yong McAnulla function M, protein W, Scheremetjew genome-scale A, Li Sangrador-Vegas 5: M, AF, ProScan Fraser Quinn west H-Y, S, and Pesseat Chang Contrasting east G, D, (2006) from Binns (Serranidae) MV P, Plectropomus Oppen trout Jones polymorphisms. A, coral Frisch ancestral the or SJ, of hybridisation Newman species Introgressive two G, in Australia: Carlos structure CL, genetic of Dudgeon patterns JH, hybridizing Choat Serranidae). (2014) two LV, spp., ML in Herwerden (Plectropomus Berumen discrimination DH, fish species Williamson reef and S, coral analysis Perumal of parentage H, species for Mansour multiplexes University. K, microsatellite State Ma Pennsylvania GP, of Validation Jones The KA, DNA. Feldheim genomic HB, of Harrison alignment pairwise Improved (2007) Mei RS J-F, Harris 3.0. Gui PhyML N, of methods and performance Chen algorithms the New X, (2010) assessing Li O 59 Gascuel phylogenies: J, W, maximum-likelihood Hordijk Zhang estimate M, Anisimova Y, to V, He Lefort and J-F, J, sequencing Dufayard Wu S, DNA Guindon Y, third-generation using Xiong L, P, genome Liu catfish Huang J, red-spotted yellow analysis. Wu W, of of Hi-C Q, Guo assembly genome Lin S, Chromosomal-level reference C, (2018) Xiao chromosome-level Zhou J C, Z, a Huang Dan of Hi-C. Y, G, assembly and Zhang novo Gong sequencing Z, De Wang nanopore (2019) Z, using akaara) Zhang L Y, (Epinephelus Zheng Wang grouper Z, S, Wu Huang Plectropomus M, J, publication. Shen ( Hu electronic K, trout Web Lin Wide coral H, World Ge of FishBase. species (2019) D two Pauly of R, Froese responses ). stress maculatus Physiological Plectropomus (2005) and T leopardus Australia. Reef, Anderson Barrier A, Great Frisch northern Epinepheli- and (Serranidae: leopardus central Plectropomus the trout from coral bee common nae) the honey of Reproduction a (1995) BP Creating Ferreira China. in (2007) consumption seafood GM luxury on Weinstock perspectives social DS, Conserv, and cultural Roos Historical, (2012) NV, a M Fabinyi Milshina Provides Juicebox JT, (2016) set. Reese EL gene AJ, Aiden consensus E, Mackey Zoom. Lander Unlimited CG, J, with Elsik Mesirov Maps I, Contact Hi-C Machol Lander chromosome-length for M, I, yields System Shamim Machol Hi-C Visualization J, MS, using Robinson genome Shamim N, aegypti NC, Durand Aedes Durand the protein M, of novel Hoeger assembly novo of SK, De identification scaffolds. Nyquist (2017) Systematic AD, AP Omer Aiden (2002) ES, SS, P Batra Bork O, gene CP, Dudchenko of functions. Ponting study nuclear J, the with Schultz for associated families tool RR, domain computational Copley Epinephelinae a T, CAFE: subfamily (2006) Doerks the MW Hahn of JP, groupers Demuth evolution. N, the family Cristianini of T, Epinephelini. phylogeny Bie the De molecular of classification A revised (2007) a with PA Sequences. (Serranidae) Genomic Hastings in MT, Genes Craig tRNA for Searching tRNAscan-SE: Biology, (2019) Molecular TM Lowe PP, Chan 307-321. , 39 cec NwYr,N.Y.), York, (New Science 83-92. , GigaScience, Bioinformatics, 1962 eoebiology, Genome 1-14. , 7 . 22 356 8 1269-1271. , R13. , 92. , opBohmPy o nertPhys, Integrat Mol A Phys Biochem Comp eoeresearch, Genome 8 Bioinformatics, clEvol, Ecol ctylResearch, Ichthyol ulMrSci, Mar Bull o hlgntEvol, Phylogenet Mol elsystems, Cell oeua clg resources, ecology Molecular 4 12 2046-2057. , 30 47-56. , 1236-1240. , 54 56 3 1-17. , 99-101. , 653-669. , 140 ytmtcBiology, Systematic 41 317-327. , 420-435. , ehd in Methods 0 1-9. , Environ Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. adY 21)Aayi f1SmtcodilrbsmlDAsqec aitosadpyoeei relations phylogenetic and variations fishes. sequence Serranidae DNA ribosomal some mitochondrial among 16S of Analysis (2019) YM genomes. Saad large in families repeat of identification novo De RNA-seq (2005) 21 of PA Pevzner analysis NC, expression Jones Transcript-level AL, (2016) Price Ballgown. SL and Salzberg StringTie JT, HISAT, Leek with searches. GM, experiments Pertea homology D, RNA Kim faster M, Pertea 100-fold in 1.1: InterPro Infernal (2019) (2013) 2933-2935. RD SR Finn Eddy EP, SY, Yong annotations. Nawrocki sequence SCE, A, protein Tosatto to Marchler-Bauer PD, Sigrist SC, access Thomas A, F, Potter and Sangrador-Vegas D351-d360. N, Madeira S, classification GA, Pesseat Thanki coverage, Salazar A, S, T, improving GG, C, Luciani Paysan-Lafosse El-Gebali Rivoire Sutton 2019: AP, R, LJ, HY, Pandurangan I, Richardson Lopez Chang C, Sillitoe N, Orengo I, SD, Redaschi CJA, G, Brown ND, Letunic Nuka Rawlings A, H, M, MA, Bridge Necci Huang Qureshi P, DA, DR, Bork Natale Haft H, M, Mi J, Blum Gough PC, MI, Babbitt ). Fraser leopardus TK, Plectropomus Attwood characterise ( AL, to grouper used Mitchell approaches coral Trans-omics leopard (2017) in J Kikuchi biorhythms M, nutritional Koiso next- fish T, analyzing Yamaguchi for K, Sakata framework M, MapReduce Mekuchi a Gabriel Toolkit: D, Analysis Altshuler data. of Genome K, sequencing The Garimella occurrences DNA A, (2010) Kernytsky generation of MA K, DePristo counting Cibulskis M, A, parallel Sivachenko Daly E, S, efficient Banks M, for Hanna approach Yang A, McKenna Y, lock-free Li fast, X, Y, A Liao Shi G, (2011) H, short-read Liu C Zhang k-mers. memory-efficient Z, G, Kingsford improved Wu Xiaoqian J, empirically G, S, Tang an Peng Marcais Y, SM, SOAPdenovo2: Liu Yiu Q, (2012) DW, Pan J Y, Cheung assembler. Wang Chen novo C, TW, G, de Han Lam He Y, J, J, Lu Yuan Wang B, W, H, Wang Ser- Huang C, Z, leopardus, Yu Li (Plectropomus Y, Y, trout Liu Xie B, coral Liu settled R, newly Luo in preference Habitat (1997) ranidae). GP Wong Jones L, PR, Bolund Light of transform. Z, Burrows-Wheeler trees with Zhang alignment phylogenetic read T, of short accurate Liu database and 25 Fast R, curated (2009) a R Li Durbin TreeFam: H, L, (2006) Li Osmotherly R J-K, Durbin Heriche J, families. Wang LJ, gene Indo-Pacific P, Coin animal an Dehal J, of W, Ruan larvae Zheng of A, GK-S, Serranidae). behaviour Coghlan (Pisces: settlement H, leopardus and Li Plectropomus swimming trout situ coral In the fish, (1999) coral-reef BM Carsonewart JM, Leis sequenc- deep using microRNAs confidence high annotating data. miRBase: ing (2014) S families. Griffiths-Jones RNA A, Petrov Kozomara non-coding RD, for Finn resource A, genome-centric Bateman a SR, to Eddy shifting 46 E, Rivas 13.0: EP, Rfam Nawrocki (2018) N, AI Quinones-Olvera J, Argasinska I, Kalvari ryisiM cenJ io ,CnosJ acyeR osa ,JnsS,MraM 20)Circos: (2009) MA Marra SJ, Jones D, Horsman R, genomics. Gascoyne comparative J, for Connors aesthetic I, information Birol an J, Schein M, Krzywinski i351-358. , 1754-1760. , D335-d342. , Bioinformatics, uli cd Res, Acids Nucleic oa Reefs, Coral GigaScience, uli cd research, acids Nucleic 16 27 117-126. , 764-770. , 42 f nmSci, Anim J Afr S 1 D68-d73. , eoeresearch, Genome 18. , 34 D572-D580. , 49 a Protoc, Nat 20 eoeresearch, Genome 9 80-89. , 1297-1303. , 11 1650. , 19 1639-1645. , a Biol, Mar c Rep, Sci uli cd Res, Acids Nucleic 134 Bioinformatics, uli cd Res, Acids Nucleic 7 51-64. , 9372. , Bioinformatics, Bioinformatics, 29 47 , , Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. al .Saitc fthe of Statistics 2. Table for method control quality the holistic for and data fast Sequencing 1. QC-Chain: Table (2013) K Ning W, J, Tables Li Xu for control W, A, quality Li Wang data. fast Y, sequencing and X, next-generation Niu comprehensive Su S, RNA-QC-Chain: Q, (2018) Liu K Zhou H, Ning Zhang S, Z, Chen G, Chen lanceolatus) data. Jing (Epinephelus RNA-Seq W, X, grouper Xu growth. Su giant rapid J, Q, the and Zhou Zhai of immunity assembly a H, innate genome its Xu for chromosome-level into G, A loci insights Fan (2019) provides microsatellite S Y, Chen polymorphic Zhang H, of H, Lin characterization Gao and Q, Development Zhou (2010) leopardus. Plectropomus S fish Yu leopardus reef H, Plectropomus threatened Liu trout coral J, the Zhang telemetry. of ultrasonic movement by of determined aggregations:patterns as (Serranidae) Spawning (1998) DC Zeller Likelihood. Maximum by Analysis 1586-1591. Phylogenetic , (2019) 4: PAML X (2007) Liu Z Yang L, Deng LTR X, reads (2007) long Xu third-generation H G, posons. error-prone Wang Fan using R, Z, genome genome large Zhang Xu diploid in O, Bermuda Wang of the S, determination through Gu Direct passing L, accurately (2017) Guo DB M, Jaffe Xu DM, Church P, Shah V, sequences. Kumar phylogenomics. color NI, and Zdobnov two Weisenfeld prediction EV, of Kriventseva gene G, analysis to Klioutchnikov assessments evolution, transcriptome P, quality and Ioannidis comparative from biology M, applications silico Manni BUSCO In FA, (2017) Simao (2015) EM M, Z Seppey leopardus). Meng RM, (Plectropomus Waterhouse H, trout coral Lin common L, the Guo of morphs C, Yu se- L, reads. genomic GenomeScope: Wang (2017) short MC in from Schatz J, profiling elements Gurtowski genome H, Fang repetitive reference-free CJ, Underwood fast identify M, Nattestad FJ, to Sedlazeck GW, RepeatMasker Vurture Using prediction (2009) initio (2015) N ab quences. E AUGUSTUS: Chen (2006) Barillot B M, J, Morgenstern Taraio-Graovac Dekker S, Waack E, A, Heard Hayes transcripts. I, JP, alternative processing. Gunduz Vert of data O, CJ, Hi-C Keller Chen for M, Stanke pipeline E, flexible Viara and BR, optimized an Lajoie HiC-Pro: N, Varoquaux N, Servant aBo104 14.37k 100 100 69.34 / 13.39 70.36 180.46 126.37 (bp) length read Mean RNA-Seq (Gb) data PacBio Clean Hi-C (Gb) data Raw 10 technology Sequencing uli cd Res, Acids Nucleic × urn rtcl nbioinformatics in protocols Current eois126 2.6100 127.36 152.61 Genomics eoeresearch, Genome M Genomics, BMC 35 543-548. , uli cd Res, Acids Nucleic 35 .leopardus P. IDR necetto o h rdcino ullnt T retrotrans- LTR full-length of prediction the for tool efficient an FINDER: 27 W265-268. , 757-767. , 19 lSone, PloS 144. , .leopardus P. 8 4.10.11-14.10.14. , eoeassembly. genome osr ee Resour, Genet Conserv 34 e60234. , W435-439. , a clProg, Ecol Mar 10 Bioinformatics, eoeassembly. genome oeua clg resources, ecology Molecular lSone, PloS 162 33 2 eoebiology, Genome 101-103. , 10 2202-2204. , oeua ilg n evolution, and biology Molecular 253-263. , e0145868. , G-aCoe:fs and fast TGS-GapCloser: 16 0 259. , 1-11. , Molecular . 24 Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. la la 5281,7.71628 8.41711 8.99 1,781.17 8.34 181.54 1,632.85 1,993.26 172.04 1,435.18 15,872.47 16,070.20 25,248 24,499 in effects (SNPs) polymorphisms nucleotide single the of Summary 6. Table Glean Augustus Glean RNA novo De Homolog the in predictions gene of Statistics 5. Table the in sequences repetitive of Statistics 4. Table the of result analysis BUSCO 3. Table e yeTtlnme oa egh(p 5 b)N0(p a egh(bp) length Max (bp) N90 (bp) N50 (bp) length Total contig number Total scaffold type Seq Gn e ubro ee D+nrnlnCSlneo e nrnlnEosprgene per Exons len intron len exon len CDS len CDS+intron genes of Number set #Gene rndcdr3,0 7680 ,2.4107 ,0.58.36 1,501.25 170.77 1,428.34 17,648.02 31,207 Transdecoder latipes Oryzias akaara Epinephelus semilaevis Cynoglossus lanceolatus Epinephelus aculeatus Gasterosteus rubripes Takifugu rerio Danio oa UC russearched groups BUSCO Total (M) BUSCOs Missing (F) (D) BUSCOs Fragmented BUSCOs duplicated and (S) Complete BUSCOs single-copy and Complete (C) BUSCOs Complete 40685708885667004,608,261 41,705,749 7,000 8,092 855,686 34,146,761 865,740,848 881,551,488 54,036 50,461 oa 7,9,3 30.74 0.002 9.79 4.27 275,295,833 0.38 5.43 18,438 87,713,772 32,129,629 16.37 3,413,395 48,680,003 146,681,333 33.38 Total 298,989,520 3.83 Unknown 1.31 Other genome in LTR 5.57 % SINE 34,303,179 11,710,155 LINE (bp) Length DNA 49,936,126 classification Biological genome Total of % novo De ProteinMask size RepeatMasker Repeat TRF method Identification 9622,5.21501 9.73076 8.20 9.04 8.56 3,037.64 8.26 1,845.79 8.55 2,417.60 191.37 8.01 1,766.28 177.73 1,570.11 8.41 1,687.02 177.79 1,606.08 1,948.16 169.48 1,521.26 2,913.94 170.76 1,400.68 177.75 1,460.43 23,455.32 181.24 1,424.34 16,439.61 1,524.78 19,789.49 14,232.13 14,201.45 15,087.57 19,672 23,126.00 23,142 33,866 23,869 28,700 21,677 25,787 .leopardus P. obndTsCmie TEs Combined 31.29 TEs Combined 280,223,461 .leopardus P. 11 .leopardus P. genome. ,8 100 2.8 3.2 3.1 90.9 4,584 94.0 130 147 140 4,167 Percentage 4,307 number Gene genome genome. .leopardus P. Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. xaso n otato.()Teercmn fKG noain ihteepne eefmle in families gene expanded the in with chromosomes annotations KEGG 24 of the enrichment leopardus of The (d) each contraction. indicate and expansion (I-IIXIV) bars green lanceolatus the and for (Chr1-24) used bars time divergent of the analyses indicate latus genomic se- dots Comparative red the 5. The Figure among branch. orthologues each at copy labeled single re-calibrations. were intervals bootstraps. 4134 confidential 500 using 95% with constructed teleost, tree lected Phylogenetic 4. between Figure models gene predicted the species. of teleost Comparisons 3. Figure (high). the red of to map (low) contact Hi-C The 35 (b) coverage 138 at coverage peak at minor peak The peak. of homozygous data the sequencing to assembly. Gb and 127.36 estimation from size generated Genome red 2. a Figure of photograph A 1. Figure Legends Figure and ( .akaara E. p epciey b endarmo h ee rmtetregoprseis c eefamily Gene (c) species. grouper three the from genes the of diagram Venn (b) respectively. , < 0.05). a RAlnt.()CSlnt.()Eo egh d nrnlnt.()Eo number. Exon (e) length. Intron (d) length. Exon (c) length. CDS (b) length. mRNA (a) × ) a hoooa olnaiybetween collinearity Chromosomal (a) . orsod odpiain.Teilsrtoso h ie r akdi h graph. the in marked are lines the of illustrations The duplications. to corresponds Stop 2,453,403 411,242 389,553 Non SNPs of Synonymous Number 1,740,250 Codon Start Splice Downstream Upstream Intron Intergenic Type oa 5,178,453 Total Stop .leopardus P. synonymous ot78 438 lost gained ot82 lost ie16,105 site hne7 change .leopardus P. h siae iegn ie(y,mlinyasao n the and ago) years million (Mya, time divergent estimated The oig117,662 coding .leopardus P. eoe h oo a hw h otc est rmwhite from density contact the shows bar color The genome. oig49,633 coding .leopardus P. 12 b inZhou). Qian (by × a rp of Graph (a) orsod otehtrzgu ek h minor The peak. heterozygous the to corresponds h ihs eka oeae66 coverage at peak highest The . .leopardus P. ihtogoprspecies, grouper two with k mrfeunydsrbto ( distribution frequency -mer .leopardus P. and .lanceolatus. E. eoeadother and genome .leopardus P. × h colorful The corresponds .lanceo- E. k and 17) = E. P. Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. 13 Posted on Authorea 14 Apr 2020 — CC BY 4.0 — https://doi.org/10.22541/au.158687239.91086096 — This a preprint and has not been peer reviewed. Data may be preliminary. 14