Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. igPn Xie Qing-Ping of gonochorism evolution and the hermaphroditism into insights croaker provide yellow polyactis) little ( of annotation and assembly Whole-genome iiLiang Qiqi ieisgt noteeouino emprdts n gonochorism Liang and Xie hermaphroditism of Qing-Ping ( evolution croaker the yellow into little insights of vide annotation and underpinnings assembly genetic explained aquaculture. Whole-genome the be fish into may enhancing insight polyactis provides for only L. information not valuable male genome provides in polyactis also duplication stage L. but lineage-specific hermaphrodite the evolution a the Decoding from rnf225, hermaphrodite that dmrt1. originated of and shows regulatory of it rnf223 analysis evolution upstream that paralogues Phylogenetic sequence suggesting its two the teleosts. species, and its by other teleost dmrt1 unlike in other Rnf183, of in lost expression not was analysis. but The or transcriptome Larimichthys the in polyactis. in present L. dimorphic only gene in is dmrt1 sexually gene The both genes. determination were species. protein-coding two sex rnf183 25,233 the candidate gene between and within-chromosome events the of Mb) fusion as number 4.52 or a identified fission = revealed years origin chromosome was N50 which million major the ˜25.4 analysis, scaffold without sequence genomic crocea and , translocations Larimichthys comparative genome Mb and of East enabled rearrangements the ancestor 1.21 assembly in common genome = report the high-quality species N50 we from Our (contig fish diverged Here, ago. polyactis Mb commercial L. ˜706 unclear. famous that of remain suggests most studying size analysis crocea Phylogenomic a for the Larimichthys with model as and polyactis, appealing hand, polyactis transient L. L. an other special of of polyactis) The the relationship L. On solved. evolutionary polyactis, be and (Larimichthys hermaphrodites. to croaker of mystery yellow intriguing formation little an the the is makes hermaphroditism stage and gonochorism hermaphroditic of direction evolutionary The Abstract 2021 2, August 4 3 2 1 .TeHdo lh nttt o itcnlg,Hnsil,Aaaa386 USA; 35806, Alabama Huntsville, Biotechnology, for Institute Alpha USA; Hudson 36849, 36849, The AL AL 4. Auburn, Station, Auburn, Experiment University, Agricultural Auburn Alabama Medicine, 3. Veterinary of College China; Pathobiology, 310021, USA; Hangzhou of Sciences, Department Agricultural of 2. Academy Zhejiang Hydrobiology, of Institute 1. uunUniversity Auburn University Xiamen available not Sciences Affiliation Agricultural of Academy Zhejiang 6 u Xie Yue , 2 u Xie Yue , 1# 6 1 egXu Peng , e Zhan Wei , e Zhan Wei , 2 egXu Peng , 5 ,X Wang Xu *, 1# 1 inZiShi Jian-Zhi , inziShi Jian-zhi , 3 uWang Xu , 2,3,4 ,BoLou Bao *, 6# 2 egLiu Feng , egLiu Feng , 4 n a Lou Bao and , 1 1 * 1 a-ogNiu Bao-Long , 1 a-ogNiu Bao-Long , 1 aiihhspolyactis Larimichthys 1 u He Xue , 1 u He Xue , 1 egLiu Meng , 1 egLiu Meng , 6 Qi-Qi , pro- ) 2 , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ytm n erdciepten.Gncoimadhrahoiimaetems motn reproductive important determination most sex the of are variety a hermaphroditism exhibit and fishes Gonochorism environment, water patterns. the reproductive in and niches systems ecological of diversity the to Due Introduction evolution hermaphrodite for Words: information valuable Key provides also but evolution aquaculture. hermaphrodite fish of enhancing underpinnings of genetic evolution the sequence into the insight by explained teleosts be other may in lost was or a analysis. Larimichthys revealed transcriptome the which in analysis, genomic Phylogenomic The comparative fusion polyactis or species. genes. enabled fission two chromosome protein-coding the assembly major between 25,233 without events genome translocations and and high-quality rearrangements Mb) within-chromosome of 4.52 Our number = N50 ago. scaffold years and Mb that suggests 1.21 analysis = of N50 relationship evolutionary (contig and Mb origin the Asia, East crocea in ( Larimichthys species croaker fish yellow commercial little famous the most makes stage polyactis hermaphroditic L. transient transcriptome, special The gonadal solved. sequence, whole-genome Abstract: croaker, yellow evolution hermaphrodite little Keywords: 8 number: Figure croaker yellow count:7701 Word little of research Genome title: China; Running 310021, Lou) Hangzhou (B. Sciences, [email protected] Agricultural of addresses: Academy E-mail Zhejiang Hydrobiology, of Institute Lou: *Bao Alabama Biotechnology, for University; Institute addresses: Auburn Alpha E-mail Hudson Medicine, The Veterinary USA; USA; 35806, of 36849, Alabama AL College Huntsville, Auburn, Pathobiology, Station; of Experiment Agricultural Department Sciences, Wang: Earth *Xu and Ocean of College addresses: Science, E-mail Environmental China; Marine 361005, of Xiamen Laboratory University, Key Xiamen State Xu: *Peng authors: Corresponding author. Corresponding * Xiamen China; Sciences, # 100016, Earth Beijing and Institute, Ocean Bioinformatics of Novogene College 6. Science, Environmental Marine China; 361005, of Xiamen Laboratory University, Key State 5. hs uhr r otiue qal oti work this to equally Contributed are authors These h xrsinof expression The . h vltoaydrcino oohrs n emprdts sa nrgigmseyt be to mystery intriguing an is hermaphroditism and gonochorism of direction evolutionary The napaigmdlfrsuyn h omto fhrahoie.O h te ad sthe as hand, other the On hermaphrodites. of formation the studying for model appealing an ) u o nohrtlotseis ugsigta toiiae rmalnaeseicduplication lineage-specific a from originated it that suggesting species, teleost other in not but iteylo rae,woegnm eune oaa transcriptome, gonadal sequence, whole-genome croaker, yellow little [email protected] [email protected] eanucer ee erpr h eoesqec of sequence genome the report we Here, unclear. remain .polyactis L. . hlgntcaayi hw httehrahoiesaei male in stage hermaphrodite the that shows analysis Phylogenetic dmrt1 Rnf183 iegdfo h omnacso of ancestor common the from diverged n t ptemrgltr gene regulatory upstream its and dmrt1 X Wang) (X. P Xu) (P. nieistoparalogues two its unlike , eewsietfida h addt e eemnto eein gene determination sex candidate the as identified was gene dmrt1 2 eoigthe Decoding . rnf223 .polyactis L. rnf183 aiihhscrocea Larimichthys .polyactis L. and eebt eulydimorphic sexually both were rnf225 eoentol provides only not genome aiihhspolyactis Larimichthys sol rsn in present only is , ihasz f˜706 of size a with , dmrt1 dmrt1 .polyactis L. 2. million ˜25.4 .polyactis L. , , rnf183 rnf183 and L. , , , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. eess n h icvr fatasetjvnl nesxsaei ae loerce h hoyo teleost of theory the have enriches We also males sexes. in in both stage gonochorism intersex from and juvenile points transient hermaphroditism time a between development of relationship discovery gonadal the the 18 studying and for and teleosts, model organs excellent major hermaphrodite of an in patterns 13 Mb) established architecture of reproductive (706 genomic from genome its genome of obtained assembled in role reference the evolution high-quality high-quality also elucidate a and A to structure produced study, that 2019). we this genome revealed In evolution, Gil, of analysis and studies methods. karyotype differentiation Park for sex and 2020; toolkit and necessary map al., the linkage et provide established (Liu genetic will successfully 48) a was = of programme (2n report diploid breeding previous artificial A an wild in 2015, 2021). of miniaturization In spawning growing artificial resources. and by fishery imbalance, restore gender precocity, to sexual needed changes of caused quality resources has the water overfishing Although polyactis and 2004), 2010). current al. ocean al. et and et Han pollution, 2016; seawater al., et overfishing, level. (Zhang to genome due the of significantly at wild sequencing increased patterns whole-genome of reproduction 1977; Therefore, abundance sexual (Takahashi, of histology The 2016). divergence and evolutionary al., to morphology the need et in determine zebrafish Lau differences of to fascinating, individuals sex 2009; approach also all no al., which is with in et stage stage, process, Orban transient ovary development type gonadal juvenile this transitional zebrafish the of a the as nature experience from serve different male-specific may is The it system as This hermaphroditism. study. to further gonochorism warrant from mechanisms molecular underlying The in species species related in systems. closely related pattern most closely differentiation the the gonadal Because As and this family, gonochorism. that nor gonad. speculate intersex hermaphroditism the an typical forming in neither will testis, rare crocea the oocytes) is extremely primary in differentiation (early hatching) is testis EPOs post are stage, hermaphroditism (days of juvenile females dph pattern the and 80 in unique males and However, of This 43 1:1. functions between to sexual degenerate close the and is maturity, appear ratio sexual Briefly, sex reach China, 2021). the in they in al., and When process, et separated, importance reproduction (Xie species. sexual economic stage fish special hermaphroditic gonochoristic great its transient a with to male-specific due a species undergo gonochorism fish males and which hermaphroditism famous ( of most evolution croaker the of yellow direction Little of one Japan. and and Korea, Pacific western the to the to belongs evolution the corvina, whole-genome ( of analysis croaker Therefore, the yellow for Little , fishes. tool 2009). in powerful in Mank, a gonochorism methods be and and differentiation will sexual hermaphroditism Avise species, of and transitional 2014; of patterns includ- studies consortium, reproductive genome-wide methods, different Sex hermaphrodism especially with differentiation of simultaneous fishes sex Tree of and and 2014; sequencing protogyny, hermaphroditic patterns all al., of hermaphrodite across reproductive et solely protandry, Interestingly, of (Bachtrog consists hermaphrodite species. types order dioecious) gonochorism, all no or have ing and (separate-sex, Perciformes fishes, gonochoristic only documented teleost contains or been groups, of also male has phylogeny diver- instead trigger hermaphroditism this the but which that yet in pathways species across fish, teleosts molecular widely 2008). of in conserved scattered orders Liu, rapidly particular, with taxonomic evolve are and in nine coupled that Mitcheson 2016); The is mechanisms de al., development. diverse signals female (Sadovy et by sex-determining Lau determined species primary are 2014; 1,500 in females consortium, 1995; sity than and Sex Ricklefs, more males of and in Tree fact, (Renner documented 2014; In fish al., been teleost et has in Bachtrog commonly hermaphroditism 2006; occur Auld, and and organisms Jarne reproducing sexually in patterns ( .crocea L. ouain Jn19;LnadCeg20) hrfr,atfiilbedn ehiusaeurgently are techniques breeding artificial Therefore, 2004). Cheng and Lin 1996; (Jin populations sadffrnitdgncoitwt XX e eemnto Yue l,21) we 2012), al., et (You determination sex XX/XY with gonochorist differentiated a is ) aiihhspolyactis Larimichthys .polyactis L. .polyactis L. Larimichthys a eeeydcie ic h 90,adiscmeca au has value commercial its and 1980s, the since declined severely has Larimichthys L n avlclue(hne l 06 i ta. 09 i tal., et Xie 2019; al., et Liu 2016; al. et (Chen culture larval and aiySieia n re ecfre.I sacokrnative croaker a is It Perciformes. order and Sciaenidae family , . polyactis , L . polyactis hs pce eosrt w ieetreproduction different two demonstrate species these , 3 a on ob odmdlfrivsiaigthe investigating for model good a be to found was ) L . polyactis ,as ae h ml elwcokro yellow or croaker yellow small the named also ), .polyactis L. vle rmagncoitcancestor. gonochoristic a from evolved L . polyactis aegaulyrcvrd(Cheng recovered gradually have L rncitm aawere data Transcriptome . . polyactis L . L sanecessary a is polyactis Larimichthys L . polyactis . polyactis sa is L. is Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. est eei a a ul ycrmnmr(eso .7 Cthne l 00 oaco h scaffolds evaluation the BWA anchor to 2020) high- al, A Genomics evaluation et data. 10x Genome (Catchen 2014) map 1.07) the al., optical genome. (version from level using et chromonomer chromosome superscaffolds produced by (Adey a to built reads into connected fragScaff was linked were map using among scaffolds genetic resulting information scaffolded density the distance first then, and were and location libraries, contigs the PacBio to the according and 2014). superscaffolds, 2013), al., construct al., et (Walker To assembled et (v1.22) (Chin the pilon (v5.0.1) Finally, by Quiver corrected 2015). by assembly were data Scaffold al., reads PacBio bp) et the (350 2014). with deletions, (Pendleton size correction (Myers, insertions, short-insert (v1.2.4) error daligner the FALCON of Overlap-Layout-Consensus to using called using rate subjected reads algorithm were contigs the consensus preassembled contigs obtain the to obtain by according to to assembled were correction pairs (DBG2OLC) reads base error preassembled the to the between Then, subjected errors were sequencing reads and PacBio the First, 2013). al., et (Liu long-read depth) PacBio peak bp) of (350 17-mers)/(position reads of assembly short-insert number high-quality Genome (total and of = genome Gb size the 128 Genome of the of formula: characteristics total estimate following the A to 2010). on used al., depends et were K-mers (Li of distribution distribution Poisson’s The follows of values. K-mer size of genome distribution Committee the the estimate by approved To was Sciences. size which the Agricultural genome , of with of Laboratory Academy accordance Estimation of Zhejiang in Use at conducted and Experimentation were Care experiments Laboratory for of Animal addition, Guide the In genome). of the genome). regulations of the coverage of of sequence (˜616-fold coverage reference 10 obtained high-quality (˜183-fold and a 150 data mapping generate a same genome of to with BioNano used the the Gb platform including following 128 At HiSeq technologies, kit approximately Illumina mapping RSII preparation the other obtaining genome). sample Biosciences on length, a the sequenced Pacific using read was of the library library bp coverage paired-end using the bp) (˜69-fold and sequenced (350 involved recommendations, data was was short-insert manufacturer’s which PacBio a polymerase library constructed preparation, sequencing Gb the we and 48 template Finally, time, annealed, approximately was was template. yielding using step primer PacBio SMRTbell purification instrument, sequencing the first template the the on and the to Then, adapters, sequencing library, hairpin bound for Beads. of size Magnetic USA) ligation PB insert Hilden, repair, and end AMPure (Qiagen, 20-kb muscle repair, GenomicTip100 damage a the Qiagen concentration, from build DNA a extracted To using was fish DNA male platform. genomic a High-quality artificially of an blood JAFMOB000000000. from The Information, is collected Biotechnology China. number were for Province, accession 1A) Center Zhejiang (National (Figure Zhoushan, NCBI individuals to in male submitted (2016) and generation female F2 1877) bred ID: Fishbase 334908; croaker ID: yellow little The sequencing and Samples Methods and Materials differentiation. and determination sex enovo de .polyactis L. .polyactis L. assembly .polyactis L. Ntoa etrfrBoehooyIfrain[CI [NCBI] Information Biotechnology for Center (National eoesz ae nte1-e rqec itiuinuigthe using distribution frequency 17-mer the on based size genome eue ed rmpie-n irre odtriethe determine to libraries paired-end from reads used we , .polyactis L. 4 oa f˜3 bo eunigdt was data sequencing of Gb ˜431 of total A . https://www.ncbi.nlm.nih.gov × L eoislne ed,wr also were reads, linked Genomics . polyactis eoefiehsbeen has file genome ,adthe and ), Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. eemdl eeae rmEiecMdlrwt h rncit sebe yTiiy(rbere al., the et adjusting (Grabherr after Trinity structures by gene annotation structures assembled final Functional gene transcripts the all the acquire nonredundant to with merged the 2003) EVidenceModeler We generate al., 2011). to from and software. et 2008) 2011) generated 2003) (Hass al., al., models PASA al., et et used gene (Haas et we (Grabherr EVidenceModeler Finally, (Hass 2.1.1) using sets. PASA (version approaches gene Trinity by three by the structures data positions. by alignment gene RNA-seq obtained on the the based assembled 2010) we obtained al., we hand, then et one hand, (Trapnell the other 2.1.1) On (version the prediction. cufflinks On gene by structures in the gene assist obtained to to and used data were 3.0.2) RNA-seq tissues the (version different 1.0) mapped from GlimmerHMM (version data methods. RNA-seq 2000), Genescan for five addition, al., the In 2004) 2006), from et (Ian, al., results SNAP (Parra five et obtained and 1.4) Stanke we 2004) Finally, (version 2003; al., Waack, Geneid et & 2000), obtained. (Majoros to (Stanke Solovyev, were used structures 3.0.2) was and gene 2004) (version (Salamov homology-based al., Augustus The et sequences. (Birney used alignment 2.2.0) We the (version on Genewise based Then, 2012). models al., gene et predict (Darriba SOLAR with E-value results an ele- with from genome genes assembled the of floridae sequences chiostoma protein-coding longest rubripes et the (Haas used including EVidenceModeler we species, by ven prediction, prediction, integrated gene gene were homology-based genes homology-based For data. predicted of nonredundant all obtain combination then, to a and 2008) al., data, with RNA-seq genes and protein-coding methods, the predicted We Minscore=50, annotation PI=10, gene PM=80, Protein-coding Fin- Delta=7, Repeats Mismatch=7, Tandem “Match=2, using searched to parameters were MaxPeriod=2000”. parameters the repeats and default with tandem the The 1999) with level. (Benson, protein 2009) der the Chen, at and sequences (Tarailo-Graovac repeat RepeatProteinMask identify used and we Repbase addition, against In parameters default with LTR RepeatMasker and using 2005), and al., homology et of the (Price built combination we the a First, for packages. with software out (TEs) carried was elements annotation posable repeat assembly, genome After detection assembly.Repeat genome HMMER the and of 2006) completeness al., annotation the et Genome assess Burkhard genomic to 2003; Waack, novel 2013) and a al., (Stanke in with et Augustus combined genes (Mistry 1997), was conserved 248 al., 2015) of et al., of (Sim˜ao (Altschul structures series et Orthologs) set a TBLASTN exon-intron Single-Copy has reliable the Universal It assembly. highly (Benchmarking identify genome BUSCO evaluate a accurately to sequence. uses eukaryotes can of 2007) that range wide al., procedures a of et in occur (Parra that Approach) families protein Mapping conserved Genes distribution Eukaryotic depth (Core data. sequencing CEGMA sequencing The genome 15”. evaluation the -i scaffolds BUSCO of 1 the uniformity and to “-o the back CEGMA parameters: indicates reads following which size the distribution, short-insert with Poisson the 2009) a mapped al., follows we et assembly, (Li genome BWA the using of quality the evaluate To , yolsu semilaevis Cynoglossus o nlss(age l,21) is,w lge hs rti eune gis our against sequences protein those aligned we First, 2014). al., et (Wang analysis for , .crocea L. IDR(ue l,20) eod epromdhmlg-ae prediction homology-based performed we Second, 2007). al., et (Xu FINDER , rohoi niloticus Oreochromis , .polyactis L. yrnscarpio Cyprinus < e5uigTLSN(lshle l,19)adlne h alignment the linked and 1997) al., et (Altschul TBLASTN using 1e-5 enovo de eetlbayuigRpaMdlr(mte l) RepeatScout al.), et (Smit RepeatModeler using library repeat eoeuigTpa vrin201)(rpele l,2009) al., et (Trapnell 2.0.13) (version TopHat using genome enovo de , ai rerio Danio 5 , atrsesaculeatus Gasterosteus eepeito nterpa-akdgenome. repeat-masked the on prediction gene enovo de , enovo de .polyactis L. oosapiens Homo rdcinmtosuigmultiple using methods prediction eetlbaisa h N level. DNA the at libraries repeat eoe eietfidtrans- identified We genome. , , rza latipes Oryzias u musculus Mus and , Takifugu enovo de Bran- Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. musculus hlgntcrltosisa h eoelvlwr nlsdfregte species, , eighteen for analysed , were level genome found 2016). , the we al., levels at Then, et genome relationships acids. (Li and Phylogenetic OrthoMCL cutoff amino gene E-value using 30 the an families with than at gene blast into less analysis all-vs-all alignments Phylogenetic was an the running length clustered by protein we sequences Finally, protein whose 1e-7. species’ genes of 18 the the using removed matches and possible genes the of using clustering , family gene out species, carried 18 We 2016). with al., 2016) et aculeatus al., (Yang ancestor event common et last speciation (Li the a in by OrthoMCL sequence species ancestral examined single a an from of resulting sequences homologous are Orthologues cluster family Gene INFERNAL tRNAs by for search database to comparison 2005) 1997) Genome Eddy, al., and and et (Lowe database tRNAscan-SE we (Griffithsjones rRNA genomes. used Here, the Rfam we the Finally, blasting protein. the in software. by a 2009) against rRNAs al., into snRNAs for et translated and (Nawrocki searched be We siRNAs RNAs (snRNAs). cannot for ribosomal RNAs that (tRNAs), searched nuclear RNAs molecule transfer small (miRNAs), RNA and microRNAs of including (rRNAs), ncRNAs, kind of types a four is annotated (ncRNA) RNA the Noncoding find to databases 2000) annotation Apweiler, RNA and Noncoding (Bairoch gene TrEMBL the and (Ashburner, pathway SwissProt which (Kyoto (GO) determine as KEGG (Mulder Ontology symbols. to well searched databases gene Gene 4.7) as also 2000) in, We the (version al., involved descriptions. et obtained InterProScan be InterPro35 (Hiroyuki we might by Genomes) corresponding Moreover, and the 2005), Genes domains. from of 1998), al., gene and Encyclopedia al., et each et motifs for (Bru (Mi obtain terms ProDom PANTHER to 2000) 2006), and al., 2007) (Finn et 2012) al., Pfam (Hulo including al., et PROSITE databases, 1994), et 2007) First, Beck, al., species. (Letunic and et the (Attwood SMART (Mulder of InterPro PRINTS adaptation against 2000), biological search al., the to et understand proteins coding better our to used annotation we function gene out carried We , RA- in species: of , following relationships method phylogenetic the (ML) The among indivi- likelihood 2008). single-copy analysed aligned al., maximum the et were ( the with Stamatakis protein species using supergene 2006; RING-finger one 18 (Stamatakis, tree 8.0.19) concatenated phylogenetic the (version we a among software Then, xML constructed genes 2004). single-copy and Robert, alignments 590 and gene the (Edgar from MUSCLE sequences by dually acid amino the First, egbu-onn n iN loihst arxo ariedsacsetmtduigtemaximum the using estimated distances applying is pairwise by together automatically of clustered obtained matrix were taxa a search the associated heuristic to with the the the algorithms tree which for using BioNJ The tree(s) in by 2000). and Initial trees inferred branches. Kumar, neighbour-joining of was the and percentage history to (Nei The evolutionary next model shown shown. The reversible is S1. time Table log-likelihood general Supplementary highest and in method listed likelihood is maximum species and gene mykiss Oncorhynchus , , rohoi niloticus Oreochromis eou tropicalis Xenopus ai rerio Danio aiuurubripes Takifugu pnpeu coioides Epinephelus in intestinalis Ciona rza latipes Oryzias alricu milii Callorhinchus , , alsgallus Gallus rza latipes Oryzias , oosapiens Homo , rpoeismarmoratus Kryptolebias , , , .crocea L. eou laevis Xenopus nlscarolinensis Anolis rnf , , , , au morhua Gadus au morhua Gadus pnpeu lanceolatus Epinephelus atrsesaculeatus Gasterosteus , rnhotm floridae Branchiostoma aiymmeso hoooe2, chromosome on members family ) , aiuurubripes Takifugu , , u musculus Mus .polyactis L. , eou tropicalis Xenopus oosapiens Homo and , , and in intestinalis Ciona , , rohoi niloticus Oreochromis cnhpgu schlegelii Acanthopagrus .polyactis L. nlscarolinensis Anolis eou tropicalis Xenopus , , alsgallus Gallus , yolsu semilaevis Cynoglossus rza latipes Oryzias , , etorsi striata Centropristis eiotu oculatus Lepisosteus 6 , u musculus Mus , , alsgallus Gallus .crocea L. , , nlscarolinensis, Anolis rnhotm floridae Branchiostoma n13 rnf223, rnf183, , aiuurubripes Takifugu is,w bandtelnettranscript longest the obtained we First, . h eBn ceso ubro each of number accession GenBank The . , ae calcarifer Lates , , , , , rohoi niloticus Oreochromis liao sinensis Alligator , arsbergylta Labrus alricu milii Callorhinchus ootrsalbus Monopterus ai rerio Danio , rcey scripta Trachemys and and , .polyactis L. rnf225, yolsu semilaevis Cynoglossus , , mhpinocellaris Amphiprion , oosapiens Homo eou tropicalis Xenopus eiotu oculatus Lepisosteus , pnpeu merra Epinephelus , , hlnamydas Chelonia prsaurata Sparus and , au morhua Gadus , , Gasterosteus ai rerio Danio dmrt1 , .crocea L. , were Mus . , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. )(LSP RRID:SCR (BLASTP, 8) m crocea to Similar the of Comparison the in crocea polyactis aculeatus and analyses. Gasterosteus selection branch the positive foreground were in of the found P-values groups were as three genes 2013). ran selected We al., positively set 955 we et selection. group, positive Zhang first undergone the 2005; have In to al., thought et were different (Zhang 0.05 calculated model We the branch-site using the computed 2012). ( with al., ratio respectively, et the branch, (Sun estimating including analysis by tree, phylogenetic level orthologous between genetic the sequence (dS) in DNA substitutions used nucleotide the synonymous genes at to (dN) out substitutions carried nucleotide synonymous was analysis selection Positive rates evolutionary accelerated with Genes al., et significant (Jeffery analyses. evolution a enrichment of have GO/KEGG process to their to the considered with during were branch expansion 0.05 each or The than comparing contraction 2006). less underwent By p-value they species. adjusted means other which an in change, with copies Families 2 (FDR). contraction rate than discovery less and had the expansion species branch, performed one ancestor We in copies time. 200 divergence than CAF and among the change tree using significant phylogenetic analysis a abovementioned underwent the that on families gene explored We expansion and contraction family and Gene Mya; 149˜165 species, other nine and , rubripes Takifugu iepit eeslce rmteTmTe Hde ta. 05 est (http://www.timetree.org): website 2015) al., Mya; et mcmctree (Hedges TimeTree the the by sapiens from calibrating selected CDSs Seven were with sample-frequency=2”. and points species and tree time sample-number=100,000 18 phylogenetic ”burn-in=10000, among parameters (http://abacus.gene.ucl.ac.uk/software/paml.html) the main 2007) on the (Yang, based PAML implemented of was program estimation time Divergence According estimation time 2020). Divergence al., et of Stecher nomenclature the 2018; corrected al., , we branch et tree, with phylogenetic (Kumar the scale, X species: sequences. of to rate nucleotide MEGA several results drawn The involved in the is analysis sites. to conducted This va- tree among site. were The log-likelihood per differences invariable. analyses substitutions rate superior evolutionarily of Evolutionary evolutionary number be a the model to as with to measured sites lengths topology used some the was allowed distribution model selecting variation gamma then discrete and A approach lue. (MCL) likelihood composite rohoi niloticus Oreochromis .polyactis L. oosapiens Homo eoe h 4crmsmswr lge to aligned were chromosomes 24 the genome, .polyactis L. and , and rnf225 eoe ntetidgop eset we group, third the In genome. .polyactis L. .crocea L. atrsesaculeatus Gasterosteus u musculus Mus , aiuurubripes Takifugu a enwogynamed wrongly been has 711Mya; 97˜151 , genome. χ and ttsi n dutdb h D ehd ee iha dutdpvlels than less p-value adjusted an with Genes method. FDR the by adjusted and statistic 2 P L , Vrin21 D ta. 06 rga fe leigtefmle hthdmore had that families the filtering after program 2006) al., et (De 2.1) (Version E ω stebcgon rnh n 6 oiieyslce ee eefudi the in found were genes selected positively 761 and branch, background the as . ´ ee nlddi h otatdo xaddfmle eeetatdadsubjected and extracted were families expanded or contracted the in included genes , vlewscluae sn ihrseatts n a hnajse ytefalse the by adjusted then was and test exact Fisher’s using calculated was -value alsgallus Gallus mhpinocellaris Amphiprion L .polyactis L. 0, polyactis 110mlinyasao(Mya); ago years million 61˜100 , . ω polyactis ,and 2, 000.Gn lgmnso ohncecai n mn cdsequences acid amino and acid nucleic both of alignments Gene 001010). rza latipes Oryzias au morhua Gadus , rohoi niloticus Oreochromis stebcgon rnh n 00pstvl eetdgnswr found were genes selected positively 1080 and branch, background the as and ω stefrgon rnhand branch foreground the as 1˜3 Mya; 312˜330 , ,rpeetn h hl re oerudbac,adbackground and branch, foreground tree, whole the representing 1, a 4crmsms ovsaietecnodnewt the with concordance the visualize To chromosomes. 24 has rnf183 L alricu milii Callorhinchus . crocea .polyactis L. and , .polyactis L. , rohoi niloticus Oreochromis , in yolsu semilaevis Cynoglossus ai rerio Danio genomes pnpeu lanceolatus Epinephelus 7 atrsesaculeatus Gasterosteus .crocea L. stefrgon rnhand branch foreground the as and , eoe ntescn ru,w set we group, second the In genome. alsgallus Gallus n fenohrseis 2˜6 Mya. 422˜463 species, other fifteen and rza latipes Oryzias , sn lsp(lsp- e1 b5- - 5 -v 5 -b 1e-10 -e (blastp blastp using rza latipes Oryzias .crocea L. , .polyactis L. aiuurubripes Takifugu , rnf225 atrsesaculeatus Gasterosteus and . stebcgon rnh and branch, background the as , nlscarolinensis Anolis .crocea L. 3˜5 Mya; 139˜158 , and , rpoeismarmoratus Kryptolebias n te pce based species other and rnf183 ω aiuurubripes Takifugu ausi h phylo- the in values , .polyactis L. , ntefollowing the in .crocea L. ai rerio Danio .polyactis L. ω , 225˜239 , fnon- of ) .crocea L. Homo and , and , , L. L. L. Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. igb W.Tempigrt a 74% oeig9.6 ftegnm ihahmlgu N rate ultraconserved SNP 248 homologous S5), a Table with (Supplementary assembly genome genome the the of of map- 99.36% completeness read covering was the short-insert 97.49%, assess assembly performed was To we rate the 0.03%. assembly, mapping of genome of The the 24 BWA. content of to by completeness G-C mapped scaffolds and ping and 2). were The N50 715 (Table consistency scaffolds contig the length 2). of map, evaluate total calculated (Table To genetic total the The constructed respectively A of Mb. the 88.55% Mb, 1). on ˜706 for (Table 4.52 Based accounting of S4). genome) chromosomes, and assembly S3, the final Mb Table of (Supplementary the 1.21 41.45% coverage of were (˜616-fold length N50 total obtained sequencing scaffold a was covering bp) data assembled, 10 (350 of were and size Gb map genome ˜431 short-insert (BioNano of technologies Illumina mapping technology, new sequencing and technology, (SMRT) real-time single-molecule the sequenced We annotation and P assembly at sequencing significance Genome statistical the determine to performed ANOVA, Results one-way were t-test, test sample hoc Independent USA). post (Chicago, Duncan’s v17.0 and statistics determine SPSS to uses software normalized Statistical then reads). and mapped determined million was Analysis per transcript Statistical transcripts. transcript each the of The of for kilobase abundance reads expression per parameters. the mapped (reads default estimate RPKM of to with used assembled number 2012) then the total was Salzberg, onto 2011) The Dewey, were and mapped and were (Langmead libraries (Li sample software program cDNA platform, each RSEM Bowtie HiSeq Then, from using Illumina reads sequences Bioanalyzer. the High-quality transcript on 2100 reads. sequenced Agilent paired-end and quality and removed. 150-bp its recommendations was generating and NanoDrop manufacturer’s 340, contamination (EB), a 270, bromide the DNA ethidium six 240, following genomic using with 180, had The stained constructed assessed 130, gel pool samples. 100, were agarose each an 80, three quantity on where 70, had evaluated and 60, was dph, pool each RNA 50, 40 each the where at of where at dph, pools integrity dph, gonads 30 The mixed 720 and testis indistinguishable and and 20, of ovary 400, 10, of pools 360, 7, pairs mixed at thirteen three samples from shown and whole-fish from gonads; of fishes; is and pools six tissues transcriptomes mixed had different assemble three pool from to from extracted sequenced data was also RNA clean Total were the libraries its of RNAseq and summary These genes. (EB), plat- A annotate libraries S2. HiSeq bromide cDNA Table Illumina reads. ethidium Then, the Supplementary paired-end with on Bioanalyzer. in sequenced stained 150-bp 2100 and Agilent gel generating recommendations removed. and agarose manufacturer’s form, was NanoDrop the an a contamination liver, following on kidney, using DNA constructed evaluated assessed head were genomic were kidney, was and quantity muscle, intestine, RNA and mcscan gill, and the quality ovary, including of testis, tissues, and integrity skin, of The heart, kinds 2009) eye, thirteen brain, from alignment spleen, extracted al., MUMMER was is the RNA which using Total et 2008), by out al., (Krzywinski carried sequencing et was (Tang RNA genomes Gonadal libraries 0.69) two utility the JCVI (version of using alignment (MUMMER). by The algorithm circos constructed package. in- was python synteny using chromosomes a The sex of de- plotted 2008). ( ference automatically al., The NCBI were et to acid (Tang from RAxML (https://github.com/tanghaibao/jcvi/wiki/MCscan-(Python-version) amino 2014). results enables CM011695.1 The (Stamatakis, which model The GenBank: inference. model, substitution (ML) using AUTO protein likelihood GAMMA tained best-scoring maximum the the for using termine RAxML inferred to was submitted phylogeny and concatenated were .polyactis L. Fgr A eoeuigacmiaino ieetapoce,including approaches, different of combination a using genome 1A) (Figure 8 https://www.ncbi.nlm.nih.gov/nuccore/1523709032/ × eoislne ed) total A reads). linked Genomics .crocea L. < eoewsob- was genome 0.05. ). Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. s ld,w nlsdtepyoeei eainhpo hs sex-linked these of relationship phylogenetic family, the protein in analysed flagella-associated we and clade, cilia- fish the in in gene found protein-coding also candidate a determination of Duplication sex Mb). the including development, (from gonadal of 2 and ( protein Chr. differentiation finger into gene sex ring inserted the in was of involved 8) members genes six (scaffold Interestingly, collinearity. scaffold high one had that sequences revealed 2) Chr. of 2, chromosomes sex comparisons the pairwise of in analysis Syntenic found were translocations and rearrangements of 3). number (Figure a events, fusion of and fission chromosomes the single of to quality aligned unambiguously be of genomes the chromosomes of sex Comparison the of analysis genomic Comparative positively the for and important families gene probably expanded all S14-S23). The were analyses. enrichment genes GO/KEGG selected to the on subjected families and contracted extracted 544 and The evolution. families accelerated gene underwent expanded 320 found species the polyactis 18 that the among suggests analysis the family genes of orthologous ancestor 1:1 common polyactis selected the 590 from on diverged based are analysis (snRNAs), Phylogenetic RNAs nuclear analysis small evolution and Genome (rRNAs), S13. RNAs Table Supplementary ribosomal in (tRNAs), microRNAs annotated including RNAs (ncRNAs), RNAs transfer noncoding assigned of of were SwissProt45, (miRNAs), length types Four KEGG43, genes intron S12). InterPro35, and predicted Table including bp (Supplementary the TrEMBL45 databases, 166.59 various of and of against (24086) length species alignment exon 95.5% eighteen by an Approximately annotations the had functional among S10). across genes Table genes families the (Supplementary average, orthologous gene bp On of single-copy 1D-E). 1,270.49 number (Figure 590 the diagram and plotted Venn a and families, in counted gene we 26,313 species; families, other seventeen gene 1B; 10,215 Figure of (Figure set of 1; calculated analysis (Table was of family TEs data 84.3% of Gene transcriptomic family and each or pipeline, for sequences annotation rate standard protein divergence 1C). sequence a homologous The using by S10-S11). predicted 1). Table supported (Table Supplementary were genome were genes of total genes total protein-coding the A the 25,233 of scaffolds. 88.55% of the representing total anchor pseudochromosomes, A to 24 used to was anchored map were linkage S8-S9). scaffolds of Table repeat high-density 309 (˜22%) (Supplementary long-terminal a Mb assembly, of transposons chromosome-level 157 (5.34%) DNA the of Mb of a For total 38 (9.57%) suggesting A including Mb genome, elements, S7). 67 Table our repetitive and out (Supplementary be in retrotransposons 4,304 assembly to assembled (LTR) of novo predicted completely total de was been a the sequences conducted, had of genome also completeness (93.9%) inte- was of groups high analysis level BUSCO indicating BUSCO high assembly, searched S6). our 4,584 Table in (Supplementary the found genes of were core 93.95% the and of CEGMA9, grity by examined were genes eukaryotic Larimichthys dmrt1 piwi2 rnh(iue2.Tetregop fpstv eeto nlssfudta oeta 0 genes 700 than more that found analyses selection positive of groups three The 2). (Figure branch and eeietfidi h e eemnto ein(iue4) .2M einsurrounding region Mb 4.62 A 4A). (Figure region determination sex the in identified were , to L L .polyactis L. rnf122 h eut hwdthat showed results The . . . crocea polyactis in ) eaae rmteacso of ancestor the from separated .crocea L. .polyactis L. Fgr B.Snealrenme of number large a Since 4B). (Figure eoeasml.Atog hs hoooe i o xeinechromosome experience not did chromosomes these Although assembly. genome L . polyactis , atrsesaculeatus Gasterosteus .polyactis L. a oprdt h orsodn ytncrgo in region syntenic corresponding the to compared was rnf and .crocea L. .crocea L. aiymmeshv ihdge fhmlg ewe these between homology of degree high a have members family L .polyactis L. . ee nlddi h otatdo xaddfmle were families expanded or contracted the in included genes crocea lnaegop2,L2)and LG22) 22, group (linkage 9 L ieg 2. ilo er g.Teacso of ancestor The ago. years million ˜25.4 lineage .fugu T. . eeldta 4crmsmsof chromosomes 24 that revealed crocea , ai rerio, Danio dpain Fgr ,SplmnayTables Supplementary 2, (Figure adaptations prxmtl 01mlinyasao Gene- ago. years million 90.1 approximately Fgr ) h eut eetdtehigh the reflected results The 3). (Figure rnf ee eentosre nayother any in observed not were genes rnf and .polyactis L. ee n uooa members autosomal and genes .polyactis L. .polyactis L. rnf .polyactis L. aiyadseveral and family ) n htteother the that and L .polyactis L. dnie core a identified . .crocea L. polyactis cfap157 (chromosome lineage dmrt1 could was , (4.73 and L L. . Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. n ihia,sgetn eels uigeouin(iue7A). (Figure evolution during loss gene suggesting Cichlidae, and Although amphibians. and examined, reptiles, mammals, of in history evolutionary the investigated genes, We effect. damage collateral to compared Rnf183 its in profiles transcription the Rnf183 of analysis Further from of 6C). gene, away transcripts (Figure upstream the of insertion immediate Analysis 18-bp an dimorphic. an identified has neighbourhood exon gene fourth the , that , of indicated structure results gene The The 6B). (Figure region this in in evolution genomic rapid sting related closely an the with Between rearranged, was order of gene orientation gene the that showed smarca2 map the locus in gene shown A are reptiles, species that mammals, these includes indicated of group hermaphroditism. results patterns main and These reproductive other The 6A). The gonochorists. (Figure branch. are diagram separate which a amphibians, in and distributed branches. birds, is different and in group scattered are aphrodite they and gonochorists, typical calcarifer Lates of merra branches Epinephelus the Interestingly, protogyny. crocea hermaphrodite except Perciformes, and are protandry them hermaphrodite of Most 6A). (Figure group polyactis main one in clustered for be could analysed from species constructed fish was tree phylogenetic A that conclude and we 6), (Figure Therefore, stages biased. all in testis-specific sex was of panel A and dph). clustered (50-720 them, dph) together Among (240-360 clustered maturity development sexual ovarian including of and of genes, testis stages differentiation-related functional and clustered all clustered the dph) determination- dph dph, 720 form 40 and 50 to before 400 40 From (50-180, meiosis gonads at meiosis together. early-stage of gonads before The testes initiation sex Methods). and indistinguishable and dph (see dph, 40 together; stages 30 at later and gonads 13 sex 20, indistinguishable at 10, ovaries together; conducted 7, also and at was testes samples profiling and whole-fish expression dph, gene in Gonad stages intestine. developmental and ovary, 18 testis, at skin, heart, eye, brain, thirteen gene spleen, on candidate performed sex-determining was analysis the Transcriptome of analysis expression and Identification species, two .latipe O. Larimichthys rza latipes Oryzias dmrt1 dmrt1 eelctdbtenseiswt emprdt rtnr n emprdt rtgn,including protogyny, hermaphrodite and protandry hermaphrodite with species between located were rnf223 s3k ptemof upstream kb 3 is n t ls aaousso atr frpdeouini teleosts in evolution rapid of pattern a show paralogues close its and scnevdaogmmas etls mhbas n te eess except teleosts, other and amphibians, reptiles, mammals, among conserved is ae on based eeanttdi h eoeadtasrpoedt Fgr n upeetr iueS1). Figure Supplementary and 5 (Figure data transcriptome and genome the in annotated were , rnf183 dmrt1 hwdasgicn etsba.I ersrce h ee ntesxdtriainregion, determination sex the in genes the restricted we If bias. testis significant a showed , .rubripes T. .polyactis L. dmrt1 y1aa ga rnf183, figla, cyp19a1a, and n o nayohrtlotspecies. teleost other any in not and and a nyb on nSieia,Ltde oaetia,Lbia,Saia,Salmonidae, Sparidae, Labridae, Pomacentridae, Latidae, Sciaenidae, in found be only can Fgr 6D). (Figure , rnf225, dmrt1 , t vr-pcfi xrsinmyrsl ntsi-pcfi xrsinof expression testis-specific in result may expression ovary-specific Its . ootrsalbus Monopterus rpoeismarmoratus, Kryptolebias pnpeu coioide Epinephelus dmrt1 , and .niloticus O. sstill is r lsl eae aaous rhlge o l he ee ol efound be could genes three all for Orthologues paralogues. related closely are .rupries T. dmrt1 .crocea L. otis5eos eebigta fohrtlotseis including species, teleost other of that resembling exons, 5 contains .crocea L. rnf183 n tdslyda poiesxbae atr ngndexpression gonad in pattern sex-biased opposite an displayed it and , dmrt1 and . and Fgr 4C). (Figure and rnf183 rohoi niloticus Oreochromis , dmrt1 arsbergylta Labrus Larimichthys eeisrinimdaeyusra of upstream immediately insertion gene stebs e eemnto eecandidate. gene determination sex best the is foxl2 h s ru otisgncoit ueiehermaphrodite, juvenile gonochorist, contains group fish The . .crocea L. rnf183 and eeldta hsgn a w xn n sol 17bases 3197 only is and exons two has gene this that revealed uloiesqecs h eut hwdta oto the of most that showed results The sequences. nucleotide hwda vr xrsinba,while bias, expression ovary an showed .polyactis L. dmrt1 rnf223 ootrsalbus Monopterus 10 Fgr )and 7) (Figure dmrt1 oprdwt hto t ls relative close its of that with Compared . h ptemgn otnswr ieet sugge- different, were contents gene upstream the , a orltdwt h vlto fgonochorism of evolution the with correlated was y1aa ga n13 ol,rf2,gd,amh, gsdf, rnf225, foxl2, rnf183, figla, cyp19a1a, , and etorsi striata Centropristis n t ontemgenes downstream its and ise,gl,msl,kde,ha-iny liver, head-kidney, kidney, muscle, gill, tissues, rnf183 , rnf225 rza latipes Oryzias ai rerio Danio figla rnf183 hs xrsinwsas sexually also was expression whose , iial,tecoetrltv of relative closest the Similarly, . r rsn nms s pce we species fish most in present are Fgr )wr infiatyovary significantly were 8) (Figure n on httoautosomal two that found and eog otejvnl herm- juvenile the to belongs and , cnhpgu schlegeli Acanthopagrus dmrt1 rnf183 aiuurubripes Takifugu dmrt3 .polyactis L. Larimichthys dmrt1 n25 sf amh, gsdf, rnf225, dmrt1 in speetonly present is Larimichthys , dmrt2 hog a through .crocea L. .rerio D. and dmrt1 The . and , are L. L. . , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. n t aaouscnb dnie n noae,wihpoie nte niaino h ihqaiyof quality high Genomics the 10x of level. and indication pseudochromosome Bionano another the provides using to which mapping annotated, assembly Physical and genome 2019). identified assembly. the al., be gene of can 2019; et using al., quality paralogues Yang species et its the 2019; Ge fish and 2019; improved al., several al., further of et et Conte assembly reads Jiang 2019; genome linked al., 2019; contig the et al., The than Cai assembly. 2018; contigs et al., genome of He et number the (Gong smaller S17) of (Table a sequencing completeness with third-generation and and longer assembly accuracy much long-read is the of N50 combination ensured The correction technologies. high-quality short-read sequencing first Illumina long-read the from of is benefited assembly genome This of Our technologies. contiguous studies Genomics most biological 10x the for and assembled crucial Bionano, novo we is sequencing, study, genome long-read present reference nanopore the A In males. than . larger remarkably females polyactis L. of assembly genome high-quality A in dph) (43-80 Discussion stage transient mechanism. the hermaphrodite molecular in transient and testis the basis the formation that than in ovary indicated expressed testis. of the results all in levels the were These expressed genes expression in highly three than the more these ovary was in Interestingly, in it dph) the stages. significantly (43-80 whereas few in was stage gonad, a level hermaphrodite expression and at higher brain, its testis significantly eye, cases, the gill, a in some the maintained in in and dph, expressed dph mainly 7 40 as at testes. early in detected as than expressed ovaries and in Although higher differentiation. tissues ovarian breeding many and the in formation, entering oocyte expressed when synthesis, stages ovary oestrogen later the to and in related dph) detected (7-40 also to juvenile was the contrast figla gsdf dph). during In 720 of expression and expression testis. high (360 high season maintained mature However, and dph). to dph 50 spermatogenesis 7 (after as during early downregulated as was expressed it First, profiles. and testis. expression dph), the analyse in to expressed highly screened and were genes sex-related studied (anti-M and well-identified few A in profiles expression gene Sex-related rnf183 While duplication. of divergence (RPKM functional testis tissues. the other in and level skin in detected sion (RPKM of expression gonadal analysed, and distribution tissue with The species fish other of that teleosts. from among of not location but sex-linked of reptiles orientation and gene amphibians, rnf223 the mammals, that among showed conserved map relatively locus gene A fco ngrln lh,o h.2 and 2) Chr. on alpha, germline in (factor sebe eoei candeuigtidgnrto sequencing. third-generation using Sciaenidae in genome assembled pcaie nsxdtriain ieetaino aneac nteoayof ovary the in maintenance or differentiation determination, sex in specializes leinhroe nCr )and 6) Chr. on hormone, ullerian in ¨ > rnf183 Xenopus 120). sacmecal motn candefihseis dl s hwsxa iopim with dimorphism, sexual show fish Adult species. fish Sciaenidae important commercially a is rnf225 a xrse tlwlvl nglstse n a xrml ihyepesdi ovaries in expressed highly extremely was and gills/testes in levels low at expressed was rnf223 ugsigta hymyhv endrvdfo nacetdpiaineet The event. duplication ancient an from derived been have may they that suggesting , rnf183 a ihyepesdi il (RPKM gills in expressed highly was rnf183 Cyp19a1a and cyp19a1a > sol on in found only is 0 u oepeso nteoay(iue7) hs eut niae the indicated results These 7C). (Figure ovary the in expression no but 10) rnf225 Figla , Amh rnf223, rnf183 .polyactis L. and ctcrm 40 aiy1,sbaiyA oyetd a nCr 10), Chr. on 1a, polypeptide A, subfamily 19, family P450, (cytochrome smil xrse ntegnd,epcal nteoay twsfirst was It ovary. the in especially gonads, the in expressed mainly is ananterepeso n ucini h iladohrtissues, other and gill the in function and expression their maintain ntae t xrsinfo 0dhbfr etclrmiss(50-180 meiosis testicular before dph 40 from expression its initiated rnf223 L foxl2 gsdf and rhlge Fgr 7B). (Figure orthologues . polyactis .polyactis L. Larimichthys gndlsm-eie atr nCr 6 r gonad-specific are 16) Chr. on factor, soma-derived (gonadal rnf225 ntetsi n vr,except ovary, and testis the in foxl2 a xrse ral nmlil ise,wt moderate a with tissues, multiple in broadly expressed was n vna hssae hr een infiatdifferences significant no were there stage, this at even and , rnf183 11 Frha o 2 nCr 5 r osdrdtightly considered are 15) Chr. in L2, Box (Forkhead rsmbydet ufntoaiainatrgene after subfunctionalization to due presumably , rnf183 , rnf223 n t eenihorodi eydifferent very is neighbourhood gene its and , > n t ptemaddwsra ee is genes downstream and upstream its and 2) ihol eylwlvlo expres- of level low very a only with 120), o h.1) and 11), Chr. (on .polyactis L. foxl2 .polyactis L. rnf225 i h nerto of integration the via t5 p Fgr 8). (Figure dph 50 at cyp19a1a .polyactis L. rnf183 m,Gsdf amh, Cr 4 were 24) (Chr. a complex a has .polyactis L. a widely was Foxl2 snx to next is . rnf183 Amh was was de Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. o tdigteeouinr ieto ewe emprdts n oohrs.Wehrtesexual the Whether gonochorism. and hermaphroditism between direction evolutionary the studying for except , and the 2004; of availability Cheng The and Lin 1996; hermaphroditism. (Jin to in gonochorism decades Therefore, pattern from recent 2020). in differentiation al., grounds et gonadal spawning Lu in this disruptors that endocrine environmental above, speculate mentioned We As demonstrated ancestor. very 2021). study gonochoristic previous is al., Our Perciformes spectrum. et in reproductive gonochorism fish (Xie the and of hermaphroditism evolution from the of transition that of evolution the study the the for in for reasons important species be transitional may a These Finding disruptors population. endocrine the environmental throughout of hermaphroditism. effects ratio females to the sex as resist gonochorism balanced such to imba- maturation, a help may sexual maintain ratio or accelerated to sex males, to population-level contribute before from may meiosis a pollution genes initiating environmental in always sex-related alternatively, resulting some earlier; individuals in feminization, mature larger Mutations to cause lance. pressure, forced changes. may fish are fishing environmental disruptors and teleost to perhaps endocrine capture that adapting environmental changes, escape suggesting of to environmental likely species, process various less the from hermaphroditic are in stress of across independently widespread experiencing solely reproduction is After consists sexual hermaphroditism order own literature, evolution no their the the evolved study and in to phylogeny, reported models fish excellent As as teleost serve gonochorism. could and fish reproduc- hermaphroditism these of simultaneous and of types 2009), and all protogyny, Mank, have and hermaphrodite lack Perciformes protandry, (Avise the hermaphrodism methods. hermaphrodite to reproductive gonochorism, due two including unclear above systems, remains the gonochorism tive between and hermaphroditism species plants between transitional in to evolution of times of due many direction re-evolved gonochorism the and fish, from lost In evolve been 2010). have could SS, loci hermaphroditism Renner or exists; genes H, hermaphroditism determination also (Schaefer mutations from sex female-sterility opposite and evolve or a The reproduction, male- could of sexual 2014). on occurrence gonochorism al., the endpoints where or et investment evolutionary (Bachtrog 2009), sex-specific two increasing Mank, gradually the by and fish either are (Avise in hermaphroditism gonochorism spectrum simultaneous and reproductive and hermaphroditism of gonochorism comparisons. evolution Theoretically, pairwise and the in fission studying found chromosome for were experience translocations model not and did A rearrangements chromosomes of these number although a species, events, of close fusion genome these the of in genomes assembled the contain chromosomes the genomes of with haploid number consistent their The was and chromosomes. (TGD), 25 duplication to genome 24 teleost-specific undergone have teleosts Most con- of genome genomes the addition, the aculeatus In than identified. lower the was also were which in lanceolatus RNAs sequences, annotated non-coding repetitive were 5,671 22.28% of tained genes total protein-coding A annotations. 25,233 functional of total A w lsl eae pce hwdta lotalcrmsmsof chromosomes all of LG22 almost chromosome that sex showed The species homology. related closely two while stage, hermaphrodite processes. transient a development gonad and whereas maturity, sexual patterns reproductive in differences .crocea L. .polyactis L. .polyactis L. (25.2%). (41.01%), r h lss eaie htcnitrre Lue l,21) u hydslyremarkable display they but 2019), al., et (Liu interbreed can that relatives closest the are nqepteno etsdffrnito snihrtpclhrahoiimnrgonochorism nor hermaphroditism typical neither is differentiation testis of pattern unique ’ pnpeu.akaara Epinephelus. h.2hsa netdsaod nconclusion, In scaffold. inserted an has 2 Chr. .crocea L. .crocea L. .polyactis L. .polyactis L. eoead au orsac nsxdtriainevolution. determination sex on research to value adds genome n aytp nlss(i ta. 00 akadGl 09.B compared By 2019). Gil, and Park 2020; al., et (Liu analysis karyotype and ssxal auea n ero g.Male age. of year one at mature sexually is .crocea L. a ergre satastoa pce ntepoeso evolution of process the in species transitional a as regarded be can (43.02%), .crocea L. .polyactis L. osnt olnaiyaayi ftewoegnm fthese of genome whole the of analysis Collinearity not. does 12 sas ihyhmlgu ihCr of 2 Chr. with homologous highly also is lpotro maculatum Glyptosternon a xeine eeeoefihn n olto of pollution and overfishing severe experienced has .polyactis L. .crocea L. .polyactis L. .crocea L. ai rario Danio eoe 391o hc had which of 23,991 genome, .polyactis L. .polyactis L. 3.6) and (33.96%), and .polyactis L. ed w er oreach to years two needs savr odmodel good very a is .polyactis L. (52.5%), ed oundergo to needs vle rma from evolved a 4 which 24, was Gasterosteus Epinephelus .polyactis L. .polyactis L. a high had Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. foctsadfiueo rmrilflil omto Syle l,20) ntehrahoii wrasse hermaphroditic male-specific the the Moreover, In degeneration. 2000). oocyte of with al., change sex et the al., during (Soyal gene et decreased level formation (Liang its promoters follicle and ovary, the their primordial in of (CANNTG) failure E-box ( and the to oocytes binding of pellucida by zona of ZP3) oocyte-specific mutagenesis and Targeted of expression 1997). ZP2, the (ZP1, controls that proteins factor (ZP) transcription oocyte-specific an is FIGLA Figla stage aphroditic Figla needed. to is markers related molecular be sex-specific might on hermaphroditism that that speculated showed analysis the of Phylogenetic expression the Moreover, point. in gene determination, sex-specific sex in role important dmrt1 in chromosomes an investigation. sex the plays further to requires family Compared mechanism gene molecular this The of that maintenance. expression suggesting and/or The recently. differentiation, gonads, occurred the has in regulation dimorphism gene in innovation this that rapid regulator revealed a study is RNF present interestingly, that The clear 2016). is al., it 2020). fish, et al., in inflammatory in (Yu et pathways reported Okamoto also been molecular 2016; of have al., and but evolution RNFs et expression of (Yu tumorigenesis role gene elucidated the only downstream fully on of been not studies yet few development diseases, not a the only has underlying several Although disease mechanism molecular in this The of involved (IBD). pathophysiology disease and be RING-finger bowel inflammatory to humans, as development, such In thought conditions and 2012). is an growth al., RNF183 representing et of thereby of (Gontan aspects catalysis, inactivation ( dose-dependent diverse X-chromosome members 183 2019). through al., protein in These initiate et breakdown involved domain. to Sun REX1 pathway 2009; RING-finger widely Spyroulias, causes important a and are RNF12 contains (Chasapis and responses mammals, and environmental cases In ligases or stress ubiquitin most and E3 processes, in biological of regulators family as largest act the is family ) to compared expression gonad that in note pattern to interesting 2018). is al., It uses et the also Mustapha addition, sole 2017; In tongue al., in gene. half-smooth et found sex-determination and Lin marker a 2002), molecular as al., maintenance sex-specific et 2014) testis a Nanda gene al., achieve 2002; determination et to al., sex S oestrogen et medaka (Chen with (Matsuda of The antagonism chromosome synthesis 2013). through Y al., the the factors et inhibiting pathway (Li thus signalling development L2), synthesis and protein oestrogen box of (Forkhead expression Foxl2 the inhibits mainly Dmrt1 study. further dmrt1 warrant mechanisms molecular underlying The question. of type reproduction aihee poecilopterus Halichoeres .polyactis L. figla dmrt1 , , ol n cyp19a1a and foxl2 foxl2 n t ptemgene upstream its and a loietfidin identified also was sacnevdadciia e eemnto-addffrnito-eae rncito atr Dmrt1 factor. transcription differentiation-related and determination- sex critical and conserved a is nX aeNl iai ( tilapia Nile male XY in hwda poiecag nepeso uigsxrvra Myk ta. 02.Overexpression 2012). al., et (Miyake reversal sex during expression in change opposite an showed rnf183 and rnf183 RNF183 n ol rv h etsseicepeso of expression testis-specific the drive could and dmrt1 cyp19a1a nfihadlnaeseicls nsm pce,sc szebrafish. as such species, some in loss lineage-specific and fish in .polyactis L. susra of upstream is .polyactis L. sakde-pcfi bqii iae(aeoe l,21) h bomlexpression abnormal The 2016). al., et (Kaneko ligase ubiquitin kidney-specific a is ) and rnf183 eecniee ob he ftems motn ee o vra differentiation. ovarian for genes important most the of three be to considered were ,wihdsly eaet-aesxchange, sex female-to-male a displays which ), rnf183 .polyactis L. xrsinpten epnil o h aeseictasetherm- transient male-specific the for responsible patterns expression Figla hc sepesdi h etsadnti h vr tams n time any almost at ovary the in not and testis the in expressed is which , ilcag rmgncoimt opeehrahoiimi tl nopen an still is hermaphroditism complete to gonochorism from change will rnf183 rohoi niloticus Oreochromis r e eemnto addt ee in genes candidate determination sex are dmrt1 sol busra of upstream kb 3 only is .crocea L. nmuersle noaista eesaloigt asv depletion massive to owing small were that ovaries in resulted mouse in dmrt1 ie albiflora Nibea dmrt1 rncitm nlsssoe that showed analysis Transcriptome . nyin only r addt e eemnto ee in genes determination sex candidate are ptemgene upstream h eut hwdta h e eemnto addt gene candidate determination sex the that showed results the , h IG(elyItrsigNwGn)figrpoen( protein finger Gene) New Interesting (Really RING The . Larimichthys 13 , e oalc fmitcsemtctsa elas well as spermatocytes meiotic of lack a to led ) .crocea, L. rnf183 dmrt1 dmrt1 u o nayohrtlotseis suggesting species, teleost other any in not but n tdslyda poiesex-biased opposite an displayed it and , losoe biu euldimorphism. sexual obvious showed also and hog oltrldmg ffc.More effect. damage collateral a through dmrt1 ctpau argus Scatophagus Dmy figla .polyactis L. eergo a enrpre as reported been has region gene dmrt1 sauiu oyof copy unique a is dmrt1 a bnatyepesdin expressed abundantly was rnf183 vlto.Teeoe we Therefore, evolution. rnf183 stems important most the is n ute research further and , hw biu sexual obvious shows Sne l,2018; al., et (Sun .polyactis L. sovary-specific is dmrt1 dmrt1 rnf on Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. two,T . ek .E 19) RNSapoenmtffigrrn database. fingerprint informati- motif protein sequence PRINTS–a Amini, long-range (1994). M., E. vitro, M. Ronaghi, In Beck, 7 & L., (2014). K., Christiansen, T. J. contiguity. Attwood, A., transposase Shendure, Kumar, via & assembly R., J., genome Daza, F. novo ps://doi.org/10.1101/gr.178319.114 N., de Steemers, J. for L., on Burton, K. programs. O., Gunderson, search J. S., database Kitzman, protein A., of Adey, generation new a PSI-BLAST: research Sch and L., BLAST T. Gapped Madden, F., S. life. Altschul, daily co. seedling and aquatic materials harbor experiment Xiangshan of the convenience References from for Veterinary Xu of China Wantu College Mr. Ningbo, University County, and Auburn Xiangshan from Zheng LTD, fund Genxing start-up Mr. laboratory thank generous Station We a Experiment Medicine. Foundation Agricultural as Science Alabama well National an as 1018100), 1928770), Grant, by OIA Enabling project (NSF supported (Hatch Fellowship Agriculture is Research technology Track-4 and X.W. (grant RII and Food 2019C0001). EPSCoR of China the number science Institute (grant LQ19C190002); of granary National county number USDA Xiangshan (grant Foundation blue the Province of of Science Department Zhejiang Technology of project Natural and Foundation special Science Science National Natural and the the national 31902341); 2018YFD0901204); key number number the by (grant innovation supported was work This dimorphism expression gene sex-related Acknowledgments for foundation solid a provides also in but evolution hermaphrodite of of genome high-quality that the summary, demonstrated In results These dph. 130 before stage. samples testis and ovary figla between difference significant no the whereas however, dph, 80 stage; after hermaphroditic Interestingly, transient testis. the male-specific respectively in tilapia, levels XX lower in male 2017). to al., In female et play from Zhang Dmrt1 2016, reversion al., and sexual Foxl2 et and bind 2016). (Lau as zebrafish primarily al., early in (BPA) as et masculinization A expressed (Xie complete bisphenol is ovary or Cyp19a1a the E2 tilapia, of of action In oestrogen differentiation 2019). zebrafish, the al., maintaining In regulating drive et et by 2013). and to Wang differentiation Li development sex fish 2013; 2018; in ovarian female al., Chang, roles antagonistic promoting et XX and Li in (Wu in 2010; dph teleosts role differentiation al., 5 gonochoristic vital sex et and Guiguen in a hermaphroditic 2007; that play factor al., both documented key 1) in well a Member afterward is considered A feminization It is Subfamily 2000). oestrogen 19 Nagahama, al., differentiation, Family 2007; et ovarian al., (Qin of mutant et inducer the (Wang follicular in natural individual nucleolar phenotype a to chromatin all-male As oocytes an prefollicular CN in the cystic resulting at blocked, from was transition oocytes 2018). IB) subsequent forming (stage the oocytes However, meiosis, perinucleolar IA). underwent (stage and stage cysts (CN) 2008). in al., clustered et were (Jørgensen zebrafish, occurred In differentiation 2015). gonadal al., et (Qiu spermatids 7,8188 ( 841–848. (7), .polyactis L. .polyactis L. , foxl2 . dmrt1 25 and , 1) 3930.( 3389–3402. (17), hsi aubeifrainfrehnigfihaquaculture. fish enhancing for information valuable is This . , Sn ta. 00.Ls ffnto fcp91 yCIP/a9adTLN illa to lead will TALENs and CRISPR/Cas9 by cyp19a1a of function of Loss 2020). al., et (Song https://doi.org/10.1093/protein/7.7.841 cyp19a1a figla foxl2 , foxl2 r epnil o P omto n h aeseictasethermaphroditic transient male-specific the and formation EPO for responsible are and and cyp19a1a https://doi.org/10.1093/nar/25.17.3389 cyp19a1a ffr .A,Zag . hn,Z,Mle,W,&Lpa,D .(1997). J. D. Lipman, & W., Miller, Z., Zhang, J., Zhang, A., A. ¨ affer, ). .polyactis L. figla xrsinee ece eyhg ee t10dh u hr was there but dph, 100 at level high very a reached even expression figla eeas bnatyepesdi h vr n xrse at expressed and ovary the in expressed abundantly also were , foxl2 a xrse rm2 p,wihwslkl h iewhen time the likely was which dpf, 26 from expressed was cyp19a1a o nypoie nih notegntcunderpinnings genetic the into insight provides only not 14 and cyp19a1a ). figla figla xrsinadosrgnpouto L tal., et (Li production oestrogen and expression eoeresearch Genome uatzbas hwdta h emcells germ the that showed zebrafish mutant ee erae ihoct degeneration oocyte with decreased level eeas xrse ntetsi nthe in testis the in expressed also were foxl2 Esr2a ). and cyp19a1a oihbtteepeso and expression the inhibit to , 24 1) 0124.( 2041–2049. (12), rti engineering Protein Ctcrm P450 (Cytochrome uli acids Nucleic htt- . Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. da .C 20) UCE utpesqec lgmn ihhg cuayadhg throughput. high and accuracy high with mammalian alignment research of sequence evolution multiple acids The MUSCLE: (2006). (2004). W. C. M. R. Edgar Hahn, & N., Cristianini, E., J. families. Stajich, gene T., teleosts. Bie, De in P., J. hermaphroditism Demuth, Functional (2008). M. Liu, ( of 1-43. study and the Y.S. for tool Mitcheson, computational a De heuristics CAFE: (2006). new W. models, M. Hahn, more evolution. & P., family 2: J. gene Demuth, jModelTest N., (2012). Cristianini, T., D. Bie, De Posada, from & assemblies R., computing. Doallo, genome parallel L., microbial and G. finished Taboada, Nonhybrid, D., (2013). Darriba, al. et data. P. sequencing Marks, SMRT D., long-read Alexander, CS., Chin, abstract). and English utilizati- with cultivation rational Chinese and season Broodstock close China summer (2016). of of Effects Sciences al. (2004). ( FY. croaker et Ding, redlip JS., of Li, F., on JZ., Ling, Liu, LS., Lin, L., JH., croaker Cheng, Chen, yellow DD., small Xu, in W., (DOI: techniques Zhan, induction B., spawning Lou, discovery. drug RY., and structure enhancing Chen, ligases: (https://doi.org/10.2174/138161209789271825) ubiquitin and 3716–3731. E(3) finger 15(31), repairing RING design, (2009). for pharmaceutical A. synteny. G. Current set Spyroulias, conserved & tool T., and C. a Chasapis, maps Chromonomer: genetic (2020). of S. integration Bassham, ( through & genomes of A., assembled database Amores, ProDom J., The (2005). Catchen, D. Kahn, 3D. & on S., emphasis Dalmar, ( more Y., families: Beausse, domain Carr`ere, S., protein E., Courcelle, C., Bru, Genomewise. and GeneWise (2004). R. Durbin, ( & M., Clamp, E., sequences. Birney, DNA analyze to program ( a 573–580. finder: (2), repeats Tandem TrEM- (1999). supplement G. its Benson and database sequence protein 2000. SWISS-PROT The in (2000). BL R. Apweiler, & A., Bairoch, it? doing Con- Sex of of ways Tree (2014). many al. so et nal.pbio.1001899 TL., why Ashman, determination: SP., Otto, Sex M., Kirkpatrick, sortium CL., Peichel, ontology: JE., Mank fishes. D., Gene Bachtrog, in (2000). S., hermaphroditism on G. Lewis, Dolin- ( perspectives A., Sherlock, Evolutionary P., Review. Kasarskis, (2009). 63. & A. JE. L., M., Mank Davis, Consortium. Issel-Tarver, JC, G. P., Ontology M., Avise Gene D. Rubin, J. The Hill, Cherry, M., biology. A., Ringwald, H., of M. E., Butler, unification ps://doi.org/10.1038/75556 Harris, the J. D., T., for Richardson, Botstein, J. tool C., A., Eppig, S., J. J. S. Blake, Matese, Dwight, A., K., C. ski, Ball, M., Ashburner, doi:10.1101/2020.02.04.934711 https://doi.org/10.1093/nar/gki034 https://doi.org/10.1101/gr.1865504 10.16378/j.cnki.1003-1111.2016.03.010 https://doi.org/10.1111/j.1467-2979.2007.00266.x uli cd research acids Nucleic o:10.1159/000223079 doi: lSone PloS https://doi.org/10.1093/nar/27.2.573 . 32 ). . 5,19–77 ( 1792–1797. (5), 11 aiihhspolyactis Larimichthys Bioinformatics 5-6.( 554-560. auemethods Nature . 1 1,e5 ( e85. (1), ). ) http://en.cnki.com.cn/Article a Methods Nat . https://doi.org/10.1093/nar/gkh340 https://doi.org/10.1371/journal.pone.0000085 ). ). 28 . ). 22 1,4–8 ( 45–48. (1), , 9 1) 2917.( 1269–1271. (10), lee)rsuc ntees hn e region. Sea China east the in resource Bleeker) 8,72 ( 772. (8), I hns ihEgihabstract). English with Chinese (In ) . uli cd research acids Nucleic 10 6–6.( 563–569. , https://doi.org/10.1093/nar/28.1.45 15 ). https://doi.org/10.1038/nmeth.2109 ). suocan polyactis Pseudosciaena https://doi.org/10.1093/bioinformatics/btl097 LSBiol. PLoS en/CJFDTOTAL-ZSCK200406012.htm https://doi.org/10.1038/nmeth.2474 auegenetics Nature 12 ). . eoeresearch Genome 33 7:1089 ( (7):e1001899. Dtbs su) D212–D215. issue), (Database uli cd research acids Nucleic ). ihadFisheries and Fish . bioRxiv e Dev Sex ihSci. Fish . 25 o:10.1371/jour- doi: ora fFishery of Journal ). . ). 1,2–9 ( 25–29. (1), 14 2004.934711. , . 5,988–995. (5), 35 3 (2-3):152- :250-254. Nucleic ). (In ) , , htt- 9 27 ). : Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. aeo . ws,I,Ymsk,Y ta.(06.Gnm-ieietfiainadgn expression gene and identification Genome-wide (2016). al. et Y. Yamasaki, I., Iwase, M., Kaneko, differentiation. profiles Expression sex ( (2008). P. gonadal 25. Bjerregaard, during & 6, genes J., L. from zebrafish Rasmussen, matrices six O., Andersen, data for E., mutation 275-282. J. of 8: Morthorst, A., generation Biosciences Jørgensen, rapid the The in (1992). Applications J.M. Computer Thornton sequences. and protein W.R., hermaphroditic Taylor among D.T., self-fertilization Jones of distribution the too: up & it M., mix animals. Animals reveals Pagni, (2006). S., JR. Auld, P. life P., Langendijk-Genevaux, Jarne, E., of database. Castro, PROSITE De Tree The L., (2006). ( Cerutti, (2015). J. V., C. Bulliard, S. Sigrist, A., Bairoch, Kumar, N., Hulo, & at M., RepeatMasker yel- MJ. Flynn, Paymer, of > & A., cycle M., Smit, R., Hubley, Reproductive Suleski, diversification. (2010). J., and YJ. ps://doi.org/10.1093/molbev/msv037 Marin, Chang, speciation SP., B., clock-like Mi, S. SY., Hedges, Kim, CM., An, ( croaker MH., Wortman, to Le, & low Program R., C. the KL., Buell, and O., Han, EVidenceModeler White, using J., Orvis, annotation E., R., structure Alignments. J. gene Maiti, Allen, Spliced eukaryotic M., Assemble I., Automated Pertea, W., (2008). L. Zhu, R. Hannick, L., J. S. Arabidopsis Jr, Salzberg, the K., J., Improving B. (2003). R. Haas, O. Smith, White, & R., assemblies. pivotal L., alignment J. S. ( A transcript Wortman, Salzberg, 5666. maximal D., M., estrogens: C. using S. Town, annotation B., and Mount, genome fish. D. L., Rusch, aromatase M., in A. C. Ovarian Delcher, change Ronning, J., (2010). sex B. CF. and Haas, Chang differentiation F, sex Piferrer gonadal 10.1016/j.ygcen.2009.03.002). A, for Fostier role anno- Rfam: Y, (2005). A. Guiguen Bateman, & R., genomes. S. complete Eddy, in ( A., RNAs Khanna, M., non-coding Marshall, Fan,tating transcrip- S., Full-length X., Moxon, F., (2011). genome. Adiconis, S., Palma, A. Griffiths-Jones, reference I., di Regev, a N., Amit, . Rhind, without . A., . A., data D. N., Gnirke, Friedman, RNA-Seq ( Thompson, N., K., from Z., Hacohen, Lindblad-Toh, assembly E., J. C., tome Mauceli, Nusbaum, Levin, Z., W., M., Chen, B. Yassour, Q., Birren, J., Zeng, B. R., Raychowdhury, degradation. Haas, L., for J. G., REX1 Grootegoed, M. families targeting W., by Grabherr, IJcken, protein inactivation van the X-chromosome E., ( initiates Rentmeester, Pfam: 386–390. RNF12 S., (2014). 485(7398), (2012). T. Nature, M. Barakat, J. J., Gribnau, Punta, Demmers, & & M., A., J., E. Hetherington, Achame, Tate, A., C., L., Heger, Gontan, R., E. S. Eddy, Sonnhammer, Y., J., R. Mistry, Eberhardt, database. P., L., Coggill, J., Holm, Clements, K., A., Bateman, D., R. Finn, https://doi.org/10.1093/nar/gkj063 https://doi.org/10.1007/s12562-010-0288-5 https://doi.org/10.1093/nar/gki081 https://doi.org/10.1038/nbt.1883 . https://doi.org/10.1093/nar/gkg770 https://doi.org/10.1186/1477-7827-6-25 Evolution. uli cd research acids Nucleic aiihhspolyactis Larimichthys 60 8612.( 1816–1824. https://doi.org/10.1038/nature11070 eoebiology Genome . 42 ). https://doi.org/10.1111/j.0014-3820.2006.tb00525.x ). ). Dtbs su) 22D3.( D222–D230. issue), (Database ). nsuhr aeso oe.FseisSine 6971-980. 76 Science. Fisheries Korea. off waters southern in ). oeua ilg n evolution and biology Molecular ). , ). 9 uli cd research acids Nucleic uli cd research acids Nucleic 1,R.( R7. (1), 16 < http://www.repeatmasker.org/RepeatModeler.html erdciebooyadedciooy:RB&E : endocrinology and biology Reproductive https://doi.org/10.1186/gb-2008-9-1-r7 e opEndocrinol Comp Gen ) auebiotechnology Nature uli cd research acids Nucleic https://doi.org/10.1093/nar/gkt1223 . . 34 33 Dtbs su) D227–D230. issue), (Database Dtbs su) D121–D124. issue), (Database . 32 4,8585 ( 835–845. (4), ). . 6:5–6 (doi: 165:352–66. 29 . 31 7,644–652. (7), 1) 5654– (19), ). htt- ). , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. i ,SiY unJ ta.(03.Etmto fgnmccaatrsisb nlzn -e rqec in frequency k-mer analyzing by characteristics genomic of Estimation (2013). al. projects. et genome J, novo Yuan de Y, Shi B, Liu i,L. hn,J.(04.A nlsso h urn iuto ffihr ilg fsmall of biology fishery of situation current the sea. of china analysis male-specific east a An the of (2004). Identification in (2017). JH. (DOI: Z. croaker Wang, Cheng, & yellow ( S., LS., croaker Sun, Lin, yellow X., Lin, large K., the Ye, ps://doi.org/10.1016/j.aquaculture.2017.08.009). in S., Xu, marker S., DNA Xiao, the A., in involved Lin, factor transcription cell-specific germ genes. a pellucida FIGalpha, zona (https://dev.biologists.org/content/124/24/4939.long). (1997). the J. of Dean, expression & coordinate M., without S. or Soyal, with L., data Liang, RNA-Seq genome. from panda quantification giant transcript accurate the genome. RSEM: of reference (2011). assembly a N. novo C. Dewey, de transform. & and B., sequence Burrows-Wheeler Li, The with (2010). al. alignment (doi: et read 463(7279):311. G, Tian short . W, accurate Fan R, and Li Fast Sun, (2009). J., R. H. Durbin, differentiati- Bioinformatics Shi, & sex Endocrinol Comp R., H., Gen in differentiation. T. Foxl2 sex Li, and Wang, plasticity and j.ygcen.2018.11.015). sexual P., 10.1016/ fish Dmrt1 in Q. (doi: estrogens of 277:9–16. of Xie, TALENs. Roles (2019) roles D. L., Wang by L, Antagonistic X. Sun demonstrated M, (2013). Jiang, Li as S. L., tilapia D. Y. in Wang, Sun, production & R., ( estrogen Y., M. via Li, L. on H., Zhou, eukaryotic H. for N., groups Yang, L. ortholog H., annotation of identification M. domain OrthoMCL: Li, protein (2003). S. the D. to Roos, & updates Jr, recent genomes. J., C. 7: Stoeckert, SMART L., (2012). Li, P. Bork, & T., (cyp19a1a) resource. Gene Doerks, Differentiation. Aromatase Ovarian I., Ovarian Failed Letunic, to Zebrafish Due of Offspring Knockout All-male (2016). to W. Leads 6 Ge, CRISPR/Cas9 M., and TALEN Qin, by Z., Zhang, 2. ES., Bowtie Lau, with alignment gapped-read Fast (2012). Genetics L. ( Evolutionary S. 357–359. Molecular Salzberg, 35:1547-1549. X: & Evolution MEGA B., and (2018). Langmead, Biology K. M. Tamura Molecular Marra, and platforms. C., & computing Knyaz J., across M., Analysis Li S. G., Jones, Stecher D., S., Kumar Horsman, genomics. R., comparative Gascoyne, for J., aesthetic information Connors, ( an I., ( Circos: Birol, (2009). 30955. J., A. Schein, 6, M., Rep Krzywinski, Sci genomes. degradation. novel in protein finding 2105-5-59 Gene reticulum (2004). I. endoplasmic Korf for ligases ubiquitin ps://doi.org/10.1038/srep30955 of profiling https://doi.org/10.1210/en.2013-1451 https://doi.org/10.1101/gr.092759.109 75.( 37357. 10.16441/j.cnki.hdxb.2004.04.009 uli cd research acids Nucleic eoeresearch Genome o:10.1038/srep37357 doi: ). https://doi.org/10.1038/nmeth.1923 , 25 M bioinformatics BMC 1) 7416.( 1754–1760. (14), 10.1038/nature08696 uniaieBiology Quantitative . 13 ) . 9,27–19 ( 2178–2189. (9), 40 ). Dtbs su) 32D0.( D302–D305. issue), (Database https://doi.org/10.1093/bioinformatics/btp324 ). I hns ihEgihabstract) English with Chinese (In ) ) . ). 12 ora fOenUiest fChina. of University Ocean of Journal aiihhscrocea Larimichthys 2.( 323. , . M bioinformatics BMC ). 35 https://doi.org/10.1101/gr.1224503 Development s13:26.( 1–3):62-67. (s 17 https://doi.org/10.1186/1471-2105-12-323 Cmrde nln) 2(4,4939–4947. 124(24), England), (Cambridge, https://arxiv.org/abs/1308.2012 ). . https://doi.org/10.1093/nar/gkr931 5 Endocrinology eoeresearch Genome Aquaculture 9 ( 59. , https://doi.org/10.1186/1471- , auemethods Nature ). ). 5(2,4814–4825. 154(12), , 480 . 19 34(4) 1-2.(htt- 116-122. , 9,1639–1645. (9), ). 565-70. : ). c Rep. Sci . Nature 9 (4), htt- ). Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. udr . pelr .(07.Itrr n nePocn ol o rti eunecasfiainand classification sequence protein for tools InterProScan: and InterPro homology (2007). R. comparison. in Apweiler, & Challenges N., fish, genes, Mulder, sex-specific hermaphroditic of (2013). protogynous profile (https://doi.org/10.2108/zsj.29.690). expression M. the and 690–701. in cloning Punta, 29(10), Molecular Dmrt1, & (2012). and H. Kuniyoshi, A., Figla & Y., Bateman, Sakai, GlimmerHMM: Y., Miyake, R., regions. and S. coiled-coil of Eddy, evolution TigrScan convergent D., ( and R. HMMER3 search: Finn, protein of J., database (2004). PANTHER Mistry, The (2005). L. A., D. Muruganujan, P. pathways. N., Thomas, and S. Guo, ( & gene-finders. S., functions H., Rabkin, Kitano, subfamilies, J., J., Salzberg, Vandergriff, families, eukaryotic M. A., Campbell, Kejariwal, O., R., & initio Doremieux, Loo, B., Lazareva-Ulitsky, Y- H., M., ab Mi, a is Pertea, source DMY ( (2002). open H., M. Sakaizumi, W. Shi- two & E., fish. S., C. medaka Majoros, Morrey, the Hamaguchi, in T., H., development Kobayashi, male Hori, C., for N., (https://doi.org/10.1038/nature751). Matsuda, required Shimizu, gene T., DM-domain Sato, S., specific A., Asakawa, genes Shinomiya, RNA N., transfer Y., bata, of Nagahama, detection improved M., for program Matsuda, a tRNAscan-SE: (1997). sequence. R. genomic S. in Eddy, & M., T. Lowe, (https://doi.org/10.1016/j.chemosphere.2020.125907) tr .(d)Agrtm nBonomtc.WB 04 etr oe nCmue cec,vl8701. vol Science, Computer in Notes Morgen- Lecture D., Brown 2014. In: WABI ( Reads. Bioinformatics. Long Heidelberg. in Noisy Berlin, amongst Algorithms Springer, Discovery (eds) Alignment B. Local stern Efficient (2014). G. Myers and risks, distribution, Seasonal contaminants emerging (2020). these Y. Will continental-scale?. Lin, at waters: & environment coastal Y., marine in Zhang, in chemicals J., risks disrupting Wu, potential endocrine C., pose of Zhang, sources J., Lu, Biol. nfih n obr,O jsu LTrne,EAdrsn OSeaso,eios rceig fthe of http://www.vliz.be/en/imis?refid=4219 Proceedings at: Available editors. 211–22. Stefansson, p. SO Andersson, (2000). Bergen E of Taranger, University GL and Research Kjesbu, gametogenesis and OS entiation Norberg, differ- sex B 6 gonadal In: of regulators fish. major in (2018). hormones: steroid L. Gonadal G. Y. Nagahama Li, & . . . P., H. Chen, W., Yang, ( T., scat H. spotted in Gu, gene H., determination Z. sex candidate Liang, , a N., is D. Dmrt1 Jiang, Male-specific F., U. Mustapha, and hybridization Interspecific sequentially two (2019). of development R.Y. male Primary Chen (2016). SX. W., Ding, B., Kang, Zhan groupers, XJ., hermaphroditic Shan, B., YY., Wang, M., Lou Liu, T. QTL and Chu, construction of map (https://doi.org/10.1007/s10499-019-00353-x). Y., linage characterization genetic Liu, first genetic A F., (2020). W. Liu, Xu B, Lou in H, 68592-0 traits Chen growth Q, for Xie mapping W, Zhan F, Liu https://doi.org/10.1093/nar/gkt263 https://doi.org/10.1093/nar/gki078 https://doi.org/10.1093/bioinformatics/bth315 th 495 nentoa ypsu nRpoutv hsooyo ih egn owy nttt fMarine of Institute Norway: Bergen, Fish. of Physiology Reproductive on Symposium International 88 5-5.(https://doi.org/10.1016/j.aquaculture.2018.06.009). 351-358. , ). 4:5863 ( (4):1598-613. ehd nmlclrbiology molecular in Methods uli cd research acids Nucleic https://doi.org/10.1111/jfb.12936 pnpeu akaara Epinephelus aiihhspolyactis Larimichthys aiihhspolyactis Larimichthys https://doi.org/10.1007/978-3-662-44753-6 ). ). . 396 . and 25 ). uli cd research acids Nucleic 97.( 59–70. , pnpeu awoara Epinephelus 5,9594 ( 955–964. (5), 18 c Rep. Sci . )and () ). https://doi.org/10.1007/978-1-59745-515-2 Bioinformatics aihee poecilopterus Halichoeres .crocea L. 15 https://doi.org/10.1093/nar/25.5.955 1()161 ( ;10(1):11621. uli cd research acids Nucleic Prioms Epinephelidae). (Perciformes: . 33 (). 5 ctpau argus Scatophagus ). Chemosphere qautInt Aquacult Nature Dtbs su) D284–D288. issue), (Database . o:10.1038/s41598-020- doi: 20 , 417 . olgclscience Zoological 1) 2878–2879. (16), 68) 559–563. (6888), 4,125907. 247, , . 27 41 ). Aquaculture 663–674. , 1) e121. (12), Fish J ). 5 ). , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. aao,A . ooyv .V 20) biii eefidn nDoohl eoi DNA. genomic Drosophila in finding gene initio Ab for (2000). package V. V. Bioconductor analysis Solovyev, expression & a research differential A., for edgeR: A. method normalization Salamov, scaling A (2010). (2010). data. K. A. RNA-seq Oshlack, data. of G. & expression D., Smyth, M. gene Robinson, & digital J., of D. analysis ( L. Journal McCarthy, American expression Zhou, plants. D., differential & flowering the M. D., in Wang, correlates Robinson, its (doi:10.1002/j.1537-2197.1995.tb11504.x). Y., and 596-606. Dioecy Nagahama, 82 Tilapia (1995). Botany. J., Nile R.E. of Wei, Teleosts, Ricklefs, and a L., S.S. in Renner, Sun, Spermatogenesis L., Antagonizing by Wu, Differentiation T., Ovarian ( Favors Charkraborty, Figla large S., (2015). in Sun, families Y., Figla/figla Gonadogenesis. repeat of Qiu, Roles Zebrafish of (2018). During W. identification Formation Ge, (https://doi.org/10.1210/en.2018-00648). Follicle novo & N., 3699–3722. and Shirgaonkar, De 159(11), W., Development Chen, diploid Ovary W., (2005). Q. Juvenile and Wong, A. W., in Assembly Song, P. Z., Zhang, Pevzner, (2015). M., Qin, & A. C., Bashir, R. N. Durrett, Jones, . G., . genomes. . Stedman, L., Deikus, Y., M., A. technologies. A., A. Guo, single-molecule Price, Cohain, Stutz, S., via H., C. genome T., Cao, human Chin, Rausch, ( H., individual 780–786. R., O., M. an Fritz, Franzen, Altman, of H., A., C., architecture Dai, Ummat, S. A., Blanchard, W., Hastie, A. E., T., Pang, Anantharaman, R., W., Sebra, M., Pendleton, ar,G,Banm . of .(07.CGA ieiet cuaeyantt oegnsin genes core annotate accurately to pipeline a CEGMA: (2007). I. Korf, & Drosophila. K., in genomes. eukaryotic GeneID Bradnam, G., (2000). Parra, R. Guigo, & E., ( Blanco, G., Parra, Yellow Small trilimeatum, Parapristipoma Grunt, Chicken 10.12717/DR.2019.23.1.073). of zebrafish. Analysis Karyotypic in differentiation Croaker, (2019). testis HW. Gil roads: IS, winding Park molecular and of Long journal (2009). Endocrinol. PE. International Cell Olsson, Mol R., Function. Sreenivasan, RNF183, L., Biological Ligases, Orban, and Ubiquitin Tissue-Specific Disease ( of in 3921. Role 21(11), The RNF152, sciences, (2020). and M. RNF182 Encyclopedia Kaneko, Kyoto & RNF186, KEGG: K., (1999). Imaizumi, M. T., Kanehisa, Okamoto, & H., Bono, Genomes. W., and Fujibuchi, Genes K., Phylogenetics. of Sato, and alignments. S., Evolution RNA Goto, Molecular H., of Ogata, inference (2000). S. 1.0: Kumar Infernal and (2009). M. R. Nei S. Eddy, & L., region D. matics sex-determining Kolbe, the P., in E. DMRT1 Nawrocki, of N., America copy Shimizu, medaka, of T., duplicated the States Haaf, A United of Z., (2002). Shan, the chromosome M. A., Y Shimizu, Schartl, the C., & of Winkler, M., S., Schmid, Asakawa, A., U., Hornung, Shima, M., Kondo, I., Nanda, https://doi.org/10.1093/bioinformatics/btp616 niloticus Oreochromis https://doi.org/10.1101/gr.10.4.511 . . Bioinformatics 25 aiihhspolyactis Larimichthys https://doi.org/10.1038/nmeth.3454 10 1) 3513.( 1335–1337. (10), 4,5652 ( 516–522. (4), eoebiology Genome Bioinformatics 312 ). https://doi.org/10.3390/ijms2111392 uli cd research acids Nucleic . 12:54.( (1-2):35-41. lSone PloS 21 https://doi.org/10.1101/gr.10.4.516 up ,i5–38 ( i351–i358. 1, Suppl , https://doi.org/10.1093/bioinformatics/btp157 99 n rw rae,Mihhsmiy e Reprod. Dev miiuy. Miichthys Croaker, Brown and , 1) 17–18.(https://doi.org/10.1073/pnas.182314699) 11778–11783. (18), . . 04,e130.(https://doi.org/10.1371/journal.pone.0123900). e0123900. 10(4), , 11 ). 23 rza latipes Oryzias https://doi.org/10.1016/j.mce.2009.04.014 3,R5 ( R25. (3), 9,16–07 ( 1061–1067. (9), ). . ). 27 https://doi.org/10.1093/bioinformatics/bti1018 https://doi.org/10.1186/gb-2010-11-3-r25 1,2–4 ( 29–34. (1), 19 . rceig fteNtoa cdm fSine of Sciences of Academy National the of Proceedings https://doi.org/10.1093/bioinformatics/btm071 ) ). https://doi.org/10.1093/nar/27.1.29 eoeresearch Genome Bioinformatics xodUiest Press University Oxford ). auemethods Nature ). . 23 . 26 10 Endocrinology 1:37.(doi: (1):73-78. ). 1,139–140. (1), 4,511–515. (4), e York. New , . ). Bioinfor- Genome 12 ). (8), ). , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ag . ag . oes .E,Mn,R,Aa,M,&Ptro,A .(08.Urvln an- zebrafish, Unraveling in the collinearity in and Univ hermaphroditism Synteny ( Juvenile (2008). (2008). 486–488. (1977). H. H. 320(5875), H. A. Takahashi, Paterson, N.Y.), A. & York, M., (New Paterson, Alam, Science R., & Ming, genomes. X., plant Wang, M., E., J. Alam, Bowers, H., Tang, R., maps. Ming, gene E., angiosperm scans J. multiply-aligned ( Genome-wide through Bowers, (2013). hexaploidy P. X., Y. cient Wang, Zhang, & H., dolphins. Y., and of Tang, Y. identification adaptation Shen, sex M., aquatic D. Genetic the ( Irwin, in Q., (2018). 130–139. involved H. genes Z. Liu, P., Wang, candidate W. for & Zhou, M., B., Y. ( Cai, RING-Finger Sun, drum Z., Plant yellow on Han, the Progress A., in Research system Lin, (https://doi.org/10.1016/j.aquaculture.2018.03.042). determination S., (2019). sex M. Xiao, potential A. W., the Xie, Li, & for S., A., (MEGA) Sun, Ren, (https://doi.org/10.3390/genes10120973) Analysis I., 973. Genetics R. 10(12), Genes, Ahmed, Evolutionary Y., Proteins. Molecular Sun, 1312–1313. J., (2020). post- Sun, ( S. Evolution Kumar with and and eukaryotes Biology and 30(9), Molecular in K., macOS. analysis prediction sources. Tamura Gene external G., England), from phylogenetic Stecher (2006). hints S. uses for Waack, that (Oxford, & model Markov ( submodel. B., tool intron hidden Morgenstern, new generalized Bioinformatics a O., a a and Schoffmann, model Markov M., hidden Stanke, a with 8: prediction Gene (2003). version Bioinformatics S. phylogenies. Waack, & RAxML M., Stanke, large analy- Web RAxML the of ( phylogenetic for (2014). models. algorithm bootstrap analysis likelihood-based A. rapid mixed A Stamatakis (2008). maximum and J. Rougemont, & RAxML-VI-HPC: taxa P., servers. Hoover, A., of Stamatakis, tran- thousands cell-specific (2006). BUSCO: ( with (2015). germ M. A. ses E. orthologs. Zdobnov, a & single-copy Stamatakis V., with E. completeness FIGalpha, Kriventseva, ( annotation P., 3210–3212. and Ioannidis, (19), assembly M., genome R. Waterhouse, assessing seven A., (2000). suggests F. Asia. (Cucurbitaceae) Simao, to J. Momordica dispersal long-distance of 10.1016/j.ympev.2009.08.006). recent phylogeny and (doi: Dean, three-genome formation. monoecy 60. A to dioecy & follicle (2010). from returns SS. ovarian Renner A., H, Schaefer for Amleh, required (https://dev.biologists.org/content/127/21/4645.short). M., factor scription S. Mechanism. Bisphe- of Signalling Soyal, Estrogenicity Its for Evidence and Genetic Differentiation (2020). 10.1016/j.jhazmat.2019.121886). Gonadal W. Ge Zebrafish (doi: ESW, in Lau ZW, A Zhang K, nol Wu HJ, Lu WY, Song https://doi.org/10.1101/gr.080978.108 https://doi.org/10.1186/1471-2105-7-62 https://doi.org/10.1093/bioinformatics/btu033 https://doi.org/10.1093/bioinformatics/btl446 . 28 ytmtcbiology Systematic 5–5 ( :57–65. https://doi.org/10.1093/gbe/evs123 . 19 https://doi.org/10.1093/bioinformatics/btv351 http://hdl.handle.net/2115/23605 up ,i25i25 ( ii215–ii225. 2, Suppl . 57 5,7871 ( 758–771. (5), ). ). https://doi.org/10.1093/bioinformatics/btg1080 https://doi.org/10.1093/molbev/msz312 ). https://doi.org/10.1080/10635150802429642 ). ). 20 ) Bioinformatics ie albiflora Nibea rcyai rerio Brachydanio https://doi.org/10.1126/science.1153917 ). Development eoeResearch Genome eoebooyadevolution and biology Genome o hlgntEvol. Phylogenet Mol aadMater Hazard J . ). M bioinformatics BMC Aquaculture . 2(1,4645-4654. 127(21), , ). 22 ulFcFs Hokkaido Fish Fac Bull Bioinformatics . 2) 2688–2690. (21), ). 18 ). . 492 386:121886. . 1944–1954. , 54(2):553– 253-258. , . 7 . 5 62. , (1), . 31 ). Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. u . ag .(07.LTR (2007). H. Wang, & Early in Z., Stage Hermaphroditic Fe- Transient A Causes Xu, (2021). SF-1 B. Lou of X, Croaker, Wang Haploinsufficiency Yellow P, 11 Little Tan in F, Development Liu (2016). Gonadal W, DS. Male Zhan BB, Wang Li QP, LN, Tilapia, Xie Sun Nile Dmrt1 LL, in Reversal testicular Chen Sex of YN, 10.1210/en.2015-2049). Male Sui regulation to X, the male He through QP, femaleness insights porgy.Xie yield the black turtle guide protandrous sea males in green 10.1016/j.ygcen.2017.01.033) and Primary Cyp19a1a turtle ovarian Nozawa, CF. soft-shell K., and Chang of plan. Beal, genomes GC, body J., draft Herrero, turtle-specific Wu The M., the Pignatelli, (2013). of S., N. evolution Fang, Kuraku, Irie, and ( Z., Y., Xiong, development S., Zheng, . the White, . . Y., into C., Li, Q., Chen, Z., Li, Y., Huang, Y., M., Ming, Niimura, B., W., binding Wang, Li, ad4 A., with D., Zadissa, interacting J., as Pascual-Anaya, well gene Z., aromatase as Wang, An upregulates promoter Foxl2 the Pilon: al. to et 1. binding F, by factor (2014). Sakai manner protein/steroidogenic S, female-specific Ijiri al. a B, in Paul-Prasanth et LY, transcription Zhou S, T, Sakthikumar Kobayashi DS, PLoS A., Wang Improvement. Abouelliel, Assembly Genome M., and Priest, Detection Variant ONE. T., Microbial Comprehensive systems. Shea, for Ross, sexual Tool T., N., Integrated of Perrin, Abeel, database MW., A B.J., Pennell, Walker, CL., Sex: Peichel, of SP., Tree Otto, (2014). R., JC. Ming, ( Kirkpatrick, Vamosi, I., MW., N., Hahn, Mayrose, EE., Valenzuela, JE., Goldberg, L., L., Mank, H., S. Blackmon, J., D., unanno- Bachtrog, Kitano, Salzberg, reveals TL., M., J., Ashman, RNA-Seq consortium: M. by Sex of quantification Baren, Tree van and differentiation. assembly cell G., Transcript during Kwan, switching A., (2010). ( isoform Mortazavi, L. and Pachter, transcripts G., & tated Pertea, J., A., B. B. Wold, Williams, C., Trapnell, o,X. a,M. in,Y. ag Y 21) itlgclosraino oaa e differenti- sex Q. gonadal Shi, on observation . Histological . X., . Pan, (2012). H., ( S., ZY. Zhang, croaker Wang, He, G., yellow ( YH., J., Fan, large Jiang, Lu, Y., in MY., C., Liu, ation Cai, Bian, D., XR., H., Mao, You, Yuan, adaptation. Y., cave W., Sun, into Jiang, insights Y., provides Y., Wang, genome ( Qiu, cavefish X., Sinocyclocheilus D., You, The Fang, X., (2016). J., Wang, Bai, Y., X., Zhang, Chen, likelihood. J., maximum Yang, by analysis phylogenetic 4: PAML 24 (2007). Z. Yang retrotransposons. ( LTR length el- repetitive identify RNA-Seq. with to junctions splice discovering RepeatMasker TopHat: Using (2009). L. S. Salzberg, & Bioinformatics L., (2009). Pachter, C., sequences. N. Trapnell, genomic Chen, ( in & ements M., Tarailo-Graovac, https://doi.org/10.1038/ng.2615 10.1038/sdata.2014.15 doi: https://doi.org/10.1038/nbt.1621 http://en.cnki.com.cn/Article https://doi.org/10.1186/s12915-015-0223-4 https://doi.org/10.1093/nar/gkm286 https://doi.org/10.1002/0471250953.bi0410s25 524.(o:10.3389/fendo.2020.542942). (doi: :542942. 8,18–51 ( 1586–1591. (8), 9 1) 126.( e112963. (11): . 25 9,10–11 ( 1105–1111. (9), https://doi.org/10.1093/molbev/msm088 https://doi.org/10.1371/journal.pone.0112963 ). aiihhscrocea Larimichthys o Endocrinol Mol en/CJFDTotal-SCKX201207010.htm ). ). uli cd research acids Nucleic https://doi.org/10.1093/bioinformatics/btp120 ). urn rtcl nbioinformatics in protocols Current ). IDR necetto o h rdcino full- of prediction the for tool efficient an FINDER: rohoi niloticus Oreochromis 20)2()722.(o:10.1210/me.2006-0248). (doi: 21(3):712–25. (2007) ). .Junlo ihre fCia 6() 1057-1064. (7): 36 China. of Fisheries of Journal ). 21 aiihhspolyactis Larimichthys e opEndocrinol Comp Gen ). . 35 auebiotechnology Nature I hns ihEgihabstract) English with Chinese (In ) . WbSre su) W265–W268. issue), Server (Web ). Endocrinology auegenetics Nature oeua ilg n evolution and biology Molecular . rn Endocrinol Front 21)211822 (doi: 261:198–202. (2018) . ). 5()20–4 (doi: 157(6):2500–14. c Data Sci M biology BMC . , 45 hpe 4 Chapter 28 6,701–706. (6), 5,511–515. (5), . (Lausanne). 1 :140015. , 14 1. , . , Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ee r akdwt akylo,adohrotoousaemre ihgreen. unique with marked yellow, are with orthologues marked other are and yellow, orthologues dark multiple-copy with pink, marked with are marked genes are orthologues Single-copy morhua Gadus gallus Gallus include species eighteen rubripes The species. crocea eighteen include of species genomes four the The species. four uleatus TEs. the nuclear unknown interspersed among el- short indicates nuclear families indicates interspersed gene purple blue long light and (LTRs), indicates yellow repeats (SINEs), dark terminal method. elements long (DNAs), prediction indicates transposons green one DNA (LINEs), indicates least ements Red at TEs. by of supported families ( were methods genes prediction 25233 different using obtained (A) manuscript the X., to of contributed Y. Characteristics authors B.N., All 1. S., B.L. Figure J. and Q.X, X.W. Q.X. analyses: legends paper: data F.L.; version. the Figure the W.Z., submitted wrote H, performed the X.W; X. approved S.; and and Q.X., J. X. read P. sample: and revision, L., collected H. Q. B.L.; X. L., and Q.X., M. X.W. experiments: Q.X., the experiments: performed the designed and Conceived interests. competing Contributions no Author have version they is that declare paper authors this The in described Interest (SRA) of version Archive Conflict The Information, Read of Biotechnology Sequence JAFMOB000000000. assembly for the number chromosome Center JAFMOB000000000. in accession final (National deposited under The NCBI been Assembly to have PRJNA705840. Female submitted data number in been sequencing Results Bioproject cyp19a1a RNA under or and foxl2 database genome of Mutation raw (2017). The al. et M, Tilapia. Li Accessibility yellow Nile H, Data XX Shi small in X, the Reversal Liu Sex for H, Male Ma RAD-seq to M, by Li generated X, Zhang evaluation resource preliminary and polymorphisms Development nucleotide of (2016). JX. evolution single ( Liu, croaker the Y., BJ., genome-wide into Feng, Liu, insight YL., J., a Li, provides Xiao, of J., genomes L., Wang, Baker, Wu, DX., bat Z., Xue, L., of BD., Xiong, Yang, analysis Zhang, J., W., Comparative Ng, J. (2013). X., Wynne, J. Jiang, immunity. X., Wang, P., and Fang, Zhou, flight A., . Y., . K. . Zhu, Bishop-Lilly, Y., M., Z., Tachedjian, for Chen, W., Huang, method Zhao, Z., likelihood L., Shi, M. branch-site C., improved Cowled, G., an Zhang, of level. Evaluation molecular (2005). the Z. at E3 ( Yang, selection (2016). & colitis, M. positive & Chen, R., detecting & Crohn’s Nielsen, Z., of J., Zeng, Journal Y., Zhang, Disease. He, Bowel B., Inflammatory Chen, in ( M., Regulator Li, Novel 713–725. H., a 10(6), Wang, Is R., RNF183 Feng, ligase K., Ubiquitin Chao, S., Zhang, Q., Yu, https://doi.org/10.1093/molbev/msi237 itr hwn eaeadml iteylo croaker, yellow little male and female showing picture A (Lcr), (Gac), (Tru), aiihhspolyactis Larimichthys rohoi niloticus Oreochromis (Gga), ai rerio Danio yolsu semilaevis Cynoglossus (Gmo), https://doi.org/10.1093/ecco-jcc/jjw02 Science rnhotm floridae Branchiostoma in intestinalis Ciona De and (Dre) . 339 ). o clResour. Ecol Mol 61) 5–6.( 456–460. (6118), aiihhspolyactis Larimichthys aiihhspolyactis Larimichthys (Oni), (Cse), ). Endocrinology. (Cin), atrsesaculeatus Gasterosteus ai rerio Danio (Bfl), enovo de nlscarolinensis Anolis 16 eiotu oculatus Lepisosteus 22 oeua ilg n evolution and biology Molecular https://doi.org/10.1126/science.1230835 5-6.( 755-768. oooybsd N-e-ae) vr8.%of 84.3% Over RNA-seq-based). homology-based, , (D) ) (Dre), 5()23–7 di 10.1210/en.2017-00127). (doi: 158(8):2634–47. (Lpo). h endarmsoscmo n unique and common shows diagram Venn The polyactis L oosapiens Homo aiihhscrocea Larimichthys aiihhspolyactis Larimichthys eoeassembly. genome https://doi.org/10.1111/1755-0998.12476 (C) (E) (Gac), Aa and (Aca) rti rhlg oprsnamong comparison orthology Protein iegnedsrbto fclassified of distribution Divergence . (Loc), (B) rza latipes Oryzias https://www.ncbi.nlm.nih.gov (Hsa), oprsno h eesets gene the of Comparison alricu milii Callorhinchus eou tropicalis Xenopus (Lcr), u musculus Mus . 22 (Lpo), 1) 2472–2479. (12), atrsesac- Gasterosteus .polyactis L. (Ola), ). Larimichthys Takifugu (Mmu), (Cmi), (Xtr). has ). ) Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 3,1020 7,30 6,40ad70dh(asps acig eesqecduigIlmn 00HiSeq 2000 100, Illumina 80, using sequenced 70, were 60, hatching) 50, post 40, (days 30, dph 20, 720 10, and 7, 400 from 360, testes 340, and 270, ovaries 240, of 180 pairs 130, thirteen and line stages black sex of the indistinguishable profiles below from away expression numbers bp The and 3197 distribution 5. the is exon 3, and represents exon exons represents rectangle bases. two rectangle red of green the number dark and the the 4 2, represent exon exon represents represents browser. rectangle rectangle green the blue light to the according 1, described exon annotation. represents are genome symbols the on Gene based figure. are this make to used ( croaker yellow little ( musculus indicate circles red and ago, years of millions in num- fossil The times from diverge genes. carolinensis estimated time orthologous the single-copy calibration indicate 592 the branches using constructed the genomes on vertebrate of bers 18 analysis of contraction tree and Phylogenetic expansion gene and Phylogenetic 2. Figure a nerdb sn h aiu ieiodmto n eea iervril oe.Evolutionary model. reversible legend the time to refer general please patterns, and reproduction sexual figure. method different the the likelihood For in X. maximum MEGA in the conducted were using analyses by inferred was milii rubripes laevis (A) 6. Figure (B) chromosomes. on marked are gene reproduction of of and direction chromosomes sex the sex between represent on to region minus members related family the and genes Plus of of view blocks. Enlarged positions alignment the The in link lines chromosomes transcription. horizontal sex two of the analysis between collinearity Genome (A) between comparisons represent chromosome actis circle Sex the 4. of Figure left represent right the the on on 1-24 1-24 chromosomes groups Linkage mosomes. of chromosomes Twenty-four between comparisons Genome 3. Figure olhdsxgnd;adfo hrenpiso vr n etsmxdposa 0 0 0 0 0,130, 100, the 80, that clustering. showed 70, correlation results including 60, good The genes, and 50, maintenance samples. quality at or three high each differentiation pools had had where determination, periods mixed pool , dph, sex different each testis of at 40 where and sexes positions at dph, ovary different The 720 gonads of of and indistinguishable gonads pairs 400, of of 360, thirteen correlation pools 340, from mixed 270, and 240, three gonads; 180, from of six fishes; samples had six whole-fish pool had of pool pools each mixed where were three analysis. analyses of map Evolutionary analysis heat Transcriptome and site. to correlation drawn per Transcriptome is substitutions tree 5. of The Figure number model. the reversible X. as time MEGA measured general in lengths and conducted method branch likelihood with maximum scale, the using by inferred ai rerio Danio foxl2 hlgntcte of tree Phylogenetic . (Cmi), (Cse), , rnf225 (Tru), ,lzrs( lizards ), dmrt1 in intestinalis Ciona atrsesaculeatus Gasterosteus (Aca), (B) ,fg ( fugu ), rza latipes Oryzias , gsdf ytncaaye of analyses Syntenic eei novdi h emprdt nesxsaein stage intersex hermaphrodite the in involved is gene eou tropicalis Xenopus nlscarolinensis Anolis aiihhspolyactis Larimichthys , aiuurubripes Takifugu amh dmrt1 and .polyactis L. (Ola), Cn and (Cin) (D) dmrt1 nhrahoii n oohrclvrerts h vltoayhistory evolutionary The vertebrates. gonochorical and hermaphroditic in oosapiens Homo (Gac), h eesrcueof structure gene The dmrt1 rohoi niloticus Oreochromis (C) piwi2 dmrt1 dmrt1 (Xtr). ,fos( frogs ), aiihhscrocea Larimichthys r akdi h picture. the in marked are , ,saas( seabass ), .polyactis L. rnhotm floridae Branchiostoma eesrcueof structure Gene hwaoet-n eainhpwt 24 with relationship one-to-one a show aiihhspolyactis Larimichthys h rw etnl represents rectangle brown the ; .TeNB eebosr(tp:/w.cinmnhgv was (https://www.ncbi.nlm.nih.gov) browser gene NCBI The ). and n t daetgnsi uas( humans in genes adjacent its and ndffrn eeomna tgso h vr n ets Five testis. and ovary the of stages developmental different in ai rerio Danio aiihhscrocea Larimichthys eou laevis Xenopus (Hsa), rnf122 ae calcarifer Lates 23 chromosomes. u musculus Mus nsxcrmsms C hlgntcte of tree Phylogenetic (C) chromosomes. sex on aiihhscrocea Larimichthys aiihhscrocea Larimichthys dmrt1 (Dre), (Oni), dmrt1 and ,tlpa( tilapia ), .polyactis L. n t ptemgene upstream its and (Lpo), au morhua Gadus (Bfl). eiotu oculatus Lepisosteus ndffrn eess h elwrectangle yellow The teleosts. different in aiihhscrocea Larimichthys ,adlre( large and ), .polyactis L. and (Mmu), aiihhscrocea Larimichthys rohoi niloticus Oreochromis aiihhspolyactis Larimichthys rnf183 h vltoayhsoywas history evolutionary The . and alsgallus Gallus oosapiens Homo .polyactis L. and aiihhspolyactis Larimichthys (Gmo), aiihhscrocea Larimichthys aiihhscrocea Larimichthys t7 0 0 n 0dph, 30 and 20, 10, 7, at .polyactis L. cyp19a1a .polyactis L. transcripts. aiihhspoly- Larimichthys (Loc), rnf183 hoooe,and chromosomes, yolsu semi- Cynoglossus (Lcr), eesymbols gene ’ , (Gga), Callorhinchus figla ,mc ( mice ), . h lines The . . ,zebrafish ), (E) rnf183 Takifugu , rnf183 . Tissue Anolis and ) chro- Mus has rnf . Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. al umr of Summary 2 Table for data sequencing of statistics P The at 1 difference Table significant a represent ** and Tables * test. P hoc at indicate respectively, P of post bars points, and sample error Duncan’s time 0.05 a the by different from above followed at SD ANOVA letters +- males one-way uppercase means and by and the lowercase females as Different in presented differences point. are significant data results each The for testis. individuals and three ovary the of stages developmental of significant in profiles a profile Expression represent expression ** gene Sex-related and * 8. Figure test. P hoc at respectively, post error points, t-test. the Duncan’s from time independent above by different SD letters P followed at +- uppercase at ANOVA means males and one-way difference lowercase the and by Different as females determined presented point. in as are data differences each results significant for The indicate individuals bar testis. three and of ovary sample browser. the a the of stages to developmental according different described annotation. are genome symbols the Gene figure. were analyses ( Evolutionary model. ( reversible zebrafish time general ( and mice X. method MEGA likelihood in conducted maximum the using of by profile significant a expression (A) represent and ** analysis and * Phylogenetic and P test. at 7. hoc respectively, post points, Figure Duncan’s time different by followed at P ANOVA at males was one-way difference or reads) by million females determined per in kilobase as differences per significant (reads of indicate RPKM profiles of bars expression measure the normalized normalize A to used study. present the in technology aiihhspolyactis Larimichthys oa 3.7–615.84 167.6 196.11 183.39 68.74 – – 430.77 150 150 117 – – 137.28 128.37 – 48.12 350bp – (X) coverage Sequence Kb Total 20 (bp) length bionano Read Genomics 10X (G) data Total Illumina reads size Pacbio Insert types Sequencing rnf225 hlgntcte of tree Phylogenetic u musculus Mus ai rerio Danio < cffl 9 egh3991bp 389,991 bp 4,519,237 bp 1,208,045 bp 20,429,779 3,134 bp 706,148,513 715 length N90 Scaffold length N50 Scaffold length N50 Contig scaffold Longest contigs of Number scaffolds of Number length Total in .1btenfmlsadmlsa h aetm on,rsetvl,b needn t-test. independent by respectively, point, time same the at males and females between 0.01 < < .polyactis L. .5adP and 0.05 .5adP and 0.05 ,lzrs( lizards ), ,saas( seabass ), amh (C) .TeNB eebosr(tp:/w.cinmnhgv a sdt aethis make to used was (https://www.ncbi.nlm.nih.gov) browser gene NCBI The ). .polyactis L. (B) rnf183 isedsrbto n xrsinpolsof profiles expression and distribution Tissue , ytncaaye of analyses Syntenic < gsdf < nlscarolinensis Anolis .1btenfmlsadmls epciey yidpnett-test. independent by respectively, males, and females between 0.01 , .1btenfmlsadmlsa h aetm on,rsetvl,by respectively, point, time same the at males and females between 0.01 ae calcarifer Lates rnf223 , cyp19a1a eoeasml statistics assembly genome , n rnf225 and dmrt1 , gaadfoxl2 and figla rnf183 ,adlre( large and ), ieetupraeadlwraeltesaoeteerror the above letters lowercase and uppercase Different . ,fos( frogs ), .polyactis L. 24 nvrerts h vltoayhsoywsinferred was history evolutionary The vertebrates. in .polyactis L. n t daetgnsi uas( humans in genes adjacent its and eou laevis Xenopus aiihhscrocea Larimichthys ae ntsu itiuin n different and distributions tissue on based .polyactis L. eoeAssembly genome rnf183 gonads. rnf183 ,tlpa( tilapia ), n t paralogues its and eesmosaebsdon based are symbols gene ’ , rnf223 n iteylo croaker yellow little and ) rohoi niloticus Oreochromis < .5 sdetermined as 0.05, and oosapiens Homo rnf225 < < rnf223 n in and 0.05, 0.05, < ), ), Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. iue2. Figure 1. Figure Figures Guanine/Cytosine GC Ccnet41.49% bp 625,267,019 bp 102% 706,148,513 5,671 25,233 21.26% 22.28% 309 88.55% characteristics Genome chromosomes on chromosomes anchored to scaffolds anchored of is Length genome the (0.575%) chromosomes of 3,895,963 on Proportion anchored scaffolds of number Quantity gene RNA non-coding Predicted kb) number per gene (SNPs protein-coding rate Predicted and number SNP Heterozygous elements transposable of Content elements repetitive of Contents content GC assembled Percent characteristics Genome length Total 25 Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. iue3. Figure 26 Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. iue4. Figure 27 Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. iue5. Figure 28 Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. iue6. Figure 29 Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. iue8. Figure 7. Figure 30 Posted on Authorea 2 Aug 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.162790498.82247747/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 31