<<

Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. ininLv Jianjian sex and adaptation, determination salinity evolution, its provides into trituberculatus insights of genome chromosome-level A un-inLiu Guang-Jian aoaoyfrMrn cec n ehooy o ehiRa,AsawiTw,Jm,Qingdao, Jimo, Town, Aoshanwei Road, Wenhai 1 No. Technology, and Science Marine for Laboratory c Qingdao,China. China 266071 315211, Sciences, Fishery of Academy Chinese b. Institute, Research Fisheries Sea Yellow a Liu Su Zhencheng *, Lv Jianjian determination valuable was sex a . Dmrt1 and provides swimming and of tion, only breeding of not trituberculatus, molecular P. genome genome for chromosome-level of useful A swimming genome also the is the but Decoding in studies, time addition, evolutionary region. In first and this biological the observed. in further also for gene for were determination located resource sex was genomic the key were chromosome of a members the Y growth as to its the rapid identified basis related and and of genetic closely immunity genomes, the region was strong into crab specific insights , it trituberculatus, in providing the P. suggested and (BSA) expanded in Analysis stage, crabs family Segregant adaptation larval Bulked of gene salinity flea-like and analyses only transcriptome of the with genomic the during Combined 53 comparative and be genes crabs. development into on to of protein-coding crab, embryonic evolution assembled Based found 19,981 swimming early was sequences with the in respectively, (49.43%). family scaffold expressed of Mb, protein repeats the mainly sequence 15.6 finger sequence in of and genome zinc simple species 79.99% kb the C2H2 of farmed 108.7 with report the proportion were important we Gb, values high an Here, ˜1.2 N50 a also covering scaffold and value. is and scale, economic contig trituberculatus chromosome The high P. the and chromosomes. ecological at industry. rate major assembled fisheries growth of was the rapid is which to its crab, to important swimming the due being as China as known well commonly as Brachyura), importance, : (Crustacea: trituberculatus Portunus Abstract 2020 3, August 3 2 1 ooeeBonomtc Institute Institute Bioinformatics Research Novogene Fisheries Sea Yellow University Science Ningbo Fishery of Academy Chinese . ucinLbrtr o aieFseisSineadFo rdcinPoess iga National Qingdao Processes, Production Food and Science Fisheries Marine for Laboratory Function e aoaoyo utial eeomn fMrn ihre,Mnsr fArclue P.R.China, Agriculture, of Ministry Fisheries, Marine of Development Sustainable of Laboratory Key e aoaoyo ple aieBoehooy iityo dcto,Nnb nvriy Ningbo University, Ningbo Education, of Ministry Biotechnology, Marine Applied of Laboratory Key a,c hni Wang Chunlin , 1 a,c, ogu Li Ronghua , ,RnhaLi Ronghua *, 3 hni Wang Chunlin , d, ,BounGao Baoquan *, b inLi Jian , 2 hnhn Su Zhencheng , otnstrituberculatus Portunus , b a,c 2 igLiu Ping , a,c igi Ti Xingbin , 3 aqa Gao Baoquan , 1 n inLi Jian and , 1 rvdsisgt noiseouin aiiyadapta- salinity evolution, its into insights provides a eigYan Deping , 1 1 igi Ti Xingbin , a unja Liu Guangjian , 1 eigYan Deping , d 1 Ping , , Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. v hn,Lu i 06 si&Ln 07.Sm coasas re osuyteslnt adaptation salinity the study potential to of hundreds tried salinity and in also involved 2011), scholars understood. mechanisms Liu, molecular & poorly Some complex Xu remain the H. 2007). However, tolerance Q. Lin, identified. 2013; were al., & genes et Tsai salt-tolerance-related (Lv 2016; analyses transcriptome Li, comparative & using Liu, Zhang, Lv, hsooia rcs fslnt dpain o rnpr-eae ee,icuigNa including genes, important decapod transport-related an the is Ion of transport adaptation ion adaptation. by salinity H salinity Faria, mediated of type Osmoregulation & of mechanism curiosity. (McNamara process the scientific adventurers physiological habitats, habitats aroused arboreal challenging long all and innumerous has virtually desert Crustacea the exploited species, including Given freshwater and travelers, and occupied estuarine terrestrial environment, 2012). have marine and the decapods to amphibious wellbeing The restricted and and species physiology include 2012). and abundance, earth, Zeng, distribution, on the & available influences (Romano that factor have abiotic of basis dramatic important genetic a an underlying as is interpreted 2012). their Salinity Faria, been and & has evolved (McNamara and crabs elucidated ‘carcinization’, how transforma- called been However, evolutionary a is not The by 2014). crab characterised pleon. (Scholtz, a are ventrally-folded change to crabs a morphological , and -like and carapace, a shrimps short morphotype of from “lobster” depressed, tion bodies or a elongate macruran with et the ancestral with (Haug body an Compared from compact Decapoda evolved within 2016). have al., discussion crabs et all and attention. (Haug terms, attention evolutionary research much In considerable drawn have attracted 2016). morphotypes al., have distinct and years, found aquaculture, forms. recent and are parasitic (Warner, most In or fisheries species the crustaceans planktonic in represent These the crabs and crabs Swimming living of of 1982). free groups group species. (Bowman, to commercially-important important largest many benthic families the includes from 47 group of ranging to this to one habitats, Furthermore, belonging marine belonging are crustaceans in they typical species largely most species, 6,000 worldwide, the of about of numbers comprising one of as 1977), terms known are In which crabs, Decapoda. called generally are Brachyuran swimming of breeding chromosome molecular Introduction for Y genomic useful valuable the also a is of provides but only region studies, not evolutionary genome specific of crabs. and crab the genome biological swimming further the addition, the for Decoding In in resource region. time this observed. in first also gene the determination were Analysis it Segregant for species suggested Bulked in located the and adaptation stage, was of salinity transcriptome larval members of growth with its flea-like basis rapid Combined and genetic the and genomes, crabs. the during into crab of insights in evolution and providing expanded the development (BSA) C2H2 family to embryonic of gene the related early N50 only proportion shrimps, closely the and high in scaffold was be crabs a expressed and to of and contig mainly found analyses genes was The were genomic protein-coding family ˜1.2 comparative 19,981 protein covering on chromosomes. finger with Based scale, 53 respectively, zinc chromosome (49.43%). into Mb, repeats the assembled 15.6 report sequence at we and sequences simple Here, assembled kb scaffold value. was 108.7 the economic which were high of values crab, and 79.99% rate swimming growth industry. with the rapid fisheries Gb, of its the to sequence to due genome China important the in being species as farmed well important an as importance, ecological major trituberculatus Portunus author, Abstract Wang), Corresponding * d hs uhr otiue equally contributed authors These ooeeBonomtc nttt,Biig101,China 100016, Beijing Institute, Bioinformatics Novogene + [email protected] APs,adNa and -ATPase, Cutca eaoa rcyr) omnykona h wmigca,i of is crab, swimming the as known commonly Brachyura), Decapoda: (Crustacea: + Pn i)and Liu) (Ping ,K [email protected] + 2Cl , - ornpreshv encoe n tde Gro ta. 2011; al., et (Garcon studied and cloned been have cotransporters [email protected] .trituberculatus P. 2 Ja Li), (Jian and , GaginLiu) (Guangjian .trituberculatus P. Dmrt1 [email protected] a dnie sakysex key a as identified was .trituberculatus P. + togimmunity strong , ,K + APs,V- -ATPase, (Chunlin salso is Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. irre o igemlcl eltm SR)Pci eoesqecn eecntutdacrigto according constructed were the sequencing from genome extracted PacBio (SMRT) were real-time k-mers) molecule 41 bp the was single depth (17 by for k-mer Seventeen-mers size the Libraries Finally, genome of calculated. the frequency). was size 17-mer estimate genome abnormal high each - the to of k-mers too and frequency 2010) total the or = al., and number low data k-mer et sequencing where too Li depth, (R. k-mer (with number/average method k-mer k-mers = distribution G frequency formula: following K-mer a used We bp. for for ˜350 bp assembly (M), 150 libraries of Genome muscle and Four size and platform, insert Novaseq (L), an prediction. Illumina with liver gene the generated (H), using by were sequenced heart followed reads were (B), platform, paired-end analysis blood (BSA) HiSeq Analysis (G), Illumina Segregant gill the Bulked (Tr), on thoracalis sequencing ganglia transcriptome as (Br), removed, brain were (E), rate) eight eyes error from 1% RNA a extracted (representing also with We 20 reads than Low-quality duplications. lower and scores platform. protocol quality sequences HiSeq4000 Illumina having tag-contaminated standard Illumina bases were the the of using using 10% constructed were than performed bp by more was 350 created Sequencing of was size 2008). insert that (Mardis, Bioin- an Novogene sibling with at libraries full stored (PE) Pair-ended and F12 tissue an muscle from from extracted originated was Institute. sequencing DNA formatics genomic genome Total for insemination. used artificial individual male The sequencing and collection Sample future for basis genetic Methods the and as Materials serve crab. can swimming immunity, genome the strong of This management adaptation, species. breeding salinity this of in basis determination of genetic investigations sex the and into growth, insights rapid of provided species. assembly important analyses genome economically genomic chromosome-scale this parative environmental the in growth, sizes, report rapid understood we individual to poorly Here, leading in remain mechanisms variability immunity genetic significant strong the However, and as others. resistance, such 2015), among stress al., outbreak, farming, et are disease crab (Liu and crabs breeding in stress, culture, artificial factor problems high-intensity pond of many which In years resistance, are 30 culture. Therefore, disease than body there artificial diseases. strong more large of After exhibit months and species. and of important shrimps 7-8 rate ecologically diseased outbreaks in growth on important prevent g rapid feed an effectively 400-500 they its As to can because for stocked. up shrimps and famous with reach propagated mixed is can artificially often crabs mass is swimming body and and the Its China, Japan, size. aquaculture, in China, crabs in of edible waters species coastal common crustacean the most in the distributed of widely one is and pebbles, complex or and sand Torrecilla, chromosomes crab, (Z. of crustaceans swimming numbers most The in large determined the be to cannot due analysis Gonz´alezorteg´on, however, karyotype Mart´ınezlage, Perina, Gonz´aleztiz´on, by 1959); & determination 1938, 2017). sex 1937, genomes, In Niiyama, 1995; 2008). (Ford, Noel, crus- factors environmental in and/or determination genetic aspect sex dentipes ( by intriguing of species controlled an mechanisms are some had The and crabs, always diverse 2002). has remarkably (Martins, which are biology process, taceans developmental developmental and biological evolutionary plastic in a is determination Sex eeblee ohv nX e eemnto ytmbsdo aytp nlss(ehr& (Lecher analysis karyotype on based system determination sex XY an have to believed were ) .trituberculatus P. rohi japonicus Eriocheir otnstrituberculatus Portunus .trituberculatus P. vlto n ilg,adwl eavlal eorefrcnevto and conservation for resource valuable a be will and biology, and evolution .trituberculatus P. , a siae ob . Gb. 1.2 be to estimated was eirpu sanguineus Hemigrapsus Cutca eaoa rcyr) naissaor with seafloors inhabits Brachyura), Decapoda: (Crustacea: 3 ise tdffrn eeomna stages developmental different at tissues .trituberculatus P. .trituberculatus P. , eirpu penicillatus Hemigrapsus eoeeouinadcom- and evolution Genome . r lorgre san as regarded also are 19 hsseisis species This . , nldn the including and Fg S1) (Fig. Plagusia Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. eas sdItrrSa vrin47 Jnse l,21)t banpoendmi noain nthe functional in Finally, annotations domain databases. protein 2016) obtain Consortium, Ontology to ( Gene 2014) BLASTP (The al., using (GO) et 2004) Ontology (Jones al., Gene 4.7) and et (version Interpro (Apweiler InterProScan SwissProt used Kawashima, and Sato, also (Kanehisa, We 2016), KEGG NCBI-NR, Tanabe, including & databases, Furumichi, gene public several the in EVidenceModeler in searching using genes mology predicted set 2012). the gene of al., & consensus 2008). annotations et Functional a Pachter, al., (Trapnell to (Trapnell, et 2.1.1) merged 2.0.10) (Haas then (version RNA-seq (version ncemodeler.sourceforge.net/) were cufflinks the Tophat (http://evide methods using addition, above (EVM) using In predicted the genome were using settings. Blanco, the predicted structures default Parra, Genes to gene with (G. while aligned all Geneid 2009), were 1997), 2004), Karlin, Salzberg, tissues al., & several et (Burge (Birney from Genescan SNAP data GlimmerHMM and for 2004), 2005), 2000), used Salzberg, Morgenstern, were Guigo, & & tools & five Pertea, (Stanke (version initio, 2.5.5) (Majoros, Genewise ab (version For 3.01) using Augustus (version alignments. proteins namely, spliced matching prediction, accurate the for ( structures to 2004) 1990) gene Durbin, Lipman, aligned & & then Clamp, Myers, were (Birney, Miller, 2.4.0) sequences Gish, genome (Altschul, TBLASTN Homologous using 1e-05). genome the search to purpuratus, Strongylocentrotus queries vannamei, species Penaeus homologous dariorum, of sequences including protein database, the NCBI Briefly, elegans the genome. prediction from the transcriptome-based downloaded in and genes initio, were ab protein-coding homology-based, predict used to 1999). we methods (Benson, genome, repeat-masked Finder the Repeats on Tandem Based using ascertained were genome the in the the in and TEs detect 15.02) 2005), library, Pevzner, (Repbase repeat & integrated Jones, library LTR discover an (Price, and repeat to in RepeatScout TEs 1.0.5), known used identify (Vision were a (http://www.repeatmasker.org) to RepeatModeler used methods from was Two derived 2004) was Chen, repeats. (N. which tandem 4.0.5) (version and RepeatMasker (TEs) TEs. elements transposable included the sequences in sequences repetitive The annotation and assessed prediction also of Genome settings. We Both default the 2007). 3.0). with Korf, (version performed were & BUSCO analyses Bradnam, the Parra, the (Genis (http://korflab.ucdavis.edu/datasets/cegma/) against the (CEGMA) searching of by completeness analysis the 2015) Zdobnov, ( & (BUSCO) Orthologs assembled Single-Copy the sal of completeness the assess assembly To genome. the chromosome-level (version of the chromonomer into completeness by scaffolds the constructed the Assessing then anchor was to map by 2020) errors genetic Bassham, re- remaining high-density & The the a Amores, correct and to programs. (Catchen, used 2014), all 1.07) was fragScaff al., reads for with SSPACE- short et data parameters Illumina-derived (Walker of linked-read Finally, default genomics pilon rounds software. 10x with 2014) using six 2014) al., super-scaffolds et using al., to (Adey connected et scaffolded further were (Kajitani were to ( plantanus scaffolds contigs PBjelly used sulting of of PacBio were iterations rounds Then, reads three seven PacBio 2014), and Pirovano, 2013). platform. & al., (Boetzer Sequel et LongRead PacBio (Chin size the and 5.0.1) on ligation, (smrtlink the sequenced adaptor blunt-end of were repair, contigs libraries was end assemble DNA the and genomic repair Then, molecular-weight damage by high selection. followed Briefly, fragments, company. large Biosciences into Pacific sheared the from protocols standard , rsote ia,Doohl eaoatr ahi ue,Ioe cplrs aataoatepi- Parasteatoda scapularis, Ixodes pulex, Daphnia melanogaster, Drosophila gigas, Crassostrea IDR(.X ag 07.RpaPoenak(ega useil,20)wsue to used was 2007) Quesneville, & (Bergman RepeatProteinMask 2007). Wang, & Xu (Z. FINDER .trituberculatus P. .trituberculatus P. .trituberculatus P. .trituberculatus P. eoeb oprn tt h Epoendtbs.Tne repeats Tandem database. protein TE the to it comparing by genome http://busco.ezlab.org/ eoeuigSATeoo n oihn roswt Quiver with errors polishing and SMARTdenovo, using genome eoeuigteCr uaytcGnsMpigApproach Mapping Genes Eukaryotic Core the using genome .trituberculatus P. eoewr dnie fe eoeasml.Repetitive assembly. genome after identified were genome .trituberculatus P. 4 oosapiens Homo Smo aehue onii,Kriventseva, Ioannidis, Waterhouse, (Simao, ) eoe epromdBnhakn Univer- Benchmarking performed we genome, and http://sourceforge.net/projects/pb-jelly/ rblu castaneum, Tribolium , eoewr hnanttduigho- using annotated then were genome erncu urticae Tetranychus enovo de eetlbay ul by built library, repeat E vle[]1e-05). [?] -value n eeue as used were and , Caenorhabditis E vle[?] -value ), Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. n –qv and 204-225 as such nodes, the PAML of age of the restrain program to mcmctree priors & Likeli- normal Suleski, the Stecher, as Maximum (Kumar, used Then TimeTree of be the the TMRCA from to for using selected () Mya were software. database tree points 2017) calibration 2006) phylogenetic Several Hedges, species. a (Stamatakis, 12 among constructed RAxML we with ( orthologues, model single-copy hood into specific addition, the classified In were performed. Using Genes then species). were each analysis 1.5”. in enrichment paralogous “-inflation gene KEGG and and of one comparing orthologous by (only parameter generate selected orthologues to were the 2003) single-copy genes with Roos, se- and cutoff organisms & protein paralogues, E-value Stoeckert, the an longest orthologues, Li, with all (L. the the species OrthoMCL performed among the encoded we used in relationships First, that we genes among Then, acids. locus similarities amino 1e-7. gene the 30 of identify than each to shorter at method sequences BLASTP protein models all-against-all encoding gene genes removed retained and only quence, we genes, fragmented 002113885.2) (GCF including the vannamei species, L. in other 11 families from quences gene identify results. To annotation gene consensus final the evolution as Genome used were database each in alignments best the of annotations AT lsfo h C eete e notecutrse,wihpromdioomlvlclustering isoform-level the performed using files which FASTA consensus step, high-quality full-length cluster generate and the to length used into –hq was Non-full fed was step parameters: then following polishing (CCS) 15000. Arrow were sequence the –maxLength CCS Finally, consensus (ICE). and the circular –maxDropFraction from 0.8, 50, A –minLength files –minPredictedAccuracy FASTA settings: 2, parameter software. Pacific following –minPasses by 5.1 the with 0.8, SMRTlink described files BAM using as subread from processed System, generated were Selection data Size Clontech BluePippin Sequence the the using (Iso-Seq) and 100-092-800-03). protocol Kit (PN Sequencing Synthesis Biosciences Isoform the cDNA to PCR according SMARTer prepared was library Iso-Seq The analysis discovery Iso-Seq false GO the PacBio overrepresented by lineage identify adjusted each to were in which test calculated genes, exact contracted were Fisher’s and p-values (FDR the expanded corresponding the rate used the The we on among Finally, pathways transitions lineage nodes. KEGG of each likelihood. child and probability along conditional to the families the parent calculate gene on from to in based other introduced size lambda changes was family all a the model with gene in examine model graphical in death 2 gene probabilistic to and orthologous A used birth of [?] stochastic further tree. contraction a was phylogenetic and using and model species species expansion 11 This the one the analyze of parameter. in each to and 200 used ancestors, Bie, was between (De 2006) families [?] (http://sourceforge.net/projects/cafehahnlab/) Hahn, numbers 4.0) & (version gene Demuth, tool Cristianini, with CAFE The families removed. families, were species gene extreme avoid analyses To contraction and expansion family Gene outgroup. of the TMRCA for Mya 256-429 http://abacus.gene.ucl.ac.uk/software/paml.html 000523025.1) trim < (GCA .elegans C. , 0.05). p30. 3p .niloticus O. , 003789085.1) .semilaevis C. (GCF quiver .vnae .virginalis P. - vannamei L. 000002985.6) .pulex D. , (GCF .trituberculatus P. min - P .rerio D. cuay09,–bin 0.99, accuracy . 001858045.2) trituberculatus .sinensis E. (GCA , 2-4 y o MC of TMRCA for Mya 421-447 ; and Yn,20)wsue oetmt h iegnetimes divergence the estimate to used was 2007) (Yang, ) 000187875.1) to .vectensis N. (GCA 5 .gigas C. , .sinensis E. eoe eue h uloieadpoense- protein and nucleotide the used we genome, h hlgntcte confirmed tree phylogenetic The . by 003336515.1) rmrfle –bin false, primer .rerio D. , (GCF (GCF , .virginalis P. 000297895.1) 000209225.1) .virginalis P. , (GCF .yessoensis M. 000002035.6) and , size .yessoensis M. , . b1 –qv 1, kb .vannamei, L. oecueputative exclude To (GCA - .vectensis N. .semilaevis C. , .gigas C. 002838885.1) trim n GO and (GCA p100, 5p and ; as - , Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. in h irr rprtoswr eune na luiaHsqpafr ih10b ardedreads paired-end bp 150 with platform genera- Hiseq cluster Illumina After an on instructions. sequenced manufacturer’s TruSeq were the the using preparations to System library according Generation the Cluster (Illumina), cBot tion, v3-cBot-HS a to Kit on sequences performed Cluster was assign PE samples to index-coded the added and of were Clustering S72h), codes and Index S48h recommendations. NEBNext sample. S24h, each manufacturer’s the S12h, following using (S0h, USA), generated (NEB, stress were salt libraries low 3 Sequencing of of amount h total 0–72 A after of gills h 0–24 the after PoM), haemocytes and InM, (PrM, trituberculatus P. analysis RNA-seq the content Significant using respectively. implemented software, of then 2005) threshold Wei, were & FDR Olyarchuk, analyses p-value an Cai, enrichment Hochberg adjusted with (Mao, KEGG and KOBAS an and and Benjamini on with package GO R the Genes based Goseq using expressed. model rate. differentially adjusted analysis a statistical discovery as were expression using false provides p-values assigned the the data DESeq Differential resulting control and expression The to (1.18.0). sequences gene results. approach package distribution. transcript digital R binomial mapping the from DESeq negative expression to the a the differential back from using determine mapped obtained to performed were assessments was was data conditions/groups transcript clean two and each estimated of 2011), were for sample Dewey, each count & for levels read Li expression expression gene (B. differential The levels, RSEM analyses. expression enrichment by gene KEGG are and of which GO quantification and microsatellites, the analyses, compound included as analyses enrichment in well nucleotides. Differential as SSR of microsatellites, number transcripts lncRNAs. perfect specific remove of a of to set by location used interrupted candidate the were our determines above comprised and poten- described using the transcripts tifies coding identified tools assess remaining The were three to The transcriptome the annotations. classifier the potential. after and best coding transcripts sequences the lacked all genomic construct that for high-quality to lack predicted approach was that K-mer family tial species optimized protein for an frames. known potential three uses the coding all of classifier transcripts. in protein any SVM non-coding translated eukaryotic of PLEK and transcripts NCBI occurrence The coding in the the the database identify identify in Pfam to to sequences the parameters 1e-10 in the default to domains searched with set and was used e-value was ORFs, the Pfam-scan the searches, of such mainly coding For quality CPC parameters the The database. and predict default annotations. to extent with known tools of performed the 2014) independent were sequences, assessed Zhou, triplets non-coding & nucleotide and Zhang, adjoining protein-coding Li, distinguish profiles (A. to Coding-Non-Coding- CNCI PLEK used and transcripts. 2015), we et of the al., analyses, (Kong potential et using LncRNA (CPC) (Finn performed In Pfam-scan Calculator was 2007), Potential 2015). analysis al., Coding al., TF non- (https://github.com/www-bioinfo-org/CNCI), et (long analyses. (CNCI) Zhang LncRNA Repeat) Index (H.-M. Sequence analyses, database (Simple factors) 2.0 SSR (transcription animalTFDB final and TF obtain analyses, prediction, and RNA) to CDS analyses, coding included 2002) enrichment analyses Godzik, differential structural & analyses, The structural Jaroszewski, ANGEL included Li, of mainly (W. implementation annotation. from analysis read 0.99) sequences function following long coding -aS a The protein 0.00 is the which cDNAs. determine -aL with pipeline, to 0 data ANGEL removed used was -G was The RNAseq (https://github.com/PacificBiosciences/ANGEL), reads 6 analyses. Illumina consensus subsequent corrected -T the for the using 0.95 transcripts in corrected redundancy (-c Any were CD-HIT 2014). reads by Rivals, consensus & (Salmela the software in LoRDEC errors nucleotide Additional ape rmdffrn eeomna tgs(,Z-4 ,adJ,dffrn otn stages molting different J), and M, Z1-Z4, (F, stages developmental different from samples µ fRAprsml a sda nu aeilfrteRAsml preparations. sample RNA the for material input as used was sample per RNA of g < .alginolytica V. .5wsslce steresults. the as selected was 0.05 MISA neto TBadT4)wr sdfrRAsqanalysis. RNA-seq for used were T24B) and (T0B infection ht:/gcikaesee.ems/iahm) IAiden- MISA (http://pgrc.ipkgatersleben.de/misa/misa.html). 6 ® lr N irr rpKtfrIllumina(r) for Kit Prep Library RNA Ultra < .5fudb Ee were DESeq by found 0.05 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. ucini AKsfwr vrin38 MKnae l,21) Ns(igencetd polymorphism) nucleotide (single Genotyper SNPs Unified the 2010). using al., samples et all (McKenna highest using for the 3.8) files with performed (version BAM pair was software the to GATK calling only with converted Variant in coordinates, were genome external function SAMtools retained. files identical the reference was had alignment using removed quality pairs the and were read mapping duplications -R” against multiple PCR –M If potential reads addition, 32 “rmdup”. In clean command, -k –t”. 4 align “–bS with -t to software “mem SAMtools used that parameters: was reads PCR following (BWA) paired-end by the two Aligner generated from duplicates Burrows-Wheeler of 2 removal PCR read The and putative b. 1 of read removal (N); (i.e., identical). nucleotides d. process completely unidentified construction and were 10% QC library mismatches; the The 10% [?] during (QC) bias. amplification with [?] control artificial allowing reads quality without adapter, of of and the removal reliable, series were a a. reads with through that follows: reads processed ensure as initially to were were scripts procedures C format FASTQ in-house using in procedures reads) were reads (raw paired-end bp. data bp ˜350 150 Raw was of and constructed libraries size platform, libraries of insert HiSeq4000 Four Illumina distribution an PCR. the size with real-time using amplification. end-polished, the generated using sequenced PCR then and were quantified above were additional system and Fragments described Briefly, XP and Bioanalyzer as AMPure Agilent2100 sequencing sonication. an the Illumina sample. using using using for each bp purified analyzed adapter 350 were to full-length products of sequences the the PCR size Finally, to assign to a according ligated to to USA), and fragmented added libraries (Illumina A-tailed, was were Kit Sequencing sample preparation codes respectively. DNA Sample trait, Index the HT disease DNA and Nano recommendations. growth Truseq manufacturer’s to the using according generated preparations were sample DNA 1.5 the of for amount and total individuals male A susceptible large respectively. 20 groups, 20 For by ToG and traits, and (68.24+-10.5g) identified analysis. resistance SuG individuals BSA were construct disease male by individuals For small found tolerant 20 were constructed. of 20 markers composed were and groups (142.2+-16.5g) genes BG individuals resistance-related and disease SG traits, and growth growth- study, this In method analysis same BSA the FDR. following analysis. the Iso-Seq processed control PacBio were genes for to expressed used approach differentially that p-value Hochberg’s of as adjusted and analyses an enrichment Benjamini KEGG with the and DESeq (1.18.0). using package by R adjusted identified DESeq were Genes the used using p-values commonly performed most resulting was the conditions/groups The 2010). currently two considers of al., is simultaneously analysis and et statistic expression counts, (Trapnell on Differential This read levels based the gene. expression calculated for same gene fragments was length the estimating of gene gene to for number and each was mapped method (expected depth counts of 2015) sequencing FPKM read sequenced) Huber, of the the pairs effect and & reference Then, base the gene Pyl, the gene. millions the (Anders, of each of per v0.6.1 length to index sequence the HTSeq mapped the transcript reads of build genome. of All kilobase to the number per used to the calculated. was count reads were to 2015) paired-end annotation. data used and clean Salzberg, assembly clean align & genome the following and Langmead, used genome of (Kim, were content files software low annotation (v2.0.4) GC model with Hisat2 and gene or and Q30, genome data. ploy-N, reference Q20, clean containing The clean the the adapters, step, on time, this containing based same During were reads the analyses scripts. removing downstream Perl At by in-house obtained using reads. were processed quality were reads) format (clean FASTQ in data reads) (raw data Raw generated. > 0 fbsswt he quality phred a with bases of 50% µ fDAprsml a olce rm2 niiul obidDApooling DNA build to individuals 20 from collected was sample per DNA of g .parahaemolyticus V. < ;c eoa fraswith reads of removal c. 5; 7 < .5wr sinda ieetal xrse.GO expressed. differentially as assigned were 0.05 neto xeiet,wihwr sdto used were which experiments, infection > 0ncetdsaindto aligned nucleotides 10 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. hn) C a efre ihtefloigprmtr:9 eCfr3 ,floe y4 ylso 95 of cycles 40 by followed s, 30 s. for 34 degC Discussion for 95 and degC was Results SYBR 60 parameters: qPCR and following and s China). the System 15 Dalian, with for PCR (Takara, performed degC kit Real-Time was reagent Fast PCR RT International) 7500 China). PrimeScript Biosoft a the (Premier using tool (qPCR) using 5 performed PCR synthesized Premier real-time was Primer Quantitative cDNA the USA). strand using CA, designed Carlsbad, were (Invitrogen, primers Trizol using individually extracted ”QD Expression ”QD –filter –filterExpression (settings: (settings: parameter GATK Filtration in parameter Filtration Variant > the using filtered were uasae atuainsae g-apissae g-oasae oasae eaoa tg,adjuvenile and blas- stage, stage, megalopal multicellular stage, stage, after zoea hepatopancreas egg stage, and (fertilized egg-zoea haemocytes, stages stage, stage), developmental egg-nauplius crab different stage, of gastrulation stage, materials analyzed were tula the expressions gene in sex-determination, qPCR and immunity, by development, with specifically. associated on genes CQ the focused For a were -R. Y-linked, zero –M to as equal 32 sequence analysis values a -k qPCR CQ al., classify 4 with et To screening -t Li the (H. calculated. mem of –t) then with results –bS were The genome (settings: sequences out. reference software those Durbin, split SAMtools of & of using the values Li value files used against (H. CQ BWA BAM The reads then the to female clean converted was Then, fragments 2009). and then into the removed. method male were divided were align were 2013) files from bp to scaffolds Alignment 250 alignments al., the used than of First, et was shorter number Y-linked. fragments (Hall 2009) the is and similar sequence (CQ) uses sequences, performed a masked quotient It whether were by chromosome determine procedures genes. to The (QC) chromosome data control sequence Y quality discover analysis. data, systematically BSA raw to the for using from sequenced those were generated. reads bp to were reliable ˜350 reads of more paired-end size obtain bp insert pooled To To 150 an were with groups and two China. Libraries platform in Weifang, sexuality. population HiSeq4000 individuals on Company, Illumina from breeding based AquaFarming DNA the bulks core data, Chang-Yi DNA sequencing the two the the generate from of at to selected accuracy reared and randomly efficiency were were the which improve females NO.1”, healthy “Huangxuan forty of the and between males index -fs2 healthy SNP/InDel Forty 0.2 in -fs1 analysis difference settings: quotient) The parameter (chromosome following size settings. CQ the window default with a in the scripts. index used indexes as Perl SNP/InDel we SNP/InDel Kb in-house delta general, all the 10 method the In in of of window as window. 0.8 average size calculated sliding overall The was step the We The a pools for genome. sites. the two 0.3. and index whole base calcalate SNP/InDel than Mb the the the Then 1 less of of as of index pools index taken genotype. SNP/InDel both SNP/InDel was of the sample window in the is genotype each this present index which the for to SNP/InDel number, used used number total the We was in which reads index. reads sites statistic SNP/InDel different InDels those to of the and removed number and calculate SNPs the to reference homozygous of used for the the ratio files was information for as vcf pool depth files the sample sample read GFF3 from one The one the extracted were in on func- groups. samples above based to comparison two described InDels between three tool InDels and in software and SNPs SNP/InDel efficient SNPs annotate for an homozygous to The Variation), used genome. (Annotate was reference variants, 2010) genetic Hakonarson, annotate & tionally Li, Wang, (K. ANNOVAR Coeff Inbreeding 60.0 || < MQ .,with 0.3, < 00” -G ”, 40.0 < 08”). -0.8 > 0ainet rmml aa and data, male from alignments 30 le ”GQ filter < 0,–lse idwie4 .IDl eefitrduigteVariant the using filtered were InDels ). 4, WindowSize –cluster 20”, .parahaemolyticus V. 8 < 4.0 < || 0ainet rmfml aawr filtered were data female from alignments 30 FS > ® rmxE a i Tkr,Dalian, (Takara, Kit Taq Ex Premix 200.0 neto 07 ) oa N was RNA Total h). (0–72 infection || edPosRankSum Read TbeS1) (Table < < 4.0 -20.0 First . || FS || Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. h eoe,floe yln emnlrpas(T,1.7)adDAtasoos(10.95%) transposons of DNA 27.29% and (LINE, of 17.07%) elements (LTR, that interspersed repeats than long S3) terminal were lower long Fig TEs much by abundant followed was most genome), which the the assembly, sequences, and repetitive the the KEGG, of Among SwissProt, 49.46% proteins), for non-redundant sinensis accounted (RefSeq japonica content NR the repeat of The one least databases at distribu- GO on the based with consistent annotated were other were which in respectively, features bp, gene 1,231.18 of and tions 10,484.05 were length (CDS) sequence The annotation and prediction the Genome in identified were genes conserved (85.08%) genome completeness. 211 assembled CEGMA, the to in trituberculatus 3.0). According identified the (version (CEGMA). were of Approach datasets BUSCOs completeness ping BUSCO the Arthropoda Arthropoda assessed 1,066 the also the We against of searching 84.7% by Overall, (BUSCO) Orthologs assembled Single-Copy the sal of completeness the assess To level. 90.72% SNPs performed single-base approximately homozygous We the with 47,482 BWA. obtained reads We using mapping assembly. accuracy by genome S2) the the evaluated of evaluated was completeness assembly and the generated explaining reads The genome coverage, sequence assembled short SAMtools. the the using mapped calling to we level, variant platform base single Illumina the the at genome by the of accuracy the evaluate assembly scaffolds To and and accuracy geneset high groups map, and linkage the continuity genetic between of high constructed Markers Completeness of the was Mbp). on assembly (691.44 our based length that pseudochromosomes total suggesting 53 collinearity, the N50: fine to of exhibited (contig attached 79.99% genomes were for 2019). shrimp accounted scaffolds al., for and 2,396 et published of Zhang previously ( total (X. of Mb A 605.56kb) those 15.59 N50: than, of scaffold better N50 and or kb scaffold 57.65 to, a removed comparable and and in are kb, summarized 10%, draft genome are the 108.68 than final data of on more sequence a N50 based bases All yielded contig filtered N 50%. assembly than were with genome more platform reads The 5) filtered Illumina ([?] adapters, the bases low-quality with by with reads generated reads filtered data genome the The sequence on criteria: data. Raw performed following sequencing was 264.81x of coverage) sequencing was (71.64x and coverage Gb constructed 85.97 according of was produced produced. platform were which library size 4000 data platform, linked-read HiSeq sequencing 4000 Illumina of Genomics coverage) HiSeq the (92.51x 10x Illumina on Gb out one 111.01 carried of addition, was total In sequencing A generated and were instructions. constructed bp, reads sequencing manufacturer’s were long 350 the libraries After of to of sequencing Illumina Gb size study. paired-end 120.79 insert Two this of an assembly. total with in genome sequencing, a subsequent used the (Tianjin), real-time for Novogene were used at single-molecule link-reads, and platform Bioscience’s SEQUEL Genomics PacBio Pacific 10x the and with including sequencing technologies, paired-end three Illumina’s of combination A assembly and sequencing Genome hs eetn o ooyosrt 006% n h ihacrc ftegnm sebyat assembly genome the of accuracy high the and (0.0067%) rate homozygous low a reflecting thus, , .trituberculatus P. .trituberculatus P. . TbeS6) (Table eoe( genome 6.2)(age l,22) u agnlyhge than higher marginally but 2020), al., et (Tang (61.42%) eoeecds1,8 rti-oiggns h vrg eelnt n coding and length gene average The genes. protein-coding 19,981 encodes genome al S4 Table a siae ob . b( Gb 1.2 be to estimated was utemr,˜44 fcmlt UC ee eescesul identified. successfully were genes BUSCO complete of ˜84.4% Furthermore, . .Alrslsidctdta h eoeasml a odcvrg and coverage good had assembly genome the that indicated results All ). TbeS,FgS2) Fig S5, (Table .trituberculatus P. .trituberculatus P. .trituberculatus P. 9 i S1 Fig al 2 Table eoeuigteCr uaytcGnsMap- Genes Eukaryotic Core the using genome eoe epromdBnhakn Univer- Benchmarking performed we genome, mn h rdce ee,1,6 (˜98.9%) 19,763 genes, predicted the Among . eoewt oa egho 6.5M,a Mb, 864.45 of length total a with genome ,adteeoe h vrg sequencing average the therefore, and ), .Teasml ttsiso h crab the of statistics assembly The ). ioeau vannamei Litopenaeus al 1 Table TbeS3) (Table TbeS7, (Table Fg 1). (Fig. (49.38%). Eriocheir . (Table P. . Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. atlg n oedsrcin hrfr,tecnrcino h w eefmle nca rvdscusto clues similar is provides some crab RA in shows families RA 2014). gene two Liu, ‘carcinization’. the and of self-tolerance. & of hyperplasia, mechanisms contraction immune and Zhuang, molecular the disease, inflammation the of Zhang, Therefore, synovial human study weakness, Bo, failure destruction. common muscle bone Zhang, a a skeletal and as to Q. by cartilage such linked Y. caused carcinization, are to 2019; diseases genes appearances al., autoimmune both proportion interestingly, et systemic body More (Steinz and a number (RA) crab. families gene arthritis to between WDFY crayfish, rheumatoid correlations and to positive actin by shrimp, characterized the from as including respectively, crab, attention, and special metamorphosis. stage deserve and larval development families like S14) in flea gene functions the contracted specific crab at have Two and development may embryonic crayfish, it of early shrimp, that regulation in penaeid suggesting the expressed thus, in in mainly ; genes was proteins 23 family the and gene for 3 the role 0, a identified are supports were There ( research S13) families previous test 2003). gene Com- and exact (Urrutia, contracted family, Fisher’s carbonization. morphogenesis forty-seven protein the in and finger using processes zinc expanded evolutionary and C2H2 species one genomic-level crab crab species, infer of two shrimp to analyses in two inter- time, genomic the been Comparative first has with the which 2019). or- pared for evolutionary ‘carcinization’, (Borradaile, body us, The called change compact allowed is morphological 2014). a shrimp crab dramatic (Scholtz, by a a pleon characterised towards ventrally-folded as are crustacean a preted crabs lobster-like and a lobsters, carapace from or short transformation shrimps depressed, of a with bodies ganization elongate the molting. with ribosome-related and Compared specific neuroendocrine of eyestalk carcinization number of of stages, large regulation basis molting the A different Genetic in to role 2011). related significantly important Mykles, are an endocrine & expressions play major their and (Chang they and eyestalk The molting that eyestalk, the suggesting the in stages. in in role located molting expressed is a are different (Hopkins), genes play in (XOSG) to expressed complex salt known gland mainly low X-organ/sinus is were under the most crustaceans, molting, but development, of of complex processes, analyzed stages above we infection different Furthermore, the in pathogen in pathway genes). upon ribosome (140 the and genes in of stress, genes number of largest expression the and the containing reclamation, enriched, bicarbonate significantly tubule P450 (p pathways most cytochrome proximal enriched by absorption, significantly xenobiotics and of revealed digestion metabolism also and protein analysis ribosome, enrichment adaption, contained KEGG salinity which species. molting, the ( to development, played the genome specific as genes in crab expanded such the contracted some process, in were that suggested physiological that genes stress different families expanded immune in gene the of seven roles species expression and important other the expanded of the enrichment were to Genomes) that compared families genome, gene crab 18 identified We ago years million ˜326.0 that showed of analyses ancestor phylogenetic orthologues, to single-copy related closely Using between families tree. relationship gene phylogenetic OrthoMCL phylogenetic single-copy a using 246 the evolutionary species investigate and twelve the To altogether in into families performed . insights analyses gene cluster important 32,918 family identified provides gene We in species species. related those closely of relationship between families gene Identifying Evolution Genome ute nlsso t xrsini mroi eeomn n avlmtmrhsssoe that showed metamorphosis larval and development embryonic in expression its of analysis Further . hr r 4 ,ad4atngns n 9 ,ad0WF ee npnedsrm,crayfish shrimp, penaeid in genes WDFY 0 and 1, 19, and genes, actin 4 and 8, 84, are There . .trituberculatus P. FgS5). (Fig .trituberculatus, P. al S11 Table Fg 2) (Fig. nadto,w loietfid141uiu eefmle otiig319genes 3,199 containing families gene unique 1,471 identified also we addition, In and . .sinensis E. .Toelnaeseicgn aiismycnrbt otat htare that traits to contribute may families gene lineage-specific Those ). ihadvrec ieaon 5. 7.–7.)mlinyasao The ago. years million (77.6–274.3) 153.0 around time divergence a with FgS7) (Fig eaae rmteacso of ancestor the from separated TbeS10) (Table FgS6) (Fig efudta oegnswr bqiosyexpressed ubiquitously were genes some that found We . al S12 Table 10 .trituberculatus P. mn hc,teptwyo iooewr the were ribosome of pathway the which, Among . EG(yt nylpdao ee and Genes of Encyclopedia (Kyoto KEGG . .Teol xaddgn aiywsthe was family gene expanded only The ). n te pce,w constructed we species, other and TbeS,TbeS,FgS4) Fig S9, Table S8, (Table .virginalis P. .sinensis E. and .vannamei L. a most was Fg 3) (Fig. < (Table (Table 0.01), Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. The codn otecutraayi fteepeso atrsdrn h aiiysrs,teepne and expanded the salinity stress, up-regulated the after salinity was genes the of group down-regulation during other showed group patterns the One expression while categories. the two test, into of divided were analysis DEGs cluster unique the in adaptation to salinity and shock According in resistance Heat involved stress were protein, with apoptosis 14-3-3-like associated and are addition, resistance, . and stress In identified, transport, plays ion also 2012). and that were osmoregulation, Zeng, genes in & family involved caspase (Romano was apoptosis two transport A DEGs and subunit the ion 70, from catalytic in protein identified ATPase role were proton genes important V-type unique the an 141 which, and Among genes expanded . Eight normal and DEGs. to in up- unique 11.6%) restored adaptation were and salinity (535, or that of 16 mechanisms regulated transcripts the Profiles down with understand To down- then stress, h. salt were were 72 which upon h. after that DEGs 72 stress, normal 14.6%) after of salinity to levels number (670, the returned highest following transcripts profiles and next immediately of 20 stress, the regulated number into exhibited post-salt 9.5%) h highest grouped of (436, 24 the were total 18 and contained and a 12 1 Method, that at Profile showed Clustering regulated expression STEM profiles, crab thus, gene the the the and of using Among on DEGs, Comparisons to clustered impact of substantial were relating 33.5% h). a DEGs mechanisms for 72 has accounting stress the to salt (DEG), h clarify of low expressed 0 transcriptome to that differentially (from complete indicating significantly importance stress the were practical examined salt unigenes and we low 4,603 immune study, to theoretical and present development, response great the growth, in of In their is affects osmoregulation. indirectly it their which Thus, changes, salinity system. external to vulnerable is trituberculatus crab P. of development adaptation the salinity in genes of brachyurization. be two Mechanism in the should involved of it are patterns Conversely, they expression that found different phase. suggest We the crab shrimp Thus, 2015). juvenile and 2015). al., first al., et the et (Song of (Xiaoqing to brachyurization expression megalopa the of the that period noted from key regulated the from down crab is of significantly crab development juvenile The the first carcinization. that the of characteristics to important megalopa most the the of one is Brachyurization stage. egg-nauplius the after active more metamorphosis Many and 1987; , development 2019). (Hogan, in development al., different functions in that et showed identifying roles (Yan stages, their embryo ontogenic indicating an different thus, of embryogenesis; Eight axes early principal McGinnis). during the expressed of are all genes throughout development phological cieptwyrfr oteatvto fsm hsooia rcse seta o uvvli o salt low in survival for among essential secretion, processes pancreatic and physiological apoptosis, some absorption, and of digestion others activation presentation passive protein and the including The environments, mainly processing to and stress antigen stress, mechanisms. refers salinity and pathway ”active” upon translation, processes active and physiological protein nonessential ”passive” some ribosome-mediated through of include inhibition mainly the were to by refers adjustments pathway made Such were adjustments environment. or compromises physiological some Pb and Hox FgS10) (Fig Dfd Ubx Kede ogobo,&Tsaaao,21;Q ag Zhong). Wang; Q. 2011; Tassanakajon, & Pongsomboon, (Kaeodee, eefml swl nw o t motn ucini ietn isedffrnito n mor- and differentiation tissue directing in function important its for known well is family gene eemr ciebfr h atuainsae and stage, gastrulation the before active more were and . r ekhg soeuaos(caaa&Fra 02.Teritra soi pressure osmotic internal Their 2012). Faria, & (McNamara osmoregulators high weak are Hox Abd-A Hox ee eefudi h rbgnm,o hc hi xrsin eedtce at detected were expressions their which of genome, crab the in found were genes ee a ieetlvl fepeso tdffrn tgs oee,i general, in however, stages, different at expression of levels different had genes ee,wihmil oto h hp fteadmnlsgeto rb were crab, of segment abdominal the of shape the control mainly which genes, Ubx and Abd-A smitie tahg ee hogotteesae nshrimp in stages these throughout level high a at maintained is Fg5) (Fig 11 .trituberculatus P. ae nteaoersls eseuae that speculated we results, above the on Based . .trituberculatus P. Scr TbeS15) (Table , Antp efrhraaye h expanded the analyzed further we , , Ubx . oaatt h o salt low the to adapt to , Abd-A Fg4) (Fig h eut suggest results The .trituberculatus P. .trituberculatus P. FgS9) (Fig and h results The . TbeS16) (Table FgS8) (Fig Abd-B The . were Hox Lab . Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. bnatntrlplschrd nntr.Mroe,cii ucin samjrsrcua opnn in component structural major a as functions chitin Moreover, nature. amounts substantial in contain of polysaccharide which polymer stomachs, natural linear their abundant a in is found Chitin shells chitin. crustacean of small and shellfishes be of to number known and are sulfotransferase which Unlike carbohydrate system, and production, endocrine chitinase the mammalian synthesis. and acidic neural-cadherin including metabolism, growth, acid to genes amino related 42 genes, metabolism, growth-related which carbohydrate candidate to these for identified, Among unique were or traits. growth expanded genes with were growth-related associated pathways. be candidate KEGG may secretion pathways 676 salivary these and genome, resistance in insulin strategy absorption, markers BSA (p by these enriched found significantly of were were locations SNP/InDels the growth-related on 562 During of population. total of growth. rate wild A rapid growth a such The from to population. selection contributing unselected of mechanism the regulatory generations to fast-growing five compared of after 20.12%, variety increased 2010 trituberculatus new weight in The body selected the was significance. selection, commercial 1, high No. their Huangxuan to due traits growth the trituberculatus P. growth to rapid of According mechanism differ- Regulatory re- is markers markers. resistance 453 related individual disease-resistance screened the we their analysis near differentiation, BSA disease, region disease-resistance performed sandthreeInDels includingnineSNP the for to squaretest, and in materials resistance resistance anchored i the disease were strong disease- Using to or pathogens. immune- exogenous have lated these of invasion of the expansion crab resist crabs significant ent. the to to ability the related crab’s Thus, closely the Although shaped be 2017). have to Li, may proved & families been In gene Liu, has related (0.017947197). which Gao, pathway enriched, Yu, signaling also (Ren, the Hippo was immunity (0.00418914), pathway the as phagosome Some and apoptosis such as (0.01011326), the such invasion, pathway addition, (7.33E-05). pathways, signaling pathogen system infection receptor immune to NOD-like Salmonella in overrepresented the related and were pathways families (6.50E-05), pathways gene in Legionellosis expanded involved KEGG (8.63E-08), 27 specifically trypanosomiasis in were African enriched families immune significantly the gene elucidate were expanded to families used were gene analyses Expanded genomic comparative time, crab. first the the of for mechanism Thus, 2008). Songye, the & and h, 72 and at diversity rates trituberculatus mortality phenotypic cumulative the trituberculatus to to of P. contributors According (LD50) Vibrios. important doses and 2019). lethal most WSSV al., median as et the such Zhang pathogens, of (X. some environment to one the probably in adaption is evolutionary expansion family Gene mechanism Immune 1 disulfidecoredomainprotein resistance 1( dxnsqecndph somemarkerswereselectedandverifiedbasedonwhethertheywererelatedtotheresistanceof ndexandsequencingdepth, wasinvolvedinthecircadianentrainmentandneuroactiveligand evm.model.Contig .sinensis E. − eaegnseenhrd includingDNA relatedgeneswereanchored, sntsniiet SVdsaecmae osrm Zoga hzeg ejn Weixian, Wenjun, Zhizheng, (Zhongfa, shrimp to compared disease WSSV to sensitive not is sas infiatyfse hnta of that than faster significantly also is sa motn qautr pce nCiaadbedr r atclryitrse nits in interested particularly are breeders and China in species aquaculture important an is > and 405 .carinicauda E. .vannamei L. . 11) Aoghsgns DNA .Amongthosegenes, .trituberculatus P. 1( < .parahaemolyticus V. .1 nTp Idaee elts acetcscein rti ieto and digestion protein secretion, pancreatic mellitus, diabetes II Type in 0.01) evm.model.Contig , > .trituberculatus P. β- .chinensis F. ,-ikdNaey--lcsmn GcA) n stescn most second the is and (GlcNAc), N-acetyl-D-glucosamine 1,4-linked Tbe3) (Table ( 242 al S17 Table h nidsaeaiiyo orknso rsaenwas: crustacean of kinds four of ability anti-disease the , 12 > . 11) − .sinensis E. satpclcrioosca.Teeaeotnalarge a often are There crab. carnivorous typical a is .vannamei L. damage − andproteintrappedinendoderm damage oeo h ee lydkyrlsi pathways in roles key played genes the of Some . ) Bsdnhlctoifrainfhsmres threedisease .Basedonthelocationinformationofthesemarkers, − − inducibletranscript − Fg6) (Fig eetrneatoptwy.lscKGptwywrascaewtimnt,dsae andenvironmentaladaptation disease, receptorinteractionpathways.AllsuchKEGGpathwayswereassociatedwithimmunity, n hssuysuh odtriethe determine to sought study this and , .trituberculatus P. inducibletranscript tedt oson.I addition, In shown). no data (the FgS11) (Fig TbeS8adFg8) Fig and S18 (Table FgS5) (Fig oa f47genes 427 of total A . h eut ugs that suggest results The . a togrresistance stronger has 4 aivleiteTOsgaigahaadhAtpayaha,whileproteintrappedinendoderm ORsignalingpathwayandtheAutophagypathway, wasinvolvedinthemT 4( npriua,the particular, In . evm.model.Contig .trituberculatus, P. Based . P. P. − 81

. 19)

Pfour AP W , .parahaemolyticus V. − − Finally, .F 12 molecularmarkerssignificantlyassociatedwiththeresistanceof − ( ua,Ci,Hpis Bagrodia, Hopkins, Chiu, ruman, F .parahaemolyticus V. & Abraham, eedniidsnte earsonchi wereidentifiedusingtheP 2017; Silva & Jung, 2013; .u ,Araki, e, Y X.Xu, − & Ahmed, 2012) .After .parahaemolyticus V. neto,theexpressionofthethreegenesinhemocytesandthehepatopancreasweresignificantlychanged infection, Fg7) (Fig Ths thesegenesmayplayanimportantroleintheimmunedefenseof hus, .T .trituberculatus, P. whichcouldberelatedtothedifferentiationofdiseaseresistancetraitsbetweenindividuals. Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. ee,laigt aesxa eeomn Msr ta. 02.I hce n afih hc xii the development exhibit male which induce flatfish, may dosage and gene chicken the In and sex-determining chromosome, 2002). Z XX/XY al., the the on et located In (Masaru is development DMRT1 system, sexual 1999). ZZ/ZW male Lin, to & leading Yuan, genes, conserved Tang, highly Y-linked more Moore, or the one (Meng, by system, characterized domains is which (DM) determination, sex of DNA-binding in involved patterns was gene expression developmental doublesex different The the during and females and sex analyzed, males the successfully in on different were stages, significantly based were genes crabs, (Contig475.8) eight juvenile gene ( Finally, unknown to doublesex including Z1 which, 2018). of of region al., sexes three Y-linked known et chromosome. the with Y (Lv in the stages marker predicted locating developmental in were different accuracy genes during our analyzed 10 the proving in thus, of located 2018); total all al., were A et they contigs(Lv that two found those and on further zero located To to also equal Contig475) 2018). values and CQ al., (Contig153 with contigs et two results in of mechanism(Lv on genotypes determination determination focused heterogametic We sex sex identified chromosome. of also XY mechanism studies the Those the support map. analyze male linkage in high-density (LOD markers a LG24 on using sex-associated located time QTL first significant analysis highly the karyotype a Gonz´alez-Orteg´on,for studies, by Mart´ınez-Lage, previous Perina, & determination our Torrecilla, In sex determined. (Zeltia been genomes, crustaceans in complex most determination and sex in of Gonz´alez-Tiz´on, chromosomes mechanism performed the of Moreover, 2017). be number cannot large the to Due in determination traits Sex insulin growth of to induces differentiation unique LAR the signaling. was which and insulin of growth between of by knock-down one rapid located regulated SNP and that the be one traits, suggested related and that growth data showed proteins with Those 1996), analyses associated cellular BSA . was Goldstein, is of of genes result & diabetes LAR phosphorylation The tandem Zhang, tyrosine 2 two Sale). Li, reversible & type Hodgkinson, of M. and (Mander, (LAR) (P. regulation resistance phosphatase obesity resistance the Protein-tyrosine retardation, cell insulin in 1995). stimulate Growth Phillips, role to also 2000; essential which Flier, 2001). an & metabolism, (Kahn plays lipid Kahn, resistance and & insulin glucose with mechanism (Saltiel of associated rapid regulation differentiation and the and specific for a growth important be is may signalling there Insulin that indicated which genome, in crab’s myogenesis the of & specific in Holland, one its found Bardeesy, and analyses, were (Maccalman, myogenesis genomics in family thereafter comparative role declined matrix important on which an extracellular Based myoblasts, plays the prefusion Blaschuk). N-cadherin of addition, in of In expression highest remodeling Decreased was al.). the et (MSC). expression (Grassot cells to myotubes ( satellite contributes in 5 murine fusion and as sulfotransferase skeletal cell biosynthesis, Carbohydrate such in during sulfate progression matrix, S.). myogenic steps keratan extracellular & during key (Chakravarti in the sulfates involved are of keratan myotubes is organization of in ) structural composed fusion different is and a which alignment requires lumican, Cell growth. alignment rapid myogenesis. Cell support from myogenesis. that inseparable substances is and growth energy Rapid chitin essential produce degrade to can the chitin and containing in conditions food found intestinal was that function and fiber gene can stomach dietary AMCase (AMCase) of (GlcNAc) under chitinase source produce mammalian glycosidase a to acidic protease-resistant hydrolyze as substrates that considered that major confirmed been enzymes a studies long are recent as has However, Chitinases chitin digested. 2015). system, not digestive Spaink, is mammalian & the Stougaard, In (Koch, chitin. insects and crustaceans, fungi, Fg10) (Fig . .trituberculatus. P. DMY/Dmrt1bY 2 .trituberculatus P. hc sasuc fcro,ntoe,adeeg Ms ta..Aspecific A al.). et (Misa energy and nitrogen, carbon, of source a is which , Ptdmrt i 9) Fig ( ee ftetlot eaa eecaatrzda sex-determining as characterized were medaka, teleost, the of genes otg7.3,mbSN-ie( myb/SANT-like Contig475.13), , neetnl,tetopeiul-nhrdsxmreswere markers sex previously-anchored two the Interestingly, . .trituberculatus P. eoe hs ugsigteca a ffiinl digest efficiently can crab the suggesting thus, genome, 13 Tbe4) (Table Chst5 eue Qaaye oaco h Y the anchor to analyses CQ used we , n hi xrsinpten were patterns expression their and , eeadoeepnigN-cadherin expanding one and gene .trituberculatus P. Ptmyb .trituberculatus P. otg5.8 n an and contig153.28) , > a o previously not has .trituberculatus P. 4 a identified was 14) Chst5 delayed Chst5 could Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. costegnm;4 Ccnetars h eoe ,represents genes 5, strand negative genome. of the distribution across 3, content genome; GC the across 4, genes genome; strand the positive across of distribution the 2, of genome; features Genomic PR- 1 (https://www.ncbi.nlm.nih.gov/bioproject/ Fig NCBI on table and found Figure be can datasets JNA631920/) sequence genomic manuscript. preparation. the All sample reviewed authors performed All Gao Ti manuscript. Baoquan availability Xingbin the and Data and wrote Zhencheng Liu Yan Su and Ping Deping Zhencheng Liu, Su, and analyses. Zhencheng Guangjian Lv data Li, Jianjian Ronghua research. and Lv, the experiments Jianjian designed the sequencing. and performed genome Liu.conceived the Ping performed and Su and Wang Chunlin (2018YFD0900303 Li, Jian Na- China LJNY2015002), of (CARS-48), (No. System Program contributions Project (41776160). Research Author Talent China R&D Leading Agriculture of Taishan Foundation China Key of Science & Natural Project National tional China Innovation the Agriculture of Eco Agriculture by Efficient of supported Ministry was 2018YFD0901304), research regulatory This the crab. provide clarify economic will to this genome important for reference also MAS Acknowledgements This are and which traits determination. breeding studies, of sex important biological adaptation and of and salinity growth mechanisms evolutionary the the rapid for on immunity, underlying light resources strong mechanisms shed crucial their and comparative data as evolution, and The well annotation structure, ecosystems. as Genomic genomic marine the complex crustaceans. of into aquatic biology insights of provided specie analyses the important genomic of an assembly and genome species brachyuran chromosomal-scale a of present specificity We species and complexity the sex- indicating Conclusions unknown studies, the in previous of mechanisms in pattern determination determination expression sex sex Neither the the in males. stages, involved in crab be higher juvenile to was from and and different megalopal changed, was gene the that, gene related in sex-related was However, unknown genes The phase. two Z4 stages. the crab of juvenile to expression Ptdmrt Z1 the in from between development peak throughout difference second The a reached females. the and differentiation. to stage sex the Similar of Z1 ( to periods the (6.20-times) critical compared during be 18.54-times, phase may (upregulated decreased Z3 periods stage then the egg-zoea was in the Expression expression at peak stage). its study) multicellular reaching development, this stage, embryonic in egg-nauplius of used the stages were different domain the individuals DM During male one 40 including the exons S12), and that three acids, (Fig showed female spans amino also frame (40 265 results reading males PCR open of of in the protein pairs) and amplify putative (RACE), base only ends (2,165 a cloned sequence encodes of cDNA amplification and full-length rapid The by In obtained al.). was al.). et et (Yoshimoto Nanda gene 2014; (ovary)-determining al., et Chen (S. and Ptmyb ugsigthat suggesting Ptdmrt fwihepeso nfml a infiatyhge hnta nml uigZ to Z1 during male in that than higher significantly was female in expression which of , ee h xrsinof expression the gene, Ptdmrt P.trituberculatus Ptdmrt saYlne eeadpasa motn oei e determination. sex in role important an plays and gene Y-linked a is .trituberculatus. P. .laevis X. i 13) Fig eewsepesdi l aetsustse,btnti females in not but tested, tissues male all in expressed was gene Ptmyb rmotrt ne ice:1 itiuino costhe across of distribution 1, circles: inner to outer From . n -ikdgn,D-,wsietfida ieysex likely a as identified was DM-W, gene, W-linked one , hs eut niae htteegze n zoea and egg-zoea the that indicated results Those . 14 nmlswsas infiatyhge hnta in that than higher significantly also was males in Ptmyb Ptdmrt .trituberculatus P. rteukonsxrltdgnswr found were genes sex-related unknown the or xrsinwssgicnl peuae at upregulated significantly was expression P.trituberculatus ersnaieca fthe of crab representative a , Ptmyb Fg11) (Fig a anyexpressed mainly was Fg12) (Fig hoooe.I 5, In chromosomes. .trituberculatus P. . Ptdmrt n RT- and , Ptdmrt can , Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. pce.()Heacia lseigo 22fml ee tsvnendffrn eeomna tgsin of stages Clustering developmental different 4 seventeen Fig at genes (J). family stage C2H2 crab of clustering Hierarchical P.trituberculatus of (B) expression species. and respec- branches, tree their berculatus Phylogenetic values above 3 colors bootstrap red represent Fig Gene and nodes genes. green orthologous ancestral in copy at shown single to 246 points are using (-) Red constructed contraction was tively. and which (+) species, sliding 12 expansion 0.5-Mb of family nonoverlapping tree in Phylogenetic drawn 2 are Fig 2–4 chromosomes. different at genes windows paralogous joins line each g-apissae(n,egze tg E) oasae(1Z) eaoa tg M,adjvnl crab juvenile and (M), stage megalopal (G), (Z1–Z4), stage stage gastrulation (B), zoea stage (Ez), blastula stage (J). (Mc), stage egg-zoea stage (En), multicellular including stage crabs egg-nauplius young to eggs fertilized of Expression 13 Fig 12 475) of Fig Contig sequence cDNA in of located Alignment was (B) gene domain. (Ptdmrt individuals. DM female the and in male structure in of crab features juvenile structural of to and Z1 Sequence from female 11 genes and Fig male 8 represents of Y-axis pattern and Expression values region 10 CQ a (PBC- Fig distinctive represrnts female by X-axis and differentiated genome. (PBX-PSX) be splitted male can the from of sequences individuals. alignments region Y-linked The same how results. the demonstrate CQ in to the data of sequence plot coverage PSC) and Reads 9 Fig respectively. groups, two of screening posiotion the the c. represents group. X-axis analysis. Δ BSA or SNP-index through the differentiation represent growth with of infection fast Contigs following of of tissues Identification different 8 in Fig genes related resistance disease of parahaemolyticus analysis Expression 7 Fig respectively. groups, two of screening the c. group. up Δ or represents posiotion the SNP-index ring represents the X-axis Green represent analysis. of BSA stress. through Contigs salinity differentiation resistance of during disease gene. of DEG Identification regulated unique 6 down Fig and represents ring expanded yellow the and of gene of analysis regulated qPCR Cluster indicated.(B) 5 are Fig genes the of orientation in and stages developmental position relative The (Pvi). SPidx au ffitn eut.a h N-ne rp fPXgop .teS-ne rp fPSX of graph SN-index the b. group. PBX of graph SNP-index the a. results. fitting of value (SNP-index) XT of graph SN-index the b. group. X of graph SNP-index the a. results. fitting of value (SNP-index) P.trituberculatus Ptdmrt. Hox Ptdmrt ee lsesadteEpesosa ieetDvlpetlSae in Stages Developmental Different at Expressions the and Clusters Genes ()Pyoeei reo ee fteepne eefml oprdwt h w shrimp two the with compared family gene expanded the of genes of tree Phylogenetic .(A) h lc o niae h osre Mdmi n h rylbldrgo stezn finger zinc the is region labeled gray the and domain DM conserved the indicates box black The Hox Δ Δ .trituberculatus P. trituberculatus P. mlf nml n eaeindividuals female and male in amplify SPidx rp.Tebu ahdln n upedse ierpeettecniinof condition the represent line dashed purple and line dashed blue The graph. (SNP-index) of condition the represent line dashed purple and line dashed blue The graph. (SNP-index) nldn etlzdegsae() oasae(1Z) eaoa tg M,adjuvenile and (M), stage megalopal (Z1–Z4), stage zoea (F), stage egg fertilized including ee in genes itrso te pce r rmtewbieht:/hlpcogaot#bu –use. http://phylopic.org/about/#about website the from are species other of pictures , Ptdmrt .trituberculatus P. .trituberculatus P. Δ Δ tdffrn eeomna tgs lvndffrn eeomna tgsfrom stages developmental different Eleven stages. developmental different at SPidx au feeySPlcs h e ie hwteSPidxor SNP-index the show lines red The locus. SNP every of value (SNP-index) or SNP-index the show lines red The locus. SNP every of value (SNP-index) n -xsrpeet h N-ne or SNP-index the represents Y-axis and or SNP-index the represents Y-axis and Ptdmrt . . Ptdmrt ( Ptr C2H2 ), eecnit ftreeosadtointrons. two and exons three of consists gene ee.()Ncetd n eue mn cdsequence acid amino deduced and Nucleotide (A) genes. .sinensis E. 15 aiya ieetdvlpetlsae in stages developmental different at family ( Esi Ptdmrt > ), .vannamei L. 0i 0 xeiet.I addition In experiments. 100 in 90 n N euneo otg475 Contig of sequence DNA and Δ Δ SPidx.Teclrdots color The (SNP-index). dots color The (SNP-index). ( Lva .trituberculatus. P. Hox ,and ), ee tdifferent at genes .virginalis P. .tritu- P. (A) V. Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. v.oe.otg61 yohoecoiaeasml atr6hmlgO=oospesG=O6P= SV=1 SV=2 PE=1 PE=2 GN=COA6 GN=Fbxl6 sapiens musculus OS=Homo OS=Mus SV=1 homolog 6 PE=2 6 protein GN=MFSD6 factor F-box/LRR-repeat scrofa assembly OS=Sus oxidase 6 c protein Cytochrome SV=3 domain-containing SV=2 PE=2 superfamily PE=1 GN=CHST5 facilitator GN=Kirrel sapiens norvegicus Major OS=Homo OS=Rattus 5 1 sulfotransferase protein Carbohydrate IRRE-like of Kin SV=1 - PE=1 GN=TRIM59 sapiens OS=Homo 59 - protein SV=1 motif-containing Tripartite PE=1 SV=2 GN=ine PE=1 melanogaster GN=KIAA1524 OS=Drosophila - sapiens ine OS=Homo transporter CIP2A GABA evm.model.Contig37.39 Protein chloride-dependent and Sodium- SV=1 evm.model.Contig66.12 PE=2 GN=FP1 coruscus OS=Mytilus SV=1 evm.model.Contig82.11 - protein PE=2 matrix GN=GABRB4 plaque gallus evm.model.Contig28.13 Adhesive OS=Gallus beta-4 SV=1 subunit SV=2 PE=2 receptor evm.model.Contig35.4 PE=1 GN=Prune acid GN=Chia musculus Gamma-aminobutyric musculus OS=Mus OS=Mus homolog evm.model.Contig51.19 chitinase prune mammalian Protein Acidic SV=1 evm.model.Contig1679.4 PE=1 GN=COMMD4 sapiens OS=Homo evm.model.Contig139.4 - 4 protein SV=2 domain-containing PE=1 evm.model.Contig693.3 COMM GN=CadN melanogaster SV=1 OS=Drosophila PE=2 evm.model.Contig58.27 Neural-cadherin sanguineus OS=Hemigrapsus BCRH2 SV=1 opsin evm.model.Contig3.89 PE=1 eye GN=Nc Compound SV=1 melanogaster PE=2 SV=2 OS=Drosophila evm.model.Contig639.9 sanguineus PE=1 Nc OS=Hemigrapsus GN=ZNF418 Caspase BCRH2 sapiens opsin OS=Homo evm.model.Contig3.172 eye 418 Compound protein finger evm.model.Contig397.12 Zinc evm.model.Contig1675.1 - evm.model.Contig6.77 evm.model.Contig39.49 evm.model.Contig166.1 evm.model.Contig15.40 evm.model.Contig128.21 evm.model.Contig1407.2 evm.model.Contig128.16 evm.model.Contig238.5 evm.model.Contig153.38 in Gene specific or expanded also genes, growth-related 42 3 Table scaffolding after Contig ** of statistics Assembly 2 Table the for used data Sequencing 1 Table oa 1.7–264.81 100.66 71.64 92.51 – 150 – 317.77 150 85.97 120.79 – 111.01 (X) – coverage – Sequence (bp) length 350 Read Total Genomics (G) 10X data Total reads Pacbio size Insert reads Illumina libraries Pair-end DSisrtannotation SwissProt ID 9 0321069807384 51 8,037 34 5,734 26 4,132 130,609 20 2,431,430 2,951 - 8,722,939 2,055 12,181,407 30,342 15,593,479 44,666 62,400 - 2,165 82,211 number 33,040,454 108,682 12,101 number N90 864,453,619 6,277,820 N80 N70 846,564,517 Length N60 N50 Number Length Max Total ID Sample > 20 2072,165 12,077 - - =2000 P.trituberculatus .trituberculatus P. otg*b)Saodb)Cni* Scaffold Contig** Scaffold(bp) Contig**(bp) 16 eoeassembly genome .trituberculatus P. Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. i 5Tesaitc fKG aha niheto xaddfamilies expanded of enrichment pathway KEGG of statistics species The different S5 in the Fig in genes sequence of consensus distribution the The and S4 genome the Fig the in in TEs elements of TE type identified each the novo between of calculated rate was divergence rate of Distribution S3 Fig . among features size=total gene , ’Genome of formula genome Comparison The the S2 represented by 41. Fig volume was Gb high K-mer 1.2 of and peak frequency as volume low estimated main at was The peak size errors. left-hand sequencing The random essentially occur. containing they K-mer which at frequency in the frequency against 17-mer of Distribution S1 Fig SV=1 PE=2 GN=timm10b laevis OS=Xenopus B Tim10 subunit translocase SV=2 membrane PE=1 inner GN=Lar import melanogaster Mitochondrial OS=Drosophila Lar phosphatase Tyrosine-protein SV=1 SV=1 PE=2 PE=1 GN=dok7 - GN=Arhgef11 rubripes norvegicus OS=Takifugu OS=Rattus region Dok-7 11 Y-linked Protein factor the exchange in - nucleotide predicted guanine genes Rho ten of information The - 4 Table SV=1 PE=2 GN=stk10 rerio - OS=Danio SV=2 10 PE=1 kinase GN=Lrrc16a Serine/threonine-protein musculus OS=Mus SV=1 16A PE=2 protein evm.model.Contig708.7 GN=BCAT1 repeat-containing aries Leucine-rich SV=2 OS=Ovis PE=1 cytosolic evm.model.Contig180.1 GN=SLC17A5 aminotransferase, Branched-chain-amino-acid sapiens SV=2 OS=Homo PE=1 evm.model.Contig317.10 Sialin GN=CadN melanogaster OS=Drosophila evm.model.Contig1137.1 Neural-cadherin SV=1 evm.model.Contig597.4 PE=1 GN=GBF1 griseus OS=Cricetulus evm.model.Contig89.6 - 1 factor - exchange nucleotide evm.model.Contig96.7 guanine SV=1 A-resistance PE=1 brefeldin GN=IDS evm.model.Contig188.4 Golgi-specific sapiens OS=Homo 2-sulfatase evm.model.Contig246.3 Iduronate evm.model.Contig954.14 evm.model.Contig348.16 evm.model.Contig15.39 evm.model.Contig499.7 evm.model.Contig0.31 evm.model.Contig23.18 evm.model.Contig3.60 evm.model.Contig23.12 evm.model.Contig526.3 Gene ahi pulex Daphnia irr eused. we library DSisrtannotation SwissProt ID , xdsscapularis Ixodes otg7 687868 evm.model.Contig475.13 evm.model.Contig475.11 evm.model.Contig475.10 evm.model.Contig475.9 + - evm.model.Contig475.8 866685 evm.model.Contig475.7 - 539131 865887 - 538208 Contig475 538896 + evm.model.Contig153.40 537678 evm.model.Contig153.39 Contig475 537720 - 497352 evm.model.Contig153.29 Contig475 537316 477231 Contig475 + 490225 evm.model.Contig153.28 Gene Contig475 - 2213665 369282 Strand Contig475 2210559 2212469 - Contig153 2210235 + 1515057 End Contig153 1310130 1514797 Contig153 1307889 Start Contig153 CHROM v.oe.otg.2RsrltdpoenRb3O=rspiamlngse NRb E1SV=1 PE=1 GN=Rab3 melanogaster OS=Drosophila Rab-3 protein Ras-related evm.model.Contig0.32 , aataoatepidariorum Parasteatoda otnstrituberculatus P 17 , erncu urticae Tetranychus , kmer eau vannamei Penaeus P.trituberculatus eoe h oueo -e splotted is K-mer of volume The genome. ID num/kmer , depth’ eoe h divergence The genome. rspiamelanogaster Drosophila , rblu castaneum Tribolium de Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. env uuts4,9 ,6.7845 .9266 1,847.03 246.65 3.59 884.59 5,661.67 44,493 Homology Augustus novo De prediction gene of Statistics S5 Table the of assessment F:Fragmented CEGMA BUSCOs S4 searched Table Duplicated groups BUSCO D:Complete n:Total BUSCOs BUSCOs M:Missing Single-Copy BUSCOs S:Complete BUSCOs C:Complete the of assessment BUSCO S3 Table in SNP of statistics General S2 study.xlsx Table this in used Primers S1 Table of RT-PCR genes stress S12 growth-related salt Fig of during analysis DEGs enrichment specific KEGG and stress S11 expanded salt Fig up-regulated during of DEGs enrichment specific KEGG and S10 expanded Fig down-regulated of low enrichment stage, KEGG stress molting S9 salinity development, Fig during of DEGs stages of different Trend in S8 pathway Fig ribosome infection in pathogen genes and families stress of specific salt expression identified the of enrichment S7 pathway Fig KEGG of statistics The S6 Fig %opeees h rprincr ee nterfrnecr eelibrary. gene core reference the in genes core proportion the gene; gene %completeness: core core of 4. partial Number and #Prots: complete 3. partial: completeness + a Complete with gene 2. core Complete: 1. pce opeecmlt opee+prilcmlt partial + complete partial + complete complete complete P.trituberculatus species eed2,5 0298 4.150 4.34,869.99 3,329.43 3,862.21 3,133.17 148.93 275.99 175.53 202.98 5.01 4.97 3.91 2.50 1,372.20 746.01 686.97 507.15 length(bp) intron Average 14,596.29 length(bp) 20,269.88 exon Average 11,939.92 39,969 Dme gene 28,958 per exons Average 77,306 Cgi length(bp) Cel CDS Average Genscan length(bp) transcript 5,202.15 Average Geneid Number SNAP 142,367 GlimmerHMM set Gene pce UC oainassmn results assessment notation BUSCO P.trituberculatus Species Ptdmrt oooySP4,7 0.0067 0.3604 2,586,394 47,872 0.3537 2,538,522 SNP Homology SNP Heterozygosis SNP All nml n eaetsusof tissues female and male in 8 54 1 85.08 211 %completeness 75.40 Prots # %completeness 187 Prots # 0854200 9.832 7.31,503.38 1,277.38 1,265.61 275.13 397.52 315.45 3.24 1.98 2.55 891.58 787.48 803.49 4,260.04 2,040.56 2,761.53 10,815 29,696 8,768 P.trituberculatus P.trituberculatus P.trituberculatus C:84.7%[S:82.5%,D:2.2%],F:2.9%,M:12.4%,n:1066 > 70% ubrPercentage(%) Number 18 genome genome genome .trituberculatus P. Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. V V 3938590 ,4.144 5.32,132.77 2,194.55 genome in % (bp) 2,346.33 Length 2,197.63 TRF) without (all TEs Combined 255.33 TRF) without (all 236.02 TEs Combined protiens 221.40 TE 256.30 protiens TE (Repbase+Denovo) 4.49 Repeatmasker 5.22 (Repbase+Denovo) Repeatmasker 6.28 Type sequences. 4.76 repetitive of Categorization S7 Table 1,147.41 1,231.18 1,389.51 1,219.26 8,599.08 10,484.05 23,993 databases. different 19,981 in annotation Gene S6 13,768.62 Table 9,475.94 注 44,920 set* Final EVM 23,395 set* Pasa-update* Final Pasa-update* PASA EVM RNAseq *otisteutasae region. untranslated the :*contains uik 6511,0.43096 .3475 2,700.67 467.53 6.63 3,099.64 18,303.84 56,571 length(bp) intron Average Cufflinks length(bp) exon Average Tur gene per exons Average Tca length(bp) Spu CDS Average Pva length(bp) transcript Average Ptep Number Isc Hsa Dpu set Gene nePoal1,7 98.50 98.90 Total 84.90 74.10 19,677 19,763 16,959 Annotated Total 14,806 Annotated 76.90 all KEGG NR InterPro 15,357 Swiss-Prot KEGG Swiss-Prot NR 8862745 1.022 1.41,536.27 1,340.60 1,334.03 1,349.24 1,419.70 1,252.86 1,416.63 1,493.25 416.34 390.59 428.30 302.06 368.69 360.28 303.93 279.70 2.20 2.51 2.39 3.10 2.27 1.58 2.86 2.04 914.90 981.52 1,023.85 935.76 836.84 567.46 869.14 571.79 2,754.58 3,009.75 18,806 2,878.81 22,091 3,766.29 15,073 2,639.54 26,412 1,287.92 23,660 3,503.53 47,137 2,131.13 13,798 31,971 O1,5 91.40 72.50 18,257 14,492 GO Pfam 19 ubrPret(%) Percent Number 991- 19,981 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 38 Others 17.07 147,540,995 1.79 15,507,775 16.48 142,449,802 LTR 0.02 134,806 0 0 0.02 134,806 SINE 27.29 235,904,361 10.44 90,240,215 23.80 205,758,594 LINE 10.95 94,695,859 0.89 7,698,813 10.20 88,177,866 DNA genome in % (bp) Length genome in % (bp) Length 20 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. al 9Tedsrbto fgnsi ieetspecies different in genes of distribution The S9 Table species each in clustering family gene for used Genes S8 Table 47.08 406,978,948 13.11 113,294,828 45.45 392,875,521 Total 3.42 7,689,177 0 0 0.89 7,689,177 Unknown 0.000004 38 0 0 0.000004 s 5 491316248 15655 163 5888 1489 762 657 orthologs Other 869 Unique orthologs Cse Multiple-copy Cgi orthologs Single-copy Species Nve Cel Pvi Mye Gene Cgi Lva Esi Dpu Oni Cse Dre Ptr Name Species eaotlavectensi Nematostella elegans Caenorhabditis virginalis Procambarus yessoensis Mizuhopecten gigas Crassostrea vannamei Litopenaeus sinensis Eriocheir pulex Daphnia niloticus Oreochromis semilaevis Cynoglossus rerio Danio trituberculatus Portunus 21 24773 20060 20761 23930 27265 24825 7502 30280 21209 20211 31951 23694 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. al 1 h S nlssrsl fgot trait.xlsx growth of result analysis BSA The markers.xlsx S18 12 Table of analysis frequency Genotype DEGs.xlsx S17 unique Table and expanded of crab.xlsx Information and S16 crayfish Table stress.xlsx shrimp, salinity penaeid crab.xlsx during in and DEG families crayfish S15 WDFY shrimp, Table and penaeid actin in of family Genes protein S14 finger Table Ptr).xlsx zinc and C2H2 (Esi of crab Genes of S13 family Table gene contracted and identified.xlsx Expanded families S12 gene Table unique of information The S11 Table in families gene contracted and P.trituberculatus expanded of information The S10 Table otat141Hydrolysis 1 184 Contract differentiation and Growth 14 54 Expand e 0 6 255794 10517 7245 23564 11779 2091 4551 3452 11577 2242 319 10968 10057 5159 15747 10633 17458 7382 2925 602 564 740 2367 958 239 755 768 404 667 908 1689 854 484 Cel 811 Ptr 1004 Dre 886 Lva 889 Esi 944 Nve orthologs 925 Other Dpu 621 Unique Pvi orthologs Mye Multiple-copy Oni orthologs Single-copy Species 1 muiyrelated Immunity differentiation and growth Metabolism expression, gene of Regulation transportation Material Opsin 1 transportation Material Sulfate 10 related Immunity 11 related Growth 11 regulator response transduction Signal 211 8 differentiation related metabolism and Disease 9 Energy growth cell expression, 7258 gene 13 of Regulation 6224 4 Metabolism 6221 12 Antibiosis expression 4982 gene 16 of Regulation 4936 10 Glycometabolism 9 3920 Metabolism 18 2218 9 1691 2 1257 16 1118 3 967 8 598 410 378 303 287 147 eefml DGn ubrClasses number Gene ID family Gene 22 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. hn .(04.UigRpaMse oIetf eeiieEeet nGnmcSequences. Genomic in Elements perspectives. Repetitive our Identify and to review RepeatMasker A 5 Bioinformatics, Using molting: in Protocols crustacean (2004). of N. Regulation Chen, (2011). L. 172 Endocrinology, D. Comparative the Mykles, and in & General Opacity S., Corneal E. and Fragility Chang, Skin Assembly: Fibril Lumican. Collagen of Regulates Absence Lumican S. & assem- Chakravarti, enhancing synteny. and conserved repairing and for set maps tool genetic a of doi:10.1101/2020.02.04.934711 Chromonomer: integration by through (2020). DNA11Edited S. genomes genomic Bassham, bled & human A., in Amores, structures J., gene Catchen, complete of Prediction Cohen. (1997). E. S. F. Karlin, Crustacea. & recent C., the Burge, of Classification carcinization. (1982). of E. instance T. An Bowman, Porcellanopagurus: II. Part Crustacea. 111-126. (2019). A. L. Borradaile, draft bacterial https://europepmc.org/articles/PMC4076250?pdf=render scaffolding https://europepmc.org/articles/PMC4076250 SSPACE-LongRead: https://doi.org/10.1186/1471-2105-15-211 (2014). https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24950923/pdf/?tool=EBI information. W. sequence https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24950923/?tool=EBI Pirovano, read long & from using M., genomes genome Genomewise. Boetzer, and in GeneWise elements (2004). transposable R. Durbin, detecting doi:10.1101/gr.1865504 & and M., Clamp, Discovering L. E., S. Birney, (2007). L. H. Yeh, Quesneville, & sequences. . sequences. DNA M., analyze C. . to Bergman, program . a S., finder: Ferro, repeats Tandem 27 B., (1999). Boeckmann, G. C., Benson, W. knowledgebase. high-throughput Barker, Protein with H., Universal work doi:10.1093/nar/gkh131 C. the to Wu, framework UniProt: A., Python (2004). HTSeq–a Bairoch, search R., (2015). alignment Apweiler, W. local Basic Huber, & (1990). T., J. data. P. D. sequencing Lipman, Pyl, & S., W., Anders, (2014). E. J. Myers, Shendure, W., Miller, . W., tool. . Gish, contiguity. . F., transposase L., S. via Altschul, Christiansen, assembly A., genome novo Kumar, de R., for Daza, 24 information N., research, sequence J. long-range Burton, vitro, O., In J. Kitzman, A., Adey, References 2,5350 doi:10.1093/nar/27.2.573 573-580. (2), http://europepmc.org/abstract/MED/24950923 ora fMlclrBooy 215 Biology, Molecular of Journal rensi iifrais 8 Bioinformatics, in Briefings 1) 0124.doi:10.1101/gr.178319.114 2041-2049. (12), ora fMlclrBooy 268 Biology, Molecular of Journal iifrais(xod nln) 31 England), (Oxford, Bioinformatics ora fCl ilg,141 Biology, Cell of Journal 1,41.11.01.doi:10.1002/0471250953.bi0410s05 4.10.11-14.10.14. (1), 3,4340 doi: 403-410. (3), 6,3232 doi:10.1093/bib/bbm048 382-392. (6), 3,3330 doi:10.1016/j.ygcen.2011.04.003 323-330. (3), 1,7-4 doi: 78-94. (1), 5,1277-1286. (5), 23 2,1619 doi:10.1093/bioinformatics/btu638 166-169. (2), M iifrais 15 bioinformatics, BMC doi:10.1186/1471-2105-15-211 uli cd eerh 32 research, acids Nucleic https://doi.org/10.1016/S0022-2836(05)80360-2 ilg fCrustacea of Biology https://doi.org/10.1006/jmbi.1997.0951 eoersac,14 research, Genome bioRxiv 2020.2002.2004.934711. , . uli cd research, acids Nucleic (suppl 1.Retrieved 211. , ) D115-D119. 1), 5,988-995. (5), olg,3 Zoology, Genome Current , Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. aiai . ohmt,K,Ngci . ooa . gr,Y,Ouo . th .(04.Efficient (2014). T. Itoh, . . reads. short . shotgun M., whole-genome Okuno, from Y., genomes heterozygous Ogura, highly A., of Toyoda, assembly H., novo Noguchi, de K., Toshimoto, resistance. R., insulin Kajitani, and Obesity (2000). S. doi:10.1172/jci10842 J. Flier, 473-481. Inter- Penaeus & B., of (2014). B. response Kahn, S. and Hunter, analysis Expression 159 . Biology, (2011). stress. ular A. salinity . to Tassanakajon, genes & 14-3-3 . S., monodon C., Pongsomboon, M., McAnulla, Kaeodee, W., Li, classification. M., function doi:10.1093/bioinformatics/btu031 Fraser, protein 1240. H.-Y., genome-scale Chang, 5: D., ProScan Binns, neuroendocrinology. crustacean P., gene, of Jones, homeobox history brief mouse A the it: of have 175 expression eyes Endocrinology, The of M. patterns P. spatial Hopkins, and Developmental (1987). evolution M. Hox2.1. The L. (2016). C. B. lobster. Haug, slipper Hogan, & a A., evolve P. Saad, to G., how Y Petit, or novel doi:10.1016/j.asd.2015.08.003 F., character, Palero, Six key S., a (2013). Charbonnier, (2008). of D., Z. R. Audo, Tu, J. T., & J. Wortman, V., Haug, females. I. . and Sharakhov, males V., sequencing . independently M. 14 by Sharakhova, discovered Genomics, . mosquitoes V., Assemble Anopheles J., Timoshevskiy, in to Y., Orvis, genes Program chromosome Qi, E., B., the J. A. and Allen, Hall, EVidenceModeler M., Pertea, using W., annotation Alignments. and Zhu, A. structure Spliced glycosylation L., gene of F. S. Highlights eukaryotic Leone, J.-M. Salzberg, Automated Petit, J., & & B. F., L., Dupuy, Haas, myogenesis. A., F. in Maftah, involved C. J., genes Saliba, Fontes, related A., adhesion Silva, C., Spermine. Da (Decapoda, and J. V., ornatus Spermidine Grassot, Callinectes McNamara, Amines Crab, Biogenic L., Blue the 244 the by J. Biology, Hydrolysis of in Membrane ATP Franca, Pathway Gills of PI3K Posterior N., Modulation The the Brachyura): in M. (2017). Activity T. Na+,K+-ATPase Lucena, R. (2011). P., Abraham, & D. S., Garcon, Bagrodia, D., B. Hopkins, Disease. H., Human Chiu, crustacean? A., a (2015). D. A. feminise Fruman, Bateman, you Can . (2008). . T. future. the . A. sustainable L., for Ford, A. more tool Mitchell, a J., computational towards Mistry, a R., doi:10.1093/nar/gkv1344 database: S. CAFE: D279-D285. Eddy, families Y., protein (2006). R. Pfam W. Eberhardt, P., The M. Coggill, Hahn, D., R. & (2013). Finn, P., J. evolution. J. Korlach, family Demuth, . gene N., of . Cristianini, study . T., C., Bie, Whole-genome data. Heiner, De sequencing J., benthic (2014). SMRT Drake, J.-N. a long-read A., Volff, from to A. assemblies Klammer, adaptation 10 . genome P., and microbial Marks, . finished evolution H., Nonhybrid, D. chromosome . Alexander, sex P., C.-S., ZW Zhang, Chin, into G., insights Liu, provides Q., Huang, flatfish lifestyle. C., a Shao, of G., sequence Zhang, S., Chen, 6,5359 doi:10.1038/nmeth.2474 563-569. (6), eeomn,99 Development, aueGntc,46 Genetics, Nature 7-7.doi:10.1186/1471-2164-14-273 273-273. , el 170 Cell, 4,2421 doi:10.1016/j.cbpb.2011.05.004 244-251. (4), 3,357-366. (3), eoeBooy 9 Biology, Genome 1,92.doi:10.1007/s00232-011-9391-5 9-20. (1), 4,603-617. (4), 4,605-635. (4), 3,253-260. (3), iifrais 22 Bioinformatics, oprtv iceityadPyilg -iceity&Molec- & B-Biochemistry Physiology and Biochemistry Comparative 1,R-7 doi:10.1186/gb-2008-9-1-r7 R7-R7. (1), M eois 15 Genomics, BMC 1) 2917.doi:10.1093/bioinformatics/btl097 1269-1271. (10), 24 qai oiooy 88 Toxicology, Aquatic rhoo tutr eeomn,45 Development, & Structure Arthropod iifrais(xod nln) 30 England), (Oxford, Bioinformatics ora fCiia netgto,106 Investigation, Clinical of Journal 1,621. (1), uli cd eerh 44 research, acids Nucleic 4,316-321. (4), eea Comparative & General eoeresearch, Genome aueMethods, Nature 2,97-107. (2), ora of Journal 9,1236- (9), (D1), BMC (4), Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. v . hn,D,Lu . i .(06.Eet fslnt clmto n ysakalto nN , for + correlate Na molecular on :a ablation trituberculatus eyestalk Portunus and of acclimation gill salinity the of in Effects expression gene (2016). for cotransporter J. Identification - Li, Marker 2Cl & and , P., Mapping + Liu, K QTL D., Zhang, (2018). tritubercula- J., (Portunus J. Crab Lv, Li, Swimming & the P., in Liu, System L., Determination tus). Sex Song, XY P., Indicating trituber- Huan, Osmoregulation. Portunus Sex-Determining: D., of of Basis Sun, Analysis Molecular for J., Transcriptome the Lv, loci (2013). into J. trait Insights Li, quantitative Provides & Stress of P., 8 Salinity Chen, Identification ONE, to B., Response Gao, (2015). in Y., Y. Wang, culatus P., Du, Liu, & J., Lv, B., trituberculatus. Gao, Portunus crab F., doi:10.1111/are.12239 swimming Zhao, 860. the P., in Liu, traits J., clustering growth-related Li, up assembly speeds novo L., significantly de redundancy and Liu, some sequence Tolerating The databases. (2002). (2010). protein A. J. Wang, large Godzik, & of . L., . Jaroszewski, . W., J., Li, Cai, Overex- genome. L., by panda He, Activation H., giant Receptor Zhu, the Insulin G., of of Tian, Cells. W., Suppression Fan, Hepatoma (1996). R., in Li, J. LAR B. Phosphatase Goldstein, Protein-Tyrosine & the eukaryotic R., of for W. groups pression ortholog Zhang, of M., identification P. OrthoMCL: Li, (2003). S. D. Roos, & Jr., genomes. Processing, J., Data C. Project Stoeckert, Genome L., Li, . . transform. . SAMtools. Burrows-Wheeler N., and doi:10.1093/bioinformatics/btp352 Homer, format with Alignment/Map 2078-2079. J., alignment Sequence Ruan, The T., read Fennell, (2009). A., short S. Wysoker, accurate B., Handsaker, and H., Fast Li, (2009). 25 R. England), (Oxford, Durbin, Bioinformatics without & or with H., data RNA-Seq Li, from quantification transcript accurate genome. RSEM: reference RNAs (2011). messenger a N. and C. RNAs Dewey, non-coding & long B., predicting Li, for scheme. tool k-mer a improved PLEK: an (2014). Crustacea. on Z. of Zhou, based & DNA J., nuclear Zhang, A., and Li, Chromosomes (1995). D. 27 Reproduction, D. Timetrees, Invertebrate Timelines, Noel, for & Resource P., A TimeTree: Lecher, (2017). B. S. Times. Hedges, & Divergence protein- M., and the Suleski, assess G., Stecher, CPC: machine. S., (2007). vector Kumar, G. support Gao, biological and & of features L., number sequence Wei, S.-Q., growing using 35 Zhao, transcripts the X.-Q., of of Liu, potential track Z.-Q., coding Ye, research. Keeping Y., biomedical (2015). Zhang, in P. L., require- partners Kong, H. interaction memory Spaink, its low & and with chitin J., aligner of Stougaard, spliced functions V., fast E. a HISAT: B. Koch, (2015). L. S. Salzberg, & ments. resource B., reference Langmead, a as D., KEGG Kim, (2016). annotation. M. Tanabe, protein & and M., Furumichi, gene M., for Kawashima, Y., Sato, M., Kanehisa, 24 WbSre su) 35W4.doi:10.1093/nar/gkm391 W345-W349. issue), Server (Web doi:10.1101/gr.170720.113 1384-1395. (8), rnir nGntc,9 Genetics, in Frontiers aueMtos 12 Methods, Nature 1) 815 doi:10.1371/journal.pone.0082155 e82155. (12), eoersac,13 research, Genome M iifrais 12 bioinformatics, BMC oeua ilg n vlto,34 Evolution, and Biology Molecular aue 463 Nature, iifrais 18 Bioinformatics, 4,3730 doi:10.1038/nmeth.3317 357-360. (4), 37.doi:10.3389/fgene.2018.00337 (337). 2,85-114. (2), 9,27-19 doi:10.1101/gr.1224503 2178-2189. (9), uli cd eerh 44 research, acids Nucleic M iifrais 15 bioinformatics, BMC 1) 7416.doi:10.1093/bioinformatics/btp324 1754-1760. (14), 77) 1-1.doi:10.1038/nature08696 311-317. (7279), 2-2.doi:10.1186/1471-2105-12-323 323-323. , 1,7-2 doi:10.1093/bioinformatics/18.1.77 77-82. (1), 25 D) 47D6.doi:10.1093/nar/gkv1070 D457-D462. (D1), 7,11-89 doi:10.1093/molbev/msx116 1812-1819. (7), 1,3131 doi:10.1186/1471-2105-15-311 311-311. (1), iifrais(xod nln) 25 England), (Oxford, Bioinformatics lcbooy 25 Glycobiology, ellrSgaln,8 Signalling, Cellular qautr eerh 46 Research, Aquaculture nentoa ora of Journal International uli cd research, acids Nucleic 5,469-482. (5), 7,467-473. (7), 4,850- (4), PLoS (16), Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. oteSxCrmsmsi lgsadnie eHa n rohi aoiu eHa Wt lt VIII Plate (With Haan de japonicus Reference Eriocheir Special and with Haan Crustacea, de Decapod Textfigures). dentipes the 7 Plagusia in and Heterogamety in Male Sex-Chromosomes of the Problem to the gene The between regulatory synteny (1937). Conserved male H. G. determination. the Niiyama, Goodwin, sex includes avian . 9 on . chromosome (re)view human . comparative and F., a proteases- Tzner, chromosome uuml, a sex Gr, is Z Z., chitinase chicken Shan, mammalian E., Acidic Zend-Ajusch, I., Onuki. is Nanda, system. terra, . digestive gene, . mouse doublesex-related . in Drosophila Miyazaki, glycosidase A Haruko, resistant Kimura, (1999). Masahiro, transport S. Ohno, Lin, ion Misa, & gill vertebrates. B., Yuan, in and somitogenesis H., patterns Tang, in involved B., osmoregulatory Moore, of A., Evolution Meng, M. (2012). DePristo, 182 Physiology, C. review. DNA Environmental . a S. and next-generation Crustacea: Faria, decapod analyzing . the & for in mechanisms . framework C., A., MapReduce J. Kernytsky, a McNamara, K., Toolkit: Cibulskis, Analysis A., Genome Sivachenko, data. The sequencing E., a Banks, is (2010). M., DMY A. genes. Hanna, (2002). box S. homeo A., Nobuyoshi, mouse McKenna, two . of . fish. expression medaka . Region-specific the K., W. in Tohru, development McGinnis, M., male Chika, for flowers. S., required the Tadashi, gene and S., DM-domain - Ai, Y-specific bees N., Yoshitaka, the M., and Masaru, birds The (2002). J. D. Martins, Methods. Sequencing DNA 9 Next-Generation identi- Genetics, pathway (2008). and R. annotation E. vocabulary. genome Mardis, controlled Automated a (2005). as induces (KO) L. Wei, phosphatase Orthology & doi:10.1093/bioinformatics/bti430 tyrosine KEGG G., protein the J. using LAR Olyarchuk, fication of T., ab Knock-down Cai, X., source J. Mao, G. open two Sale, GlimmerHMM: & and P., resistance. TigrScan C. insulin Hodgkinson, (2004). A., L. Mander, S. Salzberg, & gene-finders. M., eukaryotic Pertea, initio H., W. differentiation. terminal Majoros, myoblast during regulation levels developmental mRNA Noncoordinate fibronectin 195 W. O. and Dyn, Blaschuk, integrin, & N-CAM, C., P. N-cadherin, Holland, of N., Bardeesy, D., C. Maccalman, trait. salt-tolerant iym,H 15) NX- E-EHNS NTEML FADCPDCRUSTACEA DECAPOD A OF MALE THE IN SEX-MECHANISM XX-Y AN BENEDICT. PRINCEPS CERVIMUNIDA (1959). H. HAAN). Niiyama, (de sanguineus Hemigrapsus 14 shore-crab, Genetics, the of of chromosomes Journal X-Y The (1938). H. Niiyama, ar,G,Banm . of .(07.CGA ieiet cuaeyantt oegnsi eukaryotic in genes core annotate accurately to pipeline Drosophila. a CEGMA: (2007). in I. genomes. Korf, GeneID & K., (2000). Bradnam, G., Parra, R. Guig´o, & E., doi:10.1101/gr.10.4.511 Blanco, G., Parra, iifrais 23 Bioinformatics, 2,127-132. (2), 1,3742 doi:10.1146/annurev.genom.9.081307.164359 387-402. (1), eoersac,20 research, Genome 579 elSrs Chaperones & Stress Cell 北 海 1) 0-3028. (14), 道 34-38. , 帝 9,16-07 doi:10.1093/bioinformatics/btm071 1061-1067. (9), 大 iifrais 20 Bioinformatics, 理 部 要 8,9711.doi:10.1007/s00360-012-0665-8 997-1014. (8), 5 , 9,19-33 doi:10.1101/gr.107524.110 1297-1303. (9), eeomn,126 Development, 7,1077-1081. (7), 北 1-8. , 海 道 ora fCmaaiePyilg -iceia Systemic B-Biochemical Physiology Comparative of Journal 1) 8827.doi:10.1093/bioinformatics/bth315 2878-2879. (16), 大 水 26 yoeei eoeRsac,89 Research, Genome & Cytogenetic 部 研 究 6,1259-1268. (6), 10 , 2,106-112. (2), nulRve fGnmc n Human and Genomics of Review Annual eoersac,10 research, Genome iifrais 21 Bioinformatics, cec,235 Science, aue 417 Nature, 49) 1379-1382. (4794), < 12,67-78. (1-2), i 1) 3787-3793. (19), 68) 559-563. (6888), > DMRT1: 4,511-515. (4), Japanese < Dev /i > Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. rpel . ahe,L,&Slbr,S .(09.Tpa:dsoeigslc ucin ihRNA-Seq. with junctions splice discovering TopHat: (2009). L. S. Salzberg, in & 25 system Bioinformatics, L., chromosome Pachter, sex C., X1X1X2X2/X1X2Y Trapnell, a reveals species chromosome Palaemon sex elegans. Y marine Palaemon 2 of X 1 analysis 2/X cytogenetic X 2 Gonz´alezorteg´on, X A., Gonz´aleztiz´on, Mart´ınezlage, Comparative Perina, 1 & Z., (2017). A., X Torrecilla, E., M. 1 A. X a elegans. reveals Palaemon species in Palaemon system marine resources. of and analysis cytogenetic knowledgebase rative Ontology Gonz´alez-Orteg´on, A., Gonz´alez-Tiz´on, Compa- Gene Mart´ınez-Lage, Perina, & (2017). Z., A., the E., M. Torrecilla, Assembly A. Genome of High-Quality Expansion 45 (2020). research, Y. (2016). acids Li, Consortium. Nucleic . Ontology Evolution. . . Genome Gene (2019). X., Unique T. The Li, Its J. S., Reveals Jiang, Lanner, sinensis H., . japonica Zhang, Eriocheir . Q., of . Liu, arthritis. Z., E., Wang, rheumatoid Ahlstrand, B., in Tang, J., weakness A. muscle Cheng, skeletal that K., promote eukaryotes doi:10.1172/jci.insight.126347 Olsson, 16. actin in B., on prediction Aresh, hotspots M., gene Oxidative Persson, for M., server M. web Steinz, a AUGUSTUS: (2005). constraints. B. user-defined Morgenstern, allows & M., thousands Stanke, with models. analyses mixed phylogenetic provides and analysis likelihood-based taxa maximum transcriptomic sinensis. of RAxML-VI-HPC: Comparative Eriocheir (2006). in (2015). lifestyle A. X. benthic Stamatakis, Li, to & adaptation Y., and brachyurization Li, 558 of Y., basis Gene, Liu, molecular BUSCO: M., the (2015). Hui, into insights Z., M. Cui, E. C., Zdobnov, Song, & V., orthologs. E. single-copy Kriventseva, with doi:10.1093/bioinformatics/btv351 completeness P., 3210-3212. annotation convergence. Ioannidis, and of assembly M., genome example assessing R. Waterhouse, prime (2013). A., a U. Sim˜ao, F. of J. Jung, deconstruction & and M., L. history Silva, metabolism. - lipid crabs and 83 Zoology, glucose of to of Evolution Contributions regulation (2014). the and G. signalling Scholtz, Insulin (2001). R. C. 414 Kahn, Nature, & correction. R., error A. read Saltiel, long efficient and accurate LoRDEC: (2014). 30 E. Rivals, & L., Salmela, pro- aquaculture pathogen- exposure. to ammonia their implications elevated and crustaceans: with interactions caspases decapod 334 and in three improvement Osmoregulation potential of (2012). for S. Characterization methods ductivity, C. (2017). Zeng, J. & geno- N., Li, large Romano, & in trituberculatus. P., Portunus families Liu, in repeat pattern B., of expression Gao, identification induced novo X., De Yu, (2005). X., A. Ren, P. Pevzner, Tolerance. & Glucose C., and N. mes. Mass Jones, Muscle L., Adult A. to Price, Growth Fetal of Relation 12 (1995). Medicine, W. I. D. Phillips, 2) 5631.doi:10.1093/bioinformatics/btu538 3506-3514. (24), iifrais 21 Bioinformatics, 22.doi:10.1016/j.aquaculture.2011.12.035 12-23. , 1,8-8 doi:10.1016/j.gene.2014.12.048 88-98. (1), 66) 9-0.doi:10.1038/414799a 799-806. (6865), 8,686-690. (8), rnir nZooy 14 Zoology, in Frontiers 9,10-11 doi:10.1093/bioinformatics/btp120 1105-1111. (9), (suppl iifrais 22 Bioinformatics, D) 31D3.doi:10.1093/nar/gkw1108 D331-D338. (D1), rnir nZooy 14 Zoology, in Frontiers 2,87-105. (2), uli cd eerh 33 research, acids Nucleic ) 31i5.doi:10.1093/bioinformatics/bti1018 i351-i358. 1), uohg n Immunity and Autophagy 1,47. (1), 2) 6829.doi:10.1093/bioinformatics/btl446 2688-2690. (21), 27 1,47. (1), ih&Sels muooy 66 Immunology, Shellfish & Fish (suppl . ) 45W6.doi:10.1093/nar/gki458 W465-W467. 2), rnir nGntc,10 Genetics, in Frontiers iifrais 31 Bioinformatics, c nih,4 Insight, Jci 189-197. , Bioinformatics, Aquaculture, 1340. , Diabetic (19), (9), Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. hn,X,Ya,J,Sn . i . a,Y,Y,Y,...Xag .(09.Pnedsrm geno- shrimp Penaeid (2019). J. Xiang, . . . Y., Yu, Y., Gao, molting. AnimalTFDB frequent S., (2015). and A.-Y. Li, adaptation Guo, benthic Y., doi:10.1038/s41467-018-08197-4 . into Sun, insights . provides J., . me Yuan, W., factors. transcription X., Liu, X., Zhang, of study Zhang, functional S., and prediction Song, 43 expression, research, C.-J., for Liu, resource a 105 T., America, 2.0: Liu, of laevis. W-linked States H.-M., A Xenopus United Zhang, M. in the Ito, development of . ovary Sciences . primary . of C., Academy in Nishida-Umehara, National participates Y., Uno, DM-W, K., gene, Tamura, H., DM-domain Umemoto, E., Okada, S., the Yoshimoto, Clarifies Likelihood. Sequence Maximum Genome by Clam Analysis (2019). Phylogenetic 24 4: D. PAML Li, (2007). . Z. . Yang, Diversity. . Color L., Shell Yan, Extraordinary Z., and Li, Adaptation doi: Benthic J., Its Ding, of Z., Basis Huo, Molecular H., Nie, X., Yan, immunity. and metabolism LTR linking (2007). mTOR, posons. H. Wang, (2012). & R. Z., Xu, Ahmed, & K., Araki, 24 L., Immunology, exposed Ye, trituberculatus Portunus X., crab swimming Xu, the of profiles Their expression stress. Gene and (2011). salinity Genes Y. to Liu, Hox & (2015). H., Q. X. Xu, Jianhai, & L., Fuhua, vannamei. Litopenaeus Z., of 045 Xiaojun, Development Early Y., in Jianbo, Pattern mitten Expression W., Chinese Jiankai, the S., crabs. of Xiaoqing, of response biology antibacterial (1977). the F. G. to Warner, linked from variants genes genetic Apoptosis nm23: of crab. and annotation Caspase functional (2014). ANNOVAR: Q. K. Wang, (2010). S. H. Young, Hakonarson, . data. & sequencing . M., high-throughput . Li, improvement. S., assembly K., Sakthikumar, genome Wang, A., and detection Abouelliel, variant M., microbial 9 comprehensive Priest, ONE, for PLoS T., proteins. tool Shea, integrated T., an repressor Pilon: Abeel, J., zinc-finger B. Walker, KRAB-containing (2003). crabs doi:10.1186/gb-2003-4-10-231 euryhaline R. 13 of gills Urrutia, the in Na+,K+-ATPase and H+-ATPase acclimation. V-type salinity (2007). during (2010). H.-C. L. Lin, Pachter, & switching . J.-R., isoform . Tsai, and . J., transcripts M. unannotated Baren, reveals differentiation. van cell RNA-Seq G., Kwan, by during A., quantification Mortazavi, and G., assembly Pertea, Transcript A., B. gene Williams, Differential Cufflinks. C., (2012). Trapnell, and L. TopHat Pachter, . with . experiments . doi:10.1038/nprot.2012.016 RNA-seq R., 562-578. D. of Kelley, (3), analysis D., Kim, expression G., transcript Pertea, and L., Goff, A., Roberts, C., Trapnell, 8,18-51 doi:10.1093/molbev/msm088 1586-1591. (8), https://doi.org/10.1016/j.isci.2019.08.049 08,52-62. (008), iegnee us 2 Bugs, Bioengineered uli cd eerh 35 research, acids Nucleic Dtbs su) 7-8.doi:10.1093/nar/gku887 D76-D81. issue), (Database (11). 6,429-435. (6), aieBooy 158 Biology, Marine aueBoehooy 28 Biotechnology, Nature ora fEprmna ilg,210 Biology, Experimental of Journal 3,174-177. (3), IDR necetto o h rdcino ullnt T retrotrans- LTR full-length of prediction the for tool efficient an FINDER: uli cd eerh 38 research, acids Nucleic WbSre su) 25W6.doi:10.1093/nar/gkm286 W265-W268. issue), Server (Web 1) 1127.doi:10.1007/s00227-011-1721-8 2161-2172. (10), air eBooi aie 19 Marine, Biologie de Cahiers 28 5,5155 doi:10.1038/nbt.1621 511-515. (5), 1) 14e6.doi:10.1093/nar/gkq603 e164-e164. (16), 4,6067 doi:10.1242/jeb.02684 620-627. (4), eidclo ca nvriyo China, of University Ocean of Periodical 7,2469-2474. (7), aueCmuiain,10 Communications, Nature 1,115-115. (1), oeua ilg n Evolution, and Biology Molecular eoeBooy 4 Biology, Genome Sine 19 iScience, auePooos 7 Protocols, Nature rceig fthe of Proceedings uli acids Nucleic eiasin Seminars 1225-1237. , 1,356. (1), 1) 8. (10), Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. adaptation-and-sex-determination level-genome-of-portunus-trituberculatus-provides-insights-into-its-evolution-salinity- 2.pdf Fig file Hosted Arthri- Rheumatoid ETS1) with ( Patients Specific-1 Transformation- Chinese E26 in (2014). Polymorphisms P. R. WDFY4) Liu, ( & 4 C., Member Zhuang, tis. H., Family Zhang, WDFY L., and Bo, Q., Y. Zhang, fws npruutiuecltsfo i-utr od fpan n crabs. and of effect ponds lethal mud mix-culture of from study from Quantitative gene trituberculatus (2008). Sinica G. 70 portunu Songye, on protein & H., wssv shock Weixian, of X., heat Wenjun, W., cytosolic Zhizheng, W., a Zhongfa, of analysis expression serrata. and Scylla cloning crab Molecular M. Zhong, nentoa ora fMlclrSine,15 Sciences, Molecular of Journal International 2,90-95. (2), vial at available ih&Sels muooy 34 Immunology, Shellfish & Fish https://authorea.com/users/348321/articles/473714-a-chromosome- 2,21-71 doi:10.3390/ijms15022712 2712-2721. (2), 29 5,1306-1314. (5), caooi tLimnologia et Oceanologia Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 30 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 31 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 32 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 33 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 34 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 35 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. 36 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. *contig Contig Contig contig153.39 475 153 475 . . . 11 28 9 37 *Contig475.8 *Contig contig153.40 Contig 475 475 . 10 . 13 Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary. A B

1003

1093

1183 1294 AAATTCCGCACACAGTCTTCATCTGCTTGCAGGAAACTGGTAACGCGATAAATTGCCTTGCGTTCAAACTATGTAACGCTACATATGATTGAATATGTTTTGACAAGGTTTGTAGAGTAA 1414 TCCTTGCAATGTTTAGATAAAGAAACAACATAATCAGATATATAAATGTTGACTGAAAATGCTCAGTTAGTAAGAATACAATGATAACGTCTACCTACGAATAACCCTCGCTAA 1534 TTTTCCTCATAGAGACAACCACTCTTATAATAAGTAAACAATAGGTCTTTAAAGAGATAAAACTAAAAAAAAAAAAAATATATCAACCGGGATGCTACCTTCAGGAATTTATGCTGATTA 1654 GATGATCGTTCATTCTATGAGCTAAAGCCAAATTGAATTTCTTGATGTCAGATATCCAATTTGAAAATAGAACAGCACCCCTTACCCAGCGAGCAACTGTGTGGGTTAATGTTTGTGTCG 17 1894 2014 2134 ATAACAACAATAATAATAATAATAATAATAATAATAATAATAATAATAATAATAATGATAATAATAATAATAATAATAATAATAATAATAATGATAATAATAATAATAATAATAATAACA 2254 A 2374 CACGTACGCCTCCTGCCTCTTGTTTGCTCGGTTTACACTTGTTGTGTAGTGTGCTACCGTAAATACACTTTCATAATACCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

121 CCCCATCGTGCCCTGCCGTCAGTGTCCTGTCACCCTTAATATAAAATAACTTACCTAGTCAGAAATTTATCTCGCATCAGTTGTATGTATAGTGACAATCCACCAGGAATACACCAATCC 241 ACCGTCTACTTATCAGCTGCTACAACCATAACTATTTTGATAATGTGTTTTGGTAGTGATCGATGAGTGTGTCAGACAAGTGACTTCTTTACATGTTAATTGTTGGCATAGTGACTCTGC

361

463

553

643

733

823

913

74 GGGTGTGCAGGATTACAATTATCTATTGCAATGGAGCGAGGCGCCAGTTAAGCGACCACCTGCATTACATAGATTTGATATAAATCTTATTCACAACAATTACTTAAACAATAGTGGCAG Contig Contig Ptdmrt

1 CTACTTCTCCTAAAGCACCGTGTCACTGTTGCCATATGGAGCATGTGGTGATGTCGAACGCTGCTGCTCCACCTATTTCGTAAGACGGGCTGGCTTTCTTGGTGATGACATCTCTGATCA

CCAAGACGCATGTAAGATGCCGTACATTTAGCAGCTGACACCACTTACCGCTGGACTAGGGAAGGCATACAGGCTGTGAAAAGTGGCACTAGGATTATACTCCATTCGCCGATATCCAAT

CCACCATCTCTTCATTTCCGTGTACTCGACAGGTGAGGTAAGGTTGTCAGC ATG GCG TCG GCG CCT AAG AAG AGA GAC TTT TTT AAT GGA TCA CAA AGA CT AGA CAA TCA GGA AAT TTT TTT GAC AGA AAG AAG CCT GCG TCG GCG ATG CCACCATCTCTTCATTTCCGTGTACTCGACAGGTGAGGTAAGGTTGTCAGC

AAC TTA ACG GAG ACA CAG GAA GAG AGT ATG GTT CCT GAG CAA AAT GTC ACC AGT GGA GGA AGA AAT GGC TTC TCT TCC TCC CAG GAT TT GAT CAG TCC TCC TCT TTC GGC AAT AGA GGA GGA AGT ACC GTC AAT CAA GAG CCT GTT ATG AGT GAG GAA CAG ACA GAG ACG TTA AAC

CTT GGG CAC TCT TCA GGC GAC TGT CCT TTC CCA TCT TTC TCC AAT AGA CAA AGT TCT CCT GTG TAC AAT TGT CAA GAC AGC TGC CCA TC CCA TGC AGC GAC CAA TGT AAT TAC GTG CCT TCT AGT CAA AGA AAT TCC TTC TCT CCA TTC CCT TGT GAC GGC TCA TCT CAC GGG CTT

CTC TCT TCA CGA TAT AGT ATC CGA ACG AAA AGG GAC GAA TAC GAA AAC TCC AAA GGC ACT TCA TAC GAG TCT AAT TCA GTT CCA GAA GAA CCA GTT TCA AAT TCT GAG TAC TCA ACT GGC AAA TCC AAC GAA TAC GAA GAC AGG AAA ACG CGA ATC AGT TAT CGA TCA TCT CTC

TTA TCT TCG GCG GGC TCC GCT GAA GTT GCT TCC TTA GCC AAG CAC GAT GGA AAC TAC GAA AGG AAG AAG AAA GTG GAA AAG AGA AA AGA AAG GAA GTG AAA AAG AAG AGG GAA TAC AAC GGA GAT CAC AAG GCC TTA TCC GCT GTT GAA GCT TCC GGC GCG TCG TCT TTA

AGG TGT CGA TTG TGC GCC AAT CAC GGC AAG TAC GAA GAG ATT AAA GGT CAC AAG TGG TAC TGC GAA TAC AGG AAA CCA CAA CAC AAG TG AAG CAC CAA CCA AAA AGG TAC GAA TGC TAC TGG AAG CAC GGT AAA ATT GAG GAA TAC AAG GGC CAC AAT GCC TGC TTG CGA TGT AGG

TCC CTG TGT GAA ATC ACC CAC AAG AAA CGA CTC TTC CTA CCG AAA AAT GAG ATC AGA CGC AAA CAG CAT GAC CAA GAG CAG CAG CTA CA CTA CAG CAG GAG CAA GAC CAT CAG AAA CGC AGA ATC GAG AAT AAA CCG CTA TTC CTC CGA AAA AAG CAC ACC ATC GAA TGT CTG TCC

CAA CAA TTA AAC GTG CAG AAC CGA TCG ACA GAT GAA CCA TGG CTG GAT GGT TAT GGT CGG GTC GGC TTA CCA CCC TCG CCA ACT ACC ACC ACT CCA TCG CCC CCA TTA GGC GTC CGG GGT TAT GGT GAT CTG TGG CCA GAA GAT ACA TCG CGA AAC CAG GTG AAC TTA CAA CAA

CAC ATA GAC TTC CCC CGG CTA CAG GAA CTA GTA GAA GAG ACA GCA TCT ATC TTG GAC GAC GAG GAT TTG TTC CGC CAA ATC AAT GA AAT ATC CAA CGC TTC TTG GAT GAG GAC GAC TTG ATC TCT GCA ACA GAG GAA GTA CTA GAA CAG CTA CGG CCC TTC GAC ATA CAC

ATC CCG TTG AAT GTC CTT CAA CAT TGA TTACCTACTGTCCTCCCTAATCTACCTACCTGGTCCAGCGGAAGAATCGATGCTTGTAAAGGGATGCAGCTATTTC TGA CAT CAA CTT GTC AAT TTG CCG ATC

CCACGTGCGCATCCATACTAACAAAGATGATTATTATATGCCACACCGAGGGTCTTCTCTGAGACTGAAGTATTATTTTTACCAGTGATTAATTCTCATAACTACTAAATCTTAAAGTG

L E Q E M P Q V S G N F S Q D Q S S S F G N R G G S T V N Q E P V M S E E Q T E T L N

L G H S S G D C P F P S F S N R Q S S P V Y N C Q D S C P P C S D Q C N Y V P S S Q R N S F S P F P C D G S S H G L

S R S R K D Y N K T Y S S P E P V S N S E Y S T G K S N E Y E D R K T R I S Y R S S L

S A S E A L K D N E K K V K K K R E Y N G D H K A L S A V E A S G A S S L

C L A H K E I G K Y E R P H K H Q P K R Y E C Y W K H G K I E E Y K G H N A C L R C R

L E T K R F P N I R R I E N K P L F L R K K H T I E C L S

Q N Q R T E W D Y R G P S T T T P S P P L G V R G Y G D L W P E D T S R N Q V N L Q Q

I F R Q L E T S L D D F Q N E N I Q R F L D E D D L I S A T E E V L E Q L R P F D I H

I

TAATAACAATGATGATGATACTACTACTATGTTAGCATGTCCTTTATTTTCCCATACTCCTTTTCTGTCTGACGTCACGTCTTTCCCCTCAGTCCTCGCTCGCTGCCGCCCGAGGTTGA

P

L cDNA 475 475 DNA

V Q * * H Q L V N

38

M

A

A A S

P

K

K Q H D Q E Q Q L L Q Q E Q D H Q K

K

R

F N S R L R Q S G N F F D

K K K R K E

CACGTGC

TCGGAT

A CA A

G CG G

GA

TA

T

A

G

C

C

T

C

A

C

F

A

S

C

Q

C

R

E

Q

Y

Posted on Authorea 3 Aug 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.159646761.15797764 — This a preprint and has not been peer reviewed. Data may be preliminary.

Relative expression of Ptdmrt Female 39 Male Control Ptdmrt Control Ptdmrt Control Ptdmrt Control Ptdmrt