ARTICLE DOI: 10.1038/s41467-017-00653-x OPEN Convergent evolution of Y chromosome gene content in flies Shivani Mahajan1 & Doris Bachtrog1 Sex-chromosomes have formed repeatedly across Diptera from ordinary autosomes, and X-chromosomes mostly conserve their ancestral genes. Y-chromosomes are characterized by abundant gene-loss and an accumulation of repetitive DNA, yet the nature of the gene repertoire of fly Y-chromosomes is largely unknown. Here we trace gene-content evolution of Y-chromosomes across 22 Diptera species, using a subtraction pipeline that infers Y genes from male and female genome, and transcriptome data. Few genes remain on old Y-chromosomes, but the number of inferred Y-genes varies substantially between species. Young Y-chromosomes still show clear evidence of their autosomal origins, but most genes on old Y-chromosomes are not simply remnants of genes originally present on the proto-sex-chromosome that escaped degeneration, but instead were recruited secondarily from autosomes. Despite almost no overlap in Y-linked gene content in different species with independently formed sex-chromosomes, we find that Y-linked genes have evolved convergent gene functions associated with testis expression. Thus, male-specific selection appears as a dominant force shaping gene-content evolution of Y-chromosomes across fly species. 1 Department of Integrative Biology, University of California Berkeley, Berkeley, California 94720, USA. Correspondence and requests for materials should be addressed to D.B. (email: [email protected]) NATURE COMMUNICATIONS | 8: 785 | DOI: 10.1038/s41467-017-00653-x | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-00653-x and Y chromosomes are involved in sex determination in Dipteran flies have multiple independent originations of Xmany species1. Sex chromosomes are derived from sex chromosomes9. In particular, flies typically have XY sex ordinary autosomes, yet old X and Y chromosomes chromosomes and a conserved karyotype consisting of six contain a vastly different gene repertoire. In particular, chromosomal arms (five large rods and a small dot; X chromosomes often closely resemble the autosome from which termed Muller elements A-F10). Interestingly, we recently showed they were derived, with only few changes to their gene content2. that superficially similar karyotypes conceal the true extent of In contrast, Y chromosomes dramatically remodel their gene sex chromosome variation in Diptera: whole-genome analysis in – repertoire3 5. Y evolution is characterized by massive gene decay, 37 fly species belonging to 22 families identified over a with the vast majority of the genes originally present on the dozen different sex chromosome configurations in flies based Y disappearing, and Y degeneration is often accompanied by on gene content conservation of the X chromosome9. The small the acquisition of repetitive DNA4. Old Y chromosomes typically dot chromosome was repeatedly used as a sex chromosome, contain only a few genes, and some lineages have lost but we detected species with undifferentiated sex chromosomes, their Y chromosome entirely6. The ultimate cause for others in which a different chromosome replaced the dot as a Y degeneration is a lack of recombination on Y chromosomes, sex chromosome or in which multiple chromosomal elements which renders natural selection inefficient4. However, while became incorporated into the sex chromosomes, and others X chromosomes have been characterized and sequenced in yet with female heterogamety (ZW sex chromosomes)9. many species, much less is known about Y gene content However, no Y-linked genes were identified in our previous evolution beyond these very general patterns. Labor intensive analysis, due to the difficulty in assembling genes from the sequencing of Y chromosomes in a few mammal species has often highly repeat-rich Y chromosome. Several Y-linked protein- revealed a surprisingly dynamic history of Y chromosomes, with coding genes in Drosophila melanogaster, for example, carry palindromes retarding Y degeneration in primates7,or mega-base sized introns consisting of repetitive meiotic conflicts driving gene acquisition on the mouse Y8. transposable element (TE) and satellite-derived DNA11, making it However, the repeat-rich nature of Y chromosomes has impossible to assemble them using next-generation sequencing hampered their evolutionary studies in most organisms. approaches12, 13 (though the application of long-read PacBio Step 1: Map male RNA-seq reads to female genome assembly Step 2: Assemble transcriptome from unmapped male RNA-seq reads Unmapped reads Mapped reads Transcripts Step 3: Map transcripts to female reference genome assembly Step 4: Map female RNA-seq reads to unmapped transcripts Discard transcripts (>90% Discard transcripts (>50% length length and >98% identity) with reads ≤2 mismatches ) Step 5: Merge and extend transcripts Step 6: Map male and female genomic reads separately Discard transcripts (≤60% coverage in males and >10% coverage in females, for reads with ≤2 mismatches) Step 7: Filter transcripts by male versus female expression Step 8: Build repeat libraries and map them to transcripts Discard mapped transcripts Discard transcripts (female fpkm (BLAT score <50) > 0.5 times male fpkm) Step 9: Filter transcripts by effective length to obtain final list Legend RNA-seq Transcripts reads Genomic scaffolds Discard transcripts (effective Putative Y-linked DNA-seq Repeats length < 0.6 × transcript length) transcripts reads Fig. 1 Bioinformatic subtraction pipeline to infer Y-linked transcripts. Male RNA-seq reads are mapped to genomic scaffolds build from female genomic reads (Step 1); unmapped male RNA-seq reads are used to build a de novo transcriptome (Step 2), and transcripts that either map to the female genome assembly (Step 3) or female RNA-seq reads (Step 4) are discarded. Remaining transcripts are merged (Step 5) and only merged transcripts are kept that show mapping to male genomic reads and no mapping to female genomic reads (Step 6) and that show expression in males but not females (Step 7). Transcripts that mapped to a de novo repeat library were discarded (Step 8), and only transcripts which had an effective length (as calculated by the software eXpress) greater than 0.6 times the transcript length were kept in the final list (Step 9) 2 NATURE COMMUNICATIONS | 8: 785 | DOI: 10.1038/s41467-017-00653-x | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-00653-x ARTICLE a c 500 bp CG41561 Embryo Larva Pupa Adult 4 3 2 b 1 0 Larvae_L1 Larvae_L2 Embryo_0–2hr Embryo_2–4hr Embryo_4–6hr Embryo_6–8hr Embryo_8–10hr Larvae_L3_12hr Embryo_10–12hr Embryo_12–14hr Embryo_14–16hr Embryo_16–18hr Embryo_18–20hr Embryo_20–22hr Embryo_22–24hr AdultF_EcI_1days AdultF_EcI_5days Larvae_L3_PS1–2 Larvae_L3_PS3–6 Larvae_L3_PS7–9 AdultM_EcI_1days AdultM_EcI_5days AdultF_EcI_30days AdultM_EcI_30days Pupae_2d_postWPP Pupae_3d_postWPP Pupae_4d_postWPP White_prepupae_new White_prepupae_12hr White_prepupae_24hr Larva Pupa Adult Unsexed 40 30 20 10 0 Pupae_P8_fat Larvae_L3_fat Larvae_L3_saliv Pupae_P8_CNS Larvae_L3_CNS VirginF_4d_ovary VirginF_4d_ovary Adult_1d_dig_sys Adult_4d_dig_sys Adult_1d_carcass Adult_4d_carcass MatedM_4d_testis Adult_20d_dig_sys Adult_20d_carcass Whiteprepupae_fat Larvae_L3_dig_sys Larvae_L3_carcass Whiteprepupae_saliv 0500 1,000 1,500 2,000 Larvae_L3_imag_disc MatedM_4d_acc_gland Adult_VirginF_1d_head Adult_VirginF_4d_head Adult_MatedF_1d_head Adult_MatedF_4d_head Adult_MatedM_1d_head Adult_MatedM_4d_head Adult_VirginF_20d_head Adult_MatedF_20d_head Adult_MatedM_20d_head Fig. 2 CG41561, a new protein-coding gene on the D. melanogaster Y chromosome. a Intron/exon structure of CG41561 (grey are non-coding exons, green are coding exons). b Mapping of five male (blue) and five female (red) Drosophila genomic reads to CG41561 (for strain information see Supplementary Table 2). c Expression profile of CG41561 across developmental stages (top; samples are ordered by developmental time) and larval and adult tissues (bottom); colors in heatmap refer to expression level. CG41561 is first expressed in third instar larvae, and shows maximum expression in pupae and young adult males (it is not expressed in females). Across tissues, CG41561 is expressed in imaginal discs of third instar larvae and most highly in adult male testis. Expression profiles are taken from flybase. CNS, central nervous system; dig_sys, digestive system; fat, fat body; imag_disc, imaginal disc; saliv, salivary gland; acc_gland, accessory gland; 1d, 1-day; 4d, 4 days; 20d, 20 days technology has proven useful in assembling Y-linked genes and between them, and the amount of sequence homology progres- genomic regions in D. melanogaster14, 15). Intriguingly, most sively declines for older fusions as Y chromosomes degenerate4, – Y-linked genes in Drosophila are not simply remnants of genes 22 24. This contrast enables us to infer the selective regime under present on the autosome that became the sex chromosome; which Y chromosomes evolve initially when still containing most instead, they all appear to have been acquired secondarily on the of their ancestral genes, and their long-term evolutionary Y, after it evolved its male-limited transmission12, 13, 16, 17. dynamics after most of their original genes have been lost. Y-linked genes in D. melanogaster all have male-specific functions In particular, our sampling scheme allows us to compare and have adapted testis-specific expression, which
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-