Received: 21 August 2018 | Revised: 21 October 2018 | Accepted: 25 October 2018 DOI: 10.1111/mec.14928

ORIGINAL ARTICLE

A fast‐evolving X‐linked duplicate of importin‐α2 is overexpressed in sex‐ratio drive in neotestacea

Kathleen E. Pieper1 | Robert L. Unckless2 | Kelly A. Dyer1

1Department of Genetics, University of Georgia, Athens, Georgia Abstract 2Department of Molecular Selfish genetic elements that manipulate gametogenesis to achieve a transmission Biosciences, University of Kansas, Lawrence, advantage are known as meiotic drivers. Sex‐ratio X chromosomes (SR) are meiotic Kansas drivers that prevent the maturation of Y‐bearing sperm in male carriers to result in Correspondence the production of mainly female progeny. The spread of an SR chromosome can af‐ Kathleen E. Pieper, Department of Genetics, University of Georgia, Athens, GA. fect host genetic diversity and genome evolution, and can even cause host extinction Email: [email protected] if it reaches sufficiently high prevalence. Meiotic drivers have evolved independently

Funding information many times, though only in a few cases is the underlying genetic mechanism known. National Institute of General Medical In this study we use a combination of transcriptomics and to Sciences, Grant/Award Number: R00GM114714 and T32GM007103 ; identify widespread expression differences between the standard (ST) and sex‐ratio Division of Environmental Biology, Grant/ (SR) X chromosomes of the Drosophila neotestacea. We found the X chromosome Award Number: 1149350 and 1457707 ; Society for the Study of Evolution; National is enriched for differentially expressed transcripts and that many of these X‐linked Institutes of Health, Grant/Award Number: differentially expressed transcripts had elevated K /K values between ST and SR, T32GM007103 and R00GM114714; a s National Science Foundation, Grant/Award indicative of potential functional differences. We identified a set of candidate tran‐ Number: 1149350 and 1457707; University scripts, including a testis‐specific, X‐linked duplicate of the nuclear transport gene of Georgia importin‐α2 that is overexpressed in SR. We find suggestions of positive selection in the lineage leading to the duplicate and that its molecular evolutionary patterns are consistent with relaxed purifying selection in ST. As these patterns are consistent with involvement in the mechanism of drive in this species, this duplicate is a strong candidate worthy of further functional investigation. Nuclear transport may be a common target for genetic conflict, as the mechanism of the autosomal Segregation Distorter drive system in D. melanogaster involves the same pathway.

KEYWORDS genetic conflict, meiotic drive, nuclear transport, spermatogenesis, testes

1 | INTRODUCTION but recent developments in engineering synthetic drive systems give understanding the dynamics and mechanisms of natural driving Genetic conflict occurs when one portion of the genome promotes systems a new urgency (Lindholm et al., 2016; Sandler & Novitski, its own transmission to the detriment of another portion of genome. 1957). Sex‐ratio (SR) meiotic drive involves selfish elements located Conflict is pervasive and potentially a major evolutionary force (Burt on the X chromosome that reduce the transmission of Y‐bearing & Trivers, 2006; Lindholm et al., 2016; Rice, 2013). Meiotic drive sperm in males. This form of drive is particularly interesting because is a type of genetic conflict where selfish genes manipulate game‐ it can bias the sex ratio of the offspring towards daughters, poten‐ togenesis to subvert Mendel’s law of equal segregation and make tially leading to population or species extinction (Carvalho & Vaz, their way into over 50% of gametes. The phenomenon itself and its 1999; Hamilton, 1967). Sex‐ratio drive is also the most common form potential evolutionary consequences have been known for decades, of chromosomal meiotic drive known, having evolved independently

Molecular Ecology. 2018;27:5165–5179. wileyonlinelibrary.com/journal/mec © 2018 John Wiley & Sons Ltd | 5165 5166 | PIEPER et al. dozens of times in dipterans (Jaenike, 2001). The sex chromosomes this species, as a large number of X‐linked genes were found to may be particularly prone to conflict (Hurst & Pomiankowski, 1991), be differentially expressed between the wild‐type and selfish X though ascertainment bias also likely contributes to the large num‐ chromosomes (Reinhardt et al., 2014). Though driving X chromo‐ ber of known sex‐ratio drive systems. somes are found at relatively high frequencies in this species, Though the first meiotic drive systems were discovered in the they appear to have low nucleotide variation and extremely lim‐ 1920 s and a variety are known today, only a few have been dissected ited recombination with ST, although recombination may occur mechanistically (reviewed in Burt & Trivers, 2006). Of the known SR between SR chromosomes (Christianson, Brand, & Wilkinson, drive systems, two have at least part of their genetic and mechanis‐ 2011; Paczolt, Reinhardt, & Wilkinson, 2017). Loci involved in tic basis identified. Both systems are found in Drosophila simulans drive or very closely linked to drive are expected to show pat‐ (Helleu et al., 2016; Tao, Araripe, et al., 2007). In the Paris system, terns similar to a selective sweep (Derome, Baudry, Ogereau, drive has been pinpointed to a protein that binds to the heterochro‐ Veuille, & Montchamp‐Moreau, 2008; Kingan, Garrigan, & Hartl, matin of the Y chromosome during meiosis to cause nondisjunction 2010). Unfortunately, the very tight linkage and low polymor‐ events that result in inviable Y‐bearing sperm, although there is an‐ phism on the driving X chromosome suggests that all loci across other locus associated with this system that is still unknown (Helleu the X chromosome—even those not involved in drive—are ex‐ et al., 2016). In the independently evolved Winters sex‐ratio system, pected to show the same molecular evolutionary pattern as the the genetic loci of two SR distorters, one a duplicate copy of the drive‐associated loci in this species, making its genetic dissec‐ other, have been identified (Lin et al., 2018; Tao, Araripe, et al., 2007; tion difficult. Tao, Masly, Araripe, Ke, & Hartl, 2007). In this system, the mechanis‐ The sex‐ratio (SR) X chromosome of D. neotestacea is similarly tic basis of drive itself remains unclear, but autosomal suppression characterized by large inversions and significant genetic differen‐ occurs through small RNA interference (Lin et al., 2018; Tao, Araripe, tiation with the standard (ST) X chromosome, but critically there is et al., 2007; Tao, Masly, et al., 2007). One of the best‐studied male substantial genetic variation present on SR (Pieper & Dyer, 2016). meiotic drive systems is Segregation Distorter (SD) in D. melanogaster In a survey of nucleotide variation on SR and ST at 11 arbitrarily (Larracuente & Presgraves, 2012). Though it is an autosomal driver, chosen loci uninvolved in drive, there were no fixed differences it operates in much the same way as sex‐ratio X chromosomes in that between the two chromosome types, and linkage disequilibrium half of the sperm fail to develop properly. The distorter gene (sd‐ on SR was consistent with regular recombination between SR RanGAP) is a truncated duplicate of RanGAP (Merrill, Bayraktaroglu, chromosomes (Pieper & Dyer, 2016). This species is broadly dis‐ Kusano, & Ganetzky, 1999). Wild‐type RanGAP stays in the cyto‐ tributed through temporal and boreal forests in North America, plasm and powers the nuclear transport cycle, but the sd‐RanGAP and there is high gene flow across geographic regions (Dyer, 2012; protein mislocalizes to the nucleus, which disrupts nuclear transport Pieper & Dyer, 2016). SR is also found at long‐term high frequen‐ to cause spermatogenesis to fail (Kusano, Staber, & Ganetzky, 2001, cies in some populations, and there are no identified segregating 2002; Larracuente & Presgraves, 2012). suppressors (Dyer, 2012; Dyer, Bray, & Lopez, 2013; James & In spite of significant interest, the main reason so little is known Jaenike, 1990). Drive is also extremely effective in this species, about the mechanism of drive systems is that most meiotic drivers with SR males producing 98% daughters; the only sons are ster‐ are associated with chromosomal inversions (Burt & Trivers, 2006; ile and presumed to be XO males that result from nondisjunction Jaenike, 2001). Inversions are thought to accumulate on driving chro‐ (James & Jaenike, 1990). mosomes because they link together interacting elements that are Altogether, these characteristics make the D. neotestacea SR required for drive (Charlesworth & Hartl, 1978). Unfortunately, the system an excellent choice for using genomic approaches to inves‐ suppression of recombination caused by such inversions also makes tigate the mechanistic basis of an inversion‐rich drive system. In classical genetic analysis difficult. The Drosophila drive systems dis‐ this study, we used high‐throughput sequencing to compare gene cussed above are not associated with inversions, allowing for their expression in the testes of males carrying SR or ST in the same genetic dissection using traditional mapping techniques. The t‐hap‐ genetic background. We determined which chromosomes carry lotype of Mus musculus remains the only male meiotic driver (either differentially expressed transcripts and identified nucleotide dif‐ autosomal or sex‐linked) with a known mechanistic basis that is as‐ ferences between transcripts from ST and SR. We evaluated the sociated with inversions (Bauer, Veron, Willert, & Herrmann, 2007; potential functional impact of these differences and generated a Bauer, Willert, Koschorz, & Herrmann, 2005; Herrmann, Koschorz, list of candidates that were rapidly evolving. We used microscopy Wertz, McLaughlin, & Kispert, 1999; Lyon, 1991, 2003). to compare spermatogenesis in ST and SR males. We then car‐ Potential mapping strategies that circumvent chromosomal ried out a population genetic study of the candidates using wild inversions include large‐scale transcriptomic or genomic analy‐ ST and SR males and identified molecular evolutionary patterns at sis comparing sex‐ratio and wild‐type chromosomes. To date, the five loci consistent with direct involvement with the mechanism of only such investigation is a comparison of gene expression be‐ drive. One of these top candidates is a novel X‐linked duplicate of tween wild‐type and sex‐ratio males in the stalk‐eyed fly Teleopsis the autosomal gene importin‐ α2, a key part of the nuclear import dalmanni (Reinhardt et al., 2014). This analysis showed that mei‐ pathway. We argue this candidate is highly likely to be involved in otic drive has had a significant impact on X‐linked evolution in the meiotic drive mechanism. PIEPER et al. | 5167

2 | METHODS The lowest 30% of total expressed transcripts were filtered out to increase the power, leaving 25,484 transcripts in the differen‐ 2.1 | Samples, sequencing and transcriptome tial expression analysis. These transcripts were ranked according assembly to their false discovery rate (FDR), and then, every transcript ex‐ cept the 10,000 most differentially expressed was chosen for the The ST and SR D. neotestacea males used in this study come from RUVg empirical normalization control. These remaining 15,484 laboratory stocks maintained by K. Dyer that were originally col‐ transcripts were used to estimate unwanted variance, since we lected in New York in 1990. The SR stock was initiated with a single assume they are not differentially expressed. That estimation of wild‐caught SR male and is maintained by crossing to an inbred variance was then included in the general linear model of differ‐ ST laboratory stock. Every generation, ST/Y males are crossed to ential expression between ST and SR in DESeq2 and edgeR (Love, SR/SR females to generate SR/Y males and SR/Y males are also Huber, & Anders, 2014; Robinson, McCarthy, & Smyth, 2010). crossed to SR/SR females to produce more SR/SR females. Thus, FDR was calculated to correct for multiple testing, and FDR ≤0.01 the only genetic difference between ST and SR males is the X was required for significance (Benjamini & Hochberg, 1995). chromosome. Transcripts were considered significantly differentially expressed cDNA libraries were made from sexually mature adult males. if they had an FDR ≤0.01 in both the DESeq2 and edgeR analyses. Whole testes were dissected from 25 ST or SR males at 4–7 days All analyses were performed in the R programming language in after eclosion and flash‐frozen for each library. The carcass (i.e., the RStudio (https://www.rstudio.com/) (RCoreTeam, 2014). body without the testes) was also prepared for RNA extraction. RNA The same procedure was carried out for the carcass data, but was extracted using a Qiagen RNEasy Kit (Qiagen, Germantown, as there were only two biological replicates each for ST and SR, the MD, USA). Six testes cDNA libraries (three ST and three SR biolog‐ power of this analysis was low, and no transcripts met the crite‐ ical replicates) and four carcass libraries (two ST and two SR) were ria for significance. However, the mean expression estimates from prepared for high‐throughput 75 bp single‐end sequencing using the DESeq2 were used to determine the tissue specificity of tran‐ Illumina TruSeq kit (Illumina, Inc, San Diego, CA, USA). All libraries scripts expressed in the testes; an estimated carcass base mean were run two independent times on a single Illumina HiSeq lane (San expression of <2 counts was used as the criterion for transcript Diego, CA) and demultiplexed at the Cornell Sequencing Center. testis specificity. The total number of testes‐specific transcripts Base quality was evaluated using FASTX‐Toolkit (http://hannonlab. was 14,392. cshl.edu/fastx_toolkit/commandline.html), and sequencing adaptors and low‐quality bases were trimmed from the ends of reads, and reads with <99% of bases with a call quality score of at least 20 were 2.3 | Assignment of transcripts to chromosomes discarded. To identify X‐linked transcripts, homology to the D. melanogaster and All filtered sequencing reads were combined and used to de D. virilis genomes was used. The Release 6 D. melanogaster whole novo assemble the transcriptome using the Trinity pipeline (Haas genome assembly was obtained from Flybase (http://flybase.org/) et al., 2013). For transcripts that had multiple isoforms identified, (Hoskins et al., 2015), and a version of the D. virilis genome assembly the isoform with the highest quality blastx hit (i.e., lowest e‐value) with scaffolds assigned to Muller elements was obtained from Yasir against UniProtKB (http://www.uniprot.org/) was included in fur‐ Ahmed‐Braimah (personal communication). NCBI’s tblastx with cut‐ ther analyses (Camacho et al., 2009; The UniProt Consortium, off values of e‐value <1e–20 and length >50 bp was used to identify 2014). Reads were then split back into their respective libraries transcript homologs in the D. melanogaster and D. virilis genomes. If and aligned to the transcriptome using Bowtie2 with the default no D. virilis homolog was available or the transcript mapped to a scaf‐ parameters (Langmead & Salzberg, 2012). SR‐only and ST‐only fold, the location of the D. melanogaster homolog was used. Synteny transcriptome assemblies were created in the same way. For between Muller elements was used to assign genomic location of each library, read abundance per transcript was quantified using transcripts in D. neotestacea (Camacho et al., 2009; Schaeffer et al., RSEM (Li & Dewey, 2011). After analysis showed no differences 2008). The X chromosome is homologous in all three species and in the abundance of the technical replicates, they were combined unlinked to other Muller elements in D. neotestacea (Pieper & Dyer, together. 2016). GO term enrichment analysis was carried out using the GOseq 2.2 | Differential and tissue‐specific program in the Trinity package (Haas et al., 2013). Transcript homo‐ expression analysis logs were identified in the SwissProt database using blastx (e‐value <1e–20) (Camacho et al., 2009; The UniProt Consortium, 2014). GO We used the program RUVg to normalize abundance counts be‐ term enrichment compared to the entire transcriptome was deter‐ tween all testes libraries (Risso, Ngai, Speed, & Dudoit, 2014). mined for five sets of transcripts (DE transcripts, DE transcripts with First, a differential expression analysis between ST and SR was SR‐biased expression, DE transcripts with ST‐biased expression, tes‐ carried out in DESeq using the unnormalized read counts and the tis‐specific DE transcripts with ST‐biased expression and testis‐spe‐ total 63,821 transcripts in the assembly (Anders & Huber, 2012). cific DE transcripts with SR‐biased expression). 5168 | PIEPER et al.

coding region of each transcript using KaKs_Calculator (Zhang et 2.4 | Nucleotide differences in transcripts from al., 2006). ST and SR

Transcripts with nucleotide sequence differences between ST and 2.5 | Sequencing, gene tree and population genetic SR were identified using samtools (Li et al., 2009). The mpileup func‐ analyses of candidates tion was used to make a vcf file compiling all the variant sites in the testes libraries. Vcftools was used to remove any variant sites (i.e., Ten candidate loci were chosen and sequenced in each of 10 SNPs) with a minor allele frequency of <50% to remove sites with wild‐caught ST and SR D. neotestacea males as well as in one or variation within ST or SR libraries (e.g., one ST library has an A, two two ST individuals of the closely related species D. testacea and ST libraries have a G and all three SR libraries have a G) (Danecek et D. orientacea. These D. neotestacea males were randomly chosen al., 2011). Further filtering was performed to remove any nucleotide from the range‐spanning data set used in Pieper and Dyer (2016) sites with coverage <100× across all samples and any transcripts (See Supporting information Table S1 of that paper) and had been with less than 100 sites and with at least 100× coverage. Finally, identified as carrying an ST or SR X chromosome by the propor‐ any sites with heterozygotes called in any library were removed to tion of female offspring they produced (Dyer, 2012). As the re‐ ensure that the final set of sites contained only one allele in every SR sults will show, one of these candidates is an X‐linked duplicate sample and a different allele in every ST sample, representing a con‐ of the autosomal importin‐α2 gene that we named X‐importin‐α2. servative estimate of the sequence divergence between the chro‐ X‐importin‐α2 was sequenced in an additional seven wild‐caught mosomes. The percentage sequence difference between ST and SR ST males from the same data set, and both duplicate copies were was calculated for each transcript by taking the number of different sequenced in one or two individuals of several closely related sites over the total length of the transcript and multiplying by 100. species (D. testacea, D. orientacea, D. putrida and D. bizonata). PCR Of the 1,349 transcripts that had at least one sequence difference, primers were designed with Primer3 in Geneious (Supporting in‐ 1,067 mapped to the X chromosome using homology, 32 mapped to formation Table S2)(Kearse et al., 2012; Untergasser et al., 2012), the autosomes, and 250 had unknown genomic locations because and fragments were amplified using standard PCR protocols they did not have homology to a known gene in D. melanogaster or and sequenced on an Applied Biosystems (Foster City, CA, USA) D. virilis (Supporting information Table S1). The only genetic differ‐ 3,730xl DNA Analyzer at the Georgia Genomics Facility. Base calls ences between ST and SR should be on the X chromosome due to the were confirmed using Geneious, and sequences were aligned by crossing scheme of the stocks, indicating these transcripts can be as‐ hand (Kearse et al., 2012). signed to the X chromosome. We further validated this assumption The coding sequences of importin‐α1, importin‐α2, importin‐α3 by performing a blastn search of these 250 transcripts against an and X‐importin‐α2 sequences from D. neotestacea, D. melanogaster unpublished assembly of the D. innubila genome (R. Unckless, unpub‐ and D. virilis were aligned in Geneious and corrected by hand and lished), a species from the quinaria group that is more closely related then used to build an unrooted, neighbour‐joining phylogenetic to D. neotestacea than either D. melanogaster or D. virilis (Perlman, tree (Kearse et al., 2012). Sequences of importin‐α1, importin‐α2 Spicer, Shoemaker, & Jaenike, 2003). Although only 34% of the un‐ and importin‐α3 from D. melanogaster and D. virilis were obtained known transcripts matched to the D. innubila genome, 87% of those from Flybase (http://flybase.org/), and the D. neotestacea se‐ that did matched to the X chromosome. We therefore reassigned quences were identified from the transcriptome using blastx the 250 unknown transcripts and 32 originally autosome‐mapped (Camacho et al., 2009). We used the codeml function in Paml v4.8a transcripts with sequence differences to the X chromosome. This (Yang, 2007) to infer the rate of protein evolution (ω, or dN/dS) for increased the total number of transcripts assigned to the X chromo‐ the importin phylogeny. We compared a model with one ω value some from 2,748 to 3,030. across the entire phylogeny versus a free‐ratio model where ω For each of the 1,349 transcripts with a sequence difference was estimated independently for each branch. These models were between SR and ST, GATK was used to create a duplicate set of evaluated using a likelihood ratio test, with the p‐value inferred transcript sequences that carried the alternate alleles at each vari‐ using a χ2‐distribution. able site (using the original transcriptome assembly as the refer‐ Analyses of DNA polymorphism and divergence at these candi‐ ence) (Van der Auwera et al., 2013). The matched alternate and date loci were carried out in the program DnaSP (Librado & Rozas, reference allele‐containing transcripts were then carefully par‐ 2009). Population genetic data from a set of five X‐linked (marf, mof, titioned between ST and SR based on the assignment of variant pdg, rpl and spk) and seven autosomal protein coding loci (esc, gl, ntid, sites in the vcf file identifying sequence differences. Open reading mago, tpi, sia and wee) that were arbitrarily chosen with respect to frames were identified using Transdecoder (Haas et al., 2013), and SR were used as a comparison; these data are from Pieper and Dyer the coding sequences were extracted from these transcripts using (2016). These markers are referred to as the “noncandidate” X‐linked bedtools (Quinlan & Hall, 2010). The validity of the open reading markers. Hudson–Kreitman–Aguadé (HKA) tests were carried out in frames was confirmed by eye in a small subset of transcripts using the program MLHKA using an MCMC length of 1,000,000 (Wright & Geneious (Kearse et al., 2012). The rate of protein evolution, or Charlesworth, 2004), with significance determined using likelihood

Ka/Ks, was calculated between the SR and ST sequences for the ratio tests. PIEPER et al. | 5169

equalled those with SR‐biased expression. In contrast, the other 2.6 | Sperm microscopy four large autosomes combined carry only 18% (131 transcripts) of The testes of 1‐day‐old ST and SR males from the same stock as was DE transcripts, despite having a total number of mapped transcripts used for the RNAseq were dissected and stained with DAPI (4,6‐di‐ nearly quadruple that of the X chromosome. These mapped rela‐ amidino‐2‐phenylindole) to identify the heads of developing sperma‐ tively uniformly across the autosomes. No differentially expressed tids. Testes were dissected in PBS (phosphate‐buffered saline), and transcripts were found to map to the dot chromosome (Muller ele‐ the developing spermatids were gently removed and spread apart. ment F), the Y chromosome, or the mitochondria (Figure 1a). The The slide was dried at 60°C for 5–10 min, fixed in 3:1 methanol and remaining 40.5% (295 transcripts) of DE transcripts were not able to glacial acetic acid for 5 min, rinsed three times in PBS and labelled be mapped to any region of the genome using homology with D. viri‐ with 0.5 mg/ml DAPI in glycerol. The developing 64‐spermatid bun‐ lis and D. melanogaster or sequence differences between ST and dles in postmeiosis were identified using 650× magnification. SR. A chi‐squared test comparing DE transcripts with total mapped transcripts for the five large Muller elements was highly significant (χ2 = 868.01, df = 4, p < 2.2e–16), indicating the X chromosome is 3 | RESULTS enriched for DE transcripts. This is expected as the ST and SR males used are genetically identical except for the X chromosome; autoso‐ 3.1 | SR and ST chromosomes cause expression mal DE genes may be due to trans‐acting X‐linked genes. differences for genes primarily on the X

In total, the transcriptome contained 63,821 transcripts with an av‐ 3.2 | Amount of sequence differences between erage contig length of 376.66 bp and a contig N50 value of 514 bp. ST and SR is as expected The number of transcripts in this data set overestimates the actual number of genes expressed in all tissues due to fragmented assem‐ Of all 2,748 transcripts that initially mapped to the X chromosome bly. After filtering and normalization, 729 transcripts were identified based on homology, 38.8% (1,067 transcripts) had at least one nu‐ as differentially expressed (DE) in the testes of ST and SR males (FDR cleotide difference between ST and SR (Supporting information ≤0.01 in both DESeq2 and edgeR). Of these, 306 had SR‐biased ex‐ Table S1, Figure 2a). Because of the stringent requirements used to pression patterns, and 423 were ST‐biased (Figure 1). Additionally, identify sequence differences, this likely underestimates the differ‐ 19 transcripts were expressed in SR but absent in ST, and 149 tran‐ entiation between ST and SR. Among transcripts with a sequence scripts were expressed in ST but absent in SR. Forty‐two percent difference, the nucleotide sequence divergence ranged from 0.03% (303 transcripts) of DE transcripts mapped to the X chromosome to 4.5% with a mean difference of 0.53% (median, 0.41%) (Figure 2a). (Figure 1); of these, transcripts with ST‐biased expression about As a comparison, the average percentage sequence divergence in the coding regions of five arbitrarily chosen X‐linked noncandidate genes in a large population genetic sample of ST males ranged from

0.10 ST-biased expression 0.16% to 1.25%, with an average of 0.56% (Pieper & Dyer, 2016). 167 SR-biased expression Among transcripts that contained at least one nucleotide difference 0.08 between ST and SR, there was a very weak correlation between the

absolute value of log2 fold change in expression between ST and 0.06 SR and the percentage sequence difference (r = 0.125, p = 0.017, Pearson’s correlation) (Figure 2a, Supporting information Figure S1). 0.04 136 213 3.3 | Identification of transcripts with elevated 0.02 15 11 Ka/Ks between ST and SR 8 9 82 13 25 26 24

Proportion of transcripts DE (FDR ≤ 0.01) 0 3,030 3,360 3,479 3,440 4,312 119 28 10 7,699 Ka/Ks values were calculated between ST and SR for each transcript XBCD EFY mito unk to evaluate the potential functional consequences of these nucleo‐ Muller Element tide differences. Of the 1,349 transcripts with sequence differences, 1,116 transcripts had annotated open reading frames containing nu‐ FIGURE 1 Differentially expressed (DE) transcripts are enriched on the X chromosome. The total number of transcripts that mapped cleotide differences. The transcripts with at least one synonymous to each chromosome is listed beneath the bars. The large Muller substitution (N = 1,001) had an average Ka/Ks value of 0.23, and 31 elements are X, B, C, D and E. F is the small, non‐recombining dot of these transcripts had Ka/Ks >1 (Figure 2b). Combining these 31 chromosome, mito is the mitochondria, Y is the Y chromosome, and transcripts with those that had at least three nucleotide differences unk stands for unknown location. Dark and light grey indicate the but no synonymous differences (N = 15) yielded a total of 46 tran‐ proportion of ST‐ and SR‐biased transcripts, respectively, given the total number of transcripts that mapped to that chromosome. The scripts with an elevated number of nonsynonymous differences. number of DE transcripts in each category is printed within each Notably, these transcripts were not the transcripts with the most bar [Colour figure can be viewed at wileyonlinelibrary.com] substitutions (Figure 2b). Using the transcripts with Ka/Ks <1 as the 5170 | PIEPER et al.

(a) (b)

r

FIGURE 2 Sequence differences between ST and SR in transcripts. (a) There is weak relationship between differential expression between ST and SR (shown as the absolute value of log2 fold change) and sequence differences between ST and SR (Pearson's correlation, r = 0.125). Percent sequence difference is calculated as the total number of differences divided by the length of the transcript times 100. Transcripts with significant differential expression are marked in red. Only transcripts meeting the minimum coverage criteria for detecting sequence differences and had at least one difference are included. The mean percent difference of transcripts with nucleotide differences was 0.53%. (b) Transcripts with the highest number of differences are not the same as those with the highest Ka/Ks values. Only transcripts with more than three differences and at least one synonymous difference are included in the figure. Transcripts with more than three differences but no synonymous differences were also included in the Ka/Ks > 1 set. The identified candidates are marked in red; one of these had no synonymous differences and is not pictured. The inset Venn diagram shows the criteria used to identify Ka/Ks candidates and the number of transcripts in each category. The total number of transcripts with Ka/Ks > 1 or more than three synonymous differences was 46, the total number of ST‐SR DE transcripts was 729, and the total number of testes‐specific transcripts was 14,392. Ka/Ks was calculated between ST and SR [Colour figure can be viewed at wileyonlinelibrary.com]

expectation, transcripts with elevated Ka/Ks were not overrepre‐ specificity is not necessarily a requirement for participation in sented among loci that were DE (χ2 = 2.407, p = 0.121; Table 1). More the mechanism of drive, we chose this criterion to help narrow of these transcripts than expected were testes‐specific (χ2 = 10.383, down the field of candidates. Six transcripts matching these cri‐ p = 0.001; Table 1). teria were identified (Supporting information Table S3). Two of them (TR24932 and TR5481) had no identifiable orthologs in D. melanogaster and D. virilis. Transcript TR6297 was identified 3.4 | Identification of candidate genes as an ortholog of the D. melanogaster gene CG7366, which is lo‐ To narrow down our list of candidate transcripts for involvement cated on an autosome in D. melanogaster and highly expressed in the drive mechanism, we focused on those that have Ka/Ks > 1 in the testes with an unknown function (http://flybase.org/re‐ (or over three nonsynonymous differences and no synonymous ports/FBgn0035855.html). Transcript TR23125 is a homolog differences), are expressed only in the testes and are differen‐ of an X‐linked gene lethal(1)1Bi, which is expressed during the tially expressed between ST and SR (Figure 2b). While testis mitosis stage of spermatogenesis (http://flybase.org/reports/ FBgn0001341.html). This gene contains armadillo DNA‐binding TABLE 1 Chi‐squared tests examining enrichment of positively repeats, which often function in intracellular signalling and cy‐ selected transcripts for (a) testes‐specific expression and (b) toskeletal regulation. differential expression (DE) between ST and SR The final two candidate transcripts (TR10603 and TR2814) 2 Ka/Ks > 1 Ka/Ks < 1 χ p‐value both mapped to an autosomal gene called importin‐α2 (http:// Testes‐specific 21 241 10.383 0.001 flybase.org/reports/FBgn0267727.html). Also known as Pendulin, Not testes‐specific 27 829 this gene is involved in nuclear transport and also contains sev‐ eral armadillo DNA‐binding repeats (Goldfarb, Corbett, Mason, DE 8 94 2.407 0.121 Harreman, & Adam, 2004). Aligning these two transcripts to the Not DE 40 976 D. melanogaster importin‐α2 sequence revealed that they mapped Notes. Each test had one degree of freedom. Values larger than expected to sequential sections of the protein (Supporting information are bolded. The set of Ka/Ks > 1 includes transcripts with Ka/Ks > 1 as well as those that have three or more nonsynonymous substitutions but no Figure S2). Searching the transcriptome data for other transcripts that mapped to importin‐α2 revealed transcript TR37105 that synonymous substitutions. Ka/Ks was calculated between ST and SR. PIEPER et al. | 5171 aligned upstream and partially overlapped the other two tran‐ 3.5 | X‐importin‐α2 is a rapidly evolving scripts. TR37105 also has sequence differences between ST and member of the importin‐α gene family SR but has a Ka/Ks value of only 0.35 (Supporting information Table S4). It also has SR‐biased expression like the other two tran‐ Importin‐α2 is a member of the importin‐α gene family along with scripts, but it was not considered significantly DE in the earlier importin‐ α1 and importin‐α3. Full‐length transcripts of D. neotesta‐ analysis because the adjusted p‐value is 0.013. To identify the 3’ cea importin‐α1 (TR22571) and importin‐α3 (TR6773) were identified end of the gene, we queried the SR‐and ST‐specific assemblies from the transcriptome data. Neither of these transcripts were DE using blastn (Camacho et al., 2009). or had sequence differences between ST and SR, consistent with Searching for transcripts with homology to importin‐α2 also re‐ their autosomal location. A neighbour‐joining unrooted phylogenetic vealed a full‐length transcript (TR7043) that had no sequence dif‐ tree was built from all members of the importin‐ αfamily in D. mela‐ ferences between ST and SR and was not differentially expressed or nogaster, D. virilis and D. neotestacea, including SR and ST X‐importin‐ testis‐specific. This transcript also had much higher sequence similar‐ α2. Clearly, both X‐importin‐α2 sequences are most closely related ity with the D. melanogaster and D. virilis importin‐α2 sequences com‐ to the importin‐α2 copy in D. neotestacea, indicating the X‐linked and pared to the candidate transcripts (Figure 3, Supporting information autosomal copies are paralogs (Figure 3). Figure S2). Therefore, TR7043 must represent the homolog of impor‐ A likelihood ratio test supported a model where each branch tin‐α2, an autosomal gene in D. melanogaster, whereas the three candi‐ of the tree has an independent dN/dS value over a model with one date transcripts in D. neotestacea represent an X‐linked duplicate copy value across the entire phylogeny (χ2 = 485.5, df = 18, p < 0.0001). of this gene. Supporting this, Sanger sequencing of the X‐linked copy The estimated dN/dS values are uniformly quite low within the im‐ was never heterozygous in males, whereas this was not true for the portin‐α1 and importin‐α3 clades, indicating strong purifying selec‐ autosomal importin‐α2. We hereafter refer to the X‐linked duplicate as tion (Figure 3). The same is true of the importin‐α2 branches for the X‐importin‐α2. The transcript is 2,106 bp long in SR and 1,886 bp long two outgroups and the autosomal copy in D. neotestacea. However, in ST; the 3’ UTR is longer in SR. The open reading frame of X‐importin‐ the branch leading from the autosomal copy in D. neotestacea to the

α2 is the same length in ST and SR, but it is 16 amino acids shorter than split between the X‐linked copies has a dN/dS value of 1.03, and the the open reading frame of the autosomal importin‐α2. This shortened branches leading to the individual ST and SR sequences have rela‐ end does not affect any of the protein domains (Supporting informa‐ tively high values of 0.86 and 0.49, respectively. Clearly, this section tion Figure S3). Visual inspection suggests the C‐terminal end of X‐im‐ of the tree is evolving faster than the rest. portin‐α2 is more diverged from the autosomal copy than the rest of Sanger sequencing of X‐importin‐α2 in a selection of D. neotesta‐ the open reading frame (Supporting information Figure S3). cea, D. testacea, D. orientacea, D. putrida and D. bizonata individuals showed that the X‐linked duplication is specific to the testacea‐

D. neotestacea D. species group. It was found in the very closely related D. neotes‐ tacea, D. testacea and D. orientacea, but not in the more distantly D. melanogaster D. virilis related D. putrida or D. bizonata species (Dyer, White, Bray, Pique, & Betancourt, 2011). In the three testacea group species, some PCR primers designed for X‐importin‐α2 coincidentally captured variants 0.23 from both X‐importin‐α2 and importin‐α2. However, in D. putrida or importin-α3 0.20 D. bizonata, these same primers only detected variants from the

D. virilis autosomal copy of the gene. It is possible that the X‐linked copy is

0.48 D. melanogaster present in these species but too far diverged to be detectable with

0.45 our primers, and more work is needed to pinpoint the origin of the 1.02 D. neotestacea D. neotestacea duplication. 0.86 ST X-importin-α2 0.49SR X-importin-α2 D. virilis 3.6 | Top candidate loci have high nonsynonymous importin-α1 variation on ST and low variation on SR importin-α2 We examined the population genetic patterns of the entire open

D. melanogaster reading of X‐importin‐α2, the four other candidate loci described

FIGURE 3 X‐importin‐α2 is a fast‐evolving X‐linked duplicate of above and five additional transcripts that had positive Ka/Ks val‐ the autosomal gene importin‐α2. Branch lengths were estimated ues but did not otherwise meet the candidacy criteria (Tables 2, with a neighbour‐joining tree of 1,000 bootstraps. All nodes had Supporting information Table S3). Five of these 10 loci had no fixed bootstrap support >99. Each of the three importin‐α clades is differences between ST and SR in the population genetic sample labelled. X‐importin‐α2 includes TR10603, TR2814 and TR37105. (Table 2). These were thus excluded from further analyses, where Estimated dN/dS values larger than 0.1 are labelled, and branches the set of five “top candidates” were amended to include all se‐ with dN/dS values >0.60 are marked in red [Colour figure can be viewed at wileyonlinelibrary.com] quenced loci with fixed differences between ST and SR (Table 2). 5172 | PIEPER et al. ‐ ‐ sam

nn 1.00 S 0.97 0.95 1.00 1.00 0.92 0.76 0.92 0.86 0.76 of

ST 0.33 0.34 0.33 0.67 0.84 0.09 0.19 0.40 0.16 0.13 number K

the

is

a 1.18 (%) D 0.94 0.26 2.05 0.60 0.41 0.37 1.58 0.51 0.22 Indicated

sites.

0 Shared Shared mutations 8 0 0 0 18 2 5 6 0 5 2 0 4 3 0 0 0 0 0 NS fixed diffs nonsynonymous

for

stands

Total fixed Total diffs 13 2 1 5 4 0 0 0 0 0 NS

and

NS D −1.11 1.75 −0.47 1.00 0.39 −1.06 −0.25 −0.03 1.59 −0.26 −0.60 −0.75 −0.33 sites,

Silent D −0.85 0.14 NA NA −0.78 0.07 −1.28 NA NA −1.60 −1.48 −1.11 −1.03 NA NA −1.05 −0.75 1.30 −0.42 −1.13−1.91 −1.67 −0.56 −1.69 −1.09 −1.13 −1.30 synonymous s

π / a and 1.39 0.34 0.22 0.22 3.71 0.83 1.14 0.59 0.75 0.68 0.90 0.50 0.87

π NA NA NA NA are multiplied by 100. a 1.4 0.46 0.1 NS π (%) 1.2 1.4 1.1 0.3 1.1 0.1 1.8 1.7 0.9 1.0 1.4 1.3 0.8 0.5 0.68 noncoding

3.0 0.3 Silent Silent π (%) 5.2 1.0 0.3 1.3 1.3 0.1 0.1 1.12 3.1 2.2 0.8 2.0 0.21.4 0.61.6 3.66 1.1 0.1 0.7 include

0 0 0 0 0 0 0 0 4 6 sites

14 10 10 10 37 31 15 25 29 12 50 43 142 Silent ); number of fixed differences between ST and SR across all (total) and nonsynonymous sites; number of net nucleotide substitu

9 8 3 8 9 8 1 0 8 9 9 8 0 5 5 6 6 12 NS 10 10 10 10 10 10 17 11 12 D marked.

are

); Tajima’s D ( ); Tajima’s ST SR X‐type D. testacea SR ST ST SR SR ST SR ST ST SR ST SR ST SR SR ST ST SR π calculated between ST and SR. Values of π and D mechanism

nn 917 375 592 355 378 573 272 294 391 183 1,156 NS sites and S driving

ST 67 93 99 the 176 745 247 116 269

268 345 122 in Silent Silent sites

); nucleotide diversity ( diversity nucleotide ); S 826 541 477 394 770 280 649 648 343 1,905 Total sites 1,262 involvement

‐ for

2 α ) between ST and SR; and K tin‐ a Transcript Transcript ID X‐impor TR261 TR11103 TR24932 TR37304 TR50351 TR23125 TR1778 TR5481 TR6297 D Population genetic summary statistics of sequenced loci sequenced of statistics summary genetic Population candidates

); number of segregating sites ( Top

. N Top candidate Yes Yes Yes Yes Yes No No No No No TABLE 2 TABLE ples ( tions per site ( Notes PIEPER et al. | 5173

The presence of loci with fixed differences between ST and SR candidate markers have significantly lower silent polymorphism rel‐ is striking because previous work in this system found no fixed ative to the noncandidates (U = 2, n1 = n2 = 5, p = 0.034, two‐tailed differences in 11 arbitrarily chosen X‐linked loci (Pieper & Dyer, Mann–Whitney U test; Figure 4a). There was no dif ference in nonsyn‐ 2016). Population genetic data from five protein coding loci in this onymous polymorphism on SR between the top candidates and non‐ previously analysed data set was used as a comparison here—these candidates (U = 14, n1 = n2 = 5, p = 0.83, two‐tailed Mann–Whitney loci are hereafter referred to as “noncandidates” (Pieper & Dyer, U test; Figure 4b). However, considering only ST chromosomes, the 2016) (Supporting information Table S6). Additionally, it is notable top candidate markers do not have reduced polymorphism relative that even though TR23135 had no fixed differences and is not a to the noncandidates at silent sites (U = 10, n1 = n2 = 5, p = 0.69, top candidate, it appears to be evolving non‐neutrally as the ST two‐tailed Mann–Whitney U test; Figure 4a), but they do have samples have no synonymous polymorphisms but eight nonsyn‐ higher polymorphism at nonsynonymous sites (U = 22, n1 = n2 = 5, onymous polymorphisms. Considering only the five top candidate p = 0.056, two‐tailed Mann–Whitney U test; Figure 4b). Though not loci that have fixed differences, it is striking that they also have a statistically significant, this elevated segregating nonsynonymous low number of shared mutations between ST and SR. This differ‐ variation in ST at X‐importin‐α2 and the other top candidates can be entiation is reflected in the KST values between ST and SR of these easily viewed in Figure 4a,d and Supporting information Figure S5. top candidates, where the mean KST between ST and SR (0.43, Additionally, one top candidate had a nonsynonymous variant that SD = 0.27) was significantly higher than in the set of noncandi‐ resulted in a premature stop codon (Supporting information Figure date X‐linked markers (mean = 0.10, SD = 0.10; t = 2.74, df = 6.62, S6). p = 0.03; Supporting information Figure S4). Examining the polymorphism at X‐importin‐α2 also reveals the The top candidate markers, including X‐importin‐α2, showed a presence of two separate haplotypes at roughly equal frequency in pattern characterized by very low diversity on SR, but high diver‐ SR (Figure 4d), whereas many singletons are observed in the ST sam‐ sity on ST, particularly at nonsynonymous sites (Figure 4; Supporting ples. This pattern holds more generally across loci: Within SR chro‐ information Table S6). Considering only SR chromosomes, the top mosomes, Tajima’s D is somewhat higher on the top candidates than

(a) (b) (c)

(d)

FIGURE 4 Molecular evolutionary patterns of top candidates are consistent with positive selection on SR and relaxed purifying selection on ST. (a) Top candidates (top) on SR have significantly lower silent polymorphism than on ST. Silent sites include synonymous sites as well as noncoding sites. The star denotes statistical significance at p < 0.05, two‐tailed Mann–Whitney U test. (b) The top candidates have higher nonsynonymous polymorphism on ST than the noncandidates, though not significantly so (p = 0.056, two‐tailed Mann–Whitney U test). (c) Tajima's D is elevated in the top candidates on SR compared to the noncandidates, but not significant (p = 0.071, two‐tailed Mann–Whitney U test). In panels A through C, SR is represented by white boxes and ST by grey boxes. All data points are shown; the edges of the boxes are the first and third quartiles, and the middle line is the median. (d) Haplotype structure of X‐importin‐α2. Each row is a chromosome, with ST phenotype males above the solid black line and SR phenotype males below. Each column is a single segregating site. Dark grey represents the individual carries the major allele, and light grey is the minor allele. Some sites have a third segregating allele, which is represented by white. Sites with a gap are marked with an X. Nonsynonymous sites are denoted with an N. Unlabelled sites are either synonymous or noncoding. Fixed differences between ST and SR are marked with a star. Sites in the intron and UTR are labelled. Sites located in specific protein domains are also labelled by grey blocks: the importin‐ß binding (IBB) domain, the cargo binding domains (Armadillo [ARM] repeats 1 through 9) and the nuclear export factor (CAS) binding domain (ARM repeat 10) (Goldfarb et al., 2004) [Colour figure can be viewed at wileyonlinelibrary.com] 5174 | PIEPER et al.

noncandidate loci, though not statistically significant (U = 14, n1 = 3, genes in this study likely represent activation of pathways down‐ n2 = 5, p = 0.071, two‐tailed Mann–Whitney U test; Figure 4c). stream of drive like apoptosis (see Supporting information Table S8 However, the sample size is very small for this test because two of for results of GO terms enrichment analysis). This is especially true the top candidates lacked any segregating sites on SR and Tajima’s D for DE transcripts that map to the autosomes, as the ST and SR lines could not be calculated. For the ST chromosomes, Tajima’s D is sim‐ in this study have identical autosomes. In other cases, differential ilar between the candidates and noncandidates (U = 16, n1 = n2 = 5, expression on the X chromosome may represent gene expression p = 0.55, two‐tailed Mann–Whitney U test; Figure 4c). These trends levels that are diverging neutrally because of suppressed recombi‐ also hold when only synonymous or nonsynonymous variation is nation with ST and their subsequent separate evolutionary histo‐ considered (Supporting information Figure S7). ries (Harrison, Wright, & Mank, 2012). Although SR‐ST nucleotide We used HKA tests using divergence between ST and SR to divergence is not correlated with expression differences (Figure 2a), ask whether positive selection could explain the polymorphism differences may accumulate in cis‐regulatory regions that are not differences between the top candidates and the noncandidate X‐ captured in this analysis. Diverging expression due to drift is also linked loci. When only considering SR samples, a maximum‐likeli‐ interesting given that reduced male fertility is the only known phe‐ hood analysis supported a model where the five top candidate loci notypic difference between individuals who carry SR vs ST. Similar are under positive selection compared to the noncandidate loci widespread expression differences were seen in the stalk‐eyed fly, (LRT, χ2 = 13.58, df = 6, p = 0.035). This result is likely due to the although in that system there are clear pleiotropic consequences to high divergence between ST and SR combined with the low diver‐ carrying the driver (Cotton, Foldvari, Cotton, & Pomiankowski, 2014; sity at the candidates on SR (Supporting information Table S7). For Reinhardt et al., 2014). Similarly, it may also be the case that there ST samples, however, no model outperformed the one with all of has been selection for gene expression to mitigate any deleterious the top candidate and noncandidate loci evolving neutrally (LRT, effects of SR in D. neotestacea. The high number of identified DE χ2 = 0.001, df = 1, p = 0.97; Supporting information Table S7). transcripts and the fact that many of them are likely not involved in

drive make the use of additional criteria (particularly elevated Ka/Ks and to a lesser degree testis specificity) necessary for identifying 4 | DISCUSSION candidates.

4.1 | Expected divergence and differential expression, with some exceptions 4.2 | Molecular evolutionary patterns of candidate transcripts suggest involvement in drive We identified differentially expressed transcripts between the testes of D. neotestacea males carrying a wild‐type X chromosome Previous work found that none of 11 arbitrarily chosen X‐linked and those carrying the selfish sex‐ratio X chromosome. We found markers had fixed nucleotide differences between ST and SR (Pieper widespread differentiation between ST and SR, including differential & Dyer, 2016). In contrast, of the 10 candidate loci we assayed for expression and nucleotide sequence differences (Figures 1 and 2). population genetic variation, five had fixed nucleotide differences Nearly half of all transcripts that mapped to the X chromosome had between SR and ST (Table 2, Figure 4d). Population genetic patterns at least one sequence difference between ST and SR (Supporting suggest these five “top candidate” loci are under positive selection information Table S1), and some showed an elevated Ka/Ks sugges‐ on SR but under relaxed purifying selection on ST. tive of positive selection or relaxed purifying selection. Any two X The relative location of the top candidates on the X chromo‐ chromosomes from natural populations would be expected to have some is unknown, but a possible reason for the low variation on SR many sequence differences between them, and the average per‐ at these loci is that they are involved in the driving mechanism or centage difference between ST and SR is very close to the average tightly linked to loci involved in drive. Involvement in drive or tight percentage difference at five loci within a large sample of ST chro‐ linkage to the driver is expected to result in molecular evolutionary mosomes (Figure 2a) (Pieper & Dyer, 2016). It is difficult to say how patterns similar to a selective sweep (Derome et al., 2008; Kingan many transcripts have divergence that falls above a neutral expecta‐ et al., 2010). Consistent with this, three of these five loci harbour tion, but the tail of highly diverged transcripts likely indicates regions little to no polymorphism on SR (Figure 4). However, the other two of particularly strong differentiation between SR and ST. Previous loci show evidence of multiple segregating haplotypes on SR; thus, work on a smaller scale showed that differentiation between ST and any selective sweep that may have occurred at these loci was not SR was both widespread and variable across the chromosome, and recent. Between loci, these SR haplotypes are not in phase with this genome‐scale investigation supports these conclusions (Pieper one another, suggesting recombination may be occurring between & Dyer, 2016). them (Supporting information Figure S5, Figure 4d). This is consis‐ There are a large number of DE transcripts between ST and SR, tent with previous findings of evidence for recombination in SR/ most of which are found on the X chromosome (Figure 1). Although SR females, which are fully fertile in this system and estimated to the presence of inversions on SR suggests the mechanism of drive account for 2%–3% of all females in the wild (Dyer, 2012; Pieper may involve more than one gene, most of the 729 DE transcripts & Dyer, 2016; Pinzone & Dyer, 2013). The average population fre‐ are probably not involved in the drive phenotype. Many of the DE quency of SR across North America is 15%; in some populations, PIEPER et al. | 5175 as many as 30% of X chromosomes are SR (Dyer, 2012). Cycling of genetic innovation as it allows new genes to evolve new functions drive haplotypes has been observed in the autosomal SD system of while still maintaining the original function of the gene (Lynch & D. melanogaster (Brand, Larracuente, & Presgraves, 2015), and re‐ Walsh, 2007). combination in SR/SR females may eliminate deleterious mutations linked to drive loci. 4.4 | X‐importin‐α2 suggests nuclear transport may In contrast to SR, ST chromosomes have a high level of segregat‐ be a target of drive ing nonsynonymous variation at the top candidates (Figure 4b), sug‐ gestive of purifying selection failing to remove slightly deleterious We argue that while its molecular evolutionary patterns and variation. One sample in the top candidate TR261 contained a single status as a rapidly evolving duplicate make X‐importin‐α2 a base pair insertion that resulted in a premature stop codon, and an‐ good candidate for the mechanism of drive, the function of its other sample has an entire codon deleted in X‐importin‐α2 (Figure 4d, parent gene, importin‐ α2, makes it an excellent candidate. All Supporting information Figure S6). If these loci are solely functioning importin‐α proteins play a key role in the nuclear transport path‐ as drivers, there may be no selection to maintain them on the ST X way (Goldfarb et al., 2004; Matsuura & Stewart, 2005; Stewart, chromosome and they are thus free to accumulate mutations. There 2007; Sun, Fu, Ciziene, Stewart, & Musser, 2013). Nuclear trans‐ are also numerous fixed nonsynonymous differences between ST port has been previously implicated in genetic conflict in the and SR in the top candidate loci, further suggesting that these se‐ male germline, most prominently in the autosomal SD meiotic quences may have divergent functions or phenotypic effects in SR vs driver of D. melanogaster (Larracuente & Presgraves, 2012). ST (Table 2, Figure 4d, Supporting information Figure S5). During spermatogenesis, the haploid nuclei of sperm share a cy‐ toplasm until the individualization stage, when half of the sperm in SD males die (De Cuevas, Lilly, & Spradling, 1997; Fuller, 1993; 4.3 | X‐importin‐α2 is a rapidly evolving X‐linked Larracuente & Presgraves, 2012). The distorter locus of SD (sd‐ duplicate RanGAP) is a truncated, duplicated copy of RanGAP that mislocal‐ We argue that X‐importin‐α2 is a strong candidate for involve‐ izes to the nucleus instead of the cytoplasm (Kusano et al., 2001; ment in meiotic drive. X‐importin‐α2 is overexpressed in sex‐ratio Larracuente & Presgraves, 2012). Although the exact mechanism males, testes‐specific and highly differentiated between ST and is unknown, sd‐RanGAP is enzymatically active and causes se‐ SR (Supporting information Table S4 and Figure 4d, Supporting lective failure of sperm through disruption of the GTP gradient information Figure S5). Furthermore, it is evolving differently required for nuclear transport (Kusano et al., 2002; Larracuente from the rest of the importin‐α gene family (Figure 3). While & Presgraves, 2012). most of the tree appears to be under strong purifying selec‐ Of the three canonical importin‐α genes in D. melanogas‐ tion, the dN/dS value of 1.03 estimated for the X‐importin‐α2 ter, importin‐α2 is primarily expressed in the testes and plays a branch is comparatively extremely high, suggesting selection critical role in spermatogenesis (Mason, Fleming, & Goldfarb, for functional changes between the two copies of the gene. 2002). Male homozygous null importin‐α2 are sterile and Phylogenetic analyses place the origin of the duplication be‐ the sperm fail at the individualization checkpoint (Giarrè et al., tween the split between the lineage leading to D. neotestacea, 2002; Mason et al., 2002). In D. neotestacea, the autosomal copy D. orientacea and D. testeacea and that leading to D. putrida (Dyer of importin‐α2 is under purifying selection and is expressed at et al., 2011). We note that sex‐ratio meiotic drive is also present the same high level in the testes of both ST and SR males and in D. orientacea (K. Dyer, unpublished data) and D. testacea (Keais, thus likely retains the ancestral function as observed in D. mela‐ Hanson, Gowen, & Perlman, 2017); thus, the same three closely nogaster (Figure 3). The divergence of X‐importin‐α2 from the related species that have sex‐ratio drive in the testacea group parent copy suggests that it may have taken on a new function. also have X‐importin‐α2. The duplication event resulted from a In D. neotestacea, microscopy visualizing the late stages of sper‐ retrotransposition event involving a poorly spliced transcript: In matogenesis shows that in SR males, a portion of spermatids D. melanogaster, there are three small introns in importin‐ α2, but begin to fail during the elongation phase prior to individualiza‐ only one of these in included in X‐importin‐α2 in D. neotestacea tion (Figure 5), although a GO terms enrichment analysis of SR‐ (Supporting information Figure S8). biased, testis‐specific transcripts found the biological process Duplication events have been implicated in the origin of mul‐ “sperm individualization” was significantly enriched (Supporting tiple other meiotic drive systems. For instance, this includes the information Table S8). More work is necessary to fully describe distorter locus of the SD system of D. melanogaster (Larracuente & the cellular spermatogenesis phenotype of SR in D. neotestacea, Presgraves, 2012), both the distorter and an autosomal suppres‐ but the evidence we have currently suggests the involvement sor in the Winters sex‐ratio system of D. simulans (Tao, Araripe, et of X‐importin‐α2 is highly plausible. We also examined several al., 2007; Tao, Masly, et al., 2007), and a segmental duplication known interacting partners of importin‐α2 (e.g., importin‐ß, CAS, spanning six genes that includes the distorter in the Paris sex‐ Ran, RanGTP, RanGEF, Nup153, Nup50) in our data set, but we ratio system of D. simulans (Fouvry, Ogereau, Berger, Gavory, & found no indication of DE between ST and SR, elevated Ka/Ks or Montchamp‐Moreau, 2011). Gene duplication is a potent force for duplication (data not shown). 5176 | PIEPER et al.

ST SR FIGURE 5 Developing bundle of 64 spermatids in the testes of ST males (left) and SR males (right) at 650X magnification. The DNA is stained with DAPI, revealing the heads of the spermatids. In SR, roughly half of the sperm do not develop properly, which are presumably Y‐bearing spermatids. Arrows point out the heads of sperm that are not maturing properly [Colour figure can be viewed at wileyonlinelibrary.com]

Duplication of an importin‐α gene is not unique to the testacea other associated genes in D. melanogaster suggests a long history group. At least three independent duplication events of importin‐α2 of genetic conflict over nuclear transport (Presgraves, 2007). and importin‐α3 have been identified in other Drosophila lineages, Further experiments with X‐importin‐α2 could provide evidence and like X‐importin‐α2, these duplicates are expressed primarily in that certain pathways and processes are particularly susceptible the testes and at least one of them has signatures of positive se‐ to genetic conflict. lection in certain lineages (Phadnis, Hsieh, & Malik, 2011). Genes primarily expressed during spermatogenesis tend to be fast‐evolv‐ ACKNOWLEDGEMENTS ing, and new duplicate genes often have testis‐specific expression patterns (Betrán, Thornton, & Long, 2002; Emerson, Kaessmann, This project was funded by a Rosemary Grant Award from the Betrán, & Long, 2004; Haerty et al., 2007). This is often attributed to Society for the Study of Evolution, a Presidential Fellowship from sexual selection, but another hypothesis may be that spermatogene‐ the University of Georgia and a National Institutes of Health (NIH) sis is frequently the site of genetic conflict over a fair meiosis (Haerty award under T32GM007103 to KEP; NIH award R00GM114714 et al., 2007; Kleene, 2005). Phadnis et al. (2011) suggest that the to RLU; and National Science Foundation grants 1149350 and repeated duplication and evolution of new importin‐α family mem‐ 1457707 and funds from the University of Georgia Research bers are due to evolutionary pressure to maintain wild‐type nuclear Foundation to KAD. We thank Yasir Ahmed‐Braimah for providing import against selfish genetic elements like SD. It may be the case us with the D. virilis genome assembly, Brooke White and Ariane that the ST copy of X‐importin‐α2 was once part of this defence sys‐ Wong for laboratory assistance and three anonymous reviewers tem, but is no longer functional, resulting in the signatures of relaxed and members of the Dyer, Hall, Bergman and White laboratories purifying selection we observed. If X‐importin‐α2 is involved in drive at UGA for feedback on the manuscript. in D. neotestacea, this defence system may have been subverted and co‐opted. More work is needed to untangle the evolutionary relationship between the ST and SR copies of X‐importin‐α2 and the AUTHOR CONTRIBUTIONS autosomal copy of importin‐ 2, including a larger and more complete α K.E.P., R.L.U. and K.A.D. designed the experiments, conducted the gene tree that includes sequences from the autosomal paralog as analyses and wrote the manuscript. well as both X‐linked SR and ST alleles from D. orientacea and D. tes‐ tacea (Keais et al., 2017). In summary, we find a strong pattern of widespread expres‐ DATA ACCESSIBILITY sion and molecular evolutionary differences between SR and ST The data presented in this study can be accessed on Dryad with the of D. neotestacea, supporting the results of previous work with a Accession No.10.5061/dryad.9b21271. smaller set of loci (Pieper & Dyer, 2016). We also identified a set of candidates for involvement in the mechanism of drive based on their expression and unique molecular evolutionary patterns. ORCID Particularly notable is the X‐linked, fast‐evolving duplicate of im‐ Kelly A. Dyer https://orcid.org/0000-0001-7480-2055 portin‐α2 called X‐importin‐α2 that is overexpressed in SR males. This appealing candidate is worthy of further investigation, includ‐ ing comparisons of expression in different SR lines and eventually REFERENCES knockdown experiments to functionally confirm its involvement Anders, S., & Huber, W. (2012). Differential expression of RNA‐Seq data in SR. If X‐importin‐α2 is involved in drive, it would be a remark‐ at the gene level–the DESeq package. Heidelberg: European Molecular able example of convergent evolution with SD in D. melanogaster Biology Laboratory (EMBL). of manipulation of the nuclear transport pathway (Larracuente & Bauer, H., Veron, N., Willert, J., & Herrmann, B. G. (2007). The t‐com‐ Presgraves, 2012). The rapid sequence evolution of RanGAP and plex‐encoded guanine nucleotide exchange factor Fgd2 reveals that PIEPER et al. | 5177

two opposing signaling pathways promote transmission ratio distor‐ Fouvry, L., Ogereau, D., Berger, A., Gavory, F., & Montchamp‐Moreau, tion in the mouse. Genes & Development, 21(2), 143–147. https://doi. C. (2011). Sequence analysis of the segmental duplication respon‐ org/10.1101/gad.414807. sible for Paris sex‐ratio drive in Drosophila simulans. G3: Genes, Bauer, H., Willert, J. R., Koschorz, B., & Herrmann, B. G. (2005). The t Genomes, Genetics, 1(5), 401–410. https://doi.org/10.1534/ complex‐encoded GTPase‐activating protein Tagap1 acts as a trans‐ g3.111.000315. mission ratio distorter in mice. Nature Genetics, 37(9), 969–973. Fuller, M. (1993). Spermatogenesis. Plainview, NY: Cold Spring Harbor https://doi.org/10.1038/ng1617. Laboratory Press. Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: Giarrè, M., Török, I., Schmitt, R., Gorjánácz, M., Kiss, I., & Mechler, B. A practical and powerful approach to multiple testing. Journal of the M. (2002). Patterns of importin‐α expression during Drosophila sper‐ Royal Statistical Society. Series B (Methodological), 57(1), 289–300. matogenesis. Journal of Structural Biology, 140(1–3), 279–290. https:// Betrán, E., Thornton, K., & Long, M. (2002). Retroposed new genes out doi.org/10.1016/S1047-8477(02)00543-9 of the X in Drosophila. Genome Research, 12(12), 1854–1859. https:// Goldfarb, D. S., Corbett, A. H., Mason, D. A., Harreman, M. T., & Adam, doi.org/10.1101/gr.6049 S. A. (2004). Importin α: A multipurpose nuclear‐transport recep‐ Brand, C. L., Larracuente, A. M., & Presgraves, D. C. (2015). Origin, evolu‐ tor. Trends in Cell Biology, 14(9), 505–514. https://doi.org/10.1016/j. tion, and population genetics of the selfish Segregation Distorter gene tcb.2004.07.016 duplication in European and African populations of Drosophila mela‐ Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., nogaster. Evolution; International Journal of Organic Evolution, 69(5), Bowden, J., … Regev, A. (2013). De novo transcript sequence recon‐ 1271–1283. https://doi.org/10.1111/evo.12658. struction from RNA‐seq using the Trinity platform for reference gen‐ Burt, A., & Trivers, R. (2006). Genes in conflict: The biology of selfish ge‐ eration and analysis. Nature Protocols, 8(8), 1494–1512. https://doi. netic elements. Cambridge, MA: Belknap Press of Harvard University org/10.1038/nprot.2013.084. Press. Haerty, W., Jagadeeshan, S., Kulathinal, R. J., Wong, A., Ram, K. R., Sirot, Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, L. K., … Civetta, A. (2007). Evolution in the fast lane: Rapidly evolv‐ K., & Madden, T. L. (2009). BLAST+: Architecture and applications. BMC ing sex‐related genes in Drosophila. Genetics, 177(3), 1321–1335. Bioinformatics, 10, 421. https://doi.org/10.1186/1471-2105-10-421. https://doi.org/10.1534/genetics.107.078865 Carvalho, A. B., & Vaz, S. C. (1999). Are Drosophila SR drive chromosomes Hamilton, W. D. (1967). Extraordinary sex ratios. Science, 156(3774), always balanced? Heredity, 83, 221–228. https://doi.org/10.1038/ 477–488. sj.hdy.6886100. Harrison, P. W., Wright, A. E., & Mank, J. E. (2012). The evolution of Charlesworth, B., & Hartl, D. L. (1978). Population‐dynamics of segre‐ gene expression and the transcriptome–phenotype relationship. gation distorter polymorphism of Drosophila melanogaster. Genetics, Seminars in Cell & Developmental Biology, 23(2), 222–229. https://doi. 89(1), 171–192. org/10.1016/j.semcdb.2011.12.004 Christianson, S. J., Brand, C. L., & Wilkinson, G. S. (2011). Reduced Helleu, Q., Gérard, P. R., Dubruille, R., Ogereau, D., Prud’homme, B., polymorphism associated with X chromosome meiotic drive in the Loppin, B., & Montchamp‐Moreau, C. (2016). Rapid evolution of a stalk‐eyed fly Teleopsis dalmanni. PLoS One, 6(11), 1–7. https://doi. Y‐chromosome heterochromatin protein underlies sex chromosome org/10.1371/journal.pone.0027254. meiotic drive. Proceedings of the National Academy of Sciences USA, Cotton, A. J., Foldvari, M., Cotton, S., & Pomiankowski, A. (2014). Male 113(15), 4110–4115. https://doi.org/10.1073/pnas.1519332113. eyespan size is associated with meiotic drive in wild stalk‐eyed Herrmann, B. G., Koschorz, B., Wertz, K., McLaughlin, K. J., & Kispert, A. flies (Teleopsis dalmanni). Heredity, 112(4), 363–369. https://doi. (1999). A protein kinase encoded by the t complex responder gene org/10.1038/hdy.2013.131. causes non‐Mendelian inheritance. Nature, 402(6758), 141–146. Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, https://doi.org/10.1038/45970. M. A., … Durbin, R. (2011). The variant call format and VCFtools. Hoskins, R. A., Carlson, J. W., Wan, K. H., Park, S., Mendez, I., Galle, S. Bioinformatics, 27(15), 2156–2158. https://doi.org/10.1093/ E., … Svirskas, R. (2015). The Release 6 reference sequence of the bioinformatics/btr330. Drosophila melanogaster genome. Genome Research, 25(3), 445–458. De Cuevas, M., Lilly, M., & Spradling, A. (1997). Germline cyst formation Hurst, L. D., & Pomiankowski, A. (1991). Causes of sex‐ratio bias may in Drosophila. Annual Review of Genetics, 31(1), 405–428. https://doi. account for unisexual sterility in hybrids ‐ a new explanation of org/10.1146/annurev.genet.31.1.405 Haldane's rule and related phenomena. Genetics, 128(4), 841–858. Derome, N., Baudry, E., Ogereau, D., Veuille, M., & Montchamp‐Moreau, Jaenike, J. (2001). Sex chromosome meiotic drive. Annual Review of C. (2008). Selective sweeps in a 2‐locus model for sex‐ratio meiotic Ecology and Systematics, 32, 25–49. https://doi.org/10.1146/annurev. drive in Drosophila simulans. Molecular Biology and Evolution, 25(2), ecolsys.32.081501.113958. 409–416. https://doi.org/10.1093/molbev/msm269. James, A. C., & Jaenike, J. (1990). Sex‐ratio meiotic drive in Drosophila Dyer, K. A. (2012). Local selection underlies the geographic dis‐ testacea. Genetics, 126(3), 651–656. tribution of sex‐ratio drive in Drosophila neotestacea. Evolution; Keais, G. L., Hanson, M. A., Gowen, B. E., & Perlman, S. J. (2017). X‐chro‐ International Journal of Organic Evolution, 66(4), 973–984. https://doi. mosome drive in a widespread Palearctic woodland fly, Drosophila org/10.1111/j.1558-5646.2011.01497.x. testacea. Journal of Evolutionary Biology, 30(6), 1185–1194. https:// Dyer, K. A., Bray, M. J., & Lopez, S. J. (2013). Genomic conflict drives doi.org/10.1111/jeb.13089. patterns of X‐linked population structure in Drosophila neotesta‐ Kearse, M., Moir, R., Wilson, A., Stones‐Havas, S., Cheung, M., Sturrock, cea. Molecular Ecology, 22(1), 157–169. https://doi.org/10.1111/ S., … Drummond, A. (2012). Geneious basic: An integrated and ex‐ mec.12097. tendable desktop software platform for the organization and analy‐ Dyer, K. A., White, B. E., Bray, M. J., Pique, D. G., & Betancourt, A. J. sis of sequence data. Bioinformatics, 28(12), 1647–1649. https://doi. (2011). Molecular evolution of a Y chromosome to autosome gene org/10.1093/bioinformatics/bts199. duplication in Drosophila. Molecular Biology and Evolution, 28(3), Kingan, S. B., Garrigan, D., & Hartl, D. L. (2010). Recurrent selection on 1293–1306. https://doi.org/10.1093/molbev/msq334. the Winters sex‐ratio genes in Drosophila simulans. Genetics, 184(1), Emerson, J., Kaessmann, H., Betrán, E., & Long, M. (2004). Extensive 253–265. https://doi.org/10.1534/genetics.109.109587. gene traffic on the mammalian X chromosome. Science, 303(5657), Kleene, K. C. (2005). Sexual selection, genetic conflict, selfish genes, 537–540. and the atypical patterns of gene expression in spermatogenic cells. 5178 | PIEPER et al.

Developmental Biology, 277(1), 16–26. https://doi.org/10.1016/j. Phadnis, N., Hsieh, E., & Malik, H. S. (2011). Birth, death, and replacement ydbio.2004.09.031 of karyopherins in Drosophila. Molecular Biology and Evolution, 29(5), Kusano, A., Staber, C., & Ganetzky, B. (2001). Nuclear mislocaliza‐ 1429–1440. https://doi.org/10.1093/molbev/msr306 tion of enzymatically active RanGAP causes segregation distor‐ Pieper, K. E., & Dyer, K. A. (2016). Occasional recombination of a selfish tion in Drosophila. Developmental Cell, 1(3), 351–361. https://doi. X‐chromosome may permit its persistence at high frequencies in the org/10.1016/S1534-5807(01)00042-9 wild. Journal of Evolutionary Biology, 29(11), 2229–2241. https://doi. Kusano, A., Staber, C., & Ganetzky, B. (2002). Segregation distortion org/10.1111/jeb.12948. induced by wild‐type RanGAP in Drosophila. Proceedings of the Pinzone, C. A., & Dyer, K. A. (2013). Association of polyandry and sex‐ National Academy of Sciences of the United States of America, 99(10), ratio drive prevalence in natural populations of Drosophila neotesta‐ 6866–6870. https://doi.org/10.1073/pnas.102165099. cea. Proceedings of the Royal Society B‐Biological Sciences, 280(1769), Langmead, B., & Salzberg, S. L. (2012). Fast gapped‐read alignment with 20131397. https://doi.org/10.1098/rspb.2013.1397. Bowtie 2. Nature Methods, 9(4), 357–359. https://doi.org/10.1038/ Presgraves, D. C. (2007). Does genetic conflict drive rapid molecular nmeth.1923. evolution of nuclear transport genes in Drosophila? BioEssays, 29(4), Larracuente, A. M., & Presgraves, D. C. (2012). The selfish segregation 386–391. https://doi.org/10.1002/bies.20555 distorter gene complex of Drosophila melanogaster. Genetics, 192(1), Quinlan, A. R., & Hall, I. M. (2010). BEDTools: A flexible suite of utili‐ 33–53. https://doi.org/10.1534/genetics.112.141390. ties for comparing genomic features. Bioinformatics, 26(6), 841–842. Li, B., & Dewey, C. N. (2011). RSEM: Accurate transcript quantifica‐ https://doi.org/10.1093/bioinformatics/btq033. tion from RNA‐Seq data with or without a reference genome. BMC R CoreTeam (2014). R: A language and environment for statistical com‐ Bioinformatics, 12, 323. https://doi.org/10.1186/1471-2105-12-323. puting. Vienna, Austria: R Foundation for Statistical Computing. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Retrieved from http://www.R-project.org … Durbin, R. (2009). The sequence Alignment/Map format Reinhardt, J. A., Brand, C. L., Paczolt, K. A., Johns, P. M., Baker, R. H., & and SAMtools. Bioinformatics, 25(16), 2078–2079. https://doi. Wilkinson, G. S. (2014). Meiotic drive impacts expression and evo‐ org/10.1093/bioinformatics/btp352. lution of X‐linked genes in Stalk‐Eyed flies. Plos Genetics, 10(5), 14. Librado, P., & Rozas, J. (2009). DnaSP v5: A software for comprehensive https://doi.org/10.1371/journal.pgen.1004362. analysis of DNA polymorphism data. Bioinformatics, 25(11), 1451– Rice, W. R. (2013). Nothing in genetics makes sense except in light of 1452. https://doi.org/10.1093/Bioinformatics/Btp187. genomic conflict. Annual Review of Ecology and Systematics, 44(1), Lin, C. J., Hu, F., Dubruille, R., Vedanayagam, J., Wen, J., Smibert, 217–237. https://doi.org/10.1146/annurev-ecolsys-110411-160242. P., … Lai, E. C. (2018). The hpRNA/RNAi pathway is essential Risso, D., Ngai, J., Speed, T. P., & Dudoit, S. (2014). Normalization of to resolve intragenomic conflict in the drosophila male ger‐ RNA‐seq data using factor analysis of control genes or samples. mline. Developmental Cell, 46(3), 316–326.e315. https://doi. NatureBiotechnology, 32(9), 896–902. https://doi.org/10.1038/ org/10.1016/j.devcel.2018.07.004. nbt.2931 Lindholm, A. K., Dyer, K. A., Firman, R. C., Fishman, L., Forstmeier, W., Robinson, M. D., McCarthy, D. J., & Smyth, G. K. (2010). edgeR: A Holman, L., … Price, T. A. R. (2016). The ecology and evolutionary dy‐ Bioconductor package for differential expression analysis of digital namics of meiotic drive. Trends in Ecology & Evolution, 31(4), 315–326. gene expression data. Bioinformatics, 26(1), 139–140. https://doi. https://doi.org/10.1016/j.tree.2016.02.001. org/10.1093/bioinformatics/btp616. Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold Sandler, L., & Novitski, E. (1957). Meiotic drive as an evolution‐ change and dispersion for RNA‐seq data with DESeq2. GenomeBiology, ary force. American Naturalist, 91(857), 105–110. https://doi. 15(12), 1. https://doi.org/10.1186/s13059-014-0550-8 org/10.1086/281969. Lynch, M., & Walsh, B. (2007). The origins of genome architecture (Vol. 98). Schaeffer, S. W., Bhutkar, A. U., McAllister, B. F., Matsuda, M., Matzkin, L. Sunderland, MA: Sinauer Associates. M., O'Grady, P. M., … Kaufman, T. C. (2008). Polytene chromosomal Lyon, M. F. (1991). The genetic basis of transmission‐ratio distortion and maps of 11 Drosophila species: The order of genomic scaffolds in‐ male sterility due to the t complex. American Naturalist, 349–358. ferred from genetic and physical maps. Genetics, 179(3), 1601–1655. https://doi.org/10.1086/285168 https://doi.org/10.1534/genetics.107.086074 Lyon, M. F. (2003). Transmission ratio distortion in mice. Annual Stewart, M. (2007). Molecular mechanism of the nuclear protein import Review of Genetics, 37, 393–408. https://doi.org/10.1146/annurev. cycle. NatureReviews Molecular Cell Biology, 8(3), 195. https://doi. genet.37.110801.143030. org/10.1038/nrm2114 Mason, D. A., Fleming, R. J., & Goldfarb, D. S. (2002). Drosophila melano‐ Sun, C., Fu, G., Ciziene, D., Stewart, M., & Musser, S. M. (2013). gaster importin α1 and α3 can replace importin α2 during spermato‐ Choreography of importin‐α/CAS complex assembly and disassem‐ genesis but not oogenesis. Genetics, 161(1), 157–170. bly at nuclear pores. Proceedings of the National Academy of Sciences, Matsuura, Y., & Stewart, M. (2005). Nup50/Npap60 function in nu‐ 110(17), E1584–E1593. clear protein import complex disassembly and importin recycling. Tao, Y., Araripe, L., Kingan, S. B., Ke, Y., Xiao, H., & Hartl, D. L. (2007). A The EMBO Journal, 24(21), 3681–3689. https://doi.org/10.1038/ sex‐ratio meiotic drive system in Drosophila simulans. II: An X‐linked sj.emboj.7600843 distorter. Plos Biology, 5(11), e293. https://doi.org/10.1371/journal. Merrill, C., Bayraktaroglu, L., Kusano, A., & Ganetzky, B. (1999). pbio.0050293. Truncated RanGAP encoded by the segregation distorter locus of Tao, Y., Masly, J. P., Araripe, L., Ke, Y., & Hartl, D. L. (2007). A sex‐ratio Drosophila. Science, 283(5408), 1742–1745. https://doi.org/10.1126/ meiotic drive system in Drosophila simulans. I: An autosomal sup‐ science.283.5408.1742. pressor. Plos Biology, 5(11), e292. https://doi.org/10.1371/journal. Paczolt, K. A., Reinhardt, J. A., & Wilkinson, G. S. (2017). Contrasting pat‐ pbio.0050292. terns of X‐chromosome divergence underlie multiple sex‐ratio poly‐ The UniProt Consortium (2014). UniProt: A hub for protein infor‐ morphisms in stalk‐eyed flies. Journal of Evolutionary Biology, 30(9), mation. Nucleic Acids Research, 43(D1), D204–D212. https://doi. 1772. https://doi.org/10.1111/jeb.13140. org/10.1093/nar/gku989. Perlman, S. J., Spicer, G. S., Shoemaker, D. D., & Jaenike, J. (2003). Associations Untergasser, A., Cutcutache, I., Koressaar, T., Ye, J., Faircloth, B. C., between mycophagous Drosophila and their nematode Remm, M., & Rozen, S. G. (2012). Primer3—new capabilities and parasites: A worldwide phylogenetic shuffle. Molecular Ecology, 12(1), interfaces. Nucleic Acids Research, 40(15), e115–e115. https://doi. 237–249. https://doi.org/10.1046/j.1365-294X.2003.01721.x. org/10.1093/nar/gks596. PIEPER et al. | 5179

Van der Auwera, G. A., Carneiro, M. O., Hartl, C., Poplin, R., Del Angel, G., SUPPORTING INFORMATION Levy‐Moonshine, A., DePristo, M. A. (2013). From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices Additional supporting information may be found online in the pipeline. Current Protocols in Bioinformatics, 11(1110), 11 10 11–11 10 Supporting Information section at the end of the article. 33. https://doi.org/10.1002/0471250953.bi1110s43. Wright, S. I., & Charlesworth, B. (2004). The HKA test revisited: A max‐ imum‐likelihood‐ratio test of the standard neutral model. Genetics, How to cite this article: Pieper KE, Unckless RL, Dyer KA. A 168(2), 1071–1076. https://doi.org/10.1534/genetics.104.026500 fast‐evolving X‐linked duplicate of importin‐α2 is Yang, Z. (2007). Paml : 4 Phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution, 24(8), 1586–1591. https://doi. overexpressed in sex‐ratio drive in Drosophila neotestacea. Mol org/10.1093/molbev/msm088. Ecol. 2018;27:5165–5179. https://doi.org/10.1111/mec.14928 Zhang, Z., Li, J., Zhao, X. Q., Wang, J., Wong, G. K., & Yu, J. (2006). KaKs_Calculator: Calculating Ka and Ks through model selection and model averaging. Genomics Proteomics Bioinformatics, 4(4), 259–263. https://doi.org/10.1016/s1672-0229(07)60007-2.