YALE JOURNAL OF BIOLOGY AND MEDICINE 89 (2016), pp.499-512. REVIEW

What Drives Positive Selection in the Drosophila piRNA Machinery? The Genomic Autoimmunity Hypothesis

Justin P. Blumenstiel*, Alexandra A. Erwin, and Lucas W. Hemmer

Department of Ecology and , University of Kansas, Lawrence, KS

In animals, PIWI-interacting RNAs (piRNAs†) play a crucial role in defense. Moreover, because piRNAs can be maternally transmitted, they contribute to the epigenetic profile of inheritance. Multiple studies, especially in Drosophila, have demonstrated that the machinery of piRNA biogenesis is often the target of positive selection. Because transposable elements (TEs) are a form of genetic parasite, positive se- lection in the piRNA machinery is often explained by analogy to the signatures of positive selection com- monly observed in genes that play a role in host-parasite dynamics. However, the precise mechanisms that drive positive selection in the piRNA machinery are not known. In this review, we outline several mecha- nistic models that might explain pervasive positive selection in the piRNA machinery of Drosophila species. We propose that recurrent positive selection in the piRNA machinery can be partly explained by an ongoing tension between selection for sensitivity required by genome defense and selection for specificity to avoid the off-target effects of maladaptive genic silencing by piRNA.

INTRODUCTION a Red Queen evolutionary arms race has been proposed Transposable elements (TEs) are selfish elements to explain this signature of adaptive . Under this that encode the capacity to copy themselves across scenario, recurrent evolution in the piRNA machinery and they pose a significant threat in the form of would be driven by ongoing evolution on the part of TEs . Moreover, because of their repetitive nature, to evade piRNA silencing. In turn, the piRNA machinery they can mediate harmful chromosome rearrangements. would show a strong signature of adaptation driven by In response to these threats many utilize recurrent selection to maintain TE silencing. mechanisms of RNA silencing to limit TE proliferation. While plausible, no specific mechanism supports a In the animal gonad, a specialized form of RNA silenc- Red Queen model for the evolution of the piRNA ma- ing mediated by PIWI-interacting RNAs (piRNAs) plays chinery. In this review, we outline several mechanistic a critical role in maintaining TE repression across gener- models that might explain pervasive positive selection in ations. Strikingly, multiple studies, especially in the piRNA machinery of Drosophila species. Finally, we Drosophila, have demonstrated that the machinery of discuss the evolution of the piRNA machinery in light of piRNA biogenesis is often the target of recurrent posi- off-target effects. While most piRNAs are dedicated to tive selection. This signature is often explained by anal- TE silencing, recent studies have shown that there are ogy to the signatures of positive selection commonly significant off-target effects of gene silencing. Under observed in genes that play a role in host-parasite dy- varying circumstances, piRNAs can target and silence namics. Since TEs can be considered genomic parasites, genes as well as TEs. This indicates that there is a cost to

*To whom all correspondence should be addressed: Justin Blumenstiel, 1200 Sunnyside Ave., Lawrence, KS, tel: 785-218-2896, email: [email protected].

†Abbreviations: TE, Transposable element; piRNA, PIWI-interacting RNA; siRNA, small interfering RNA; miRNA, microRNA; RNAi, RNA interference.

Keywords: piRNA, transposable element, selfish elements, positive selection, Red Queen, host-parasite

Author Contributions: J.P.B wrote the paper. AAE and LWH contributed to analysis.

Copyright © 2016 499 500 Blumenstiel et al.: Positive selection in the piRNA machinery

TE control by piRNAs. We designate this form of off-tar- cancer have been found to be caused by insertion of L1 get gene silencing as "genomic autoimmunity." Genomic elements into the APC tumor suppressor [12]. TEs can autoimmunity is analogous to classic forms of autoimmu- also harm the host by mechanisms independent of inser- nity, which are caused by an immune response that incor- tional mutation. For example, overexpression of Alu RNA rectly targets self. In the case of genomic autoimmunity, can cause macular degeneration by activating the NLRP3 genes rather than TEs become the target of repression. We inflammasome [13]. Finally, due to their repetitive nature, propose that the strong evolutionary tension driven by ge- TEs pose a challenge to genome stability [14,15]. Ectopic nomic autoimmunity contributes to the signature of adap- recombination among dispersed repeat sequences can lead tive evolution observed in the piRNA machinery. We to significant chromosomal damage [16]. In fact, natural argue that recurrent positive selection in the piRNA ma- selection against chromosomal damage caused by ectopic chinery can be partly explained by ongoing tension that recombination is proposed to be a major force limiting leads to cycles of selection for sensitivity required by TEs from completely overwhelming genomes [14]. genome defense and selection for specificity to avoid the off-target effects of maladaptive genic silencing by piRNA. GENOME DEFENSE BY PIWI PROTEINS In the face of this threat, different mechanisms of genome defense have evolved to protect the genome from TRANSPOSABLE ELEMENTS: TEs. The primary modes of genome defense are chromatin ENDOGENOUS SELFISH MUTAGENS modification, DNA methylation and RNA silencing, Sexual reproduction plays an important role in pro- though there are others [17,18]. These modes of genome ducing variation that enables hosts to adapt to rapidly defense are mechanistically connected and also function evolving parasites [1]. Therefore, it is an evolutionary as epigenetic regulators of . In fact, many irony that sexual reproduction also establishes a condition of the epigenetic mechanisms discussed in this issue were ripe for exploitation by genetic parasites known as TEs likely derived from ancestral mechanisms of genome de- [2]. TEs come in two major types. Retrotransposons fense. (known as Class I elements) transpose via RNA that is re- Genome defense against TEs by RNA silencing is verse transcribed into DNA and inserted elsewhere into seen widely across eukaryotes. In plants, fungi, and meta- the genome. Class I retrotransposons are classified as long zoans, diverse pathways of genome defense by RNA si- terminal repeat (LTR) elements, which are similar to retro- lencing are centered on the Argonaute proteins [19]. viruses; LINE-like retrotransposons, which lack LTRs; Argonaute proteins are characterized by their capacity to and SINE-like elements, which do not code for reverse carry a small 20 to 30 nt guide RNA that mediates recog- transcriptase (RT) and instead hijack the RT from other nition of reverse complement RNA. Upon target recogni- families. DNA transposons (known as Class II elements) tion, Argonaute proteins have the capacity to slice the move via DNA intermediates. The most well understood recognized RNA. In this way, RNA silencing by Arg- DNA transposons encode transposases that "cut-and- onaute proteins can function as an adaptive immune sys- paste" insertions from one location to the next. If DNA tem. The slicing function of Argonaute proteins can target transposons move during DNA replication, copy number a wide array of harmful RNA species, both viruses and may increase if elements jump ahead of the replication TEs, via diverse guide RNAs. fork. A second class of DNA transposons, Helitrons, repli- Despite the core similarity of small RNA-based cate via rolling circle amplification. genome defense shared across eukaryotes, these pathways TEs are major determinants of variation in genome have significantly diversified. Some of this diversification architecture [3] and there is a positive relationship be- can be explained by the fact that multicellular animals, un- tween bulk TE content and genome size [4]. For example, like many other eukaryotes, commonly have a germline salamander genomes are notoriously large and much of specified early in development. This (with some excep- this can be explained by a great proportion of LTR TEs tions) results in a soma that does not contribute to the next [5,6]. It has also been estimated that nearly 70 percent of generation. This has important consequences for the evo- the human genome is comprised of repetitive sequences, lution of genome defense in animals. Because TEs are many of which are derived from TEs [7]. TEs can some- transmitted vertically through genomes passed to off- times be exapted for functions beneficial to the host. For spring, transposition in the soma does not confer an ad- example, TE sequences play critical roles in telomere vantage to a TE lineage. Instead, somatic harm caused by function in Drosophila [8] and have been recruited to me- TE insertions can only reduce the chance that genomic diate V(D)J recombination in jawed vertebrates [9]. copies will be represented in the next generation. For this Nonetheless, TEs are mostly harmful. In humans, TEs reason, in animals with an early specified germline, natu- can cause a variety of diseases [10]. For example, a case ral selection will act on TE lineages to be active only of haemophilia has been caused by an insertion of an L1 within the germline compartment. In turn, when TEs are retroelement [11]. In addition, several cases of colorectal primarily active within the germline, selection will favor Blumenstiel et al.: Positive selection in the piRNA machinery 501

genome defense mechanisms that are also specified to the mediated genome defense in somatic cells on the ovary germline. Consistent with this expectation, a unique mode exterior; the anti-sense piRNAs derived from the flamenco of genome defense has specialized within the reproduc- locus target gypsy, idefix and ZAM, as well as other ele- tive tissues of animals. This mode of genome defense is ments [25-28]. A unique property of these elements is that mediated by a specialized class of Argonaute proteins they can all be considered endogenous retroviruses with known as the PIWI proteins. the capacity to move between cells. For example, gypsy PIWI proteins are a specialized clade of Argonaute elements are known to form virus-like particles that can proteins that play a critical role in genome defense in an- cross cell membranes [29,30]. Thus, control of somatic el- imals and, for the most part, are only found in reproduc- ements by piRNAs derived from the flamenco locus can tive tissues [20,21]. Similar to other Argonautes, PIWI protect the germline against invasion of endogenous retro- proteins carry small guide RNAs and mediate a slicing re- viruses derived from flanking follicle cells. action of the target RNA. Guide RNAs loaded into PIWI Within the germline proper, a more complex pathway proteins are commonly derived from TEs and, when de- of piRNA biogenesis maintains genome defense. Here, rived from an anti-sense TE transcript, can target sense piRNAs are derived from dual-strand, rather than single TE mRNAs for destruction. PIWI proteins are also unique strand clusters. This leads to the production of both sense in how they receive their guide RNA. While other Arg- and anti-sense piRNAs that target sense TE mRNAs in a onaute proteins receive either siRNAs or miRNAs via biogenesis loop denoted as ping-pong biogenesis. Anti- Dicer proteins, PIWI proteins receive their guide RNAs sense piRNAs loaded into the Drosophila Aubergine pro- (known as PIWI-interacting RNAs, or piRNAs) from a tein destroy sense TE mRNAs by post-transcriptional gene complex machinery of piRNA biogenesis [21]. piRNAs silencing (PTGS). And while anti-sense piRNAs loaded are significantly longer than siRNAs and miRNAs, rang- into Drosophila Piwi protein can also mediate PTGS, they ing from 23 to 30 or more nucleotides. also enter the nucleus and target TE insertions for tran- How are anti-sense piRNAs generated from TEs, scriptional gene silencing (TGS). Since Drosophila Piwi rather than genes? To provide defense against only TEs, is a nuclear protein and found in both the somatic follicle has identified the Achilles heel of TE bi- cells and the germline, this form of transcriptional silenc- ology. Unlike other coding sequences in the genome, TEs ing is shared across these two compartments of the ovary. increase in number by copying themselves to other loca- In the germline, this form of transcriptional silencing is tions. Thus, TEs can be distinctly recognized by their ca- also transmitted to the next generation via maternally pro- pacity to land in different regions of the genome. TEs that visioned of Piwi-piRNA complexes. transpose into a piRNA "trap", known as a piRNA cluster, are thus recognized. piRNA clusters typically reside in the boundary between euchromatin and heterochromatin and piRNA SILENCING AS AN EPIGENETIC PHENOMENON have distinct chromatin signatures. They are also replete with diverse TE fragments, representing the history of TE Before the discovery of piRNA silencing, syndromes invasion within a species. TE sequences, residing in these of hybrid dysgenesis demonstrated an important connec- piRNA clusters, then become the source of anti-sense TE tion between genome defense and epigenetic control of piRNAs. Loaded into PIWI-proteins, they can mediate re- gene expression. This is because TEs inherited only via pression of sense TE mRNAs. the paternal germline were known to become activated in The mechanisms underlying piRNA cluster designa- progeny and cause sterility. Genetically identical recipro- tion, transcript processing and piRNA biogenesis have cal progeny, with maternally inherited TEs, were com- been extensively reviewed elsewhere [20-24]. An pletely fertile. These observations indicated that while overview of the somatic and germline pathways, as un- sperm failed to maintain TE control, the maternal lineage derstood in female Drosophila, is shown in Figure 1. maintained TE repression across generations. This mater- Many components of the piRNA biogenesis pathway nal effect was designated a 'cytotype' [31,32] and it is now evolve rapidly and this rapid evolution is also demon- known that the cytotype is the presence of piRNAs tar- strated in the figure. Several salient features of piRNA geting strain-specific TE populations [33]. Since females biogenesis should be noted. First, genome defense by exclusively transmit the cytoplasm in the egg, piRNAs piRNA is specialized across different parts of the female that target TEs in the germline are transmitted alongside gonad. Within the somatic follicle cells that surround the genomic TE insertions in females. In contrast, piRNAs are ovary, piRNAs are derived from loci that produce single- not transmitted through male sperm and TE families only stranded anti-sense transcripts. These yield piRNAs that residing in the paternal genome find themselves in an egg are found exclusively in Piwi, the sole Drosophila PIWI from a mother lacking piRNAs that target the TE family. protein expressed in these cells. The most well understood In this way, the female germline maintains TE repression locus that is the source of these somatic piRNAs is the fla- across generations in an epigenetic fashion. menco locus. Examination of the TE fragments that com- The way TEs are silenced at the transcriptional level prise the flamenco locus explains the existence of piRNA also has connections to epigenetic control of gene expres- 502 Blumenstiel et al.: Positive selection in the piRNA machinery

Figure 1. The piRNA biogenesis and silencing pathway. dN/dS estimates, indicated with varying intensity of red, be- tween D. melanogaster and D. simulans piRNA pathway components were obtained from the flyDIVaS web server [81], with the exception of estimates for ago3, deadlock, Yb, maelstrom, rhino, squash, vasa, and valois. dN/dS esti- mates for these missing genes were obtained by through our analysis.

sion. Nuclear piRNAs, in complex with Piwi, mediate local genic chromatin states, with genes closer to TEs transcriptional silencing by directing methylation more likely to yield genic piRNAs and enrichment for si- at TE insertion sites. This transcriptional silencing is fa- lencing by H3K9 tri-methylation [42,43]. For these rea- cilitated by asterix/DmGTSF-1 [34-36] and panoramix/Si- sons, piRNA mediated genome defense can be considered lencio [37,38]. While these mechanisms of transcriptional an important epigenetic regulator of gene expression. In silencing can efficiently maintain TE repression, there are this way, piRNA-mediated genome defense is costly, also significant off-target effects of piRNA mediated tran- since genes important for host fitness can be caught in the scriptional silencing on genes. Off-target gene silencing crossfire by this form of "genomic autoimmunity". We by piRNA can be considered a form of "genomic autoim- propose that the cost of genomic autoimmunity con- munity" because the piRNA machinery targets self rather tributes to the pervasive signature of positive selection in than non-self. For example, in some strains of D. virilis, the piRNA machinery. the center divider gene has become silenced by being con- verted into a piRNA cluster, presumably caused by flank- ing TE sequences located within the telomere [39-41]. POSITIVE SELECTION IN THE piRNA MACHINERY This conversion is associated with local H3K9 tri-methy- lation, a mark of heterochromatin. It has been further Despite the essential function of piRNA mediated shown that TE insertions exert an epigenetic effect on genome defense, a number of studies, especially in Blumenstiel et al.: Positive selection in the piRNA machinery 503

Table 1. Previous studies that have identified signatures of positive selection in the Drosophila piRNA machinery. Study Species Test of Neutrality Genes w/significant Genes w/non- tests significant tests

Lee and Langley [44] mel and/or sim MK test armi, aub, krimper, ago3, hen1, mael, piwi, spn-E, vasa rhino, squash, zuc

Vermaak et al. [45] mel, sim Sliding window rhino N/A dN/dS, MK test

mel to pseudo PAML: M1 vs M2, M7 rhino N/A vs. M8

Lewis et al. [46] Nematocera, PAML: M8a vs M8 N/A/ ago3, piwi/aub Brachycera

Heger and Ponting [47] mel, sim, yak, ere, SLR spn-E N/A ana

Obbard et al. [49] mel, sim MK test aub, armi N/A

Kolaczkowski et al. [50] mel SweepFinder armi, aub, hen1, piwi, ago3, krimper, mael rhino, spn-E, squash, zuc

12 species PAML: Branch-sites aub, krimper, spn-E, ago3, armi, mael, hen1, squash, zuc piwi, rhino

Obbard et al. [48] mel, sim MK test armi, aub, piwi, mael, ago3, zuc, squash, spn-E hen1, krimper

Simkin et al. [52] mel aub, armi, rhino, spn- ago3, krimper, piwi, E, vasa, zuc squash

mel,sim MK test aub, armi, krimper, ago3, piwi, rhino, spn-E, vasa squash, zuc

mel, sim, sech, PAML: M7 vs. M8 ago3, aub, piwi, rhino armi, krimper, spn-E, yak, ere, ana squash, vasa, zuc

Drosophila, have demonstrated that the machinery of tests identify positive selection across different time piRNA biogenesis and function is rapidly evolving [44- scales. 52]. Moreover, this rapid evolution can largely be ex- What explains this pervasive signal of positive selec- plained by the action of positive selection. The first of tion in the piRNA machinery? Since TEs can be consid- these studies identified a strong signature of positive se- ered genomic parasites, an analogy is frequently made to lection in the rhino protein, even before piRNAs were the adaptation often observed in immune systems. This known [45]. Since this study, multiple studies have used model proposes that hosts and TEs are in an evolutionary diverse approaches to characterize natural selection on the arms race and positive selection is driven by Red Queen piRNA machinery at different time scales. These range dynamics of adaptation and counter-adaptation. Most de- from strict population-based studies of patterns of linked scriptions of the Red Queen process describe it as a co- polymorphism to McDonald-Kreitman based tests of evolutionary process. In the case of host-parasite polymorphism and divergence to molecular evolutionary relationships, positive selection on host immune systems based tests of synonymous and non-synonymous substi- will occur when parasites evolve strategies to evade the tution. Table 1 indicates that this signature of adaptation host immune system. In turn, selection favors immune is pervasive across the machinery of piRNA biogenesis. In strategies that maintain control of parasites. While this some cases, such as for spn-E, diverse and independent analogy to other host-parasite dynamics seems a plausi- 504 Blumenstiel et al.: Positive selection in the piRNA machinery

Figure 2. Models for TE evasion of the piRNA machiney. A. General pathway for piRNA silencing with cluster insert that becomes source of anti-sense piRNA, followed by transcriptional and post-transcriptional gene silencing. B. Di- rect antagonism. A TE encoded antagonist of a piRNA effector. Here, a TE encoded protein (red) interferes with a PIWI protein (blue) directed slicing reaction. An evolutionary change in the PIWI protein (blue) allows it to silence the mRNA of the TE. C. Evasion. Specialized localization of the TE mRNA (either temporal or physical) allows the TE mRNA to avoid being a target. Separation in space or time is indicated with the green line. Adaptation in the PIWI pro- tein (blue) allows it re-localize and silence the TE mRNA. D. Cluster evasion. A newly arriving TE proliferates within the genome, but avoids inserting into a cluster. A change in protein sequence (perhaps in rhino, as has been pro- posed [52]) facilitates TE insertion into the cluster, leading to subsequent piRNA silencing of the TE family. ble explanation for recurrent positive selection in the Here, a TE-encoded protein may directly antagonize a piRNA machinery, it lacks a clear mechanistic explana- PIWI protein and disrupt the association of the PIWI pro- tion. tein with a guide piRNA. This would lead to global loss of TE piRNA silencing by the target protein. We have pro- posed that TE-encoded suppressors of piRNA silencing MECHANISMS TO EXPLAIN POSITIVE may explain the general release of diverse elements dur- SELECTION IN THE piRNA MACHINERY ing hybrid dysgenesis [53]. Viruses in plants and animals If positive selection is driven by a canonical host-par- are known to antagonize RNA silencing pathways in this asite Red Queen process, then TEs and the host piRNA fashion. In plants, virally encoded suppressors of RNA si- machinery must be co-evolving. Specifically, adaptation lencing (VSRs) can act through diverse mechanisms that in TE lineages must enable TE evasion of piRNA silenc- include sequestering siRNAs and directly antagonizing ing, followed by corresponding adaptation in the piRNA Argonaute proteins [54-56]. Different modes of VSR ac- machinery to the new TE variant. Several mechanisms of tion, including the antagonism of Argonaute proteins and TE evasion are plausible and described in Figure 2. Im- their corresponding slicer activity, are also observed in in- portantly, since we are mostly concerned with evolution of sect viruses [57-59]. Natural selection to escape VSRs has the piRNA machinery itself, we restrict our discussion to been proposed to explain a strong signature of positive se- adaptation in protein sequence. lection in the RNAi machinery that is devoted to defense One possibility is that TEs directly antagonize the against viruses [48,60]. Thus, if TEs encode suppressors piRNA machinery. This is demonstrated in Figure 2B. of piRNA silencing, natural selection on the piRNA ma- Blumenstiel et al.: Positive selection in the piRNA machinery 505

chinery to avoid TE antagonism may drive the pervasive localize to regions just flanking the oocyte nucleus within signature of positive selection [52]. the developing egg chamber [66]. This shows that natural To date, however, there are few examples of TEs that selection may act on TE lineages to ensure patterns of lo- encode anti-silencing function. One explanation for this calization that enable proliferation. Similar mechanisms contrast with viruses is that TE encoded suppressors of may also enable evasion of piRNA silencing. Such phys- the piRNA machinery may have a net negative effect on ical evasion of piRNA silencing by TEs may lead to pos- TE fitness. Virally encoded suppressors of RNA silenc- itive selection on piRNA proteins to target new ing may reduce host fitness by increasing susceptibility to compartments within the germline. This form of selection other viruses, but this cost may not significantly burden is borne out by the fact that Aubergine and Piwi proteins the virus lineage encoding the VSR. A weakened host are paralogs that have evolved specialized localization with a higher viral titer may be tolerable to the viral line- within the germline and the soma of the ovary, as well as age since viruses have the capacity to move from dying the cytoplasm and the nucleus [25]. hosts to healthy ones. TEs do not have this luxury. A TE Lastly, evasion by TEs may simply happen by avoid- encoded suppressor of piRNA silencing that releases ing entry into the piRNA pathway altogether (Figure 2D). global TE control may only reduce host fertility, thereby TEs become targets for piRNA biogenesis by inserting reducing the chance of transmission to the next genera- into designated piRNA clusters. Dual strand clusters are tion. Therefore, if TE-encoded suppressors of piRNA si- designated by the Rhino-Deadlock-Cutoff (RDC) com- lencing exist, it seems most likely that they would act on plex [67-69]. Therefore, if TEs are capable of moving specialized, rather than general components of the piRNA freely within the genome without landing in any of these machinery. dual strand clusters, they may avoid detection by the The absence of known TE-encoded suppressors of piRNA machinery altogether. This may place strong se- piRNA silencing may also be explained by lower adap- lection on the RDC complex to entice TE insertions into tive capacity of TEs, as compared to viruses. Viruses can these dual strand clusters [52,67]. In fact, even before piR- achieve very large population sizes. However, at least in NAs were known, a model of TE entrapment was pro- Drosophila where the signatures of positive selection in posed to explain positive selection on rhino [45]. the piRNA pathway are pervasive, there may only be dozens of TE copies of each TE family per genome. More- over, since each copy likely jumps about every 100 to CONSIDERATIONS OF MODES OF ADAPTATION IN THE piRNA MACHINERY 10,000 generations [61,62], the functional effective pop- ulation size of the TE lineage (which measures the adap- A standard model of Red Queen driven adaptation in tive capacity) may be smaller than the host itself. Thus, an immune system proposes that both host and parasite compared to viruses, TEs may have less capacity to evolve co-evolve with each other. As mentioned earlier, and in suppressors of piRNA silencing. Nonetheless, the piRNA contrast to viruses, the capacity of a TE lineage to evolve machinery might experience positive selection not through to evade the piRNA machinery may be limited. Therefore, its function in genome defense, but through pleiotropic a co-evolutionary dynamic may not directly influence the function to protect against viruses. In mosquitoes, the evolution of the piRNA machinery. However, it is well piRNA machinery appears to play an important role in so- known in Drosophila that genomes are constant targets to matic defense against viruses [63,64]. However, in D. horizontal invasion by TEs [70-72]. For example, the P melanogaster, recent studies have shown that the piRNA element likely became established in D. melanogaster machinery provides little protection against viral challenge through horizontal transfer from D. willistoni [73]. There- [65]. Alternately, positive selection in the piRNA ma- fore, adaptation in the piRNA machinery may not be chinery may be driven by the indirect effects of viral sup- driven by newly evolved strategies of resident TEs. pressors of RNA silencing that serendipitously target the Rather, adaptation in the piRNA machinery may be driven piRNA machinery [50]. Overall, future studies are needed by novel strategies of transposition carried by horizontally to formally test whether TEs or viruses can antagonize the transferred TEs. TE diversity itself has been proposed to piRNA machinery directly. be a driver of adaptation in the piRNA machinery [44,51]. While positive selection in response to TE mediated This seems especially plausible given our recent simula- suppression of genome defense may be an unlikely ex- tion study demonstrating that horizontal transfer may be a planation for adaptive evolution of the piRNA machinery, more important determinant of TE abundance and diver- several other mechanisms are plausible. Figure 2C demon- sity than drift [74]. strates that TE evasion of the piRNA machinery may be We have performed a preliminary test of this hypoth- achieved through temporal or positional regulation within esis [75]. If TE diversity was a key determinant of adap- the germline. TEs are known to carry regulatory sequences tation in the piRNA machinery, one might expect that the that can drive exquisite localization within the germline. piRNA machinery might evolve more quickly in species For example, the I, G2 and jockey elements all carry of Drosophila with a greater TE load. However, in a mo- gurken-like mRNA localization signals that allow them to lecular evolutionary analysis of rates of evolution in the 506 Blumenstiel et al.: Positive selection in the piRNA machinery

piRNA machinery across different species of Drosophila, that determine specificity and sensitivity lead to natural we found the opposite to be true. In species with a higher selection that propagates through the entire network of TE burden, and also greater TE diversity, we found that piRNA biogenesis. the piRNA machinery is evolving more slowly. However, In Figure 1, we identify the fastest evolving compo- we found that there is a striking relationship between TE nents of the piRNA machinery in a comparison between abundance and codon bias in the piRNA machinery [75]. D. melanogaster and D. simulans using results provided It is widely known that highly expressed genes frequently from the flyDIVAS database (http://igem.temple.edu/fly- show a greater degree of codon bias [76,77]. In species divas/index.html) [81]. This analysis shows that the fastest with a greater TE burden, codon bias in the piRNA ma- components are not the effectors of piRNA silencing chinery (but not the rest of the genome) is increased. This (aubergine and piwi). Rather, they are krimper, rhino, suggests that the primary response to increased TE abun- deadlock, cutoff and Yb. The fast evolution of rhino, dead- dance is not an increased rate of adaptive evolution. lock and cutoff is striking since these code for the three Rather, an increased TE burden appears to select for in- proteins that comprise the Rhino-Deadlock-Cutoff (RDC) creased function of the piRNA machinery. This result sug- complex [68,82]. The RDC complex plays an essential gests an alternate model to explain the rapid and adaptive role in determining the identity of dual-strand clusters that evolution of the piRNA machinery. are the source of piRNAs in the germline. We propose that fast evolution of proteins in the RDC complex is driven by an especially strong tension over the costs of transcrip- THE GENOMIC AUTOIMMUNITY HYPOTHE- tional genic silencing that can spread into flanking genes; SIS FOR POSITIVE SELECTION IN THE in contrast to PTGS, TGS is extremely dangerous. While piRNA MACHINERY off-target effects of cytoplasmic gene silencing can lead to Genomic autoimmunity is likely to be a significant the destruction of a single mRNA molecule, TGS can lead cost of genome defense [42,43,78]. And because piRNA to the silencing of the sole genomic copies that the cell re- silencing can be transmitted as a maternal effect, this col- lies on. Natural selection against the effects of TE medi- lateral silencing can be transmitted across generations as ated TGS on neighboring genes has been seen in both corn a "paramutation" in the absence of the original piRNA [78] and Drosophila [43]. A dual-strand cluster may be source [39-41,79,80]. This may lead to a popula- especially potent in this respect. Thus, if TEs become tion-level "paramutation load". Further insight into this more common within a species, natural selection may act tension is provided by studies of hybrid dysgenesis caused on components of the RDC complex to become more dis- by TEs. As in the case of D. melanogaster [31], investi- criminating in their capacity to define dual-strand clusters. gations of sterility induction in D. virilis have shown a How might fast evolution in some components of the strong relationship between the dose of paternal chromo- piRNA machinery drive pervasive adaptation across the somes carrying TEs and the induction of sterility [41]. whole pathway? We further propose that the key evolu- Correspondingly, there is also a strong dose effect of ma- tionary innovations in proteins that modulate the balance ternal chromosomes carrying TE insertions that produce between specificity and sensitivity also select for com- piRNAs [53]. These studies of hybrid dysgenesis suggest pensatory changes throughout the machinery of piRNA that the paternal dose of inherited TEs must be appropri- biogenesis. A propagating consequence of adaptive di- ately matched with a maternal piRNA dose that maintains vergence in the piRNA machinery has also been proposed TE silencing. These dosage effects can explain our obser- by others [50,83]. If adaptive changes lead to selection for vation that the piRNA machinery evolves increased codon compensatory changes in other components of the piRNA bias in species of Drosophila with higher TE burden [75]. machinery, we would expect that components of the Species with a higher TE burden presumably experience piRNA machinery show a correlated pattern of amino acid increased selection for an increased dose of piRNA de- divergence across species. In particular, we would expect fense and this may be facilitated by increased codon bias that this correlated divergence would be greater than that in the piRNA machinery. expected to arise simply due to the effects of changing de- It seems likely that an increased dose of the piRNA mography and population size. We can test this hypothe- machinery comes at a cost of increased off-target gene si- sis by performing an Evolutionary Rate Covariation lencing. We propose that ongoing fluctuation in the TE analysis of piRNA biogenesis factors [84-86] (Figure 3). load leads to cycles of oscillating selection favoring either In this analysis, we utilized the Evolutionary Covariation specificity or sensitivity in piRNA-mediated genome de- Rate webserver (http://csb.pitt.edu/erc_analysis/) to de- fense. Rather than being driven by co-evolution and adap- termine the level of evolutionary covariation of the piRNA tation in TE lineages, we propose this oscillating mode of machinery within the 12 well sequenced species of the selection is driven by fluctuating TE exposure by hori- Drosophila genus. This method determines branch-spe- zontal transfer. In this way, natural selection drives the cific levels of amino acid divergence and tests whether machinery continuously, as both TE load increases and gene-by-gene pairwise divergence covaries. Statistical sig- decreases. We further propose that evolution in factors nificance is determined empirically against the rest of the Blumenstiel et al.: Positive selection in the piRNA machinery 507

Figure 3. Evolutionary Rate Covariation analysis of piRNA proteins across the Drosophila genus. Results were ob- tained from [84]. Proteins in the middle of the table share the greatest signature of co-evolution. Evolutionary Rate Covariation (ERC) analysis measures the degree to which changes in one protein (relative to background) are corre- lated with changes in another (also relative to background). The analysis is performed by estimating branch specific amino acid divergence across a phylogeny. Here, branch specific amino acid divergence was estimated for 12 se- quenced species of the Drosophila genus. Significance is measured relative to the genomic background of all pair- wise covariation estimates. Some proteins are not available due to lack of clear orthologs in divergent species. Above the diagonal: ERC values. Values range from -1 to 1, with values closer to 1 indicate higher levels of covariation. Red intensity above diagonal scales with strength of correlation. Below the diagonal: P values. P values were determined empirically relative to background. Red intensity below diagonal scales with degree of significance. Significant co- variation across the piRNA machinery is demonstrated by a Z-score of observed P values being equal to -15.4 (P << .001).

genome-wide pairwise correlations. If positively selected tion between sensitivity and specificity? One possible way amino acid substitutions in some components of the of testing this is to examine the evolution of piRNA piRNA machinery lead to selection for compensatory lengths. Little is known about the specificity of piRNA changes in other components, we would expect to see a targeting and how many mismatches may be tolerated be- significant pattern of evolutionary covariation in the tween piRNA and target [87]. However, it is reasonable to piRNA machinery. While some genes, such as rhino and propose that piRNAs of increasing length have increased deadlock, are unfortunately excluded from this analysis specificity. In an analysis of oxidized small RNA se- due to their extreme divergence, there is strong evidence quencing data obtained from wild type Drosophila for co-evolution among piRNA biogenesis factors (Con- melanogaster [88], we determined the number of tran- trast to global pairwise ERC values, P << .001). Perhaps scripts targeted by ~4,000,000 26 bp piRNAs and com- most interesting and also perplexing, the effector proteins pared this to the number of transcripts targeted by the Piwi and Aubergine show only a weak signal of co-evo- same pool of piRNA with one base removed from the 3' lution with other factors. end (anti-sense mappers with no mismatches). We found Is there a way to explicitly test the hypothesis that that the 25 bp small RNAs could potentially target ~8 per- fluctuating TE loads can lead to oscillating natural selec- 508 Blumenstiel et al.: Positive selection in the piRNA machinery

Table 2. Comparison of piRNA length between D. virilis and D. melanogaster. Mode was the most com- mon length class among small RNAs 23 to 30 nt, post adapter trimming and filtering against ncRNA, miRNA, miscRNA and tRNA. Only including one of each replicate library for the comparison (indicated with * for each library included), the difference in small RNA mode is significant (p < .01, t test among modes). § indicates a library generated by oxidation by sodium periodate.

Species Sample Tissue Source Small RNA mode

Drosophila virilis Strain 160 (rep 1) 0-2 hr embryos Erwin et al. [41] 26*

Drosophila virilis Strain 160 (rep 2) 0-2 hr embryos Erwin et al. [41] 26

Drosophila virilis Strain 9 (rep 1) 0-2 hr embryos Erwin et al. [41] 26*

Drosophila virilis Strain 9 (rep 2) 0-2 hr embryos Erwin et al. [41] 26

Drosophila virilis wt follicle cells Chirn et al. [93] 27*

Drosophila virilis wt (rep 1) 0-4 hr unfertilized embryos Ninova et al.[96] 27*

Drosophila virilis wt (rep 2) 0-4 hr unfertilized embryos Ninova et al.[96] 27

Drosophila virilis Strain 9 0-2 hr embryos Rozhkov et al. [97] 26*

Drosophila melanogaster Oregon-R follicle cells Chirn et al. [93] 26*

Drosophila melanogaster wt (rep 1) 0-4 hr unfertilized embryos Ninova et al.[96] 26*

Drosophila melanogaster wt (rep 2) 0-4 hr unfertilized embryos Ninova et al.[96] 26

Drosophila melanogaster Ral 437 ovaries§ Song et al.[98] 26*

Drosophila melanogaster Ral 375 ovaries§ Song et al.[98] 26*

Drosophila melanogaster Ral 714 ovaries§ Song et al.[98] 25*

Drosophila melanogaster Ral 707 ovaries§ Song et al.[98] 25*

Drosophila melanogaster Ral 427 ovaries§ Song et al.[98] 25*

Drosophila melanogaster Ral 313 ovaries§ Song et al.[98] 25*

Drosophila melanogaster Ral 732 ovaries§ Song et al.[98] 25*

Drosophila melanogaster Ral 391 ovaries§ Song et al.[98] 25*

cent more transcripts (2284 transcripts vs. 2477 tran- piRNAs that are bound to MIWI2 and likely directly scripts). methylate TE insertions are longer than piRNAs that are The inference that longer piRNAs have increased cytoplasmic and bound to MILI [89]. Strikingly, this pat- specificity is also supported by several aspects of RNA si- tern of silencing RNAs that target chromatin being longer lencing biology. As previously mentioned, TGS is signif- than cytoplasmic small RNAs is also observed in plants. icantly more dangerous for the host than PTGS. In In plants, there are two classes of small RNAs (plants lack accordance with this hypothesis, piRNAs, which can me- piRNAs). 21/22 nt small RNAs mediate PTGS whereas diate TGS, are longer than miRNAs and siRNAs. More- 24 nt small RNAs have a primary role in RNA-directed over, in Drosophila, the class of piRNAs that directly DNA methylation and TGS [90,91] (but see [92]). Again, mediate TGS (those bound to Piwi protein in the nucleus) it appears that small RNAs with primary roles in nuclear are longer than those with exclusive roles in the cytoplasm silencing are longer. This is consistent with the hypothe- (those bound to Aubergine) [25]. A similar difference is sis that costs associated with off-target TGS require addi- observed in mice among pre-pachytene piRNAs. Nuclear tional specificity. Blumenstiel et al.: Positive selection in the piRNA machinery 509

Based on these observations, one might predict that CONCLUSIONS AND OUTLOOK natural selection would favor increasing specificity of We propose that the pervasive signature of positive piRNA silencing when the genomic TE burden is large. selection in the piRNA machinery is in part driven by se- A true phylogenetic comparative analysis of piRNA size lection that fluctuates alternately for sensitivity and speci- and TE abundance has yet to be performed, but some in- ficity in genome defense. An ever changing TE teresting trends arise when comparing available piRNA assemblage, dynamically driven by horizontal transfer, profiles across species. In Drosophila, the most common may lead to ongoing positive selection. This would not piRNAs are 25 to 26 nts. However, in mammals ranging represent a formal Red Queen process, because TE line- from macaques and marmosets to mice and pigs, piRNAs ages would not co-evolve with the piRNA machinery. are considerably longer, with the most common piRNAs Rather, the piRNA machinery would evolve in response to being 28 to 30 nts [93]. This coincides with the fact mam- a fluctuating TE burden. This hypothesis is supported by malian genomes carry a significantly greater TE burden. the current observation that species with a greater TE bur- This trend of greater TE burden being associated with den have piRNA length distributions that favor specificity longer piRNAs is also observed more narrowly within over sensitivity. Dipterans. For example, mosquito genomes carry a sig- It is important to consider that evolution can happen nificantly greater burden of TEs compared to Drosophila, at both the protein sequence level and at the expression and piRNAs in mosquitoes are longer than those in level. Our examination of codon bias suggests that in- Drosophila [94,95]. Even more taxonomically narrow, creasing expression of the piRNA machinery may be an- within the genus Drosophila, this pattern holds. D. virilis other adaptive response to increasing TE burden. This has a significantly greater TE burden compared to D. hypothesis has yet to be tested experimentally. In fact, melanogaster and D. virilis has a correspondingly longer changing size distributions of piRNAs could be driven ei- piRNA profile (Table 2). Together, this overall pattern ther by altered mechanisms of piRNA biogenesis, or suggests that increased TE burden may select for in- changing expression levels of different Piwi proteins. For creased specificity of the piRNA machinery. This is likely example, an increased size distribution of piRNAs in D. mediated by changes in the proteins that comprise the virilis could be explained either by protein evolution lead- piRNA biogenesis pathway and define piRNA length dis- ing to preference for longer piRNAs in Piwi or increased tributions. expression of Piwi, which carries longer piRNAs com- This signature supports a model of piRNA machin- pared to Aubergine and Ago3 [25]. A test of the genomic ery evolution driven by an ongoing tension between suf- autoimmunity hypothesis for positive selection in proteins ficient genome defense and the costs of off-target gene of the piRNA machinery will require teasing apart these silencing. As TE burden fluctuates within a species, we two modes of adaptive evolution. propose the piRNA machinery must be constantly fine- tuned. In this case, natural selection will follow a mode of Acknowledgments: This work was funded by NSF MCB- 1413532 and the University of Kansas positive selection when the TE burden increases but also when the TE burden decreases, driving protein sequence REFERENCES evolution in specific components of the piRNA machin- 1. Morran LT, Schmidt OG, Gelarden IA, Parrish RC, 2nd, ery. In turn, this would lead to cascading positive selection Lively CM. Running with the Red Queen: host-parasite co- on other parts of the machinery to compensate for these evolution selects for biparental sex. Science. changes. This correlated response across diverse compo- 2011;333(6039):216-8. nents of the piRNA machinery is supported by the strong 2. Hickey DA. Selfish DNA: A sexually-transmitted nuclear par- asite. Genetics. 1982;101(3-4):519-31. signature of correlated evolution. 3. Gaut BS, Ross-Ibarra J. Perspective - Selection on major com- A weakness of the argument outlined here is that ev- ponents of angiosperm genomes. Science. idence for the tension between specificity and sensitivity 2008;320(5875):484-6. lacks a phylogenetic perspective. To test this general hy- 4. Munoz-Diez C, Vitte C, Ross-Ibarra J, Gaut BS, Tenaillon MI. Using nextgen sequencing to investigate genome size pothesis, it will be critical to use more phylogenetically variation and transposable element content. Plant transpos- informed approaches to the analysis of evolution of the able elements: Springer: Berlin Heidelberg, 2012. p. 41-58. piRNA machinery. For example, future studies must test 5. Metcalfe CJ, Casane D. Accommodating the load: The trans- posable element content of very large genomes. Mob Genet whether a reduced TE load coincides with changes in sen- Elements. 2013;3(2):e24775. sitivity in the form of smaller piRNAs. It will also be crit- 6. Sun C, Shepard DB, Chong RA, Lopez Arriaza J, Hall K, Cas- ical to perform functional analysis of the amino acid toe TA, et al. LTR retrotransposons contribute to genomic changes that appear to have been driven by selection for gigantism in plethodontid salamanders. Genome Biol Evol. 2012;4(2):168-83. increased or decreased specificity. The acid test of any 7. de Koning AP, Gu W, Castoe TA, Batzer MA, Pollock DD. proposed explanation for recurrent positive selection in Repetitive elements may comprise over two-thirds of the the piRNA machinery will be to identify the functional human genome. PLoS Genet. 2011;7(12):e1002384. consequences of amino acid substitutions that have been 8. Casacuberta E, Pardue ML. Transposon telomeres are widely distributed in the Drosophila genus: TART elements in the fixed by natural selection. 510 Blumenstiel et al.: Positive selection in the piRNA machinery

virilis group. Proc Natl Acad Sci U S A. 2003;100(6):3363- 29. Chalvet F, Teysset L, Terzian C, Prud'homme N, Santamaria 8. P, Bucheton A, et al. Proviral amplification of the Gypsy en- 9. Huang S, Tao X, Yuan S, Zhang Y, Li P, Beilinson HA, et al. dogenous retrovirus of Drosophila melanogaster involves Discovery of an Active RAG Transposon Illuminates the env-independent invasion of the female germline. EMBO J. Origins of V(D)J Recombination. Cell. 2016;166(1):102-14. 1999;18(9):2659-69. 10. Hancks DC, Kazazian HH. Roles for retrotransposon inser- 30. Syomin BV, Fedorova LI, Surkov SA, Ilyin YV. The en- tions in human disease. Mobile DNA. 2016;7(1):1-28. dogenous Drosophila melanogaster retrovirus gypsy can 11. Kazazian HH, Jr., Wong C, Youssoufian H, Scott AF, propagate in Drosophila hydei cells. Mol Gen Genet. Phillips DG, Antonarakis SE. Haemophilia A resulting from 2001;264(5):588-94. de novo insertion of L1 sequences represents a novel mech- 31. Engels WR. Hybrid dysgenesis in Drosophila melanogaster anism for mutation in man. Nature. 1988;332(6160):164-6. - Rules of inheritance of female sterility. Gen Res. 12. Miki Y, Nishisho I, Horii A, Miyoshi Y, Utsunomiya J, Kin- 1979;33(3):219-36. zler KW, et al. Disruption of the APC gene by a retrotrans- 32. Kidwell MG. Hybrid dysgenesis in Drosophila melanogaster posal insertion of L1 sequence in a colon cancer. Cancer Res. - The genetics of cytotype determination in a neutral strain. 1992;52(3):643-5. Genetics. 1981;98(2):275-90. 13. Tarallo V, Hirano Y, Gelfand BD, Dridi S, Kerur N, Kim Y, 33. Brennecke J, Malone CD, Aravin AA, Sachidanandam R, et al. DICER1 loss and Alu RNA induce age-related macu- Stark A, Hannon GJ. An Epigenetic Role for Maternally In- lar degeneration via the NLRP3 inflammasome and MyD88. herited piRNAs in Transposon Silencing. Science. Cell. 2012;149(4):847-59. 2008;322(5906):1387-92. 14. Charlesworth B, Langley CH. The population genetics of 34. Donertas D, Sienski G, Brennecke J. Drosophila Gtsf1 is an Drosophila transposable elements. Annu Rev Genet. essential component of the Piwi-mediated transcriptional si- 1989;23:251-87. lencing complex. Genes Dev. 2013;27(15):1693-705. 15. Langley CH, Montgomery E, Hudson R, Kaplan N, 35. Ohtani H, Iwasaki YW, Shibuya A, Siomi H, Siomi MC, Charlesworth B. On the role of unequal exchange in the con- Saito K. DmGTSF1 is necessary for Piwi-piRISC-mediated tainment of transposable element copy number. Genet Res. transcriptional transposon silencing in the Drosophila ovary. 1988;52(3):223-35. Genes Dev. 2013;27(15):1656-61. 16. Montgomery EA, Huang SM, Langley CH, Judd BH. Chro- 36. Muerdter F, Guzzardo PM, Gillis J, Luo Y, Yu Y, Chen C, mosome rearrangement by ectopic recombination in et al. A genome-wide RNAi screen draws a genetic frame- Drosophila melanogaster: genome structure and evolution. work for transposon control and primary piRNA biogenesis Genetics. 1991;129(4):1085-98. in Drosophila. Mol Cell. 2013;50(5):736-48. 17. Muckenfuss H, Hamdorf M, Held U, Perkovic M, Lower J, 37. Sienski G, Batki J, Senti KA, Donertas D, Tirian L, Meixner Cichutek K, et al. APOBEC3 proteins inhibit human LINE- K, et al. Silencio/CG9754 connects the Piwi-piRNA com- 1 retrotransposition. J Biol Chem. 2006;281(31):22161-72. plex to the cellular heterochromatin machinery. Genes Dev. 18. Jacobs FMJ, Greenberg D, Nguyen N, Haeussler M, Ewing 2015;29(21):2258-71. AD, Katzman S, et al. An evolutionary arms race between 38. Yu Y, Gu J, Jin Y, Luo Y, Preall JB, Ma J, et al. Panoramix KRAB zinc-finger genes ZNF91/93 and SVA/L1 retrotrans- enforces piRNA-dependent cotranscriptional silencing. Sci- posons. Nature. 2014;516(7530):242-5. ence. 2015;350(6258):339-42. 19. Hutvagner G, Simard MJ. Argonaute proteins: key players in 39. Rozhkov NV, Aravin AA, Sachidanandam R, Hannon GJ, RNA silencing. Nat Rev Mol Cell Biol. 2008;9(1):22-32. Sokolova ON, Zelentsova ES, et al. The RNA interference 20. Ku H-Y, Lin H. PIWI proteins and their interactors in piRNA system differently responds to the same mobile element in biogenesis, germline development and gene expression. Nat distant Drosophila species. Dokl Biochem Biophys. Sci Rev. 2014;1(2):205-18. 2010;431:79-81. 21. Czech B, Hannon GJ. One Loop to Rule Them All: The Ping- 40. Le Thomas A, Marinov GK, Aravin AA. A transgenerational Pong Cycle and piRNA-Guided Silencing. Trends Biochem process defines piRNA biogenesis in Drosophila virilis. Cell Sci. 2016;41(4):324-37. Rep. 2014;8(6):1617-23. 22. Iwasaki YW, Siomi MC, Siomi H. PIWI-Interacting RNA: 41. Erwin AA, Galdos MA, Wickersheim ML, Harrison CC, Its Biogenesis and Functions. Annu Rev Biochem. Marr KD, Colicchio JM, et al. piRNAs Are Associated with 2015;84:405-33. Diverse Transgenerational Effects on Gene and Transposon 23. Theron E, Dennis C, Brasset E, Vaury C. Distinct features of Expression in a Hybrid Dysgenic Syndrome of D. virilis. the piRNA pathway in somatic and germ cells: from piRNA PLoS Genet. 2015;11(8):e1005332. cluster transcription to piRNA processing and amplification. 42. Shpiz S, Ryazansky S, Olovnikov I, Abramov Y, Kalmykova Mob DNA. 2014;5(1):28. A. Euchromatic transposon insertions trigger production of 24. Le Thomas A, Toth KF, Aravin AA. To be or not to be a novel Pi- and endo-siRNAs at the target sites in the piRNA: genomic origin and processing of piRNAs. Genome Drosophila germline. PLoS Genet. 2014;10(2):e1004138. Biol. 2014;15(1):204. 43. Lee YC. The Role of piRNA-Mediated Epigenetic Silenc- 25. Brennecke J, Aravin AA, Stark A, Dus M, Kellis M, Sachi- ing in the Population Dynamics of Transposable Elements danandam R, et al. Discrete small RNA-generating loci as in Drosophila melanogaster. PLoS Genet. master regulators of transposon activity in Drosophila. Cell. 2015;11(6):e1005269. 2007;128(6):1089-103. 44. Lee YC, Langley CH. Long-term and short-term evolution- 26. Zanni V, Eymery A, Coiffet M, Zytnicki M, Luyten I, Ques- ary impacts of transposable elements on Drosophila. Genet- neville H, et al. Distribution, evolution, and diversity of ics. 2012;192(4):1411-32. retrotransposons at the flamenco locus reflect the regulatory 45. Vermaak D, Henikoff S, Malik HS. Positive selection drives properties of piRNA clusters. Proc Natl Acad Sci U S A. the evolution of rhino, a member of the Heterochromatin 2013;110(49):19842-7. Protein 1 family in Drosophila. PLoS Genet. 2005;1(1):96- 27. Mével-Ninio M, Pelisson A, Kinder J, Campos AR, 108. Bucheton A. The flamenco Locus Controls the gypsy and 46. Lewis SH, Salmela H, Obbard DJ. Duplication and Diversi- ZAM Retroviruses and Is Required for Drosophila Oogene- fication of Dipteran Argonaute Genes, and the Evolutionary sis. Genetics. 2007;175(4):1615-24. Divergence of Piwi and Aubergine. Genome Biol Evol. 28. Goriaux C, Théron E, Brasset E, Vaury C. History of the dis- 2016;8(3):507-18. covery of a master locus producing piRNAs: the fla- 47. Heger A, Ponting CP. Evolutionary rate analyses of or- menco/COM locus in Drosophila melanogaster. Front Genet. thologs and paralogs from 12 Drosophila genomes. Gen Res. 2014;5. 2007;17(12):1837-49. Blumenstiel et al.: Positive selection in the piRNA machinery 511

48. Obbard DJ, Gordon KHJ, Buck AH, Jiggins FM. The evo- 68. Mohn F, Sienski G, Handler D, Brennecke J. The rhino-dead- lution of RNAi as a defence against viruses and transposable lock-cutoff complex licenses noncanonical transcription of elements. Philos Trans R Soc Lond B Biol Sci. dual-strand piRNA clusters in Drosophila. Cell. 2009;364(1513):99-115. 2014;157(6):1364-79. 49. Obbard DJ, Welch JJ, Kim K-W, Jiggins FM. Quantifying 69. Chen YC, Stuwe E, Luo Y, Ninova M, Le Thomas A, Adaptive Evolution in the Drosophila Immune System. Rozhavskaya E, et al. Cutoff Suppresses RNA Polymerase II PLoS Genet. 2009;5(10):e1000698. Termination to Ensure Expression of piRNA Precursors. 50. Kolaczkowski B, Hupalo DN, Kern AD. Recurrent Adapta- Mol Cell. 2016;63(1):97-109. tion in RNA Interference Genes Across the Drosophila Phy- 70. Daniels SB, Peterson KR, Strausbaugh LD, Kidwell MG, logeny. Mol Biol Evol. 2010;28(2):1033-42. Chovnick A. Evidence for Horizontal Transmission of the 51. Yi M, Chen F, Luo M, Cheng Y, Zhao H, Cheng H, et al. P-Transposable Element between Drosophila Species. Ge- Rapid evolution of piRNA pathway in the teleost fish: im- netics. 1990;124(2):339-55. plication for an adaptation to transposon diversity. Genome 71. Bartolome C, Bello X, Maside X. Widespread evidence for Biol Evol. 2014;6(6):1393-407. horizontal transfer of transposable elements across 52. Simkin A, Wong A, Poh YP, Theurkauf WE, Jensen JD. Re- Drosophila genomes. Genome Biol. 2009;10(2):R22. current and recent selective sweeps in the piRNA pathway. 72. Schaack S, Gilbert C, Feschotte C. Promiscuous DNA: hor- Evolution. 2013;67(4):1081-90. izontal transfer of transposable elements and why it matters 53. Blumenstiel JP, Hartl DL. Evidence for maternally trans- for eukaryotic evolution. Trends Ecol Evol. 2010;25(9):537- mitted small interfering RNA in the repression of transposi- 46. tion in Drosophila virilis. Proc Natl Acad Sci U S A. 73. Daniels SB, Peterson KR, Strausbaugh LD, Kidwell MG, 2005;102(44):15965-70. Chovnick A. Evidence for horizontal transmission of the P 54. Kasschau KD, Xie ZX, Allen E, Llave C, Chapman EJ, transposable element between Drosophila species. Genetics. Krizan KA, et al. P1/HC-Pro, a viral suppressor of RNA si- 1990;124(2):339-55. lencing, interferes with Arabidopsis development and 74. Groth SB, Blumenstiel JP. Horizontal transfer can drive a miRNA function. Dev Cell. 2003;4(2):205-17. greater transposable element in large populations. J Hered. 55. Burgyan J, Havelda Z. Viral suppressors of RNA silencing. 2016. Published online August 24, 2016. Trends Plant Sci. 2011;16(5):265-72. 75. Castillo DM, Mell JC, Box KS, Blumenstiel JP. Molecular 56. Guo W, Liew JY, Yuan YA. Structural insights into the arms evolution under increasing transposable element burden in race between host and virus along RNA silencing pathways Drosophila: a speed limit on the evolutionary arms race. in Arabidopsis thaliana. Biol Rev Camb Philos Soc. BMC Evol Biol. 2011;11:258. 2014;89(2):337-55. 76. Quax TE, Claassens NJ, Soll D, van der Oost J. Codon Bias 57. van Mierlo JT, Overheul GJ, Obadia B, van Cleef KWR, as a Means to Fine-Tune Gene Expression. Mol Cell. Webster CL, Saleh M-C, et al. Novel Drosophila Viruses En- 2015;59(2):149-61. code Host-Specific Suppressors of RNAi. PLoS Pathog. 77. Shah P, Gilchrist MA. Explaining complex codon usage pat- 2014;10(7):e1004256. terns with selection for translational efficiency, mutation 58. Li H, Li WX, WeiDing S. Induction and Suppression of bias, and . Proc Natl Acad Sci U S A. RNA silencing by an Animal Virus. Science. 2011;108(25):10231-6. 2002;296:1319-20. 78. Hollister JD, Gaut BS. Epigenetic silencing of transposable 59. Chao JA, Lee JH, Chapados BR, Debler EW, Schneemann elements: A trade-off between reduced transposition and A, Williamson JR. Dual modes of RNA-silencing suppres- deleterious effects on neighboring gene expression. Gen Res. sion by Flock House virus protein B2. Nat Struct Mol Biol. 2009;19(8):1419-28. 2005;12(11):952-7. 79. de Vanssay A, Bouge AL, Boivin A, Hermant C, Teysset L, 60. Murray GG, Kosakovsky Pond SL, Obbard DJ. Suppressors Delmarre V, et al. Paramutation in Drosophila linked to of RNAi from plant viruses are subject to episodic positive emergence of a piRNA-producing locus. Nature. selection. Proc Biol Sci. 2013;280(1765):20130965. 2012;490(7418):112-5. 61. Nuzhdin SV, Mackay TFC. The genomic rate of transposable 80. Ronsseray S, Lemaitre B, Coen D. Maternal inheritance of P- element movement in Drosophila melanogaster. Mol Biol cytotype in Drosophila melanogaster - A pre-P cytotype is Evol. 1995;12(1):180-1. strictly extra-chromosomally transmitted. Mol Gen Genet. 62. Nuzhdin SV. Sure facts, speculations, and open questions 1993;241(1-2):115-23. about the evolution of transposable element copy number. 81. Stanley CE, Jr., Kulathinal RJ. flyDIVaS: A Comparative Genetica. 1999;107(1-3):129-37. Genomics Resource for Drosophila Divergence and Selec- 63. Morazzani EM, Wiley MR, Murreddu MG, Adelman ZN, tion. G3 (Bethesda). 2016;6(8):2355-63. Myles KM. Production of virus-derived ping-pong-depen- 82. Zhang Z, Wang J, Schultz N, Zhang F, Parhad SS, Tu S, et dent piRNA-like small RNAs in the mosquito soma. PLoS al. The HP1 homolog rhino anchors a nuclear complex that Pathog. 2012;8(1):e1002470. suppresses piRNA precursor splicing. Cell. 64. Schnettler E, Donald CL, Human S, Watson M, Siu RW, Mc- 2014;157(6):1353-63. Farlane M, et al. Knockdown of piRNA pathway proteins 83. Kelleher ES, Edelman NB, Barbash DA. Drosophila inter- results in enhanced Semliki Forest virus production in mos- specific hybrids phenocopy piRNA-pathway mutants. PLoS quito cells. J Gen Virol. 2013;94(Pt 7):1680-9. Biol. 2012;10(11):e1001428. 65. Petit M, Mongelli V, Frangeul L, Blanc H, Jiggins F, Saleh 84. Wolfe NW, Clark NL. ERC analysis: web-based inference of M-C. piRNA pathway is not required for antiviral defense gene function via evolutionary rate covariation. Bioinfor- in Drosophila melanogaster. Proc Natl Acad Sci U S A. matics. 2015;31(23):3835-7. 2016;113(29):E4218-27. 85. Clark NL, Alani E, Aquadro CF. Evolutionary rate covaria- 66. Van De Bor V, Hartswood E, Jones C, Finnegan D, Davis I. tion reveals shared functionality and coexpression of genes. gurken and the I factor retrotransposon RNAs share com- Genome Res. 2012;22(4):714-20. mon localization signals and machinery. Dev Cell. 86. Findlay GD, Sitnik JL, Wang W, Aquadro CF, Clark NL, 2005;9(1):51-62. Wolfner MF. Evolutionary rate covariation identifies new 67. Klattenhoff C, Xi HL, Li CJ, Lee S, Xu J, Khurana JS, et al. members of a protein network required for Drosophila The Drosophila HP1 Homolog Rhino Is Required for Trans- melanogaster female post-mating responses. PLoS Genet. poson Silencing and piRNA Production by Dual-Strand 2014;10(1):e1004108. Clusters. Cell. 2009;138(6):1137-49. 512 Blumenstiel et al.: Positive selection in the piRNA machinery

87. Post C, Clark JP, Sytnikova YA, Chirn GW, Lau NC. The ca- pacity of target silencing by Drosophila PIWI and piRNAs. RNA. 2014;20(12):1977-86. 88. Song J, Liu J, Schnakenberg SL, Ha H, Xing J, Chen KC. Variation in piRNA and transposable element content in strains of Drosophila melanogaster. Genome Biol Evol. 2014;6(10):2786-98. 89. Aravin AA, Sachidanandam R, Bourc'his D, Schaefer C, Pezic D, Toth KF, et al. A piRNA pathway primed by indi- vidual transposons is linked to de novo DNA methylation in mice. Mol Cell. 2008;31(6):785-99. 90. Qi Y, He X, Wang X-J, Kohany O, Jurka J, Hannon GJ. Dis- tinct catalytic and non-catalytic roles of ARGONAUTE4 in RNA-directed DNA methylation. Nature. 2006;443(7114):1008-12. 91. Lewsey MG, Hardcastle TJ, Melnyk CW, Molnar A, Valli A, Urich MA, et al. Mobile small RNAs regulate genome-wide DNA methylation. Proc Natl Acad Sci U S A. 2016;113(6):E801-E10. 92. Nuthikattu S, McCue AD, Panda K, Fultz D, DeFraia C, Thomas EN, et al. The initiation of epigenetic silencing of active transposable elements is triggered by RDR6 and 21- 22 nucleotide small interfering RNAs. Plant Physiol. 2013;162(1):116-31. 93. Chirn G-W, Rahman R, Sytnikova YA, Matts JA, Zeng M, Gerlach D, et al. Conserved piRNA Expression from a Dis- tinct Set of piRNA Cluster Loci in Eutherian Mammals. PLoS Genet. 2015;11(11):e1005652. 94. Arensburger P, Hice RH, Wright JA, Craig NL, Atkinson PW. The mosquito Aedes aegypti has a large genome size and high transposable element load but contains a low pro- portion of transposon-specific piRNAs. BMC Genomics. 2011;12(1):1-23. 95. George P, Jensen S, Pogorelcnik R, Lee J, Xing Y, Brasset E, et al. Increased production of piRNAs from euchromatic clusters and genes in Anopheles gambiae compared with Drosophila melanogaster. Epigenetics Chromatin. 2015;8(1):1-21. 96. Ninova M, Ronshaugen M, Griffiths-Jones S. MicroRNA evolution, expression, and function during short germband development in Tribolium castaneum. Genome Res. 2016;26(1):85-96. 97. Rozhkov NV, Aravin AA, Zelentsova ES, Schostak NG, Sachidanandam R, McCombie WR, et al. Small RNA-based silencing strategies for transposons in the process of invad- ing Drosophila species. RNA. 2010;16(8):1634-45. 98. Song J, Liu J, Schnakenberg SL, Ha H, Xing J, Chen KC. Variation in piRNA and Transposable Element Content in Strains of Drosophila melanogaster. Genome Biol Evol. 2014;6(10):2786-98.