<<

Downloaded from .cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

1 and pseudogenes regulate mRNAs and lncRNAs via the piRNA pathway

2 in the germline

3

4 Toshiaki Watanabe*, Ee-chun Cheng, Mei Zhong, and Haifan Lin*

5 Yale Stem Cell Center and Department of Cell Biology, Yale University School of Medicine, New Haven,

6 Connecticut 06519, USA

7

8 Running Title: Pachytene piRNAs regulate mRNAs and lncRNAs

9

10 Key Words: , pseudogene, lncRNA, piRNA, Piwi, spermatogenesis

11

12 *Correspondence: [email protected]; [email protected]

13

1

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

14 ABSTRACT

15 The eukaryotic genome has vast intergenic regions containing transposons, pseudogenes, and other

16 repetitive sequences. They produce numerous long non-coding (lncRNAs) and PIWI-interacting

17 RNAs (piRNAs), yet the functions of the vast intergenic regions remain largely unknown. Mammalian

18 piRNAs are abundantly expressed in late spermatocytes and round spermatids, coinciding with the

19 widespread expression of lncRNAs in these cells. Here, we show that piRNAs derived from transposons

20 and pseudogenes mediate the degradation of a large number of mRNAs and lncRNAs in mouse late

21 spermatocytes. In particular, they have a large impact on the lncRNA , as a quarter of

22 lncRNAs expressed in late spermatocytes are up-regulated in mice deficient in the piRNA pathway.

23 Furthermore, our genomic and in vivo functional analyses reveal that retrotransposon sequences in the

24 3´UTR of mRNAs are targeted by piRNAs for degradation. Similarly, the degradation of spermatogenic

25 cell-specific lncRNAs by piRNAs is mediated by retrotransposon sequences. Moreover, we show that

26 pseudogenes regulate mRNA stability via the piRNA pathway. The degradation of mRNAs and lncRNAs

27 by piRNAs requires PIWIL1 (also known as MIWI) and, at least in part, depends on its slicer activity.

28 Together, these findings reveal the presence of a highly complex and global RNA regulatory network

29 mediated by piRNAs with retrotransposons and pseudogenes as regulatory sequences.

30

2

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

31 INTRODUCTION

32 The of complex contain vast intergenic regions that are mostly composed of

33 repetitive DNA, with transposons as a major constituent (Taft et al. 2007). Although -coding

34 have been extensively studied over the past half century and many genomes have been sequenced in the

35 past decade, the functions of the vast majority of the genome are largely unknown. Although it is now

36 apparent that some transposons and pseudogenes participate in regulation (Lunyak et al. 2007;

37 Goodier and Kazazian 2008; Tam et al. 2008; Watanabe et al. 2008; Lynch et al. 2011; Schmidt et al.

38 2012), the function of the majority of these sequences remains unknown. The recent advent of deep

39 sequencing technology further revealed that a large portion of the mammalian genome is transcribed into

40 an immense number of non-coding RNAs, including over ten thousand long non-coding RNAs

41 (lncRNAs) , thousands of expressed pseudogenes, and millions of Piwi-interacting RNAs (piRNAs)—a

42 class of 21-32 nucleotide small non-coding RNAs predominantly expressed in the germline (Kim et al.

43 2009; Cabili et al. 2011; Gong and Maquat 2011; Juliano et al. 2011; Ishizu et al. 2012;

44 Kalyana-Sundaram et al. 2012; Kelley and Rinn 2012). The discovery of these new types of RNAs

45 reveals additional levels of genome complexity and presents unprecedented challenges to understanding

46 the function and inter-relatedness of the non-coding regions of the genome.

47 piRNAs bind to Piwi , which represent a subfamily of the Argonaute protein family (Kim

48 et al. 2009; Juliano et al. 2011; Ishizu et al. 2012; Pillai and Chuma 2012; Ross et al. 2014). piRNAs are

3

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

49 generated from various portions of longer single-stranded piRNA precursors, which are transcribed from

50 hundreds of genomic loci named piRNA clusters that are mostly located in the intergenic regions of the

51 genome. Piwi proteins find their target RNAs by using piRNAs as guides, and cleave the targets through

52 the RNase (slicer) activity of the PIWI domain (Saito et al. 2006; Reuter et al. 2011). Additionally, they

53 are believed to interact with factors involved in RNA degradation and chromatin modification (Rouget et

54 al. 2010; Gou et al. 2014; Ross et al. 2014). Across the animal kingdom, Piwi proteins and piRNAs

55 suppress retrotransposons in order to prevent excessive in the germline genome (Grimson et al.

56 2008). Although the regulation of protein-coding genes by piRNAs has been suggested by a few studies in

57 and C.elegans (Ishizu et al. 2012), little is known about piRNA function beyond

58 retrotransposon suppression.

59 During mouse spermatogenesis, the three Piwi proteins, PIWIL1, PIWIL2 and PIWIL4 (a.k.a.

60 MIWI, MILI and MIWI2) and piRNAs are highly expressed in two phases: the gonocyte stage and the

61 pachytene spermatocyte to round spermatid stage (Pillai and Chuma 2012). PIWIL2 and PIWIL4 are

62 expressed in gonocytes and bind to gonocyte piRNAs (Aravin et al. 2008; Kuramochi-Miyagawa et al.

63 2008). PIWIL1 and PIWIL2 are expressed in pachytene spermatocytes and round spermatids, where they

64 bind to pachytene piRNAs (Robine et al. 2009; Reuter et al. 2011; Beyret et al. 2012). Ectopic expression

65 of artificial pachytene piRNAs leads to the degradation of the complementary reporter RNA in pachytene

66 spermatocytes and round spermatids (Yamamoto et al. 2013), suggesting that pachytene piRNAs degrade 4

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

67 target RNAs during this period of spermatogenesis. LINE-1 (L1) retrotransposon suppression by PIWIL1

68 and pachytene piRNAs has been reported (Xu et al. 2008; Reuter et al. 2011), however, other targets of

69 pachytene piRNAs are unknown. Here we report the systematic identification of the target RNAs of

70 pachytene piRNAs and the regulatory sequences in the target RNAs. These analyses, together with

71 functional analyses using several newly generated mouse strains, allowed us to discover that transposons

72 and pseudogenes regulate mRNAs and lncRNAs via piRNAs.

73 74 75 RESULTS

76 Protein-coding genes are negatively regulated by the piRNA pathway

77 Pachytene piRNAs are expressed in late spermatocytes and round spermatids, and bind to

78 PIWIL1 and PIWIL2 (Figure 1A). In the adult testis, PIWIL1-bound piRNAs, mostly 29-30 nt in length,

79 are more abundant than PIWIL2-bound piRNAs that are 25-26 nt in length (Supplemental Figure 1A).

80 Moreover, most of the pachytene piRNAs expressed in late spermatocytes (mid-pachytene through

81 diakinesis stages) are lost in Piwil1−/− mice (Supplemental Figure 1B). Together, these results indicate that

82 the expression of a large portion of piRNAs in late spermatocytes is PIWIL1-dependent. To identify

83 endogenous target RNAs of PIWIL1-piRNA complexes, we performed RNA-seq analysis in Piwil1+/− and

84 Piwil1−/− mice (Table S1). To minimize the background signal from cells not expressing PIWIL1, we

85 isolated late spermatocytes by fluorescence-activated cell sorting (FACS) from 21−23 days post-partum

5

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

86 (dpp) mouse testes (Supplemental Figures 1C, D). Although changes of RNA expression in Piwil1−/− mice

87 were expected to be larger at the round spermatid stage, we selected the late spermatocyte stage, the

88 initial stage of PIWIL1 expression, to minimize secondary effects due to developmental abnormalities

89 (Deng and Lin 2002). We found that the L1 and RLTR4I_MM retrotransposon RNAs were increased in

90 Piwil1−/− late spermatocytes as compared with Piwil1+/− late spermatocytes (Figures 1B-E). Interestingly,

91 the expression of protein-coding mRNAs was also affected in Piwil1−/− late spermatocytes (Figure 1F).

92 Here, 172 genes were significantly up-regulated (>2-fold increase, P < 0.025, RPKM > 0.5), and 11 were

93 significantly down-regulated (>2-fold decrease, P < 0.025, RPKM > 0.5). We did not analyze

94 down-regulated mRNAs in this study, because decrease of these mRNAs is probably secondary due to

95 developmental abnormalities (Supplemental results and Supplemental Figures 1E-G). Of 14 randomly

96 chosen up-regulated genes, 13 were also significantly increased in q-PCR analysis (Supplemental Figure

97 1H), confirming that the expression of a subset of mRNAs in spermatocytes is increased in Piwil1−/−

98 mice.

99 To investigate whether mRNA up-regulation could also be observed in other piRNA pathway

100 mutants, we analyzed the transcriptome of Mov10l1 CKO mice (Mov10l1flox/-, Stra8-Cre). MOV10L1 is

101 an RNA helicase essential for piRNA biogenesis (Frost et al. 2010; Zheng et al. 2010), and piRNAs in late

102 spermatocytes were almost entirely absent in Mov10l1 CKO mice (Supplemental Figure 1I). These mice

103 displayed a Piwil1−/−-like defect with spermatogenesis arrested after meiosis at the early round spermatid 6

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

104 stage (Supplemental Figures 1J, K) (Deng and Lin 2002). A set of 81 genes was up-regulated (>2-fold) in

105 both Mov10l1 CKO and Piwil1 KO late spermatocytes (Figure 1G). No specific GO term in the biological

106 process ontology and the cellular component ontology was enriched among this set, but in the molecular

107 function ontology, “hydrolase activity, acting on glycosyl bond” was frequently observed (six of the 81

108 genes, P = 2.2 x 10-3). Several genes important for spermatogenesis (Man2a2, Morc1, Tex101, and Tdrd1)

109 were among the set of 81 up-regulated genes. Additionally, several retrotransposons including

110 RLTR4I_MM were up-regulated in Mov10l1 CKO mice (Figure 1H).

111

112 Testis-enriched lncRNAs are negatively regulated by the piRNA pathway

113 Recent transcriptome analyses in revealed that a large and diverse set of lncRNAs are

114 specifically expressed in the testis (Cabili et al. 2011; Soumillon et al. 2013). Consistently, we found a

115 large number of lncRNAs (24.4%; 1074/4401) registered in Ensembl were most highly expressed in testes

116 as compared to other tissues and cells (Figure 1I) (Flicek et al. 2014). Many of the testis-enriched

117 Ensembl lncRNAs were initially expressed at 15−22 dpp, when the first wave of spermatogenesis reaches

118 the pachytene spermatocyte to early round spermatid stages (Figure 1J). Thus, their expression overlaps

119 with that of PIWIL1 and pachytene piRNAs. To analyze the impact of the piRNA pathway on lncRNA

120 expression, we first identified 6,014 lncRNAs (RPKM > 0.5 either in Piwil1+/- or in Piwil-/-) from

121 intergenic region using our late spermatocyte RNA-seq data. Many of these lncRNAs are preferentially 7

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

122 expressed in the testis, with expression first detected at 15-18 dpp (Supplemental Figures 1L, M); this

123 suggests that these lncRNAs are expressed from the late spermatocyte stage. However, very few (2.4%)

124 were derived from pachytene piRNA clusters, suggesting that most of them are not piRNA precursors.

125 Remarkably, 485 of the 6,014 identified lncRNAs were significantly increased in Piwil1−/− mice (>2-fold

126 increase, P < 0.025), while only 32 were significantly decreased (>2-fold decrease, P < 0.025) (Figure 1K).

127 In Mov10l1 CKO mice, 473 of 4872 lncRNAs (RPKM > 0.5 either in Mov10l1 CKO or in control) were

128 significantly increased (>2-fold decrease, P < 0.025), and many of the increased lncRNAs were also

129 up-regulated in Piwil1−/− mice (Figure 1L). In addition, 20.4% (1229/6014) and 21.5% (1051/4872) of

130 lncRNAs were up-regulated by more than 1.5-fold in Piwil1 KO and Mov10l1 CKO mice, respectively

131 (Figure 2B). When comparing RNA levels between Piwil1 and Mov10l1 mutant mice, a lower correlation

132 of fold-change in lncRNA levels (R = 0.57) was observed as compared with mRNA levels (R = 0.74).

133 This is at least partly due to the relatively lower expression level of lncRNAs (Supplemental Figure 1N),

134 because the expression levels of lowly expressed RNAs tend to be variable. Additionally, many of the

135 piRNA precursors are up-regulated in Mov10l1 CKO mice, but only mildly affected in Piwil1 KO mice

136 (see below) (Figure 1L).

137

138 The regulation of mRNAs and lncRNAs by PIWIL1 is piRNA-dependent

139 To determine if PIWIL1 regulates the potential target RNAs in a piRNA-sequence-dependent 8

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

140 manner, we mapped the published PIWIL1-associated piRNAs (Robine et al. 2009) against mRNAs and

141 lncRNAs expressed in late spermatocytes and counted the number of piRNA reads that were matched to

142 each mRNA and lncRNA in the antisense orientation (See Supplemental results and Supplemental Figure

143 2 for details on mapping and counting). Notably, piRNAs were preferentially matched to the up-regulated

144 mRNAs and lncRNAs (Figures 2A-D), suggesting that piRNAs guide PIWIL1 to target RNAs. In

145 addition, the average number of piRNA matches to the moderately up-regulated (1−2-fold) transcripts

146 was much higher than that to the down-regulated (>1-fold) transcripts. This result suggests that some

147 moderately up-regulated RNAs are also direct targets of PIWIL1 and piRNAs. We attempted to pull-down

148 PIWIL1 target RNAs by immunoprecipitation using the anti-PIWIL1 antibody. However, the target RNAs

149 were not enriched in the immunoprecipitated RNAs, possibly due to a transient association between

150 PIWIL1 and the target RNAs that are degraded (Supplemental results and Supplemental Figure 3).

151 piRNA precursors were up-regulated in Mov10l1 CKO mice but only mildly affected or

152 unaffected in Piwil1−/− mice (Figure 1L). Consistent with this, there was a positive correlation between

153 the extent of mRNA and lncRNA up-regulation in Mov10l1 CKO mice and the number of unique sense

154 piRNAs (Figures 2E-G). This correlation was modest or almost absent in Piwil1−/− mice (Figures 2E, G).

155 These results suggest that the expression of a subset of mRNAs and lncRNAs up-regulated in Mov10l1

156 CKO mice is regulated by the piRNA precursor processing machinery upstream of PIWIL1.

157 9

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

158 Pseudogene-derived piRNAs regulate their cognate functional gene

159 The most up-regulated gene in Piwil1−/− late spermatocyte RNA-seq was Stambp, which

160 encodes a deubiquitinase required for neuronal cell survival (Ishii et al. 2001). PIWIL1-associated

161 piRNAs were mapped to several regions within this mRNA in the antisense orientation (Figure 3A).

162 Specifically, most of these piRNAs were mapped to Stambp mRNA with mismatches, but were perfectly

163 and uniquely matched to its processed pseudogene (Stambp-ps1) in a piRNA cluster. This suggests that

164 piRNAs from Stambp-ps1 regulate Stambp mRNA. To explore this hypothesis, we generated two mouse

165 strains from ES lines with gene-trap insertions just upstream of Stambp-ps1 in the piRNA cluster (Figure

166 3B). In spermatocytes from mice homozygous for the insertion (Stambp-ps1Gt1/Gt1 and Stambp-ps1Gt2/Gt2),

167 both putative piRNA precursor transcripts and piRNAs from the Stambp-ps1 region were dramatically

168 decreased (Figures 3B-D). As predicted, Stambp mRNA was significantly increased in late spermatocytes

169 from Stambp-ps1Gt1/Gt1 and Stambp-ps1Gt2/Gt2 mice, but Stambp pre-mRNA was not changed (Figure 3E).

170 Although we did not find any phenotype in Stambp-ps1Gt1/Gt1 mice (Supplemental Figures 4A-D), a

171 marked increase in Stambp protein was observed in Stambp-ps1Gt1/Gt1 late spermatocytes and round

172 spermatids (Figures 3F, G). Together, these results indicate piRNAs from the Stambp pseudogene regulate

173 the Stambp gene at the post-transcriptional level. Inspired by this example, we aligned all mRNAs to

174 piRNA clusters expressed in spermatocytes, and identified six other pseudogenes that potentially targeted

175 their functional cognate mRNAs. Of the six cognate mRNAs, three were significantly up-regulated in 10

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

176 both Piwil1−/− and Mov10l1 CKO spermatocytes (Supplemental Figure 4E).

177

178 Transcripts containing retrotransposon sequences are enriched in the up-regulated RNAs

179 To examine the piRNA target sequences within the regulated RNAs, we identified

180 piRNA-complementary and, therefore, potential target sequences within mRNAs. Almost all of the

181 piRNA-complementary sequences were located within the 3´UTR rather than 5´UTR and coding DNA

182 sequence (CDS) of mRNAs (Figure 4A). The piRNA-complementary sequences were much more densely

183 observed in the 3´UTR of the up-regulated mRNAs than the down-regulated mRNAs, consistent with the

184 results of preferential mapping of piRNAs to the up-regulated RNAs (Figure 2). Many (68.5%) of the

185 piRNA-complementary sequences within the up-regulated mRNAs corresponded to retrotransposon

186 sequences (Figure 4B), of which 83.2% corresponded to SINE, non-autonomous retrotransposons with a

187 length of 150-300 nt (Goodier and Kazazian 2008). When repeat sequence-derived piRNAs and

188 non-repeat sequence-derived piRNAs were separately mapped to mRNAs, strong preferential mapping to

189 the up-regulated mRNAs was observed only for the repeat sequence-derived piRNAs and these were

190 mostly retrotransposon-derived (Figure 4C). A similar result was obtained for lncRNAs (Figure 4D).

191 These results raise the possibility that retrotransposon sequences in mRNAs and lncRNAs are targeted by

192 PIWIL1 and piRNAs.

193 To evaluate the importance of retrotransposon sequences in piRNA-mediated mRNA and 11

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

194 lncRNA regulation, we classified mRNAs and lncRNAs based on the extent of up-regulation in the

195 Piwil1−/− mice. Remarkably, the frequencies of mRNAs and lncRNAs containing retrotransposon

196 sequences were significantly higher among the up-regulated RNAs than the down-regulated RNAs for all

197 retrotransposon classes analyzed (Figures 4E, F). For example, 46.4% of the up-regulated mRNAs but

198 only 12.9% of the down-regulated mRNAs contained SINE sequences. In addition, the density of

199 retrotransposon sequences was considerably higher in the 3´UTR of the up-regulated mRNAs (Figure 4G).

200 For lncRNAs, a higher density was observed across the entire region of the up-regulated lncRNAs (Figure

201 4H). This correlation implies that many of the retrotransposon sequences in the up-regulated mRNAs and

202 lncRNAs are targeted by PIWIL1 for degradation.

203

204 PIWIL1 degrades target mRNAs and lncRNAs in a slicer-dependent fashion

205 Among the 13 known piRNA pathway genes (Pillai and Chuma 2012), Tdrd1, Piwil2, and

206 Piwil1 mRNAs contain SINE sequences potentially targeted by piRNAs (Figure 5A and Supplemental

207 Figure 5A). Moreover, these mRNAs were up-regulated in Piwil1−/− or Mov10l1 CKO mice

208 (Supplemental Figure 5B). In the 3´UTR of Tdrd1 mRNA, two SINE sequences (PB1D10 and ID2) were

209 found (Figure 5A). Interestingly, we observed a sharp increase in RNA-seq reads at one site in the

210 PB1D10 sequence of Tdrd1 mRNA (Figure 5B, arrow). The reason for this sharp increase can be inferred

211 from our RNA-seq method: we prepared strand-specific RNA-seq libraries by fragmenting RNAs and 12

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

212 ligating the 5´ ends of the resulting RNA fragments to an RNA linker. The library was then sequenced

213 from the linker. Therefore, if there were many RNA ends at a particular site generated by RNase cleavage

214 in vivo in addition to in vitro fragmentation, more reads would be produced from the ends at that site.

215 Because the sharp increase in RNA-seq reads at the PB1D10 sequence was observed only in Piwil1+/− but

216 not Piwil1−/− mice (Figure 5B), it likely resulted from the PIWIL1-mediated cleavage. In support of this,

217 PIWIL1-associated piRNAs were mapped predominantly to one region within the PB1D10 sequence in

218 the antisense orientation, and the site of sharp increase occurred between the 10th and 11th positions of

219 the mapped piRNAs (Figures 5A, C), exactly corresponded to the cleavage site of PIWIL1 and these

220 piRNAs (Brennecke et al. 2007; Gunawardane et al. 2007; Reuter et al. 2011). Additionally, the 5´ end of

221 one of the piRNAs generated from Tdrd1 mRNA exactly corresponds to the site of sharp increase

222 (Figures 5B, C), showing characteristics of secondary piRNAs, which are generated from 5´ ends of

223 RNAs cleaved by Piwi-piRNAs (Juliano et al. 2011; Ishizu et al. 2012; Pillai and Chuma 2012). This

224 further supports the PIWIL1-mediated cleavage at the site of sharp increase. In addition, expression of

225 Tdrd1 and other target RNAs were increased in late spermatocytes from Piwil1 slicer mutant (Piwil1-/ADH)

226 mice (Figure 5D and Supplemental results) (Reuter et al. 2011), indicating that slicer activity contributes

227 to the target RNA degradation.

228

229 PIWIL1 promotes the decay of a reporter mRNA containing a B1 or B2 SINE sequence at its 13

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

230 3’UTR

231 To further demonstrate that retrotransposon sequences in a target mRNA indeed mediate the

232 degradation of the target mRNA by the PIWIL1-piRNA pathway, we generated mice with a single-copy

233 transgene that expresses mCherry mRNA with a retrotransposon sequence in its 3´UTR (Figure 6A and

234 Supplemental Figure 7). The retrotransposon sequence in mCherry mRNA was flanked by LoxP sites, and

235 therefore, can be removed in the presence of Cre protein. We chose B1 and B2 SINE retrotransposons,

236 because they are frequently found in the up-regulated mRNAs and lncRNAs in Piwil1−/− mice (Figures 4E,

237 F and Supplemental Figure 7). Upon introduction of Stra8-Cre, B1 and B2 sequences in the transgenes

238 were deleted in late spermatocytes (Figures 6B, C), and mCherry fluorescence signal in late

239 spermatocytes increased by 2.2-fold (mCherry-B1flox/0 vs. mCherry-B1flox/0, Stra8-CreTg/0) and 3.2-fold

240 (mCherry-B2flox/0 vs. mCherry-B2flox/0, Stra8-CreTg/0) (Figures 6D, E left). A similar level of increase

241 (2.8-fold in mCherry-Bflox/0 and 2.7-fold in mCherry-B1flox/0) was also observed at the mRNA level,

242 however, no change was observed at the pre-mRNA level (Figure 6E center and right). These results

243 indicate that B1 and B2 sequences confer instability to mature mRNAs in late spermatocytes.

244 To examine if the observed instability caused by B1 and B2 sequences is PIWIL1-dependent,

245 mCherry-B1flox/0 and mCherry-B2flox/0 mice were crossed with Piwil1 mutant mice. Elevated levels of

246 mCherry mRNA were observed in Piwil1−/− late spermatocytes as compared with the Piwil1+/− control in

247 both mCherry-B1flox/0 and mCherry-B2flox/0 mice (Figure 6F right; 1.8-fold for mCherry-B1flox/0 and 14

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

248 3.4-fold for mCherry-B2flox/0), suggesting that PIWIL1 is involved in the degradation of B1- and

249 B2-containing mCherry mRNAs. The larger increase observed for B2-containing mCherry mRNA as

250 compared to B1-containing mCherry mRNA in Piwil1−/− mice is consistent with the preferential

251 enrichment of the B2 sequence in highly up-regulated RNAs in the RNA-seq analysis (Figures 4E, F, 6F).

252 When B1 and B2 sequences were removed (mCherry-B1Δlox/0 and mCherry-B2Δlox/0), no increase was

253 observed in Piwil1−/− mice compared with Piwil1+/− mice (Figure 6G). These observations indicate that

254 PIWIL1 targets B1 and B2 sequences in the mCherry mRNAs for degradation.

255

256 MIWI promotes the decay of an endogenous mRNA containing conserved retrotransposon sequence

257 in its 3´UTR

258 To directly validate that retrotransposon sequences regulate the stability of endogenous mRNAs

259 via MIWI-piRNA-mediated mechanism, we deleted such retrotransposon sequences in the 3´UTR of an

260 endogenous mRNA, Prelid1 mRNA, and examined its expression in wildtype versus Miwi mutant

261 backgrounds. The two most frequently observed retrotransposon sequences in pachytene piRNAs were

262 MIR and LINE2 (L2) repeats (Supplemental Figure 8A), inactive retrotransposons that were propagated

263 before the mammalian radiation and widely exist in mammalian genomes (Silva et al. 2003). Their

264 sequences are often found in mRNAs, as 6.3% and 2.5% of mouse Ref-Seq mRNAs contain MIR and L2a

265 sequences, respectively (Faulkner et al. 2009). Prelid1 mRNA contains both MIR and L2a sequences in 15

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

266 the 3´UTR, and PIWIL1-associated piRNAs were mapped predominantly to the MIR and L2a sequences

267 in the antisense orientation (Figure 7A). Up-regulation of this mRNA in Piwil1−/− mice as compared to

268 Piwil1+/− mice (approximately 4-fold) was observed only at the mRNA level, but not at the pre-mRNA

269 level (Supplemental Figure 8B), indicating that PIWIL1 degrades this mRNA.

270 To generate mice expressing Prelid1 mRNA without the retrotransposon sequences, we inserted

271 a synthetic signal sequence just upstream of the two retrotransposon sequences in the

272 3´UTR using the CRISPR-Cas9 system, because the polyadenylation signal sequence of this mRNA is in

273 the L2a retrotransposon sequence, which is located immediately downstream of the MIR retrotransposon

274 sequence (Figure 7A). Mice homozygous for the mutant allele (Prelid1pA/pA) generated truncated Prelid1

275 mRNA lacking the retrotransposon sequences (Figure 7B). Although we did not find any phenotype in

276 Prelid1pA/pA mice, Prelid1 mRNA, but not its pre-mRNA, was increased approximately six-fold in late

277 spermatocytes from Prelid1pA/pA mice as compared to wild-type (Prelid1+/+) control mice (Figure 7C).

278 Such an increase was not observed in the liver where PIWIL1 is not expressed (Figure 7C). Furthermore,

279 in contrast to the full-length Prelid1 mRNA, the truncated Prelid1 mRNA was not increased in Piwil1−/−

280 late spermatocytes (Supplemental Figure 8C), indicating that PIWIL1 targets the retrotransposon

281 sequences in the 3´UTR of Prelid1 mRNA.

282 During spermatogenesis, Prelid1 mRNA is expressed at high levels until the early-pachytene

283 stage, and it is down-regulated after the expression of pachytene piRNAs at the mid-pachytene stage, 16

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

284 reaching a very low level at the round spermatid stage (Figure 7D). Although the expression of many of

285 the mRNAs that are down-regulated from the early spermatocyte stage to the late spermatocyte stage

286 were not affected in Miwi−/− late spermatocytes (data not shown), a subset of these mRNAs may be

287 cleared or repressed by pachytene piRNAs in late spermatocytes and round spermatids, as observed for

288 Prelid1 mRNA (Figure 7D). This low level of Prelid1 expression in round spermatids, together with

289 strong derepression of Stambp expression in Piwil1−/− round spermatids (Figure 3F), suggests that genes

290 that function in round spermatids are not repressed by the piRNA pathway. Consistent with this idea,

291 PIWIL1-associated piRNAs were less preferentially matched to mRNAs expressed in round spermatids

292 (Rspd mRNAs) compared with mRNAs expressed in early spermatocytes (Espc mRNAs, Figures 7E, F).

293 Furthermore, retrotransposon sequences were found less often in Rspd mRNAs than in Espc mRNAs

294 (Figure 7G). This low frequency is likely due to the shorter 3´UTRs of Rspd mRNAs, which have a

295 median 3´UTR length of 328 nt compared to 729 nt and 973 nt for all mRNAs and Espc mRNAs,

296 respectively (Figure 7H). Therefore, it is tempting to speculate that round spermatid genes might have

297 evolved to possess short 3´UTR in order to escape retrotransposon insertions that result in deleterious

298 effects on these important reproductive genes.

299

300 DISCUSSION

17

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

301 We have identified a global RNA regulatory mechanism mediated by retrotransposons and

302 pseudogenes that regulate mRNAs and lncRNAs through our systematic search for the potential target

303 RNAs of pachytene piRNAs. Pachytene piRNAs are abundantly expressed in mouse late spermatocytes

304 and round spermatids, and their deficiency leads to spermatogenic arrest at the early round spermatid

305 stage. Therefore, they have been proposed to play an important role during spermatogenesis; however, the

306 mechanisms and targets of their regulation were largely unknown. The present study reveals that

307 pachytene piRNAs and PIWIL1 negatively regulate protein-coding mRNAs and lncRNAs as well as L1

308 retrotransposons in late spermatocytes, with piRNAs acting as guide molecules to identify target RNAs

309 via sequence complementarity. The degradation of target RNAs by PIWIL1 and piRNAs is mostly

310 dependent on the slicer activity of PIWIL1. Furthermore, our results reveal two mechanisms for the

311 piRNA guidance (Figure 7I). First, piRNAs generated from pseudogene sequences in piRNA clusters

312 target mRNAs transcribed from their functional cognate genes. Second, piRNAs from retrotransposon

313 sequences target RNAs that contain retrotransposon sequences.

314 Our results reveal that PIWIL1 and piRNAs have a large impact on the expression of lncRNAs.

315 Many of them are probably regulated via retrotransposon sequences. Given that 66% of mouse lncRNAs

316 contain transposable elements (Kelley and Rinn 2012), the large effect on the lncRNA transcriptome is

317 not surprising. In late spermatocytes and round spermatids, standard histones are replaced with several

318 histone variants associated with an open chromatin structure, and a genome-wide low level of H3K9me2 18

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

319 and high level of H3K4me2 are observed (Kimmins and Sassone-Corsi 2005; Tachibana et al. 2007;

320 Soumillon et al. 2013). The open chromatin structure presumably facilitates the replacement of the

321 histones with transition proteins (Doenecke et al. 1997). Although we cannot rule out the possibility that

322 widespread of lncRNAs in late spermatocytes and early round spermatids have some

323 functional purposes such as producing a diverse and large set of pachytene piRNAs (Ha et al. 2014), most

324 of these intergenic RNAs could be a “by-product” of extensive chromatin remodeling occurring in these

325 cells (Soumillon et al. 2013). Nonfunctional RNAs generated from intergenic regions often contain

326 retrotransposon sequences, since ~40% of the mouse genome is derived from retrotransposon sequences.

327 Some of them might have harmful effects on spermatogenesis. The piRNA pathway would be ideal for

328 regulating such lncRNAs, as it mainly targets retrotransposon sequences. Because the expression of

329 PIWIL1 and pachytene piRNAs occurs at the same time as the chromatin remodeling and the resultant

330 pervasive transcription, it is possible that piRNAs are abundantly expressed in late spermatocytes and

331 round spermatids in part to degrade “leaked” intergenic transcripts that may cause harmful effects.

332 Our results reveal that piRNAs derived from inactive as well as active retrotransposons regulate

333 mRNAs, including Tdrd1 and Prelid1 mRNA. piRNAs are generated from two biogenesis pathways, the

334 primary and secondary pathways (Kim et al. 2009; Juliano et al. 2011; Ishizu et al. 2012; Pillai and

335 Chuma 2012). The secondary pathway specifically amplifies piRNAs derived from active

336 retrotransposons. Given that amplification of active retrotransposon-derived piRNAs depends on high 19

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

337 similarity among active retrotransposon copies, inactive retrotransposon-derived piRNAs are unlikely to

338 be efficiently amplified by the secondary pathway. Mouse pachytene piRNAs are known to be generated

339 mostly through the primary pathway (Reuter et al. 2011; Beyret et al. 2012; Pillai and Chuma 2012).

340 Similarly, it has just been reported that adult testicular piRNAs in the common marmoset are generated

341 mainly via the primary pathway, and both young and old retrotransposon sequences equally contribute to

342 the production of these piRNAs (Hirano et al. 2014). In addition, this study also identified

343 pseudogene-derived piRNAs that potentially regulate parental mRNAs. Thus, generation of piRNAs from

344 inactive retrotransposons and pseudogenes via the primary pathway in adult testes seems to be a

345 conserved phenomenon in mammals. Although it is unclear whether this is due to the predominant

346 presence of cellular mechanisms activating the primary pathway and/or the involvement of only a small

347 subset of pachytene piRNAs in silencing active retrotransposons, it is possible that pachytene piRNA

348 generation highly depends on the primary pathway in order to generate sufficient unamplifiable piRNAs

349 such as inactive retrotransposon-derived piRNAs to regulate mRNAs and lncRNAs.

350 In summary, our genome-wide analysis of pachytene piRNA targets in late spermatocytes reveals

351 a regulatory network in which the stability of many lncRNAs and mRNAs are regulated by

352 retrotransposons and pseudogenes via their derivative piRNAs. In addition, retrotransposons auto-regulate

353 the turnover of their own RNAs also via the piRNA pathway. Interestingly, the expression of an unusually

354 large number of lncRNAs as well as pachytene-like piRNAs in the testis is a well conserved phenomenon 20

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

355 among mammals and birds (Li et al. 2013; Soumillon et al. 2013; Hirano et al. 2014). Therefore the

356 genome-wide inter-regulatory mechanisms reported here for piRNAs, lncRNAs, mRNAs, pseudogene

357 RNAs, and transposon RNAs may be well conserved in the testes of vertebrates.

358

359 METHODS

360 RNA-seq library construction and sequencing

361 Total RNAs were isolated from FACS-isolated cells or testes using TRIzol (Invitrogen).

362 RNA-seq data from several tissues and cell types that were sequenced in Bing Ren’s laboratory were

363 obtained from the Omnibus (GEO) database under accession number GSE36026. See

364 Supplemental Methods for details on RNA-seq library preparation, mapping, calculation of RPKM values

365 and lncRNA identification.

366

367 Flow cytometry and sorting of spermatogenic cells

368 To prepare single-cell suspensions, decapsulated testes were treated with a digestion solution (0.5 mg/ml

369 collagenase type IV, 0.05% trypsin-EDTA, and 25 μg/ml DNaseI in DMEM) at 35oC for 10 min. See

370 Supplemental methods for details.

371

21

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

372 Generation of knock-in, gene-trap, and transgenic mice

373 An oligo DNA for (50 nt homologous sequence − polyadenylation signal

374 sequence − 50 nt homologous sequence), Cas9 mRNA, and chimeric guide RNA were mixed and then

375 injected into the cytoplasm of 1-cell embryos of B6;SJL F2 mice. See Supplemental methods for details

376 about CRISPR-mediated knock-in mouse, gene-trap mouse and transgenic mouse generation.

377

378 Mapping of PIWIL1-associated piRNAs to mRNAs and lncRNAs

379 PIWIL1-associated small RNA reads (SRR033657) were annotated as previously described (Watanabe et

380 al. 2011), and miRNAs and degradation products of abundant noncoding RNAs (rRNA, tRNA, snRNA,

381 snoRNA, scRNA, srpRNA and RNA) were removed. The remaining small RNAs were mapped to the

382 masked mRNAs and lncRNAs using SeqMap software

383 (http://www-personal.umich.edu/~jianghui/seqmap/) allowing up to four mismatches including two .

384 See Supplemental methods for details about piRNA mapping analyses.

385

386 Small RNA analysis in Stambp pseudogene mutant mice

387 Small RNA libraries were constructed using total RNAs isolated from 23 dpp Stambp-ps1+/Gt1 and

388 Stambp-ps1Gt1/Gt1 mouse testes. See Supplemental methods for details.

389 22

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

390 Analyses of retrotransposon sequences in mRNAs and lncRNAs

391 The mRNA (lncRNA) sequences with 5' and 3' flanking intergenic sequences were obtained from the

392 UCSC Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables). Retrotransposon sequences in the

393 downloaded sequences were identified using RepeatMasker (http://www.repeatmasker.org). See

394 Supplemental methods for details about retrotransposon analyses.

395

396 Identification of pseudogenes in piRNA clusters

397 Pseudogenes in piRNA clusters were identified by aligning mRNA sequences to piRNA cluster sequences. See

398 Supplemental methods for details.

399

400 DATA ACCESS

401 The sequence data from this study have been submitted to the Gene Expression Omnibus (GEO)

402 database (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE42004.

403

404

405 ACKNOWLEDGEMENTS

406 We thank Timothy Nottoli and Carole Pease for generating transgenic and

407 CRISPR-Cas9-medaited KO mice, Michael Reuter and Ramesh Pillai for Piwil1 slicer mutant mice, 23

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

408 Robert Frost and Eric Olson for Mov10l1 mutant mice, Sandra Martin for anti-L1 ORF1, Celina Juliano,

409 Robert Ross, Molly Weiner and Anastasios Vourekas for comments on manuscript, Yoshiaki Tanaka and

410 Mengjie Chen for discussion about bioinformatics analysis, Chia-ling Hsieh for help with sample

411 preparation. This work was supported by the NIH grant R37HD42012 and a Mathers Foundation award to

412 H.L.. T.W. was a Fellow of the Japan Society for the promotion of Science (JSPS) and The Uehara

413 Memorial Foundation during part of this work. E.C. was supported by Connecticut´s Stem Cell Research

414 grant (10SCA13). T.W. performed the experiments and conducted bioinformatics analysis. E.C. was

415 involved in FACS sorting. M.Z. performed the sequencing. T.W. and H.L. designed the study and wrote

416 the paper.

417

418

24

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

419 FIGURE LEGENDS

420 Figure 1. piRNAs negatively regulate mRNAs in spermatocytes

421 (A) A schematic drawing of mouse spermatogenesis, with the expression patterns of PIWIL1, PIWIL2,

422 pre-pachytene piRNAs and pachytene piRNAs. The stages of developmental arrest in Piwil1−/− and

423 Piwil2−/− mice are shown.

424 (B, F, K) RNA-seq comparison of retrotransposon (B), mRNA (F), and spermatocyte lncRNA (K)

425 expression between Piwil1+/− and Piwil1−/− late spermatocytes. Lines flanking the diagonal represent

426 2-fold differences. Significantly changed genes (P < 0.025, Student's t-test) are indicated in red (n = 3).

427 RPKM stands for Reads Per Kilobase of transcript per Million mapped reads.

428 (C, D) Northern blot analysis of L1 full-length RNA (C) and Western blot analysis of L1 ORF1 (D) in

429 23-dpp testes of Piwil1+/− and Piwil1−/− mice. Two biological replicates from different mice were

430 analyzed.

431 (E) Immunohistochemical analysis of L1 ORF1 in sections of 28-dpp testes from Piwil1+/− and Piwil1−/−

432 mice. Spermatocytes (Spc) and round spermatids (RS) are indicated.

433 (G, L) Piwil1 and Mov10l1 mutations have similar impacts on mRNA (G) and spermatocyte lncRNA (L)

434 expression. lncRNAs derived from the top 50 pachytene piRNA clusters are indicated in red.

435 (H) RNA-seq analysis of retrotransposon expression in late spermatocytes from Mov10l1 CKO and

436 control mice. See (B) for more information

25

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

437 (I) Approximately 25% of lncRNAs registered in Ensembl are preferentially expressed in the testis.

438 Shown are the expression levels of 4401 Ensembl lncRNAs (rows) across tissues and cells (columns).

439 (J) Expression levels of 1074 testis-enriched Ensembl lncRNAs during spermatogenesis.

440

441 Figure 2. The extent of mRNA and lncRNA upregulation is strongly correlated to the number of

442 piRNA matches.

443 (A, B) piRNAs are preferentially mapped to the up-regulated mRNAs (A) and lncRNAs (B). Y-axis

444 represents the numbers (RPM) of piRNAs mapped in the antisense orientation by allowing up to four

445 mismatches including up to 2 indels. Protein-coding genes and lncRNAs on the X-axis are sorted based

446 on the extent of changes in the RNA-seq analysis in Piwil1 KO. RPM stands for Reads Per Million

447 sequenced reads.

448 (C, D) A box plot analyzing the number of piRNA matches in the antisense orientation. Protein-coding

449 genes (C) and lncRNAs (D) are classified into five groups based on the extent of up-regulation in

450 Piwil1−/− mice.

451 (E) MA plot analysis of mRNA (left) and lncRNA (right) expression in Piwil1 KO and Mov10l1 CKO.

452 The top 200 mRNAs and lncRNAs generating many sense unique piRNAs are highlighted in red. The top

453 200 mRNAs and lncRNAs having many antisense piRNA hits (four mismatch condition) are highlighted

454 in blue. 26

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

455 (F) piRNAs are preferentially generated from the up-regulated mRNAs (top) and lncRNAs (bottom) in

456 Mov10l1 CKO. Protein-coding genes and lncRNAs on the X-axis are sorted based on the extent of

457 changes in the RNA-seq analysis in Mov10l1 CKO.

458 (G) Box plots analyzing the number of unique sense piRNAs generated from mRNAs (top) and lncRNAs

459 (bottom). mRNAs and lncRNAs are classified into five groups based on the extent of up-regulation in

460 Mov10l1 CKO (left) and Piwil1 KO (right) mice.

461

462 Figure 3. The Stambp pseudogene regulates its cognate mRNA via the piRNA pathway

463 (A) Distribution of PIWIL1-associated piRNAs matched to Stambp mRNA in the antisense orientation.

464 CDS is represented by the hatched region in mRNA.

465 (B) Structure of the Stambp-ps1 piRNA cluster. The Stambp pseudogene (Stambp-ps1) is indicated in

466 green. The insertion sites of the two gene trap lines (Gt1 and Gt2) are indicated in arrows and brown bars.

467 piRNAs generated from this locus and Chr17 piRNA cluster (control) in Stambp-ps1+/Gt1 and

468 Stambp-psGt1/Gt1 mice are shown. Watson (Crick) strand piRNAs are shown in red (blue).

469 (C) Northern blotting analysis of a piRNA derived from Stambp-ps1. The location of the probe used is

470 shown in (B).

471 (D) q-PCR analysis of the putative piRNA precursor in Stambp-ps1Gt1/Gt1 (left) and Stambp-ps1Gt2/Gt2

472 (right) late spermatocytes. The locations of the primers used are shown in (B). Error bars represent the SD 27

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

473 (n = 4). *P < 0.01 (Student's t-test).

474 (E) q-PCR analysis of Stambp mRNA and pre-mRNA in Stambp-ps1Gt1/Gt1, Stambp-ps1Gt2/Gt2, Piwil1−/−,

475 and Mov10l1 CKO late spermatocytes. Error bars represent the SE (n = 3 for Mov10l1 CKO and n = 4 for

476 others). *P < 0.01 (Student's t-test).

477 (F) Immunofluorescence analysis of Stambp protein. Round spermatids (RS) are located at the center of

478 seminiferous tubules and indicated.

479 (G) Western blot analysis of Stambp protein. Late spermatocytes and round spermatids were

480 FACS-isolated from 26-dpp mouse testes.

481

482 Figure 4. Retrotransposon sequences are enriched in the up-regulated mRNAs and lncRNAs

483 (A) Most piRNA-complementary sequences were observed within 3´UTR. The average number (RPM) of

484 antisense piRNA matches at each position was calculated using a moving window of 100 bases (step size

485 = 100 bases). TSS represents transcription start site.

486 (B) Many of the piRNA-complementary sequences in mRNAs were SINE sequences. Each position in

487 piRNA-complementary sequences within the up-regulated mRNAs was annotated.

488 (C, D) The number of matches to repeat-derived piRNAs was correlated with the extent of up-regulation.

489 Box plots analyzing the number of matches to nonrepeat-derived piRNAs (left) and repeat-derived

490 piRNAs (right) are shown. 28

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

491 (E, F) The up-regulated mRNAs (E) and lncRNAs (F) frequently contain retrotransposon sequences. The

492 number of genes and lncRNAs in each class is shown above the bar on the left graph. P-values between

493 the <1 and >2 groups were calculated by the R prop.test.

494 (G, H) High density of retrotransposon sequences in the 3´UTR of the up-regulated mRNAs (G) and

495 across the entire region of the up-regulated lncRNAs (H). Regions examined are divided into 0.1 kb bins

496 (window size = 100 bases, step size = 100 bases).

497

498 Figure 5. PIWIL1 slicer activity contributes to the degradation of target RNAs

499 (A) PIWIL1-associated piRNAs are predominantly mapped to PB1D10 SINE sequences (gray box) in the

500 3´UTR of Tdrd1 mRNA.

501 (B) The sharp increase in RNA-seq reads at the PB1D10 region. Shown are RNA-seq reads and

502 PIWIL1-associated piRNAs derived from this region (below).

503 (C) A magnified view of the site of the sharp increase. The location is indicated with the blue line in (A).

504 (D) q-PCR analysis of several target mRNAs (top) and lncRNAs (bottom) in Piwil1 KO (blue) and in

505 Piwil1 slicer mutant (red) mice. The locations of PCR amplicons are shown in Figure 3A (Stambp),

506 Figure 5A (Tdrd1) and Figure 7A (Prelid1). Error bars represent the SE (n = 3).

507

29

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

508 Figure 6. B1 and B2 SINE sequences confer instability to mRNAs

509 (A) A schematic representation of a transgene construct. The PEST motif was fused to the C terminus of

510 the mCherry protein, which results in shorter half-life of the protein.

511 (B) FACS analysis of testicular cells from 28 dpp mice using Hoechst 33342.

512 (C) Deletion of B1 (top) and B2 (bottom) sequences from transgenes in late spermatocytes from

513 Stra8-Cre mice. A set of PCR primers spanning LoxP sites was used for PCR.

514 (D) The increase of mCherry fluorescence upon deletion of retrotransposon sequences.

515 (E) mCherry fluorescence (left), mCherry mRNA (center) and mCherry pre-mRNA (right) levels in the

516 absence (blue) or the presence (red) of Stra8-Cre allele in late spermatocytes are shown. RNA levels were

517 measured by q-PCR. Error bars represent the SE (n = 3). *P < 0.05 (Student's t-test).

518 (F, G) PIWIL1 targets B1 and B2 sequences for degradation. mCherry fluorescence (left) and mCherry

519 mRNA (right) levels in Piwil1+/− (blue) and Piwil1−/− (red) late spermatocytes are shown. Error bars

520 represent the SE (n = 3). *P < 0.05 (Student's t-test).

521

522 Figure 7. The mammalian conserved retrotransposon sequence in Prelid1 mRNA regulates its

523 stability in late spermatocytes.

524 (A) The distribution of PIWIL1-associated piRNAs mapped to Prelid1 mRNA in the antisense orientation.

30

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

525 PIWIL1-associated piRNAs are mapped to the MIR and L2 retrotransposon sequences (gray boxes) in the

526 antisense orientation.

527 (B) 3´RACE analysis of Prelid1 mRNA in Prelid1pA/pA late spermatocytes. The arrows indicate 3´RACE

528 PCR products amplified from the 3´ ends of wildtype and truncated Prelid1 mRNAs. The insertion site of

529 the polyadenylation signal sequence and the location of the primer used is shown in (A).

530 (C) q-PCR analysis of Prelid1 mRNA and pre-mRNA in Prelid1pA/pA late spermatocytes (left) and livers.

531 Error bars represent the SE (n = 3). *P < 0.05 (Student's t-test). The locations of the primers used for the

532 analyses of the mRNA are shown in (A).

533 (D) Published microarray data (1431420_a_at probe) are shown (Fallahi et al. 2010).

534 (E) Clustering analysis of the relative expression levels of 12,077 Ensemble protein-coding genes among

535 early spermatocytes (Espc), late spermatocytes (Lspc) and round spermatids (RS) is shown. Expression

536 levels were determined by RNA-seq in individual cell types.

537 (F) The read number (RPM) of PIWIL1-associated piRNAs complementary to mRNA(s) of each gene is

538 shown.

539 (G) Retrotransposons are less frequently found in round spermatid (RS) mRNAs than early spermatocyte

540 (Espc) mRNAs. The number of genes in each class is shown above the bar on the left graph. P-values

541 between the <1 and >2 groups were calculated using the R prop.test.

31

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

542 (H) RS mRNAs have a short 3´UTR. Box plots analyzing the length of 5´UTR, CDS and 3´UTR of all

543 Ensembl mRNAs, Espc mRNAs and RS mRNAs are shown.

544 (I) PIWIL1 and piRNAs derived from pseudogenes and retrotransposons in piRNA clusters degrade

545 spermatocyte-expressed mRNAs and lncRNAs at the mature RNA level via piRNA-complementary

546 sequences in the RNAs. The PIWIL1 and piRNA-induced degradation is mostly dependent on the slicer

547 activity of PIWIL1.

548

549

32

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

550 REFERENCES

551 Aravin AA, Sachidanandam R, Bourc'his D, Schaefer C, Pezic D, Toth KF, Bestor T, Hannon GJ. 2008. 552 A piRNA pathway primed by individual transposons is linked to de novo DNA methylation in 553 mice. Molecular cell 31(6): 785-799. 554 Beyret E, Liu N, Lin H. 2012. piRNA biogenesis during adult spermatogenesis in mice is independent 555 of the ping-pong mechanism. Cell research 22(10): 1429-1439. 556 Brennecke J, Aravin AA, Stark A, Dus M, Kellis M, Sachidanandam R, Hannon GJ. 2007. Discrete 557 small RNA-generating loci as master regulators of transposon activity in Drosophila. Cell 558 128(6): 1089-1103. 559 Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. 2011. Integrative 560 annotation of large intergenic noncoding RNAs reveals global properties and specific 561 subclasses. Genes & development 25(18): 1915-1927. 562 Deng W, Lin H. 2002. miwi, a murine homolog of piwi, encodes a cytoplasmic protein essential for 563 spermatogenesis. Developmental cell 2(6): 819-830. 564 Doenecke D, Drabent B, Bode C, Bramlage B, Franke K, Gavenis K, Kosciessa U, Witt O. 1997. 565 Histone gene expression and chromatin structure during spermatogenesis. Advances in 566 experimental medicine and biology 424: 37-48. 567 Fallahi M, Getun IV, Wu ZK, Bois PRJ. 2010. A global expression switch marks pachytene initiation 568 during mouse male meiosis. Genes 1: 469-483. 569 Faulkner GJ, Kimura Y, Daub CO, Wani S, Plessy C, Irvine KM, Schroder K, Cloonan N, Steptoe AL, 570 Lassmann T et al. 2009. The regulated retrotransposon transcriptome of mammalian cells. 571 Nature 41(5): 563-571. 572 Flicek P, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, 573 Fitzgerald S et al. 2014. Ensembl 2014. Nucleic acids research 42(Database issue): D749-755. 574 Frost RJ, Hamra FK, Richardson JA, Qi X, Bassel-Duby R, Olson EN. 2010. MOV10L1 is necessary 575 for protection of spermatocytes against retrotransposons by Piwi-interacting RNAs. 576 Proceedings of the National Academy of Sciences of the United States of America 107(26): 577 11847-11852. 578 Gong C, Maquat LE. 2011. lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 579 3' UTRs via Alu elements. Nature 470(7333): 284-288. 580 Goodier JL, Kazazian HH, Jr. 2008. Retrotransposons revisited: the restraint and rehabilitation of 581 parasites. Cell 135(1): 23-35. 582 Gou LT, Dai P, Yang JH, Xue Y, Hu YP, Zhou Y, Kang JY, Wang X, Li H, Hua MM et al. 2014. 583 Pachytene piRNAs instruct massive mRNA elimination during late spermiogenesis. Cell 584 research 24(6): 680-700. 33

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

585 Grimson A, Srivastava M, Fahey B, Woodcroft BJ, Chiang HR, King N, Degnan BM, Rokhsar DS, 586 Bartel DP. 2008. Early origins and of and Piwi-interacting RNAs in 587 animals. Nature 455(7217): 1193-1197. 588 Gunawardane LS, Saito K, Nishida KM, Miyoshi K, Kawamura Y, Nagami T, Siomi H, Siomi MC. 589 2007. A slicer-mediated mechanism for repeat-associated siRNA 5' end formation in 590 Drosophila. Science 315(5818): 1587-1590. 591 Ha H, Song J, Wang S, Kapusta A, Feschotte C, Chen KC, Xing J. 2014. A comprehensive analysis of 592 piRNAs from adult human testis and their relationship with genes and mobile elements. 593 BMC genomics 15: 545. 594 Hirano T, Iwasaki YW, Lin ZY, Imamura M, Seki NM, Sasaki E, Saito K, Okano H, Siomi MC, Siomi 595 H. 2014. Small RNA profiling and characterization of piRNA clusters in the adult testes of the 596 common marmoset, a model primate. Rna 20(8): 1223-1237. 597 Ishii N, Owada Y, Yamada M, Miura S, Murata K, Asao H, Kondo H, Sugamura K. 2001. Loss of 598 neurons in the hippocampus and cerebral cortex of AMSH-deficient mice. Molecular and 599 cellular biology 21(24): 8626-8637. 600 Ishizu H, Siomi H, Siomi MC. 2012. Biology of PIWI-interacting RNAs: new insights into biogenesis 601 and function inside and outside of germlines. Genes & development 26(21): 2361-2373. 602 Juliano C, Wang J, Lin H. 2011. Uniting germline and stem cells: the function of Piwi proteins and 603 the piRNA pathway in diverse organisms. Annual review of genetics 45: 447-469. 604 Kalyana-Sundaram S, Kumar-Sinha C, Shankar S, Robinson DR, Wu YM, Cao X, Asangani IA, 605 Kothari V, Prensner JR, Lonigro RJ et al. 2012. Expressed pseudogenes in the transcriptional 606 landscape of human cancers. Cell 149(7): 1622-1634. 607 Kelley D, Rinn J. 2012. Transposable elements reveal a stem cell-specific class of long noncoding 608 RNAs. Genome biology 13(11): R107. 609 Kim VN, Han J, Siomi MC. 2009. Biogenesis of small RNAs in animals. Nature reviews Molecular cell 610 biology 10(2): 126-139. 611 Kimmins S, Sassone-Corsi P. 2005. Chromatin remodelling and epigenetic features of germ cells. 612 Nature 434(7033): 583-589. 613 Kuramochi-Miyagawa S, Watanabe T, Gotoh K, Totoki Y, Toyoda A, Ikawa M, Asada N, Kojima K, 614 Yamaguchi Y, Ijiri TW et al. 2008. DNA methylation of retrotransposon genes is regulated by 615 Piwi family members MILI and MIWI2 in murine fetal testes. Genes & development 22(7): 616 908-917. 617 Li XZ, Roy CK, Dong X, Bolcun-Filas E, Wang J, Han BW, Xu J, Moore MJ, Schimenti JC, Weng Z et 618 al. 2013. An ancient transcription factor initiates the burst of piRNA production during early 619 meiosis in mouse testes. Molecular cell 50(1): 67-81. 620 Lunyak VV, Prefontaine GG, Nunez E, Cramer T, Ju BG, Ohgi KA, Hutt K, Roy R, Garcia-Diaz A, Zhu 34

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

621 X et al. 2007. Developmentally regulated activation of a SINE B2 repeat as a domain 622 boundary in organogenesis. Science 317(5835): 248-251. 623 Lynch VJ, Leclerc RD, May G, Wagner GP. 2011. Transposon-mediated rewiring of gene regulatory 624 networks contributed to the evolution of pregnancy in mammals. Nature genetics 43(11): 625 1154-1159. 626 Pillai RS, Chuma S. 2012. piRNAs and their involvement in male germline development in mice. 627 Development, growth & differentiation. 628 Reuter M, Berninger P, Chuma S, Shah H, Hosokawa M, Funaya C, Antony C, Sachidanandam R, 629 Pillai RS. 2011. Miwi catalysis is required for piRNA amplification-independent LINE1 630 transposon silencing. Nature 480(7376): 264-267. 631 Robine N, Lau NC, Balla S, Jin Z, Okamura K, Kuramochi-Miyagawa S, Blower MD, Lai EC. 2009. A 632 broadly conserved pathway generates 3'UTR-directed primary piRNAs. Current biology : CB 633 19(24): 2066-2076. 634 Ross RJ, Weiner MM, Lin H. 2014. PIWI proteins and PIWI-interacting RNAs in the soma. Nature 635 505(7483): 353-359. 636 Rouget C, Papin C, Boureux A, Meunier AC, Franco B, Robine N, Lai EC, Pelisson A, Simonelig M. 637 2010. Maternal mRNA deadenylation and decay by the piRNA pathway in the early 638 Drosophila embryo. Nature 467(7319): 1128-1132. 639 Saito K, Nishida KM, Mori T, Kawamura Y, Miyoshi K, Nagami T, Siomi H, Siomi MC. 2006. Specific 640 association of Piwi with rasiRNAs derived from retrotransposon and heterochromatic regions 641 in the Drosophila genome. Genes & development 20(16): 2214-2222. 642 Schmidt D, Schwalie PC, Wilson MD, Ballester B, Goncalves A, Kutter C, Brown GD, Marshall A, 643 Flicek P, Odom DT. 2012. Waves of retrotransposon expansion remodel genome organization 644 and CTCF binding in multiple mammalian lineages. Cell 148(1-2): 335-348. 645 Silva JC, Shabalina SA, Harris DG, Spouge JL, Kondrashovi AS. 2003. Conserved fragments of 646 transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and 647 L2-derived sequences within the mouse and human genomes. Genetical research 82(1): 1-18. 648 Soumillon M, Necsulea A, Weier M, Brawand D, Zhang X, Gu H, Barthes P, Kokkinaki M, Nef S, 649 Gnirke A et al. 2013. Cellular source and mechanisms of high transcriptome complexity in the 650 mammalian testis. Cell reports 3(6): 2179-2190. 651 Tachibana M, Nozaki M, Takeda N, Shinkai Y. 2007. Functional dynamics of H3K9 methylation 652 during meiotic prophase progression. The EMBO journal 26(14): 3346-3359. 653 Taft RJ, Pheasant M, Mattick JS. 2007. The relationship between non-protein-coding DNA and 654 eukaryotic complexity. BioEssays : news and reviews in molecular, cellular and developmental 655 biology 29(3): 288-299. 656 Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, Cheloufi S, Hodges E, Anger M, 35

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

657 Sachidanandam R, Schultz RM et al. 2008. Pseudogene-derived small interfering RNAs 658 regulate gene expression in mouse oocytes. Nature 453(7194): 534-538. 659 Watanabe T, Chuma S, Yamamoto Y, Kuramochi-Miyagawa S, Totoki Y, Toyoda A, Hoki Y, Fujiyama A, 660 Shibata T, Sado T et al. 2011. MITOPLD is a mitochondrial protein essential for nuage 661 formation and piRNA biogenesis in the mouse germline. Developmental cell 20(3): 364-375. 662 Watanabe T, Totoki Y, Toyoda A, Kaneda M, Kuramochi-Miyagawa S, Obata Y, Chiba H, Kohara Y, 663 Kono T, Nakano T et al. 2008. Endogenous siRNAs from naturally formed dsRNAs regulate 664 transcripts in mouse oocytes. Nature 453(7194): 539-543. 665 Xu M, You Y, Hunsicker P, Hori T, Small C, Griswold MD, Hecht NB. 2008. Mice deficient for a small 666 cluster of Piwi-interacting RNAs implicate Piwi-interacting RNAs in transposon control. 667 Biology of reproduction 79(1): 51-57. 668 Yamamoto Y, Watanabe T, Hoki Y, Shirane K, Li Y, Ichiiyanagi K, Kuramochi-Miyagawa S, Toyoda A, 669 Fujiyama A, Oginuma M et al. 2013. Targeted gene silencing in mouse germ cells by insertion 670 of a homologous DNA into a piRNA generating locus. Genome research 23(2): 292-299. 671 Zheng K, Xiol J, Reuter M, Eckardt S, Leu NA, McLaughlin KJ, Stark A, Sachidanandam R, Pillai RS, 672 Wang PJ. 2010. Mouse MOV10L1 associates with Piwi proteins and is an essential component 673 of the Piwi-interacting RNA (piRNA) pathway. Proceedings of the National Academy of 674 Sciences of the United States of America 107(26): 11841-11846. 675

676

36

Figure 1 A B Retrotransposons Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press Mitosis Meiosis Spermiogenesis Late spermatocytes L1MM

RLTR4I_MM

Spermatogonia Early spermatocyte Late spermatocyte Round spermatid Elongated spermatid PIWIL2 PIWIL2

X 0.5 Mili KO PIWIL1

X 0.1 1 10 100 1000

Piwil1 KO Piwil1-/- expression (RPKM) Prepachytene piRNAs Pachytene piRNAs 0.5 0.1 1 10 100 1000 Piwil1+/- expression (RPKM)

C D Piwil1+/- Piwil1-/- Piwil1+/- Piwil1-/- 1 2 1 2 F Ensembl protein-coding genes 1 2 1 2 L1 ORF1 Late spermatocytes Stambp

GAPDH Prelid1

L1 RNA E Piwil1+/- Piwil1-/- Nucb1 Spc Odf1

Ace 0.5 Tnp1 Spc Fhl5 Piwil1 Piwil1-/- expression (RPKM) 0.1 1 10 100 1000 RS rRNA RS 0.5 0.1 1 10 100 1000 Piwil1+/- expression (RPKM)

G Ensembl protein-coding genes (RPKM > 0.5) H I Stambp Late spermatocytes Prelid1 Retrotransposons Late spermatocytes Tex101 RLTR4I_MM Morc1 ORR1A2_LTR ES

Man2a2 IAPEY3_LTR Bone marrow Cerebellum Cortex Heart Kidney Liver Lung MEF Spleen Testis Late spermatocyte L1MM Tdrd1 LTR1S4 IAPEY3_I RLTR1D_MM RLTR10D2 BGLII_LTR

(-/- vs. +/-) RLTR13B1 RMER21B RMER12C MLTR31_FA_MM 0.5 (flox/-, Stra8-Cre) Fold change in Piwil1 KO ERVB5_2-LTR_MM

R = 0.74 0.1 1 10 100 1000 0.125 0.25 0.5 1 2 4 8 16 0.125 0.25 0.5 0 2 4 8 16 0.5

0.1 1 10 100 1000 4401 Ensembl lncRNAs

Fold change in Mov10l1 conditional KO Mov10l1 conditional KO expression (RPKM) (flox/-, Stra8-Cre vs. flox/+) Mov10l1 fl/+ expression (RPKM)

K L J 6014 spermatocyte lncRNAs (RPKM > 0.5) Spermatocyte lncRNAs Late spermatocytes Testis Late spermatocytes 16 84210.50.25 6 dpp Late spermatocyte 22 dpp 28 dpp 40 dpp 10 dpp 15 dpp 18 dpp (-/- vs. +/-) 0.5 Fold change in Piwil1 KO 0.1 1 10 100 1000 R = 0.57

Piwil1-/- expression (RPKM) Red: piRNA cluster-derived lncRNA 0.125 0.5 0.1 1 10 100 1000 0.125 0.25 0.5 1 248 16 1074 testis-enriched Ensembl lncRNAs Fold change in Mov10l1 conditional KO Piwil1+/- expression (RPKM) (flox/-, Stra8-Cre vs. flox/+) Figure 2 A Ensembl protein-coding genes (RPKM > 0.5) C 10000 Antisense (4 mismatches)

Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor1000 Laboratory Press 100 10 Antisense (4 mismatches) PIWIL1-bound piRNA hits (RPM) PIWIL1-bound piRNA hits (RPM) 1

0 2000 4000 6000 8000 10000 12000 Piwil1-/- <1

Piwil1+/- 1.5-2 >2 Piwil1-/- 1-1.25 Piwil1+/- 1.5 1.2 1 1.2 1.5 1.25-1.5 P < 0.025 No. of 5910 1722277043291 Down-regulation Up-regulation genes D B Spermatocyte lncRNAs (RPKM > 0.5) Antisense (4 mismatches) 10000 1000 100 10 Antisense (4 mismatches) PIWIL1-bound piRNA hits (RPM) PIWIL1-bound piRNA hits (RPM) 1

0 10000 20000 30000 40000 50000 Piwil1-/- <1

Piwil1+/- 1.5-2 >2 Piwil1-/- 1-1.25

1.5 1.2 1 1.2 1.5 1.25-1.5 Piwil1+/- No. of P < 0.025 Down-regulation Up-regulation lncRNAs 2585 4852866741526 E Mov10l1 cKO vs Mov10l1 fl/+ Piwil1-/- vs Piwil1+/- Mov10l1 cKO vs Mov10l1 fl/+ Piwil1-/- vs Piwil1+/- Protein-coding genes Protein-coding genes lncRNAs lncRNAs 3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1 Sense (unique) -2 Sense (unique) -2 Sense (unique) -2 Sense (unique) -2 Antisense (4 mismatches) Antisense (4 mismatches) Antisense (4 mismatches) Antisense (4 mismatches) -3 -3 -3 -3 Log2 fold change (Piwil1-/- vs. Piwil1+/-) Log2 fold change (Piwil1-/- vs. Piwil1+/-) -2 6420 8 10 -2 6420 8 10 -2 6420 8 10 -2 6420 8 10 Log2 fold change (Mov10l1 cKO vs. Mov10l1 fl/+) Log2 fold change (Mov10l1 cKO vs. Mov10l1 fl/+) Log2 Expression level (RPKM) Log2 Expression level (RPKM) Log2 Expression level (RPKM) Log2 Expression level (RPKM)

G Protein-coding genes Protein-coding genes F 10000 10000 Sense (unique) 10169 Ensembl protein-coding genes (RPKM > 0.5) 1000 1000 10010 10010 Sense (unique) Sense (unique) 1 1 PIWIL1-bound piRNA (RPM) PIWIL1-bound piRNA (RPM) PIWIL1-bound unique piRNAs (RPM) 0 10000 20000 30000 0.1 0.1 1.5 1.2 1 1.2 1.5 Mov10l1 cKO Piwil1-/- Down-regulation Up-regulation Mov10l1 cKO

Mov10l1 fl/+ <1 <1 1.5-2 Mov10l1 fl/+ 1.5-2 Piwil1+/- >2 1-1.25 >2 1-1.25 1.25-1.5 1.25-1.5 P < 0.025 No. of P < 0.025 No. of 5910 1722277043291 genes 6404 1131867052611 genes

lncRNAs lncRNAs

4872 Spermatocyte lncRNAs (RPKM > 0.5) Sense (unique) 10000 10000 1000 1000 10010 10010 Sense (unique) Sense (unique) 0 50000 100000 150000 1 1 PIWIL1-bound unique piRNAs (RPM) PIWIL1-bound piRNA (RPM) 1.5 1.2 1 1.2 1.5 PIWIL1-bound piRNA (RPM) Mov10l1 cKO Down-regulation Up-regulation 0.1 Mov10l1 fl/+ 0.1

Mov10l1 cKO Piwil1-/- <1 <1 1.5-2 Mov10l1 fl/+ Piwil1+/- 1.5-2 >2 >2 1-1.25 1-1.25 1.25-1.5 1.25-1.5 No. of P < 0.025 No. of P < 0.025 lncRNAs 2182 3971665261113 lncRNAs 2585 4852866741526 Figure 3

A Downloaded from genome.cshlp.org onB October 6, 2021Stambp - Published pseudogene by Cold (Stambp-ps1) Spring Harbor piRNA Laboratory cluster PressChr17 piRNA cluster Stambp 1000 2000 mG An 2 kb 1800 _ 200 PIWIL1-bound piRNAs RT-PCR (E) chr6: 128,394,000 128,396,000 128,398,000 Antisense (4 mismatches) 70 -

rpm +/Gt1 rpm +/Gt1 0 _ 0 rpm

0

1800 _ C 21 dpp testes 70 -

Gt1/Gt1 rpm

rpm Gt1/Gt1 +/Gt1 Gt1/Gt1 +/+ Gt2/Gt2

No. of unique small RNA reads No. of unique small RNA 0 _ Stambp pseudogene 0 piRNA Stambp-ps1 Gene trap insertion Gt1 (IST11216A6) Chr17 piRNA Gt2 (IST12340E11) No. of unique small RNA reads (Watsons strand) reads (Watsons No. of unique small RNA RT-PCR primer (D) 1 2 43 Northern probe (C) D F Late spermatocytes 28 dpp testes (Green: STAMBP, Blue: DAPI) 1.6 1.4 +/Gt1 +/+ +/Gt1 Gt1/Gt1 Gt1/Gt1 1.2 Gt2/Gt2 1.2 RS 1.0 RS RS 0.8 0.8 0.6 RS RS 0.4 RS

(Relative to +/Gt1) 0.4 0.2 * * * * * *

Precursor RNA level / Gapdh mRNA level level / Gapdh mRNA Precursor RNA 0 0 1 2 3 4 1 2 3 4 Primer set Upstream of Downstream of Upstream of Downstream of Insertion Insertion Insertion Insertion Stambp-ps1 Stambp-ps1 region region

E Late spermatocytes 12 20 20 20 G Late spermatocytes * * +/Gt1 Gt1/Gt1 10 * +/+ 16 Mov10l1 fl/+ +/Gt1 16 16 Piwil1+/- STAMBP Gt1/Gt1 Gt2/Gt2 8 Piwil1-/- Mov10l1 cKO 12 12 12 (fl/-, Stra8-Cre) GAPDH 6 8 8 8 STAMBP/GAPDH 1 6.27 4

(Relative to +/Gt1) 4 4 4 2 Round spermatids +/Gt1 Gt1/Gt1 RNA level / Gapdh mRNA level level / Gapdh mRNA RNA 0 0 0 0 Stambp pre-mRNAStambp pre-mRNA Stambp pre-mRNA Stambp pre-mRNAStambp pre-mRNA Stambp mRNA Stambp mRNAStambp pre-mRNAStambp pre-mRNA Stambp mRNAStambp pre-mRNA Stambp mRNA STAMBP (Exon7-Intron7) (Exon7-Intron7) (Exon6-Intron6) (Exon6-Intron6)(Exon7-Intron7) (Exon6-Intron6)(Exon7-Intron7) (Exon6-Intron6)

GAPDH

STAMBP/GAPDH 1 5.49 Figure 4

A B Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press 5’ UTR CDS 3’ UTR piRNA-complementary sequences SINE 120 within mRNAs upregulated in Piwil1-/- Antisense (4 mismatches) Piwil1-/- Piwil1+/- LTR 80 >2, P < 0.025 (172) LINE 23.5% 1.5-2, P < 0.025 (227)

(RPM) 1.25-1.5 (704) Other repeats 40 8.0% 57.0% 1-1.25 (3291) Nonrepeat <1 (5910) 8.9% 0 0.50 -0.5 0 -11 0 -11 0 (kb) 2.6% TSS Start codon polyA Average number of PIWIL1-bound piRNA hits Average number of PIWIL1-bound piRNA

E C Protein-coding genes All retrotransposons SINE LINE LTR B1 (SINE) B2 (SINE) B3 (SINE) Nonrepeat-derived piRNAs Repeat-derived piRNAs -16 -16 -2 P < 2.2 x 10 P < 2.2 x 10 8 -12 -16 -16 -11 60 50 P = 4.8 x 10 14 P = 5.0 x 10 35 P < 2.2 x 10 12 P < 2.2 x 10 14 P = 2.3 x 10 10000 172 45 7 227 1000 12 30 12 50 704 40 10 6

100010010 35 10 25 10 40 8 10010 30 5 8 20 8 30 25 4 6 3291 6 15 6 20 3 4 20 5910 15 2 4 10 4 1

1 10 2 10 1 2 5 2

% of mRNAs contatinig 5 retrotransposon sequence 0 0 0 0 0 <1 1.5 - 2 (P < 0.025) >2 (P < 0.025) 0.1 1 - 1.25 1.25 - 1.5 <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1.5 - 2 (P < 0.025) >2 (P < 0.025) 1 - 1.25 1.25 - 1.5 <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) 0.1 0 0 PIWIL1-bound piRNA hits (RPM) PIWIL1-bound piRNA Piwil1-/-

Piwil1-/- <1 <1 1.5-2 1.5-2 Piwil1+/- 1-1.25 1-1.25 >2 Piwil1+/- >2 1.25-1.5 P < 0.025 1.25-1.5 P < 0.025

D lncRNAs F All retrotransposons SINE LINE LTR B1 (SINE) B2 (SINE) B3 (SINE) -16 -16 Nonrepeat-derived piRNAs Repeat-derived piRNAs P < 2.2 x 10-16 P < 2.2 x 10-16 P = 2.3 x 10-12 P < 2.2 x 10-16 P < 2.2 x 10 P < 2.2 x 10-16 P < 2.2 x 10 100 70 40 50 45 25 25 485 45

10000 90 40 286 60 35 10000 80 40 35 20 20 674 30 50 35 100010010 70 1526 30 100010010 25 60 2585 30 15 15 40 25 50 20 25 20 30 20 40 15 10 10 15 30 20 15 10 20 10 10 5 5 10

1 5 1 10 5 5 % of lncRNAs contatinig retrotransposon sequence 0 0 0 <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1.5 - 2 (P < 0.025) >2 (P < 0.025) 1 - 1.25 1.25 - 1.5 <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) <1 1 - 1.25 1.25 - 1.5 1.5 - 2 (P < 0.025) >2 (P < 0.025) 0 1 - 1.25 1.25 - 1.5 0 0 0 0.1 0.1 PIWIL1-bound piRNA hits (RPM) PIWIL1-bound piRNA Piwil1-/- Piwil1-/- Piwil1+/- <1 <1 1.5-2 1.5-2 1-1.25 1-1.25 >2

Piwil1+/- >2 1.25-1.5 1.25-1.5 P < 0.025 P < 0.025

G H >2, P < 0.025 (172) >2, P < 0.025 (485) 1.5-2, P < 0.025 (227) >1.5, P < 0.025 (286) Piwil1-/- Piwil1-/- 1.25-1.5 (704) 1.25-1.5 (674) Piwil1+/- Piwil1+/- 1-1.25 (3291) 1-1.25 (1526) <1 (5910) <1 (2585) Flanking Flanking Flanking Flanking genomic sequence 5’ UTR CDS 3’ UTR genomic sequence genomic sequence lncRNA genomic sequence 50 35 30 40 25 30 20 15 20 10

density (%) 10 density (%) 5 Retrotransposon Retrotransposon 0 0 25 20

20 15 15 10 10 5 5 SINE density (%) SINE density (%) 0 0 -1 0 0.5-0.5 0 -0.50.5 0 -0.50.5 0 1 -1.5 0 1 -1 0 1.5 TSS Start codon Stop codon polyA (kb) TSS polyA (kb) Figure 5 A C PB1D10 piRNA hit (antisense) RT-PCR (D) DownloadedPrimer1 Primer2 from genome.cshlp.orgPrimer3 Primer4 on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press Tdrd1 2,000 4,000 mG An 200 PIWIL1-bound piRNA hit PB1D10 ID2 Antisense (4 mismatches) rpm 10 nt 0 RNA-seq (Piwil1_het1) B PB1D10 ID2 RPM 8 _ PIWIL1 IP piRNA (sense, unique hit) Piwil1_het1 D 18 3 6 16 8 _ 2.5 5 Piwil1_het2 14 12 2 4 10 1.5 3 8 8 _ Piwil1_het3 6 1 2 4 0.5 1 2

24_ level Fold change in mRNA 0 0 0

(normalized by Gapdh mRNA) Primer1 Primer2 Primer3 Piwil1_KO1 Stambp Primer1 Primer2 Primer3 Primer4 mRNA Tdrd1 mRNA Prelid1 mRNA

24 _ Piwil1_KO2 6 5 2.5 3 2.5 5 4 2

24 _ 4 2 3 1.5 Piwil1_KO3 3 1.5 Piwil1-/- vs. Piwil1+/- 2 1 2 1 Piwil1-/ADH vs. Piwil1+/- 1 1 0.5 0.5 PIWIL1-bound piRNAs (sense, unique hit) 0 0 0 0 Fold change in lincRNA level Fold change in lincRNA

(normalized by Gapdh mRNA) Gm15389 CUFF22389 CUFF13988 CUFF27295 lncRNA lncRNA lncRNA lncRNA Figure 6 B C A CAG mCherry-B1flox/0 mCherry PEST loxP loxP Downloaded from genome.cshlp.org on OctoberpolyA 6, 2021 - Published300K by Cold Spring Harbor Laboratory Press - Stra8-CreTg/0 SINE (B1 or B2) flox D Late-spermatocyte 100 100 Δlox WT Late-spermatocyte WT Late-spermatocyte 200K flox/0 flox/0 mCherry-B1 mCherry-B2 80 80 flox/0 flox/0 flox/0 mCherry-B1 mCherry-B2 mCherry-B2 Tg/0

Tg/0 Hoechst Blue-A Stra8-Cre Stra8-Cre 60 60 100K - Stra8-CreTg/0 Count Count flox 40 40 Δlox 0 20 20 0 100K 200K 300K 0 0 Hoechst Red-A 0 1 2 3 4 0 1 2 3 4 10 10 10 10 10 10 10 10 10 10 mCherry fluorescence mCherry fluorescence

4.5 4.5 4.5 E mCherry fluorescence mCherry mRNA mCherry pre-mRNA 4 4 4

* 3.5 3.5 3.5 * 3 3 * 3 - 2.5 2.5 2.5 * Stra8-CreTg/0 2 2 2

1.5 1.5 1.5

1 1 1

0.5 Realative mCherry mRNA level 0.5 0.5 Realative mCherry pre-mRNA level (Nomalized by Gapdh mRNA level) (Nomalized by Gapdh mRNA level) 0 0 0 (normalized by WT fluorescence level) flox/0 flox/0 flox/0 flox/0 flox/0 mCherry-B1flox/0 mCherry-B2 mCherry-B1 mCherry-B2 mCherry-B1 mCherry-B2 Relative mCherry fluorescence level (median)

F G 4.5 mCherry fluorescence 4.5 mCherry mRNA 4.5 mCherry fluorescence 4.5 mCherry mRNA * 4 4 4 4

3.5 3.5 3.5 3.5

3 3 3 3 Piwil1+/- Piwil1+/- 2.5 2.5 2.5 2.5 * * Piwil1-/- Piwil1-/- 2 2 2 2

1.5 * 1.5 1.5 1.5

1 1 1 1 Realative mCherry mRNA level 0.5 Realative mCherry mRNA level 0.5 0.5 0.5 (Nomalized by Gapdh mRNA level) (Nomalized by Gapdh mRNA level)

0 (normalized by WT fluorescence level) 0 0

(normalized by WT fluorescence level) 0 flox/0 flox/0 Δlox/0 Δlox/0 Δlox/0 Δlox/0 flox/0 mCherry-B2flox/0 mCherry-B1 mCherry-B2 Relative mCherry fluorescence level (median) mCherry-B1 mCherry-B2 mCherry-B1 mCherry-B2 Relative mCherry fluorescence level (median) mCherry-B1 Figure 7 Late spermatocyte Liver B C 2 A 3’ RACE 8 * Downloaded from genome.cshlp.orgPolyA insertion on October (pA) 6, 2021 - Prelid1Published by Cold Spring Harbor Laboratory Press Prelid1+/+ * Prelid1pA/pA 3’ RACE primer (B) +/+ pA/pA 6 Prelid1 500 1000 mG An MIR L2a 4 1 50 PIWIL1-bound piRNAs Antisense (4 mismatches)

rpm 2

0 Realative mRNA level RT-PCR 0 0 (C) Primer1 Primer2 Primer3 (Nomalized by Gapdh mRNA level) Primer Primer2 pre-mRNA Primer Primer2 pre-mRNA Prelid1 mRNA Prelid1 mRNA

D E PIWIL1 and F pachytene piRNAs PIWIL1 and 1 pachytene piRNAs Espc Lspc Rspd

Espc mRNAs (3796 genes) 0. 8 Early-spermatocyte (Espc) mRNAs 12000 3796 genes 8000 Antisense (4 mismatches) 0.6 (e.g. meiotic recombination genes such as Rad51, Dmc1 and Sycp1) 4000

0.4 0

0.2 12000 Rspd mRNAs (1309 genes)

8000 Antisense (4 mismatches) Leptotene/Zygotene Early pachytene 0 Spermatogonia Preleptotene Mid pachytene Late pachytene Diplotene Round spermatid Round spermatid (Rspd) mRNAs

(Relative to Spermatogonia) 4000

Expression level of Prelid1 mRNA 1309 genes PIWIL1-bound piRNA hits (RPM) PIWIL1-bound piRNA

12077 Ensembl protein-coding genes (e.g. acrosome component genes 0 such as Acr, Acrv1 and Izumo1) mRNAs sorted by the number piRNA hits

H 5’ UTR CDS 3’ UTR G All SINE LINE LTR B1 (SINE) B2 (SINE) B3 (SINE) retrotransposons -15 -16 -4 -16 -5 P = 6.6 x 10 P < 2.2 x 10 P = 2.2 x 10 P = 0.012 P < 2.2 x 10 P = 2.7 x 10 P = 1.4 x 10-4 30 25 5 7 14 3.5 5 3796 6 12 3 25 20 4 4 5 10 2.5 20 15 3 3 4 8 2 Length (nt) 15 1309 6 10 2 3 1.5 2 10 2 4 1 5 1 1 5 1 2 0.5 Espc Rspd 0 Espc Rspd 0 0 0 0 Espc Rspd 0 0 Espc Rspd Espc Rspd Espc Rspd Espc Rspd 0 1000 2000 3000 4000 0 1000 2000 3000 4000 5000 0 200 400 600 800 % of mRNAs contatinig retrotransposon sequence All Espc Rspd All Espc Rspd All Espc Rspd

I

piRNA cluster piRNA cluster 5’ UTR CDS 3’ UTR Pseudogene Retrotransposon

piRNA mG An

PIWIL1 piRNA complex mG An mRNA lncRNA

mG An

Degradation Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press

Retrotransposons and pseudogenes regulate mRNAs and lncRNAs via the piRNA pathway in the germline

Toshiaki Watanabe, Ee-chun Cheng, Mei Zhong, et al.

Genome Res. published online December 5, 2014 Access the most recent version at doi:10.1101/gr.180802.114

Supplemental http://genome.cshlp.org/content/suppl/2014/12/23/gr.180802.114.DC1 Material

P

Accepted Peer-reviewed and accepted for publication but not copyedited or typeset; accepted Manuscript manuscript is likely to differ from the final, published version.

Creative This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the Commons first six months after the full-issue publication date (see License http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

Email Alerting Receive free email alerts when new articles cite this article - sign up in the box at the Service top right corner of the article or click here.

Advance online articles have been peer reviewed and accepted for publication but have not yet appeared in the paper journal (edited, typeset versions may be posted when available prior to final publication). Advance online articles are citable and establish publication priority; they are indexed by PubMed from initial publication. Citations to Advance online articles must include the digital object identifier (DOIs) and date of initial publication.

To subscribe to Genome Research go to: https://genome.cshlp.org/subscriptions

Published by Cold Spring Harbor Laboratory Press