Supplemental Information

SUPPLEMENTAL MATERIALS AND METHODS

Vector construction

For the construction of pEF4-3xFlag A/B/C vectors, DNA fragment of 3xFlag-tag sequence cloned from pRH3xFlag vector (Saito et al. 2005) was inserted into KpnI and

BamHI site of pEF4-Myc/His A/B/C vectors (Life Technologies, V942-20). The

FLAG-MILI expression vector was produced by insertion of the Mili cDNAs, amplified form C57BL/6J line mouse testis. Primer sequences are described in Supplemental

Table S10. The FLAG-MIWI and -MIWI2 expression vectors were kindly gifted by Dr.

Kuramochi-Miyagawa (Kojima et al. 2009).

Immunofluorescence

Adult marmoset testes were fixed in 4 % (w/v) paraformaldehyde overnight at 4 °C.

Testes were the soaked in 15 % and 30 % (w/v) sucrose sequentially. Testes embedded in OCT compound (Sakura Finetech, 4583) or SCEM (Section-Lab, SCEM) were cut into 7 µm sections using cryostat (Leica, CM3050S). After air-drying, sections were placed in Target Retrieval Solution (Dako, S1699) and subjected to antigen retrieval using autoclave at 105 °C for at least 5 min and then blocked for 1 hour at room temperature in 2 % Block Ace (DS Pharma Biomedical, UK-B40). Primary antibody incubation was done overnight at 4 °C in blocking reagent with 0.1 % (v/v) Triton X-100. After washing in PBS, the appropriate secondary antibody (1:1000 dilution,

Alexa Fluor series, Molecular Probes) and 4’,6-diamidino-2-phenylindole (DAPI) solution (1:10000 dilution; Dojindo, 340-07971) was added and incubated at room temperature for 1 hour in blocking reagent with 0.1 % Triton X-100. Sections were mounted on glass slides in PermaFluor Aqueous Mounting Medium (Thermo,

TA-006-FM). Confocal laser microscopy (Zeiss, LSM710) was used to acquire images.

The following primary antibodies were used: cultured supernatant of mouse IgG1 and

IgG2b monoclonal anti-MARWI hybridoma cells (1:50 dilution, 2D9 and 4A1 line, respectively), rabbit polyclonal anti-DDX4 (VASA) (1:100 dilution for marmoset and

1:1000 dilution for mouse, Abcam, ab13840), rabbit polyclonal anti-DAZL (1:200 dilution for marmoset and 1:1000 dilution for mouse, abcam, ab34139), mouse IgG1 monoclonal anti-phospho-Histone H2A.X (Ser139) (γH2AX) (1:500 dilution, Millipore,

05-636).

Western blotting

Western blot analysis was performed as described previously (Miyoshi et al. 2010). Ten

µg of from tissue samples or immunoprecipitates was subjected to SDS-PAGE.

Culture supernatants of anti-MARWI (1A5) hybridoma cells (1:10 dilution), mouse monoclonal anti-FLAG M2 (1:2000 dilution, Sigma, F1804) and mouse monoclonal anti-β-TUBULIN (1:5,000 dilution, DSHB, E7) were used.

Immunoprecipitation Immunoprecipitation details have been described previously (Saito et al. 2009). Briefly, fragmented testes tissues were homogenized in binding buffer (30 mM HEPES, pH 7.3,

150 mM potassium acetate, 2 mM magnesium acetate, 5 mM dithiothreitol (DTT), 0.1%

Nonidet P-40 (Calbiochem, 492016), 2 mg/mL Leupeptin (Sigma, L9783), 2 mg/mL

Pepstatin (Sigma, P5718) and 0.5% Aprotinin (Sigma, A6279)). MARWI and MIWI were immunopurified using anti-MARWI (1A5) immobilized on Dynabeads protein G (Life Technologies, 10004D). Reaction mixtures were incubated at 4°C for 3 h and beads were rinsed twice with binding buffer then five times with wash buffer

(binding buffer with 0.5 M NaCl).

β-elimination

Periodate oxidation and β-elimination was performed as described (Kirino and

Mourelatos 2007; Ohara et al. 2007; Saito et al. 2007; Simon et al. 2011). A 100 µL mixture consisting of purified RNAs and 10 mM NaIO4 (Wako, 199-02401) was incubated at 4°C for 40 min in the dark. The oxidized RNAs were then incubated with 1

M Lys-HCl at 45°C for 90 min to achieve β-elimination.

Northern blotting

Northern blot analysis was carried out essentially as previously described (Saito et al.

2010). Total RNAs of testis or MARWI immunoprecipitates were isolated using

ISOGEN (NIPPON , 311-02502), run on 12% acrylamide-denatured gels, and transferred onto Hybond N+ or N membranes (GE Healthcare, RPN303N or RPN303B, respectively). RNAs were crosslinked with 280 nm UV or

1-ethyl-3-(3-dimenthylaminopropyl) carbodiimide (EDC)-mediated chemical crosslinking methods (Pall and Hamilton 2008). The hybridizations were performed at

42°C in 0.2 M sodium phosphate (pH 7.2), 1mM EDTA (pH8.0) (Life Technologies,

15575-038) and 7% SDS, and washed at 42°C in 2 × saline sodium citrate and 0.1%

SDS. The sequence of northern probes is described in Supplemental Table S10.

Preparation of small RNA libraries

Preparation of MARWI-piRNA and total small RNA libraries used the NEBNext Small

RNA Library Sample Prep Set (New England BioLabs (NEB), E7330) with slight modifications. Gel-purified small RNAs obtained using the Small RNA Gel Extraction kit (TaKaRa, RR065) underwent 3’ ligation at 16°C for 18 hour, then free 3’ adaptors were degraded using 5’ deadenylase (NEB, M0331) and RecJf (NEB, M0264). 3’ ligated RNAs underwent 5’ adaptor ligation at 16°C for 18 hour. RNAs were reverse transcribed using SuperScriptIII (Life Technologies, 18080-044) and were

PCR-amplified using LongAmp Taq 2xMaster mix (NEB, M0287) with 18 cycles.

In vitro Slicer assay

The in vitro slicer assay was performed as described previously (Miyoshi et al. 2008).

32P-radio labeled piR-2 target RNAs were incubated with MARWI immunoprecipitates at 35°C in cleavage buffer (25 mM Hepes-KOH (pH 7.5), 50 mM KOAc, 5 mM

Mg(OAc)2, 5 mM DTT, and 0.4 unit RNasin plus (Promega, N2611)) for 4 hour. The sequence of piR-2 target RNA is described in Supplemental Table S10.

Comparison of two replicate samples for RNA-seq and MARWI-IP small RNA

-seq

To analyze MARWI associated piRNAs and total RNA, we generated two libraries from two different samples (51 month- and 41 month-old adult marmoset testis).

Expression level of each gene was calculated for each of the libraries and plotted as dot plot for RNA-seq (R2 = 0.50), where the number of reads obtained from each unique sequence was counted for each of the libraries and plotted as a dot plot for MARWI-IP small RNA-seq (R2 = 0.89). If the sequence is only present in one of the libraries, the read count was considered to be zero for the other library.

Annotation of the reads

The annotation of marmoset genome mapped reads was determined as previously described (Kawamura et al. 2008). Briefly, the overlap between mapped regions of the reads was examined using feature track data of the UCSC Genome Browser (Meyer et al. 2013) and Ensembl databases (Flicek et al. 2013). Reads were assigned to a feature when the length of its overlap was longer than 90% of the small RNA. We defined the priority of the feature assignment to avoid any conflict of assignment. The priority was in the following order: miRNA, ncRNA, rRNA, tRNA, snRNA, snoRNA, to remove small RNA contaminants, protein coding , transposons, and repeats. The assignments to miRNA, ncRNA, rRNA, snRNA, snoRNA, and coding genes were performed using Ensembl tracks (Flicek et al. 2013). We used UCSC Genome Browser tRNA tracks for tRNAs, and UCSC Genome Browser RepeatMasker (repeats) and

Simple repeats tracks for repeats (Meyer et al. 2013; Smit et al. 2010). Transposons were defined by a combination of Ensembl retrotransposed tracks (Flicek et al. 2013), and the RepeatMasker track (transposon) from the UCSC Genome Browser (Meyer et al.

2013; Smit et al. 2010).

RNA-seq and data processing

RNA-seq libraries were prepared with the directional mRNA-seq library prep protocol using the TruSeq Small RNA Sample Prep Kit (Illumina, RS-200-0012) by TaKaRa

Dragon Genomics. Two replicate libraries were prepared and sequenced using

HiSeq2000, and obtained reads were combined resulting in total of 168,398,076 reads.

Reads were mapped to the marmoset reference genome (calJc3, WUSTL 3.2) using

Bowtie (Langmead et al. 2009) with default parameters. Expression of each transcript was determined using TopHat (Trapnell et al. 2009) and Cufflinks (Trapnell et al. 2010), except for Argonaute proteins as these sequences were unavailable within the Ensembl data (C_jacchus3.2.1)(Flicek et al. 2013) (see below).

Calculation of RPKM for Argonaute/Piwi expression

Because of the lack of marmoset Argonaute/Piwi mRNA sequence in Ensembl and

GenBank databases (Benson et al. 2013; Flicek et al. 2013), we determined the expression of marmoset Argonaute/Piwi mRNAs by calculating the number of reads that uniquely map to the transcript sequence of each Argonaute/Piwi mRNA. These mapped reads were normalized by the length of each transcript and the total number of genome mapped reads to calculate Reads per Kilobase per Million mapped reads

(RPKM).

Sequence logo and heatmap

We used motifStack bioconductor package to produce sequence logos. Sequences were aligned to the 5’ end to search for nucleotide bias. Heatmaps were depicted using Java

TreeView software (Saldanha 2004).

Distribution of RNA-seq and small RNA-seq data

RNA-seq and MARWI-IP small RNA-seq distribution was determined using the reads that were mapped 1–5 times within the genome. We normalized the reads by dividing the number of mapped reads by the number of positions mapped to the genome.

Regions with genomic sequences consisting of unidentified nucleotides (N), are shown as gaps. Transposons, coding genes, and pseudogenes were defined using annotations from the Ensembl database (Flicek et al. 2013), UCSC Genome Browser (Meyer et al.

2013), and manual curation of the pseudogenes according to the host gene sequence.

Some piRNA clusters were merged within the MARWI-IP small RNA-seq distribution since the gaps within the marmoset genomic sequence seem to have divided the single cluster into two.

miRNA prediction

We used total small RNA-seq reads 21–23-nt in length to predict miRNAs (1,258,550 reads). Marmoset miRNA prediction was performed using miRDeep2 software

(Friedländer et al. 2012). The reads were mapped using mapper2.pl script (included in miRDeep2 software), with default settings (accepting reads mapped to 1–5 positions within the genome, and seed match length of 18-nt with no mismatch).

To perform a miRDeep2 search, the following information was required: (1) marmoset mature miRNA sequences, (2) marmoset miRNA precursor sequences, and

(3) marmoset-relative mature miRNA sequences (miRNA sequences from primates other than marmoset). For marmoset-relative mature miRNA sequences, we used 4,460 primate-originating miRNAs from the miRBase (release 19)(Kozomara and

Griffiths-Jones 2011). For marmoset miRNA precursor sequences, we extracted 1,439 miRNA precursor sequences from marmoset Ensembl gtf miRNA tracks, because of the unavailability of marmoset miRNA sequences in miRBase. Because mature marmoset miRNA sequences are not available and the data used for the annotation of genomic regions only provide information for miRNA precursor sequences, we defined marmoset mature miRNAs in the following way: 1,439 marmoset miRNA precursor sequences from Ensembl miRNA tracks were extracted. We performed an NCBI blast search (Blastn with -F F -U F -S 1 options) against these miRNA precursor sequences using 4,460 primate mature miRNA reads to extract mature miRNAs from precursor sequences. Mismatches up to two nucleotides were permitted, because of differences in mature miRNA sequences between relative primate species. In this way, we extracted 927 mature miRNA sequences from the precursor sequences and defined them as marmoset mature miRNAs.

Endo-siRNA search

To identify endo-siRNAs, we used 21–23-nt long total small RNA-seq reads which mapped to genomic regions other than the location of miRNA genes (476,689 reads). We searched for complementary small RNAs with 2-nucleotide 3’ overhangs corresponding to products excised from long double-stranded precursors by Dicer in endo-siRNA biogenesis, and also manually searched for genomic regions that can potentially give rise to phased endo-siRNAs by folding into long stem-loop secondary structures (Kawamura et al. 2008). However, we were unable to detect endo-siRNA-like small RNAs with significant reads.

Calculation of strand-bias, read-frequency and ping-pong signature

Strand-bias, read-frequency (Fig. 5A; Supplemental Fig. S10C) and ping-pong signature

(10-base binding partners) (Supplemental Fig. S9A) were calculated as described previously (Saito et al. 2009), with several modifications. For calculation of strand-bias and read-frequency, MARWI-piRNA sequences were mapped to a set of transposable elements obtained from UCSC Genome Browser RepeatMasker tracks (Meyer et al.

2013). The transposable element sequences from RepeatMasker include each of the sequences originating from different genomic positions (Smit et al. 2010). We searched this dataset for transposable elements that perfectly aligned to piRNA-reads. Those reads aligned to multiple sequences were divided by the number of times they were aligned. The MARWI-piRNA read count for each transposon sequence was gathered according to their subfamily names, so transposons originating from multiple regions within the genome were integrated as one. We took the top 100 transposable elements with a higher number of mapped piRNA reads to depict a heatmap showing the fraction of entire transposable element-mapped piRNA reads that was aligned to each transposable element, and the strand bias. The genomic copy number was defined as the number of positions from which a transposable element originates. For the ping-pong signature, we determined the threshold of the reads as the value at which the frequency of 2–11 nt 10-base binding partners was lower than 0.05%, because of the depth of the reads within this analysis compared to the previous study (Saito et al. 2009). Using these reads, we performed a sequence similarity search using the NCBI BLASTN program among small RNAs to detect potential binding partners that bind each other with 10-base full complementarity.

piRNA cluster analysis

The piRNA cluster was defined as described previously (Girard et al. 2006), with some minor modifications. A 5 kb window was slid for every 100 bp, searching for regions with the number of piRNA reads exceeding the threshold; bordering windows were merged as a single cluster. For the threshold, we normalized the previously described piRNA cluster threshold, 5 piRNAs / 5 kb, according to our total number of genome mapped reads (134,519,473). This resulted in a new threshold of 8,681 piRNAs / 5 kb. We only allowed clusters with over 200 unique sequences per 5 kb to be used, to avoid defining artifact reads as piRNA clusters. For read counts, we accepted those reads that mapped to the genome 1–5 times, and normalized this by dividing the number of reads by the number of mapped genomic positions.

Synteny region of piRNA clusters

Marmoset piRNA clusters were analyzed for their syntenic conservation in humans and mice. piRNA cluster information in the mouse and marmoset was obtained from previous research (Girard et al. 2006). We obtained syntenic regions of the genome for the marmoset, mouse, and human using UCSC genome browser liftOver. Clusters overlapping each other within their syntenic region were defined as a syntenic cluster.

TE-derivcd piRNA target prediction

To determine whether TE-derived piRNAs can target TEs other than their originating subfamily members, we searched for the sequences that perfectly align at least within positions 2–21 of piRNAs. We first determined the TE subfamily that each piRNA sequence originates from by aligning each piRNA sequence to TEs with a perfect match.

This assigned each TE-derived piRNA sequence to one of the TE subfamilies. The pooled piRNA sequences for each TE subfamily were then searched for target TEs against those subfamilies other than the one they originated from. This was repeated for piRNA pools in all TE subfamilies, and the ratio of TE-derived piRNA reads with targets other than their originating subfamily was calculated. The alignment was performed using Bowtie (Langmead et al. 2009) because of the large number of obtained sequences.

Accession number

The GenBank accession numbers for the genes used in this paper are as follows:

PIWIL1/MARWI (XM_002753158), PIWIL2/MARLI (XM_002756872),

PIWIL3/MARWI3 (XM_002743622), LOC100410008/MARWI2 (XM_002807291),

EIF2C1/AGO1 (XM_002750611), EIF2C2/AGO2 (XM_002759200), EIF2C3/AGO3

(XM_002806988), EIF2C4/AGO4 (XM_002750676), URI1/URI1 (XM_002761974),

WDR1/WDR1 (XM_002745989), BUB1/BUB1 (XM_002757493), CACYBP/CACYBP

(XM_002760341), SSBP3/SSBP3 (XM_002750861), ANO6/ANO6 (XM_002807084),

TPM1/TPM1 (XM_002761858), CBL/CBL (XM_002754434), MLEC/MLEC

(XM_002753092), C1H9orf64/C1H9orf64 (XM_002742790), Piwil1/Miwi

(NM_021311), Piwil2/Mili (NM_021308), Piwil4/Miwi2 (NP_067283), PIWIL1/HIWI

(NM_004764), PIWIL2/HILI (NM_001135721), PIWIL3/HIWI3 (NM_001008496) and

PIWIL4/HIWI2 (NM_152431).

The Ensembl accession numbers for the pseudogenes used in this paper are as follows: CACYBP-P1 (ENSCJAG00000001398), SSBP3-P1 (ENSCJAG00000010873), and TMP1-P1 (ENSCJAG00000015442).

SUPPLEMENTAL REFERENCES

Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. 2013. GenBank. Nucleic Acids Res 41: D36–42.

Flicek P, Ahmed I, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, et al. 2013. Ensembl 2013. Nucleic Acids Res 41: D48–55.

Friedländer MR, Mackowiak SD, Li N, Chen W, Rajewsky N. 2012. miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Res 40: 37–52.

Girard A, Sachidanandam R, Hannon GJ, Carmell MA. 2006. A germline-specific class of small RNAs binds mammalian Piwi proteins. Nature 442: 199–202.

Han J, Pedersen JS, Kwon SC, Belair CD, Kim Y-K, Yeom K-H, Yang W-Y, Haussler D, Blelloch R, Kim VN. 2009. Posttranscriptional Crossregulation between Drosha and DGCR8. Cell 136: 75–84.

Kawamura Y, Saito K, Kin T, Ono Y, Asai K, Sunohara T, Okada TN, Siomi MC, Siomi H. 2008. Drosophila endogenous small RNAs bind to Argonaute#2 in somatic cells. Nature 453: 793–797.

Kirino Y, Mourelatos Z. 2007. Mouse Piwi-interacting RNAs are 2'-O-methylated at their 3' termini. Nat Struct Mol Biol 14: 347–348.

Kojima K, Kuramochi-Miyagawa S, Chuma S, Tanaka T, Nakatsuji N, Kimura T, Nakano T. 2009. Associations between PIWI proteins and TDRD1/MTR-1 are critical for integrated subcellular localization in murine male germ cells. Genes Cells 14: 1155–1165.

Kozomara A, Griffiths-Jones S. 2011. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39: D152–7.

Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the . Genome Biol 10: R25. Lin ZY-C, Imamura M, Sano C, Nakajima R, Suzuki T, Yamadera R, Takehara Y, Okano HJ, Sasaki E, Okano H. 2012. Molecular signatures to define spermatogenic cells in common marmoset (Callithrix jacchus). Reproduction 143: 597–609.

Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, Sloan CA, Rosenbloom KR, Roe G, Rhead B, et al. 2013. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res 41: D64–9.

Miyoshi K, Miyoshi T, Hartig JV, Siomi H, Siomi MC. 2010. Molecular mechanisms that funnel RNA precursors into endogenous small-interfering RNA and microRNA biogenesis pathways in Drosophila. RNA 16: 506–515.

Miyoshi K, Uejima H, Nagami-Okada T, Siomi H, Siomi MC. 2008. In vitro RNA cleavage assay for Argonaute-family proteins. Methods Mol Biol 442: 29–43.

Ohara T, Sakaguchi Y, Suzuki T, Ueda H, Miyauchi K, Suzuki T. 2007. The 3' termini of mouse Piwi-interacting RNAs are 2'-O-methylated. Nat Struct Mol Biol 14: 349–350.

Pall GS, Hamilton AJ. 2008. Improved northern blot method for enhanced detection of small RNA. Nat Protoc 3: 1077–1084.

Saito K, Inagaki S, Mituyama T, Kawamura Y, Ono Y, Sakota E, Kotani H, Asai K, Siomi H, Siomi MC. 2009. A regulatory circuit for piwi by the large Maf gene traffic jam in Drosophila. Nature 461: 1296–1299.

Saito K, Ishizu H, Komai M, Kotani H, Kawamura Y, Nishida KM, Siomi H, Siomi MC. 2010. Roles for the Yb body components Armitage and Yb in primary piRNA biogenesis in Drosophila. Genes Dev 24: 2493–2498.

Saito K, Ishizuka A, Siomi H, Siomi MC. 2005. Processing of pre-microRNAs by the Dicer-1-Loquacious complex in Drosophila cells. PLoS Biol 3: e235.

Saito K, Sakaguchi Y, Suzuki T, Suzuki T, Siomi H, Siomi MC. 2007. Pimet, the Drosophila homolog of HEN1, mediates 2'-O-methylation of Piwi-interacting RNAs at their 3' ends. Genes Dev 21: 1603–1608. Saldanha AJ. 2004. Java Treeview—extensible visualization of microarray data. Bioinformatics 20: 3246–3248.

Simon B, Kirkpatrick JP, Eckhardt S, Reuter M, Rocha EA, Andrade-Navarro MA, Sehr P, Pillai RS, Carlomagno T. 2011. Recognition of 2′-O-methylated 3′-end of piRNA by the PAZ domain of a Piwi protein. Structure 19: 172–180.

Smit A, Hubley R, Green P. 2010. RepeatMasker. http://www.repeatmasker.org.

Trapnell C, Pachter L, Salzberg SL. 2009. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105–1111.

Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnol 28: 511–515.

Zhang R, Peng Y, Wang W, Su B. 2007. Rapid evolution of an X-linked microRNA cluster in primates. Genome Res 17: 612–617.

Hirano_Figure S1

A B 21~23nt 24~33nt 6.68% 83.46% 25 Unique region 2~9 regions 20 4.10% > 10 regions

15 Multiply mapped 24.00%

10 Uniquely mapped 71.90%

Fraction of reads (%) 5 0 20 40 60 80

0 Fraction of reads (%) 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 Read length (nt)

C Cellular ncRNAs 14 9.11% Coding genes 1.68%

Repetitive Ratio (%) Elements miRNA 3.60 19.28% tRNA 4.65 Other ncRNA 0.86 Unannotated Coding region 1.06 69.96% UTR 0.62 Transposon 17.07 Repeat 2.21 Unannotated 69.96

Supplemental Fig. S1. Small RNAs in the marmoset testis. (A) The size distribution of total small RNAs from the marmoset testis is shown. The small RNA library represents two major peaks of ~22-nt-long (miRNA-like) and 29–30-nt-long (piRNA-like). In this library, the miRNA-like fraction (21–23-nt) and piRNA-like fraction (24–33-nt) occupy 6.68% and 83.46% of the reads, respectively. (B) Ratio of the total small RNA reads that mapped either to unique (71.9%) or multiple (28.1%) regions within the genome. TEs mapping to multiple genomic positions implies that they are derived from transposable elements or repeats. (C) Annotations of genomic regions where total small RNA reads are perfectly mapped. Unannotated (mostly intergenic) regions account for 69.96% of mapped regions, and transposable elements and repeats for 19.28%. Note that the ratios do not add up to 100% due to the rounding error. Hirano_Figure S2

A B 2 Ratio (%) miRNA 43.68 miRNAs tRNA 0.44 1 43.68% Other ncRNA bits 1.21 Coding region 0.52 UTR 0.35 0 Transposon 8.72 1 5 10 15 20 Repeat 5.52 nt position from 5’ end (nt) potential novel Unannotated 39.56 miRNAs

C D miRNAs derived from URI1 coding region Clustered miRNAs derived from DNA transposons U C 3' 5' U A A mir-MER91C-21 G U G G C A A G U G G 5’ A G G G A A C C A U U U A U G A C A G U U A A U U 3’ G C G G A C A C C U U C C U U G G U A A G U A C U G U C A A U U G U G C A U G C U U U C U C A A G A A G G U G G C A U U C A U G U C A G U G C A A U G U A U G A A G A G U U U U U C C G C U G U A A G U A C A U C A C A A G U U G A A A A Chr22: 24603044 - 24603123 ChrX: 134414895 - 134414972 Mature miRNA Reads Mature miRNA Reads GUGGCAAGUUCAGAACCAUUUAGUGAUCAGUUGAAUAGUCAGUUGAACUGUUCAGUGAAUGGUUCCAGUUCUUACCACAG UGCAGUGCCUUUCUCAAGAAGGUGGCAUUCAUGUGAGCUAAAACAUGAAUGUCGCCUUUUUGAGAAGAGUAAUGUACA ...... CAGAACCAUUUAGUGAUCAGUU...... 97 ...... UGAAUGUCGCCUUUUUGAGAAGA...... 6733 ...... CAGAACCAUUUAGUGAUCAGU...... 21 ...... UUCUCAAGAAGGUGGCAUUCAUG...... 6257 ...... CAGAACCAUUUAGUGAUCAGUUa...... 3 ...... UGAAUGUCGCCUUUUUGAGAAG...... 2535 ...... UUCUCAAGAAGGUGGCAUUCAU...... 1074 ...... CAGAACCAUUUAGUGAUCAGUa...... 2 ...... UGAAUGUCGCCUUUUUGAGAA...... 889 ...... CAGAACCAUUUAGUGAUCAGUg...... 2

30 3200 21-23nt 21-23nt 0 0 RNA RNA * * * * * * * * * 3200 * * * * * * * * 30 3200 * 30 24-33nt 24-33nt 0 RNA 0 RNA 30 3200 600 125 RNA-seq 0 RNA-seq 0 600 125 14 MER91C URI1 isoforms Transposon chr22 24530000 24550000 24600000 chrX 134300000 134400000 134460000

E Testis Liver Testis Liver

miR-MER91C miR-MER91C -21-3p -5-3p

U6 U6

Supplemental Fig. S2. microRNAs in the marmoset testis. (A) Nucleotide bias of 21–23-nt-long small RNAs. The preference of uracil and adenine at the first position of the 5’ end is observed. Together with the nucleotide length, which peaks at 22-nt, these sequences possess characteristics of miRNAs. (B) Annotation of 21–23-nt-long small RNAs in the testis. miRNAs are the largest class of 21–23-nt RNA, and 43.68% of 21–23-nt-long small RNAs in the testis are annotated to known miRNAs. Other sequences, which map to various regions including transposable elements, are candidates of novel miRNAs. (C) URI1 exon-derived miRNAs. Putative secondary structure of miRNA precursor derived from URI1 protein coding region is shown with obtained reads. Red sequence indicates mature miRNA sequence; blue sequence indicates miRNA* sequence which derive from opposite arm of mature miRNA. Lower panel shows the density of small RNA-seq and RNA-seq reads within the genomic region (blue, sense; yellow, antisense). Gaps in the x-axis indicate unidentified genomic regions. This data indicate that URI1 mRNAs produce both URI1 proteins and miRNAs, and that URI1 mRNAs may be regulated by the miRNA biogenesis pathway, as shown previously (Han et al. 2009). (D) X-linked MER91C miRNA cluster. miRNAs mainly originated from MER91C DNA transposons are clustered within the X . The X-linked MER91C miRNA cluster contains 29 pre-piRNA-like hairpins which conceivably result in 17 distinct mature miRNAs from the 5’ arms and 19 miRNAs from the 3’ arms of ~22-nt RNA duplexes from the reiteration of identical sequences (also see Supplemental Table S3). The mir-MER91C-21 secondary structure and genomic distributions of smRNA-seq and RNA-seq reads are shown. Asterisks over the smRNA-seq read distribution indicate members of MER91C-derived miRNA precursors that overlap the MER91C transposon. X-linked MER91C-miRNA cluster sequences are shown in Supplemental Table S3. (E) Northern blotting of MER91C cluster miRNAs (miR-MER91C-21-3p, miR-MER91C-5-3p). These miRNAs are expressed in the testis, but not in the liver. miR-MER91C-5-3p shares the mature sequence with miR-MER91C-6-3p, -15-3p, -19-3p, and -22-3p. Hirano_Figure S3

A

R2 = 0.50 5

3 (Log) Marmoset #2 (51 month old) 1

1 3 5 Marmoset #1 (41 month old) (Log)

B

AGO1 0.85

AGO2 3.58

AGO3 0.91

AGO4 8.56

PIWIL1/MARWI 42.75

PIWIL2/MARLI 10.12

PIWIL3/MARWI3 0.68

PIWIL4/MARWI2 4.03

0 10 20 30 40 50

RPKM in marmoset testis

Supplemental Fig. S3. Expression of Argonaute family Genes in the marmoset testis. (A) Two individual directional RNA-seq libraries are similar. RNA sequences were plotted as a pairwise comparison between two individual directional RNA-seq libraries from 41 month and 51 month old marmosets. R2 = 0.50 indicates that individual libraries are related to each other. (B) The read frequency of Argonaute family members (AGO subfamily, AGO1 to AGO4; PIWI subfamily, PIWIL1/MARWI, PIWIL2/MAWRLI, PIWIL3/MARWI3 and PIWIL4/MIRWI2) was counted and normalized (shown in RPKM value), using RNA-seq data from marmoset adult testis. MARWI mRNA is most abundantly expressed in the testis (RPKM = 42.75). MARWI-piRNA sequencingdata(MARWI-piRNA withthelargestnumber ofreadswastRNA-derived piRNA). piR-2 andpiR-3 areMARWI-piRNAs,which occupiedsecondand third largestnumberofreads accordingtothe against piR-2 andpiR-3fromMARWIimmunoprecipitates. piRNA signals weredetected,confirming MARWI-IP-seqdata. - -terminiofMARWI-piRNAs is2’ 3’ - without 2’ ( produced. TogetherwithFig. 1 GST-fused N-terminalregionoffourmarmosetPIWI proteins usingananti-MARWImonoclonalantibody(1A5)we Supplemental Fig.S4.MARWIproteinandassociated piRNAs. B ) A B C - 26 mer RNA + O - -methyl modificationwere used ascontrols.TheunchangedMARWI-piRNAsignals indicatethatthe 26 mer RNA-2’ -O-Me + - MARWI-piRNA + - A

data,thisshowsthattheanti-MARWI antibodyspecificallyrecognizesMARWI proteins. MIWI-piRNA + O -methylated, asthecharacteristic ofpiRNAs. -elimination

GST-MARWI-N 100 - 20 - 30 - 40 - 50 - 75 - nt GST-MARLI-N

Total RNA GST-MARWI3-N ( A

) Westernblottingwasperformedagainstthe Non-immune

IP GST-MARWI2-N

anti-MARWI anti-GST anti-MARWI MARWI piR-2 ( C ) Northern blottingwasperformed 100 - 20 - 30 - 40 - 50 - 75 - nt

Total RNA

Non-immune Hirano_Figure S4 IP anti-MARWI MARWI piR-3 Hirano_Figure S5

R2 = 0.89

3

1 (Log) Marmoset #2 (51 month old) -1

-2 0 2 4 Marmoset #1 (41 month old) (Log)

Supplemental Fig. S5. Two individual MARWI-piRNA libraries are remarkably similar. piRNA sequences were plotted as a pairwise comparison between two individual MARWI-piRNA libraries from 41 month and 51 month old marmosets. R2 = 0.89 indicates that individual libraries are closely related to each other, and the depth of MARWI-piRNA sequencing was satisfied. Furthermore, the senescence in marmoset testes between 41 and 51 months has a moderate or no influence in the MARWI-piRNA population. from thedeep sequencing. derived from 5’ regionofAsp-tRNAbut not forLeuandValtRNAs, whichwasconsistantwith thenumberofreadsobtained immunoprecipitates withprobestargeting 5’regionoftRNAsforAsp, Leu,andVal.Thesignals indicatethatpiRNAsare Supplemental Fig.S6.MARWI associateswithtRNA-derivedpiRNAs. 100 - nt 30 - 40 - 50 - 75 - 20 - piRNA-reads 78,376 Total RNA

anti-MARWI IP

Non-immune AspGUC 5’ tRNA- 100 - nt 20 - 30 - 40 - 50 - 75 - piRNA-reads 6 Total RNA

anti-MARWI IP

Non-immune LeuCAG 5’ tRNA- Northernblottingwasperformed fromMARWI 100 - nt 30 - 40 - 50 - 75 - 20 - piRNA-reads 43 Total RNA

anti-MARWI IP

Non-immune ValCAC 5’ tRNA- Hirano_Figure S6 Hirano_Figure S7

A 1200 MARWI -piRNA 0

1200 400

RNA-seq 0

400 CBL isoforms chr11 1687000 1690000 1696000

1000 MARWI -piRNA 0

1000 500

RNA-seq 0

500 MLEC isoforms chr9 111550000 111560000 111566000

B 25000 MARWI -piRNA 0 25000 400

RNA-seq 0

400 C1H9orf64

chr1 84314000 84320000 84330000

Supplemental Fig. S7. MARWI associates with genic piRNAs. (A) mRNA 3’ untranslated region (3’ UTR) derived piRNAs. CBL and MLEC (LOC100387153) mRNAs preferentially generate piRNAs in the 3’ UTRs. (B) piRNAs originating across the whole C1H9orf64 (LOC100392619) mature mRNA transcript. These data indicate that genic-piRNA pathway is conserved among animal phyla (Robine et al. 2009; Saito et al. 2009). Hirano_Fig. S8 A anti-MARWI (2D9) anti-MARWI (4A1) Merge

100m B MARWI VASA Merge

100m C H2AX DAZL Merge

100m

D MARWI DAZL Merge

100m E MARWI H2AX Merge

100m Supplemental Fig. S8. MARWI expresses from pachytene sprematocyte to elongating spermatid. (A) Adult marmoset testis is stained using two different monoclonal anti-MARWI antibodies (2D9: IgG1, 4A1: IgG2b), and detected by different secondary antibodies that recognize distinct immunoglobulin isotypes. The staining patterns of 2D9 and 4A1 antibodies are identical. (B) Co-immunostaining with MARWI and VASA (DDX4). In the previous report, VASA signals were detect from spermatocytes to round spermatids (Lin et al. 2012). MARWI signals are merged VASA signals in the cytoplasm except luminal cells. (C) Co-immunostaining with germ line markers, DAZL and H2AX, is shown. DAZL positive cells are present in the base cells of seminiferous epithelium (spermatogonia) to some layers of spermatocyte cells, as reported in the previous report (Lin et al. 2012). H2AX is expressed in the meiotic prophase. During early spermatocyte (leptotene and zygotene), H2AX displays punctate-appearance. At pachytene spermatocyte, H2AX is restricted to the sex body. DAZL positive cells are present both in the seminiferous epithelium and the cells which dot-shaped H2AX signals are detected. Thus, DAZL expresses from spermatogonia to pachytene spermatocytes in the marmoset adult testis. MARWI is stained with DAZL (D) and H2AX (E). MARWI is partly overlapped with DAZL positive cells. (E) MARWI signals are detected from the cells with dot-shaped H2AX signals to H2AX negative luminal cells of seminiferous tuble. Thus MARWI expresses from pachytene spermatocyte to elongating spermatid stage. Hirano_Figure S9

A Total small RNA Total Total small RNA Total MARWI-piRNA

1-10 nt 2.46 1-10 nt 0.21 2.91 MARWI Total -piRNA 2-11 nt 0 2-11 nt 0.04 0.04

0% 12.25% 25%

B

Potential piRNA Potential piRNA target TEs target genes Antisense Sense Bi-directional Antisense Sense Bi-directional

Sense 3.52 0.08 0 Sense 0.01 0.01 0

Antisense 0.87 5.88 0.06 Antisense 0.02 0.03 0 TE-piRNAs TE-piRNAs Bidirectional 0 0 0.18 Bidirectional 0 0 0

0% 5% 10%

Supplemental Fig. S9. TE-derived MARWI-piRNAs might regulate oppositely oriented non-self transposon transcripts without the ping-pong pathway. (A) Heat maps showing the degree to which complementary 5’ 10 mers are found in each pairwise comparison. The analysis was performed for both total small RNA and MARWI-piRNA sequences. Control analysis used the 10 mer from position 2-11. Together with Fig. 1E, these data indicate that piRNAs from marmoset adult testes are mostly independent from the ping-pong pathway. (B) Heat maps showing the degree to which complementary target transposons permitting up to two mismatches are found in each pairwise comparison. Targets were searched against transposable elements other than those of the subfamily which piRNAs originate from. The right panel shows the control search for complementary target protein coding genes in the same conditions. Results indicate that definite proportion of TE-sense piRNAs target antisense TE transcripts and TE-antisense piRNAs target sense TE transcripts. Hirano_Figure S10 A C 40 LINE X = 85101 20

0 40 piRNA read per kb (log10) SINE X = 52093

4 3 2 1 0 TE copies Ratio of clusters in piRNA 20 L1:L1PA7 L1:L1MB7 L1:L1MB8 0 L2:L2 L2:L2b 40 L1:L1PB1 LTR X = 15005 L2:L2c L2:L2a 20 L1:L1P2 L1:L1M5 CR1:L3 L1:L1M4 0 L1:L1ME1

Fraction of Transposons (%) Transposons Fraction of L1:L1ME4a 40 L1:L1M2 DNA X = 4092 L1:L1P3 L1:L1MA7 LINE 20 RTE:L4 L1:L1PB4 L1:HAL1 CR1:L3b 0 L1:L1MD2 L1:L1ME3A 0..1 1..2 2..3 3..4 4..5 5..6 6..7 L1:L1MC L1:L1PA13 Mapped piRNA reads (log10) L1:L1ME3E L1:L1PA16 L1:L1MA6 B L1:L1MC2 L1:L1MC5 L1:L1MA8 LINE SINE L1:L1MEd L1:L1PREC2 250 250 L1:L1MEc R2 = 0.873 R2 = 0.936 L1:L1MCa MIR:MIRb MIR:MIR SINE 200 200 MIR:MIR3 MIR:MIRc Alu:AluJr Alu:AluSg 150 150 Alu:AluSx Alu:AluJb ERV1:HERVS71-int ERV1:HUERS-P2-int 100 100 ERV1:LTR1 ERV1:HERVE-int ERVK:LTR22B 50 50 ERV1:MER4E ERVL:MER21A ERVL-MaLR:MLT1C n = 146 n = 52 ERV1:LTR10E 0 0 ERV1:LTR10C Copy number in piRNA clusters Copy number in piRNA clusters 0 50 100 150 200 250 0 50 100 150 200 250 ERVL-MaLR:THE1B ERV1:MER49 Genomic copy number (1000copy) Genomic copy number (1000copy) ERVL-MaLR:MSTB ERVL-MaLR:MSTA ERV1:PABL_A LTR DNA ERVL-MaLR:MLT1A0 ERV1:HERVH-int ERV1:MER66D 50 50 ERV1:HERV9-int R2 = 0.359 R2 = 0.736 ERV1:LTR19A ERV1:MER52A ERVL-MaLR:MLT1J

40 40 ERV1:MER51A LTR ERVK:LTR14 ERVL-MaLR:MLT1F-int ERV1:MER50-int 30 30 ERVL:MLT2A2 ERV1:LTR31 ERV1:LTR26 20 ERV1:LTR2752 20 ERV1:MER90a ERVL-MaLR:MLT1J-int ERVL-MaLR:MLT1D 10 10 ERV1:LTR7C ERVL-MaLR:THE1D n = 512 n = 208 ERV1:MER51B 0 0 ERV1:MER50 ERV1:MER61A

0 10 20 30 40 50 Copy number in piRNA clusters 0 10 20 30 40 50 ERV1:LTR9D Copy number in piRNA clusters ERVL-MaLR:MSTA-int Genomic copy number (1000copy) Genomic copy number (1000copy) ERV1:HERV3-int ERVL-MaLR:THE1A-int ERV1:MER4D0 ERVL:LTR62 D E ERVL-MaLR:THE1B-int ERV1:LTR7B ERV1:MER52-int Colored bars: TEs in piRNA clusters piRNA cluster #1 ERV1:MER66A Gray bar: TEs out of piRNA clusters ERVL:MER21B ERV1:HERVFH21-int 80 LINE ERVK:HERVK14-int hAT-Charlie:MER103C DNA hAT-Tip100:MER115 TcMar-Mariner:HSMAR1 40 hAT-Tip100:MER45A hAT-Charlie:MER5A hAT-Charlie:Charlie18a 0 piRNA cluster #2 Ratio of TE copies in piRNA clusters 80 SINE 0% 2% >4%

40 0.11% (Ratio of piRNA cluster in the genome)

0 80 LTR piRNA cluster F 40 4000 MARWI 0 0 -piRNA 80 4000 DNA 800 Fraction of Transposons in each group (%) Transposons Fraction of 40 RNA-seq 0 800 0 0..1 1..2 2..3 3..4 4..5 5..6 6..7 Transposon chr5 Mapped piRNA reads (log10) 137550000 L1:L1PA7 137585000 Transposon Hirano_Figure S10, continued

Supplemental Fig. S10. Genomic copy number, number of copies in piRNA cluster, piRNA reads per kb of TEs and number of mapped piRNA reads for each transposon. (A) The number of mapped piRNA reads for each transposon was counted, and their distribution is shown as histograms for LINEs, SINEs, LTRs, and DNA transposons. The average number of mapped piRNA reads is also indicated. Larger fractions of LINE and SINE transposable elements have over 1,000 piRNAs mapped to an element. The lower peak for SINE transposons represents AluY transposable elements, indicating that a small number or no piRNAs are mapped to AluY transposable elements. (B) Dot plot analysis comparing the genomic copy number of each transposable element, and the copy number of transposable elements located within piRNA clusters. A strong correlation was observed for LINEs and SINEs. (C) Heat map shows the ratio of transposon copies found in piRNA clusters for each TE subfamily (white, lower; red, higher). Ratio of piRNA cluster region within the whole genome (0.11%) is indicated in the color-scale. Mapped piRNA reads normalized by the total TE length is shown in the bar graph (log scale). LTR retrotransposable elements are enriched in piRNA clusters, and can efficiently produce piRNAs. (D) Histogram showing the fraction of transposons with or without insertions into piRNA clusters that produce indicated amounts of piRNAs, related to Fig. 5C. Colored lines indicate transposons with insertions in piRNA clusters; gray lines indicate transposons without insertions. X-axis indicates the amount of piRNAs mapped to each transposable element; y-axis indicates the ratio of transposable elements within each group. (E) Each member of a TE subfamily was inserted into piRNA clusters with a sense/antisense bias. This was correlated with the observation that piRNAs corresponding to one subfamily are predominantly antisense to the TE, whereas piRNAs corresponding to another subfamily are predominantly sense, suggesting that the direction of TE insertion into the piRNA clusters is a major factor for setting the strong bias to the production of these piRNAs. (F) Example of an L1PA7 transposon oppositely inserted into a unidirectional piRNA cluster. The LINE L1PA7, to which the largest number of TE-derived piRNAs was mapped, has a copy integrated in one of the piRNA clusters in the antisense orientation. This results in the production of piRNAs that are antisense to L1PA7. A total of 59 L1PA7 copies were observed in the piRNA clusters. Hirano_Figure S11

A B C D IP from mouse tesits IP from mouse tesits Mouse testis NIH/3T3 Flag-Miwi Flag-Mili Flag-Miwi2 Mouse testis MW Non-immune anti-PIWIL1 kDa MW Non-immune anti-PIWIL1 anti-PIWIL1 only anti-MARWI kDa 175 - /PIWIL1 nt 220 - 150 - anti-FLAG 160 - 100 - 80 - anti-MARWI 80 - /PIWIL1 120 - 60 - anti- -TUBULIN 100 - 58 - 50 - 90 - - MIWI 40 - 80 - 46 - 70 - 30 - - MIWI-piRNA 60 - Heavy chain 20 - 30 - 50 -

40 - anti--TUBULIN 10 -

30 - Light chain

E anti-MARWI/PIWIL1 (2D9) anti-DAZL Merge

100m

F anti-MARWI/PIWIL1 (2D9) anti-MVH Merge

100m

Supplemental Fig. S11. Anti-MARWI/PIWIL1 monoclonal antibodies cross-react with orthologous mouse MIWI/PIWIL1 protein. (A) Western blotting was performed against mouse testis and embryonic fibroblast cell line NIH/3T3 using anti-MARWI/PIWIL1 monoclonal antibody (1A5). (B) Western blotting was performed against 293T cells expressed with Flag-tagged mouse Piwi proteins, using anti-MARWI/PIWIL1 (1A5) (upper), anti-FLAG (middle), and C) Silver staining of immunoprecipitant from mouse testis with anti-MARWI/PIWIL1 antibody (1A5). (D) MIWI bound RNA was visualized by 32P-labeling. These data show that anti-MARWI/PIWIL1 antibody (1A5) also recognizes orthologous mouse MIWI/PIWIL1 protein. Immunostaining was carried out against mouse testis using anti-MARWI/PIWIL1 (2D9) and anti-DAZL (E) or anti-DDX4/mouse VASA homologue (MVH) (F) antibodies. Note that anti-MARWI/PIWIL1 (4A1) antibody could not detect significant signals in mouse testis (also see Supplemental Fig. S8). Hirano_ Table S1

Supplemental Table S1. Summary of sequencing analysis. Mapped Read number Mapped read frequency (%) Total small RNA-seq (smRNA-seq) Total 18,832,688 21-23nt fraction 1,258,550 1,026,395 81.55 24-33nt fraction 15,718,112 12,792,323 81.39 MARWI-piRNA-seq (RIP-seq) (used 24-33 nt products) MARWI-piRNAs 140,662,017 134,519,473 95.63 Marmoset #1 80,836,991 Marmoset #2 59,825,026 Directional RNA-seq (RNA-seq) 168,398,076 127,728,428 64.59 Marmoset #1 91,307,038 Marmoset #2 77,091,038 Hirano_ Table S3 & S4

Supplemental Table S3. X-linked MER91C cluster miRNAs. 5' arm 3' arm Pre-miRNA name Total read Read Seed * Sequence ** Read Seed * Sequence ** cja-mir-MER91C-1 2728 2443 UCUCAAG a UUCUCAAGAAGGUGGCAUUCAU 285 GAUGACA l UGAUGACACCUUUUUGAGAAGA cja-mir-MER91C-2 9268 8796 UCUCAAG a UUCUCAAGAAGGUGGCAUUCAUG 472 GAUGUCA m UGAUGUCACUUUUUUGAGAAGA cja-mir-MER91C-3 3800 3575 UCACAAG b UUCACAAGGAGGUGUCAUUCAUG 225 GAUGACA l UGAUGACACCUUCUUGAGAAGA cja-mir-MER91C-4 3816 3575 UCACAAG b UUCACAAGGAGGUGUCAUUCAUG 241 GAAUGCC n UGAAUGCCGCCUUCUUGAGAAGA cja-mir-MER91C-5 9832 3575 UCACAAG b UUCACAAGGAGGUGUCAUUCAUG 6257 GAAUGUC o UGAAUGUCGCCUUCUUGAGAAGA cja-mir-MER91C-6 6634 377 UACAAGA c UUACAAGAAGGUGCCAUUCAUG 6257 GAAUGUC o UGAAUGUCGCCUUCUUGAGAAGA cja-mir-MER91C-7 7525 7497 UCACAAG b UUCACAAGGAGCUGUCAUUCAUG 28 GAAUGUC o UGAAUGUCACCUUCUUGAGAAGA cja-mir-MER91C-8 3656 3575 UCACAAG b UUCACAAGGAGGUGUCAUUCAUG 81 GAAUCCC p UGAAUCCCACCUUCUUGAGAAGA cja-mir-MER91C-9 3404 2774 UCUCAAG a UUCUCAAGAAGGUGGUAUUCAUG 630 GAUAUCA q UGAUAUCACCUUUUUGAGAAGA cja-mir-MER91C-10 4547 3575 UCGCAGG d UUCGCAGGGAGGUGUCAUUCAUG 972 GAACGUC r UGAACGUCACCUUCUUGAGAAGA cja-mir-MER91C-11 3404 2774 UCUCAAG a UUCUCAAGAAGGUGGUAUUCAUG 630 GAUAUCA q UGAUAUCACCUUUUUGAGAAGA cja-mir-MER91C-12 364 123 CCGCAGG e UCCGCAGGGAGGUGUCAUUCAUG 241 GAAUGCC n UGAAUGCCGCCUUCUUGAGAAGA cja-mir-MER91C-13 3404 2774 UCUCAAG a UUCUCAAGAAGGUGGUAUUCAUG 630 GAUAUCA q UGAUAUCACCUUUUUGAGAAGA cja-mir-MER91C-14 3780 3575 UCACAAG b UUCACAAGGAGGUGUCAUUCAUG 205 GAAUGCU s UGAAUGCUGCCUUCUUGAGAAGA cja-mir-MER91C-15 6640 377 UACAAGA c UUACAAGAAGGUGCCAUUCAUG 6263 GAAUGUC o UGAAUGUCGCCUUCUUGAGAAGA cja-mir-MER91C-16 105 52 UCACAAG b UUCACAAGGUGGUGUCAUUCA 53 GAAUGCC n UGAAUGCCACCUUUUUGAGAAGA cja-mir-MER91C-17 8992 8796 UCUCAAG a UUCUCAAGAAGGUGGCAUUCAUG 196 GAUGUCA m UGAUGUCACCUUCUUGAGAGGA cja-mir-MER91C-18 482 10 UCACAAG b UUCACAAGGAGGUGUCAUUUA 472 GAUGUCA m UGAUGUCACUUUUUUGAGAAGA cja-mir-MER91C-19 9832 3575 UCGCAGG d UUCGCAGGGAGGUGUCAUUCAUG 6257 GAAUGUC o UGAAUGUCGCCUUCUUGAGAAGA cja-mir-MER91C-20 7738 7497 UCACAAG b UUCACAAGGAGCUGUCAUUCAUG 241 GAAUGCC n UGAAUGCCGCCUUCUUGAGAAGA cja-mir-MER91C-21 20334 8796 UCUCAAG a UUCUCAAGAAGGUGGCAUUCAUG 11538 GAAUGUC o UGAAUGUCGCCUUUUUGAGAAGA cja-mir-MER91C-22 9832 3575 UCGCAGG d UUCGCAGGGAGGUGUCAUUCAUG 6257 GAAUGUC o UGAAUGUCGCCUUCUUGAGAAGA cja-mir-MER91C-23 9770 507 GUUCAGG f UGUUCAGGAAGGUGUUUUUCAU 9263 AAUGGCA t GAAUGGCACCCUUCUGAGCAGA cja-mir-MER91C-24 2489 1192 ACUCAAG g UACUCAAGAGGGGGCAAUCAUG 1297 GAUUGAC u UGAUUGACACCUCUGUGAGUGG cja-mir-MER91C-25 146 0 CUCGAGA h ACUCGAGAGAGUAGCAAUCACA 146 GAUUGAA v UGAUUGAAACCUCUGUGAGUAG cja-mir-MER91C-26 3150 1921 GCUGCAG i UGCUGCAGAGUGGGCAAUCAUU 1229 GAUUGAC u UGAUUGACACAUCUGCAGGUAG cja-mir-MER91C-27 2553 39 GCUGCAG i UGCUGCAGAGAGUGGCAAUCAUU 2514 GAUUGAC u UGAUUGACACGUCUGCAGGUAG cja-mir-MER91C-28 1360 1263 UCUUAUG j UUCUUAUGGUCUUCCUAUCCUU 97 GAUAGGA w GGAUAGGAAACCCAUAAGAACU cja-mir-MER91C-29 18122 12 ACUCCAG k UACUCCAGAGAGUAGCAAUCAU 18110 UUGAAAC x AUUGAAACCUCUGAGAGUAGA X-linked MER91C cluster miRNAs encoded in the minus strand of chromosome X (chx:134296736 - 134465757: -) are shown. cja-miR-MER91C-25-5p counts no read in our sequencing. * miRNAs that share same seed sequence mark same alphabets (a-x). Seed type alphabets in this Table links Supplementary Table 4. ** miRNAs that share mature sequence mark same Greek letters (

Supplemental Table S4. Seed conservation of X-linked MER91C cluster miRNAs, related to Table S3. 5' arm miRNA seed UCUCAAG UCACAAG UACAAGA UCGCAGG CCGCAGG GUUCAGG ACUCAAG CUCGAGA GCUGCAG UCUUAUG ACUCCAG (Seed type) a b c d e f g h (No read) i j k Human 1 1 0 0 0 0 0 0 0 0 0 Chimpanzee 0 1 0 0 0 0 0 0 0 0 0 Siamang 0 2 0 0 0 0 0 0 0 0 0 Rhesus monkey 0 2 0 0 0 0 0 0 0 0 0 Ygm 0 1 0 0 0 0 0 0 0 0 0 Spider monkey 2 3 2 0 0 0 0 0 0 0 1 Marmoset 7 9 2 3 1 1 1 1 2 1 1 3' arm miRNA seed GAUGACA GAUGUCA GAAUGCC GAAUGUC GAAUCCC GAACGUC GAUAUCA GAAUGCU AAUGGCA GAUUGAC GAUUGAA GAUAGGA UUGAAAC (Seed type) l m n o p q r s t u v w x Human 0 0 0 0 0 0 0 0 0 0 0 0 0 Chimpanzee 0 1 0 0 0 0 0 0 0 0 0 0 0 Siamang 0 0 0 0 0 0 0 0 0 0 0 0 0 Rhesus monkey 0 0 0 0 0 0 0 0 0 0 0 0 0 Ygm 0 0 0 0 0 0 0 0 0 0 0 0 0 Spider monkey 0 0 0 0 0 0 0 0 0 2 0 0 0 Marmoset 2 3 4 7 1 1 3 1 1 3 1 1 1 Alphabet letter links the seed types in Supplemental Table S3. Mature miRNA sequences of X-linked miRNAs in primates without marmoset are reffered to Zhang et al. 2007. Hirano_ Table S9

Supplemental Table S9. List of common marmoset examined in this study.

Series No. Developmental stage Age Gender Usage I3945 Adult 3y5m (41 month) male Total small RNA-seq, MARWI-piRNA-seq, RNA-seq, Western blot I3814 Adult 4y3m (51 month) male MARWI-piRNA-seq, RNA-seq, Northern blot I3814 Adult 4y3m (51 month) male Immunoprecipitation, Western blot, Northern blot I3219 Adult 5y11m (71 month) male Immunoprecipitation, Western blot, Northern blot I678 Adult 2y4m (28 month) male Immunoprecipitation, Western blot, Northern blot I4279M Adult 2y2m (26 month) male Immunohistochemistry Hirano_ Table S10

Experiment Vector name Primer name Primer sequence MARWI_F_EcoRI AAGAATTCATGACTGGGCGAGCCAGA pGEX-5x1-MARWI_1-612nt MARWI_R612stop_XhoI GGGCTCGAGTTATGTGGGTGGAAGCTCATTT MARLI_F_EcoRI CCGAATTCATGGATCCTTTCCGACCATC pGEX-5x1-MARLI_1-603nt GST expression MARLI_R603stop_XhoI GGGCTCGAGTTAAGGGTGATCTGGAGAGTGCA vector MARWI3_F_EcoRI AAGAATTCATGACTGGTAGGGCCAGGAC pGEX-5x1-MARWI3_1-597nt MARWI3_R597stop_XhoI AAACTCGAGTTACGCGAGTTCTTTGGAAAACT MARWI2_F_EcoRI AAGAATTCATGAGTGGAAGGGCCCGA pGEX-5x1-MARWI2_1-636nt MARWI2_R636_XhoI CCCCTCGAGTTAACTTGATGGCAGCTCCCTC 3xFlag expression 3xFlag_F_KpnI pEF4-3xFlag A/B/C AAGGTACCATGGACTACAAAGACCATGAC vector 3xFlag_R_BamHI CCGGATCCCTTGTCATCGTCATCCTT 3xFlag-Mili Mili_F_EcoRI pEF4-3xFlag-Mili TGAATTCAATGGATCCTGTCAGGCCG expression vector Mili_R_full_NotI CACGCGGCCGCTTACAGGAAGAACAGGTTCCCA

Experiment Primer name Primer sequence anti-miR-MER91C-20-3p TCTTCTCAAAAAGGCGACATTCA anti-miR-MER91C-5-3p TCTTCTCAAGAAGGCGACATTCA anti-U6 snRNA GCTTCACGAATTTGCGTGTCATCCT anti-piR-2 CATGGTTGTGCACACATTTTTACAATCTAA Northern anti-piR-3 GCTGCCAGCAACAGTCCATGCAAAAGGAA probe anti-tRNA-GluGAG_5' GCCGAATCCTAACCACTAGACCACCAGGGA anti-tRNA-GluGAG_3' TGGTTCCCTGACCGGGAATCGACCCGGGCC anti-tRNA-AspGAC_5' GGGGATACTCACCACTATACTAACGAGGA anti-tRNA-LeuCTG_5' AGCGCCTTAGACCGCTCGGCCATCCTGAC anti-tRNA-ValGTG_5' GAACGTGATAACCACTACACTACGGAAAC

Experiment Oligo name Sequence 26mer RNA UCGAAGUAUUCCGCGUACGUGAUGUU -elimination 26mer RNA-2'-O-Me UCGAAGUAUUCCGCGUACGUGAUGUUm Slicer assay piR-2 target RNA AAUCGCAUGGUUGUGCACACAUUUUUACAAUCUAAUUGCC O-methyl modification Supplemental Tables . in .xls format Supplemental Table S2. Known and novel miRNAs expressed in marmoset testis. Supplemental Table S5. piRNA sequences mapped to tRNAs. Supplemental Table S6. Genes with piRNAs mapped to their 3’ UTR sequence. Supplemental Table S7. Marmoset piRNA clusters. Supplemental Table S8. piRNAs mapped to each transposable element.