bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Maternal Plag1 deficiency delays zygotic genome activation and two-cell stage
2 embryo development
3 Running title Plag1 controls genome activation
4 Elo Madissoon1§, Anastasios Damdimopoulos2§, Shintaro Katayama1§, Kaarel Krjutskov3,4,
5 Elisabet Einarsdottir1,4, Katariina Mamia5, Bert De Groef6, Outi Hovatta5, Juha Kere1,7,8*,
6 Pauliina Damdimopoulou1,5,9*
7 1 Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
8 2 Bioinformatics and Expression Analysis core facility, Department of Biosciences and
9 Nutrition, Karolinska Institutet, Stockholm, Sweden.
10 3 Competence Centre on Health Technologies, Tartu, Estonia.
11 4 Molecular Neurology Research Program, University of Helsinki and Folkhälsan Institute of
12 Genetics, Helsinki, Finland
13 5 Department of Clinical Science, Intervention and Technology, Karolinska Institutet,
14 Stockholm, Sweden
15 6 Department of Physiology, Anatomy and Microbiology, La Trobe University, Bundoora,
16 Victoria, Australia
17 7 Research Programs Unit, Molecular Neurology, University of Helsinki, and Folkhälsan
18 Institute of Genetics, Helsinki, Finland.
19 8 School of Basic and Medical Biosciences, King’s College London, Guy’s Hospital, London,
20 UK
21 9 Swetox, Karolinska Institutet, Unit of Toxicological Sciences, Södertälje, Sweden
22 § Equal contribution
23 *Corresponding authors: [email protected] and [email protected]
1 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
24 Keywords: pleomorphic adenoma gene 1; preimplantation embryo; cleavage-stage embryo;
25 zygotic genome activation
26
27 Abbreviations
28 2c, 4c, 8c 2-cell, 4-cell, 8-cell
29 BMA Best match average
30 DEG Differentially expressed gene
31 dpc Days post coitus
32 GO Gene ontology
33 HET Heterozygous
34 hpf Hours post fertilization
35 KO Knockout
36 matPlag1KO Maternal Plag1 knockout
37 PLAG1 Pleomorphic adenoma gene 1
38 RNA-seq RNA sequencing
39 SINE Short interspersed nuclear element
40 WT Wildtype
41 ZGA Zygotic genome activation
42
43
2 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
44 Abstract
45 Pleomorphic adenoma gene 1 (PLAG1) is a transcription factor that is involved in cancer and
46 growth. We discovered a de novo DNA motif containing a PLAG1 binding site, located within
47 the promoters of genes activated during zygotic genome activation (ZGA) in human embryos.
48 This site frequently overlapped with Alu elements and was conserved in mouse B1 elements.
49 We show that Plag1 is essential for timely preimplantation embryo development. Mouse
50 oocytes lacking maternally loaded Plag1 gave rise to embryos that spent significantly longer
51 time in 2-cell stage, displayed delayed regulation of 1,089 genes and in contrast to wildtype,
52 expressed Plag1 from the paternal allele. The PLAG1 motif was enriched in promoters of
53 mouse delayed-activation genes that also showed a significant overlap with human ZGA
54 genes. By ontology, these mouse and human genes were connected to ribosome biogenesis
55 and protein synthesis. Our data suggests that PLAG1/Plag1 affects ZGA through a
56 conserved DNA motif within retrotransposons influencing ribosomes and protein synthesis, a
57 mechanism that might also explain its roles in cancer and growth.
3 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
58 Introduction
59 Early preimplantation embryo development is dependent on a number of processes,
60 including chromatin remodeling, maternal RNA degradation and zygotic genome activation
61 (ZGA) (Jukam et al., 2017; Niakan et al., 2012). Transcription from the newly formed zygotic
62 genome starts gradually and a major increase in transcriptional output, also known as the
63 major ZGA, takes place during the zygote to 2-cell (2c) transition in mice and 4c to 8c
64 transition in humans (Jukam et al., 2017; Niakan et al., 2012). Successful ZGA is a
65 prerequisite for the formation of a totipotent embryo with the capacity to grow, develop to all
66 embryonic and extra-embryonic tissues, and implant into the receptive uterine endometrium.
67 Knowledge about gene expression programs during the first steps of embryonic development
68 is essential for understanding totipotency, lineage differentiation, and infertility, as well as for
69 regenerative medicine. This has prompted many laboratories, including ours, to map
70 transcriptional programs during preimplantation development (De Iaco et al., 2017;
71 Petropoulos et al., 2016; Tohonen et al., 2015; Vassena et al., 2011; Xue et al., 2013; Yan et
72 al., 2013).
73 In a previous study, we used an RNA sequencing (RNA-seq) method based on the detection
74 of the 5’ ends of transcripts to study gene expression during the first three days of human
75 preimplantation development. Our data revealed 129 upregulated transcription start sites (or
76 transcript far 5’-ends, TFEs) during human major ZGA and de novo motif analyses of the
77 associated promoters led to our discovery of PRD-like homeodomain transcription factors as
78 regulators of ZGA in humans (Jouhilahti et al., 2016; Madissoon et al., 2016; Tohonen et al.,
79 2015). Several additional significant de novo motifs were discovered, and some of them
80 harbored known transcription factor binding sites, including that of pleomorphic adenoma
81 gene 1 (PLAG1).
82 PLAG1 encodes a C2H2 zinc finger transcription factor that was first characterized in 1997
83 through positional cloning in pleomorphic adenomas of the salivary gland (Kas et al., 1997).
4 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
84 It belongs to the same protein family as the imprinted tumor suppressor PLAG-like 1
85 (PLAGL1) and the oncogene PLAG-like 2 (PLAGL2). PLAG1 and PLAGL2 have overlapping
86 functions; they can bind to the same DNA elements, induce insulin-like growth factor (IGF2)
87 expression, and stimulate mitogenic signaling and tumor growth in mice (Hensen et al.,
88 2002). Ectopic expression of PLAG1 and PLAGL2 resulting from chromosomal translocation
89 events can be found in malignant tumors (Juma et al., 2016). Associations with cancer have
90 been the focus of most studies on PLAG1 family transcription factors and consequently, less
91 is known about their role in normal physiology. There are no reports on PLAG family genes in
92 preimplantation embryo development or pluripotency.
93 Plag1 knockout (KO) mice are growth restricted and subfertile; Plag1KO males have
94 impaired spermatogenesis (Juma et al., 2017) and Plag1KO females give birth to small litters
95 (Hensen et al., 2004). The poor breeding success of the Plag1KO mice combined with our
96 discovery of PLAG1 binding sites in the promoters of human ZGA genes prompted us to
97 study the reproductive phenotype of Plag1KO animals. Here, we describe the role of Plag1 in
98 the murine female reproductive tract and preimplantation embryos. Our results show that
99 Plag1 is dispensable in normal ovarian and uterine function but essential for ZGA.
5 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
100 Results
101 A de novo motif containing a PLAG1 binding site is enriched in the promoters of
102 human ZGA genes
103 Analysis of the 129 promoters upregulated during major ZGA in human embryos revealed 13
104 significant de novo DNA motifs (Tohonen et al., 2015), including a 31 bp element harboring a
105 known PLAG1 binding site (MA0163.1 in JASPAR, p=0.0076 by TomTom) (Fig. 1a). This
106 PLAG1 de novo motif was found in 74 of the 129 promoters upregulated during ZGA
107 (E=2.6×10-646 by MEME; File S1) and similar sites were found in 93 altogether (p<0.0001 by
108 MAST; Fig. 1b & File S1). On average, each promoter contained 4.3 motifs and they were
109 preferably located approximately 1,250 bp upstream of the transcription start site (Fig. 1b). In
110 addition, the PLAG1 de novo motif frequently overlapped with Alu elements in a manner
111 similar to the PRD-like homeodomain transcription factor motif we discovered in our previous
112 study (Tohonen et al., 2015) (Fig. 1b, c & File S1). In fact, these two motifs are both located
113 within a conserved region of Alu elements in close vicinity to an RNA polymerase III promoter
114 element, A-box (Ludwig et al., 2005) (Fig. 1c). Alu elements are primate-specific
115 retrotransposable short interspersed nuclear elements (SINE) that emerged from a
116 duplication of the 7SL RNA gene (Ullu and Tschudi, 1984) and their evolutionary
117 counterparts in rodents are the B1 elements (Labuda et al., 1991). Interestingly, the segment
118 of Alu elements containing the PRD and PLAG1 de novo motifs was well conserved in rodent
119 B1 elements and the putative PLAG1/Plag1 binding within the elements was nearly identical
120 to the consensus PLAG1 binding motif (Voz et al., 2000) (Fig 1c). Human PLAG1 and mouse
121 Plag1 proteins are 94% identical by sequence and their DNA binding domains are identical
122 (Text S1). Collectively, these data suggested that PLAG1/Plag1 could be involved in mouse
123 and human ZGA through binding to Alu and B1 elements. If true, disruption of PLAG1/Plag1
124 could lead to reproductive failure.
125 Plag1 knockout affects growth and reproduction in mice
6 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
126 To study the role of Plag1 in fertility, we obtained Plag1 knockout (KO) mice (Hensen et al.,
127 2004). Pups from heterozygous intercrosses (HET×HET) were genotyped to study the effect
128 of Plag1 deficiency on embryonic lethality. Genotype ratios from 16 litters encompassing 164
129 pups did not significantly deviate from the expected Mendelian distribution, although 43%
130 fewer KO pups were born (n=27) than expected (n=47) (Fig. 2a). Weights recorded at
131 weaning (3 weeks of age) confirmed growth retardation in KO animals: Plag1KO females
132 were 40% smaller and Plag1KO males 41% smaller than WT pups (Fig 2b) (Hensen et al.,
133 2004). Compared to HET intercrosses, breeding pairs with a Plag1KO female and a
134 Plag1HET male produced significantly fewer pups per litter (Fig. 2c). When we reversed
135 parental genotypes (Plag1KO males with Plag1HET females), litter size was not affected
136 (Fig. 2c). Homozygous KO×KO intercrosses produced the smallest litters (Fig. 2c). The
137 significant reduction in litter size of Plag1KO mothers was seen as early as 7.5−8.5 days post
138 coitum (dpc), when fewer implantation marks were observed in the uterus compared to the
139 Plag1HET and WT mothers (Fig. 2d, e).
140 The number of litters per month over a three-month continuous breeding period did not differ
141 between HET×HET crosses and pairs consisting of a Plag1KO female and a Plag1HET male
142 (Fig. 2f). When we reversed the parental genotypes and crossed Plag1KO males with
143 Plag1HET females, two of the three pairs did not manage to maintain the approximate one-
144 litter-per-month rate (Fig. 2f). When we intercrossed homozygous Plag1KO mice, litter
145 frequency was significantly reduced compared to Plag1HET intercrosses: the three KO×KO
146 breeding pairs produced only two litters during the entire three-month test period (Fig. 2f).
147 In summary, Plag1KO females produced small litters with normal frequency; Plag1KO males
148 had normal litter size but irregular frequency, and KOxKO pairs produced very small and very
149 infrequent litters.
150 Plag1 deficiency does not affect the ovaries or uterus
7 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
151 To study whether the reduced litter size of Plag1KO females could depend on impaired
152 ovarian function, we carried out superovulation experiments and dissected the ovaries for
153 histological evaluation. Oocyte yield and ovarian weights did not differ between WT and
154 Plag1KO females (Fig. 3a,b). The general histology of Plag1KO ovaries was similar to WT:
155 the number of follicles in different stages and corpora lutea did not differ between the
156 genotypes when normalized by cross section surface area (Fig. 3c,d). The Plag1KO uteri
157 were significantly smaller than WT uteri, but all normal uterine structures were present (Fig.
158 3e,f). When the weights were corrected for body weight the significance between genotypes
159 was lost (p=0.07 by Student’s t-test). To investigate possible differences at the transcriptomic
160 level, we performed RNA-seq from a piece of uterine horn, plus mucosa scraped from the
161 endometrial surface, and from outer muscle layer myometrium (Fig. S1). Principal
162 component analysis separated the samples by tissue type (uterus, mucosa, myometrium) but
163 not by genotype (Fig. 3g). Taken together, these data show that Plag1KO and WT ovaries
164 and uteri were not significantly different.
165 Maternal Plag1 deficiency delays 2-cell stage embryo development
166 We used embryos derived from Plag1KO females crossed with WT males to study the effect
167 of maternal Plag1 deficiency on embryonic development. These breeding pairs produced
168 Plag1HET embryos that lack the maternal Plag1 allele, and will hereafter be referred to as
169 matPlag1KO embryos. We observed preimplantation development of WT and matPlag1KO
170 embryos by time-lapse microscopy from zygote up to late morula or early blastocyst stages
171 (Movie 1 and 2). We recorded the time each embryo spent in different developmental stages,
172 and discovered that matPlag1KO embryos spent significantly longer time at the 2c stage
173 compared to WT embryos (Fig. 4). The time spent at the zygote, 2c and 4c stages was
174 34.1±0.5, 26.1±0.4 and 14.8±0.3 h (mean±SEM) in the matPlag1KO embryos compared to
175 33.1±0.3, 23.1±0.6 and 14.3±0.4 h in the WT, respectively. When 50% of WT embryos had
176 processed to the 4c stage, all matPlag1KO embryos were still arrested at stage 2c (Fig 4,
177 middle panel). When the matPlag1KO embryos proceeded beyond the 2c stage, normal
8 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
178 developmental pace was regained and the time spent at the 4c stage was not significantly
179 different from WT (Fig. 4). The survival of the embryos (zygote to morula) did not differ
180 between genotypes (73% in matPlag1KO, 60% in WT). In conclusion, maternal Plag1
181 deficiency caused a delay in 2c stage embryo development without compromising embryo
182 viability or further preimplantation development.
183 Two-cell stage matPlag1KO embryos exhibit delayed transcriptional program and
184 express Plag1 from paternal allele
185 In order to better understand the effect of maternal Plag1 deficiency on embryo development,
186 we collected MII oocytes (21−23 h post chorionic gonadotropin treatment), 2c embryos
187 (45−47 h post fertilization, hpf) and 8c embryos (71−73 hpf) for single embryo RNA-seq.
188 Altogether, we sequenced and analyzed 45 WT and 45 matPlag1KO oocytes and embryos
189 collected at three separate time points (Table S1). We used spike-in RNA for data
190 normalization to correct for the large general changes in cellular RNA content during these
191 developmental stages (Tohonen et al., 2015). The size of the spike-in reference RNA library
192 recovered by RNA-seq did not vary between different genotypes or developmental stages,
193 showing the robustness of the normalization strategy (Fig. S2). In contrast, the spike-in
194 normalized embryo RNA-seq libraries showed a dramatic drop in embryonic poly(A) mRNA
195 content from oocyte to 2c stage and a subsequent increase during 2c−8c transition (Fig. 5a).
196 There were no significant differences in general mRNA amount between matPlag1KO and
197 WT embryos at any stage (Fig. 5a). Analysis of differentially expressed genes (DEGs)
198 showed that the majority of DEGs were downregulated in both genotypes from oocyte to 2c
199 stage when maternal RNA degradation is known to take place. In that transition, there were
200 6,832 and 6,044 DEGs downregulated in WT and matPlag1KO compared to 2,910 and 2,851
201 being upregulated, respectively (Fig. 5b). More DEGs were upregulated in the 2c−8c
202 transition, 3,636 in WT and 3,586 in matPlag1KO, when the zygotic genome becomes fully
203 active (Fig. 5b).
9 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
204 We then compared WT and matPlag1KO embryos, and found that they differed most at the
205 2c stage. The 2c matPlag1KO embryos had 530 downregulated and 559 upregulated DEGs
206 compared to 2c WT embryos (Fig. 5c). A heat map clustered the embryos into three primary
207 groups by developmental stage (MII oocyte, 2c, 8c) but at the 2c stage, the embryos also
208 clustered by genotype (Fig. 5d). Principal component analysis likewise separated the
209 embryos primarily by developmental stage, but also by genotype at the 2c stage (Fig. 5e).
210 Finally, cell trajectory (pseudotime) analysis yielded a similar pattern, and further showed
211 that the development of the matPlag1KO embryos transcriptionally lagged behind WT at the
212 2c stage (Fig. 5f).
213 Next, we studied the expression pattern of the DEGs in matPlag1KO 2c stage embryos by
214 plotting their mean expression levels from oocyte to 8c stage. We found that the DEGs
215 upregulated in matPlag1KO embryos as compared to WT (n=559) were actually maternally
216 loaded transcripts whose degradation was delayed in the matPlag1KO embryos as
217 compared to WT embryos (Fig. 6a). Similarly, the DEGs downregulated in matPlag1KO
218 embryos at the 2c stage (n=530) were in fact genes whose upregulation was delayed in the
219 matPlag1KO embryos as compared to WT (Fig. 6a). We hereafter refer to these two sets of
220 genes as “delayed-degradation” and “delayed-activation” genes. Interestingly, the expression
221 level of the delayed-activation and delayed-degradation genes in matPlag1KO reached WT
222 levels by the 8c stage, suggesting that the transcriptional dysregulation caused by the lack of
223 maternal Plag1 allele was temporary (Fig. 6a).
224 We next considered the expression profile of Plag1 itself in the embryos. Plag1 transcripts
225 were present in WT oocytes, showing that Plag1 is a maternally loaded factor (Fig. 6b). We
226 also stained ovarian tissue sections and observed positive staining in secondary follicles
227 within the oocyte nucleoli, confirming expression in growing oocytes (Fig. S3). In WT
228 embryos, the maternally loaded Plag1 transcripts were degraded by the 2c stage, and there
229 was no further expression at the 8c stage, suggesting that Plag1 is not normally expressed
230 from the zygotic genome this early (Fig. 6c). A similar expression pattern was confirmed in
10 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
231 two independent embryo RNA-seq datasets in human, where PLAG1 transcripts were
232 present from oocyte to 4c stage, when ZGA takes place, and then downregulated (Fig. S4).
233 As expected, Plag1 transcripts were not detected in matPlag1KO oocytes. Surprisingly, they
234 were present in the 2c stage matPlag1KO embryos (Fig. 6b). As there is no maternal Plag1
235 allele in these embryos, the expression must stem from the paternal allele. Expression of
236 paternal Plag1 returned to baseline by the 8c stage (Fig. 6b). The expression levels of the
237 other Plag1 family members Plagl1 and Plagl2 were unaffected in matPlag1KO embryos as
238 compared with WT (Fig. S5).
239 We next carried out gene ontology (GO) analysis on the delayed-activation and delayed-
240 degradation gene sets to understand their functions. The delayed-activation genes clustered
241 into categories relevant for ribosome biogenesis, RNA processing, and translation (Fig. 6c,
242 File S2). The delayed-degradation genes showed less clear clustering. The largest clusters
243 represented diverse GO categories, such as neurogenesis, cell communication,
244 developmental and morphogenesis events (Fig 6c, File S2). We then compared these GO
245 categories to those associated with normal mouse embryo development by GO-annotating
246 the genes significantly up- and downregulated during 2c−8c transition in WT (File S3). The
247 delayed-activation gene GOs showed higher semantic similarity with genes typically
248 upregulated during 2c−8c transition in the WT [best match average (BMA) 0.792] than with
249 those downregulated (BMA 0.531), while the delayed-degradation gene GOs showed higher
250 similarity to genes normally downregulated between 2c−8c stages (BMA 0.714) than to those
251 upregulated (BMA 0.445). These data confirmed that embryos that lack the maternal Plag1
252 allele were characterized by delayed regulation of both maternal mRNA degradation and
253 ZGA compared to WT embryos.
254 Mouse delayed-activation genes overlap with human ZGA genes and show enrichment
255 for the PLAG1 de novo motif
11 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
256 In order to compare our earlier human embryo RNA-seq dataset (Tohonen et al., 2015) to
257 our current mouse data, we converted the mouse and human genes to orthologous gene as
258 well as to proteins, and used both gene and protein family-based approaches in subsequent
259 comparisons. First, we compared gene expression changes during major ZGA between
260 humans (4c−8c transition) and mice (2c−8c transition) in general using gene set analysis,
261 and found highly significant overall similarities (Fig. 7a). We then compared the human ZGA
262 genes with the genes expressed in mouse embryos at the 2c stage, and discovered that they
263 were significantly enriched among the delayed-activation genes (Fig. 7b). We obtained the
264 same result when using the more conservative χ2 test that is based on significant DEGs only
265 (whereas gene set analyses are based on ranked expression of all detected genes) (Fig. 7c).
266 DEGs upregulated during human major ZGA showed a significant overlap with the
267 matPlag1KO delayed-activation genes but not with delayed-degradation genes (Fig. 7c).
268 Comparison of protein families yielded similar results (Fig. 7d).
269 We next studied the promoters of the delayed-activation genes. When enrichment of the
270 database PLAG1 binding site (MA0163.1 in JASPAR) was studied, no significant differences
271 were found between delayed-activation, delayed-degradation and not regulated genes (Fig.
272 7e). Interestingly, when we used our PLAG1 de novo motif instead, significant enrichment
273 was discovered in promoters of delayed-activation genes as compared to genes whose
274 regulation was not affected by maternal lack of Plag1 (Fig. 7e). Altogether, 132 delayed-
275 activation genes had at least one PLAG1 de novo motif within −2,000 to +500 bp of their
276 transcription start site (File S4). We then compared these mouse genes with those 94 human
277 ZGA genes that contained a PLAG1 de novo motif (Fig 1b, File S1). Gene set analysis
278 revealed that there was a significant overlap between the mouse and human genes,
279 suggesting that ZGA genes containing a PLAG1 de novo motif are conserved between mice
280 and humans (Fig 7f).
281 We finally wanted to know the functions of these conserved mouse and human genes. We
282 GO-annotated them and compared the semantic similarity between the top 100 GO
12 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
283 categories associated to each gene set. Hierarchical clustering revealed that human ZGA
284 and mouse delayed-activation genes clustered together to categories representing
285 ribosomes, protein transport, translation, RNA processing, and protein metabolism (Fig 7g,
286 File S5). Conversely, the GOs associated to mouse delayed-degradation genes clustered to
287 groups representing morphogenesis, p53 signalling, and immune responses and showed
288 little overlap with the human ZGA clusters (Fig 7g, File S5).
289 Collectively, our findings show that there is a conserved sets of genes harboring PLAG1
290 binding motifs in their promoters that are activated during ZGA in mouse and human. Our
291 data further suggests that these genes play roles in central cellular processes that relate to
292 ribosomes, RNA and protein metabolism.
293
294 Discussion
295 In the present study, we have discovered a new role for oncogene PLAG1 as a regulator of
296 ZGA. We identified a motif containing a PLAG1 binding site among the promoters
297 upregulated during ZGA in human embryos, and showed that the lack of Plag1 leads to an
298 incomplete ZGA in mice on a transcriptional level with consequences for the timing of
299 cleavage stage development.
300 Our studies on Plag1KO mice confirmed many of the earlier observations, including growth
301 restriction and subfertility (Hensen et al., 2004). Plag1KO females produced small litters
302 regardless of the paternal genotype, implying that this phenotype was caused by the lack of
303 Plag1 allele in the mother. Still, we could not find any significant differences in histology or
304 function of reproductive organs between Plag1KO and WT females. Their ovaries contained
305 normal numbers of follicles at all developmental stages, an equivalent number of oocytes
306 were ovulated, and their litter frequency was not affected. Plag1KO uteri had normal
307 histology and did not transcriptionally differ from WT. We note that the Plag1KO uteri
308 remained slightly smaller than WT even after body weight adjustment, which could perhaps
13 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
309 impose spatial constrains to implantation. However, “uterine crowding” has not been found to
310 affect fetal viability in rabbits or mice (Argente et al., 2017; Bruce and Wellstead, 1992),
311 suggesting that smaller uterine size does not necessarily reduce implantation rate or
312 compromise post-implantation survival. We conclude that Plag1 is not required for normal
313 ovarian or uterine function. This suggests that embryos derived from Plag1KO females,
314 regardless of paternal genotype, do not have the same developmental potential as WT
315 embryos do, and are even different from HET embryos that have inherited the mutant allele
316 from the father. In order to focus on the maternal effect of Plag1, we compared Plag1KO
317 oocytes fertilized by WT studs (matPlag1KO embryos) with WT embryos.
318 We discovered that matPlag1KO embryos spent significantly more time in the 2c stage
319 compared to WT embryos. Interestingly, major ZGA takes place in the 2c stage of mouse
320 development, and failure to activate the zygotic genome leads to a developmental arrest. For
321 example, knockdown of the pluripotency factor Lin28 in mouse embryos leads to arrest at the
322 2c−4c stage (Vogt et al., 2012), and KO of the maternal-effect gene Mater to arrested
323 embryonic development at the 2c stage (Tong et al., 2000). ZGA is accomplished with the
324 help of maternally loaded factors, i.e., transcripts and proteins that are deposited into the
325 oocyte. We showed that Plag1 was a maternally loaded factor that could be detected in
326 nucleoli of oocytes within growing follicles by staining, and in mature mouse and human
327 oocytes by RNA-seq. MatPlag1KO embryos lacked maternally loaded Plag1 and were
328 delayed at 2c stage development suggesting that Plag1 is required for normal cleavage-
329 stage embryo development.
330 Despite lacking maternal Plag1, matPlag1KO embryos developed to the blastocyst stage
331 with equal efficiency as WT embryos. Interestingly, matPlag1KO embryos started expressing
332 Plag1 from the paternal allele, which WT embryos did not do, and this may have rescued
333 their development. Although these data argue that Plag1 is essential for embryo
334 development, we noted that even homozygous Plag1KO intercrosses occasionally produced
335 litters. We hypothesize that maternally loaded Plagl2, the Plag1 family member with
14 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
336 redundant activity to Plag1, might rescue embryo development in some cases. Testing the
337 hypothesis would require generation of Plag1/PlagL2 double KO mice, which is impossible
338 due to the severe phenotype of PlagL2 mice. Plagl2KO pups die shortly after birth to
339 starvation due to their inability to absorb chylomicrons (Van Dyck et al., 2007).
340 Although matPlag1KO embryos regained normal developmental pace after exiting the 2c
341 stage, their developmental success was not the same as that of WT embryos. Interestingly,
342 human embryos can be scored after in vitro fertilization through morphokinetic
343 measurements, and the time an embryo spends in the 2c and 4c stages is a significant
344 determinant of developmental potential (Wong et al., 2010). The disrupted regulation of over
345 1,000 genes at the 2c stage in matPlag1KO embryos, when not only ZGA but also epigenetic
346 changes take place, could have permanent effects on the competence of the blastocysts.
347 Moreover, synchronous preparation of both the embryo and endometrium for implantation is
348 a prerequisite for successful implantation during the window of receptivity. Therefore, a
349 simple delay in preimplantation development could lead to some blastocysts missing this
350 critical window (Wang and Dey, 2006).
351 The PLAG1 de novo motif in humans frequently localized within Alu elements in the
352 promoters of the ZGA genes. Alu elements are transposable elements ubiquitously present
353 in primate genomes with involvement in gene regulation by various mechanisms (Elbarbary
354 et al., 2016). The counterparts of Alu elements in the mouse genome are B1 elements
355 (Labuda et al., 1991). Mouse B1 and primate Alu have split from the common ancestor
356 7SLRNA over 80 million years ago and have evolved and retrotransposed independently
357 ever since (Ullu and Tschudi, 1984). Despite independent evolution, the densities of these
358 elements in promoters of orthologous genes in humans and mice are surprisingly correlated
359 (Mouse Genome Sequencing et al., 2002; Polak and Domany, 2006), and genes containing
360 Alu and B1 elements belong to highly correlated functional categories (Tsirigos and
361 Rigoutsos, 2009). Both Alu and B1 elements are highly enriched in promoters that are
362 activated during ZGA, and the number of copies in the promoter correlates with the level of
15 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
363 gene upregulation (Ge, 2017). It is interesting that the component of the Alu element
364 containing the PLAG1 binding site was well-conserved in the mouse B1 element.
365 Collectively, these findings suggested that the SINEs Alu and B1 play a role in ZGA in
366 humans and mice by attracting transcription factors such as PLAG1 to gene promoters. Our
367 data further showed that the database PLAG1 binding site alone did not show any
368 enrichment in the promoters of delayed-activation genes. This suggested that PLAG1 works
369 in concert with other transcription factors and co-regulators and therefore the short binding
370 site does not accurately reflect the DNA sequence required for PLAG1-stimulated gene
371 expression. This is a largely unexplored area that should be a focus of follow-up studies.
372 Functional annotation categories associated to delayed-activation genes in the matPlag1KO
373 embryos were mainly related to ribosome biogenesis, maturation and function. Even when
374 we restricted the analysis to only those genes containing a PLAG1 de novo motifs, GO-
375 analyses suggested roles in ribosome biogenesis, RNA and protein metabolism. Importantly,
376 human ZGA genes with PLAG1 de novo motif had a very similar GO profile. Alu elements
377 have been suggested to function as enhancer elements that regulate gene transcription (Su
378 et al., 2014), and Alu elements are enriched in promoters associated to ribosome biogenesis,
379 protein biosynthesis and RNA metabolism (Polak and Domany, 2006) as well as to ZGA (Ge,
380 2017). It is a plausible hypothesis that Alu elements in the promoters of ribosomal function
381 genes attract PLAG1 that then regulates the associated gene. Many oncogenes have effects
382 on ribosome biogenesis, which enables cancerous cells to increase protein synthesis and
383 grow rapidly (Pelletier et al., 2018). Although PLAG1 is an oncogene, its potential role in
384 ribosome biogenesis and protein synthesis has not been evaluated. Interestingly, one of the
385 most striking phenotypes of the Plag1KO mice is their small size (Hensen et al., 2004). In
386 addition, PLAG1 polymorphisms associate with body size and growth in farm animals and
387 humans (Fink et al., 2017; Rubin et al., 2012; Utsunomiya et al., 2013; Zhang et al., 2016).
388 Based on our data, we hypothesize that PLAG1/Plag1 associated growth effects could result
389 from modulation of protein synthesis that affects cell size and division rate.
16 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
390 We conclude that the lack of maternally loaded Plag1 leads to delayed ZGA and 2c stage
391 development, and reduced embryo competence for implantation. The effect on ZGA genes
392 arose through retrotransposition of Alu and B1 elements that harbour conserved PLAG1
393 binding sites. The genes affected by the lack of PLAG1 have roles in ribosome biogenesis,
394 RNA and protein metabolism. Follow-up studies should focus on Alu and B1 elements in
395 preimplantation development and a deeper analysis of PLAG1 involvement in protein
396 synthesis, as this is a mechanism that would explain many of the reported biological activities
397 of PLAG1, including tumorigenicity, cell proliferation, and growth.
398
17 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
399 Materials and Methods
400 Mouse colony and handling
401 All experiments were approved by the Swedish Board of Agriculture (#S5-14) and performed
402 in accordance to the ethical licence. Plag1KO mice (Hensen et al., 2004), backcrossed to the
403 CD1 strain, were a kind gift from Prof. Wim Van de Ven (University of Leuven, Belgium) and
404 Dr. Carol Schuurmans (University of Calgary, Canada). The local colony at Karolinska
405 Institutet was established through embryo transfers. HET animals were kept in continuous
406 breeding, and the litters were earmarked, weaned and weighed at the age of 3−4 weeks. Ear
407 punches were used for genotyping with primers Plag1MT_F (5'-
408 CAGTTCCCAGGTGTCCAACAAG-3'), Plag1MT_R (5'-AATGTGAGCGAGTAACAACCCG-
409 3'), Plag1WT_F (5'- CGGAAAGACCATCTGAAGAATCAC -3'), and Plag1WT_R (5'-
410 CGTTCGCAGTGCTCACATTG -3'). The animals were housed in standard conditions
411 (19−21°C, 55% humidity, lights 6:00 am−6:00 pm) with free access to feed (irradiated Global
412 18% diet 2918; Envigo, Huntingdon, UK) and tap water. In all experiments, animals were
413 sacrificed by cervical dislocation.
414 Superovulation and embryo imaging
415 Sexually mature 1-3 months old Plag1KO and WT females were superovulated by i.p.
416 injections of 5 IU pregnant mare serum (Folligon; Intervet, Dublin, Ireland), followed two days
417 later by 5 IU human chorionic gonadotropin (hCG) (Chorulon; Intervet, Dublin, Ireland), and
418 mated with trained WT studs. MII oocytes (no mating) and zygotes were collected from
419 oviducts the following morning and their numbers recorded. Cumulus cells were removed
420 with hyaluronidase (0.3 mg/ml, Sigma-Aldrich, St.Louis, MO, USA) treatment. For imaging,
421 zygotes were placed 2−4 per well into Primo Vision embryo culture dishes (Vitrolife,
422 Göteborg, Sweden) under a Nikon Ti-E spinning disk wide-field microscope with a live-cell
423 imaging incubator. The microscope was programmed to take bright-field images every 30
424 min with an Andor EM-CCD camera using the 20× objective for a total of 90 h 30 min. The
425 imaging was repeated three times with embryos from a total of 5 Plag1KO (providing 103
18 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
426 embryos) and 6 WT (providing 89 embryos) females with both genotypes present at every
427 session. Cleavage events until the 8c stage were scored manually from the images by a
428 researcher blind to the genotypes.
429 Histological assessment of ovaries and uteri
430 Ovaries and uteri were collected during superovulation experiments, their weights recorded,
431 and tissues stored in 4% (w/v) paraformaldehyde. Fixative was changed to 70% ethanol,
432 tissue embedded in paraffin and processed to 4-µm hematoxylin-eosin stained sections, and
433 the slides were digitalized with Mirax Slide Scanner (Zeiss, Göttingen, Germany). Transverse
434 sections from the middle of the uterine horn were used for histological examination. One
435 section from the middle of the ovary was used for assessment of follicles in different
436 developmental categories (preantral, antral, atretic and corpora lutea) and their number
437 adjusted for the ovary surface area using Pannoramic Viewer software (3DHistech,
438 Budapest, Hungary). Altogether, 22 Plag1KO and 13 WT animals were used for ovary
439 assessment, and 8 Plag1KO and 8 WT for uterus histology.
440 X-gal staining
441 Ovaries were collected from 3 Plag1KO and 3 WT females, aged 5 months, fixed,
442 cryosectioned and subjected to X-gal staining and Nuclear Fast Red counterstaining as
443 described before (Juma et al., 2017).
444 Implantation study
445 Plag1KO and WT females, aged 2-4 months, were mated with trained WT studs. The uteri
446 were collected 7.5−8.5 dpc and the number of implantation sites counted. The experiment
447 was carried out twice, with a total of 8 Plag1KO and 8 WT females.
448 Single-embryo RNA-seq
449 Zygotes collected from superovulated females (see “superovulation and embryo imaging”
450 above) were treated with hyaluronidase and placed in KSOM medium (EMD Millipore,
451 Nottingham, UK) under ovoil-100 (Vitrolife) in IVF 4-well plates (Sigma-Aldrich) for culture at
452 37°C under 5% CO2/95% air. Single embryos or uncultured MII oocytes were picked into 4 μl
19 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
453 library lysis buffer containing 5 mM Tris-HCl pH 8.0 (Sigma-Aldrich), 2 mM dNTP mixture
454 (ThermoFisher Scientific, Waltham, MA, USA), 10 mM DTT (Sigma-Aldrich), 0.05% Triton X-
455 100 (Sigma-Aldrich), 400 nM anchored oligo(dT) primer biotin-
456 TTAAGCAGTGGTATCAACGCAGAGTCGAC(T)29V where V is LNA nucleotide (Exiqon,
457 Vedbaek, Denmark), and 4 U RiboLock RNase inhibitor (ThermoFisher). Embryos were
458 picked 21−23 h post hCG treatment (MII oocytes, no culture), 45−47 hpf (2c stage) and
459 71−73 hpf (8c stage) on three different occasions, yielding a total of 16 zygotes, 14 2c stage
460 embryos, and 15 8c stage embryos for both matPlag1KO and WT (Table S1). Two separate
461 46-plex libraries (libA and libB) were made as described previously (Krjutskov et al., 2016)
462 with the following modifications: barcoded 10 µM template-switching oligonucleotides were
463 added prior to reverse transcriptase, ERCC spike-in Mix A was diluted 15,300-fold with clean
464 water, and 1 µl was taken per library reverse transcriptase master mix. Twenty cycles of PCR
465 were used for the first round of amplification and ten additional cycles for the second round to
466 introduce Illumina-compatible universal sequences. Both libraries contained all
467 developmental stages and genotypes.
468 Uterus RNA-seq
469 Uterine horns from 8 Plag1KO and 8 WT females used in the superovulation experiments
470 were collected in RNAlater (Ambion, Foster City, CA, USA). At dissection, uterine horns were
471 cut in half longitudinally, and the endometrial side was gently scraped with a scalpel (Fig.
472 S1). RNA was extracted with the RNeasy mini kit (Qiagen, Hilden, Germany) and quality
473 measured with the Agilent 2100 BioAnalyzer (Agilent Technologies, Santa Clara, CA, USA).
474 Ten nanograms of high-quality RNA (RIN>8) was used for RNA-seq. The RNA-seq was
475 performed according to the modified STRT protocol (Krjutskov et al., 2016).
476 RNA-seq data analysis
477 The analysis was performed as described previously (Krjutskov et al., 2016). In short, the
478 reads were filtered, samples de-multiplexed, UMI-s joined, reads trimmed and mapped to the
479 reference mouse genome mm9 by TopHat (Kim et al., 2013). The resulting bam files were
480 converted to tag directories employing Homer (Heinz et al., 2010) and were subsequently 20 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
481 used to estimate the reads in all annotated genes. Annotations were in GTF file format
482 retrieved from UCSC and were concatenated to a GTF file with the ERCC annotations. Gene
483 counts were then imported to R (R Development Core Team, 2010) and libraries with a
484 median gene expression of log2 counts per million (cpm) under 0 were excluded from further
485 analysis. Cell libraries, excluding the ERCC spike-in counts, were normalized with EdgeR
486 (Robinson et al., 2010) using the TMM normalization method. The ERCC counts were used
487 for normalization between the various embryonic cell stages by scaling the library sizes.
488 EdgeR was also employed for the subsequent differential gene expression analysis, which
489 was performed on genes that had 1 cpm in at least five or more samples and the rest of the
490 genes were filtered out. After removing the low expressing/unexpressed genes, the gene
491 counts were renormalized.
492 Principal component analysis was performed in R by using the genes that were significant in
493 any of the comparisons between the genotypes. Heatmaps were plotted on TMM normalized
494 counts exported from EdgeR and gene expression was standardized across all samples
495 (mean=0 and sd=1). Samples and genes were clustered using hierarchical clustering in R
496 and plotted employing the ComplexHeatmap library (Gu et al., 2016). The same gene set
497 was used for the cell trajectory (pseudotime) analysis by the monocle package (Trapnell et
498 al., 2014).
499 Gene ontology analysis was carried out in R (R Development Core Team, 2010) with the
500 topGO library. To identify enriched gene ontology terms, the classic algorithm and Fisher
501 statistic were used and analysis was carried out on up- and downregulated genes separately.
502 Semantic similarity between the GO terms was calculated using the Wang algorithm in
503 GOSemSim bioconductor package (Yu et al., 2010). The result is given as the best match
504 average (BMA) score that ranges from 0 to 1.
505 Gene set enrichment analysis was conducted to test whether the mouse genes homologous
506 to the human genes regulated between the 4c and 8c stage were also regulated in mouse
507 development. To calculate the p-values, the geneSetTest function from the limma package
508 (Ritchie et al., 2015) was used. The moving average of the enrichment was calculated with
21 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
509 the tricubeMovingAverage function and plotted with ggplot2. Homologene from NCBI was
510 used to convert the human genes to the homologous mouse genes. The significance in
511 overlap of the human genes with the mouse genes regulated by the Plag1KO at the 2c stage
512 was calculated with the Fisher test in R. Genes were also converted to protein families using
513 the bioconductor libraries for genome-wide annotation for human and mouse
514 (org.Hs.eg.db/org.Mm.eg.db). For genes that had more than one protein family annotated to
515 them, only one of the protein families was used in order not to inflate the number of
516 overlapping or non-overlapping families between the different gene groups.
517 Promoter analyses
518 Human embryo promoter sequence analysis was performed as previously described
519 (Tohonen et al., 2015). In brief, the de novo motif was compared with known motifs by
520 TomTom (Gupta et al., 2007). We applied MEME (Bailey and Elkan, 1994) for motif analysis
521 within the upregulated promoters, and identified similar sequences with the PLAG1 motif
522 (MA0163.1 in JASPAR (Meng et al., 2005; Sandelin et al., 2004)) by MAST (Bailey and
523 Gribskov, 1998). The location of Alu elements within the promoters was based on the
524 RepeatMasker track in the UCSC Genome Browser.
525 Human and mouse SINE repetitive elements DF0000002 (AluY), DF0000051 (AluSz),
526 DF0000034 (AluJo), DF0000144 (FLAM_C), DF0000016 (7SLRNA), DF0003101 (PB1) and
527 DF0001733 (B1_Mm) were retrieved from the Dfam database (Hubley et al., 2016) and
528 aligned and highlighted according to the percent identity by JalView2.
529 Mouse embryo promoter analysis was carried out with Homer (Heinz et al., 2010). The
530 JASPAR (Sandelin et al., 2004) PLAG1 motif (MA0163.1) (Meng et al., 2005) and the de
531 novo motifs (Tohonen et al., 2015) were used, and all annotated transcriptional start sites
532 were scanned for the presence of the motifs from 2,000 bp upstream to 500 bp downstream
533 of the transcriptional start site. Enrichment was analysed using Fischer’s exact test.
534 Statistical analysis
22 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
535 Continuous data were analyzed with Student’s t-test or one-way ANOVA (one categorical
536 predictor) or two-way ANOVA (two categorical predictors) followed by Fischer LSD post hoc
537 testing when necessary. Normality was tested with the Shaphiro Wilks test and
538 homoscedasticity with Levene’s test. Categorical data were analyzed with the χ2 test. The
539 exit time of embryos from each developmental stage was plotted as an empirical distribution
540 function (ecdf) using ggplot2. To test the significance between the exit times of the WT and
541 matPlag1KO embryos, the Kolmogorov-Smirnov test was used. All analyses were carried out
542 using R (R Development Core Team, 2010). All p-values are two-tailed and were considered
543 significant when p<0.05.
544 Data Availability
545 RNA-seq data have been deposited to Gene Expression Omnibus data repository as a
546 SuperSeries record under the reference GSE111040.
547
548 Acknowledgements and funding
549 Prof Wim Van de Ven is particularly acknowledged for giving us access to the Plag1KO
550 animal model, and Dr Schuurmans for sending the mice to us. Stanisław Wawrzyczek is
551 thanked for assistance with X-gal staining and Ingegerd Fransson for mouse genotyping and
552 library preparation. This study was performed in part at the Preclinical Laboratory (PKL)
553 animal facility and the Live Cell Imaging unit, Department of Biosciences and Nutrition,
554 Karolinska Institutet, Sweden, supported by grants from the Knut and Alice Wallenberg
555 Foundation, the Swedish Research Council, the Centre for Innovative Medicine and the
556 Jonasson donation to the School of Technology and Health, Royal Institute of Technology,
557 Sweden. The morphological phenotype analysis (FENO) core facility and Tarja Schröder are
558 thanked for the preparation of the histological specimens. The computations were performed
559 on resources provided by SNIC through Uppsala Multidisciplinary Center for Advanced
560 Computational Science (UPPMAX) under projects b2010037 and snic2017-7-317. The study
561 was supported by Knut and Alice Wallenberg Foundation (KAW2015.0096) and by the
23 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
562 Distinguished Professor Award by Karolinska Institutet to JK. Essential funding was also
563 provided by Jane & Aatos Erkko Foundation to PD.
564
24 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
565 Figure legends
566
567 Figure 1. A de novo DNA motif that harbors a known PLAG1 binding site is found in
568 the promoters of genes upregulated during ZGA in humans.
569 a) Comparison of the JASPAR PLAG1 binding motif (MA0163.1, top) and the PLAG1 de
570 novo motif (bottom). b) Location of source sites for the PLAG1 de novo motifs (red dots) and
571 similar sites (yellow dots) in the promoters of human major ZGA genes. Promoters of the 93
572 genes containing the motifs are stacked and aligned ranging from 2,000 upstream bases to
573 500 downstream bases around transcription start site (TSS, dashed line). The genes are
574 listed in File S1. Locations of AluJ, AluS and AluY elements are highlighted in grey. The
575 curve on top of the figure illustrates the density of the PLAG1 de novo motif along the
576 promoter. c) Comparisons of our earlier ZGA de novo motif (i) (Tohonen et al., 2015), the
577 current PLAG1 de novo motif (ii), known PDR-like transcription factor binding site (iii) (Jolma
578 et al., 2013) and PLAG1 JASPAR (iv) (Meng et al., 2005) and consensus (v) (Voz et al.,
579 2000) binding sites with human and mouse short interspersed nuclear elements (SINE). The
580 consensus motif (v) is described by IUPAC notation; R is purine (G or A), and K is keto (G or
581 T). The internal RNA polymerase III promoter, A-box, is indicated. The de novo motifs and
582 binding motifs are reverse-complemented. Sequences of AluY, AluSz, AluJo, FLAM_C,
583 7SLRNA, PB1 and B1_Mm were extracted from the Dfam database of repetitive DNA
584 elements. ZGA, zygotic genome activation; TF, transcription factor
585
586 Figure 2. Plag1 deficiency affects body weight, litter size, and breeding success.
587 a) Observed and expected numbers of pup genotypes in litters from heterozygous
588 intercrosses. b) Body weights of female and male pups at weaning. c) Mean number of pups
589 in litters from different parent genotypes. d) Mean number of implanted embryos on 7.5−8.5
25 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
590 dpc, and e) representative photos of uteri at dissection. Asterisks indicate implanted
591 embryos. Scale bars are 1 cm. f) Frequency of litters from breeding pairs of different
592 genotypes maintained in continuous breeding for 90 days. Every breeding pair is represented
593 with a horizontal line and the birth of a litter is illustrated with a circle. Quantification of the
594 data is shown on the right. The data are presented as means + SEM and the number of
595 observations is shown inside the columns. Statistical analysis by χ2 test (a) or one-way
596 ANOVA followed by Fisher LSD post hoc test (b-f). * p<0.05, ** p<0.01, *** p<0.001. F,
597 female; HET, heterozygous; KO, knockout; M, male; WT, wild type.
598
599 Figure 3. Plag1 deficiency does not have significant effects on ovaries and uterus.
600 a) Number of oocytes collected after superovulation presented as violin plots. The median is
601 marked as a red line. b) Absolute and relative (body weight-adjusted) weights of ovaries,
602 means + SEM. c) Representative images of hematoxylin-eosin stained ovarian cross
603 sections from WT and Plag1KO mice. The scale bar is 1 mm. d) Numbers of preantral, antral
604 and atretic follicles as well as corpora lutea (CL) per mm2 of ovary. Data are presented as
605 median (red line), interquartile ranges (box), and non-outlier ranges (whisker). Outliers are
606 depicted as dots. e) Representative photos and hematoxylin-eosin stained histological
607 images of WT and Plag1KO uteri. The scale bar is 500 µm. f) Absolute and relative (body
608 weight-adjusted) uterine weights presented as means + SEM (n=8). g) Principal component
609 analysis of global gene expression in WT and Plag1KO uterus samples divided into “uterus”
610 (a piece of uterine horn) (n=8), “mucosa” (cells scraped from the endometrial surface of the
611 uterine horn) (n=8), and “myometrium” (tissue left after mucosa has been removed) (n=8).
612 Statistical analysis by t-test (a, b, f) or two-way ANOVA (d), ***p<0.001. CL, corpus luteum;
613 KO, knockout; PC, principal component; WT, wild type.
614
26 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
615 Figure 4: MatPlag1KO embryos spend significantly more time in the 2c stage
616 compared to WT embryos.
617 The time spent in each developmental stage for each embryo (exit time) was plotted as an
618 empirical distribution function. Plots show the cumulative proportion of embryos (y-axis, 0−1)
619 that have exited the corresponding developmental stage at each time point (x-axis, hours).
620 To test the significance between the exit times of the WT (n=53) and matPlag1KO (n=75)
621 embryos, the Kolmogorov-Smirnov test was used.
622
623 Figure 5. Single-embryo RNA-seq distinguishes matPlag1KO embryos from WT
624 embryos at the 2c stage.
625 a) Size of the spike-in normalized RNA-seq libraries by genotype and developmental stage.
626 The line represents the median, the box first and third quartiles, and the whisker non-outlier
627 range. Outliers are depicted as dots and defined as data points >1.5 times interquartile range
628 outside range. One sequencing reaction equals one embryo, and the number of embryos
629 (biological replicates) used for sequencing is given in the figure. b) Number of differentially
630 expressed genes from one developmental stage to another (horizontal arrows) in
631 matPlag1KO and WT embryos. c) Number of differentially expressed genes between
632 matPlag1KO and WT embryos at MII oocyte, 2c and 8c stage. d) Heat map, e) principal
633 component analysis, and f) pseudotime (cell trajectory) analysis based on all differentially
634 expressed genes in the libraries. 2c, 2-cell stage; 8c, 8-cell stage; DEG, differentially
635 expressed gene; KO, knockout; PC, principal component; WT, wild type.
636
637 Figure 6. MatPlag1KO embryos have a delayed transcriptional program, start
638 expressing Plag1 from the paternal allele at the 2c stage, and catch up with WT
639 embryos by the 8c stage.
27 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
640 a) Mean normalized expression (Z) pattern of the 559 genes that are significantly
641 upregulated (“delayed-degradation”) and the 503 that are significantly downregulated
642 (“delayed-activation”) in matPlag1KO embryos compared to WT embryos at the 2c stage. b)
643 Normalized expression of Plag1 in the embryos at MII oocyte, 2c and 8c stages. c) Heat
644 maps displaying semantic similarity among the top 150 significantly enriched GO terms
645 associated to the delayed-degradation (left) and delayed-activation (right) genes. Five largest
646 clusters are depicted (cl1−cl5), and common denominators among GO terms belonging to
647 these clusters are shown. The full GO lists are provided in File S2. 2c, 2-cell stage; 8c, 8-cell
648 stage; CPM, counts per million; GO, gene ontology; KO knockout; WT, wildtype.
649
650 Figure 7. MatPlag1KO delayed-activation genes show a significant overlap with human
651 ZGA genes as well as enrichment of the PLAG1 de novo motif in their promoters.
652 a) Gene set enrichment analysis comparing gene expression changes from mouse 2c−8c
653 stage (x-axis) with human 4c−8c gene expression changes (y-axis). b) Gene set enrichment
654 analysis comparing genes that are expressed in mouse embryos (WT and matPlag1KO) at
655 the 2c stage (x-axis) with human 4c−8c gene expression changes (y-axis). Black vertical
656 lines shows the location of human genes among the ranked mouse genes, and curve depicts
657 the enrichment. Red dotted line indicates “no enrichment” level. c) Venn diagram showing
658 the overlap between matPlag1KO delayed-activation and delayed-degradation genes with
659 the human major ZGA genes. d) Venn diagram showing the same overlaps based on protein
660 families instead of genes. e) Presence of PLAG1 database binding sites and PLAG1 de novo
661 motifs in the promoters of genes expressed in mouse 2c stage embryos. Total number of
662 genes in different categories as well as number of unique genes with the motif are shown.
663 Enrichment over not affected genes was analyzed with Fischer’s exact test. ***p<0.001. f)
664 Gene set enrichment analysis as in b) but only displaying those human and mouse genes
665 with a PLAG1 de novo motif in their promoters. g) t-SNE plot demonstrating similarity among
28 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
666 the top 100 GO categories associated to human ZGA, mouse delayed-activation and mouse
667 delayed-degradation genes that contain PLAG1 de novo motifs. Eight largest clusters with
668 the most common words within the clusters are shown. 2c, 2-cell stage; 4c, 4-cell stage; 8c,
669 8-cell stage; KO, knockout; OR, odds ratio; WT, wild type.
670
671 Movie 1: Time-lapse video of WT embryo development
Movie 1.avi 672
673 A representative movie of WT mouse embryo development from zygote to blastocyst. The
674 video time line is indicated (0-89 h) as well as the time spent in 2c stage.
675
676 Movie 2: Time-lapse video of matPlag1KO embryo development
Movie 2.avi 677
678 A representative movie of matPlag1KO mouse embryo development from zygote to
679 blastocyst. The video time line is indicated (0-89 h) as well as the time spent in 2c stage.
680
29 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
681 References
682 Argente, M. J., Calle, E. W., Garcia, M. L. and Blasco, A. (2017). Correlated response in litter size 683 components in rabbits selected for litter size variability. J Anim Breed Genet 134, 505-511. 684 Bailey, T. L. and Elkan, C. (1994). Fitting a mixture model by expectation maximization to discover 685 motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2, 28-36. 686 Bailey, T. L. and Gribskov, M. (1998). Combining evidence using p-values: application to sequence 687 homology searches. Bioinformatics 14, 48-54. 688 Bruce, N. W. and Wellstead, J. R. (1992). Spacing of fetuses and local competition in strains of mice 689 with large, medium and small litters. J Reprod Fertil 95, 783-789. 690 De Iaco, A., Planet, E., Coluccio, A., Verp, S., Duc, J. and Trono, D. (2017). DUX-family transcription 691 factors regulate zygotic genome activation in placental mammals. Nat Genet 49, 941-945. 692 Elbarbary, R. A., Lucas, B. A. and Maquat, L. E. (2016). Retrotransposons as regulators of gene 693 expression. Science 351, aac7247. 694 Fink, T., Tiplady, K., Lopdell, T., Johnson, T., Snell, R. G., Spelman, R. J., Davis, S. R. and Littlejohn, 695 M. D. (2017). Functional confirmation of PLAG1 as the candidate causative gene underlying 696 major pleiotropic effects on body weight and milk characteristics. Sci Rep 7, 44793. 697 Ge, S. X. (2017). Exploratory bioinformatics investigation reveals importance of "junk" DNA in early 698 embryo development. BMC Genomics 18, 200. 699 Gu, Z., Eils, R. and Schlesner, M. (2016). Complex heatmaps reveal patterns and correlations in 700 multidimensional genomic data. Bioinformatics 32, 2847-2849. 701 Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. and Noble, W. S. (2007). Quantifying similarity 702 between motifs. Genome Biol 8, R24. 703 Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y. C., Laslo, P., Cheng, J. X., Murre, C., Singh, H. 704 and Glass, C. K. (2010). Simple combinations of lineage-determining transcription factors 705 prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38, 706 576-589. 707 Hensen, K., Braem, C., Declercq, J., Van Dyck, F., Dewerchin, M., Fiette, L., Denef, C. and Van de 708 Ven, W. J. (2004). Targeted disruption of the murine Plag1 proto-oncogene causes growth 709 retardation and reduced fertility. Dev Growth Differ 46, 459-470. 710 Hensen, K., Van Valckenborgh, I. C., Kas, K., Van de Ven, W. J. and Voz, M. L. (2002). The 711 tumorigenic diversity of the three PLAG family members is associated with different DNA 712 binding capacities. Cancer Res 62, 1510-1517. 713 Hubley, R., Finn, R. D., Clements, J., Eddy, S. R., Jones, T. A., Bao, W., Smit, A. F. and Wheeler, T. J. 714 (2016). The Dfam database of repetitive DNA families. Nucleic Acids Res 44, D81-89. 715 Jolma, A., Yan, J., Whitington, T., Toivonen, J., Nitta, K. R., Rastas, P., Morgunova, E., Enge, M., 716 Taipale, M., Wei, G., et al. (2013). DNA-binding specificities of human transcription factors. 717 Cell 152, 327-339. 718 Jouhilahti, E. M., Madissoon, E., Vesterlund, L., Tohonen, V., Krjutskov, K., Plaza Reyes, A., 719 Petropoulos, S., Mansson, R., Linnarsson, S., Burglin, T., et al. (2016). The human PRD-like 720 homeobox gene LEUTX has a central role in embryo genome activation. Development 143, 721 3459-3469. 722 Jukam, D., Shariati, S. A. M. and Skotheim, J. M. (2017). Zygotic Genome Activation in Vertebrates. 723 Dev Cell 42, 316-332. 724 Juma, A. R., Damdimopoulou, P. E., Grommen, S. V., Van de Ven, W. J. and De Groef, B. (2016). 725 Emerging role of PLAG1 as a regulator of growth and reproduction. J Endocrinol 228, R45-56. 726 Juma, A. R., Grommen, S. V. H., O'Bryan, M. K., O'Connor, A. E., Merriner, D. J., Hall, N. E., Doyle, S. 727 R., Damdimopoulou, P. E., Barriga, D., Hart, A. H., et al. (2017). PLAG1 deficiency impairs 728 spermatogenesis and sperm motility in mice. Sci Rep 7, 5317.
30 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
729 Kas, K., Voz, M. L., Roijer, E., Astrom, A. K., Meyen, E., Stenman, G. and Van de Ven, W. J. (1997). 730 Promoter swapping between the genes for a novel zinc finger protein and beta-catenin in 731 pleiomorphic adenomas with t(3;8)(p21;q12) translocations. Nat Genet 15, 170-174. 732 Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R. and Salzberg, S. L. (2013). TopHat2: accurate 733 alignment of transcriptomes in the presence of insertions, deletions and gene fusions. 734 Genome Biol 14, R36. 735 Krjutskov, K., Katayama, S., Saare, M., Vera-Rodriguez, M., Lubenets, D., Samuel, K., Laisk-Podar, 736 T., Teder, H., Einarsdottir, E., Salumets, A., et al. (2016). Single-cell transcriptome analysis of 737 endometrial tissue. Hum Reprod 31, 844-853. 738 Labuda, D., Sinnett, D., Richer, C., Deragon, J. M. and Striker, G. (1991). Evolution of mouse B1 739 repeats: 7SL RNA folding pattern conserved. J Mol Evol 32, 405-414. 740 Ludwig, A., Rozhdestvensky, T. S., Kuryshev, V. Y., Schmitz, J. and Brosius, J. (2005). An unusual 741 primate locus that attracted two independent Alu insertions and facilitates their 742 transcription. J Mol Biol 350, 200-214. 743 Madissoon, E., Jouhilahti, E. M., Vesterlund, L., Tohonen, V., Krjutskov, K., Petropoulous, S., 744 Einarsdottir, E., Linnarsson, S., Lanner, F., Mansson, R., et al. (2016). Characterization and 745 target genes of nine human PRD-like homeobox domain genes expressed exclusively in early 746 embryos. Sci Rep 6, 28995. 747 Meng, X., Brodsky, M. H. and Wolfe, S. A. (2005). A bacterial one-hybrid system for determining the 748 DNA-binding specificity of transcription factors. Nat Biotechnol 23, 988-994. 749 Mouse Genome Sequencing, C., Waterston, R. H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J. F., 750 Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., et al. (2002). Initial sequencing 751 and comparative analysis of the mouse genome. Nature 420, 520-562. 752 Niakan, K. K., Han, J., Pedersen, R. A., Simon, C. and Pera, R. A. (2012). Human pre-implantation 753 embryo development. Development 139, 829-841. 754 Pelletier, J., Thomas, G. and Volarevic, S. (2018). Ribosome biogenesis in cancer: new players and 755 therapeutic avenues. Nat Rev Cancer 18, 51-63. 756 Petropoulos, S., Edsgard, D., Reinius, B., Deng, Q., Panula, S. P., Codeluppi, S., Plaza Reyes, A., 757 Linnarsson, S., Sandberg, R. and Lanner, F. (2016). Single-Cell RNA-Seq Reveals Lineage and X 758 Chromosome Dynamics in Human Preimplantation Embryos. Cell 165, 1012-1026. 759 Polak, P. and Domany, E. (2006). Alu elements contain many binding sites for transcription factors 760 and may play a role in regulation of developmental processes. BMC Genomics 7, 133. 761 R Development Core Team (2010). R: A language and environment for statistical computing. Vienna, 762 Austria: R Foundation for Statistical Computing. 763 Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W. and Smyth, G. K. (2015). limma powers 764 differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids 765 Res 43, e47. 766 Robinson, M. D., McCarthy, D. J. and Smyth, G. K. (2010). edgeR: a Bioconductor package for 767 differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140. 768 Rubin, C. J., Megens, H. J., Martinez Barrio, A., Maqbool, K., Sayyab, S., Schwochow, D., Wang, C., 769 Carlborg, O., Jern, P., Jorgensen, C. B., et al. (2012). Strong signatures of selection in the 770 domestic pig genome. Proc Natl Acad Sci U S A 109, 19529-19536. 771 Sandelin, A., Alkema, W., Engstrom, P., Wasserman, W. W. and Lenhard, B. (2004). JASPAR: an 772 open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 773 32, D91-94. 774 Su, M., Han, D., Boyd-Kirkup, J., Yu, X. and Han, J. D. (2014). Evolution of Alu elements toward 775 enhancers. Cell Rep 7, 376-385. 776 Tohonen, V., Katayama, S., Vesterlund, L., Jouhilahti, E. M., Sheikhi, M., Madissoon, E., Filippini- 777 Cattaneo, G., Jaconi, M., Johnsson, A., Burglin, T. R., et al. (2015). Novel PRD-like 778 homeodomain transcription factors and retrotransposon elements in early human 779 development. Nat Commun 6, 8207.
31 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
780 Tong, Z. B., Gold, L., Pfeifer, K. E., Dorward, H., Lee, E., Bondy, C. A., Dean, J. and Nelson, L. M. 781 (2000). Mater, a maternal effect gene required for early embryonic development in mice. 782 Nat Genet 26, 267-268. 783 Trapnell, C., Cacchiarelli, D., Grimsby, J., Pokharel, P., Li, S., Morse, M., Lennon, N. J., Livak, K. J., 784 Mikkelsen, T. S. and Rinn, J. L. (2014). The dynamics and regulators of cell fate decisions are 785 revealed by pseudotemporal ordering of single cells. Nat Biotechnol 32, 381-386. 786 Tsirigos, A. and Rigoutsos, I. (2009). Alu and b1 repeats have been selectively retained in the 787 upstream and intronic regions of genes of specific functional classes. PLoS Comput Biol 5, 788 e1000610. 789 Ullu, E. and Tschudi, C. (1984). Alu sequences are processed 7SL RNA genes. Nature 312, 171-172. 790 Utsunomiya, Y. T., do Carmo, A. S., Carvalheiro, R., Neves, H. H., Matos, M. C., Zavarez, L. B., Perez 791 O'Brien, A. M., Solkner, J., McEwan, J. C., Cole, J. B., et al. (2013). Genome-wide association 792 study for birth weight in Nellore cattle points to previously described orthologous genes 793 affecting human and bovine height. BMC Genet 14, 52. 794 Van Dyck, F., Braem, C. V., Chen, Z., Declercq, J., Deckers, R., Kim, B. M., Ito, S., Wu, M. K., Cohen, 795 D. E., Dewerchin, M., et al. (2007). Loss of the PlagL2 transcription factor affects lacteal 796 uptake of chylomicrons. Cell Metab 6, 406-413. 797 Wang, H. and Dey, S. K. (2006). Roadmap to embryo implantation: clues from mouse models. Nat 798 Rev Genet 7, 185-199. 799 Vassena, R., Boue, S., Gonzalez-Roca, E., Aran, B., Auer, H., Veiga, A. and Izpisua Belmonte, J. C. 800 (2011). Waves of early transcriptional activation and pluripotency program initiation during 801 human preimplantation development. Development 138, 3699-3709. 802 Vogt, E. J., Meglicki, M., Hartung, K. I., Borsuk, E. and Behr, R. (2012). Importance of the 803 pluripotency factor LIN28 in the mammalian nucleolus during early embryonic development. 804 Development 139, 4514-4523. 805 Wong, C. C., Loewke, K. E., Bossert, N. L., Behr, B., De Jonge, C. J., Baer, T. M. and Reijo Pera, R. A. 806 (2010). Non-invasive imaging of human embryos before embryonic genome activation 807 predicts development to the blastocyst stage. Nat Biotechnol 28, 1115-1121. 808 Voz, M. L., Agten, N. S., Van de Ven, W. J. and Kas, K. (2000). PLAG1, the main translocation target in 809 pleomorphic adenoma of the salivary glands, is a positive regulator of IGF-II. Cancer Res 60, 810 106-113. 811 Xue, Z., Huang, K., Cai, C., Cai, L., Jiang, C. Y., Feng, Y., Liu, Z., Zeng, Q., Cheng, L., Sun, Y. E., et al. 812 (2013). Genetic programs in human and mouse early embryos revealed by single-cell RNA 813 sequencing. Nature 500, 593-597. 814 Yan, L., Yang, M., Guo, H., Yang, L., Wu, J., Li, R., Liu, P., Lian, Y., Zheng, X., Yan, J., et al. (2013). 815 Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. 816 Nat Struct Mol Biol 20, 1131-1139. 817 Yu, G., Li, F., Qin, Y., Bo, X., Wu, Y. and Wang, S. (2010). GOSemSim: an R package for measuring 818 semantic similarity among GO terms and gene products. Bioinformatics 26, 976-978. 819 Zhang, W., Li, J., Guo, Y., Zhang, L., Xu, L., Gao, X., Zhu, B., Gao, H., Ni, H. and Chen, Y. (2016). Multi- 820 strategy genome-wide association studies identify the DCAF16-NCAPG region as a 821 susceptibility locus for average daily gain in cattle. Sci Rep 6, 38073.
822
32
Figure 1 a c of 129 up-regulated promoters;MA0163.1 E=2.6e-646) is similar PLAG1 t_1_MA0163.1 22 bioRxiv preprint doi: https://doi.org/10.1101/305375; this(i) versionPRD-like posted TF de April novo 22, motif 2018. in ZGA The copyright(iii) holder for this preprint(iv) (which was not
1 certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. bits (ii) PLAG1 de novo motif in ZGA CAA 1 G C A GT C A TT T T CGG GA G TA Bits T G T GG T G T T G CG A C T C AC A G G GG A A A A T G A A C C 0 T T A T CC G C 2 3 4 5 6 7 8 9 1 G C G C C T C TA A A A C GA T T G G T CGA G A G T T CG GA C G TA 10 11 12 13 14 C 0 CC C C (iii) PRD-like TF binding site TAAT G GG T GGGG 22 (iv) PLAG1 binding site (JASPAR: MA0163.1) T G T C CC TA A A C GA T GAT CGA G TA TCG 1
bits A CG CG T G T G C C T C CT T AGGAC CAA TA TAGA AT ACC 1 G C G T T TG G TT T T T A T GTGC G A TGG TAT C T A T ACT G ACC C A CCGAT CC A C AA TC AA CC TCAA GA C GCAAC GG GG Bits (v) PLAG1 consensus site (i) C A TTTGGG GGC A GAT CG CG A C C T A A G G T G T T C T AT A G A G C C G A GC G T T T T A T A G T T T C T T G A T G T G T T G C C T C CT T AGGAC CAA TA TAGA AT ACC G C G T G G TT T TT T A TTG GTGC G A TGG TAT C C T 0 T A C G C ACT G ACC C A CC AT C A A TC AA CC GA 2 3 4 5 6 7 8 9 1 C C C A TCA G A(ii) C C G T A C G T G C T T T TC GA TG TTT T G AT TCA C CT C C A A G G C CCG G CC CGC G TGG G G G GAT G TGT G T T A T G G A T T G G T G CT G TA A T A G G A C C AT C AT CGA C AC A C A C A G A A C C C C C C A AAC A A A A A A G A A A C C A T TG A G 0 AT 10 11 T12 13 14 G15 C16 C17 18 C19 C20 21 22 23 A24 25 26 27 28 T29 30 G31 GC AC T T TC C GAC TT GGGAGGC A AT q_2 T Tomtom 16.9.2013 06:26 G G A A T GAG C
9 promoters; E=9.1e-158) is similar PLAG1 (p=0.0057). Primate Alu evolution TG C T CT G T G T T T A TG TTT T AT TCA C CT C A A C CCG G CC CGC G TGG G G G GAT G TGT G T T G G A T T T G G T CT G TA A T A G G A C C AT C AT CGA C AC C A C AC G C A C A C C C A AAC A A A A A A G A A A
b GGC AC TGTA TC CAGCT TT GGAGGCCA Density
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● GRGGC-6~8N-RGGK ● ● ● ● Rodent B1 evolution (v) ● ● ● ● ● ● A-box ● ● ● ● ● ● ● ● ● ● Core Cluster ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● promoters;● E=2.0e-81)● ● is similar PLAG1 (p=9.228e-5) and Plagl1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● a ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 44 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
–2,000−2000−1500 –1,500−1000−500 –1,000 0 –500 500 0 500 Distance from TSS AluJ AluS Source site for de novo motif discovery AluY Similar site with the de novo motif
b –1,500–2,000–1,000 –500 0 500
D-like homeobox genes in early development. Since the embryonic genestend to have more species-specif c expression. D-like homeodomain TF-binding motif is over-represented The conserved 14 TFs can also be grouped on the basis of h high signif cance in both early and late waves of EGA evolutionary conservation asbeingpresent already in Eumetazoa Figure 2 a b c d X2=3.87, df=2, p=0.14 Genotype F(2,53)=13.3, p=0.00002 F(3,55)=9.83, p=0.00003 F(2,10)=10.8, p=0.003 100 14 18 16 bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted16 April 22, 2018. The copyright holder for this14 preprint (which was not 80 certified by peer12 review) is the author/funder. All rights reserved. No reuse allowed without permission. 14 12 10 12
** * 60 10 ** 8 10 ** 8
6 8 *** 40 6
Litter size 6 4 4
Body weight (g) 4 Number of pups 20 2 2 Implantation sites (n) 2 7 14 5 9 19 5 29 12 15 3 3 6 4 0 0 0 0 WT HET KO Females Males F: HETKO HET KO F: WT HET KO Observed Expected WT HET KO M: HET HET KO KO M: WT WT WT
f F(3,10)=4.83, p=0.025 F x M e 1.2 HET x HET 1 * * * * KO x HET 0.8 * * * * * * * 0.6 * * * * * * * * * * HET x KO 0.4 * * * * KO
Litters per month 0.2 34 3 4 3 3 3 3 WT KO x KO 0 F: HET KO HET KO 0 30 60 90 Time (days) M: HET HET KO KO Figure 3 a b c d 10 WT n=13 n=18 1 n=8 WT 1.0 WT 1 mm KO n=22 KO bioRxivn=30 preprint doi: https://doi.org/10.1101/3053750.90,9 KO n=8 ; this version posted April 22, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 60 0.80,040,8 KO 7.5 WT 2 0.70,7 Rel weight Rel 0,6 0.030,03 40 0,5 5 0,4 KO 0.020,020,3
20 0,2 Number / mm Number of oocytes 2.5 0,1 0,01 0.010 Ovary Ovary/BW Abs weight (g) Superovulated oocytes 0 0 OvaryOvary weightsOvary/BW Preantral Antral Atretic CL g e Uterus f Mucosa WT 44 Myometrium WT n=8 0.2 0,4 KO n=8 KO WT 33 500 p=0.07 0.1 1000 0,3 1500
22
KO KO 0,2 0.2 PC2 0
1 0.5 mm 0.10,1 *** -0.1 0 Uterus Uterus/BW Abs weight (g) Rel weight 0 UterusUterus weightsUterus/BW -0.2
-0.16 -0.15 PC1 Figure 4
Zygote 2-cell 4-cell
1.00 bioRxiv preprint doi: https://doi.org/10.1101/3053751.00 ; this version posted April 22, 2018. The copyright1.00 holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
0.75 0.75 0.75
0.50 0.50 0.50 Proportion
0.25 D=0.5 0.25 p=9.993e-08 0.25
matPlag1KO WT 0.00 0.00 0.00
30 40 50 15 20 25 30 35 10 15 20 Time of exit (hours) Figure 5 a b 21-23 h post hCG +2,910 45-47 hpf +3,636 71-73 hpf 200 WT matPlag1KO WT y size y bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018.-6,832 The copyright holder for this -4,301preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
brar 150 matPlag1KO +2,851 +3,586 -6,044 -4,454
alized li alized 100 MII 2-cell 8-cell m
Down in matPlag1KO Up in matPlag1KO in nor in 50 c MII
Spike- n=16 n=16 n=14 n=14 n=15 n=15 MII 2c 8c 2-cell d 8-cell
-600 -400 -200 0 200 400 600 Number of DEGs
e WT_MII matPlag1KO_MII WT_2c 8-cell matPlag1KO_2c 0.1 WT_8c matPlag1KO_8c
WT matPlag1KO
2-cell 0.0 PC2
-0.1
MII
0.08 0.10 0.12 PC1 f WT_MII 10 matPlag1KO_MII 2-cell WT_2c matPlag1KO_2c WT_8c matPlag1KO_8c
5 ponent 2 Com 0
8-cell Oocyte
-20 -10 0 10 20 Component 1 Figure 6 a Genes upregulated in matPlag1KO at 2c Genes downregulated in matPlag1KO at 2c b Expression of Plag1 "Delayed degradation" "Delayed activation" bioRxiv preprint doi: https://doi.org/10.1101/305375; this version posted April 22, 2018. The copyright holder for this preprint (which was not certifiedWT by peer review) is the author/funder.WT All rights reserved. No reuse allowed without permission. WT matPlag1KO matPlag1KO 40 matPlag1KO 1 1 30
0 0
CPM 20
10 -1 -1 Mean normalized expression Mean normalized expression
0 MII 2c 8c MII 2c 8c MII 2c 8c c GO clusters: delayed degradation genes (n=559) GO clusters: delayed activation genes (n=530)
cl1 cl4 Similarity Similarity 1 cl2 1 cl2 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 cl5 Clusters cl3 Clusters cl1 cl1 cl2 cl2 cl3 cl3 cl4 cl4 cl3 cl5 cl5 cl5 cl6 cl6 cl7 cl8 cl4 cl1
cl1: ribonucleoside metabolism, negative regulation of transcription and metabolism cl1: ribosome biogenesis, cellular organelle assembly cl2: development and morphogenesis, cell proliferation cl2: rRNA maturation, RNA processing and slicing cl3: protein localization, hormone secretion and transport, calcium transport cl3: cellular macromolecule biosynthesis, nucleoside metabolism, translation cl4: regulation of cell proliferation, hormone and second messenger signalling cl4: nitrogen compound metabolism, organic substance biosynthesis cl5: regulation of neurogenesis, cell communication cl5: RNA nuclear export, establishment of protein localization Figure 7 a b
4 Significantly down from 2c to 8c 4 Delayed-activation genes bioRxivExpressed preprint but notdoi: different https://doi.org/10.1101/305375 ; this version posted April 22, 2018. TheExpressed copyright but holder not different for this preprint (which was not Significantly up fromcertified 2c to by8c peer review) is the author/funder. All rights reserved. No reuseDelayed-degradation allowed without permission. genes
3 3
2 2
p=1.9*10-11 p=4.7*10-10
1 1 Enrichment of human 4c to 8c genes 0 Enrichment of human 4c to 8c genes 0
0 5,000 10,000 15,000 0 5,000 10,000 15,000 Ranked expressed genes: WT mouse 2c to 8 c Ranked expressed genes: mouse 2c stage c Genes d Protein families
Mouse delayed-activation Mouse delayed-degradation Mouse delayed-activation Mouse delayed-degradation
107 505 0 558 305 377 0 91 25 1 98 12
68 p=2.9*10-7 40 p=1.3*10-8 Human ZGA OR=26 Human ZGA OR=2.3 e 2c: not affected 2c: delayed-degradation 2c: delayed-activation (15,432 genes) (559 genes) (530 genes)
PLAG1 site JASPAR 1,169 (7.6%) 52 (9.3%) 27 (5.1%) PLAG1 de novo motif 2,509 (16.3%) 81 (14.5%) 132 (24.9%)*** f g 50 Morphogenesis Only genes with PLAG1 de novo motif Development Biogenesis Neuron Ribosome Delayed activation Complex Expressed but not different 4 Delayed degradation 25 3 Regulation Signaling p53
Regulation 2 0 Protein Response Acetylation Immunity Ubiquitination Protein Transport Organelle Metabolic 1 Regulation -25 Translation with PLAG1 de novo motif
Processing Enrichment of human 4c to 8c genes 0 RNA Maturation
-50 Human ZGA Mouse delayed-activation Mouse delayed-degradation 0 1,000 2,000 Ranked expressed genes: mouse 2c stage -50 -25 0 25 50 with PLAG1 de novo motif: