UNIVERSITY OF COPENHAGEN F A C U L T Y O F SCIENCE

PhD Thesis Qing Zhou

Maternal and paternal contribution to the development of human preimplantation embryos

Supervisor: Karsten Kristiansen

Submitted on: June 2020

Dissertation for the degree of philosophiae doctor (PhD)

Department of Biology, University of Copenhagen, Copenhagen, Denmark

and

BGI-Shenzhen, Shenzhen, China

June, 2020

Author: Qing Zhou Title: Maternal and paternal contribution to the development of human preimplantation embryos Academic advisor: Karsten Kristiansen, Department of Biology, University of Copenhagen

This thesis has been submitted to the PhD School of The Faculty of Science, University of Copenhagen

ACKNOWLEDGMENT Firstly, I would like to express my sincere gratitude to my principal supervisor Prof. Karsten Kristiansen. It was fantastic to have the opportunity to take my PhD journey with you. Your patience, preciseness, motivation, encouragement and continuous support are greatly appreciated. I sincerely thank you for all your valuable guidance and critical review on the multiple research projects and manuscripts. You are my mentor and a better advisor beyond the imagination.

A very special gratitude goes out to Beijing Genomics Institute and the University of Copenhagen for providing funding supporting the PhD program and courses. It is a precious experience studying in Denmark.

I won’t forget to express the gratitude to the following colleagues in BGI: Wen-Jing Wang, Xi Yang, Zhongzhen Liu, Taifu Wang, Jinghua Sun, Yanru Xing, for your kind help and suggestion. I would also like to thank all the co-authors and collaborators in our projects. Without your time and efforts, my PhD research cannot be completed.

I am grateful to all the people I met during my PhD courses. I would always remember the time we spent together for group working and discussions. I would not list all the persons’ names here, and I would like to thank you all for your patience, tolerant, encouragement and kind help.

I am also grateful for all the support and help from the PhD School. Thanks to the PhD secretary and coordinate: Jannike Dyrskjøt and Anders Løbner-Olesen, for your kindly administrative supports. I would also like to thank my colleagues in BGI-college, Qi Zeng and Xiaojiao Zhan, for your kindly unfailing support and assistance.

I also want to take this opportunity to thank my friends. I would like to thank Huanzi for your academic suggestion on my PhD project and accompany during our staying in Copenhagen. I must express my gratitude to Jihua, Huainian, Mingtong and Dongli, who helped me a lot during the time I stayed in Copenhagen.

And finally, last but by no means least, I would like to express the special gratitude to my husband Lin, thank you for your love and support. To my parents, without your help and support, my PhD work would not be accomplished. I am also grateful to my other family members and friends supporting me along the way.

ABSTRACT Infertility is a critical component of reproductive health and remains a highly prevalent global condition. In many cases, couples with infertility may choose assisted reproductive technology (ART) to increase their chances of conception. However, an increasing application of ART treatment worldwide has also generated both scientific and public interest in its efficacy and safety. Since successful development of embryos in vitro is one of the key steps in ART and has a crucial influence on pregnancy outcome, an improved understanding of these processes will help us figure out the underlying mechanisms in ART failure due to poor embryonic development, especially for couples without normal embryos for transfer in several cycles. In this thesis, I discuss maternal and paternal contribution to the development of human preimplantation embryos. Firstly, I give a brief introduction to the present clinical status of infertility and great concerns in the field of ART. Secondly, I systematically introduce the developmental process of human preimplantation embryos, including maturation of gametes, the process of fertilization, the dynamic reprogramming of epigenome upon fertilization, the transcriptional activation of embryonic genome, and the lineage differentiation of embryos at the blastocyst stage. Lastly, I summarize the clinical situation and scientific findings of couples suffering from recurrent ART failure due to poor embryonic development. Next, I present our findings on the maternal and paternal contribution to human embryogenesis based on the dynamics of parental-of-origin sex , as well as the effects of parental genetic background on embryonic development. The distinct activation and silencing of sex chromosomes result in an imbalanced dosage of expression, and thus, have potential effects on the sex-specific behaviour of early embryos. The analysis of couples with recurrent poor embryonic development indicate that they are highly heterogeneous both phenotypically and genetically. These results will further knowledge within the ART field and possibly improve the treatment of infertility. Our findings also enable potential application of whole genome sequencing to clinical diagnosis and it could contribute to improve genetic counselling of couples with a history of ART failure. Hence, all the results will possibly expand the capabilities of ART and enhance reproductive health.

CONTENTS

1 INTRODUCTION ...... 1

1.1 Infertility ...... 1

1.2 Assisted reproductive technology (ART) ...... 2

1.2.1 Introduction of ART ...... 2

1.2.2 Efficacy and safety of ART ...... 3

1.2.3 Sex ratio in ART...... 4

1.3 Development of preimplantation embryos ...... 6

1.3.1 Maturation of gametes ...... 6

1.3.2 Fertilization ...... 10

1.3.3 Reprogramming of epigenetics ...... 10

1.3.4 Embryonic genome activation (EGA) ...... 12

1.3.5 Lineage differentiation ...... 16

1.3.6 Single-cell technology in embryonic study...... 17

1.4 Failure of embryonic development in ART ...... 19

1.4.1 Poor embryonic development ...... 19

1.4.2 Abnormality of embryos ...... 19

1.4.3 Parental genetic effects ...... 20

2 OBJECTIVES ...... 23

3 LIST OF SCIENTIFIC PAPERS ...... 24

4 RESULTS ...... 27

3.1 Contribution of sex chromosomes during human embryogenesis ...... 27

3.1.1 Sex-specific behaviour of IVF embryos ...... 27

3.1.2 Transcriptional differences on sex chromosomes ...... 28

3.1.3 Dynamic behaviour of sex chromosomes ...... 30

3.1.4 Contribution of imbalanced expression dosage ...... 34

3.2 Parental genetic effects on embryonic development failure ...... 36

3.2.1 Recurrent ART failure ...... 36

3.2.2 Clinical characteristics ...... 37 3.2.3 Genetic findings ...... 38

3.2.4 Paternal genetic factors ...... 41

5 CONCLUSION AND FUTURE PERSPECTIVES...... 43

6 REFERENCES ...... 49

7 APPENDICES ...... 70

LIST OF ABBREVIATION ACMG American College of Medical Genetics and Genomics AI Artificial intelligence ART Assisted reproductive technology BMI Body mass index CHIP-seq Chromatin immunoprecipitation sequencing CHM Complete hydatidiform mole DEGs Differentially expressed EGA Embryonic genome activation EPI Epiblast GCs Granulosa cells GO GV Germinal vesicle H3K27me3 Trimethylated lysine 27 on histone H3 H3K4me2 Dimethylated lysine 4 on histone H3 H3K4me3 Trimethylated lysine 4 on histone H3 ICM Inner cell mass ICMART The International Committee for Monitoring Assisted Reproductive Technologies ICSI Intracytoplasmic sperm injection IVF In vitro fertilization lncRNA Long noncoding RNA LP Likely pathogenic MGI Mouse genetic interface MII Metaphase II miRNA MicroRNA ncRNAs Noncoding RNAs PADs Polycomb-associating domains PB Polar body PE Primitive endoderm piRNA Piwi-RNA RNA-seq RNA sequencing SNVs Single nucleotide variants SSCs Spermatogonial stem cells SVs Structure variants TADs Topologically associated domains TDI Therapeutic donor insemination TE Trophectoderm tRNA Transfer RNA t-SNE T‐distributed stochastic neighbour embedding

WES Whole exome sequencing WGS Whole genome sequencing WHO The world health organization VUS Variants of uncertain significance Xp Paternal X ZGA Zygotic genome activation ZP Zona pellucid

1 INTRODUCTION

1.1 Infertility

Infertility, or the inability to conceive, is a critical component of reproductive health and remains a highly prevalent global condition. The world health organization (WHO) has calculated that it affects over 186 million reproductive-aged couples worldwide, translating into one in every four couples (data except for China) (Inhorn et al, 2015). The probable global average prevalence of infertility is 15% (Boivin et al, 2007; Mascarenhas et al, 2012), with various rates in different regions and populations (Boivin et al, 2007; Ombelet et al, 2008) (Figure 1).

Figure 1. Prevalence of primary infertility (unable to have a first live birth) and secondary infertility (unable to have an additional live birth), presented as the percent of women who seek a child, and as the percent of all women of reproductive age (20-44), in 1990 and 2010 (Mascarenhas et al, 2012). Infertility affects women as much as it affects men and often it is due to the

1 combination of male and female factors. However, male infertility is usually underreported as men do not agree to undergo fertility evaluation in some countries. The percentage of infertility due to male factor ranged from 20% to 70% among various regions (Agarwal et al, 2015) (Figure 2). The inability to have children can lead to distress and depression, influencing their mental health and social functioning and behaviour (Chachamovich et al, 2010). Fortunately, owing to the development of reproductive technology, couples with infertility have a number of medical options to have a healthy baby.

Figure 2. World map containing percentages of infertility cases per region that are due to male factor (Agarwal et al, 2015).

1.2 Assisted reproductive technology (ART)

1.2.1 Introduction of ART

In many cases, couples with infertility may choose assisted reproductive technology (ART) to increase their chances of conception. In 2017, the International Committee for Monitoring Assisted Reproductive Technologies (ICMART) released The International Glossary on Infertility and Fertility Care (Zegers-Hochschild et al, 2017). According to this glossary, the ART is defined as “All interventions that include the in vitro handling of both human oocytes and sperm or of embryos for the purpose 2 of reproduction”. It has many procedures including releasing gametes from both woman and man, manipulating oocytes and sperms, fertilizing via in vitro fertilization (IVF) or intracytoplasmic sperm injection (ICSI), culturing embryos in vitro, and transferring embryos back to woman’s uterus for implantation. The first in vitro fertilized baby was born in 1978 and the doctor, Robert Geoffrey Edwards, was awarded the Nobel Prize in 2010 for his contribution to the development of human reproductive therapy. Globally, more than seven million children have been born with the help of ART, an innovation defined as “a milestone in modern medicine” (Zegers-Hochschild et al, 2009; Adamson et al, 2018; Berntsen et al, 2019).

1.2.2 Efficacy and safety of ART

An increase in the application of ART treatment worldwide has also generated both scientific and public interest on its efficacy and safety (Adamson et al, 2006; Dickey et al, 2007). Differences of ART outcomes exist among various regions of the world. By 2012-2013, live birth rates of fresh cycle were highest in the United States (29%) while the lowest rate was in Japan (5%) (Kushnir et al, 2017). A recent study about women in the United Kingdom undergoing IVF treatment indicates that the birth rate after 6 IVF ovarian stimulation cycles is 65.3%, with variants by age and treatment type (Smith et al, 2015). Although the technology of ICSI could bypass natural barriers to fertilization and is increasingly used in patients without male factor infertility, there is insufficient evidence to support its improvement of postfertilization reproductive outcomes (Practice Committees of the American Society for Reproductive et al, 2012; Boulet et al, 2015). In a word, the low successful rate of ART is still a problem to be solved in clinical practice, thus further research and practice are needed to improve the ART outcome. So far, there have been many prospective epidemiologic studies about the perinatal outcomes of ART-conceived babies. These studies reveal increased risk of multiple births, higher odds of preterm birth, low and very low birthweight, being small for gestational age, and more perinatal mortality for ART babies (Dickey et al, 2007; Romundstad et al, 2008; Kalra et al, 2011; Davies et al, 2012; Pandey et al, 2012; 3

Kulkarni et al, 2013; Kulkarni et al, 2017). The majority of the published studies evaluating height, weight, and body mass index (BMI). Although these studies have not found differences of children by method of conception (Wilson et al, 2011; Halliday et al, 2014), adverse birth outcomes may have long-term consequences for adult health. For example, the ART-conceived child has an increased risk of developing neurological problems and cardiometabolic disease (Stromberg et al, 2002; Sandin et al, 2013; Yeung et al, 2013). Besides, children born with the help of ART have altered frequency of dynamic mutations (Zheng et al, 2013) and epigenetic profiles (Song et al, 2015; Choux et al, 2017). Conception using ART involves several processes that are quite different from a spontaneous conception, such as hyperstimulation and manipulation of gametes and exposure of embryos to culture medium at the beginning of development stages. Although these ART treatments may bring various chemical and physical factors exposing embryos into stressors, there is a major challenge for researching the mechanisms responsible for the long-term health consequences as it is difficult to distinguish the underlying contribution of infertility from the ART treatment. To fully investigate these adverse outcomes and underlying mechanisms, both clinical studies and basic molecular research are needed.

1.2.3 Sex ratio in ART

Besides the short- and long-term effects of health in offspring, a skewed sex ratio is also a matter of great concern in connection with ART (Maalouf et al, 2014; Orzack et al, 2015).The sex ratio is male-biased at both implantation and deliveries and significantly differs among ART treatment types - between 1.03 and 1.50 in IVF cycles and at an average of 1.00 in ICSI cycles (Tarín et al, 2014) (Table 1). Using ICSI decreases the percentage of male offspring, while for both ICSI and IVF, transferring embryos at blastocyst stage results in more males than that at cleavage-stage (Lin et al, 2010; Bu et al, 2014).

4

Table 1. Sex ratio (XY/XX) at birth of singleton deliveries (data from (Tarín et al, 2014)).

Method of Day of embryo Transfer type Sex ratio fertilization transfer

≤ Day 3 0.98 (1929/1968) a Cleavage-stage 1.08 (2084/1932) b transfer Total: ≤ Day 3 1.03 (4013/3900) IVF > Day 3 1.22 (1030/846) a Blastocyst-stage 1.28 (1088/852) b transfer Total: > day 3 1.25 (2118/1698) ≤ Day 3 0.94 (3047/3236) a Cleavage-stage 0.95 (2414/2542) b transfer Total: ≤ Day 3 0.95 (5461/5778) ICSI > Day 3 0.98 (1542/1566) a Blastocyst-stage 1.10 (1289/1167) b transfer Total: > Day 3 1.04 (2831/2733) a Large-sample surveys from United States b Assisted reproductive databases from Australia and New Zealand Several mechanisms account for the skewed sex ratio: 1) the physical manipulating in ART may have putative role in selecting Y-bearing spermatozoa or Y-bearing spermatozoa may have advantage in fertilization over X-bearing spermatozoa in vitro; 2) male embryos have higher developmental ability after fertilization as they have been reported to display higher metabolic activity and cleavage at a faster rate than female embryos (Alfarawati et al, 2011; Huang et al, 2018); 3) ICSI may decrease the number of trophectoderm (TE) cells in female blastocysts and thus prevent correct developmental competence of female embryos (Dumoulin et al, 2005); 4) exposure to culture medium for extended time to the blastocyst stage may influence the process of X chromosome dosage compensation in females, which occurs at late blastocyst stage (Petropoulos et al, 2016), resulting in preferential female mortality at early post- implantation stages (Sullivan et al, 2003; Tan et al, 2016).

5

Recently, several mammalian and human studies have molecularly and functionally characterized the development of male and female IVF embryos (Aiken et al, 2004; Serdarogullari et al, 2014). Even though some observations point to sex- specific differences of early embryos, the molecular mechanisms governing these differences remain to be established. The underlying causes of increased anomalies in ART offspring are unknown, while alterations in the epigenetic state of gametes or embryos by the ART treatment have been proposed as potential contributors (Cutfield et al, 2007; Lim et al, 2008). As a result, understanding the development of early embryos on the molecular level is of importance in relation to ART, and thus may improve the treatment of infertility.

1.3 Development of preimplantation embryos

1.3.1 Maturation of gametes

Oocyte maturation is a crucial event required for successful fertilization and embryonic development. During development in the fetal ovary, oocytes enter into the early stage of meiosis and arrest at the germinal vesicle (GV) stage (Figure 3). The stored oocytes have already accomplished homologous chromosome recombination that ensures the genetic diversity of offspring. Triggered by the preovulatory surge of luteinizing hormone, the oocyte grows and increases in size, accumulating all the storage materials to prepare for the future fertilization and support the development of early embryo. During the progression through maturation, the fully-grown oocyte resumes meiosis with the extrusion of the first polar body (Maalouf et al, 2014). A microtubule spindle assembles around the chromosomes, then migrates to the oocyte surface and segregates half of the homologous chromosomes into the polar body. As a result, the fertilizable oocyte has a haploid complement of DNA and remains arrested at the metaphase II (MII) stage until fertilization (Clift et al, 2013).

6

Figure 3. Procedures of oocyte maturation and corresponding chromosome configurations (Clift and Schuh, 2013). Due to the restricted availability of human oocytes, most information of the maturation process is obtained from research based on animal models. However, human may have a different development pattern compared with mouse. The recent development of single-cell sequencing technology has provided chances to molecularly characterize human oocyte at multiple levels. Zhang et al. performed single-cell RNA sequencing (RNA-seq) of human oocytes and corresponding granulosa cells (GCs) at five key stages of follicular development. By exploring a transcriptomic landscape of folliculogenesis from primordial to preovulatory stage, they provided key insights into the transcriptional regulation and molecular interactions that coordinate the stepwise maturation of human oocyte (Zhang et al, 2018). By applying whole-genome amplification to single human oocytes and polar bodies, Hou et al. carried out the first comprehensive analyses of female meiosis and recombination (Hou et al, 2013). Similarly, Ottolini et al. generated a genome-wide map of recombination and chromosome segregation in human oocytes (Ottolini et al, 2015). Their results show that the recombination rate in oocyte is similar to the result from population studies and recombination can also affect the fate of sister chromatids at the MII stage. Recently, the epigenetic landscape of oocyte has also been established. For example, the DNA methylation level of mature oocytes is about 54.5% (Zhu et al, 2018), with a tiny increase comparing to the level of GV oocytes. Late-stage mouse oocytes have a unique chromatin organization, Polycomb-associating domains (PADs), which are marked by trimethylated lysine 27 on histone H3 (H3K27me3). Regulated by Polycomb group

7 proteins, PADs disassemble after meiotic and reappear on the maternal genome upon fertilization (Du et al, 2019). In contrast to mouse oocytes, the histone modification of trimethylated lysine 4 on histone H3 (H3K4me3) occurs as strong peaks at promoters in human GV and MII oocytes and trimethylated lysine 27 on histone H3 (H3K27me3) is enriched at crucial developmental gene promoters and methylated regions (Xia et al, 2019). All of this genetic and epigenetic information, as well as the stored transcripts and proteins in the cytoplasm, will pass to the embryos and influence their development. Still, the other landscapes of oocytes, such as the profiles of small RNA or other kinds of histone modification, need to be further investigated. Spermatogenesis is a highly regulated and complex cellular differentiation process important to male reproduction. It starts with the mitotic division of the spermatogonial stem cells (SSCs), where DNA methylation has set up the paternal specific imprints (Figure 4). One of the division products can continuously self-renew so that it has the responsibility to maintain the male germline throughout life. Another differentiated product is the primary spermatocyte cell. The primary spermatocytes accomplish DNA duplication and homologous recombination and then divide meiotically to become secondary spermatocytes. During the second meiosis division, each secondary spermatocyte divides into two haploid round spermatids (Clermont et al, 1972). In the last step of spermiogenesis, the spermatids undergo extensive chromatin remodelling and develop into mature spermatozoa, also known as sperm cells. To transcriptionally inactivate and protect the paternal DNA, the spermatids involve a histone-to-protamine transition to form a specialized epigenome for mature spermatozoa. Protamines are small basic proteins that bind DNA to form tightly packed structures. The high-level compaction of nuclear chromatin is a critical attribute for DNA transport in the mature sperm head (Balhorn et al, 2000). During spermatogenesis, protamines replace 85–95% of histones in the sperm (Oliva et al, 1991), thus resulting in a distinct chromatin structure to support the male gamete. The remaining histones are highly acetylated and not randomly distributed along the genome (Hammoud et al, 2014).

8

Figure 4. Processes of spermatogenesis and epigenetic modification events at each stage (adopt from (Zamudio et al, 2008; Rahman et al, 2013)). Male germ cell development involves various steps including cell proliferation, apoptosis, self-renew, and differentiation. Although much has been learned about the series of states in rodents, these processes in humans are just beginning to be unravelled. Using single-cell technology of RNA-seq, researchers have described spermatogenic cell types of human testis and revealed the full transcriptional programs inherent to cell fate transition underlying the ongoing spermatogenesis (Hermann et al, 2018; Wang et al, 2018; Sohni et al, 2019). In addition, from the results of whole genome sequencing (WGS) of single sperm cell, two studies have characterized the genomic diversity in one individual’s gamete genome by creating a genome-wide recombination map, as well as measuring the de novo mutation rates in human sperm (Lu et al, 2012; Wang et al, 2012). Similar to oocyte, the DNA in sperm is hypermethylated and the median methylation level is approximately 70% in bulk sperms (Molaro et al, 2011) and about 82.0% in single-cell data (Zhu et al, 2018). Interestingly, some hypomethylated promoters in the mature sperm greatly overlap promoters of developmental

9 transcription factors (Boyer et al, 2005). The gene-specific methylation on sperm may establish the paternal specific imprints and play a role in regulating transcription during early embryogenesis. Although the majority of histones are replaced by protamines after the histone-protamine exchange process, the remaining histones with epigenetic modifications are enriched at important developmental loci. The histone profile of bulk sperms shows that both dimethylated lysine 4 on histone H3 (H3K4me2) and H3K4me3 are enriched at promoters of certain developmental regulators, including imprinted gene clusters, HOX clusters, microRNA (miRNA) cluster and signalling factors. Additionally, H3K27me3, a transcription repression marker, localizes to promoters of genes silent in early embryos (Hammoud et al, 2009). Thus, these epigenetic modifications could also pass to the offspring and regulate the transcription in early embryos. Taken together, after the maturation of gametes, both oocyte and sperm have been prepared for the moment of fertilization, by accomplishing a diverse haploid genome (not for the MII oocyte), establishing epigenetic marks of DNA methylation and histone modification, and storing necessary transcripts and proteins in the cytoplasm.

1.3.2 Fertilization

Fertilization is the fusion of sperm and oocyte. Sperm initially binds to the zona pellucida (ZP) of the egg and releases a specialized secretory vesicle, changing the structure of ZP to prevent the entering of other sperms (Bleil et al, 1981; Wassarman et al, 2008). When the binding is established, the sperm triggers the second meiotic division of the MII oocyte (Figure 3). The oocyte segregates half of the sister chromatids into the second polar body through a meiotic division. Thus, the zygote has haploid pronucleus with maternal and paternal origin (Clift and Schuh, 2013).

1.3.3 Reprogramming of epigenetics

Upon fertilization, the maternal genome is transcriptional silent and arrests in the MII stage containing sister chromatids for each chromosome. The paternal genome is tightly compacted. Then the oocyte resumes the second meiosis division by excluding half of the sister chromatids and forms a maternal pronuclei (Figure 3). Meanwhile,

10 during the formation of paternal pronuclei, the highly compacted sperm chromatin reorganizes and the sperm DNA expands to about three times the size of the mature sperm nucleus (Jenkins et al, 2012). Opposite to the histone-protamine-transition during spermatogenesis, the protamines from the paternal chromatin are completely removed and replaced by the maternally derived histones from the oocyte (McLay et al, 2003; Estella et al, 2011) (Figure 5). This is a key conserved event that occurs post- fertilization. The exchange process completes between 2 and 4 hours in pig (Nakazawa et al, 2002) and occurs as late as 8 hours post-fertilization in mouse (Nonchev et al, 1990). Although it has been reported that the protamine removal in humans is completed within 1 hour of ICSC (Estella et al, 2011), due to technical and ethical restrictions, it is difficult to study this process in humans. Known as a key step to active the paternal pronuclei, it is clear that this process must be completed prior to paternal DNA replication and mitosis (Nonchev and Tsanev, 1990), generating a transcriptionally competent DNA to support the following embryonic development.

Figure 5. Overviewing the development of human post-fertilization embryos before implantation, including alteration of epigenome and transcriptome corresponding with embryonic timeline (edited from (Jenkins and Carrell, 2012)). 11

As mentioned in the previous part, the epigenetic landscapes of oocyte and sperm are distinct from each other, as well as from somatic cells. After fertilization, both maternal and paternal pronuclei need to reprogram epigenetic modifications in order to re-establish an appropriate state for embryonic development. Although the DNA methylation level in sperm is much higher than oocyte, the global DNA methylation of paternal genome is mostly erased soon after fertilization, except for imprinted clusters and retrotransposons (Hales et al, 2011). While the maternal DNA passively demethylates in a replication-dependent manner (Figure 5). The DNA methylation level in human early maternal and paternal pronucleus decrease from 82% to 52.9% and from 54.5% to 50.7%, respectively. The following global demethylation occurs from the late zygote to the blastocyst in a stepwise and wave-like manner in human preimplantation embryos (Zhu et al, 2018). With the fusion of maternal and paternal pronuclei, the diploid zygote is formed and prepares for the next mitosis or cleavage. Although the parental genomes have equivalently structured nucleosome, they also retain some parental-of-origin specific patterns of histone modification or DNA methylation, attributing for the following transcriptional regulation.

1.3.4 Embryonic genome activation (EGA)

The rapid developing embryos in the cleavage-stage entail cell divisions without cell growth. This means the size of the embryo remains the same while the following mitosis results in separated cells with equal and decreased size. There are three cleavage divisions for human embryos, from 1 cell to 8 cells (Niakan et al, 2012). The first rapid divisions are driven exclusively by materials from the maternal cytoplasm, without any contribution from the zygotic genome. Then embryos pass through a major developmental transition, maternal-to-zygotic transition (MZT), during which development control is handed from maternal stores of gene products to those produced from the zygotic genome. This transition includes a series of events making the embryonic genome fully transcriptionally active and depleting many of the maternal stored mRNAs and proteins (Lee et al, 2014; Jukam et al, 2017). There is a major wave of transcriptional activation named embryonic genome 12 activation (EGA) or zygotic genome activation (ZGA). This activation is a conserved event and occurs at different times after fertilization in different species. In 1988, Braude et al first determined that the timing of EGA in humans was between 4-cell and 8-cell stage (Braude et al, 1988), corresponding to embryonic Day 3 (Figure 5). Moreover, several studies have revealed minor transcriptional activity before the major wave of EGA (Dobson et al, 2004), as early as zygote stage (Yan et al, 2013). By now, some crucial EGA activators of model animals have been identified. In Drosophila embryos, maternal transcription factor Zelda functions as a pioneer factor that binds to nucleosomes near to the promoters of hundreds of zygotic genes and promotes chromatin opening, manipulating the transcriptional activation (Liang et al, 2008; Harrison et al, 2011; Sun et al, 2015). Although Zelda has no ortholog in fish or mammals, the transcription factors Nanog, SoxB1, and Oct4 can initiate the first major phase of embryonic gene activation in zebrafish (Lee et al, 2013; Leichsenring et al, 2013). Mouse EGA is controlled by several factors, including Yap1, which is highly expressed in oocytes and downregulate approximately 3000 EGA transcripts when maternally deleted (Abbassi et al, 2016), the Zscan transcription factors, which are expressed exclusively in the embryos at the time before EGA (2-cell stage for mouse)

(Falco et al, 2007), the NFYα, which actives several developmental genes at the 2-cell stage (Lu et al, 2016), and the DUX homeobox family genes specific to placental mammals (De Iaco et al, 2017). In human embryos, the DUX4 family plays a conserved role similar to mouse and activates several hundreds of endogenous genes during EGA (Hendrickson et al, 2017). Notably, the DUX4 family itself is zygotically expressed, so the potential maternal regulators remain to be investigated. However, more “master” transcription factors responsible for human EGA need to be further identified. Another mechanism mediating EGA is the dynamic chromatin state, especially chromatin accessibility and histone modifications (Dahl et al, 2016; Lu et al, 2016; Wu et al, 2016). The accessibility of chromatin provides chances for binding of transcription factors, recruiting transcription machines and initiating the process of transcription. Consistent with gene expression, sharp open chromatin peaks appear near

13 to the promoters or cis-regulatory sequences of activated genes from 2-cell stage to blastocyst in mouse embryos (Lu et al, 2016; Wu et al, 2016). The overlapping of widespread accessible chromatin regions with cis-regulatory sequences and transposable elements is a conserved principle underlying the chromatin transition during EGA in both mouse and human (Lu et al, 2016; Wu et al, 2016; Gao et al, 2018; Wu et al, 2018; Liu et al, 2019). While open regions with OCT4 binding motifs are enriched at the time of EGA in human embryos, but not in mice (Gao et al, 2018). Additionally, a large proportion of accessible chromatin loci are already identified before EGA at CpG-rich promoters, as well as the DUX4 binding sites (Wu et al, 2018; Liu et al, 2019). Histone modifications also mark genes for expression before activation. Recent studies have identified a non-canonical broad H3K4me3 peak at promoters in mouse oocytes and they are restricted to the maternal genome upon fertilization. Active removal of these broad domains is required for normal EGA and the re-establishment of canonical H3K4me3 occurs more rapidly than the repressive mark H3K27me3 (Dahl et al, 2016; Liu et al, 2016; Zhang et al, 2016). Contrary to mouse oocytes, human oocytes have the canonical pattern of H3K4me3, exhibiting sharp peaks at promoters. Before the process of EGA, human embryos acquire widespread H3K4me3 in CpG- rich regions while the H3K27me3 is globally depleted (Xia et al, 2019). Taken together, as a key event during embryogenesis, the genome-wide activation might be regulated by various molecular mechanisms, including but not limited to the transcriptional factors and epigenetic landscapes. After the chromatin remodelling, the parental genomes become transcriptionally and are activated undergoing the process of EGA. As there are differences of inherited or re-established epigenetic patterns between the maternal and paternal genome, it is not clear whether they contribute equally during the genome-wide transcription. Until recent years, using single-cell RNA-seq, Deng et al demonstrated a global map of allelic expression among individual cells of mouse embryos. They conclude that the embryonic activated genes have a dynamic and random monoallelic expression pattern during the EGA of mouse embryos (Deng et al, 2014). Similarly, based on paternal

14 genetic information, Xue et al determined the parent-of-origin for approximately 15- 20% of detected transcripts for human preimplantation embryos and identified a stage- specific monoallelic expression pattern for a subset of transcripts (Xue et al, 2013). Consistently, additional research also revealed that most genes are already biallelically expressed at the morula stage (Zhang et al, 2019). Recently, Leng et al have demonstrated the parental-of-origin effects on human preimplantation development by comparing the transcriptome of uniparental and biparental embryos. Their results revealed that the maternally-biased expressed genes contributed to the initiation of EGA, while paternally-biased expressed genes had effects on embryo compaction and lineage specification of TE cells (Leng et al, 2019). However, studies addressing genome-wide allelic expression pattern for human embryos are still lacking. In spite of randomly allelic-specific expression, a set of genes is always expressed from a single allele, named imprinting. That is, these genes are expressed from either maternal or paternal chromosomes (Bartolomei et al, 2009). This monoallelic expression is essential to mammalian development and placental biology before birth (Abu-Amero et al, 2006). More recently, it has been established that imprinted genes also have major effects on postnatal development, growth and survival, as well as common diseases of adults such as diabetes and cancer (Peters et al, 2014). The current number of human imprinted genes is 257 (http://www.geneimprint.com/site/genes-by- species) and more imprinted genes have been identified using single-cell technology (Santoni et al, 2017). Generally, imprinted status is maintained by DNA methylation or noncoding RNA (ncRNA) (Bartolomei 2009; Lee et al, 2013). The maternal or paternal specific DNA methylation clusters, together with transcription factors and cis- regulators, regulate the monoallelic expression of nearby genes. In the ncRNA model, the ncRNA is one of the competitors for cis-regulated genes, by either competitional binding of the transcriptional machinery or repressing the recruitment of RNA polymerase II to promoters of target genes (Figure 6). Notably, the allelic specific transcription of ncRNA is also regulated by DNA methylation (Lee and Bartolomei 2013). Zhang et al have demonstrated another potential new DNA-methylation-

15 independent imprinting mechanism in human preimplantation embryos, the histone modification marks of H3K27me3 (Zhang et al, 2019). However, we have neither fully detected the imprinted genes nor characterized the molecular mechanisms maintaining genomic imprinting. Studying the monoallelic expression during human embryogenesis not only overviews the contribution of maternal and paternal alleles during EGA but also provides chances to identify new imprinted genes in early stages and more importantly, to understand the underlying regulating mechanisms of inherited or re- established epigenetic marks.

Figure 6. Mechanisms of genomic imprinting: DNA methylation and non-coding RNA model (Lee and Bartolomei 2013).

1.3.5 Lineage differentiation

Upon the early cleavage, embryo blastomeres undergo a process known as

16 compaction, increasing the surface area of their cell contacts and starting cell-cell communication. Later, embryos develop to the morula stage, then differentiate into blastocysts. Around Day 5, the first cell fates are specified as trophectoderm (TE) and inner cell mass (ICM), which will give rise to the extraembryonic tissues (placenta) and form embryo, respectively (De Paepe et al, 2014). The second lineage differentiation occurs in the ICM resulting in primitive endoderm (PE) and epiblast (EPI) on embryonic Day 7 (Figure 5). These lineages have both specific patterns of transcriptome and distinct re-establishment features in the DNA methylome (Zhou et al, 2019). For female embryos, dosage compensation of the X chromosome starts around this time and occurs in all three lineages (Petropoulos et al, 2016). Unlike the paternal-specific X chromosome inactivation (XCI) in mouse, biallelic expression dampening occurs in female embryos at embryonic Day 7 to reach the dosage balance between male and female human embryos (Petropoulos et al, 2016). Additionally, the random XCI of human embryos is initiated around embryonic Day 12 (Zhou et al, 2019). Although specific markers have been identified for these three lineages (Blakeley et al, 2015), whether molecular differences are a consequence or cause of lineage differentiation is poorly understood.

1.3.6 Single-cell technology in embryonic study

The recently developed single-cell or low-input sequencing technologies have provided us with the chance to characterize individual cells at multiple levels (Table 2). The transcriptome analysis has revealed the detailed process of maternal transcripts clearance and genome activation (Xue et al, 2013; Yan et al, 2013; Petropoulos et al, 2016). By now, the epigenetic landscapes, including DNA methylation (Smith et al, 2014; Zhu et al, 2018; Zhou et al, 2019), chromatin accessibility (Gao et al, 2018; Wu et al, 2018; Liu et al, 2019), histone modifications (Xia et al, 2019) and high-order chromatin structure (Chen et al, 2019), have shown their potential function in regulating gene expression and development. Furthermore, given the importance of early development of human preimplantation embryos and its crucial influence on the pregnancy outcome, an understanding of molecular programs and cellular dynamics of 17 normal embryonic development also promises to further understanding of the causes of failure in ART, thus helping to improve the treatment of patients undergoing assisted reproductive therapies.

Table 2. Single cell and low-input methods for assaying human preimplantation embryos.

Assay Samples Reference

Single-cell RNA-seq Single cell (Yan et al, 2013)

Transcriptome* (Petropoulos et al, Smart-seq2 Single cell 2016)

Reduced representation Small number (Guo et al, 2014) DNA bisulfite sequencing (RRBS) of cells

methylation Post-bisulfite adaptor tagging Single cell (Zhu et al, 2018) (PBAT)

DNase-seq 50-100 cells (Gao et al, 2018)

Chromatin Assay for transposase 20 cells (Wu et al, 2018) accessibility accessible chromatin 10 cells (Liu et al, 2019) (ATAC-seq)

Cleavage under targets and Histone release using nuclease 50 cells (Xia et al, 2019) modification (CUT&RUN)

Three- low-input genome-wide dimensional chromosome conformation 50-100 cells (Chen et al, 2019) chromatin capture (Hi-C) structure

DNA Single-cell chromatin overall methylation & omic-scale landscape Single-cell (Li et al, 2018) Chromatin sequencing (scCOOL-seq) accessibility

Transcriptome Single-cell triple omics Single-cell (Zhou et al, 2019) & DNA sequencing (scTrio-seq)

18

methylation *transcriptome studies of specific stage or a few samples are not included.

1.4 Failure of embryonic development in ART

1.4.1 Poor embryonic development

Upon fertilization and in vitro culture, it is time to choose proper embryos for transfer. In clinical, embryos are normally scored from several aspects: 1) cell number and development stage corresponding to the embryonic day; 2) ratio of fragmentation between cells; 3) size symmetry of the cells; 4) the morphology of embryos. Only embryos with excellent quality will be chosen for transfer and implantation. The most common features that lead to poor embryonic development in ART are embryo fragmentation, developmental arrest and unsuccessful blastocyst formation. Embryo fragmentation is “The process during which one or more blastomeres shed membrane vesicles containing cytoplasm and occasionally whole chromosomes or chromatin” (Zegers-Hochschild et al, 2017). When transferring embryos with >50% fragmentation, the clinical pregnancy and implantation rate are significantly influenced (Ebner et al, 2001). Developmental arrest is a condition where embryos arrest at a certain stage, with a smaller number of cells at a certain developmental time, and normally fail to form a normal blastocyst. The failure of embryonic development in ART would decrease the number of embryos for transfer and thus reduce the chance of a successful pregnancy.

1.4.2 Abnormality of embryos

Many studies have investigated the poorly developed embryos and tried to figure out the hidden causes. Notably, fragmented or arrested embryos have significant different levels of chromosomal abnormalities (Munne et al, 1994; Munne et al, 1995; Dekel-Naftali et al, 2013), as well as the content of mitochondrial DNA released into the culture medium (Stigliani et al, 2013). In addition, the spindle abnormalities (Chatzimeletiou et al, 2005), abnormal nuclei (Kort et al, 2015), and the mitochondrial dysfunction (Thouas et al, 2004) could also result in the embryonic aneuploidy and

19 arrest. At the molecular level, a wide range of higher expression of apoptosis-related genes is found in fragmented embryos (Jurisicova et al, 2003; Metcalfe et al, 2004). The unsuccessful activation of embryonic genome also contributes to abnormal development (Song et al, 2009). Moreover, it has been reported that poor embryo quality is in part due to the aberrant DNA methylation status, either inheriting immature landscape from gametes or a consequence of improper epigenetic reprogramming (Kishigami et al, 2006). All in all, any mistake or abnormality during key events of development would result in low-quality embryos.

1.4.3 Parental genetic effects

Immaturity of gametes is another key factor causing the poor embryonic development. Nowadays, researchers have reported several maternal genetic causes of abnormal oocytes and embryonic development (Table 3). Women with biallelic mutations of the oocyte-specific translational repressor PATL2 cannot get fertilizable mature oocytes as all the oocytes are immature and arrest at the GV stage (Chen et al, 2017; Maddirevula et al, 2017; Christou-Kent et al, 2018). Similarly, homozygous or heterozygous mutations in TUBB8 also interfere with oocyte maturation by causing oocyte meiotic arrest (Feng et al, 2016). Normally, mammalian oocytes are surrounded by ZP, a glycoprotein matrix which is formed by ZP family proteins and essential for oogenesis, fertilization and preimplantation development (Conner et al, 2005; Avella et al, 2014). Many studies have indicated that sequence variations of the ZP family genes may cause zona anomalies and result in female infertility (Margalit et al, 2012; Avella et al, 2014; Huang et al, 2014; Chen et al, 2017; Liu et al, 2017). In addition to the immaturity or changes of physical structure, some oocytes gradually degenerate or die soon after retrieval, a phenotype termed “oocyte death”. Maternal heterozygous mutations in PANX1 have been found responsible for this subtype of female infertility (Sang et al, 2019). Even if oocytes with normal morphology are retrieved, the maternal homozygous mutations in WEE2 may cause fertilization failure and thus affect the ability of fertility (Sang et al, 2018). Upon fertilization, embryonic lethality or arrest is one of the major features of poor embryonic development. Several genetic determinants, 20 such as PADI6 (Xu et al, 2016) and TLE6 (Alazami et al, 2015), have been identified through genomic sequencing of patients who have suffered from several failures in ART cycles. Homozygous mutations in KHDC3L are pathogenic causes for complete hydatidiform mole (CHM) (Fallahian et al, 2013), while they are also detected in patients with all embryos arresting at the morula stage (Wang et al, 2018). Biallelic mutations in NLRP2 and NLRP5 are also detected in female infertility characterised by early embryonic arrest (Mu et al, 2019). However, these findings are from studies based on pedigree family or several sporadic cases, research with a larger population is needed to find more genetic factors. Present studies mainly focus on maternal genetic background, it is also necessary to investigate the paternal genetic influence on gamete’s quality and embryonic development. Table 3. Summary of pathogenic genes of female infertility and the reported phenotype. Phenotype Gene Inheritance Reference

Arrest at (Chen et al, 2017;

germinal vesicle PATL2 AR Maddirevula et al, 2017;

Christou-Kent et al, 2018) Oocytes

Meiotic arrest TUBB8 AR/AD (Feng et al, 2016)

Oocyte death PANX1 AD (Sang et al, 2019)

ZP1 AR (Huang et al, 2014)

(Avella et al, 2014; Liu et ZP2 AD Fertilization al, 2017)

failure (Chen et al, 2017; Liu et Embryos ZP3 AD al, 2017)

WEE2 AR (Sang et al, 2018)

Developmental PADI6 AR (Xu et al, 2016)

arrest TLE6 AR (Alazami et al, 2015)

21

KHDC3L AR (Wang et al, 2018)

NLRP2 AR (Mu et al, 2019)

NLRP5 AR (Mu et al, 2019)

In summary, basic studies of human embryonic development give rise to our understanding of how genetics and epigenetics correlate with molecular dynamics during embryogenesis. Since the successful development of embryos is one of the key steps in assisted reproductive treatment, an improved understanding of these processes will help us elucidate the underlying mechanisms in ART failure, especially for couples without available embryos for transfer in several treatment cycles. Knowing the maternal and paternal contribution to embryonic development at the molecular level, as well as the parental genetic effects on abnormal embryonic development, will further knowledge within the ART field and possibly improve the diagnosis and treatment of infertility.

22

2 OBJECTIVES

The aim of this PhD study was to understand the maternal and paternal contribution to the molecular programs and cellular dynamics of human preimplantation embryos, thus to find out which factors would lead to abnormal embryonic development and result in ART failures. The specific aims are two major points summarized here: 1. Using single-cell RNA-seq data to figure out the molecular influences of maternal and paternal sex chromosomes on early embryonic development. a) Does the expression of sex chromosome differ during the early development stages of human preimplantation embryos? b) Is there association between the gene expression and sex-specific development behaviour? 2. Using WGS to understand the maternal and paternal genetic effects on the poor embryonic development in ART. a) What are the clinical features of couples suffering from recurrent ART failure? b) Which kind of variants in maternal and paternal genome could lead to the embryonic developmental arrest? c) Could application of WGS benefit the clinical diagnosis of couples with a history of ART failure?

23

3 LIST OF SCIENTIFIC PAPERS

Scientific papers

1. Qing Zhou, Taifu Wang, Lizhi Leng, Wei Zheng, Jinrong Huang, Fang Fang, Ling Yang, Fang Chen, Ge Lin, Wen-Jing Wang, and Karsten Kristiansen. Single-cell RNA- seq reveals distinct dynamic behavior of sex chromosomes during early human embryogenesis. Molecular Reproduction and Development, 2019; 86: 871– 882

2. Qing Zhou, Wei Zheng, Wen-Jing Wang, Karsten Kristiansen, Ge Lin. The phenotypic and genetic landscape of an infertility cohort with recurrent embryonic developmental arrest. Genetics in Medicine. Preliminary Manuscript

24

Publications not included in this thesis (2016 ~ 2020)

Publication

1. Taifu Wang, Jinghua Sun, Xiuqing Zhang, Wen-Jing Wang, Qing Zhou. CNV-PG: a machine-learning framework for accurate copy number variation predicting and genotyping. bioRxiv, 2020

2. Jun Wang, Yaxiong Cui, Zhenyang Yu, Wen-Jing Wang, Xuan Cheng, Wenliang Ji, Shuyue Guo, Qing Zhou, Ning Wu, Yan Chen, Ying Chen, Xiaopeng Song, Hui Jiang, Yanxiao Wang, Yu Lan, Bin Zhou, Lanqun Mao, Jin Li, Huanming Yang, Weixiang Guo and Xiao Yang, Lactate Homeostasis Regulated by Brain Endothelial Cells is Critical for Adult Hippocampal Neurogenesis.Cell stem cell, 2019. 25(6): p754-767.

3. Yang, Xi, Taifu Wang, Sujun Zhu, Juan Zeng, Yanru Xing, Qing Zhou, Zhongzhen Liu, Haixiao Chen, Jinghua Sun, Liqiang Li, Jinjin Xu, Chunyu Geng, Xun Xu, Jian Wang, Huanming Yang, Shida Zhu, Fang Chen, and Wen-Jing Wang, PALM-Seq: integrated sequencing of cell-free long RNA and small RNA. bioRxiv, 2019: p. 686055.

4. Lin, Lin, Yong Liu, Fengping Xu, Jinrong Huang, Tina Fuglsang Daugaard, Trine Skov Petersen, Bettina Hansen, Lingfei Ye, Qing Zhou, Fang Fang, Ling Yang, Shengting Li, Lasse Fløe, Kristopher Torp Jensen, Ellen Shrock, Fang Chen, Huanming Yang, Jian Wang, Xin Liu, Xun Xu, Lars Bolund, Anders Lade Nielsen, and Yonglun Luo, Genome-wide determination of on-target and off-target characteristics for RNA- guided DNA methylation by dCas9 methyltransferases. GigaScience, 2018. 7(3): p. giy011.

Conference

1. Qing Zhou, Jinghua Sun, Sujun Zhu, Juan Zeng, Taifu Wang, Yanru Xing, Zunmin Wan, Xi Yang, Zhongzhen Liu, Wen-Jing Wang. Diagnosis of Rothmund-Thomson syndrome by whole genome sequencing. European Society of Human Genetics Meeting,

25

2020.6

2. Qing Zhou, Shiqi Lin, Jiahua Mu, Jie Guo, Xi Yang, Wen-Jing Wang. Transcriptome analysis reveals potential role of autophagy in embryo fragmentation. The 35th Annual Meeting of the European Society of Human Reproduction and Embryology, 2019.6

3. Jinghua Sun, Qing Zhou, Taifu Wang, Xi Yang, Zhongzhen Liu, Yanru Xing, Wen- Jing Wang. Diagnosis of Noonan syndrome for an aborted fetus by whole genome sequencing. European Society of Human Genetics Meeting, 2019.6

4. Qing Zhou, Taifu Wang, Jinghua Sun, Wen-Jing Wang, Karsten Kristiansen. Single- cell RNA-seq reveals distinct dynamic behavior of sex chromosomes during human early embryogenesis. The 34th Annual Meeting of the European Society of Human Reproduction and Embryology. 2018.6

26

4 RESULTS

3.1 Contribution of sex chromosomes during human embryogenesis

3.1.1 Sex-specific behaviour of IVF embryos

As mentioned above, sex ratio is a great issue in relation to ART, especially when exposing embryos for an extended time and transferring at the blastocyst stage. This phenomenon is partly due to the different behaviour of male and female embryos during the early development stages. Results from animal studies have clearly demonstrated the differences between sex at the early stages. For example, mouse embryos carrying a Y chromosome have a higher culture rate at the 2-cell stage (Sato et al, 1995) and developed more quickly (Valdivia et al, 1993). Sex not only affects the metabolism of early embryos (Kochhar et al, 2001; Alomar et al, 2008), but also influences their response to the environment. For bovine embryos, when mixing culture medium with additional embryonic colony stimulating factor 2, more female embryos survive at the morula stages (Hansen et al, 2016). Similarly, human preimplantation embryos also display different behaviours between sex. Male embryos have been reported to have an increased number of cells, higher metabolic activity, and significant faster cleavage rate than female embryos (Ray et al, 1995; Alfarawati et al, 2011; Huang et al, 2018). Later at the blastocyst stage, female embryos need to initiate a female-specific epigenetic event – dosage compensation of X chromosome - to balance the expression of X-linked genes between males and females (van den Berg et al, 1998).

As we know, from the moment of fertilization, the sex of an embryo is determined by the sex chromosome carried by a spermatozoon, either an X or Y chromosome (Alomar et al, 2008; Setti et al, 2012). The regulating role of sex chromosomes and the molecular mechanisms governing the observed differences are still not clear. Although several animal studies have proved the existence of sex‐specific differences in gene expression of preimplantation embryos (Kobayashi et al, 2006; Lowe et al, 2015), the Y-chromosome-driven effects on gene expression of human are only revealed in

27 pluripotent stem cells (Ronen et al, 2014). As a result, a systematic profiling of gene expression comparing male and female human embryos is needed. In other words, sex chromosomes are the most different features with parental origins. Female embryos have both maternal and paternal X chromosomes, while male embryos inherit the maternal X chromosome and paternal Y chromosome. To investigate the maternal and paternal contribution to early embryonic development, it is necessary to understand the different behaviour and function of sex chromosomes on the molecular level. It is also important to understand their potential effects of the sex-specific developmental behaviours.

3.1.2 Transcriptional differences on sex chromosomes

Recent published single-cell RNA-seq studies have generated comprehensive transcriptional atlases for human preimplantation embryos, from the moment of fertilization to the stage of late blastocyst (Xue et al, 2013; Yan et al, 2013; Petropoulos et al, 2016). These researches provide the opportunity to further understand the dynamics of gene expression during embryogenesis. In this study, we collected public transcriptome data of human preimplantation embryos from two datasets (Yan et al, 2013; Petropoulos et al, 2016), including 1607 individual cells covering development stages from the 4-cell stage to late blastocyst (Figure 7), and then performed a comprehensive analysis of transcriptome between male and female embryos during these early development stages.

Figure 7. Summary of sequencing data from two public datasets, including the development stages and the statistics of cells and embryos (in parentheses) within each stage. TE: trophectoderm; EPI: epiblast; PE: primitive endoderm. E2~E7: in vitro culture time of embryos between embryonic Day 2~Day 7.

28

Firstly, the global transcriptional map of early embryos shows no obvious differences between male and female embryos within each development stage (Figure 8). According to the two-dimensionality reduction results of t‐distributed stochastic neighbour embedding (t‐SNE), cells are clearly clustered according to their development stages. This means the primary segregating factor for these cells is the time point of the development, rather than the sex. Furthermore, it is impossible to distinguish male and female embryos within each stage as they are closely clustered to each other.

Figure 8. Global transcriptional map of early embryos. The t‐SNE result of all cells is calculated by the expression of total genes. Colours are used to represent the embryonic day for male (triangle) and female (dot) embryos. tSNE: Two‐dimensional t‐distributed stochastic neighbour embedding. Secondly, the differences of gene expression at the chromosome level become apparent between sex. Although the autosomes have a consistent transcription pattern between male and female embryos, the sex chromosomes display significant differences as early as the 8-cell stage (Day 3), the exact time point after EGA (Figure 9). These differences persist until the late blastocyst stage at Day 7, and then differences on autosomes also become apparent.

29

Figure 9. Gene expression at the chromosome level in the male (light blue) and female (pink) embryos at embryonic Day 3 (E3) and Day 7 (E7). *significant with p<10−5 (Mann Whitney Wilcoxon test).

3.1.3 Dynamic behaviour of sex chromosomes

Since gene expression from sex chromosomes exhibits differences, we performed further analysis to profile the expression of all X-linked and Y-linked genes. Interestingly, only a few genes, like RPS4Y1 and DDX3Y, have transcriptional activity during the initial activation of the Y chromosome (Figure 10). Their expressions are much higher than other Y-linked genes. Furthermore, embryos could be classified into male and female groups based on the expression value of a single RPS4Y1 gene at the 8-cell stage. This consistent and high expression of specific genes is observed in all male cells. As a result, the RPS4Y1 has the potential to serve as a sex-specific expression 30 marker to distinguish male and female human cleavage embryos.

Figure 10. Expression profile of the Y chromosome and sex-specific markers during early development stages. a) Heatmap showing the expression of detected Y-linked genes ordered by their genomic loci. b) The sex-specific marker genes expressed much higher than the average value of Y-linked genes in all male cells. c) Embryos can be classified into male (blue) and female (red) group according to the expression of RPS4Y1 as early as the 8-cell stage after EGA. EGA: embryonic genome activation On the contrary, X chromosomes are widely activated, as the majority of X-linked genes are upregulated during EGA. When the embryonic genome is activated, genes broadly locating along the X chromosomes display increased expression in both male and female embryos (Figure 11). Notably, the expression of X-linked genes is higher in females than that in males. According to the allelic analysis result, both alleles of heterozygous sites can be detected in the female embryos from transcription data while the reads ratio in male embryos is approximately 100%. Combining with their higher expression after EGA, these genes with biallelic expression may get transcripts from both maternal and paternal X-chromosomes. These results together indicate that two copies of the X-chromosomes in female embryos are both transcriptionally activated during EGA.

31

Figure 11. Expression landscape and allelic analysis of X‐linked genes during EGA. Heatmap showing the average expression of genes on X chromosome in male (E*_M) and female (E*_F) embryos during EGA. All the genes are sorted by their genomic location. Two representative examples indicate the biallelic expression of X-linked genes in female embryos. The heatmap bars under bar chart shows increase of their expressions in all embryos and the detected reads ratio supporting each allele is approximate 50% in females and near to 100% in males. As described above, the RPS4Y1 gene in male embryos highly expresses at the time point when embryos accomplish EGA. It has a paralogous gene, RPS4X, which locates on the X chromosome and encodes a functionally equivalent protein (Zinn et al, 1994). It has been assumed that normal human development requires two RPS4X in female cells and one RPS4X and one RPS4Y in male cells, as haploinsufficiency of these genes may lead to Turner Syndrome (Watanabe et al, 1993). Due to the activation of two copies of the X chromosome, the RPS4X gains a two-fold dosage in female embryos. The RPS4Y1 highly expresses in males in order to balance the dosage between sex at the early stage. However, it is not the same situation for another marker gene, DDX3Y, which is actively expressed to balance the dosage of its paralogous gene, DDX3X. Although proteins encoded by these two genes both belong to the RNA helicase family and share high similarity, their functions are quite different (Rosner et al, 2006). According to their functions, early activation of DDX3Y in male embryos may lead to a specific RNA metabolism, which subsequently affects the downstream regulation

32 such as neuronal differentiation (Vakilian et al, 2015). In fact, most of the genes cannot reach a balanced expression dosage between sex. As a result, the specific existent of sex chromosomes and their distinct activation pattern during EGA result in an unbalanced dosage of gene expression between male and female embryos.

Now, we know that unbalanced expression dosage of sex chromosomes exists between male and female embryos after EGA. Then embryos start the process of compaction and differentiation. During the formation of the blastocyst, the Y chromosome in males is consistently and stably activated, while expression of the X‐ linked genes in females tends to decrease, especially in TE cells (Figure 12). Although it has been reported that to reach a dosage balance between male and female embryos, biallelic expression dampening occurs in female embryos at the late blastocyst stage (Petropoulos et al, 2016) and the random XCI of human embryos is initiated around Day 12 (Zhou et al, 2019), the exact time of completing the dosage compensation in human embryos is still unknown. The dynamic behaviour of sex chromosomes during early embryogenesis, especially from the timeline of EGA to the accomplishment of dosage compensation, may lead to imbalanced dosages of various genes.

Figure 12. Dynamic behaviour of sex chromosomes during blastocyst formation. Similar to autosomes (chrA), the Y chromosome has consistent expression at the blastocyst stage. Unlike the stable pattern in male embryos, the total expression value of X chromosome in female embryos decline from embryonic Day 5 (E5) to Day 7 (E7).

33

3.1.4 Contribution of imbalanced expression dosage

To evaluate the dosage imbalance within each embryonic stage, we performed differential expression analysis comparing male and female embryos. In total, approximately 500~2500 differentially expressed genes (DEGs) are identified at each stage (Figure 13). Notably, the majority of DEGs are located or enriched on the X chromosome at the 8-cell stage, and more DEGs encoded on the autosomes are detected at later stages.

Figure 13. Number of DEGs on each chromosome from embryonic Day 3 to Day 7 in the two datasets (bottom: Petropoulos et al., 2016; top: Yan et al., 2013). The significant enrichment is marked with star (Fisherʼs exact test, p < 0.001). DEG: differentially expressed genes These genes show a stage-specific enrichment of various biological functions (Figure 14). For example, the main regulated biological functions at the 8-cell stage include cell cycle control and cell division, and later in the morula stage, the process of cellular component organization is affected. During the formation of blastocysts, these genes are mainly involved in chromatin assembly, translation elongation, metabolism and lipid transport. Since male IVF embryos are reported to develop faster than female embryos and display higher metabolic activity (Alfarawati et al, 2011; Huang et al, 2018), these enriched biological processes may suggest a potential role of sex-specific expressed genes in regulating the particular behaviour of early embryos. 34

Figure 14. Gene Ontology annotation and function cluster of identified DEGs. a) Circos plot showing the DEGs acroos multiple stages. Blue lines link the different genes where they fall into the same enriched ontology term. b) Heatmap of top 20 function clusters across multiple stages. White block: no significant enrichment; DEGs: differentially expressed genes. In summary, this paper investigates the maternal and paternal specific contribution to early embryonic development by investigating the dynamic transcriptional behaviour and potential function of sex chromosomes. By analysing the public single-cell transcriptome data of human preimplantation embryos, this paper provides a comprehensive comparison of gene expression between male and female embryos during early development. The dynamic and distinct pattern of activation and silencing of sex chromosomes result in a different dosage of gene expression, and thus, regulate various biological functions that may have potential effects on the sex-specific

35 behaviour of IVF embryos during early embryogenesis.

For this paper, I conceived the original idea and designed the study of reanalysing public datasets to perform data mining with the supervision of my supervisor. Thanks to the assistance from my colleagues in BGI, I finished the data analysis, wrote the first version of the manuscript, prepared all the figures, revised the manuscript with my supervisor, and took the charge of submission. After several rounds of revision, either by my supervisor or reviewers, this paper has been published in Molecular Reproduction and Development in 2019 (Zhou et al, 2019).

3.2 Parental genetic effects on embryonic development failure

3.2.1 Recurrent ART failure

In clinical practice, a great issue for ART treatment is the low successful rate of pregnancy, no matter due to embryonic development failure or higher miscarriage possibility after implantation. The most common features of embryonic development failure are embryo fragmentation, developmental arrest and unsuccessful blastocyst formation. For each ART cycle, approximately ten to twenty oocytes are released, depending on the response of women’s ovary. Poor embryonic development is a common phenotype for embryos during in vitro culture. For most couples undergoing ART, they can finally get several candidate embryos meeting the requirement of transfer and one or two of the embryos may be chosen to implant. However, there is a subgroup of the population who have tried several ART cycles and fail to get pregnant as they cannot get normal embryos for transfer. Most of them suffer from embryonic developmental arrest. As a result, they might lose the chance to have a healthy baby.

Some genetic studies have demonstrated several maternal genetic factors affecting the structure or quality of gametes, influencing the following fertilization or leading to the embryonic developmental arrest, thus resulting in the failure of embryonic development (Table 3). Despite these several findings from pedigree families, the genetic factors leading to the embryonic development failure have not been

36 systematically evaluated, let alone the effects of paternal genetic background. Therefore, in this study, we comprehensively discussed the clinical phenotypes and genetic features of a subpopulation of couples suffering from several ART failures due to recurrent embryonic developmental arrest.

3.2.2 Clinical characteristics

We performed clinical assessment, WGS and variants analysis in 58 women, who had suffered from more than two ART failures because their in vitro cultured embryos all arrested before blastocyst stage. The recruited women were from 22 to 44 years old, with an average five-year history of infertility (Table 4). The ratio of ART cycles through IVF, ICSI and combination of IVF and ICSI was 60.17%, 34.75% and 5.08%, respectively. A small number of women (17/118) received therapeutic donor insemination (TDI) if her husband had been clear diagnosed as infertility with no functional sperms. The minimum number of released oocytes in a cycle was 4 and the maximum number was 35. On average, the fertilization rate was approximately 47.62% for either IVF or ICSI treatment. Although an average of four fertilized embryos were generated for each ART cycle, all of the couples could not get normal embryos for transfer at the blastocyst stage.

Table 4. Summary of main clinical features of the infertility cohort in this study. Main clinical features Number of cases/average value a Age Median 30 Range (20-44) BMI Median 21.36 Range (16.36-29.29) Years of infertility Median 5 Range (2-18) Number of oocytes Median 9 Range (4-35) TDI 17/118 (14.41%)

37

Table 4. Continued Main clinical features Number of cases/average value a Fertilization IVF 71/118 (60.17%) ICSI 41/118 (34.75%) IVF+ICSI 6/118 (5.08%) Fertilization rate Mean 47.62% Range (0-100%) Number of fertilized embryos Mean 4 Range (0-17) Number of embryos for 0 transfer a value collected and calculated per cycle; BMI: body mass index; IVF: in vitro fertilization; ICSI: intracytoplasmic sperm injection; TDI: therapeutic donor insemination

3.2.3 Genetic findings

We got genetic information of all these women using genome sequencing. Besides, we performed whole exome sequencing (WES) to get exonic genetic information for 16 women. From the genetic data, we only detected novo variants for reported genes (summarized in Table 3), rather than recurrent variants of reported loci. We identified likely pathogenic (LP) causative variants in about 6.9% (4/58) of the patients (Table 5), and variants of uncertain significance (VUS) in another 8.6% (5/58). As expected, we only detected pathogenic variants of genes with a reported phenotype of developmental arrest, like TLE6, PADI6 and KHDC3L, rather than genes influencing the maturation of gametes or fertilization. These results are consistent with the clinical features of our cohort since the average fertilization rate of oocytes is normal while all embryos are arrested at a certain development stage.

38

Table 5. Summary of main clinical features and genetic findings in infertility women.

Number Reported Genomic variant(s) Main clinical ACMG No. Samples Ages Cycles of embryo Fertilization Gene inheritance (zygosity) features category (oocyte) pattern [transcript]

c.222G>C; p.(Q74H) Likely TLE6 AR (hom) All embryos arrest at pathogenic 1 I1* 29 2 18 (18) IVF [NM_001143986] 4C or 5C. 2bp-del; (hom) Uncertain PADI6 AR Upstream cis-element significance

Embryos arrest from c.1055C>T; p.(A352V) Likely 2 I2* 33 2 9 (21) IVF, ICSI 3C to 6C with TUBB8 AR, AD (het) pathogenic fragmentation [NM_177987]

c.993C>G; p.(F331L) Likely All embryos arrest TUBB8 AR, AD (het) pathogenic 3 I173* 27 2 8 (31) IVF from 4C to 8C with [NM_177987]

compaction 2bp-del; (hom) Uncertain PADI6 AR Upstream cis-element significance

39

Table 5. Continued

Number Reported Genomic variant(s) Main clinical ACMG No. Samples Ages Cycles of embryo Fertilization Gene inheritance (zygosity) features category (oocyte) pattern [transcript]

c.245A>T: p.(N82I) Likely (het) pathogenic TDI; all embryos [NM_001017361] 4 I211 38 2 10 (21) IVF, ICSI KHDC3L AR arrest from 2C to 5C c.*18_*315del Likely (het) pathogenic [NM_001017361]

ACMG, American College of Medical Genetics and Genomics; AD, autosomal dominant; AR, autosomal recessive; bp, base pair; BMI, body mass index; del, deletion; het, heterozygous; hom, homozygous; ICSI, intracytoplasmic sperm injection; IVF, in vitro fertilization; TDI: therapeutic donor insemination; 2~6C: embryo with 2~6 cleavage cells.

* Exonic variant results confirmed by both WES and WGS

40

By analysing WGS data, we confirmed the majority of results found in WES data and detected two additional structure variants (SVs) locating within cis-regulatory elements of reported pathogenic genes, PADI6 in patient I1 and I173. Although they were identified as VUS according to the guideline of American College of Medical Genetics and Genomics (ACMG) (Richards et al, 2015), they might provide a basis for further validation experiments and new insights into the pathogenic mechanism. Since TLE6 and PADI6 were both reported pathogenic genes for infertility women characterized with embryonic developmental arrest, the fertilization rate for patient I1 was 100% (18/18) while all embryos arrested before the 8-cell stage. In patient I173, another likely pathogenic variant was in TUBB8, which was involved in maintaining the normal function of oocytes. Although the patient I173 got more oocytes during her two ART cycles, the fertilization rate was as low as 25.8% (8/31). All embryos arrested from 4-cell to 8-cell stage with compaction. Additional patient (I2) with variant in TUBB8 also had low fertilization rate and displayed embryonic arrest, as well as fragmentation. Interestingly, KHDC3L is a reported autosome recessive pathogenic gene associated with recurrent hydatidiform mole (Parry et al, 2011; Fallahian et al, 2013) and developmental arrest (Wang et al, 2018). From the result of single nucleotide variants (SNVs), one missense mutation of this gene was detected as heterozygous in patient I211. In addition, we found a small structural deletion in this gene via WGS. As a result, the compound variants may finally affect the two copies of KHDC3L and thus lead to the embryonic developmental arrest in patient I211.

3.2.4 Paternal genetic factors

We also profiled the genetic landscape to identify male factors, which cause poor embryonic development. Since the genetic basis of male infertility due to abnormal spermatogenesis has been extensively reported (Skakkebaek et al, 2016; Ray et al, 2017), the association between paternal genetic factors and poor embryonic development is not clear. In fact, sperm proteins are functionally involved in the processes of fertilization and preimplantation embryonic development (Castillo et al, 2018). 41

In order to investigate the paternal factors, we selected 21 women among them and performed WGS for their husband. According to gene ontology (GO) annotation and null-mouse data from the database of Mouse Genetic Informatic (MGI) (Blake et al, 2000), 103 proteins were identified involved in the process of fertilization and 59 sperm proteins were selected with related roles in embryonic development (Castillo et al, 2018). Then we summarised the genetic findings of these genes in our paternal genomic data. In total, we found LP or VUS variants in 16 samples, with a detection ratio of 76.2%. Most of them were heterozygous missense variants involved in multiple key processes of early embryonic development (data not shown).

In summary, we displayed the first description of clinical phenotypes and genetic features of a large population of infertility couples suffering from recurrent ART failure, with phenotype of embryonic developmental arrest. Although the main phenotype for this population is the embryonic developmental arrest, our results show that this population is highly heterogeneous both phenotypically and genetically. Combining with high-resolution WGS analysis and the paternal genetic data, we have achieved a comprehensive genetic landscape for this infertility cohort. Our findings enable the potential application of WGS to clinical diagnosis and it could contribute to improved genetic counselling of couples with a history of recurrent ART failure.

This is a collaborative project with a reproductive hospital in China. Due to confidential consideration, only parts of the results are shown in this thesis and the manuscript in appendix is a re-write version. I conceived and designed this study together with collaborators. Additionally, I was responsible for analysing the genomic sequencing data, summarising the variant results and writing and revising the manuscript.

42

5 CONCLUSION AND FUTURE PERSPECTIVES

Our comprehensive analysis of the transcriptional atlases of a large number of human preimplantation embryos reveals the dynamic and distinct transcription and silencing behaviour of sex chromosomes during embryogenesis. These findings provide evidence of prominent expression differences between maternal and paternal sex chromosomes. Since male gonadal differentiation in embryo first occurs post- implantation (Haqq et al, 1994), the sex-determining SRY gene is not expressed at these early stages. The specific high expression of RPS4Y1 on the Y chromosome helps to balance the dosage and could also serve as a potential expression marker for male embryos. On the contrary, X chromosomes are widely activated after EGA. It has been reported that the dosage compensation of biallelic expression dampening occurs in female embryos at embryonic Day 7 (Petropoulos et al, 2016) and the random XCI of human embryos is initiated around Day 12 (Zhou et al, 2019). We found a rapid decline of gene expression on the X chromosome in TE cells, where the first interaction occurs between embryos and maternal endometrium. It may result in a balanced gene expression dosage, thus be beneficial to later implantation. In mice, the imprinted inactivation of the paternal X chromosome (Xp) is maintained in the TE of the blastocyst, but is reversed randomly in the ICM (Mak et al, 2004; Okamoto et al, 2004). Key genes, such as Atrx, involved in chromatin remodelling and playing a crucial role in the XCI, have been found to be expressed in TE cells, not in other cell types (Okamoto et al, 2004). Our finding of the rapid process of dosage compensation in TE cells also raises the question as to whether there are lineage-specific factors, similar to those found in mouse, can regulate this process in human. Additionally, the dynamics of sex chromosomes leads to various dosages of gene expression between male and female embryos, and thus, regulates various biological functions, including cell cycle, chromatin assembly and metabolism. These molecular processes may have potential effects on the sex-specific behaviour of IVF embryos during early embryogenesis, such as male-biased development rate and metabolic

43 activity at the blastocyst stage (Aiken et al, 2004; Kobayashi et al, 2006; Alfarawati et al, 2011; Huang et al, 2018). Since differences in sex chromosome content, gene expression and proteins exist between X- and Y-bearing sperm, treatment of mouse sperm with X-retarding chemical could slow X-bearing sperm motility without impairing sperm fertilization ability (Umehara et al, 2019). By sorting the gametes according to their mobility, researchers could selectively generated majority-female or majority-male mouse embryos. Such sperm-sorting techniques have been widely used in the livestock sector, as well as in human assisted reproduction (Karabinus et al, 2014), especially for couples at risk of having children with sex-linked disorders. However, present technology has potential risk damaging the DNA of the sperm (Caroppo et al, 2013). Our research helps to understand the basic molecular effects governing sex differences during human embryogenesis. It may provide a basis for potential improvement of culture condition to balance the sex ratio of ART-conceived babies or selectively generate male or female embryos to retain from sex-linked disorders. Knowing the sex-specific contribution of maternal and paternal sex chromosomes will possibly expand the capabilities of ART and may enhance reproductive health. Due to the technical stochastic nature of single-cell RNA-seq technology, the present research is mainly profiling the mature message RNAs. Lowly-expressed genes or other types of transcripts could be missed. However, besides protein-coding genes, small noncoding RNAs, including long noncoding RNA(lncRNA) (Guttman et al, 2012), miRNA (Ambros et al, 2004; Kloosterman et al, 2006), piwi-RNA (piRNA) (Aravin et al, 2006; Grivna et al, 2006) and transfer RNA (tRNA) (Lee et al, 2009; Keam et al, 2015), are also essential factors with wide regulating functions. Recent studies have found that tRNA fragments in sperm could be affected by paternal diet and may mediate metabolic disorders in offspring (Chen et al, 2016; Sharma et al, 2016). The piRNA, preventing genomic damage caused by transposable element reactivation (Aravin et al, 2007), has tissue-specific expression in male germlines (Aravin et al,

2006; Girard et al, 2006; Grivna et al, 2006).Its newly found function of activating mRNA translation reveals its central role in acrosome formation required for spermatid

44 development (Dai et al, 2019). However, its biogenesis and function in embryonic development are unclear. Thus, to fully understand the process of early embryonic development, we still need to improve and develop new technology of single-cell sequencing, to generate highly informative whole transcriptome of early embryos and understand the comprehensive transcriptional regulation network. Similarly, another technical barrier for researching early development is profiling the epigenetic landscape on a single-cell level or for low-input cells. Chromatin and methylation state changes could influence global transcription or regulate the expression of specific genes. For humans, the dynamic DNA methylation changes in pre- and post-implantation embryos have been investigated (Guo et al, 2014; Zhu et al, 2018; Zhou et al, 2019), as well as the prenatal germlines (Gkountela et al, 2015; Guo et al, 2015). Resetting histone modifications of H3K4me3 and H3K27me3 in early embryos have been demonstrated, revealing a distinct pattern compared to mouse (Xia et al, 2019). While changes of other types of histone modification still need further research. If the chromatin immunoprecipitation sequencing (ChIP-seq) for transcription factors could be realized for embryos, it will clearly display the regulating function for specific factors, helping to find pioneer regulators initiating the EGA of human embryos. Additionally, chromatin is organized into topologically associated domains (TADs) and plays important roles in regulating transcription (Dekker et al, 2016). Recent work of global chromatin remodelling in mouse shows that mature sperms have a TAD structure with long-range inter-chromosomal interactions, whereas mature oocytes have a smaller proportion of distal interactions (Du et al, 2017; Flyamer et al, 2017; Ke et al, 2017). Due to the chromatin remodelling event after fertilization, the well-defined TADs disappear at the 2-cell embryos and gradually reorganize through cleavage stages, with clear TAD structure arising after the major EGA at 4-cell stage for mouse embryos (Du et al, 2017; Ke et al, 2017). When EPI transits to a state of primed pluripotency around implantation, enhancers in ectoderm are preferentially pre-accessible and a strong bivalency state of H3K4me3 and H3K27me3 clusters at developmental gene promoters, with enhanced spatial interactions (Xiang et al, 2019). Unlike mouse sperm,

45 human mature sperms do not have TAD structures. Upon fertilization, TADs are established during ZGA in human embryos owing to the expression of CTCF (Chen et al, 2019). However, the chromatin landscape of human post-implantation embryos remains poorly understood. We expect the further development and application of single-cell technologies studying various and comprehensive landscapes of human pre- and post-implantation embryos. Most importantly, it is necessary to move from cataloging dynamic patterns by stage toward understanding the causality between epigenetic changes and gene expression, especially the maternal and paternal specific inherited patterns and their contribution to early embryogenesis. Basic studies of genetic and epigenetic correlation with molecular dynamics during embryogenesis will further help us figure out the underlying mechanisms in ART failure. In clinical practice, there is a subpopulation of couples who have tried several cycles of ART, but unfortunately, they cannot get normally developed embryos for transfer. Several maternal genetic causes have been identified from studies of pedigree family. These pathogenic variants lead to gametes abnormality or embryonic developmental arrest (Table 3). Taking the advantage of WGS technology and high-resolution analysis, our research displays a comprehensive genetic landscape for a large population of infertility couples suffering from recurrent ART failure, revealing their phenotypic and genetic heterogeneous. Clinical WES has emerged as a powerful genetic diagnostic tool since it dramatically increased the diagnostic yield of suspected genetic disorders compared to multigene panel sequencing (Lee et al, 2014; Trujillano et al, 2017). In fact, WGS offers improved uniformity of coverage compared to WES, which could increase the accuracy of detecting variants on exonic regions (Lelieveld et al, 2015). Our data demonstrate the excellent ability of WGS, not only to identify diagnostic variants detected by WES, but also to interpret deep intronic and other noncoding SNVs, as well as small copy number variants (CNVs). Our data provide evidence for incorporating WGS in the clinical workup of people with a history of ART failure and also has potential ability to identify new pathogenic variants or genes for this infertility cohort. Our findings enable the potential application of WGS to clinical diagnosis and it could

46 improve the genetic counselling of couples in relation to reproductive health. Still, further understanding of the individual’s whole genomic information and studies of a larger population are needed. Despite the extreme phenomenon of abnormal embryonic development, transferring embryos with good developmental potency will dramatically improve the successful rate of ART. Therefore, how to select embryos with good quality becomes a critical question. Nowadays, morphological grading of blastocysts by embryologists is clinically used for embryo selection (Meseguer et al, 2011; Irani et al, 2017). Besides present blastocyst grading, artificial intelligence (AI) and time-lapse have been used trying to improve human blastocyst morphology evaluation. Training deep learning models using time-lapse images or videos can reduce variation between embryologists and also can predict pregnancy outcome (Khosravi et al, 2019; Tran et al, 2019). If the model is trained based on larger datasets in the future, the evaluation system will have better performance and inherently enhance our capabilities of assessing embryo viability. Another non-invasive material for embryo selection is the culture medium during in vitro development. Several biomarkers of blastocyst potential have been found from the culture medium, such as the content of mitochondrial DNA (Stigliani et al, 2014), expression of specific miRNA (Borges et al, 2016; Capalbo et al, 2016) and the content of certain genes (Alfaidy et al, 2016; Bouvier et al, 2017). Since it is extremely difficult to detect the very low concentrations of nucleotides in the culture medium, only a limited number of biomarkers have been identified. Similarly, if more sensitive technologies are developed and studies on large-scale samples are realized, more reliable biomarkers would be found to predict the outcome of implantation and pregnancy, and thus, serve as non-invasive biomarkers for human embryo selection in ART treatment. More importantly, when the assisted reproductive treatment succeeds, the short- and long-term effects of health in offspring become of great concern. During early development, embryos have to erase the inherited parental epigenetic landscape and re- establish appropriate state for embryonic development (Hales et al, 2011). Since in ART

47 treatment, embryos accomplish these processes during in vitro culture, it has been reported that ART-conceived children have altered epigenetic profiles, and these alterations may be one of the key factors resulting in adverse child outcomes after ART (Berntsen et al, 2019). However, large-scale systematic researches about the genetic and epigenetic changes of ART-conceived children are still lacking. Additionally, the epigenetic state is susceptible to environmental modulation, both for embryos and parental gametes. As mentioned previously, tRNA fragments in sperm could be affected by the paternal diet and may mediate metabolic disorders in offspring (Chen et al, 2016; Sharma et al, 2016). Similarly, vitamin C is required for proper DNA demethylation of mammalian maternal germline. Deficiency of maternal vitamin C during pregnancy does not affect overall embryonic development, but reduces the number of germ cells and fecundity in adult offspring (DiTroia et al, 2019). Exposure to paracetamol during pregnancy, both through metabolism from environmental exposure and by pharmacological use, can affect the reproductive health of female offspring (Holm et al, 2016). Therefore, in addition to various parental genetic background, maternal and paternal environmental exposure may also influence the embryonic development, as well as the health in offspring. In summary, for basic scientific studies, we expect to break through present technical barriers to fully understand the development process of human pre- and post- implantation embryos. Knowing the maternal and paternal contribution to embryonic development on the molecular level will further knowledge within the ART field. Studying the parental genetic and epigenetic effects on abnormal embryonic development could improve the diagnosis and treatment of infertility. Meanwhile, we need more clinical researches based on large-scale populations to expand the capabilities of ART and enhance reproductive health.

48

6 REFERENCES

1. M. C. Inhorn and P. Patrizio (2015). "Infertility around the globe: new thinking on gender, reproductive technologies and global movements in the 21st century." Hum Reprod Update 21(4): 411-426. 2. J. Boivin, L. Bunting, J. A. Collins and K. G. Nygren (2007). "International estimates of infertility prevalence and treatment-seeking: potential need and demand for infertility medical care." Hum Reprod 22(6): 1506-1512. 3. M. N. Mascarenhas, S. R. Flaxman, T. Boerma, S. Vanderpoel and G. A. Stevens (2012). "National, regional, and global trends in infertility prevalence since 1990: a systematic analysis of 277 health surveys." PLoS Med 9(12): e1001356. 4. W. Ombelet, I. Cooke, S. Dyer, G. Serour and P. Devroey (2008). "Infertility and the provision of infertility medical services in developing countries." Hum Reprod Update 14(6): 605-621. 5. A. Agarwal, A. Mulgund, A. Hamada and M. R. Chyatte (2015). "A unique view on male infertility around the globe." Reprod Biol Endocrinol 13: 37. 6. J. R. Chachamovich, E. Chachamovich, H. Ezer, M. P. Fleck, D. Knauth and E. P. Passos (2010). "Investigating quality of life and health-related quality of life in infertility: a systematic review." J Psychosom Obstet Gynaecol 31(2): 101-110. 7. F. Zegers-Hochschild, G. D. Adamson, S. Dyer, C. Racowsky, J. de Mouzon, R. Sokol, et al. (2017). "The International Glossary on Infertility and Fertility Care, 2017." Fertil Steril 108(3): 393-406. 8. F. Zegers-Hochschild, G. D. Adamson, J. de Mouzon, O. Ishihara, R. Mansour, K. Nygren, et al. (2009). "The International Committee for Monitoring Assisted Reproductive Technology (ICMART) and the World Health Organization (WHO) Revised Glossary on ART Terminology, 2009." Hum Reprod 24(11): 2683-2687. 9. G. D. Adamson, J. de Mouzon, G. M. Chambers, F. Zegers-Hochschild, R. Mansour, O. Ishihara, et al. (2018). "International Committee for Monitoring Assisted Reproductive Technology: world report on assisted reproductive technology, 2011."

49

Fertility and Sterility 110(6): 1067-1080. 10. S. Berntsen, V. Soderstrom-Anttila, U. B. Wennerholm, H. Laivuori, A. Loft, N. B. Oldereid, et al. (2019). "The health of children conceived by ART: 'the chicken or the egg?'." Hum Reprod Update 25(2): 137-158. 11. G. D. Adamson, J. de Mouzon, P. Lancaster, K. G. Nygren, E. Sullivan and F. Zegers-Hochschild (2006). "World collaborative report on in vitro fertilization, 2000." Fertil Steril 85(6): 1586-1622. 12. R. P. Dickey (2007). "The relative contribution of assisted reproductive technologies and ovulation induction to multiple births in the United States 5 years after the Society for Assisted Reproductive Technology/American Society for Reproductive Medicine recommendation to limit the number of embryos transferred." Fertility and Sterility 88(6): 1554-1561. 13. V. A. Kushnir, D. H. Barad, D. F. Albertini, S. K. Darmon and N. Gleicher (2017). "Systematic review of worldwide trends in assisted reproductive technology 2004- 2013." Reprod Biol Endocrinol 15(1): 6. 14. A. Smith, K. Tilling, S. M. Nelson and D. A. Lawlor (2015). "Live-Birth Rate Associated With Repeat In Vitro Fertilization Treatment Cycles." JAMA 314(24): 2654-2662. 15. M. Practice Committees of the American Society for Reproductive and T. Society for Assisted Reproductive (2012). "Intracytoplasmic sperm injection (ICSI) for non- male factor infertility: a committee opinion." Fertil Steril 98(6): 1395-1399. 16. S. L. Boulet, A. Mehta, D. M. Kissin, L. Warner, J. F. Kawwass and D. J. Jamieson (2015). "Trends in Use of and Reproductive Outcomes Associated With Intracytoplasmic Sperm InjectionTrends and Outcomes of Intracytoplasmic Sperm InjectionTrends and Outcomes of Intracytoplasmic Sperm Injection." JAMA 313(3): 255-263. 17. L. B. Romundstad, P. R. Romundstad, A. Sunde, V. von During, R. Skjaerven, D. Gunnell, et al. (2008). "Effects of technology or maternal factors on perinatal outcome after assisted fertilisation: a population-based cohort study." Lancet 372(9640): 737-

50

743. 18. S. K. Kalra and K. T. Barnhart (2011). "In vitro fertilization and adverse childhood outcomes: what we know, where we are going, and how we will get there. A glimpse into what lies behind and beckons ahead." Fertility and Sterility 95(6): 1887-1889. 19. M. J. Davies, V. M. Moore, K. J. Willson, P. Van Essen, K. Priest, H. Scott, et al. (2012). "Reproductive Technologies and the Risk of Birth Defects." New England Journal of Medicine 366(19): 1803-1813. 20. S. Pandey, A. Shetty, M. Hamilton, S. Bhattacharya and A. Maheshwari (2012). "Obstetric and perinatal outcomes in singleton pregnancies resulting from IVF/ICSI: a systematic review and meta-analysis." Hum Reprod Update 18(5): 485-503. 21. A. D. Kulkarni, D. J. Jamieson, H. W. Jones, Jr., D. M. Kissin, M. F. Gallo, M. Macaluso, et al. (2013). "Fertility treatments and multiple births in the United States." N Engl J Med 369(23): 2218-2225. 22. A. D. Kulkarni, E. Y. Adashi, D. J. Jamieson, S. B. Crawford, S. Sunderam and D. M. Kissin (2017). "Affordability of Fertility Treatments and Multiple Births in the United States." Paediatr Perinat Epidemiol 31(5): 438-448. 23. C. L. Wilson, J. R. Fisher, K. Hammarberg, D. J. Amor and J. L. Halliday (2011). "Looking downstream: a review of the literature on physical and psychosocial health outcomes in adolescents and young adults who were conceived by ART." Hum Reprod 26(5): 1209-1219. 24. J. Halliday, C. Wilson, K. Hammarberg, L. W. Doyle, F. Bruinsma, R. McLachlan, et al. (2014). "Comparing indicators of health and development of singleton young adults conceived with and without assisted reproductive technology." Fertil Steril 101(4): 1055-1063. 25. B. Stromberg, G. Dahlquist, A. Ericson, O. Finnstrom, M. Koster and K. Stjernqvist (2002). "Neurological sequelae in children born after in-vitro fertilisation: a population- based study." Lancet 359(9305): 461-465. 26. S. Sandin, K.-G. Nygren, A. Iliadou, C. M. Hultman and A. Reichenberg (2013). "Autism and Mental Retardation Among Offspring Born After In Vitro FertilizationIn

51

Vitro Fertilization and AutismIn Vitro Fertilization and Autism." JAMA 310(1): 75-84. 27. E. H. Yeung and C. Druschel (2013). "Cardiometabolic health of children conceived by assisted reproductive technologies." Fertil Steril 99(2): 318-326. 28. Y. M. Zheng, L. Li, L. M. Zhou, F. Le, L. Y. Cai, P. Yu, et al. (2013). "Alterations in the frequency of trinucleotide repeat dynamic mutations in offspring conceived through assisted reproductive technology." Hum Reprod 28(9): 2570-2580. 29. S. Song, J. Ghosh, M. Mainigi, N. Turan, R. Weinerman, M. Truongcao, et al. (2015). "DNA methylation differences between in vitro- and in vivo-conceived children are associated with ART procedures rather than infertility." Clinical Epigenetics 7(1): 41. 30. C. Choux, C. Binquet, V. Carmignac, C. Bruno, C. Chapusot, J. Barberet, et al. (2017). "The epigenetic control of transposable elements and imprinted genes in newborns is affected by the mode of conception: ART versus spontaneous conception without underlying infertility." Human Reproduction 33(2): 331-340. 31. W. E. Maalouf, M. N. Mincheva, B. K. Campbell and I. C. Hardy (2014). "Effects of assisted reproductive technologies on human sex ratio at birth." Fertil Steril 101(5): 1321-1325. 32. S. H. Orzack, J. W. Stubblefield, V. R. Akmaev, P. Colls, S. Munne, T. Scholl, et al. (2015). "The human sex ratio from conception to birth." Proc Natl Acad Sci U S A 112(16): E2102-2111. 33. J. J. Tarín, M. A. García-Pérez, C. Hermenegildo and A. Cano (2014). "Changes in sex ratio from fertilization to birth in assisted-reproductive-treatment cycles." Reproductive Biology and Endocrinology 12(1): 56. 34. P.-Y. Lin, F.-J. Huang, F.-T. Kung, L.-J. Wang, S. Y. Chang and K.-C. Lan (2010). "Comparison of the Offspring Sex Ratio Between Cleavage Stage Embryo Transfer and Blastocyst Transfer." Taiwanese Journal of Obstetrics and Gynecology 49(1): 35-39. 35. Z. Bu, Z. J. Chen, G. Huang, H. Zhang, Q. Wu, Y. Ma, et al. (2014). "Live birth sex ratio after in vitro fertilization and embryo transfer in China--an analysis of 121,247 babies from 18 centers." PLoS One 9(11): e113522.

52

36. S. Alfarawati, E. Fragouli, P. Colls, J. Stevens, C. Gutiérrez-Mateo, W. B. Schoolcraft, et al. (2011). "The relationship between blastocyst morphology, chromosomal abnormality, and embryo gender." Fertility and Sterility 95(2): 520-524. 37. B. Huang, X. Ren, L. Zhu, L. Wu, H. Tan, N. Guo, et al. (2018). "Is differences in embryo morphokinetic development significantly associated with human embryo sex?" Biology of Reproduction: ioy229-ioy229. 38. J. C. M. Dumoulin, J. G. Derhaag, M. Bras, A. P. A. Van Montfoort, A. D. M. Kester, J. L. H. Evers, et al. (2005). "Growth rate of human preimplantation embryos is sex dependent after ICSI but not after IVF." Human Reproduction 20(2): 484-491. 39. S. Petropoulos, D. Edsgard, B. Reinius, Q. Deng, S. P. Panula, S. Codeluppi, et al. (2016). "Single-Cell RNA-Seq Reveals Lineage and X Chromosome Dynamics in Human Preimplantation Embryos." Cell 167(1): 285. 40. A. E. Sullivan, T. Lewis, M. Stephenson, R. Odem, J. Schreiber, C. Ober, et al. (2003). "Pregnancy outcome in recurrent miscarriage patients with skewed X chromosome inactivation." Obstet Gynecol 101(6): 1236-1242. 41. K. Tan, L. An, K. Miao, L. Ren, Z. Hou, L. Tao, et al. (2016). "Impaired imprinted X chromosome inactivation is responsible for the skewed sex ratio following in vitro fertilization." Proc Natl Acad Sci U S A 113(12): 3197-3202. 42. C. E. Aiken, P. P. Swoboda, J. N. Skepper and M. H. Johnson (2004). "The direct measurement of embryogenic volume and nucleo-cytoplasmic ratio during mouse pre- implantation development." Reproduction 128(5): 527-535. 43. M. Serdarogullari, N. Findikli, C. Goktas, O. Sahin, U. Ulug, E. Yagmur, et al. (2014). "Comparison of gender-specific human embryo development characteristics by time-lapse technology." Reprod Biomed Online 29(2): 193-199. 44. W. S. Cutfield, P. L. Hofman, M. Mitchell and I. M. Morison (2007). "Could Epigenetics Play a Role in the Developmental Origins of Health and Disease?" Pediatric Research 61: 68R. 45. D. Lim, S. C. Bowdin, L. Tee, G. A. Kirby, E. Blair, A. Fryer, et al. (2008). "Clinical and molecular genetic features of Beckwith–Wiedemann syndrome associated with

53 assisted reproductive technologies." Human Reproduction 24(3): 741-747. 46. D. Clift and M. Schuh (2013). "Restarting life: fertilization and the transition from meiosis to mitosis." Nat Rev Mol Cell Biol 14(9): 549-562. 47. Y. Zhang, Z. Yan, Q. Qin, V. Nisenblat, H. M. Chang, Y. Yu, et al. (2018). "Transcriptome Landscape of Human Folliculogenesis Reveals Oocyte and Granulosa Cell Interactions." Mol Cell 72(6): 1021-1034 e1024. 48. Y. Hou, W. Fan, L. Yan, R. Li, Y. Lian, J. Huang, et al. (2013). "Genome Analyses of Single Human Oocytes." Cell 155(7): 1492-1506. 49. C. S. Ottolini, L. Newnham, A. Capalbo, S. A. Natesan, H. A. Joshi, D. Cimadomo, et al. (2015). "Genome-wide maps of recombination and chromosome segregation in human oocytes and embryos show selection for maternal recombination rates." Nat Genet 47(7): 727-735. 50. P. Zhu, H. Guo, Y. Ren, Y. Hou, J. Dong, R. Li, et al. (2018). "Single-cell DNA methylome sequencing of human preimplantation embryos." Nature Genetics 50(1): 12-19. 51. Z. Du, H. Zheng, Y. K. Kawamura, K. Zhang, J. Gassler, S. Powell, et al. "Polycomb Group Proteins Regulate Chromatin Architecture in Mouse Oocytes and Early Embryos." Molecular Cell. 52. W. Xia, J. Xu, G. Yu, G. Yao, K. Xu, X. Ma, et al. (2019). "Resetting histone modifications during human parental-to-zygotic transition." Science 365(6451): 353. 53. Y. Clermont (1972). "Kinetics of spermatogenesis in mammals: seminiferous epithelium cycle and spermatogonial renewal." Physiol Rev 52(1): 198-236. 54. R. Balhorn, L. Brewer and M. Corzett (2000). "DNA condensation by protamine and arginine-rich peptides: analysis of toroid stability using single DNA molecules." Mol Reprod Dev 56(2 Suppl): 230-234. 55. R. Oliva and G. H. Dixon (1991). "Vertebrate protamine genes and the histone-to- protamine replacement reaction." Prog Nucleic Acid Res Mol Biol 40: 25-94. 56. S. S. Hammoud, D. H. Low, C. Yi, D. T. Carrell, E. Guccione and B. R. Cairns (2014). "Chromatin and transcription transitions of mammalian adult germline stem

54 cells and spermatogenesis." Cell Stem Cell 15(2): 239-253. 57. N. M. Zamudio, S. Chong and M. K. O'Bryan (2008). "Epigenetic regulation in male germ cells." Reproduction 136(2): 131-146. 58. M. S. Rahman, J. S. Lee, W. S. Kwon and M. G. Pang (2013). "Sperm proteomics: road to male fertility and contraception." Int J Endocrinol 2013: 360986. 59. B. P. Hermann, K. Cheng, A. Singh, L. Roa-De La Cruz, K. N. Mutoji, I. C. Chen, et al. (2018). "The Mammalian Spermatogenesis Single-Cell Transcriptome, from Spermatogonial Stem Cells to Spermatids." Cell Rep 25(6): 1650-1667 e1658. 60. M. Wang, X. Liu, G. Chang, Y. Chen, G. An, L. Yan, et al. (2018). "Single-Cell RNA Sequencing Analysis Reveals Sequential Cell Fate Transition during Human Spermatogenesis." Cell Stem Cell 23(4): 599-614 e594. 61. A. Sohni, K. Tan, H.-W. Song, D. Burow, D. G. de Rooij, L. Laurent, et al. (2019). "The Neonatal and Adult Human Testis Defined at the Single-Cell Level." Cell Reports 26(6): 1501-1517.e1504. 62. S. Lu, C. Zong, W. Fan, M. Yang, J. Li, A. R. Chapman, et al. (2012). "Probing meiotic recombination and aneuploidy of single sperm cells by whole-genome sequencing." Science 338(6114): 1627-1630. 63. J. Wang, H. C. Fan, B. Behr and S. R. Quake (2012). "Genome-wide single-cell analysis of recombination activity and de novo mutation rates in human sperm." Cell 150(2): 402-412. 64. A. Molaro, E. Hodges, F. Fang, Q. Song, W. R. McCombie, G. J. Hannon, et al. (2011). "Sperm methylation profiles reveal features of epigenetic inheritance and evolution in primates." Cell 146(6): 1029-1041. 65. L. A. Boyer, T. I. Lee, M. F. Cole, S. E. Johnstone, S. S. Levine, J. P. Zucker, et al. (2005). "Core transcriptional regulatory circuitry in human embryonic stem cells." Cell 122(6): 947-956. 66. S. S. Hammoud, D. A. Nix, H. Zhang, J. Purwar, D. T. Carrell and B. R. Cairns (2009). "Distinctive chromatin in human sperm packages genes for embryo development." Nature 460(7254): 473-478.

55

67. J. D. Bleil, C. F. Beall and P. M. Wassarman (1981). "Mammalian sperm-egg interaction: Fertilization of mouse eggs triggers modification of the major zona pellucida glycoprotein, ZP2." Developmental Biology 86(1): 189-197. 68. P. M. Wassarman and E. S. Litscher (2008). "Mammalian fertilization: the egg's multifunctional zona pellucida." Int J Dev Biol 52(5-6): 665-676. 69. T. G. Jenkins and D. T. Carrell (2012). "Dynamic alterations in the paternal epigenetic landscape following fertilization." Front Genet 3: 143. 70. D. W. McLay and H. J. Clarke (2003). "Remodelling the paternal chromatin at fertilization in mammals." Reproduction (Cambridge, England) 125(5): 625-633. 71. L. J. Estella, O. Z. Andrei and A. Z. Irina (2011). "Protamine Withdrawal from Human Sperm Nuclei Following Heterologous ICSI into Hamster Oocytes." Protein & Peptide Letters 18(8): 811-816. 72. Y. Nakazawa, A. Shimada, J. Noguchi, I. Domeki, H. Kaneko and K. Kikuchi (2002). "Replacement of nuclear protein by histone in pig sperm nuclei during in vitro fertilization." Reproduction 124(4): 565-572. 73. S. Nonchev and R. Tsanev (1990). "Protamine-histone replacement and DNA replication in the male mouse pronucleus." Mol Reprod Dev 25(1): 72-76. 74. B. F. Hales, L. Grenier, C. Lalancette and B. Robaire (2011). "Epigenetic programming: from gametes to blastocyst." Birth Defects Res A Clin Mol Teratol 91(8): 652-665. 75. K. K. Niakan, J. Han, R. A. Pedersen, C. Simon and R. A. Pera (2012). "Human pre-implantation embryo development." Development 139(5): 829-841. 76. M. T. Lee, A. R. Bonneau and A. J. Giraldez (2014). "Zygotic genome activation during the maternal-to-zygotic transition." Annu Rev Cell Dev Biol 30: 581-613. 77. D. Jukam, S. A. M. Shariati and J. M. Skotheim (2017). "Zygotic Genome Activation in Vertebrates." Dev Cell 42(4): 316-332. 78. P. Braude, V. Bolton and S. Moore (1988). "Human gene expression first occurs between the four- and eight-cell stages of preimplantation development." Nature 332(6163): 459-461.

56

79. A. T. Dobson, R. Raja, M. J. Abeyta, T. Taylor, S. Shen, C. Haqq, et al. (2004). "The unique transcriptome through day 3 of human preimplantation development." Human Molecular Genetics 13(14): 1461-1470. 80. L. Yan, M. Yang, H. Guo, L. Yang, J. Wu, R. Li, et al. (2013). "Single-cell RNA- Seq profiling of human preimplantation embryos and embryonic stem cells." Nat Struct Mol Biol 20(9): 1131-1139. 81. H.-L. Liang, C.-Y. Nien, H.-Y. Liu, M. M. Metzstein, N. Kirov and C. Rushlow (2008). "The zinc-finger protein Zelda is a key activator of the early zygotic genome in Drosophila." Nature 456: 400. 82. M. M. Harrison, X.-Y. Li, T. Kaplan, M. R. Botchan and M. B. Eisen (2011). "Zelda Binding in the Early Drosophila melanogaster Embryo Marks Regions Subsequently Activated at the Maternal-to-Zygotic Transition." PLOS Genetics 7(10): e1002266. 83. Y. Sun, C. Y. Nien, K. Chen, H. Y. Liu, J. Johnston, J. Zeitlinger, et al. (2015). "Zelda overcomes the high intrinsic nucleosome barrier at enhancers during Drosophila zygotic genome activation." Genome Res 25(11): 1703-1714. 84. M. T. Lee, A. R. Bonneau, C. M. Takacs, A. A. Bazzini, K. R. DiVito, E. S. Fleming, et al. (2013). "Nanog, Pou5f1 and SoxB1 activate zygotic gene expression during the maternal-to-zygotic transition." Nature 503(7476): 360-364. 85. M. Leichsenring, J. Maes, R. Mössner, W. Driever and D. Onichtchouk (2013). "Pou5f1 Transcription Factor Controls Zygotic Gene Activation In Vertebrates." Science 341(6149): 1005. 86. L. Abbassi, S. Malki, K. Cockburn, A. Macaulay, C. Robert, J. Rossant, et al. (2016). "Multiple Mechanisms Cooperate to Constitutively Exclude the Transcriptional Co- Activator YAP from the Nucleus During Murine Oogenesis1." Biology of Reproduction 94(5). 87. G. Falco, S.-L. Lee, I. Stanghellini, U. C. Bassey, T. Hamatani and M. S. H. Ko (2007). "Zscan4: A novel gene expressed exclusively in late 2-cell embryos and embryonic stem cells." Developmental Biology 307(2): 539-550. 88. F. Lu, Y. Liu, A. Inoue, T. Suzuki, K. Zhao and Y. Zhang (2016). "Establishing

57

Chromatin Regulatory Landscape during Mouse Preimplantation Development." Cell 165(6): 1375-1388. 89. A. De Iaco, E. Planet, A. Coluccio, S. Verp, J. Duc and D. Trono (2017). "DUX- family transcription factors regulate zygotic genome activation in placental mammals." Nature Genetics 49: 941. 90. P. G. Hendrickson, J. A. Doráis, E. J. Grow, J. L. Whiddon, J.-W. Lim, C. L. Wike, et al. (2017). "Conserved roles of mouse DUX and human DUX4 in activating cleavage-stage genes and MERVL/HERVL retrotransposons." Nature Genetics 49: 925. 91. J. A. Dahl, I. Jung, H. Aanes, G. D. Greggains, A. Manaf, M. Lerdrup, et al. (2016). "Broad histone H3K4me3 domains in mouse oocytes modulate maternal-to-zygotic transition." Nature 537(7621): 548-552. 92. J. Wu, B. Huang, H. Chen, Q. Yin, Y. Liu, Y. Xiang, et al. (2016). "The landscape of accessible chromatin in mammalian preimplantation embryos." Nature 534(7609): 652-657. 93. L. Gao, K. Wu, Z. Liu, X. Yao, S. Yuan, W. Tao, et al. (2018). "Chromatin Accessibility Landscape in Human Early Embryos and Its Association with Evolution." Cell 173(1): 248-259 e215. 94. J. Wu, J. Xu, B. Liu, G. Yao, P. Wang, Z. Lin, et al. (2018). "Chromatin analysis in human early development reveals epigenetic transition during ZGA." Nature 557(7704): 256-260. 95. L. Liu, L. Leng, C. Liu, C. Lu, Y. Yuan, L. Wu, et al. (2019). "An integrated chromatin accessibility and transcriptome landscape of human pre-implantation embryos." Nat Commun 10(1): 364. 96. X. Liu, C. Wang, W. Liu, J. Li, C. Li, X. Kou, et al. (2016). "Distinct features of H3K4me3 and H3K27me3 chromatin domains in pre-implantation embryos." Nature 537(7621): 558-562. 97. B. Zhang, H. Zheng, B. Huang, W. Li, Y. Xiang, X. Peng, et al. (2016). "Allelic reprogramming of the histone modification H3K4me3 in early mammalian development." Nature 537(7621): 553-557.

58

98. Q. Deng, D. Ramsköld, B. Reinius and R. Sandberg (2014). "Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells." Science 343(6167): 193. 99. W. Zhang, Z. Chen, Q. Yin, D. Zhang, C. Racowsky and Y. Zhang (2019). "Maternal-biased H3K27me3 correlates with paternal-specific gene expression in the human morula." Genes Dev 33(7-8): 382-387. 100. L. Leng, J. Sun, J. Huang, F. Gong, L. Yang, S. Zhang, et al. (2019). "Single-Cell Transcriptome Analysis of Uniparental Embryos Reveals Parent-of-Origin Effects on Human Preimplantation Development." Cell Stem Cell. 101. M. S. Bartolomei (2009). "Genomic imprinting: employing and avoiding epigenetic processes." Genes Dev 23(18): 2124-2133. 102. S. Abu-Amero, D. Monk, S. Apostolidou, P. Stanier and G. Moore (2006). "Imprinted genes and their role in human fetal growth." Cytogenetic and Genome Research 113(1-4): 262-270. 103. J. Peters (2014). "The role of genomic imprinting in biology and disease: an expanding view." Nat Rev Genet 15(8): 517-530. 104. F. A. Santoni, G. Stamoulis, M. Garieri, E. Falconnet, P. Ribaux, C. Borel, et al. (2017). "Detection of Imprinted Genes by Single-Cell Allele-Specific Gene Expression." Am J Hum Genet 100(3): 444-453. 105. J. T. Lee and M. S. Bartolomei (2013). "X-inactivation, imprinting, and long noncoding RNAs in health and disease." Cell 152(6): 1308-1323. 106. C. De Paepe, M. Krivega, G. Cauffman, M. Geens and H. Van de Velde (2014). "Totipotency and lineage segregation in the human embryo." Mol Hum Reprod 20(7): 599-618. 107. F. Zhou, R. Wang, P. Yuan, Y. Ren, Y. Mao, R. Li, et al. (2019). "Reconstituting the transcriptome and DNA methylome landscapes of human implantation." Nature 572(7771): 660-664. 108. P. Blakeley, N. M. Fogarty, I. del Valle, S. E. Wamaitha, T. X. Hu, K. Elder, et al. (2015). "Defining the three cell lineages of the human blastocyst by single-cell RNA-

59 seq." Development 142(18): 3151-3165. 109. Z. Xue, K. Huang, C. Cai, L. Cai, C. Y. Jiang, Y. Feng, et al. (2013). "Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing." Nature 500(7464): 593-597. 110. Z. D. Smith, M. M. Chan, K. C. Humm, R. Karnik, S. Mekhoubad, A. Regev, et al. (2014). "DNA methylation dynamics of the human preimplantation embryo." Nature 511(7511): 611-615. 111. X. Chen, Y. Ke, K. Wu, H. Zhao, Y. Sun, L. Gao, et al. (2019). "Key role for CTCF in establishing chromatin structure in human embryos." Nature. 112. H. Guo, P. Zhu, L. Yan, R. Li, B. Hu, Y. Lian, et al. (2014). "The DNA methylation landscape of human early embryos." Nature 511(7511): 606-610. 113. L. Li, F. Guo, Y. Gao, Y. Ren, P. Yuan, L. Yan, et al. (2018). "Single-cell multi- omics sequencing of human early embryos." Nature Cell Biology 20(7): 847-858. 114. T. Ebner, C. Yaman, M. Moser, M. Sommergruber, W. Pölz and G. Tews (2001). "Embryo fragmentation in vitro and its impact on treatment and pregnancy outcome." Fertility and Sterility 76(2): 281-285. 115. S. Munne, J. Grifo, J. Cohen and H. U. Weier (1994). "Chromosome abnormalities in human arrested preimplantation embryos: a multiple-probe FISH study." Am J Hum Genet 55(1): 150-159. 116. S. Munne, M. Alikani, G. Tomkin, J. Grifo and J. Cohen (1995). "Embryo morphology, developmental rates, and maternal age are correlated with chromosome abnormalities." Fertil Steril 64(2): 382-391. 117. M. Dekel-Naftali, A. Aviram-Goldring, T. Litmanovitch, J. Shamash, H. Yonath, A. Hourvitz, et al. (2013). "Chromosomal integrity of human preimplantation embryos at different days post fertilization." J Assist Reprod Genet 30(5): 633-648. 118. S. Stigliani, P. Anserini, P. L. Venturini and P. Scaruffi (2013). "Mitochondrial DNA content in embryo culture medium is significantly associated with human embryo fragmentation." Hum Reprod 28(10): 2652-2660. 119. K. Chatzimeletiou, E. E. Morrison, N. Prapas, Y. Prapas and A. H. Handyside

60

(2005). "Spindle abnormalities in normally developing and arrested human preimplantation embryos in vitro identified by confocal laser scanning microscopy." Hum Reprod 20(3): 672-682. 120. D. H. Kort, G. Chia, N. R. Treff, A. J. Tanaka, T. Xing, L. B. Vensand, et al. (2015). "Human embryos commonly form abnormal nuclei during development: a mechanism of DNA damage, embryonic aneuploidy, and developmental arrest." Human Reproduction 31(2): 312-323. 121. G. A. Thouas, A. O. Trounson, E. J. Wolvetang and G. M. Jones (2004). "Mitochondrial dysfunction in mouse oocytes results in preimplantation embryo arrest in vitro." Biol Reprod 71(6): 1936-1942. 122. A. Jurisicova, M. Antenos, S. Varmuza, J. L. Tilly and R. F. Casper (2003). "Expression of apoptosis-related genes during human preimplantation embryo development: potential roles for the Harakiri gene product and Caspase-3 in blastomere fragmentation." Mol Hum Reprod 9(3): 133-141. 123. A. D. Metcalfe, H. R. Hunter, D. J. Bloor, B. A. Lieberman, H. M. Picton, H. J. Leese, et al. (2004). "Expression of 11 members of the BCL-2 family of apoptosis regulatory molecules during human preimplantation embryo development and fragmentation." Mol Reprod Dev 68(1): 35-50. 124. B. S. Song, S. H. Lee, S. U. Kim, J. S. Kim, J. S. Park, C. H. Kim, et al. (2009). "Nucleologenesis and embryonic genome activation are defective in interspecies cloned embryos between bovine ooplasm and rhesus monkey somatic cells." BMC Dev Biol 9: 44. 125. S. Kishigami, N. Van Thuan, T. Hikichi, H. Ohta, S. Wakayama, E. Mizutani, et al. (2006). "Epigenetic abnormalities of the mouse paternal zygotic genome associated with microinsemination of round spermatids." Dev Biol 289(1): 195-205. 126. B. Chen, Z. Zhang, X. Sun, Y. Kuang, X. Mao, X. Wang, et al. (2017). "Biallelic Mutations in PATL2 Cause Female Infertility Characterized by Oocyte Maturation Arrest." Am J Hum Genet 101(4): 609-615. 127. S. Maddirevula, S. Coskun, S. Alhassan, A. Elnour, H. S. Alsaif, N. Ibrahim, et al.

61

(2017). "Female Infertility Caused by Mutations in the Oocyte-Specific Translational Repressor PATL2." Am J Hum Genet 101(4): 603-608. 128. M. Christou-Kent, Z. E. Kherraf, A. Amiri-Yekta, E. Le Blevec, T. Karaouzene, B. Conne, et al. (2018). "PATL2 is a key actor of oocyte maturation whose invalidation causes infertility in women and mice." EMBO Mol Med 10(5). 129. R. Feng, Q. Sang, Y. Kuang, X. Sun, Z. Yan, S. Zhang, et al. (2016). "Mutations in TUBB8 and Human Oocyte Meiotic Arrest." N Engl J Med 374(3): 223-232. 130. S. J. Conner, L. Lefievre, D. C. Hughes and C. L. Barratt (2005). "Cracking the egg: increased complexity in the zona pellucida." Hum Reprod 20(5): 1148-1152. 131. M. A. Avella, B. Baibakov and J. Dean (2014). "A single domain of the ZP2 zona pellucida protein mediates gamete recognition in mice and humans." J Cell Biol 205(6): 801-809. 132. M. Margalit, G. Paz, H. Yavetz, L. Yogev, A. Amit, T. Hevlin-Schwartz, et al. (2012). "Genetic and physiological study of morphologically abnormal human zona pellucida." Eur J Obstet Gynecol Reprod Biol 165(1): 70-76. 133. H. L. Huang, C. Lv, Y. C. Zhao, W. Li, X. M. He, P. Li, et al. (2014). "Mutant ZP1 in familial infertility." N Engl J Med 370(13): 1220-1226. 134. T. Chen, Y. Bian, X. Liu, S. Zhao, K. Wu, L. Yan, et al. (2017). "A Recurrent Missense Mutation in ZP3 Causes Empty Follicle Syndrome and Female Infertility." Am J Hum Genet 101(3): 459-465. 135. W. Liu, K. Li, D. Bai, J. Yin, Y. Tang, F. Chi, et al. (2017). "Dosage effects of ZP2 and ZP3 heterozygous mutations cause human infertility." Hum Genet 136(8): 975-985. 136. Q. Sang, Z. Zhang, J. Shi, X. Sun, B. Li, Z. Yan, et al. (2019). "A pannexin 1 channelopathy causes human oocyte death." Science Translational Medicine 11(485): eaav8731. 137. Q. Sang, B. Li, Y. Kuang, X. Wang, Z. Zhang, B. Chen, et al. (2018). "Homozygous Mutations in WEE2 Cause Fertilization Failure and Female Infertility." Am J Hum Genet 102(4): 649-657. 138. Y. Xu, Y. Shi, J. Fu, M. Yu, R. Feng, Q. Sang, et al. (2016). "Mutations in PADI6

62

Cause Female Infertility Characterized by Early Embryonic Arrest." Am J Hum Genet 99(3): 744-752. 139. A. M. Alazami, S. M. Awad, S. Coskun, S. Al-Hassan, H. Hijazi, F. M. Abdulwahab, et al. (2015). "TLE6 mutation causes the earliest known human embryonic lethality." Genome Biol 16: 240. 140. M. Fallahian, N. J. Sebire, P. M. Savage, M. J. Seckl and R. A. Fisher (2013). "Mutations in NLRP7 and KHDC3L confer a complete hydatidiform mole phenotype on digynic triploid conceptions." Hum Mutat 34(2): 301-308. 141. X. Wang, D. Song, D. Mykytenko, Y. Kuang, Q. Lv, B. Li, et al. (2018). "Novel mutations in genes encoding subcortical maternal complex proteins may cause human embryonic developmental arrest." Reproductive BioMedicine Online 36(6): 698-704. 142. J. Mu, W. Wang, B. Chen, L. Wu, B. Li, X. Mao, et al. (2019). "Mutations in NLRP2 and NLRP5 cause female infertility characterised by early embryonic arrest." Journal of Medical Genetics 56(7): 471. 143. E. Sato, M. Xian, R. P. Valdivia and Y. Toyoda (1995). "Sex-linked differences in developmental potential of single blastomeres from in vitro-fertilized 2-cell stage mouse embryos." Horm Res 44 Suppl 2: 4-8. 144. R. P. Valdivia, T. Kunieda, S. Azuma and Y. Toyoda (1993). "PCR sexing and developmental rate differences in preimplantation mouse embryos fertilized and cultured in vitro." Mol Reprod Dev 35(2): 121-126. 145. H. P. Kochhar, J. Peippo and W. A. King (2001). "Sex related embryo development." Theriogenology 55(1): 3-14. 146. M. Alomar, H. Tasiaux, S. Remacle, F. George, D. Paul and I. Donnay (2008). "Kinetics of fertilization and development, and sex ratio of bovine embryos produced using the semen of different bulls." Animal Reproduction Science 107(1): 48-61. 147. P. J. Hansen, K. B. Dobbs, A. C. Denicol and L. G. Siqueira (2016). "Sex and the preimplantation embryo: implications of sexual dimorphism in the preimplantation period for maternal programming of embryonic development." Cell Tissue Res 363(1): 237-247.

63

148. P. F. Ray, J. Conaghan, R. M. Winston and A. H. Handyside (1995). "Increased number of cells and metabolic activity in male human preimplantation embryos following in vitro fertilization." J Reprod Fertil 104(1): 165-171. 149. I. M. van den Berg, J. S. E. Laven, M. Stevens, I. Jonkers, R.-J. Galjaard, J. Gribnau, et al. (1998). "X Chromosome Inactivation Is Initiated in Human Preimplantation Embryos." The American Journal of Human Genetics 84(6): 771-779. 150. A. S. Setti, R. C. Figueira, D. P. Braga, A. Iaconelli, Jr. and E. Borges, Jr. (2012). "Gender incidence of intracytoplasmic morphologically selected sperm injection- derived embryos: a prospective randomized study." Reprod Biomed Online 24(4): 420- 423. 151. S. Kobayashi, A. Isotani, N. Mise, M. Yamamoto, Y. Fujihara, K. Kaseda, et al. (2006). "Comparison of Gene Expression in Male and Female Mouse Blastocysts Revealed Imprinting of the X-Linked Gene at Preimplantation Stages." Current Biology 16(2): 166-172. 152. R. Lowe, C. Gemma, V. K. Rakyan and M. L. Holland (2015). "Sexually dimorphic gene expression emerges with embryonic genome activation and is dynamic throughout development." BMC Genomics 16(1): 295. 153. D. Ronen and N. Benvenisty (2014). "Sex-dependent gene expression in human pluripotent stem cells." Cell Rep 8(4): 923-932. 154. Q. Zhou, T. Wang, L. Leng, W. Zheng, J. Huang, F. Fang, et al. (2019). "Single- cell RNA-seq reveals distinct dynamic behavior of sex chromosomes during early human embryogenesis." Molecular Reproduction and Development 86(7): 871-882. 155. A. R. Zinn, R. K. Alagappan, L. G. Brown, I. Wool and D. C. Page (1994). "Structure and function of ribosomal protein S4 genes on the human and mouse sex chromosomes." Molecular and Cellular Biology 14(4): 2485-2492. 156. M. Watanabe, A. R. Zinn, D. C. Page and T. Nishimoto (1993). "Functional equivalence of human X– and Y–encoded isoforms of ribosomal protein S4 consistent with a role in Turner syndrome." Nature Genetics 4: 268. 157. A. Rosner, G. Paz and B. Rinkevich (2006). "Divergent roles of the DEAD-box

64 protein BS-PL10, the urochordate homologue of human DDX3 and DDX3Y proteins, in colony astogeny and ontogeny." Dev Dyn 235(6): 1508-1521. 158. H. Vakilian, M. Mirzaei, M. Sharifi Tabar, P. Pooyan, L. Habibi Rezaee, L. Parker, et al. (2015). "DDX3Y, a Male-Specific Region of Y Chromosome Gene, May Modulate Neuronal Differentiation." Journal of Proteome Research 14(9): 3474-3483. 159. S. Richards, N. Aziz, S. Bale, D. Bick, S. Das, J. Gastier-Foster, et al. (2015). "Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology." Genet Med 17(5): 405-424. 160. D. A. Parry, C. V. Logan, B. E. Hayward, M. Shires, H. Landolsi, C. Diggle, et al. (2011). "Mutations causing familial biparental hydatidiform mole implicate c6orf221 as a possible regulator of genomic imprinting in the human oocyte." Am J Hum Genet 89(3): 451-458. 161. N. E. Skakkebaek, E. Rajpert-De Meyts, G. M. Buck Louis, J. Toppari, A. M. Andersson, M. L. Eisenberg, et al. (2016). "Male Reproductive Disorders and Fertility Trends: Influences of Environment and Genetic Susceptibility." Physiol Rev 96(1): 55- 97. 162. P. F. Ray, A. Toure, C. Metzler-Guillemain, M. J. Mitchell, C. Arnoult and C. Coutton (2017). "Genetic abnormalities leading to qualitative defects of sperm morphology or function." Clin Genet 91(2): 217-232. 163. J. Castillo, M. Jodar and R. Oliva (2018). "The contribution of human sperm proteins to the development and epigenome of the preimplantation embryo." Hum Reprod Update 24(5): 535-555. 164. J. A. Blake, J. T. Eppig, J. E. Richardson and M. T. Davisson (2000). "The Mouse Genome Database (MGD): expanding genetic and genomic resources for the . The Mouse Genome Database Group." Nucleic Acids Res 28(1): 108-111. 165. C. M. Haqq, C. Y. King, E. Ukiyama, S. Falsafi, T. N. Haqq, P. K. Donahoe, et al. (1994). "Molecular basis of mammalian sexual determination: activation of Mullerian inhibiting substance gene expression by SRY." Science 266(5190): 1494-1500.

65

166. W. Mak, T. B. Nesterova, M. de Napoles, R. Appanah, S. Yamanaka, A. P. Otte, et al. (2004). "Reactivation of the paternal X chromosome in early mouse embryos." Science 303(5658): 666-669. 167. I. Okamoto, A. P. Otte, C. D. Allis, D. Reinberg and E. Heard (2004). "Epigenetic dynamics of imprinted X inactivation during early mouse development." Science 303(5658): 644-649. 168. T. Umehara, N. Tsujita and M. Shimada (2019). "Activation of Toll-like receptor 7/8 encoded by the X chromosome alters sperm motility and provides a novel simple technology for sexing sperm." PLoS Biol 17(8): e3000398. 169. D. S. Karabinus, D. P. Marazzo, H. J. Stern, D. A. Potter, C. I. Opanga, M. L. Cole, et al. (2014). "The effectiveness of flow cytometric sorting of human sperm (MicroSort(R)) for influencing a child's sex." Reprod Biol Endocrinol 12: 106. 170. E. Caroppo (2013). "Sperm sorting for selection of healthy sperm: is it safe and useful?" Fertility and Sterility 100(3): 695-696. 171. M. Guttman and J. L. Rinn (2012). "Modular regulatory principles of large non- coding RNAs." Nature 482(7385): 339-346. 172. V. Ambros (2004). "The functions of animal microRNAs." Nature 431(7006): 350- 355. 173. W. P. Kloosterman and R. H. Plasterk (2006). "The diverse functions of microRNAs in animal development and disease." Dev Cell 11(4): 441-450. 174. A. Aravin, D. Gaidatzis, S. Pfeffer, M. Lagos-Quintana, P. Landgraf, N. Iovino, et al. (2006). "A novel class of small RNAs bind to MILI protein in mouse testes." Nature 442(7099): 203-207. 175. S. T. Grivna, E. Beyret, Z. Wang and H. Lin (2006). "A novel class of small RNAs in mouse spermatogenic cells." Genes Dev 20(13): 1709-1714. 176. Y. S. Lee, Y. Shibata, A. Malhotra and A. Dutta (2009). "A novel class of small RNAs: tRNA-derived RNA fragments (tRFs)." Genes Dev 23(22): 2639-2649. 177. S. P. Keam and G. Hutvagner (2015). "tRNA-Derived Fragments (tRFs): Emerging New Roles for an Ancient RNA in the Regulation of Gene Expression." Life (Basel)

66

5(4): 1638-1651. 178. Q. Chen, M. Yan, Z. Cao, X. Li, Y. Zhang, J. Shi, et al. (2016). "Sperm tsRNAs contribute to intergenerational inheritance of an acquired metabolic disorder." Science 351(6271): 397-400. 179. U. Sharma, C. C. Conine, J. M. Shea, A. Boskovic, A. G. Derr, X. Y. Bing, et al. (2016). "Biogenesis and function of tRNA fragments during sperm maturation and fertilization in mammals." Science 351(6271): 391-396. 180. A. A. Aravin, G. J. Hannon and J. Brennecke (2007). "The Piwi-piRNA pathway provides an adaptive defense in the transposon arms race." Science 318(5851): 761-764. 181. A. Girard, R. Sachidanandam, G. J. Hannon and M. A. Carmell (2006). "A germline-specific class of small RNAs binds mammalian Piwi proteins." Nature 442(7099): 199-202. 182. P. Dai, X. Wang, L.-T. Gou, Z.-T. Li, Z. Wen, Z.-G. Chen, et al. (2019). "A Translation-Activating Function of MIWI/piRNA during Mouse Spermiogenesis." Cell 179(7): 1566-1581.e1516. 183. S. Gkountela, K. X. Zhang, T. A. Shafiq, W. W. Liao, J. Hargan-Calvopina, P. Y. Chen, et al. (2015). "DNA Demethylation Dynamics in the Human Prenatal Germline." Cell 161(6): 1425-1436. 184. F. Guo, L. Yan, H. Guo, L. Li, B. Hu, Y. Zhao, et al. (2015). "The Transcriptome and DNA Methylome Landscapes of Human Primordial Germ Cells." Cell 161(6): 1437-1452. 185. J. Dekker and L. Mirny (2016). "The 3D Genome as Moderator of Chromosomal Communication." Cell 164(6): 1110-1121. 186. Z. Du, H. Zheng, B. Huang, R. Ma, J. Wu, X. Zhang, et al. (2017). "Allelic reprogramming of 3D chromatin architecture during early mammalian development." Nature 547(7662): 232-235. 187. I. M. Flyamer, J. Gassler, M. Imakaev, H. B. Brandão, S. V. Ulianov, N. Abdennur, et al. (2017). "Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte- to-zygote transition." Nature 544: 110.

67

188. Y. Ke, Y. Xu, X. Chen, S. Feng, Z. Liu, Y. Sun, et al. (2017). "3D Chromatin Structures of Mature Gametes and Structural Reprogramming during Mammalian Embryogenesis." Cell 170(2): 367-381 e320. 189. Y. Xiang, Y. Zhang, Q. Xu, C. Zhou, B. Liu, Z. Du, et al. (2019). "Epigenomic analysis of gastrulation identifies a unique chromatin state for primed pluripotency." Nature Genetics. 190. H. Lee, J. L. Deignan, N. Dorrani, S. P. Strom, S. Kantarci, F. Quintero-Rivera, et al. (2014). "Clinical exome sequencing for genetic identification of rare Mendelian disorders." JAMA 312(18): 1880-1887. 191. D. Trujillano, A. M. Bertoli-Avella, K. Kumar Kandaswamy, M. E. Weiss, J. Koster, A. Marais, et al. (2017). "Clinical exome sequencing: results from 2819 samples reflecting 1000 families." Eur J Hum Genet 25(2): 176-182. 192. S. H. Lelieveld, M. Spielmann, S. Mundlos, J. A. Veltman and C. Gilissen (2015). "Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein-Coding Regions." Hum Mutat 36(8): 815-822. 193. M. Meseguer, J. Herrero, A. Tejera, K. M. Hilligsøe, N. B. Ramsing and J. Remohí (2011). "The use of morphokinetics as a predictor of embryo implantation†." Human Reproduction 26(10): 2658-2671. 194. M. Irani, D. Reichman, A. Robles, A. Melnick, O. Davis, N. Zaninovic, et al. (2017). "Morphologic grading of euploid blastocysts influences implantation and ongoing pregnancy rates." Fertil Steril 107(3): 664-670. 195. P. Khosravi, E. Kazemi, Q. Zhan, J. E. Malmsten, M. Toschi, P. Zisimopoulos, et al. (2019). "Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization." NPJ Digit Med 2: 21. 196. D. Tran, S. Cooke, P. J. Illingworth and D. K. Gardner (2019). "Deep learning as a predictive tool for fetal heart pregnancy following time-lapse incubation and blastocyst transfer." Human Reproduction 34(6): 1011-1018. 197. S. Stigliani, L. Persico, C. Lagazio, P. Anserini, P. L. Venturini and P. Scaruffi (2014). "Mitochondrial DNA in Day 3 embryo culture medium is a novel, non-invasive

68 biomarker of blastocyst potential and implantation outcome." Mol Hum Reprod 20(12): 1238-1246. 198. E. Borges, Jr., A. S. Setti, D. P. Braga, M. V. Geraldo, R. C. Figueira and A. Iaconelli, Jr. (2016). "miR-142-3p as a biomarker of blastocyst implantation failure - A pilot study." JBRA Assist Reprod 20(4): 200-205. 199. A. Capalbo, F. M. Ubaldi, D. Cimadomo, L. Noli, Y. Khalaf, A. Farcomeni, et al. (2016). "MicroRNAs in spent blastocyst culture medium are derived from trophectoderm cells and can be explored for human embryo reproductive competence assessment." Fertil Steril 105(1): 225-235 e221-223. 200. N. Alfaidy, P. Hoffmann, P. Gillois, A. Gueniffey, C. Lebayle, H. Garcin, et al. (2016). "PROK1 Level in the Follicular Microenvironment: A New Noninvasive Predictive Biomarker of Embryo Implantation." J Clin Endocrinol Metab 101(2): 435- 444. 201. S. Bouvier, O. Paulmyer-Lacroix, N. Molinari, A. Bertaud, M. Paci, A. Leroyer, et al. (2017). "Soluble CD146, an innovative and non-invasive biomarker of embryo selection for in vitro fertilization." PLoS One 12(3): e0173724. 202. S. P. DiTroia, M. Percharde, M. J. Guerquin, E. Wall, E. Collignon, K. T. Ebata, et al. (2019). "Maternal vitamin C regulates reprogramming of DNA methylation and germline development." Nature 573(7773): 271-275. 203. J. B. Holm, S. Mazaud-Guittot, N. B. Danneskiold-Samsoe, C. Chalmey, B. Jensen, M. M. Norregard, et al. (2016). "Intrauterine Exposure to Paracetamol and Aniline Impairs Female Reproductive Development by Reducing Follicle Reserves and Fertility." Toxicol Sci 150(1): 178-189.

69

7 APPENDICES

Papers included in this thesis

1. Qing Zhou, Taifu Wang, Lizhi Leng, Wei Zheng, Jinrong Huang, Fang Fang, Ling Yang, Fang Chen, Ge Lin, Wen-Jing Wang, and Karsten Kristiansen. Single-cell RNA- seq reveals distinct dynamic behavior of sex chromosomes during early human embryogenesis. Molecular Reproduction and Development, 2019; 86: 871– 882

2. Qing Zhou, Wei Zheng, Wen-Jing Wang, Karsten Kristiansen, Ge Lin. The phenotypic and genetic landscape of an infertility cohort with recurrent embryonic developmental arrest. Genetics in Medicine. Preliminary Manuscript

70

Received: 14 February 2019 | Revised: 21 March 2019 | Accepted: 11 April 2019 DOI: 10.1002/mrd.23162

RESEARCH ARTICLE

Single‐cell RNA‐seq reveals distinct dynamic behavior of sex chromosomes during early human embryogenesis

Qing Zhou1,2,3* | Taifu Wang1,2,4* | Lizhi Leng5,6 | Wei Zheng7 | Jinrong Huang1,2 | Fang Fang1,2 | Ling Yang1,2 | Fang Chen1,2,3 | Ge Lin5,6,7,8 | Wen‐Jing Wang1,2† | Karsten Kristiansen1,2,3†

1BGI‐Shenzhen, Shenzhen, China 2China National GeneBank, BGI‐Shenzhen, Shenzhen, China 3Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen, Denmark 4BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, China 5Institute of Reproductive and Stem Cell Engineering, School of Basic Medical Science, Central South University, Changsha, China 6Key Laboratory of Reproductive and Stem Cells Engineering, Ministry of Health, Changsha, China 7Reproductive & Genetic Hospital of CITIC‐Xiangya, Changsha, China 8National Engineering and Research Center of Human Stem Cell, Changsha, China

Correspondence Qing Zhou and Karsten Kristiansen, Abstract BGI‐Shenzhen, Shenzhen 518083, China, Several animal and human studies have demonstrated that sex affects kinetics and Department of Biology, University of Copenhagen, Universitetsparken 13, metabolism during early embryo development. However, the mechanism governing 2100 Copenhagen, Denmark. these differences at the molecular level before the expression of the sex‐determining Email: [email protected] (Q.Z.); [email protected] (K.K.); gene SRY is unknown. We performed a systematic profiling of gene expression Wen‐Jing Wang, BGI‐Shenzhen, comparing male and female embryos using available single‐cell RNA‐sequencing Shenzhen 518083, China. Email: [email protected] data of 1607 individual cells from 99 human preimplantation embryos, covering development stages from 4‐cell to late blastocyst. We observed consistent Funding information Ministry of Science and Technology of the chromosome‐wide transcription of autosomes, whereas expression from sex People's Republic of China, Grant/Award chromosomes exhibits significant differences after embryonic genome activation Number: 2018YFC1004900; National Natural Science Foundation of China, Grant/Award (EGA). Activation of the Y chromosome is initiated by expression of two genes, Number: 81300075; Natural Science Foundation of Guangdong Province, Grant/ RPS4Y1 and DDX3Y, whereas the X chromosome is widely activated, with both copies Award Number: 2014A030313795; Shenzhen in females being activated after EGA. In contrast to the stable activation of the Municipal Government of China, Grant/Award Numbers: JCYJ20160429174400950, Y chromosome, expression of X‐linked genes in females declines at the late blastocyst JCYJ20170412152854656 stage, especially in trophectoderm cells, revealing a rapid process of dosage compensation. This dynamic behavior results in a dosage imbalance between male and female embryos, which influences genes involved in cell cycle, protein translation and metabolism. Our results reveal the dynamics of sex chromosomes expression and silencing during early embryogenesis. Studying sex differences during human

Abbreviations: ART, assisted reproductive technology; DEGs, differentially expressed genes; EGA, embryonic genome activation; EPI, epiblast; GO, Gene Ontology; IVF, in vitro fertilization; PE, primitive endoderm; RNA‐seq, RNA sequencing; SNV, single nucleotide variant; t‐SNE, t‐distributed stochastic neighbor embedding; TE, trophectoderm; XCI, X chromosome inactivation; Xp, paternal X chromosome.

*Qing Zhou and Taifu Wang contributed equally to this work.

† Wen‐Jing Wang and Karsten Kristiansen are co‐senior authors.

Mol Reprod Dev. 2019;86:871-882. wileyonlinelibrary.com/journal/mrd © 2019 Wiley Periodicals, Inc. | 871 872 | ZHOU ET AL.

embryogenesis, as well as understanding the process of X chromosome inactivation and their effects on the sex bias development of in vitro fertilized embryos, will expand the capabilities of assisted reproductive technology and possibly improve the treatment of infertility and enhance reproductive health.

KEYWORDS assisted reproductive technology, dosage compensation, embryogenesis, sex differences, single‐cell RNA‐seq

1 | INTRODUCTION governing these differences before the expression of the sex‐ determine gene SRY remain to be established. From the moment of fertilization in mammals, the sex of the At the stage before implantation, sex‐specific differences in gene preimplantation embryo is determined by the spermatozoon carrying expression become apparent. These have been demonstrated initially either an X or Y chromosome (Alomar et al., 2008; Setti, Figueira, in genes on the sex chromosomes (at the 8‐cell stage in mouse [Lowe, Braga, Iaconelli, & Borges, 2012). However, a skewed sex ratio is an Gemma, Rakyan, & Holland, 2015]), and later in the autosomes issue of great concern in relation to assisted reproductive technology (blastocyst in mouse [Kobayashi et al., 2006]). In bovine embryos, (ART; Orzack et al., 2015). In recent years, several mammalian and expression of key enzymes involved in establishing genome methyla- human studies have aimed to molecularly and functionally char- tion, as well as histone methylation, is higher in male blastocysts acterize male and female embryos during in vitro development compared to their female counterparts (Bermejo‐Álvarez, Rizos, (Aiken, Swoboda, Skepper, & Johnson, 2004; Serdarogullari et al., Rath, Lonergan, & Gutierrez‐Adan, 2008). For humans, although 2014). There are three main aspects that differ between male and Y‐chromosome‐driven effects have been detected in pluripotent female embryos: 1) patterns of development, including morphology stem cells in a transcriptional study (Ronen & Benvenisty, 2014), a and gene transcription (B. Huang et al., 2018); 2) kinetics and timing systematic profiling of gene expression comparing male and female of development, including growth rates (Sato, Xian, Valdivia, & embryos during early development is needed. Toyoda, 1995; Tan, Wang, Zhang, An, & Tian, 2016; Valdivia, Another female‐specific epigenetic event during early embry- Kunieda, Azuma, & Toyoda, 1993); and 3) mortality during ogenesis is X chromosome inactivation (XCI), which occurs to balance intrauterine development (Legato, 2017). At the 2‐cell stage the the X‐linked gene dosage between males and females (van den Berg percentage of successful culturing differs between male and female et al., 1998). Impaired XCI is one of the major epigenetic barriers mouse embryos (Sato et al., 1995). Mouse embryos that carry a Y preventing correct developmental competence of female embryos chromosome develop more quickly in vitro than XX embryos (Tan, An et al., 2016), and normally results in early miscarriage and (Valdivia et al., 1993). For bovine embryos, addition of embryonic embryonic lethality (Lanasa, Hogge, Kubik, Blancato, & Hoffman, colony stimulating‐factor 2 (CSF2) to the culture medium increases 1999; Sullivan et al., 2003). In mouse, a period of double X the survival of female embryos at the morula stages, but not male chromosome activation occurs between the 4‐cell and 16‐cell stages embryos (Hansen, Dobbs, Denicol, & Siqueira, 2016). Moreover, following a paternal‐specific XCI (Deng, Ramskold, Reinius, & several animal studies have demonstrated that sex affects metabo- Sandberg, 2014; Gardner, Larman, & Thouas, 2010). The burst of lism during early embryonic development (Alomar et al., 2008; Holm transcription from both X chromosomes results in a proteome et al., 1998; Kochhar, Peippo, & King, 2001). Thus, much evidence exhibiting distinct differences between the sexes. In human, it has regarding sex differences comes from animal models. For human been reported that dosage compensation in vitro occurs in all three embryos derived via ART, it has been reported that transferring lineages of blastocyst embryos on Day 7, and that the expression of embryos at a later stage (blastocyst) may increase the percentage of both X chromosomes is reduced before the random silencing of an male offspring (Bu et al., 2014; Maalouf, Mincheva, Campbell, & entire X chromosome (Petropoulos et al., 2016). However, detailed Hardy, 2014; Sotiroska et al., 2015), but this effect was not observed information of the precise temporal activation and the dosage in other studies (P.‐Y. Lin et al., 2010; Weston, Osianlis, Catt, & compensation of the X chromosome during early development is Vollenhoven, 2009). Male in vitro fertilized (IVF) embryos have been still lacking. reported to display an increased number of cells and higher The recent development of single‐cell sequencing technology has metabolic activity than female embryos and to develop at a allowed us to characterize individual embryonic cells at multiple levels significantly faster rate (Alfarawati et al., 2011; B. Huang et al., (Gao et al., 2018; Gkountela et al., 2015; F. Guo et al., 2015; 2018; Ray, Conaghan, Winston, & Handyside, 1995). Thus, even Y. Hou et al., 2013; Wu et al., 2018), including the generation of though some observations clearly point to developmental differences comprehensive transcriptional atlases (Deng et al., 2014; Petropoulos between male and female embryos, the molecular mechanisms et al., 2016; Xue et al., 2013; Yan et al., 2013). Here we aimed to ZHOU ET AL. | 873 determine to what extent male and female embryos differ in relation embryos at the 8‐cell stage into two separate groups according to to gene expression levels during early development. By analyzing sex solely based on the expression of the RPS4Y1 gene (Figure 2c). available transcriptome data, we reveal a dynamic pattern of These results reveal that only few genes are highly transcribed expression for the sex chromosomes and the potential functional during the initial activation of the Y chromosome, and the effects of the sex‐specific developmental behavior. consistent and high expression in all male cells indicates that the RPS4Y1 gene could serve as a potential sex‐specific expression marker in human cleavage embryos. 2 | RESULTS

2.1 | Transcriptional profiling reveals differences in 2.3 | X chromosomes are widely activated during expression pattern between sex chromosomes as EGA both in male and female embryos early as embryonic genome activation For the X chromosome, we examined the dynamic changes for all To examine to what extent sex differences affect transcriptional expressed genes during the genomic activation process. These patterns during early human embryogenesis, we collected public X‐linked genes, distributed along the entire chromosome, ex- sequencing data on the transcriptome of 1607 individual cells from hibitedactivationafterEGAbothinmaleandfemaleembryos 99 human preimplantation embryos ranging from the 4‐cell (Figure 2d). Furthermore, their expression was much higher in stage to late blastocyst (Petropoulos et al., 2016; Yan et al., females than in males at E3 and E4. Considering the extra X 2013; Figure 1a; E2–E7). A total of 3–17 embryos and 12–466 cells chromosome in females, we used allelic information of X‐linked were analyzed per developmental stage (Figure 1b). We first genes to further investigate whether the higher level of expres- generated a comprehensive transcriptional map of early embryos sion reflected the activation of both copies. We performed allelic during these stages. Dimensionality reduction by t‐distributed expression analysis for each common single nucleotide variant stochastic neighbor embedding (t‐SNE) revealed that the primary (SNV) present in the dbSNP database within each cell using the segregating factor was the time point of development, not the sex, data set of Yan et al. (2013). Thus, female embryos at the 8‐cell as samples were clearly separated according to embryonic day stage showed bi‐allelic expression. Take the HNRNPH2 gene for (Figure1c).Wetheninvestigated expression at the chromosome example, both a T and a G allele could be identified from the RNA‐ level comparing male and female embryos. Following embryonic seq data, whereas the transcripts in male embryos harbored only genome activation (EGA), differences in gene expression became a T allele after EGA (Figure 2e). This was also the case for DDX3X, apparent. At the 8‐cell stage we observed significant differences ageneescapingfromX‐inactivation, with an increased expression in transcription of genes on the sex chromosomes (Figure 1d; and approximate 50% of the reads representing expression of the p <10−5; Mann Whitney Wilcoxon test), and sex‐dependent alternative allele in each female cell after EGA. All the above differences in expression of genes on autosomes became detect- results demonstrate that all X chromosomes, both in male and able during later stages, especially in late blastocysts (Figure S1; female, exhibit wide transcriptional activity during the process of p <10−5, Mann Whitney Wilcoxon test). EGA. As a result, the transcription of the two copies in females may result in an unbalanced dosage between male and female embryos in these early stages. 2.2 | Initial transcription of the Y chromosome is initiated from few marker genes 2.4 | Sex chromosomes show dynamic behavior To further investigate the temporal expression patterns of genes during blastocyst formation on the sex chromosomes, we profiled the expression of Y‐linked genes. In total, expression of 33 Y‐linked genes was detected For adult females, there is a random XCI to equalize the expression during the early embryonic stages (Figure 2a). For example, the of X‐linked genes with males (Payer & Lee, 2008; Wutz, 2011). As transcript from the PCDH11Y gene could be detected in E2 we observed an unbalanced dosage of X chromosome between embryos before EGA, whereas downregulation and transcriptional males and females in the early embryonic stages, we further silencing were observed during later development. Notably, the focused on the process of dosage compensation in females. We sex‐determining SRY gene was as expected not expressed at these evaluated X chromosome expression dynamics in cells of trophec- early stages, reflecting that male gonadal differentiation in the toderm (TE), primitive endoderm (PE), and epiblast (EPI) at the developing embryo first occurs postimplantation (Haqq et al., blastocyst stage. As expected, autosomes showed comparable 1994). The majority of the Y‐linked genes exhibited low levels of expression in all cells. Contrasting the stable activation of the expression, except two genes, RPS4Y1 and DDX3Y, exhibiting high Y chromosome, expression of the X‐linked genes in female tended expression immediately after EGA. These two genes were highly to be downregulated with time during the formation of the expressed in all male cells, and the specific expression was blastocyst, especially in TE cells (Figure 3a). Since the DNA maintained and even accentuated in the following stages, methylation landscape of specific marker regions on the especially for RPS4Y1 (Figure 2b). Furthermore, we could classify X chromosome can also reflect the status of gene activation or 874 | ZHOU ET AL.

(a) (b)

(c)

(d)

FIGURE 1 Global transcriptome profiling of male and female embryos reveals differences during development. (a) Sequencing data of embryos from different developmental stages included in this study: trophectoderm (TE); epiblast (EPI) and primitive endoderm (PE). (b) Table showing the number of male and female cells and embryos analyzed in this study within each embryonic stage. The number of cells is listed in the table and the numbers in parentheses refer to the number of embryos for each stage and data set. The embryos of the 4‐cell stage are classified as neither male nor female. (c) Two‐dimensional t‐distributed stochastic neighbor embedding (t‐SNE) results of all cells represented by the expression of total genes. Different colors are used to indicate the embryonic day for male (triangle) and female (dot) embryos. (d) Genome‐ wide expression per chromosome in the male (light blue) and female (pink) embryos at the E3 stage. Chromosomal RPKM values are calculated as chromosomal reads per kilobase of transcript per million reads mapped (Methods). The significant differences of sex chromosome are defined as p <10−5 (two‐sided Mann Whitney Wilcoxon test) and marked with red stars. RPKM: reads per kilobase per million mapped reads [Color figure can be viewed at wileyonlinelibrary.com] inactivation (Allen, Zoghbi, Moseley, Rosenblatt, & Belmont, 1992), expected, the informative region near to PCSK1N was hemimethy- we collected methylation data for early embryos from 4‐cell to lated in postimplantation embryos, as one of the X chromosomes postimplantation stage (H. Guo et al., 2014) and investigated the had completed the inactivation and became methylated after DNA methylation level of four reported markers: AR, ZDHHC15, implantation (Figure S2). Interestingly, we also discovered a low SLITRK4, and PCSK1N (Bertelsen, Tumer, & Ravn, 2011). As methylation level for these DNA sites in TE cells, comparing ZHOU ET AL. | 875

(a) (b)

(c)

(d) (e)

FIGURE 2 Continued. 876 | ZHOU ET AL.

(a)

(b)

FIGURE 3 Dynamic behavior and imbalanced dosage of sex chromosomes during the formation of blastocyst. (a) Boxplots of total expression of genes on sex chromosomes in individual cells from E5 to E7 blastocysts, including expression from the X chromosomes within the lineages of epiblast (EPI), primitive endoderm (PE), and trophectoderm (TE). An example of chr10 from E6 is presented here as a negative control of autosomes (chrA); p‐value: two‐sided Mann Whitney Wilcoxon test. (b) Bar chart shows the number of DEGs on each chromosome from E3 to E7 in the two datasets (bottom: Petropoulos et al., 2016; top: Yan et al., 2013). The significant enrichment of DEGs on sex chromosomes is marked with star (Fisherʼs exact test, p < 0.001) and the result of chr1 is presented as an example of autosomes on the top. DEG: differentially expressed genes [Color figure can be viewed at wileyonlinelibrary.com]

FIGURE 2 Distinct activation pattern of sex chromosomes during the process of genome activation. (a) Heatmap showing the expression of detected Y‐linked genes in the early stages. Genes with RPKM > 0.5 at least one cell are defined as expressed and they are sorted by their genomic loci. The mean value of RPKM within each stage is calculated and scaled to z‐score range in [−4,4]; RPKM: reads per kilobase per million mapped reads. (b) Boxplot of the two most highly expressed genes on the Y chromosome indicates high expression in male cells from E2 to E7. The boxes with blue color represent the expression of RPS4X1 and DDX3Y, respectively. The mean values of all other Y‐linked genes except these two markers are calculated and drawn as gray boxes. (c) Hierarchical clustering for E3 male (blue) and female (red) embryos using the expression of RPS4Y1 shows a consistent classification pattern. The embryos with name starting with “Y” are from the data set of Yan et al. (2013), and the others are from Petropoulos et al. (2016). (d) Heatmap of the average expression of X‐linked genes in males and females during the process of genome activation (from E2 to E4). These genes are sorted by their genomic location and the mean value of RPKM within each stage is scaled to z‐score. A schematic of the X chromosome marked with genomic location of presented genes (gray line) is drawn on the left. E2: embryos at E2; E3_M: male embryos at E3; E3_F: female embryos at E3; E4_M: male embryos at E4; E4_F: female embryos at E4. (e) Histogram of reads ratio of the two representative genes on the X chromosome with biallelic expression in female embryos after EGA. The exact read numbers originating from each allele are marked above each bar, and the heatmap under the bars represents their expression in each cell (with a range from blue to red to show the increase of expression). The two informative loci are rs41307260 and rs5963597. RPKM: reads per kilobase per million mapped reads [Color figure can be viewed at wileyonlinelibrary.com] ZHOU ET AL. | 877 with the nonmethylated landscape in PE cells (or inner cell mass 3 | DISCUSSION (ICM)). The significant decline in expression and the specific methylation pattern reveal a rapid process of dosage compensation Our study provides comprehensive information on expression of X chromosome in TE cells of E7 embryos. differences between male and female IVF embryos during early development. The inclusion of a large number of embryos provides evidence of prominent transcriptional differences between sex 2.5 | Dynamic behavior of sex chromosomes leads chromosomes. Our analysis demonstrates distinct activation patterns to differentially expressed genes and regulate of the sex chromosomes during early embryogenesis, with initial multiple biological processes activation of few genes on the Y chromosome and activity of a broad As shown above RPS4Y1 exhibits high expression in male embryos region on the X chromosome. Thus, RPS4Y1 exhibits high expression at the time of EGA. We also found that its paralogous gene RPS4X, at the time of EGA and shows a sex‐specific expression pattern. In which is the first gene on long arm of the X chromosome known to humans, RPS4Y1 is one of the variants encoding the ribosomal escape from X inactivation, had significant higher expression in protein S4 (RPS4), and its paralogous gene RPS4X is the first gene on female embryos (Figure S3A; Mann Whitney Wilcoxon test). long arm of the X chromosome known to escape from X inactivation Interestingly, these two ribosomal protein S4 genes exhibited a (Fisher et al., 1990). The amino acid differences between the proteins balanced dosage, especially for E3 and E4 embryos (Figure S3B). encoded by these two genes result in the generation of two distinct, However, when analyzing the expression of all ribosome related but functionally equivalent, forms of ribosomes (Zinn, Alagappan, genes, we found a group of genes that differed in dosage Brown, Wool, & Page, 1994). In contrast to the silencing of the between the two sexes (Figure S3C). These results indicate that homologous genes in mouse (Zinn et al., 1991), it has been assumed even though there is sex‐specific expression pattern of the that normal human development requires at least two RPS4 genes sex chromosomes, the dosage for single gene or gene family per cell; two RPS4X in female cells and one RPS4X and one RPS4Y in may vary. male cells. It has been reported that haploinsufficiency of the RPS4 Next, we performed differential expression analysis comparing genes may play a role in Turner syndrome (Watanabe, Zinn, Page, & male and female cells within each embryonic stage. To adjust for Nishimoto, 1993). The high transcription of RPS4Y1 helps to balance batch affects between different datasets, we defined differentially the dosage between the sexes (Andrés et al., 2008) at the early stage expressed genes (DEGs) on each data set separately. In agreement in spite of the two‐fold dosage of RPS4X in female cells after EGA due with previously reported results (Petropoulos et al., 2016), the to the activation of both X chromosomes. Thus, expression of RPS4Y1 majority of DEGs defined in the dataset of Petropoulos et al. may be used as a potential expression marker to distinguish embryo (2016) located on the X chromosome from E3 to E5 (Figure 3b; sex at these stages, earlier than the expression of other sex‐ Tables S1 and S2), whereas more DEGs encoded on the autosomes determining genes. The other first activated gene on the Y were detected at E6 and E7. In the data set of Yan et al. (2013), the chromosome, DDX3Y, belongs to the RNA helicase family. The absolute number of DEGs on autosomes and sex chromosomes protein encoded by this gene shares high similarity to DDX3X, on the was comparable, but still there was a significant enrichment of X chromosome, while their functions differ (Rosner, Paz, & Rinkevich, DEGs on the sex chromosomes (Figure 3b; Figure S4A‐B; p <0.001; 2006). As a result, activation of this gene in the early stage may lead Fisherʼsexacttest). to a male‐specific RNA metabolism and downstream regulation, such In total, we defined between 500–2500 DEGs at each stage, as neuronal differentiation (Vakilian et al., 2015). with a subset appearing at multiple stages (Figure 4a). Functional Our study revealed that most X‐linked genes become annotation based on Gene Ontology (GO) revealed stage‐specific transcriptionally active concomitant with completion of EGA in all functions for these DEGs (Figure S5; Table S3). At the 8‐cell stage, embryos. In addition, both copies of the X‐chromosomes in females these genes are mainly involved in cell cycle control, cell division, are activated. It has generally been assumed that the germline‐ and chromosomal segregation. Later in the morula stage, DEGs are inactivated X might be passed onto the offspring, as repetitive associated with processes involved in chromatin organization. elements on the paternal X chromosome are suppressed in two‐cell During formation of the blastocyst, sex‐dependent differences in mouse embryos (Cooper, 1971; Huynh & Lee, 2003). However, gene expression are related to chromatin assembly, translation de novo inactivation of the paternal X chromosome in mouse elongation, metabolism, lipid transport, and neuron differentiation. embryos has been reported (Okamoto et al., 2005; Okamoto, Otte, The ontology clusters similarly revealed a stage‐specific enrich- Allis, Reinberg, & Heard, 2004), with a reinactivation taking place ment of various functions (Figure 4b; Table S4). The most enriched after the 4‐cell stage (Deng et al., 2014). For humans, we know that clusters showed a network mainly controlling the cell cycle, beyond completion of EGA at E4, female cells possess two active metabolic processes, and development morphogenesis (Figure X chromosomes (Petropoulos et al., 2016). From the comparison of 4c). All these results indicate that expression differences between transcription and the allelic expression analysis, our analysis males and females are manifest already during early develop- demonstrates that the two copies of X chromosomes in females mental stages of embryogenesis, and thus, regulate various are widely activated immediately after genome activation from the biological processes of development. 4‐cell to the 8‐cell stage at E3. 878 | ZHOU ET AL.

(a) (c)

(b) Cell Cycle Eukaryotic Translation Elongation organelle fission negative regulation of cellular component organization vasculature development microtubule−based process Cell cycle Checkpoints The citric acid(TCA) cycle and respiratory electron transport tissue morphogenesis extracellular structure organization response to wounding positive regulation of cell death response to toxic substance mitochondrion organization gland development purine ribonucleoside monophosphate metabolic process peptide biosynthetic process apoptotic signaling pathway transmembrane receptor protein tyrosine kinase signaling pathway cytokine production E3 E4 E5 E6 E7

FIGURE 4 Stage‐specific function and network of enriched Gene Ontology. (a) The Circos plot shows the overlap of DEGs from multiple stages. Dark orange color and purple lines represent the same DEGs that appear in multiple stages. Blue lines link the different genes where they fall into the same enriched ontology term (with size no larger than 100). (b) Heatmap of top 20 function clusters with their representative enriched terms across multiple stages, colored by terms. A white block indicates no significant enrichment at that stage for a given cluster. Process enrichment analysis was carried out by Metascape (Methods). (c) Network of enriched clusters, where nodes that share the same cluster identity are typically close to each other. The size is proportional to the number of genes included in the term and the thickness of the edge represents the similarity score. The color of each cluster represents its cluster identity, the same as labels in (b). DEGs: differentially expressed genes [Color figure can be viewed at wileyonlinelibrary.com]

The extensive datasets investigated in the study indicate an inactivation of the paternal X (Xp) chromosome occurs beyond the imbalanced dosage of sex chromosomes from the EGA to forming of 4‐cell stage (Deng et al., 2014). Inactivity of the Xp is maintained the blastocyst, then dosage compensation of the X chromosome in the TE but is reversed randomly in the ICM of the blastocyst (Mak in females occurs, especially in TE cells. In mice, the imprinted et al., 2004; Okamoto et al., 2004). Key genes, including Atrx, which ZHOU ET AL. | 879 are involved in chromatin remodeling and heterochromatin forma- the field of ART and possibly improve the treatment of infertility and tion and play a central role in the X‐chromosome inactivation enhance reproductive health. In addition, this study of sex process, have been found to be expressed in TE cells, but not in other differences in gene expression in early embryos and the potentially cell types (EPI). For humans, it has been reported that dosage functional effects will also provide a basis for further experiments on compensation of the X chromosome occurs in all three lineages at E7, how environmental impact during early developmental stages can with a biallelic expression dampening (Petropoulos et al., 2016). Our elicit profound and lasting effects that are different in male and finding of the rapid decline of the expression of X‐linked genes in TE female offspring. cells raises the question as to whether lineage‐specific factors, similar to the situation in mouse, can regulate this process. The fast dosage 5 | MATERIALS AND METHODS compensation of the X chromosome in TE cells, where the first interaction between embryos and uterus occurs during implantation, 5.1 | Ethical approval may result in a balanced dosage between the embryo and the maternal endometrium. Thus, it may be beneficial in relation to Analyses performed at BGI comprised bioinformatics analysis of implantation as skewed X‐chromosome inactivation is associated public sequencing data, approved by the Institutional Review Board with recurrent miscarriage (Dasoula et al., 2008). Understanding the on Bioethics and Biosafety of BGI (IRB 13067). relationship between the biallelic expression dampening of human IVF embryos and the random XCI in adults is an important question 5.2 | Data collection for the future. The existence of sex‐specific dynamic behavior during early Single‐cell RNA‐seq data, processed data, and raw reads reported in embryogenesis may lead to various dosages of gene expression. Thus, connection with human early embryonic development were downloaded part of the DEGs represents genes involved in translation elongation from two publicly available datasets: GSE36552 (78 cells from 4‐cell to and metabolism. The main biological processes regulated by DEGs late blastocyst; Yan et al., 2013) and ArrayExpress: E‐MTAB‐3929 (1529 further include cell cycle, chromatin assembly, and lipid transport. cells from E3 to E7 embryos; Petropoulos et al., 2016). The DNA Since a bias in the sex‐ratio in connection with ART has been methylation data of human embryos was from GSE49828 (covering reported and male IVF embryos are reported to display a faster developmental stages from 4‐cell stage to postimplantation; H. Guo development rate and higher metabolic activity than female embryos et al., 2014). (Alfarawati et al., 2011; B. Huang et al., 2018; Ray et al., 1995), our findings may suggest a potential correlation between sex‐specific 5.3 | Sequencing data processing and expression gene expression pattern and the particular behavior of early profiling embryos. However, due to the biological and technical stochastic nature of single‐cell data, some low‐expressed genes or allele For RNA‐seq data of Yan et al. (2013), raw reads were mapped to the information could be missed. Thus, we still need to improve the (hg19) using TopHat (Pollier, Rombauts, & Goossens, technology of single‐cell sequencing and further validation experi- 2013; Trapnell, Pachter, & Salzberg, 2009) with default settings after ments. removing the low‐quality reads. Only uniquely mapped reads were kept for further analysis. The gene expression level of raw reads count was calculated by HTSeq (Anders, Pyl, & Huber, 2015; 4 | CONCLUSIONS Shahriyari, 2017). Then the table was combined with the raw count file from Petropoulos et al. (2016), and the expression values (RPKM, In conclusion, we provide a comprehensive comparison of the reads per kilobase per million mapped reads) were estimated using transcriptional atlases of male and female human preimplantation Cufflinks (Pollier et al., 2013) with the annotation of RefSeq (Table embryos and reveal the dynamics of sex chromosomes expression S5). Chromosome level expression was counted as chromosomal and silencing during embryogenesis. Precocious imbalanced dosage reads per kilobase of coding region within the chromosome RPKM. of X chromosome and decrease in the development rate for IVF female embryos may account for the observed preferential female 5.4 | Inference of embryonic sex and cell lineage mortality at early stages and the biased sex ratio in connection with ART (Alfarawati et al., 2011; B. Huang et al., 2018; Ray et al., 1995; InformationonsexforeachcellandembryoafterEGAwasclassifiedas Tarin, Garcia‐Perez, Hermenegildo, & Cano, 2014). previously described (Petropoulos et al., 2016). Embryos at E2 were The importance of sex differences in relation to disease risk or considered neither male nor female. The embryos in the DNA symptoms and responses to medical treatment is now recognized (S. methylation data set were classified based on the number of detected Lin et al., 2019; Yang et al., 2019). Studying sex differences during loci on the Y chromosome, using the average number in oocytes and human embryogenesis, as well as understanding the process of sperm as a baseline for female and male samples, respectively (Figure dosage compensation of the X chromosome and the effects on sex S2A). We considered the three lineages of cells at the blastocyst stage as biases in development of IVF embryos, will further knowledge within reported by the authors (Petropoulos et al., 2016). 880 | ZHOU ET AL.

5.5 | Analyses of allelic expression (2014A030313795), Shenzhen Municipal Government of China (JCYJ20170412152854656, JCYJ20160429174400950). The alignment of raw sequencing reads to the human genome was performed by BWA (Li & Durbin, 2010). Then we used the function of mpileup in SAMtools (Li, 2011) to retrieve allelic read counts in the RNA‐ CONFLICTS OF INTEREST seq data for common variants in db151(Sherry et al., 2001), and The authors declare that there are no conflicts of interest. intergenic SNVs were excluded using ANNOVAR (Wang, Li, & Hakonarson, 2010). To obtain the total read counts for each site, we run the mpileup program without base quality correction and filtering. AUTHOR CONTRIBUTIONS

Q. Z., W. J. W., and K. K. conceived and designed the study; Q. Z., 5.6 | Sex differential expression analysis T. W., J. H., L. Y., F. F., L. L., and W. Z. performed the data analysis; F. C. and G. L. oversaw the study; Q. Z. and T. W. prepared the Differential expression analysis was performed for each stage figures; Q. Z. wrote the first version of the manuscript; Q. Z., W. J. W., ‐ comparing male and female cells. p values were calculated using and K. K. revised the manuscript; all authors reviewed the final ‐ DESeq2 (Anders & Huber, 2010) and a significant level cut off of version of the manuscript. adjusted p < 0.05 was used. A cut‐off of a two‐fold change in expression was used to define DEGs. We performed this analysis for each data set and stage separately, and then used the union of DEGs ORCID within each stage for further annotation. Qing Zhou http://orcid.org/0000-0002-9900-3353 Ge Lin http://orcid.org/0000-0002-3877-2546 5.7 | Function annotation and process enrichment analyses REFERENCES

The functional annotation was performed using the Database for Aiken, C. E. M., Swoboda, P. P. L., Skepper, J. N., & Johnson, M. H. (2004). The ‐ Annotation, Visualization, and Integrated Discovery (D. W. Huang, direct measurement of embryogenic volume and nucleo cytoplasmic ratio during mouse pre‐implantation development. Reproduction, 128(5), 527– Sherman, & Lempicki, 2008) Bioinformatics Resource. GO terms for 535. https://doi.org/10.1530/rep.1.00281 each stage were plotted by the GOplot package in R and summarized Alfarawati, S., Fragouli, E., Colls, P., Stevens, J., Gutiérrez‐Mateo, C., to a representative term. The process enrichment analyses were Schoolcraft, W. B., … Wells, D. (2011). The relationship between performed using Metascape (http://metascape.org/gp/index.html#/ blastocyst morphology, chromosomal abnormality, and embryo gender. Fertility and Sterility, 95(2), 520–524. https://doi.org/10. main/Step1; Tripathi et al., 2015), with ontology sources including 1016/j.fertnstert.2010.04.003 Reactome Gene Sets, Canonical Pathways, BioCarta Gene Sets, GO Allen, R. C., Zoghbi, H. Y., Moseley, A. B., Rosenblatt, H. M., & Belmont, J. Biological Processes, Hallmark Gene Sets, and KEGG Pathway. Terms W. (1992). Methylation of HpaII and HhaI sites near the polymorphic with a p < 0.01, a minimum count of three were collected and grouped CAG repeat in the human androgen‐receptor gene correlates with X chromosome inactivation. American Journal of Human Genetics, 51(6), into clusters based on their membership similarities. In the network 1229–1239. cluster results, terms with a similarity score > 0.3 were linked by Alomar, M., Tasiaux, H., Remacle, S., George, F., Paul, D., & Donnay, I. an edge. The network was visualized with Cytoscape (v3.1.2) with (2008). Kinetics of fertilization and development, and sex ratio of “force‐directed” layout and with edge bundled for clarity. bovine embryos produced using the semen of different bulls. Animal Reproduction Science, 107(1), 48–61. https://doi.org/10.1016/j.anire prosci.2007.06.009 5.8 | Statistical analysis Anders, S., & Huber, W. (2010). Differential expression analysis for sequence count data. Genome Biology, 11(10), R106. https://doi.org/ ‐ ‐ ‐ ‐ Mann Whitney Wilcoxon analyses were performed in R. For 10.1186/gb 2010 11 10 r106 Anders, S., Pyl, P. T., & Huber, W. (2015). HTSeq‐‐a Python framework to identification of DEGs, we used an adjusted p < 0.05 as cut‐off. In work with high‐throughput sequencing data. Bioinformatics, 31(2), the functional analysis, only GO terms with p < 0.01 were included. 166–169. https://doi.org/10.1093/bioinformatics/btu638 Andrés, O., Kellermann, T., López‐Giráldez, F., Rozas, J., Domingo‐Roura, X., & Bosch, M. (2008). RPS4Y gene family evolution in primates. BMC Evolutionary Biology, 8, 142–142. https://doi.org/10.1186/1471‐ ACKNOWLEDGEMENTS 2148‐8‐142 van den Berg, I. M., Laven, J. S. E., Stevens, M., Jonkers, I., Galjaard, R. ‐J., TheauthorsthankZ.Liu,X.Yang,Y.Xing,J.Sun,andmembersofthe Gribnau, J., & Hikke van Doorninck, J. (1998). X Chromosome group for helpful discussions; C. Ye for data download and management; Inactivation Is Initiated in Human Preimplantation Embryos. The – H. Zhong for constructive advice about the final manuscript. This project American Journal of Human Genetics, 84(6), 771 779. https://doi.org/ 10.1016/j.ajhg.2009.05.003 is supported by the National Key R&D Program of China Bermejo‐Álvarez,P.,Rizos,D.,Rath,D.,Lonergan,P.,&Gutierrez‐Adan, A. (2018YFC1004900), National Natural Science Foundation of China (2008). Epigenetic differences between male and female bovine (81300075), Natural Science Foundation of Guangdong Province blastocysts produced in vitro. Physiological Genomics, 32(2), 264–272. ZHOU ET AL. | 881

Bertelsen, B., Tumer, Z., & Ravn, K. (2011). Three new loci for determining Huynh, K. D., & Lee, J. T. (2003). Inheritance of a pre‐inactivated paternal x chromosome inactivation patterns. Journal of Molecular Diagnostics, X chromosome in early mouse embryos. Nature, 426(6968), 857–862. 13(5), 537–540. https://doi.org/10.1016/j.jmoldx.2011.05.003 https://doi.org/10.1038/nature02222 Bu, Z., Chen, Z. J., Huang, G., Zhang, H., Wu, Q., Ma, Y., … Sun, Y. (2014). Kobayashi, S., Isotani, A., Mise, N., Yamamoto, M., Fujihara, Y., Kaseda, K., Live birth sex ratio after in vitro fertilization and embryo transfer in … Okabe, M. (2006). Comparison of gene expression in male and China‐‐an analysis of 121,247 babies from 18 centers. PLoS One, female mouse blastocysts revealed imprinting of the X‐linked gene at 9(11), e113522. https://doi.org/10.1371/journal.pone.0113522 preimplantation stages. Current Biology, 16(2), 166–172. https://doi. Cooper, D. W. (1971). Directed genetic change model for X chromosome org/10.1016/j.cub.2005.11.071 inactivation in eutherian mammals. Nature, 230(5292), 292–294. Kochhar, H. P., Peippo, J., & King, W. A. (2001). Sex related embryo Dasoula, A., Kalantaridou, S., Sotiriadis, A., Pavlou, M., Georgiou, I., development. Theriogenology, 55(1), 3–14. Paraskevaidis, E., … Syrrou, M. (2008). Skewed X‐Chromosome Lanasa,M.C.,Hogge,W.A.,Kubik,C.,Blancato,J.,&Hoffman,E.P. Inactivation in Greek Women with idiopathic recurrent miscarriage. (1999). Highly skewed X‐chromosome inactivation is associated Fetal Diagnosis and Therapy, 23(3), 198–203. with idiopathic recurrent spontaneous abortion. The American Deng, Q., Ramskold, D., Reinius, B., & Sandberg, R. (2014). Single‐cell RNA‐ Journal of Human Genetics, 65(1), 252–254. https://doi.org/10. seq reveals dynamic, random monoallelic gene expression in 1086/302441 mammalian cells. Science, 343(6167), 193–196. https://doi.org/10. Legato, M. J. (2017). Principles of gender‐specific medicine: gender in the 1126/science.1245316 genomic era. ELSEVIER:Academic Press, 292–293. Fisher, E. M. C., Beer‐Romero, P., Brown, L. G., Ridley, A., McNeil, J. A., Li, H. (2011). A statistical framework for SNP calling, mutation discovery, Lawrence, J. B., … Page, D. C. (1990). Homologous ribosomal protein association mapping and population genetical parameter estimation genes on the human X and Y chromosomes: Escape from X from sequencing data. Bioinformatics, 27(21), 2987–2993. https://doi. inactivation and possible implications for turner syndrome. Cell, org/10.1093/bioinformatics/btr509 63(6), 1205–1218. https://doi.org/10.1016/0092‐8674(90)90416‐C Li, H., & Durbin, R. (2010). Fast and accurate long‐read alignment with Gao, L., Wu, K., Liu, Z., Yao, X., Yuan, S., Tao, W., … Liu, J. (2018). Burrows–Wheeler transform. Bioinformatics, 26(5), 589–595. https:// Chromatin accessibility landscape in human early embryos and its doi.org/10.1093/bioinformatics/btp698 association with evolution. Cell, 173(1), 248–259. e215. https://doi. Lin, P. ‐Y., Huang, F. ‐J., Kung, F. ‐T., Wang, L. ‐J., Chang, S. Y., & Lan, K. ‐C. org/10.1016/j.cell.2018.02.028 (2010). Comparison of the offspring sex ratio between cleavage stage Gardner, D. K., Larman, M. G., & Thouas, G. A. (2010). Sex‐related embryo transfer and blastocyst transfer. Taiwanese Journal of physiology of the preimplantation embryo. Molecular Human Repro- Obstetrics and Gynecology, 49(1), 35–39. https://doi.org/10.1016/ duction, 16(8), 539–547. https://doi.org/10.1093/molehr/gaq042 S1028‐4559(10)60006‐X Gkountela, S., Zhang, K. X., Shafiq, T. A., Liao, W. W., Hargan‐Calvopina, J., Lin, S., Liu, Y., Goldin, L. R., Lyu, C., Kong, X., Zhang, Y., … Gao, Y. (2019). Chen, P. Y., & Clark, A. T. (2015). DNA demethylation dynamics in the Sex‐related DNA methylation differences in B cell chronic lymphocy- human prenatal germline. Cell, 161(6), 1425–1436. https://doi.org/10. tic leukemia. Biology of Sex Differences, 10(1), 2. https://doi.org/10. 1016/j.cell.2015.05.012 1186/s13293‐018‐0213‐7 Guo, F., Yan, L., Guo, H., Li, L., Hu, B., Zhao, Y., … Qiao, J. (2015). The Lowe, R., Gemma, C., Rakyan, V. K., & Holland, M. L. (2015). Sexually transcriptome and DNA methylome landscapes of human primordial dimorphic gene expression emerges with embryonic genome activa- germ cells. Cell, 161(6), 1437–1452. https://doi.org/10.1016/j.cell. tion and is dynamic throughout development. BMC Genomics, 16(1), 2015.05.015 295. https://doi.org/10.1186/s12864‐015‐1506‐4 Guo, H., Zhu, P., Yan, L., Li, R., Hu, B., Lian, Y., … Qiao, J. (2014). The DNA Maalouf, W. E., Mincheva, M. N., Campbell, B. K., & Hardy, I. C. (2014). methylation landscape of human early embryos. Nature, 511(7511), Effects of assisted reproductive technologies on human sex ratio at 606–610. https://doi.org/10.1038/nature13544 birth. Fertility and Sterility, 101(5), 1321–1325. https://doi.org/10. Hansen, P. J., Dobbs, K. B., Denicol, A. C., & Siqueira, L. G. (2016). Sex and 1016/j.fertnstert.2014.01.041 the preimplantation embryo: Implications of sexual dimorphism in the Mak, W., Nesterova, T. B., de Napoles, M., Appanah, R., Yamanaka, S., preimplantation period for maternal programming of embryonic Otte, A. P., & Brockdorff, N. (2004). Reactivation of the paternal X development. Cell and Tissue Research, 363(1), 237–247. https://doi. chromosome in early mouse embryos. Science, 303(5658), 666–669. org/10.1007/s00441‐015‐2287‐4 https://doi.org/10.1126/science.1092674 Haqq, C. M., King, C. Y., Ukiyama, E., Falsafi, S., Haqq, T. N., Donahoe, P. K., Okamoto, I., Otte, A. P., Allis, C. D., Reinberg, D., & Heard, E. (2004). & Weiss, M. A. (1994). Molecular basis of mammalian sexual Epigenetic dynamics of imprinted X inactivation during early mouse determination: Activation of Mullerian inhibiting substance gene development. Science, 303(5658), 644–649. https://doi.org/10.1126/ expression by SRY. Science, 266(5190), 1494–1500. science.1092727 Holm, P., Shukri, N. N., Vajta, G., Booth, P., Bendixen, C., & Callesen, H. (1998). Okamoto, I., Arnaud, D., Le Baccon, P., Otte, A. P., Disteche, C. M., Avner, Developmental kinetics of the first cell cycles of bovine in vitro produced P., & Heard, E. (2005). Evidence for de novo imprinted X‐chromosome embryosinrelationtotheirinvitroviabilityandsex.Theriogenology, 50(8), inactivation independent of meiotic inactivation in mice. Nature, 1285–1299. https://doi.org/10.1016/S0093‐691X(98)00227‐1 438(7066), 369–373. https://doi.org/10.1038/nature04155 Hou, Y., Fan, W., Yan, L., Li, R., Lian, Y., Huang, J., … Qiao, J. (2013). Orzack, S. H., Stubblefield, J. W., Akmaev, V. R., Colls, P., Munne, S., Genome analyses of single human oocytes. Cell, 155(7), 1492–1506. Scholl, T., … Zuckerman, J. E. (2015). The human sex ratio from https://doi.org/10.1016/j.cell.2013.11.040 conception to birth. Proceedings of the National Academy of Sciences of Huang, B., Ren, X., Zhu, L., Wu, L., Tan, H., Guo, N., … Jin, L. (2018). Is the United States of America, 112(16), E2102–E2111. https://doi.org/ differences in embryo morphokinetic development significantly 10.1073/pnas.1416546112 associated with human embryo sex? Biology of Reproduction, 100, Payer, B., & Lee, J. T. (2008). X Chromosome Dosage Compensation: How ioy229–ioy229. https://doi.org/10.1093/biolre/ioy229 Mammals Keep the Balance. Annual Review of Genetics, 42(1), 733– Huang, D. W., Sherman, B. T., & Lempicki, R. A. (2008). Systematic and 772. https://doi.org/10.1146/annurev.genet.42.110807.091711 integrative analysis of large gene lists using DAVID bioinformatics Petropoulos, S., Edsgard, D., Reinius, B., Deng, Q., Panula, S. P., Codeluppi, S., … resources. Nature Protocols, 4,44–57. https://doi.org/10.1038/ Lanner, F. (2016). Single‐cell RNA‐seq reveals lineage and X chromosome nprot.2008.211. https://www.nature.com/articles/nprot.2008.211# dynamics in human preimplantation embryos. Cell, 167(1), 285. https:// supplementary‐information doi.org/10.1016/j.cell.2016.08.009 882 | ZHOU ET AL.

Pollier, J., Rombauts, S., & Goossens, A. (2013). Analysis of RNA‐seq data Cell Host & Microbe, 18(6), 723–735. https://doi.org/10.1016/j.chom. with TopHat and cufflinks for genome‐wide expression analysis 2015.11.002 of jasmonate‐treated plants and plant cultures. Methods in Molecular Vakilian, H., Mirzaei, M., Sharifi Tabar, M., Pooyan, P., Habibi Rezaee, L., Biology, 1011,305–315. https://doi.org/10.1007/978‐1‐62703‐ Parker, L., … Salekdeh, G. H. (2015). DDX3Y, a male‐specific region of 414‐2_24 Y chromosome gene, may modulate neuronal differentiation. Journal Ray, P. F., Conaghan, J., Winston, R. M., & Handyside, A. H. (1995). of Proteome Research, 14(9), 3474–3483. https://doi.org/10.1021/acs. Increased number of cells and metabolic activity in male human jproteome.5b00512 preimplantation embryos following in vitro fertilization. Journal of Valdivia, R. P., Kunieda, T., Azuma, S., & Toyoda, Y. (1993). PCR sexing and Reproduction and Fertility, 104(1), 165–171. developmental rate differences in preimplantation mouse embryos Ronen, D., & Benvenisty, N. (2014). Sex‐dependent gene expression in fertilized and cultured in vitro. Molecular Reproduction and Develop- human pluripotent stem cells. Cell Reports, 8(4), 923–932. https://doi. ment, 35(2), 121–126. https://doi.org/10.1002/mrd.1080350204 org/10.1016/j.celrep.2014.07.013 Wang, K., Li, M., & Hakonarson, H. (2010). ANNOVAR: Functional Rosner, A., Paz, G., & Rinkevich, B. (2006). Divergent roles of the DEAD‐ annotation of genetic variants from high‐throughput sequencing data. box protein BS‐PL10, the urochordate homologue of human DDX3 Nucleic Acids Research, 38(16), e164–e164. https://doi.org/10.1093/ and DDX3Y proteins, in colony astogeny and ontogeny. Developmental nar/gkq603 Dynamics, 235(6), 1508–1521. https://doi.org/10.1002/dvdy.20728 Watanabe, M., Zinn, A. R., Page, D. C., & Nishimoto, T. (1993). Functional Sato, E., Xian, M., Valdivia, R. P., & Toyoda, Y. (1995). Sex‐linked equivalence of human X– and Y–encoded isoforms of ribosomal differences in developmental potential of single blastomeres from in protein S4 consistent with a role in Turner syndrome. Nature Genetics, vitro‐fertilized 2‐cell stage mouse embryos. Hormone Research, 4, 268–271. https://doi.org/10.1038/ng0793‐268 44(Suppl 2), 4–8. Weston, G., Osianlis, T., Catt, J., & Vollenhoven, B. (2009). Blastocyst Serdarogullari, M., Findikli, N., Goktas, C., Sahin, O., Ulug, U., Yagmur, E., & transfer does not cause a sex‐ratio imbalance. Fertility and Sterility, Bahceci, M. (2014). Comparison of gender‐specific human embryo 92(4), 1302–1305. https://doi.org/10.1016/j.fertnstert.2008.07. development characteristics by time‐lapse technology. Reproductive 1784 BioMedicine Online, 29(2), 193–199. https://doi.org/10.1016/j.rbmo. Wu, J., Xu, J., Liu, B., Yao, G., Wang, P., Lin, Z., … Sun, Y. (2018). Chromatin 2014.03.026 analysis in human early development reveals epigenetic transition Setti, A. S., Figueira, R. C., Braga, D. P., Iaconelli, A., Jr., & Borges, E., Jr. during ZGA. Nature, 557(7704), 256–260. https://doi.org/10.1038/ (2012). Gender incidence of intracytoplasmic morphologically se- s41586‐018‐0080‐8 lected sperm injection‐derived embryos: A prospective randomized Wutz, A. (2011). Gene silencing in X‐chromosome inactivation: Advances study. Reproductive BioMedicine Online, 24(4), 420–423. https://doi. in understanding facultative heterochromatin formation. Nature org/10.1016/j.rbmo.2012.01.007 Reviews Genetics, 12(8), 542–553. https://doi.org/10.1038/nrg3035 Shahriyari, L. (2017). Effect of normalization methods on the performance Xue, Z., Huang, K., Cai, C., Cai, L., Jiang, C., Feng, Y., … Fan, G. (2013). of supervised learning algorithms applied to HTSeq‐FPKM‐UQ data Genetic programs in human and mouse early embryos revealed by sets: 7SK RNA expression as a predictor of survival in patients with single‐cell RNA sequencing. Nature, 500(7464), 593–597. https://doi. colon adenocarcinoma. Briefings in Bioinformatics, https://doi.org/10. org/10.1038/nature12364 1093/bib/bbx153 Yan, L., Yang, M., Guo, H., Yang, L., Wu, J., Li, R., … Tang, F. (2013). Single‐ Sherry, S. T., Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, cell RNA‐Seq profiling of human preimplantation embryos and E. M., & Sirotkin, K. (2001). dbSNP: The NCBI database of genetic embryonic stem cells. ature Structural & Molecular Biology, 20(9), variation. Nucleic Acids Research, 29(1), 308–311. 1131–1139. https://doi.org/10.1038/nsmb.2660 Sotiroska, V., Petanovski, Z., Dimitrov, G., Hadji‐Lega, M., Shushleski, D., Yang, W., Warrington, N. M., Taylor, S. J., Whitmire, P., Carrasco, E., Saltirovski, S., … Johansson, L. (2015). The day of embryo transfer Singleton, K. W., … Rubin, J. B. (2019). Sex differences in GBM affects delivery rate, birth weights, female‐to‐male ratio, and revealed by analysis of patient imaging, transcriptome, and survival monozygotic twin rate. Taiwanese journal of obstetrics & gynecology, data. Science Translational Medicine, 11(473), eaao5253. https://doi. 54(6), 716–721. https://doi.org/10.1016/j.tjog.2015.06.011 org/10.1126/scitranslmed.aao5253 Sullivan, A. E., Lewis, T., Stephenson, M., Odem, R., Schreiber, J., Ober, C., Zinn, A. R., Alagappan, R. K., Brown, L. G., Wool, I., & Page, D. C. (1994). & Branch, D. W. (2003). Pregnancy outcome in recurrent miscarriage Structure and function of ribosomal protein S4 genes on the human patients with skewed X chromosome inactivation. Obstetrics and and mouse sex chromosomes. Molecular and Cellular Biology, 14(4), Gynecology, 101(6), 1236–1242. 2485–2492. Tan, K., Wang, Z., Zhang, Z., An, L., & Tian, J. (2016). IVF affects embryonic Zinn, A. R., Bressler, S. L., Beer‐Romero, P., Adler, D. A., Chapman, V. M., development in a sex‐biased manner in mice. Reproduction, 151(4), Page, D. C., & Disteche, C. M. (1991). Inactivation of the Rps4 gene on 443–453. https://doi.org/10.1530/REP‐15‐0588 the mouse X chromosome. Genomics, 11(4), 1097–1101. Tan,K.,An,L.,Miao,K.,Ren,L.,Hou,Z.,Tao,L.,… Tian, J. (2016). Impaired imprinted X chromosome inactivation is responsible for the skewed sex ratio following in vitro fertilization. Proceedings of the SUPPORTING INFORMATION National Academy of Sciences of the United States of America, 113(12), 3197–3202. https://doi.org/10.1073/pnas.1523538113 Additional supporting information may be found online in the Tarin, J. J., Garcia‐Perez, M. A., Hermenegildo, C., & Cano, A. (2014). Supporting Information section at the end of the article. Changes in sex ratio from fertilization to birth in assisted‐reproduc- tive‐treatment cycles. Reproductive Biology and Endocrinology, 12, 56. https://doi.org/10.1186/1477‐7827‐12‐56 Trapnell, C., Pachter, L., & Salzberg, S. L. (2009). TopHat: Discovering How to cite this article: Zhou Q, Wang T, Leng L, et al. splice junctions with RNA‐Seq. Bioinformatics, 25(9), 1105–1111. Single‐cell RNA‐seq reveals distinct dynamic behavior of sex https://doi.org/10.1093/bioinformatics/btp120 chromosomes during early human embryogenesis. Mol Reprod Tripathi, S., Pohl, M. O., Zhou, Y., Rodriguez‐Frandsen, A., Wang, G., Dev. 2019;86:871–882. https://doi.org/10.1002/mrd.23162 Stein, D. A., … Chanda, S. K. (2015). Meta‐ and orthogonal integration of influenza "OMICs" data defines a role for UBR4 in virus budding. Supplementary figures

Figure S1. (A-D) Plots of additional stages (E4-E7) for genome-wide expression per chromosome in female (pink box) and male (light blue box) cells. Chromosomal RPKM values were calculated as chromosomal reads per kilobase of transcript per million reads mapped. Chromosome exhibiting significant differences are marked with a red star if p<10-5 in the Mann Whitney Wilcoxon test.

Figure S2. (A) Bar chart indicating the number of CpG islands detected on the Y chromosome for all embryos in the DNA methylation dataset. (B-E) Integrative genome view (IGV) of the DNA methylation level near to the four reported marker regions (AR, ZDHHC15, SLITRK4, PCSK1N) for determining X chromosome inactivation or activation, including stages from the 8-cell stage to post-implantation.

The height of the bars shows the percentage of methylation at each locus, ranging from

0% to 100%. The genomic index: chrX:66,765,297-66,765,584 (AR); chrX:74,694,462-

74,694,958 (ZDHHC15); chrX:142,722,666-142,723,065 (SLITRK4); chrX:48,693,322-48,693,661 (PCSK1N).

Figure S3. (A-B) Boxplot showing the expression level of RPS4X and RPS4 genes

(RPS4X & RPS4Y1) in males (light-blue) and females (pink) at each embryonic day

(E3-E7). The significant results are marked if p< 0.001 in the Mann-Whitney-Wilcoxon test. (C) Heatmap revealing the expression of differentially expressed ribosomal genes comparing male and female embryos at E6.

Figure S4. (A-B) Number of DEGs on each chromosome comparing males and females at 8-cell stage and late blastocyst in the dataset of Yan et al, stratified by autosomes (green), X chromosome (red) and Y chromosome (blue); (C-G) Distribution of DEGs on each chromosome from E3 to E7 in the other dataset (Petropoulos et al). The significant enrichment of DEGs on sex chromosomes is marked with red star (Fisher’s exact test, p-value<0.001).

Figure S5. Gene Ontology enrichment results of DEGs at each stage, representing GO terms for biological processes (red bubble) and molecular function (blue bubble). Most significant results are summarized at the bottom. x- axis: z-score; y-axis: negative logarithm of the adjusted p-value (provided by DAVID); area of a circle: gene number assigned to the term.

FigureS1 D C B A

RPKM of each chromosome RPKM of each chromosome RPKM of each chromosome RPKM of each chromosome

0 20 40 60 80 0 20 40 60 80 100 120 0 20 40 60 80 0 10 20 30 40 50 60 70 * chr1 chr1 chr1 chr1

chr2 * chr2 chr2 chr2 * * chr3 chr3 chr3 * chr3

chr4 chr4 chr4 chr4 *

chr5 chr5 chr5 chr5 * * * chr6 chr6 chr6 chr6

chr7 chr7 chr7 chr7

chr8 chr8 chr8 chr8 * chr9 chr9 chr9 * chr9 Expression ofallchromosomes atE4 Expression ofallchromosomes atE7 Expression ofallchromosomes atE6 Expression ofallchromosomes atE5

chr10 chr10 * chr10 chr10 chr11 chr11 chr11 chr11 chr12 chr12 chr12 chr12 chr13 chr13 chr13 chr13 * chr14 chr14 chr14 chr14 * chr15 chr15 chr15 chr15 * chr16 chr16 chr16 chr16 * chr17 chr17 chr17 * chr17 * chr18 chr18 chr18 chr18 * chr19 chr19 chr19

* chr19 chr20 chr20 chr20 chr20 * chr21 chr21 chr21 chr21 * chr22 chr22 chr22 chr22

chrX chrX chrX chrX * * chrY chrY chrY chrY * M,n=206 F,n=171 M,n=176 F,n=290 M,n=202 F,n=243 * M,n=114 F,n=92 * * * * * * * FigureS2 A

9000

6000

1XPEHURIORFL 3000

0

TE1 TE2 TE3 ICM1 ICM2 ICM3 íFHOOíFHOOíFHOOíFHOOíFHOOíFHOOíFHOO Sperm1Sperm2Sperm3Sperm4 0RUXOD0RUXOD0RUXOD

MII_Oocyte1MII_Oocyte2MII_Oocyte3

RVWLPSODQWDWLRQRVWLPSODQWDWLRQRVWLPSODQWDWLRQ P P P

B A 4-cell_1 8-cell_1 morula ICM_1 ICM_2 ICM_3 TE_1 TE_2 TE_3 postimplantation_1 E postimplantation_2 postimplantation_3 chrX:48,693,322-48,693,661 AR 8-cell_1 B C 4-cell_1 8-cell_2 8-cell_1 Morula_1 morula ICM_1 ICM_1 ICM_2 ICM_2 ICM_3 TE_1 TE_1 TE_2 TE_2 TE_3 TE_3 postimplantation_1 Postimplantation_1 postimplantation_2 Postimplantation_2 postimplantation_3 Postimplantation_3

ZDHHC15

C PCSK1N D 4-cell_1 8-cell_1 morula ICM_1 ICM_2 ICM_3 TE_1 TE_2 TE_3 postimplantation_1 postimplantation_2 postimplantation_3

SLITRK4 FigureS3

A B RPS4X RPS4X+RPS4Y1 ***** ** log2(RPKM) log2(RPKM) 6 7 8 9 10 11 12 13 6 7 8 9 10 11 12 13

E3 E4 E5 E6 E7 E3 E4 E5 E6 E7

C

MRPL13 MRPS10 RPS6KB1 MRPL1 RPS4Y1 MRPL35 MRPL30 MRPL41 MRP63 MRPL55 RPL17 MRPL4 MRPL54 MRPL52 MRPS21 RPS26 RPL37 RPL38 RPL39 RPS28 RPS21 RPL37A RPL41 RPL31 RPS15A RPS25 RPS6KB2 RPL7A RPL3 RPS4X RPS3 RPL10 RPSA MRPS12 RPS5 RPS2 RPS19 RPL8 RPL36 RPL35 RPL27 RPL32 RPL19 RPS11 RPS14 RPS18 RPS16 RPL23A RPL27A RPS15 RPLP2 RPL18A RPL28 RPLP1 RPL18 RPL13 RPS10 FAU RPS9

Male Female -4 -2 0 2 4 FigureS4 A 8−cell B Late−blastocyst

* Number of significant genes Number of significant genes * 0 20 40 60 80 100 120 0 20 40 60 80 100 120

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y

C D E3 E4

*

Number of significant genes * Number of significant genes * 0 20 40 60 80 100 120 0 20 40 60 80 100 120

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y

E F E5 E6

* * Number of significant genes Number of significant genes * * 0 20 40 60 80 100 120 0 20 40 60 80 100 120

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y

G E7

* Number of significant genes * 0 20 40 60 80 100 120

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y FigureS5 16 regulation of cell morphogenesis nucleosome organization translational elongation regulation of lipid transport 3.0 response to hyperoxia cell cycle 3.0 sexual reproduction neuron differentiation cell cycle phase chromatin assembly 15 12 cell cycle process epithelial tube morphogenesis mitotic cell cycle 3 2.5 organelle fission 2.5 structural constituent of ribosome

10 8 enzyme binding translation chromosome organization 2.0 2.0 −log (adj p−value) generation of precursor metabolites 2 5 4

1.5 1.5 Threshold Threshold Threshold Threshold Threshold

−9 −6 −3 0 −12 −8 −4 −3 −2 −1 0 1 2 −7.5 −5.0 −2.5 0.0 −4 −3 −2 −1 0 z−score z−score z−score z−score z−score

E3 E4 E5 E6 E7

cell cycle cell division tube development regulation of cell chromatin assembly morphogenesis nucleosome organization translational elongation metabolism neuron differentiation regulation of lipid transport

The phenotypic and genetic landscape of an infertility cohort with recurrent embryonic developmental arrest

Qing Zhou, MS1,2,*, Wei Zheng, MD3,*, Wen-Jing Wang, PhD1, Karsten Kristiansen, PhD1,2, Ge Lin,

MD3,4.

1 BGI-Shenzhen, Shenzhen 518083, China,

2 Laboratory of Genomics and Molecular Biomedicine, Department of Biology,

University of Copenhagen, Copenhagen, Denmark,

3 Reproductive & Genetic Hospital of CITIC-Xiangya, Changsha, China.

4 Institute of Reproductive and Stem Cell Engineering, School of Basic Medical

Science, Central South University, Changsha, China.

Corresponding author:

Ge Lin,

Reproductive and Genetic Hospital of CITIC-Xiangya, Changsha, 410078, China

Email: [email protected]

Karsten Kristiansen

Department of Biology, University of Copenhagen, Universitetsparken 13, 2100

Copenhagen, Denmark

BGI-Shenzhen, Shenzhen 518120, China

Email: [email protected]

*these authors contributed equally to this work

ABSTRACT

Purpose: we displayed the first description of clinical phenotypes and genetic features of a large population suffering from recurrent failures in assisted reproductive technology (ART) treatment, with phenotype of embryonic developmental arrest. We also discuss the paternal genetic influences on poor embryonic development. Methods: We performed both exome sequencing (ES) and genome sequencing (GS) to characterize the parental genetic landscape of 58 clinically well-characterized couples with at least two ART failures due to embryonic developmental arrest. Results: These population is highly heterogeneous both phenotypically and genetically. We identified likely pathogenic (LP) causative variants in about 6.9% (4/58) of the patients, and variants of uncertain significance (VUS) in another 8.6% (5/58). In paternal data, we found LP or VUS variants in 16 samples, with a detection ratio of 76.2%. Combing with high-resolution GS analysis, we got an improved diagnostic rate of 15% in our study population. Conclusions: Our data provides evidence for incorporating GS in the clinical workup of people with a history of ART failure and also has potential ability to identify new pathogenic variants or genes for this population. Our findings suggest that GS is equivalent and superior to ES and it could improve the genetic counselling of couples in relation to reproductive health.

Keywords: infertility, genome sequencing, assisted reproductive technology, embryonic developmental arrest, structure variant

INTRODUCTION

Infertility is a critical component of reproductive health. The probable global average prevalence of infertility is 15%1,2, with various rates in different regions and populations1,3. In clinical, the assisted reproductive technology (ART) has provided more chances of conception for couples with infertility. However, any abnormality during the ART process would affect the outcome4. The low successful rate of pregnancy has been a great concern in ART treatment. Upon fertilization and in vitro culture, only embryos with excellent morphological quality will be chosen for transfer and implantation. The most common features leading to poor embryonic development are embryo fragmentation, developmental arrest and unsuccessful blastocyst formation. Developmental arrest is a condition where embryos arrest at a certain stage, with a smaller number of cells at certain developmental time, and normally are not able to form a normal blastocyst5. The poor development of embryos in vitro would decrease the number of embryos available for transfer and thus reduce the chances of successful pregnancy, resulting in recurrent ART failure. Many studies have investigated the poorly developed embryos and tried to understand the underlying mechanisms. Notably, fragmentated or arrested embryos have significant different levels of chromosomal abnormalities6-8, as well as the content of mitochondrial DNA releasing to the culture medium9. Besides, the spindle abnormalities10, abnormal nuclei11, and the mitochondrial dysfunction12 could also result in the embryonic aneuploidy and developmental arrest. At the molecular level, a wide range of higher expression of apoptosis-related genes is found in fragmentated embryos13,14. The failure of embryonic genome activation (EGA) also contributes to abnormal development15. Moreover, it has been reported that poor embryo quality is in part due to the aberrant DNA methylation status, either inheriting immature landscape from gametes or a consequence of unproper epigenetic reprogramming16. All in all, any mistake or abnormality during key events of development would result in low-quality embryos. Nowadays, several maternal genetic causes of abnormal oocytes or embryonic

development failure have been reported. Women with biallelic mutations of the oocyte- specific translational repressor PATL2 cannot get fertilizable oocytes as all the oocytes are immature, arresting at germinal vesicle (GV) stage17-19. Similarly, homozygous or heterozygous mutations in TUBB8 also interfere with oocyte maturation by leading to oocyte meiotic arrest20. Normally, mammalian oocytes are surrounded by a zona pellucida (ZP), a glycoprotein matrix which is formed by ZP family proteins and essential for oogenesis, fertilization and preimplantation development21,22. Many studies have indicated that sequence variations in the ZP family genes may cause anomalies of ZP and result in female infertility22-26. In addition to the immaturity or changes of physical structure, some oocytes gradually degenerate or die soon after retrieval, a phenotype termed “oocyte death”. Maternal heterozygous mutations in PANX1 have been found responsible for this subtype of infertility27. Even if oocytes with normal morphology are retrieved, the maternal homozygous mutations in WEE2 may cause fertilization failure and thus affect the ability of fertility28. Upon fertilization, embryonic lethality or developmental arrest is one of major features of poor embryonic development. Several genetic determinants, like PADI629 and TLE630, have been identified through genomic sequencing of patients who have suffered from several failures in ART cycles. Homozygous mutations in KHDC3L are pathogenic causes for complete hydatidiform mole (CHM)31, while they are also detected in patients with embryos arrested at morula stage32. Biallelic mutations in NLRP2 and NLRP5 are also found female infertility characterised by early embryonic arrest33. In this study, we comprehensively discussed the clinical phenotypes and genetic features of a cohort of 58 clinically well-characterized couples with at least two ART failures due to the embryonic developmental arrest. We performed both exome sequencing (ES) and genome sequencing (GS) to characterize the parental genetic landscapes. Our data provides evidence for incorporating GS in the clinical workup of people with a history of ART failure and also has potential ability to identify new pathogenic variants or genes for this subtype of infertility.

MATERIALS AND METHODS

Patients recruitment

This study was approved and guided by the ethical committee of the Reproductive & Genetic Hospital. Couples with more than twice ART failures due to embryonic developmental arrest were recruited at the Reproductive & Genetic Hospital between 2014 and 2018. The clinical information was anonymously collected from clinical electronic record. All of samples were collected with written informed consent signed by the recruited couples.

Sequencing, variant detection and annotation

Genomic DNA from blood was extracted with DNeasy Blood & Tissue Kit. Library was prepared following the standard protocol of BGISeq-500 library Kit and was analyzed on a Fragment Analyzer to verified fragment size and distribution. After quality control, library of each sample was sequenced with a minimum of ~600 million reads in paired-end 100 base pair on a BGISeq-500 platform (~30X coverage with ~120Gb raw data per sample). Variant detection was processed using Burrows–Wheeler Aligner 0.7.1634 and Genome Analysis Toolkit 3.4-4635,36. Structure variant (SV) was identified using the software of Speedseq37 and CNVnator38, detecting small SV and copy number variant (CNV) with size >100kb respectively. Common variants were filtered if the allele frequency is larger than 1% in either public or in-house population database and the remaining variants were annotated and categorized by ANNOVAR39.

Variants classification and validation

Candidate pathogenic genes for females were selected from previously reported literatures (Table 1). Since there is no reported variant or gene involved in embryonic developmental arrest for males, candidate genes were collected from sperm proteins functionally involved in the processes of preimplantation embryonic development40, as well as null-mice data from the database of Mouse Genetic Informatic (MGI)41. Variants affecting known genes potentially related to the phenotype were interpreted according

to the American College of Medical Genetics and Genomics (ACMG) guidelines42. Selected variants were validated by Sanger sequencing using an AB3730 capillary sequencer. De novo and compound variants were also confirmed in patient and their parents.

RESULTS

Clinical characteristics of cohort

The recruited women were from 22 to 44 years old, with an average of five-year history of infertility with unknown cause (Table 2). Briefly, approximately nine oocytes were released for each ART cycle, ranging from 2 to 35 oocytes depending on the response of women’s ovary. A small number of women received therapeutic donor insemination (TDI) if her husband had been clearly diagnosed with infertility. The ratio of ART cycles through in vitro fertilization (IVF), intracytoplasmic sperm injection (ICSI) and combination of IVF and ICSI was 60.17%, 34.75% and 5.08%, respectively. Despite of fertilization method, the average fertilization rate was about 47.62%. Although an average of four fertilized embryos were generated for each cycle, they all arrested at 4-cell or 8-cell stage before blastocyst formation. As a result, all of the couples could not get embryos for transfer at the blastocyst stage.

Genetic findings

To figure out the genetic influences on the embryonic developmental arrest, we got genetic information of all these women using GS. Additionally, we collected 21 of them and performed GS for their husband. By searching literatures, we summarized a pathogenic gene list resulting in embryonic developmental failure (Table 1) and categorized the variants detected in our cohort. From the genetic data, we only detected novo variants for known genes, rather than recurrent variants of reported loci. We identified likely pathogenic (LP) causative variants in about 6.9% (4/58) of the patients, and variants of uncertain significance (VUS) in another 8.6% (5/58). As expected, we

only detected likely pathogenic variants of genes with a reported phenotype of developmental arrest, like TLE6, PADI6 and KHDC3L, rather than genes influencing the maturation of gametes or fertilization (Table 3). These findings consistent with the clinical characteristics of these patients that the average fertilization rate of their oocytes is normal while their embryos arrest at the later developmental stage.

By analysing GS data, we confirmed the majority of results found in ES data and detected additional two SVs locating within cis-regulatory elements of reported pathogenic genes, PADI6, in patient I1 and I173. Although they were identified as VUS, they might provide basis for further validation experiments and new insights into the pathogenic mechanism. In patient I1, we detected a homozygous LP variant of another gene, TLE6, reported as pathogenic genes for infertility women characterized with embryonic developmental arrest. Notably, the fertilization rate for patient I1 was 100% (18/18) while all embryos arrested before the 8-cell stage. In patient I173, another likely pathogenic variant was in TUBB8, which was involved in maintaining the normal function of oocytes. Although the patient I173 got more oocytes during her two ART cycles, the fertilization rate was as low as 25.8% (8/31). All embryos arrested from 4- cell to 8-cell stage with compaction. Additional patient (I2) having variant of TUBB8 also had low fertilization rate and displayed embryonic arrest, as well as fragmentation. Interestingly, KHDC3L is a reported autosome recessive pathogenic gene associated with recurrent hydatidiform mole31,43 and developmental arrest32. From the result of SNV, a heterozygous missense mutation of this gene was detected in patient I211. In addition, we found a small structural deletion of this gene via GS data. As a result, the compound variants may finally affect the two copies of KHDC3L and thus lead to the embryonic developmental arrest.

Paternal genetic factors

Besides, we also performed GS for the male partners in our cohort. Although genetic basis of male infertility due to abnormal spermatogenesis has been extensively reported44,45, the association between paternal genetic factors and poor embryonic

development is not clear. In fact, sperm proteins are functionally involved in the processes of fertilization and preimplantation embryonic development40. According to Gene Ontology annotation and null-mouse data from the database of Mouse Genetic Informatic (MGI)41, 103 proteins were identified involved in the process of fertilization and 59 sperm proteins were selected with related roles in embryonic development40. Then we summarised the genetic findings of these genes in our paternal genomic data. In total, we found LP or VUS variants in 16 samples, with a detection ratio of 76.2%. Most of them were heterozygous missense variants involved in multiple key processes of early embryonic development.

DISCUSSION

Taking the advantage of GS technology and high-resolution analysis, we displayed the first description of clinical phenotypes and genetic features of a large population of couples suffering from recurrent ART failure, with phenotype of embryonic developmental arrest. Combining with high-resolution GS analysis and the paternal genetic data, we have profiled their genetic landscapes of reported pathogenic genes, including SNVs, Indels and SVs. In addition, we showed the genetic variants of crucial paternal genes regulating early embryonic development, providing a new insight of parental genetic influence on the recurrent embryonic development failure.

Clinical ES has emerged as a powerful genetic diagnostic tool since it dramatically increased the diagnostic yield of suspected genetic disorders compared to multigene panel sequencing46,47. In fact, GS offers improved uniformity of coverage compared to

ES and it could increase the accuracy of detecting variants on exonic regions48. Our data demonstrate the excellent ability of GS, not only to identify diagnostic variants detected by ES, but also to interpret deep intronic and other noncoding SNVs, as well as functional cis-regulatory SVs in intergenic regions. Our findings suggest that GS is equivalent and superior to ES, with an improved diagnostic rate of 15% in our study population. Unlike extensively reported genetic basis of male infertility, only a few studies

have been published reporting genetic causes of female infertility. Influences of these pathogenic variants include arrest of oocytes maturation, abnormal structure of oocytes, oocyte death, inability of fertilization and early embryonic developmental arrest (Table1). As we interpreted all these reported genes in our cohort, we did not find variants of genes leading to oocyte immaturity and abnormal physical structure. It is consistent with the phenotype of our recruited couples, since they all failed in ART treatment due to recurrent embryonic developmental arrest, with normal oocyte morphology and fertilization rate. Interestingly, we identified LP variants on various genes in our cohort, showing a high genetic heterozygous. In addition, individuals with variants of TUBB8, which is reported to maintain normal function of oocytes20,49,50, have lower fertilization rate during ART treatment and display embryonic fragmentation or compaction besides of developmental arrest comparing with other patients. Combing with clinical characteristic features, our interpretation results reveal clear genotype-phenotype association of diagnosed individuals. Besides, our data provide evidence for incorporating GS in the clinical workup of people with a history of ART failure and also has potential ability to identify new pathogenic variants or genes for the infertility population. Our findings enable the potential application of GS to clinical diagnosis and it could improve the genetic counselling of couples in relation to reproductive health. Still, further understanding of the individual’s whole genomic information and studies of larger populations are needed. In summary, although the main phenotype for these population is embryonic developmental arrest, our results show that these population is highly heterogeneous both phenotypically and genetically. Combining with high-resolution GS analysis and the paternal genetic data, we have achieved genetic landscapes for this kind of patients. Our findings enable potential application of GS to clinical diagnosis and it could contribute to improved genetic counselling of couples with history of ART failure.

ACKNOWLEDGMENTS

The authors thank members of the research group in BGI for helpful discussions; the colleagues in China National GeneBank for their kind assistant of sequencing; the recruited patients for participating this study and donating samples.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest.

REFERENCE

1. Boivin J, Bunting L, Collins JA, Nygren KG. International estimates of infertility prevalence and treatment-seeking: potential need and demand for infertility medical care. Hum Reprod. 2007;22(6):1506-1512. 2. Mascarenhas MN, Flaxman SR, Boerma T, Vanderpoel S, Stevens GA. National, regional, and global trends in infertility prevalence since 1990: a systematic analysis of 277 health surveys. PLoS Med. 2012;9(12):e1001356. 3. Ombelet W, Cooke I, Dyer S, Serour G, Devroey P. Infertility and the provision of infertility medical services in developing countries. Hum Reprod Update. 2008;14(6):605-621. 4. Zegers-Hochschild F, Adamson GD, de Mouzon J, et al. The International Committee for Monitoring Assisted Reproductive Technology (ICMART) and the World Health Organization (WHO) Revised Glossary on ART Terminology, 2009. Hum Reprod. 2009;24(11):2683-2687. 5. Gardner DK, Lane M. Culture and selection of viable blastocysts: a feasible proposition for human IVF? Hum Reprod Update. 1997;3(4):367-382. 6. Munne S, Grifo J, Cohen J, Weier HU. Chromosome abnormalities in human arrested preimplantation embryos: a multiple-probe FISH study. Am J Hum Genet. 1994;55(1):150-159.

7. Munne S, Alikani M, Tomkin G, Grifo J, Cohen J. Embryo morphology, developmental rates, and maternal age are correlated with chromosome abnormalities. Fertil Steril. 1995;64(2):382-391. 8. Dekel-Naftali M, Aviram-Goldring A, Litmanovitch T, et al. Chromosomal integrity of human preimplantation embryos at different days post fertilization. J Assist Reprod Genet. 2013;30(5):633-648. 9. Stigliani S, Anserini P, Venturini PL, Scaruffi P. Mitochondrial DNA content in embryo culture medium is significantly associated with human embryo fragmentation. Hum Reprod. 2013;28(10):2652-2660. 10. Chatzimeletiou K, Morrison EE, Prapas N, Prapas Y, Handyside AH. Spindle abnormalities in normally developing and arrested human preimplantation embryos in vitro identified by confocal laser scanning microscopy. Hum Reprod. 2005;20(3):672- 682. 11. Kort DH, Chia G, Treff NR, et al. Human embryos commonly form abnormal nuclei during development: a mechanism of DNA damage, embryonic aneuploidy, and developmental arrest. Human Reproduction. 2015;31(2):312-323. 12. Thouas GA, Trounson AO, Wolvetang EJ, Jones GM. Mitochondrial dysfunction in mouse oocytes results in preimplantation embryo arrest in vitro. Biol Reprod. 2004;71(6):1936-1942. 13. Jurisicova A, Antenos M, Varmuza S, Tilly JL, Casper RF. Expression of apoptosis- related genes during human preimplantation embryo development: potential roles for the Harakiri gene product and Caspase-3 in blastomere fragmentation. Mol Hum Reprod. 2003;9(3):133-141. 14. Metcalfe AD, Hunter HR, Bloor DJ, et al. Expression of 11 members of the BCL- 2 family of apoptosis regulatory molecules during human preimplantation embryo development and fragmentation. Mol Reprod Dev. 2004;68(1):35-50. 15. Song BS, Lee SH, Kim SU, et al. Nucleologenesis and embryonic genome activation are defective in interspecies cloned embryos between bovine ooplasm and rhesus monkey somatic cells. BMC Dev Biol. 2009;9:44. 16. Kishigami S, Van Thuan N, Hikichi T, et al. Epigenetic abnormalities of the mouse

paternal zygotic genome associated with microinsemination of round spermatids. Dev Biol. 2006;289(1):195-205. 17. Chen B, Zhang Z, Sun X, et al. Biallelic Mutations in PATL2 Cause Female Infertility Characterized by Oocyte Maturation Arrest. Am J Hum Genet. 2017;101(4):609-615. 18. Maddirevula S, Coskun S, Alhassan S, et al. Female Infertility Caused by Mutations in the Oocyte-Specific Translational Repressor PATL2. Am J Hum Genet. 2017;101(4):603-608. 19. Christou-Kent M, Kherraf ZE, Amiri-Yekta A, et al. PATL2 is a key actor of oocyte maturation whose invalidation causes infertility in women and mice. EMBO Mol Med. 2018;10(5). 20. Feng R, Sang Q, Kuang Y, et al. Mutations in TUBB8 and Human Oocyte Meiotic Arrest. N Engl J Med. 2016;374(3):223-232. 21. Conner SJ, Lefievre L, Hughes DC, Barratt CL. Cracking the egg: increased complexity in the zona pellucida. Hum Reprod. 2005;20(5):1148-1152. 22. Avella MA, Baibakov B, Dean J. A single domain of the ZP2 zona pellucida protein mediates gamete recognition in mice and humans. J Cell Biol. 2014;205(6):801-809. 23. Margalit M, Paz G, Yavetz H, et al. Genetic and physiological study of morphologically abnormal human zona pellucida. Eur J Obstet Gynecol Reprod Biol. 2012;165(1):70-76. 24. Huang HL, Lv C, Zhao YC, et al. Mutant ZP1 in familial infertility. N Engl J Med. 2014;370(13):1220-1226. 25. Chen T, Bian Y, Liu X, et al. A Recurrent Missense Mutation in ZP3 Causes Empty Follicle Syndrome and Female Infertility. Am J Hum Genet. 2017;101(3):459-465. 26. Liu W, Li K, Bai D, et al. Dosage effects of ZP2 and ZP3 heterozygous mutations cause human infertility. Hum Genet. 2017;136(8):975-985. 27. Sang Q, Zhang Z, Shi J, et al. A pannexin 1 channelopathy causes human oocyte death. Science Translational Medicine. 2019;11(485):eaav8731. 28. Sang Q, Li B, Kuang Y, et al. Homozygous Mutations in WEE2 Cause Fertilization Failure and Female Infertility. Am J Hum Genet. 2018;102(4):649-657.

29. Xu Y, Shi Y, Fu J, et al. Mutations in PADI6 Cause Female Infertility Characterized by Early Embryonic Arrest. Am J Hum Genet. 2016;99(3):744-752. 30. Alazami AM, Awad SM, Coskun S, et al. TLE6 mutation causes the earliest known human embryonic lethality. Genome Biol. 2015;16:240. 31. Fallahian M, Sebire NJ, Savage PM, Seckl MJ, Fisher RA. Mutations in NLRP7 and KHDC3L confer a complete hydatidiform mole phenotype on digynic triploid conceptions. Hum Mutat. 2013;34(2):301-308. 32. Wang X, Song D, Mykytenko D, et al. Novel mutations in genes encoding subcortical maternal complex proteins may cause human embryonic developmental arrest. Reproductive BioMedicine Online. 2018;36(6):698-704. 33. Mu J, Wang W, Chen B, et al. Mutations in NLRP2 and NLRP5 cause female infertility characterised by early embryonic arrest. Journal of Medical Genetics. 2019;56(7):471. 34. Li H, Durbin R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics. 2010;26(5):589-595. 35. DePristo MA, Banks E, Poplin R, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics. 2011;43:491. 36. Ren S, Bertels K, Al-Ars Z. Efficient Acceleration of the Pair-HMMs Forward Algorithm for GATK HaplotypeCaller on Graphics Processing Units. Evol Bioinform Online. 2018;14:1176934318760543. 37. Chiang C, Layer RM, Faust GG, et al. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Methods. 2015;12(10):966-968. 38. Abyzov A, Urban AE, Snyder M, Gerstein M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 2011;21(6):974-984. 39. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Research. 2010;38(16):e164-e164. 40. Castillo J, Jodar M, Oliva R. The contribution of human sperm proteins to the development and epigenome of the preimplantation embryo. Hum Reprod Update.

2018;24(5):535-555. 41. Blake JA, Eppig JT, Richardson JE, Davisson MT. The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse. The Mouse Genome Database Group. Nucleic Acids Res. 2000;28(1):108-111. 42. Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17(5):405-424. 43. Parry DA, Logan CV, Hayward BE, et al. Mutations causing familial biparental hydatidiform mole implicate c6orf221 as a possible regulator of genomic imprinting in the human oocyte. Am J Hum Genet. 2011;89(3):451-458. 44. Skakkebaek NE, Rajpert-De Meyts E, Buck Louis GM, et al. Male Reproductive Disorders and Fertility Trends: Influences of Environment and Genetic Susceptibility. Physiol Rev. 2016;96(1):55-97. 45. Ray PF, Toure A, Metzler-Guillemain C, et al. Genetic abnormalities leading to qualitative defects of sperm morphology or function. Clin Genet. 2017;91(2):217-232. 46. Lee H, Deignan JL, Dorrani N, et al. Clinical exome sequencing for genetic identification of rare Mendelian disorders. JAMA. 2014;312(18):1880-1887. 47. Trujillano D, Bertoli-Avella AM, Kumar Kandaswamy K, et al. Clinical exome sequencing: results from 2819 samples reflecting 1000 families. Eur J Hum Genet. 2017;25(2):176-182. 48. Lelieveld SH, Spielmann M, Mundlos S, Veltman JA, Gilissen C. Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein- Coding Regions. Hum Mutat. 2015;36(8):815-822. 49. Feng R, Yan Z, Li B, et al. Mutations in TUBB8 cause a multiplicity of phenotypes in human oocytes and early embryos. J Med Genet. 2016;53(10):662-671. 50. Yuan P, Zheng L, Liang H, et al. A novel mutation in the TUBB8 gene is associated with complete cleavage failure in fertilized eggs. J Assist Reprod Genet. 2018;35(7):1349-1356.

Table 1. Summary of pathogenic genes of female infertility and reported phenotypes.

Phenotype Gene Inheritance Reference

(Chen et al, 2017; Arrest at Maddirevula et al, 2017; PATL2 AR germinal vesicle Christou-Kent et al, Oocytes 2018)

Meiotic arrest TUBB8 AR/AD (Feng et al, 2016)

Oocyte death PANX1 AD (Sang et al, 2019)

ZP1 AR (Huang et al, 2014)

(Avella et al, 2014; Liu ZP2 AD Fertilization et al, 2017)

failure (Chen et al, 2017; Liu et ZP3 AD al, 2017)

WEE2 AR (Sang et al, 2018) Embryos PADI6 AR (Xu et al, 2016)

TLE6 AR (Alazami et al, 2015) Developmental KHDC3L AR (Wang et al, 2018) arrest NLRP2 AR (Mu et al, 2019)

NLRP5 AR (Mu et al, 2019)

AR: autosome recessive; AD: autosome dominant.

Table 2. Summary of main clinical features of our infertility cohort in this study. Number of cases/average Main clinical features value Age Median 30 Range (20-44) BMI Median 21.36 Range (16.36-29.29) Years of infertility Median 5 Range (2-18) Number of oocytes Median 9 Range (4-35) TDI 17/118 (14.41%) Fertilization IVF 71/118 (60.17%) ICSI 41/118 (34.75%) IVF+ICSI 6/118 (5.08%) Fertilization rate Mean 47.62% Range (0-100%) Number of fertilized embryos 4 Mean (0-17) Range Number of embryos for 0 transfer a value collected and calculated per cycle; BMI: body mass index; IVF: in vitro fertilization; ICSI: intracytoplasmic sperm injection; TDI: therapeutic donor insemination

Table 3. Summary of main clinical features and genetic findings in infertility women. Number Reported Genomic variant(s) Main clinical ACMG No. Samples Ages Cycles of embryo Fertilization Gene inheritance (zygosity) features category (oocyte) pattern [transcript]

c.222G>C; p.(Q74H) Likely TLE6 AR (hom) All embryos arrest at pathogenic 1 I1* 29 2 18 (18) IVF [NM_001143986] 4C or 5C. 2bp-del; (hom) Uncertain PADI6 AR Upstream cis-element significance

Embryos arrest from c.1055C>T; p.(A352V) Likely 2 I2* 33 2 9 (21) IVF, ICSI 3C to 6C with TUBB8 AR, AD (het ) pathogenic fragmentation [NM_177987]

c.993C>G; p.(F331L) Likely All embryos arrest TUBB8 AR, AD (het) pathogenic 3 I173* 27 2 8 (31) IVF from 4C to 8C with [NM_177987]

compaction 2bp-del; (hom) Uncertain PADI6 AR Upstream cis-element significance

Table 3. Continued Number Reported Genomic variant(s) Main clinical ACMG No. Samples Ages Cycles of embryo Fertilization Gene inheritance (zygosity) features category (oocyte) pattern [transcript]

c.245A>T: p.(N82I) Likely (het) pathogenic TDI; all embryos [NM_001017361] 4 I211 38 2 10 (21) IVF, ICSI KHDC3L AR arrest from 2C to 5C c.*18_*315del0 Likely (het) pathogenic [NM_001017361]

ACMG, American College of Medical Genetics and Genomics; AD, autosomal dominant; AR, autosomal recessive; bp, base pair; BMI, body mass index; del, deletion; het, heterozygous; hom, homozygous; ICSI, intracytoplasmic sperm injection; IVF, in vitro fertilization; TDI: therapeutic donor insemination;

* Exonic variant results confirmed by both ES and GS